Partition - Discrete dimension example (along a column )

Solved!
n0thing233
Level 3
Partition - Discrete dimension example (along a column )

I have a file-based dataset(csv format).I want to partition this dataset based on value a a column (there are 5 values of the column (0,1,2,3,4)



I followed the tutorial but cannot paritiion it.



My column name is 'partition'.And I clicked on "add decrete dimension".Then I fill "partition" in to the box and it generate the pattern "%{partition}/.*"



But then I clicked "list partitions" button. it shows me "



Detected 0 partitions






  Found 1 unmatched file:




  • /out-s0.csv



"



Anyone can help me?

1 Solution
Mattsco
Dataiker

 



Hi, 



Your dataset is not yet partitioned. You need to rebuild it to see the generated partitions.

To generate this partitioned dataset the parent recipe should be a sync (Configuration tab) or a prepare recipe (Advanced tab) with the redispatch partitioning activated. 





Redispatch setting



 



 



Matt



 



 

Mattsco

View solution in original post

0 Kudos
9 Replies
Mattsco
Dataiker

 



Hi, 



Your dataset is not yet partitioned. You need to rebuild it to see the generated partitions.

To generate this partitioned dataset the parent recipe should be a sync (Configuration tab) or a prepare recipe (Advanced tab) with the redispatch partitioning activated. 





Redispatch setting



 



 



Matt



 



 

Mattsco
0 Kudos
n0thing233
Level 3
Author
sorry I don't see "redispatch partitioning according to input columns" in my advanced tab, what could be the reason? My seeting told me " No settings required".
0 Kudos
Mattsco
Dataiker
To see it, you need to have the output dataset of the recipe partitioned.
I added a picture on the main answer.
Mattsco
0 Kudos
n0thing233
Level 3
Author
Sorry I might ask some stupid questions but the picture you show me is different from my dataiku .
In my sync recipe-> advanced->settings there is no checkbox for "redispatch partitioning according to input columns". I really want to attach my screenshot here. How should I do that?
0 Kudos
Mattsco
Dataiker
Sorry, in the sync it's in the Configuration tab
Mattsco
0 Kudos
n0thing233
Level 3
Author
In the sync-> configuration tab -> settings .I have only two options:"Free output schema(name-based matching)" and "Maintain strict schema equality".There is still no "redispatch partitioning according to input columns"
0 Kudos
Mattsco
Dataiker
In the sync-> configuration tab -> output, can you confirm the dataset is partitionned by something?
you should see that mentioned below the name of the dataset.
Mattsco
0 Kudos
n0thing233
Level 3
Author
Thanks. I finally made some progress. I'm able to do the partition.
0 Kudos
Valengo
Level 1

Hello @n0thing233, I'm in the same situation than you, and I'm really interrested about how did you do the partition in the sync recipe. Cound you please explain the steps to follow please ?

Thank you very much in advance.

0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku