Survey banner
The Dataiku Community is moving to a new home! New posts are now disabled and the community will shortly be in temporary read only mode: LEARN MORE

Is there a way to split a dataset based on a column value?

Is there a way to split a dataset based on a column value?
Hello all,

I have a dataset which contains many rows for a given event, which id is in the "event_id" column. There are of course many events in the dataset.

Is there a way to split this dataset more easily than manually defining the output datasets using the split visual recipe? There are hundreds of events... (it would be a bit painful, or at least time-consuming).

I am using DSS 2.2.2

Thanks in advance!
0 Kudos
1 Reply
Community Manager
Community Manager

Hi Alex,

There is not really a better way to do this than with the split recipe. If you want to have one dataset per event, you need anyway to create these datasets. Maybe you could create the datasets with DSS API but is still not ideal.

The best option would be to change your strategy. You should keep a single dataset and create a partition on the event_id column. To learn more about it, you can read Working with partitions and Repartitioning a non-partitioned dataset.

I hope that helps,


Jeremy, Product Manager at Dataiku
0 Kudos


Labels (2)
A banner prompting to get Dataiku