Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

Split - Keep same proportions

Level 1
Split - Keep same proportions

Hi all !

I would need to split my dataset into a training and a testing dataset. I would like to ensure that the proportions of classes I have in my original dataset are kept in my training dataset (for example, if my original dataset has 55% of women and 45% of men, the same proportion would be found in the training dataset, same for several other classes).

Which type of splitting should be used to ensure the above ? Is it a default when using the full random or should I add some filters ?

Many thanks !

0 Kudos
1 Reply


random sampling is fine in this case