Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi all !
I would need to split my dataset into a training and a testing dataset. I would like to ensure that the proportions of classes I have in my original dataset are kept in my training dataset (for example, if my original dataset has 55% of women and 45% of men, the same proportion would be found in the training dataset, same for several other classes).
Which type of splitting should be used to ensure the above ? Is it a default when using the full random or should I add some filters ?
Many thanks !
Hi
random sampling is fine in this case