Shuffling before Training

Solved!
malalearning
Level 2
Shuffling before Training

Hi everyone, i would like to shuffle a dataset before the training of an ML model(random forest, light GBM and XGboost), that is simply changing randomly the order of each observation in the training dataset. Is there any method in dataiku to achieve that without using code recipes in dataiku? Thank you all

0 Kudos
1 Solution
JordanB
Dataiker

Hi @malalearning,

With DSS UI, you can use a sort recipe, generate a random column in computed columns, and then sort by it:

Screen Shot 2023-03-30 at 12.41.51 PM.png

Other than that, if you use Artificial Neural Networks, you do have the option to shuffle between epochs.

I hope this helps!

Thanks,

Jordan

 

 

View solution in original post

1 Reply
JordanB
Dataiker

Hi @malalearning,

With DSS UI, you can use a sort recipe, generate a random column in computed columns, and then sort by it:

Screen Shot 2023-03-30 at 12.41.51 PM.png

Other than that, if you use Artificial Neural Networks, you do have the option to shuffle between epochs.

I hope this helps!

Thanks,

Jordan