Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I am using the community version. I have a dataset with 200k rows. I split the set and performed feature engineering work separately on train and test. The goal was to prevent data from leaking between the test and train set. Is there a way to run training and specify two separate datasets, one for train and the other for the test? I did see an option for this, but it still is asking for a split ratio. My goal is to use 100% of the train ds for training and 100% of the test ds from testing. They are in separate folders.
Operating system used: mac os
Hi,
You should be able to use the option "Explicit extracts from two datasets" to achieve what you are looking for.
You can change this from Model - Design - Train / test set
Let me know if that helps!
Hi,
You should be able to use the option "Explicit extracts from two datasets" to achieve what you are looking for.
You can change this from Model - Design - Train / test set
Let me know if that helps!
Thanks for your help. On a separate note, is there a way to handle Train/Test split depicted in the attached photo?
attached picture?
Are you asking for cross-validation (CV)? 5-fold CV is "on" by default and may be edited or changed from the Hyperparameters section under MODELING.