Test / Train Split

al-gharak · August 2022

I am using the community version. I have a dataset with 200k rows. I split the set and performed feature engineering work separately on train and test. The goal was to prevent data from leaking between the test and train set. Is there a way to run training and specify two separate datasets, one for train and the other for the test? I did see an option for this, but it still is asking for a split ratio. My goal is to use 100% of the train ds for training and 100% of the test ds from testing. They are in separate folders.

Operating system used: mac os

Alexandru · August 2022

Hi,

You should be able to use the option "Explicit extracts from two datasets" to achieve what you are looking for.

You can change this from Model - Design - Train / test set

Screenshot 2022-08-22 at 18.09.40.png

Let me know if that helps!

al-gharak · August 2022

Thanks for your help. On a separate note, is there a way to handle Train/Test split depicted in the attached photo?

attached picture?

Rickh008 · August 2022

Are you asking for cross-validation (CV)? 5-fold CV is "on" by default and may be edited or changed from the Hyperparameters section under MODELING.

Test / Train Split

Tags

Best Answer

Answers

Welcome!

Welcome!

Quick Links

Categories

Sign up to take part

Test / Train Split

Tags

Best Answer

Answers

Welcome!

Welcome!

Quick Links

Categories