Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I am currently running a flow to do a binary classification model. After running the model on my training data, I want to compare the results of the top three models on my test data set (accuracy, precision, recall, etc.). I know how to do it on the train dataset, but I am unsure on how to do it on the test data and compare it via model comparisons.
Also, after running my flow on the entire set of features, is there a way to only select the top 5 features to run a new model on?
To evaluate on the test dataset you would need to perform the split using the split recipe and then use explicit extracts for your train/test sets.
You can do this Visual Analysis > select the model > Design > Train/Test Set and choose "Explicit extracts from two datasets"
To reduce the number of features you can have a look at: https://doc.dataiku.com/dss/latest/machine-learning/supervised/settings.html#settings-feature-reduct...
Let me know if that helps.