May 21, 2020
When training a model using the Visual ML interface, have you ever noticed that the reported value of your optimization metric for a given algorithm does not always exactly match the final value in the line chart, or that the algorithm that visually performs the best in the chart is not necessarily the one that is reported as the model champion for that session?
For example, in the image above, the Random Forest algorithm show an AUC of 0.780, beating out the Logistic Regression’s AUC of 0.753. You can see this score of 0.780 in three places (highlighted with pink boxes), but yet if you hover over the individual data points in the line chart, the AUC is reported 0.793 (gold box). Why the difference?
This is because the line chart is plotting the cross-validation scores for each individual experiment -- executed with a specific set of hyperparameters -- on the cross-validation set, which is a subset of the train data.
Once grid-search is complete and DSS finds the optimal set of hyperparameters, it retrains the model on the whole train set and scores the holdout sample to produce the final results for that session (pink boxes).
Therefore, the values in the line chart cannot be directly compared to the test scores that you see on the other parts of the page, because:
So while it is likely that the algorithm that looks best on the chart will also perform best on the holdout data and that the metric values will match, it is not always the case.
We suggest you instead use this line chart to guide you in the following ways:
Visit our documentation for a complete description of the visualization of grid search results.
DSS offers many settings to tune how the search for the best hyperparameters is performed. Consult the reference documentation to learn more about advanced models optimization.