XGB Model Training over x+hours
Hi Team,
I have a tabular dataset 300K rows by 15 columns (numerical data). I am using a visual recipe to train XGB model for prediction. I have selected a range of hyperparameters for tuning purpose (no early stopping)
The strategy is Random Search with max iterations 0 but max search time of 30(mins).
I expected that model would train to the best possible parameters within 30mins. But its still training.... (I can view the logs and changes in rmse)
There is no option to interrupt but directly abort, which I believe will kill the model and not produce anything.
Since it's not mission-critical, I haven't killed it yet - Any advice?
(No GPU, 16GB RAM)
Thanks,
Nikita
Answers
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
There are several parameters available in algorithms that will cause many models to be built. And the more hyperparameters you search, the more models will be built. In the example below, the hyperparameter tuning that I chose caused 6 different models to be built.
If the 6 models each take 30 minutes. The hyperparameter search above would take more than 3 hours to complete.
I tend to use early stopping when building preliminary exploratory models. Only if I'm dealing with final model production I might try turning off early stopping.
In recent version of DSS, some model types you can suspend. And there are some that you can only about.
Advise, check the count of models you are building. It's easy to turn on all parameter searching and end up with hundreds of potential models to build. Start with default parameters, worry about your features and data enrichment, then spent the time doing parameter searching.
Hope that helps.