Ready for Dataiku 9? Try out the Crash Course on new features! GET STARTED

XGB Model Training over x+hours

ntibdiwal
Level 1
XGB Model Training over x+hours

Hi Team,

I have a tabular dataset 300K rows by 15 columns (numerical data). I am using a visual recipe to train XGB model for prediction.  I have selected a range of hyperparameters for tuning purpose (no early stopping)

The strategy is Random Search with max iterations 0 but max search time of 30(mins).

I expected that model would train to the best possible parameters within 30mins. But its still training.... (I can view the logs and changes in rmse)

There is no option to interrupt but directly abort, which I believe will kill the model and not produce anything.

Since it's not mission-critical, I haven't killed it yet   - Any advice?

(No GPU, 16GB RAM)

Thanks,

Nikita

0 Kudos
2 Replies
tgb417
Neuron
Neuron

@ntibdiwal ,

There are several parameters available in algorithms that will cause many models to be built.  And the more hyperparameters you search, the more models will be built.  In the example below, the hyperparameter tuning that I chose caused 6 different models to be built.

Showing a part of the Model Results for XGBoost.  In this case 6 hyper parameters combinations were searched.  There is a speech bubble point to the 6  hyper paramater search result.  The bubble say "If each one of these takes 30 minutes, We are talking about more than 3 hours for a model run."Showing a part of the Model Results for XGBoost. In this case 6 hyper parameters combinations were searched. There is a speech bubble point to the 6 hyper paramater search result. The bubble say "If each one of these takes 30 minutes, We are talking about more than 3 hours for a model run."

If the 6 models each take 30 minutes.  The hyperparameter search above would take more than 3 hours to complete.

I tend to use early stopping when building preliminary exploratory models.  Only if I'm dealing with final model production I might try turning off early stopping.  

In recent version of DSS, some model types you can suspend.  And there are some that you can only about.

Advise, check the count of models you are building.  It's easy to turn on all parameter searching and end up with hundreds of potential models to build.  Start with default parameters, worry about your features and data enrichment, then spent the time doing parameter searching.

Hope that helps.

--Tom
0 Kudos
ntibdiwal
Level 1
Author

Hi Tom @tgb417 

Thank you for your prompt reply. I somehow missed acknowledging this. 
Your inputs were helpful.

 

Regards,

Nikita

0 Kudos
A banner prompting to get Dataiku DSS