Retrain only the best model from a visual analysis

Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Registered Posts: 8 Partner

I'm using a visual analysis to build some machine learning algorithms (Random Forest and XGBoost) with a grid search consisting of 20 iterations for each algorithm. It ends up that XGBoost performs better and therefore I deploy it into the flow. So far so good.

If I want to retrain the model by clicking on the "Retrain" button on the diamond-shape saved model, DSS takes a lot of time to retrain it, and that seems strange for a single model. By having a look at the logs and the new model version, it seems that DSS is performing again a 20-iteration grid search on the XGBoost algorithm, whereas I would like it to retrain only the model with the best hyper-parameters identified in the visual analysis (they are also available in the Model Information > Algorithm menu).

How can I do that in a proper way? It comes to mind to run, inside the visual analysis, a new session selecting only the XGBoost algorithm and insert manually the best set of hyper-parameters, but maybe there's a more efficient way to do it.

Best Answer

  • Dataiker Posts: 37 Dataiker
    Answer ✓

    Hello,

    when you deploy a model from the Lab to the flow, there is an advanced setting allowing you to only re-train with the best set of hyperparameters.

    Go back to the Lab, click on your model > Deploy > Select whether you want to deploy it to an existing saved model or create a new one > Click on "Advanced" on the bottom left of the modal

    image.png

    Then, for Model parameters select Use already detected parameters

    image.png

    Then, click on create. This should create a train recipe with the behaviour you are looking for.

    Hope this helps,

    Best regards

Answers

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.