Retrain only the best model from a visual analysis

RicSpd Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Registered Posts: 8 Partner

I'm using a visual analysis to build some machine learning algorithms (Random Forest and XGBoost) with a grid search consisting of 20 iterations for each algorithm. It ends up that XGBoost performs better and therefore I deploy it into the flow. So far so good.

If I want to retrain the model by clicking on the "Retrain" button on the diamond-shape saved model, DSS takes a lot of time to retrain it, and that seems strange for a single model. By having a look at the logs and the new model version, it seems that DSS is performing again a 20-iteration grid search on the XGBoost algorithm, whereas I would like it to retrain only the model with the best hyper-parameters identified in the visual analysis (they are also available in the Model Information > Algorithm menu).

How can I do that in a proper way? It comes to mind to run, inside the visual analysis, a new session selecting only the XGBoost algorithm and insert manually the best set of hyper-parameters, but maybe there's a more efficient way to do it.


Best Answer

  • Nicolas_Servel
    Nicolas_Servel Dataiker Posts: 37 Dataiker
    Answer ✓


    when you deploy a model from the Lab to the flow, there is an advanced setting allowing you to only re-train with the best set of hyperparameters.

    Go back to the Lab, click on your model > Deploy > Select whether you want to deploy it to an existing saved model or create a new one > Click on "Advanced" on the bottom left of the modal


    Then, for Model parameters select Use already detected parameters


    Then, click on create. This should create a train recipe with the behaviour you are looking for.

    Hope this helps,

    Best regards


Setup Info
      Help me…