We're excited to announce that we're launching the second installment of Dataiku Product Days Register Now

Redeploy the same model after retraining.

Solved!
gnaldi62
Level 4
Level 4
Redeploy the same model after retraining.

Hi all,

  we have duplicated a project; in this project we have a predictive model which in the original

  project was trained with multiple algorithms, and among these, the one with the best score was

  deployed. Because in the new project a few columns from the starting training dataset have been

  deleted, we need to retrain and redeploy (otherwise the scoring fails because also the dataset

   to score has the same columns removed).

  Because in the project there are many of these, we are trying to automate everything via a script

  using Python API.

  What we have done so far is to get all the saved models and from them the original ml_task.

saved_models = this_project.list_saved_models()
for smod in saved_models:
  if smod['name'] in saved_models_names:
    current_model = smod['id']
    current_saved_model = this_project.get_saved_model(current_model)
    current_ml_task = current_saved_model.get_origin_ml_task()

  We also are OK with training the model,

  try:

    list_trained = current_ml_task.train()

....

  The question is: how can we redeploy the model in the list trained corresponding to the same

  algorithm original deployed ? We mean: if it was "Lasso (L1)" (but we don't know) how (if feasible):

  a) to retrieve the name of the algorithm originally deployed (from the model) ?

  b) how to redeploy the same algorithm of the retrained model ?

  Txs. Rgds.

Giuseppe

0 Kudos
1 Solution
arnaudde
Dataiker
Dataiker

Hello Giuseppe,

You can get the lastExportedFrom argument of the saved model which will give you the id of the trained model it comes from. From there you can get the trained model details and the name of the algorithm. Here is a sample code

saved_model = client.get_project("MYPROJECT").get_saved_model("n1EpkFGp")
trained_model_id = saved_model.get_settings().get_raw()["lastExportedFrom"]
mltask = saved_model.get_origin_ml_task()
mltask.get_trained_model_details(trained_model_id).get_raw()["modeling"]["algorithm"]


You can then redeploy the model using the deploy_to_flow method.

You can also have a look at the example in the doc 

Hope it helps,
Arnaud

View solution in original post

0 Kudos
5 Replies
arnaudde
Dataiker
Dataiker

Hello Giuseppe,

You can get the lastExportedFrom argument of the saved model which will give you the id of the trained model it comes from. From there you can get the trained model details and the name of the algorithm. Here is a sample code

saved_model = client.get_project("MYPROJECT").get_saved_model("n1EpkFGp")
trained_model_id = saved_model.get_settings().get_raw()["lastExportedFrom"]
mltask = saved_model.get_origin_ml_task()
mltask.get_trained_model_details(trained_model_id).get_raw()["modeling"]["algorithm"]


You can then redeploy the model using the deploy_to_flow method.

You can also have a look at the example in the doc 

Hope it helps,
Arnaud

View solution in original post

0 Kudos
gnaldi62
Level 4
Level 4
Author

Hi Arnaud,

  thank you for the details. One doubt: if we use "depoly_to_flow" we get a new recipe, right ?

  But if we want to use the same original recipe (we have many steps donwward) ? Would it

  be OK to use "redeploy_to_flow" instead ?

Thanks. Regards.

  Giuseppe

0 Kudos
arnaudde
Dataiker
Dataiker

Hello Giuseppe,
Indeed you should use redeploy_to_flow not deploy_to_flow, my bad.

Best,

Arnaud

0 Kudos
gnaldi62
Level 4
Level 4
Author

Great. Many thanks. Rgds.

Giuseppe

0 Kudos
gnaldi62
Level 4
Level 4
Author

Quick update: when duplicating the project the info about  lastExportedFrom  is lost (probably a unique id so stays with original project ?).

Current workaround is to retrain and simply redeploy one of the trained models. At least the scoring recipe does not fail and the script can go on with building all the remaining datasets.

Rgds.

Giuseppe

0 Kudos
A banner prompting to get Dataiku DSS