Redeploy the same model after retraining.

Options
gnaldi62
gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron

Hi all,

we have duplicated a project; in this project we have a predictive model which in the original

project was trained with multiple algorithms, and among these, the one with the best score was

deployed. Because in the new project a few columns from the starting training dataset have been

deleted, we need to retrain and redeploy (otherwise the scoring fails because also the dataset

to score has the same columns removed).

Because in the project there are many of these, we are trying to automate everything via a script

using Python API.

What we have done so far is to get all the saved models and from them the original ml_task.

saved_models = this_project.list_saved_models()
for smod in saved_models:
if smod['name'] in saved_models_names:
current_model = smod['id']
current_saved_model = this_project.get_saved_model(current_model)
current_ml_task = current_saved_model.get_origin_ml_task()

We also are OK with training the model,

try:

list_trained = current_ml_task.train()

....

The question is: how can we redeploy the model in the list trained corresponding to the same

algorithm original deployed ? We mean: if it was "Lasso (L1)" (but we don't know) how (if feasible):

a) to retrieve the name of the algorithm originally deployed (from the model) ?

b) how to redeploy the same algorithm of the retrained model ?

Txs. Rgds.

Giuseppe

Best Answer

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    edited July 17 Answer ✓
    Options

    Hello Giuseppe,

    You can get the lastExportedFrom argument of the saved model which will give you the id of the trained model it comes from. From there you can get the trained model details and the name of the algorithm. Here is a sample code

    saved_model = client.get_project("MYPROJECT").get_saved_model("n1EpkFGp")
    trained_model_id = saved_model.get_settings().get_raw()["lastExportedFrom"]
    mltask = saved_model.get_origin_ml_task()
    mltask.get_trained_model_details(trained_model_id).get_raw()["modeling"]["algorithm"]


    You can then redeploy the model using the deploy_to_flow method.

    You can also have a look at the example in the doc

    Hope it helps,
    Arnaud

Answers

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    Hi Arnaud,

    thank you for the details. One doubt: if we use "depoly_to_flow" we get a new recipe, right ?

    But if we want to use the same original recipe (we have many steps donwward) ? Would it

    be OK to use "redeploy_to_flow" instead ?

    Thanks. Regards.

    Giuseppe

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    Options

    Hello Giuseppe,
    Indeed you should use redeploy_to_flow not deploy_to_flow, my bad.

    Best,

    Arnaud

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    Great. Many thanks. Rgds.

    Giuseppe

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    Quick update: when duplicating the project the info about lastExportedFrom is lost (probably a unique id so stays with original project ?).

    Current workaround is to retrain and simply redeploy one of the trained models. At least the scoring recipe does not fail and the script can go on with building all the remaining datasets.

    Rgds.

    Giuseppe

Setup Info
    Tags
      Help me…