Redeploy the same model after retraining.
Hi all,
we have duplicated a project; in this project we have a predictive model which in the original
project was trained with multiple algorithms, and among these, the one with the best score was
deployed. Because in the new project a few columns from the starting training dataset have been
deleted, we need to retrain and redeploy (otherwise the scoring fails because also the dataset
to score has the same columns removed).
Because in the project there are many of these, we are trying to automate everything via a script
using Python API.
What we have done so far is to get all the saved models and from them the original ml_task.
saved_models = this_project.list_saved_models()
for smod in saved_models:
if smod['name'] in saved_models_names:
current_model = smod['id']
current_saved_model = this_project.get_saved_model(current_model)
current_ml_task = current_saved_model.get_origin_ml_task()
We also are OK with training the model,
try:
list_trained = current_ml_task.train()
....
The question is: how can we redeploy the model in the list trained corresponding to the same
algorithm original deployed ? We mean: if it was "Lasso (L1)" (but we don't know) how (if feasible):
a) to retrieve the name of the algorithm originally deployed (from the model) ?
b) how to redeploy the same algorithm of the retrained model ?
Txs. Rgds.
Giuseppe
Best Answer
-
Hello Giuseppe,
You can get the lastExportedFrom argument of the saved model which will give you the id of the trained model it comes from. From there you can get the trained model details and the name of the algorithm. Here is a sample code
saved_model = client.get_project("MYPROJECT").get_saved_model("n1EpkFGp") trained_model_id = saved_model.get_settings().get_raw()["lastExportedFrom"] mltask = saved_model.get_origin_ml_task() mltask.get_trained_model_details(trained_model_id).get_raw()["modeling"]["algorithm"]
You can then redeploy the model using the deploy_to_flow method.You can also have a look at the example in the doc
Hope it helps,
Arnaud
Answers
-
gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
Hi Arnaud,
thank you for the details. One doubt: if we use "depoly_to_flow" we get a new recipe, right ?
But if we want to use the same original recipe (we have many steps donwward) ? Would it
be OK to use "redeploy_to_flow" instead ?
Thanks. Regards.
Giuseppe
-
Hello Giuseppe,
Indeed you should use redeploy_to_flow not deploy_to_flow, my bad.Best,
Arnaud
-
gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
Great. Many thanks. Rgds.
Giuseppe
-
gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
Quick update: when duplicating the project the info about lastExportedFrom is lost (probably a unique id so stays with original project ?).
Current workaround is to retrain and simply redeploy one of the trained models. At least the scoring recipe does not fail and the script can go on with building all the remaining datasets.
Rgds.
Giuseppe