How to run checks on the model metrics and kick off a model retrain

MNOP
Level 3
How to run checks on the model metrics and kick off a model retrain

Hi,
I have a model flow setup in Dataiku. Underlying dataset updates every month.
How to check if a model metric is below a threshold (say MAPE >5%)  and trigger model retraining if it is below the threshold.

Even after the retaining if the model performance is below the threshold I want to rebuild the model. i.e. I want to use some kind of automl to create a new model from all possible features?  

My idea is to develop a self-adopting model in dataiku. How feasible is that?


Operating system used: Windows

0 Kudos
3 Replies
AlexT
Dataiker

Hi @MNOP,

You can leverage the APIs to implement this. 

You can look at the example to obtain best best-performing model after retraining with same or various parameters then save the best as the saved model /active version in the flow

https://developer.dataiku.com/latest/concepts-and-examples/ml.html




0 Kudos
MNOP
Level 3
Author

@AlexT, Thanks for the reply.
Is it possible to use the scenario to trigger a model rerun if the model metric is below the threshold?

I see this post  where he runs checks on the model evaluation AUC and kicks off a model retrain if below a certain threshold? How to achieve this in dataiku?.

0 Kudos
LouisDHulst

Hi @MNOP ,

Dataiku Scenarios allow you to run Python scripts as a step, so you can leverage the API that @AlexT linked to you. Using scripts is pretty powerful, allowing you to continuously retrain your model and only deploy versions that would improve performance, like what's done in this post.

If you don't want to use the API, you can try using the Evaluate recipe and Model Evaluation Stores. The user in the post you linked actually uses an evaluate recipe to evaluate their model on a holdout set, and then runs a check. You can set up checks within the Model Evaluation Store (concept + doc).

Your steps would look like:

- set up your evaluation set, recipe and store

- add a metric check in the evaluation store

- add steps in your scenario to build the evaluation store

- add a Run Checks step in the scenario

- add a step to re-train your model in case the Run Checks step fails

An upside of using the MES is input data drift detection, so you'll be able to measure how much your data is changing every month.