How to run checks on the model metrics and kick off a model retrain

Mohammed
Mohammed Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 43 ✭✭✭

Hi,
I have a model flow setup in Dataiku. Underlying dataset updates every month.
How to check if a model metric is below a threshold (say MAPE >5%) and trigger model retraining if it is below the threshold.

Even after the retaining if the model performance is below the threshold I want to rebuild the model. i.e. I want to use some kind of automl to create a new model from all possible features?

My idea is to develop a self-adopting model in dataiku. How feasible is that?


Operating system used: Windows

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,225 Dataiker

    Hi @MNOP
    ,

    You can leverage the APIs to implement this.

    You can look at the example to obtain best best-performing model after retraining with same or various parameters then save the best as the saved model /active version in the flow

    https://developer.dataiku.com/latest/concepts-and-examples/ml.html




  • Mohammed
    Mohammed Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 43 ✭✭✭

    @AlexT
    , Thanks for the reply.
    Is it possible to use the scenario to trigger a model rerun if the model metric is below the threshold?

    I see this post where he runs checks on the model evaluation AUC and kicks off a model retrain if below a certain threshold? How to achieve this in dataiku?.

  • LouisDHulst
    LouisDHulst Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Neuron, Registered, Neuron 2023 Posts: 54 Neuron

    Hi @MNOP
    ,

    Dataiku Scenarios allow you to run Python scripts as a step, so you can leverage the API that @AlexT
    linked to you. Using scripts is pretty powerful, allowing you to continuously retrain your model and only deploy versions that would improve performance, like what's done in this post.

    If you don't want to use the API, you can try using the Evaluate recipe and Model Evaluation Stores. The user in the post you linked actually uses an evaluate recipe to evaluate their model on a holdout set, and then runs a check. You can set up checks within the Model Evaluation Store (concept + doc).

    Your steps would look like:

    - set up your evaluation set, recipe and store

    - add a metric check in the evaluation store

    - add steps in your scenario to build the evaluation store

    - add a Run Checks step in the scenario

    - add a step to re-train your model in case the Run Checks step fails

    An upside of using the MES is input data drift detection, so you'll be able to measure how much your data is changing every month.

Setup Info
    Tags
      Help me…