Dataiku MLFlow integration

Krupa
Krupa Partner, Registered Posts: 5 Partner

Hi Dataiku Team,

I am trying to integrate the MLFLOW models into DSS, which I have successfully done as per the DSS 10 documentation. I am interested in knowing if we would be able to connect with the ground truth data (lets say from (snowflake DB) for the models that we have imported from mlflow into DSS, so that we would be able to view the performance metrics in the model versions tab. Attaching the screenshot where these tabs are disabled due to the unavailability of the ground truth data.

Answers

  • HarizoR
    HarizoR Dataiker, Alpha Tester, Registered Posts: 138 Dataiker
    edited July 17

    Hi,

    In order to surface the model performance visualizations, you need to create an "evaluation" dataset (what you call "ground truth") and follow step 3 of the documentation's example:

    # 3. Evaluate the saved model
    # (Optional, only for tabular models, mandatory to have access to the saved model performance tab)
    mlflow_version.set_core_metadata(target_column, classes, evaluation_dataset_name)
    mlflow_version.evaluate(evaluation_dataset_name)

    Hope this helps!

    Best,

    Harizo

  • Krupa
    Krupa Partner, Registered Posts: 5 Partner
    edited July 17

    Hi Team,

    1) I used the below code for setting the metadata for a particalur mlflow imported model version

    mlflow_version.set_core_metadata()

    Here the model version tab still does not show the charts and other features depicted (like confusion matrix, calibration curve)

    I have the ground truth data in snowflake, the predicted data and the model itself. Is there a way possible in which we would be able to get data for the performance metrics in the model version details tab?

  • HarizoR
    HarizoR Dataiker, Alpha Tester, Registered Posts: 138 Dataiker

    Hi,

    Don't forget to run the other code line to effectively perform the evaluation:

    mlflow_version.evaluate(evaluation_dataset_name)

    Best,

    Harizo

  • prabhasJ
    prabhasJ Registered Posts: 1

    Hi @Krupa

    I'm currently working on a project that involves integrating MLflow with Dataiku DSS 10, and I'm facing some challenges. I would greatly appreciate any assistance or guidance you can provide.

    Can you please provide steps to integrate it.

    Thanks!

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron

    @prabhasJ
    ,

    Welcome to the Dataiku community. We are so glad you have joined us.

    In short I do not have an answer for you.

    However, I do have a bit of a suggestion. Dataiku DSS has been updated recently to version 12.0.1. Version 10 is getting a little bit old at this time. I also note there are a fare amount of posts that call out Version 12 and MLflow. I’m wondering if your challenges might be related to trying to use an old version Dataiku with modern MLflow components.

    I also note here that one needs to make sure that certain packages are installed to work with MLflow.

    https://doc.dataiku.com/dss/latest/mlops/mlflow-models/limitations.html

    Finally if you are using an enterprise license, I’d open a support ticket. The support team is very good.

    hope this might help a bit Good luck, let us all know how you are getting on.

Setup Info
    Tags
      Help me…