Dataiku MLFlow integration

Krupa · April 2022

Hi Dataiku Team,

I am trying to integrate the MLFLOW models into DSS, which I have successfully done as per the DSS 10 documentation. I am interested in knowing if we would be able to connect with the ground truth data (lets say from (snowflake DB) for the models that we have imported from mlflow into DSS, so that we would be able to view the performance metrics in the model versions tab. Attaching the screenshot where these tabs are disabled due to the unavailability of the ground truth data.

HarizoR · April 2022

Hi,

In order to surface the model performance visualizations, you need to create an "evaluation" dataset (what you call "ground truth") and follow step 3 of the documentation's example:

# 3. Evaluate the saved model
# (Optional, only for tabular models, mandatory to have access to the saved model performance tab)
mlflow_version.set_core_metadata(target_column, classes, evaluation_dataset_name)
mlflow_version.evaluate(evaluation_dataset_name)

Hope this helps!

Best,

Harizo

Krupa · April 2022

Hi Team,

1) I used the below code for setting the metadata for a particalur mlflow imported model version

mlflow_version.set_core_metadata()

Here the model version tab still does not show the charts and other features depicted (like confusion matrix, calibration curve)

I have the ground truth data in snowflake, the predicted data and the model itself. Is there a way possible in which we would be able to get data for the performance metrics in the model version details tab?

HarizoR · May 2022

Hi,

Don't forget to run the other code line to effectively perform the evaluation:

mlflow_version.evaluate(evaluation_dataset_name)

Best,

Harizo

prabhasJ · June 2023

Hi @Krupa

I'm currently working on a project that involves integrating MLflow with Dataiku DSS 10, and I'm facing some challenges. I would greatly appreciate any assistance or guidance you can provide.

Can you please provide steps to integrate it.

Thanks!

tgb417 · June 2023

@prabhasJ
,

Welcome to the Dataiku community. We are so glad you have joined us.

In short I do not have an answer for you.

However, I do have a bit of a suggestion. Dataiku DSS has been updated recently to version 12.0.1. Version 10 is getting a little bit old at this time. I also note there are a fare amount of posts that call out Version 12 and MLflow. I’m wondering if your challenges might be related to trying to use an old version Dataiku with modern MLflow components.

I also note here that one needs to make sure that certain packages are installed to work with MLflow.

https://doc.dataiku.com/dss/latest/mlops/mlflow-models/limitations.html

Finally if you are using an enterprise license, I’d open a support ticket. The support team is very good.

hope this might help a bit Good luck, let us all know how you are getting on.

Dataiku MLFlow integration

Answers

Categories

Setup Info

Tags