Dataiku MLFlow integration
Hi Dataiku Team,
I am trying to integrate the MLFLOW models into DSS, which I have successfully done as per the DSS 10 documentation. I am interested in knowing if we would be able to connect with the ground truth data (lets say from (snowflake DB) for the models that we have imported from mlflow into DSS, so that we would be able to view the performance metrics in the model versions tab. Attaching the screenshot where these tabs are disabled due to the unavailability of the ground truth data.
Answers
-
Hi,
In order to surface the model performance visualizations, you need to create an "evaluation" dataset (what you call "ground truth") and follow step 3 of the documentation's example:
# 3. Evaluate the saved model # (Optional, only for tabular models, mandatory to have access to the saved model performance tab) mlflow_version.set_core_metadata(target_column, classes, evaluation_dataset_name) mlflow_version.evaluate(evaluation_dataset_name)
Hope this helps!
Best,
Harizo
-
Hi Team,
1) I used the below code for setting the metadata for a particalur mlflow imported model version
mlflow_version.set_core_metadata()
Here the model version tab still does not show the charts and other features depicted (like confusion matrix, calibration curve)
I have the ground truth data in snowflake, the predicted data and the model itself. Is there a way possible in which we would be able to get data for the performance metrics in the model version details tab?
-
Hi,
Don't forget to run the other code line to effectively perform the evaluation:
mlflow_version.evaluate(evaluation_dataset_name)
Best,
Harizo
-
Hi @Krupa
I'm currently working on a project that involves integrating MLflow with Dataiku DSS 10, and I'm facing some challenges. I would greatly appreciate any assistance or guidance you can provide.
Can you please provide steps to integrate it.
Thanks!
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
Welcome to the Dataiku community. We are so glad you have joined us.
In short I do not have an answer for you.
However, I do have a bit of a suggestion. Dataiku DSS has been updated recently to version 12.0.1. Version 10 is getting a little bit old at this time. I also note there are a fare amount of posts that call out Version 12 and MLflow. I’m wondering if your challenges might be related to trying to use an old version Dataiku with modern MLflow components.
I also note here that one needs to make sure that certain packages are installed to work with MLflow.
https://doc.dataiku.com/dss/latest/mlops/mlflow-models/limitations.html
Finally if you are using an enterprise license, I’d open a support ticket. The support team is very good.
hope this might help a bit Good luck, let us all know how you are getting on.