Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

Dataiku MLFlow integration

Krupa
Level 2
Dataiku MLFlow integration

Hi Dataiku Team,

I am trying to integrate the MLFLOW models into DSS, which I have successfully done as per the DSS 10 documentation. I am interested in knowing if we would be able to connect with the ground truth data (lets say from (snowflake DB) for the models that we have imported from mlflow into DSS, so that we would be able to view the performance metrics in the model versions tab. Attaching the screenshot where these tabs are disabled due to the unavailability of the ground truth data.

3 Replies
HarizoR
Developer Advocate

Hi,

In order to surface the model performance visualizations, you need to create an "evaluation" dataset (what you call "ground truth") and follow step 3 of the documentation's example:

# 3. Evaluate the saved model
# (Optional, only for tabular models, mandatory to have access to the saved model performance tab)
mlflow_version.set_core_metadata(target_column, classes, evaluation_dataset_name)
mlflow_version.evaluate(evaluation_dataset_name)

 

Hope this helps!

 

Best,

Harizo

0 Kudos
Krupa
Level 2
Author

Hi Team,

1) I used the below code for setting the metadata for a particalur mlflow imported model version

mlflow_version.set_core_metadata()

Here the model version tab still does not show the charts and other features depicted (like confusion matrix, calibration curve)

I have the ground truth data in snowflake, the predicted data and the model itself. Is there a way possible in which we would be able to get data for the performance metrics in the model version details tab?

 

0 Kudos
HarizoR
Developer Advocate

Hi,

Don't forget to run the other code line to effectively perform the evaluation:

mlflow_version.evaluate(evaluation_dataset_name)

 

Best,

Harizo

Labels

?

Setup info

?
A banner prompting to get Dataiku