Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

model performance export

Solved!
NR
Level 3
model performance export

Hello,

Anyway to get model performance exported ? I'm interested on both training session results and model deployed performances.

Here are the screenshots:

Sans titre2.png

Sans titre2.png Thanks

0 Kudos
1 Solution
SarinaS
Dataiker

Hi @NR,

For the second screen, you should be able to click on the tiny wheel icon on right  

Screenshot 2023-05-01 at 5.51.53 PM.png

And click on "Create dataset from metrics data".

For the second screenshot, if you want a dataset/CSV file of the different training sessions, you could use the Python API to get the metrics for each training, add them to a dataframe, and then create a dataset with the full table of data. 

Here's an example. You may want slightly different metrics depending on the specific training. You can look at the output of get_performance_metrics() to determine the right values to put into metrics_array

import dataiku
import pandas as pd 

client = dataiku.api_client()
project = client.get_default_project()
analysis = project.get_analysis('ANALYSIS_ID')
mltask = analysis.get_ml_task('MLTASK_ID') # can also get this from analysis.list_ml_tasks()

metrics_array = []
metric_keys = ['accuracy', 'precision', 'recall', 'f1', 'auc', 'aucstd', 'logLoss', 'logLossstd', 
'calibrationLoss', 'calibrationLossstd', 'lift']

# for each training session
for task_id in mltask.get_trained_models_ids():
    single_training_array = []
    details = mltask.get_trained_model_details(task_id)
    # this returns the metrics associated with the session 
    metrics = details.get_performance_metrics()
    # store the metrics that match the keys from "metric_keys" into an array 
    single_training_array.append(details.get_user_meta()['name'])
    for key in metric_keys:
        single_training_array.append(metrics[key])
    # store into an array of arrays, where each row represents a training 
    metrics_array.append(single_training_array)

metric_keys.insert(0, 'name')
# point to an existing output managed dataset that you created in the flow
metric_dataset = dataiku.Dataset('managed_metrics')
# turn our metric array of arrays into a dataframe 
df = pd.DataFrame(metrics_array, columns = metric_keys)
# write dataframe to dataset 
metric_dataset.write_with_schema(df)

 

This is what my output dataset looks like:

Screenshot 2023-05-01 at 6.40.23 PM.png

And my analysis training data screen:

Screenshot 2023-05-01 at 6.40.32 PM.png

I hope that helps! 

Thanks,
Sarina 

View solution in original post

0 Kudos
2 Replies
SarinaS
Dataiker

Hi @NR,

For the second screen, you should be able to click on the tiny wheel icon on right  

Screenshot 2023-05-01 at 5.51.53 PM.png

And click on "Create dataset from metrics data".

For the second screenshot, if you want a dataset/CSV file of the different training sessions, you could use the Python API to get the metrics for each training, add them to a dataframe, and then create a dataset with the full table of data. 

Here's an example. You may want slightly different metrics depending on the specific training. You can look at the output of get_performance_metrics() to determine the right values to put into metrics_array

import dataiku
import pandas as pd 

client = dataiku.api_client()
project = client.get_default_project()
analysis = project.get_analysis('ANALYSIS_ID')
mltask = analysis.get_ml_task('MLTASK_ID') # can also get this from analysis.list_ml_tasks()

metrics_array = []
metric_keys = ['accuracy', 'precision', 'recall', 'f1', 'auc', 'aucstd', 'logLoss', 'logLossstd', 
'calibrationLoss', 'calibrationLossstd', 'lift']

# for each training session
for task_id in mltask.get_trained_models_ids():
    single_training_array = []
    details = mltask.get_trained_model_details(task_id)
    # this returns the metrics associated with the session 
    metrics = details.get_performance_metrics()
    # store the metrics that match the keys from "metric_keys" into an array 
    single_training_array.append(details.get_user_meta()['name'])
    for key in metric_keys:
        single_training_array.append(metrics[key])
    # store into an array of arrays, where each row represents a training 
    metrics_array.append(single_training_array)

metric_keys.insert(0, 'name')
# point to an existing output managed dataset that you created in the flow
metric_dataset = dataiku.Dataset('managed_metrics')
# turn our metric array of arrays into a dataframe 
df = pd.DataFrame(metrics_array, columns = metric_keys)
# write dataframe to dataset 
metric_dataset.write_with_schema(df)

 

This is what my output dataset looks like:

Screenshot 2023-05-01 at 6.40.23 PM.png

And my analysis training data screen:

Screenshot 2023-05-01 at 6.40.32 PM.png

I hope that helps! 

Thanks,
Sarina 

0 Kudos
NR
Level 3
Author

Thanks @SarinaS . That's fixing all points 🙂

I'm wondering why do we need code to export training sessions. Any way your answer is perfect.

0 Kudos