Discover all of the brand-new features and improvements to existing capabilities in the Dataiku 11.3 updateLET'S GO

Get Calculated Statistics with Model Evaluation Store API?

Solved!
cmjurs
Level 2
Get Calculated Statistics with Model Evaluation Store API?

Im aware that you can get a hold of many metrics in the model evaluation store through the use of the API

e.g.

mes = dataiku.ModelEvaluationStore("XXX")
mes_info = mes.get_info()

the_MES_list = mes.get_metric_history("XXX")

My question is, how can I extract the calculated valuesin the MES, like those in the univariate data drift section (image)? Id like to be able to alarm on these values..

Thanks!

CJ


Operating system used: Ubuntu

0 Kudos
1 Solution
fsergot
Dataiker

Hello,

You need to through the path of Project > Model Evaluation Store > Model Evaluation and in there call compute_data_drift().

Here is a sample code that does that:

import dataiku
from dataiku import pandasutils as pdu
import pandas as pd
import pprint
pp = pprint.PrettyPrinter(indent=4)

client = dataiku.api_client()
proj = client.get_project("DKU_ENERGY_CONSUMPTION_2")
mes = proj.get_model_evaluation_store("b0G6ywDN")
print("Found MES '{}'".format(mes.mes_id))

me = mes.get_latest_model_evaluation()
print("Latest ME is '{}'".format(me.full_id))

metrics = me.get_metrics()
#pp.pprint(metrics)
print("Latest ME metrics are: ")
for metric in metrics['metrics'] :
    print("     {} = {}".format(metric['meta']['name'],metric['lastValues'][0]['value']))

print("--- Getting last data drift metrics ---")
drift = me.compute_data_drift()
pp.pprint(drift.get_raw())
univariateDriftResult = drift.get_raw()['univariateDriftResult']['columns']
for feature in univariateDriftResult :
    feat = univariateDriftResult[feature] 
    pp.pprint(feat)
    print("Feature '{}' => KS test = {} | Chi-square test = {} | PSI = {}".format(
        feat['name'],
        feat['ksTestPvalue'] if 'ksTestPvalue' in feat else "N/A",
        feat['chiSquareTestPvalue'] if 'chiSquareTestPvalue' in feat else "N/A",
        feat['populationStabilityIndex'] if 'ksTestPvalue' in feat else "N/A",
    ))

 

View solution in original post

2 Replies
fsergot
Dataiker

Hello,

You need to through the path of Project > Model Evaluation Store > Model Evaluation and in there call compute_data_drift().

Here is a sample code that does that:

import dataiku
from dataiku import pandasutils as pdu
import pandas as pd
import pprint
pp = pprint.PrettyPrinter(indent=4)

client = dataiku.api_client()
proj = client.get_project("DKU_ENERGY_CONSUMPTION_2")
mes = proj.get_model_evaluation_store("b0G6ywDN")
print("Found MES '{}'".format(mes.mes_id))

me = mes.get_latest_model_evaluation()
print("Latest ME is '{}'".format(me.full_id))

metrics = me.get_metrics()
#pp.pprint(metrics)
print("Latest ME metrics are: ")
for metric in metrics['metrics'] :
    print("     {} = {}".format(metric['meta']['name'],metric['lastValues'][0]['value']))

print("--- Getting last data drift metrics ---")
drift = me.compute_data_drift()
pp.pprint(drift.get_raw())
univariateDriftResult = drift.get_raw()['univariateDriftResult']['columns']
for feature in univariateDriftResult :
    feat = univariateDriftResult[feature] 
    pp.pprint(feat)
    print("Feature '{}' => KS test = {} | Chi-square test = {} | PSI = {}".format(
        feat['name'],
        feat['ksTestPvalue'] if 'ksTestPvalue' in feat else "N/A",
        feat['chiSquareTestPvalue'] if 'chiSquareTestPvalue' in feat else "N/A",
        feat['populationStabilityIndex'] if 'ksTestPvalue' in feat else "N/A",
    ))

 

cmjurs
Level 2
Author

@fsergotA+ Thanks!

0 Kudos