Model evaluation: Running check on the most recent scores
Hi all,
I am running the model evaluation recipe and writing the results out to the dataset, I want to add checks to this dataset to look for model drift.
The evaluation recipe appends to the dataset with every run, creating a handy archive of model performance, but how can I configure my checks to look at the most recent scores only, since I am only interested in triggering a retrain based on the most recent performance, not an average?
Ben
Best Answers
-
Hi,
You can create a custom metric that show you the most recent score, and then your check will be based on this metric.
To do that, you can go to the metrics tab -> edit -> python probe and do something like this
def process(dataset, partition_id):
df = dataset.get_dataframe()
most_recent_score = df[df['date'] == np.max(df['date'])][MY_SCORE].values[0]
metric_values = {'My most recent score':most_recent_score}
return metric_valuesCheers,
-
Once adding the custom python probe, it can be found in the Metrics list in the screenshot, here I have chosen 3 on the 11 available metrics.
Answers
-
ben_p Neuron 2020, Registered, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant Posts: 143 ✭✭✭✭✭✭✭
Thanks @duphan
- when once creates a Python metrics like this, where can it then be selected? I don't see it appear as an option in the available metrics?I also found another solution to this, syncing to an additional table, which is set to append, and changing the first metrics table to be overwrite, where I then set my checks:
Ben