How do I send an email to the user based on a condition on count of records in a dataset?

hi,
After reading the documentation, I came cross the "Compute metrics" step in Scenarios, but how do i retrieve the count of the dataset using ${stepOutput_the_metrics}? And then if the count is more than 0, I want to trigger an email to the user.
I am on DSS version 13
Any help is appreciated.
Thanks
Operating system used: Windows
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,252 Neuron
Hi, have a look at this post which covers in detail how to extract values from a metric into a scenario variable:
-
@Turribeach thanks for the quick response. I looked at the solution but that needs project id and dataset id to be mentioned in the formula. But can this be done using a python custom code? I was doing something like this. The code gives me the metric count, but now how to I use "query_fail_count" variable in the Reporter section to check the condition and send email. Eg; outcome == 'SUCCESS' && query_fail_count > 0 THEN send an email
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,252 Neuron
Personally I think it’s more elegant without using Python. You can however use Python too. You will need to define a scenario variable to be able to use the value in other scenario steps. This code should do that:
import dataiku from dataiku.scenario import Scenario mydataset = dataiku.Dataset("dataset") Scenario().set_scenario_variables(query_fail_count = mydataset.get_last_metric_values().get_global_value('records:COUNT_RECORDS'))
However I don't like this solution because of two reasons:
- dataset.get_last_metric_values() assumes metrics have been run successfully at least once for the dataset. It will fail if no metrics / record count have ever been run or have not completed successfully
- dataset.get_last_metric_values() may give you outdated data as it will not guarantee it's the current record count of the dataset. If your last metric count failed to be executed this call will give you the previous value without any warning!
Fixing the above issues in Python code is possible but it will need several lines of more code to execute the metric, wait for it to complete, check the result, etc. Hence why the solution I proposed in my other post is more elegant, much simpler and more robust.
Add compute metrics step in your scenario:
This guarantees the record count has been calculated successfully as part of the scenario run. If the compute metrics step fails, the scenario fails. Then fetch the metric value and define the variable:
Finally use it on any subsecuent scenario steps for conditional execution of the step: