Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello,
I have a requirement to Monitor percentage of 0s in a dataset and put 3 weeks data in weekly email and to generate an alert in email notification when there are > 90% of 0s.
How this can be done in dataiku?
Thanks in advance
To keep track of zero percentages in your dataset using Dataiku and send out weekly emails containing three weeks of data, as well as trigger an alert for zero percentages exceeding 90%, follow these steps:
By following these steps, you'll ensure consistent monitoring of zero percentages and receive alerts whenever thresholds are breached, all conveniently managed within Dataiku's workflow.
To send data as part of your scenario look at scenario reports:
https://knowledge.dataiku.com/latest/mlops-o16n/automation/tutorial-reporters.html
To generate an email notification for the percentage of zeros in your dataset you need to set a metric in your dataset. Then retrieve the value of the metric in the scenario and use conditional step execution logic on a reporter scenario step to send it. Here is a post covering conditional step execution in detail.
Hey Turribeach Thanks so much for the answer.
I still have some confusion here, I have created scenario with steps to calculate percentage of 0 in a column and stored it in a new data set. Now I m trying to send email with last 3 weeks data from new data set with some filtering on new data set, how can I do that ?
I tried adding send message step in scenario, but it adds complete data set as an attachment without any filtering on it.
I also tried using custom python code to to do filtering on output data set and store the result in new Data frame variable . But how can I use this variable from python script into email body or send message step?
@Ruta wrote:I still have some confusion here, I have created scenario with steps to calculate percentage of 0 in a column and stored it in a new data set.
In order to be able to take conditional steps in a scenario you need to have the calculate percentage of 0s in a dataset metric so you can retrieve the value of the metric and store it in a scenario variable. This is irrespective of your requirement of actually sending the dataset with calculate percentage of 0s as a mail attachment. In other words you need to do both things, the metric and the calculate percentage of 0s dataset. Have another look at my post and the post from @Grixis as it explains how to do this. If you are still having issues describe all the steps you followed and where do you see an error.
@Ruta wrote:Now I m trying to send email with last 3 weeks data from new data set with some filtering on new data set, how can I do that ? I tried adding send message step in scenario, but it adds complete data set as an attachment without any filtering on it.
Mail dataset attachments can't have any filters applied to them. But this is a trivial problem to bypass. Just add a new Filter recipe using your original calculate percentage of 0s dataset as an input and then set whatever filters you want. Then use the new output dataset in your mail attachment.
Is it possible to attach more than 1 result dataset in send message step (in scenario). If yes, how to do it ?
Why do you think you can only attach only 1 result dataset? Show a screen shot of how you doing it.
Hi,
I am able to attach multiple dataset in single send message step.
Currently I am facing one more issue, when I run python recipe manually to calculate percentage of zero in input dataset, the output dataset file gets refreshed each time when the python recipe is executed and historical data is also deleted. Could you please help to this problem?
following code not working
# Write recipe outputs
results_w_brands_no_filter_S3_PROD_predcited_zeros = dataiku.Dataset("results_w_brands_no_filter_S3_PROD_predcited_zeros",ignore_flow=True)
results_w_brands_no_filter_S3_PROD_predcited_zeros.write_with_schema(new_result)
Thanks
To keep track of zero percentages in your dataset using Dataiku and send out weekly emails containing three weeks of data, as well as trigger an alert for zero percentages exceeding 90%, follow these steps:
By following these steps, you'll ensure consistent monitoring of zero percentages and receive alerts whenever thresholds are breached, all conveniently managed within Dataiku's workflow.
Hey Thanks so much for the answer.
On output dataset (csv file) how can I apply filter to fetch 3 weeks records only and send mail every weekend.
I need to also highlight records with warning if the error value is > 90.
How can I do this in dataiku? Do I need to write python recipe for it or their is some other way to do it . Can you please explain in detail . I m new to dataiku.
Thanks
Hello @Ruta
Indeed, there is no explicit example in the documentation for this but I think the two attachments meet your need.
By first doing a computation metrics step of a dataset you will keep the result as 'stepOutput'.
Consequently during your scenario you can add a step following to iterate on stepOutput_your_name_of_previous_update_metrics_steps to set project variables by the using a visual step.
In the attached example I update the metrics of a dataset of my project by naming the step the_metrics so just behind I set variables by taking my objectstepOutput_the_metrics to which I set all the information as complete_json_example.
Then 3 other example sets to show you how to use the dataiku formula language to impose a filter on precise values.(filter(parseJson(stepOutput_the_metrics)["database.your_dataset_name"]['computed'], x, x["metricId"]=="col_stats:MEAN:build_time_avg")[0].value) col_stats:MEAN:build_time_avg as the metrics ID you want to capture.