Set up warning for failure in the scenario

Options
Ruta
Ruta Registered Posts: 8

Hello,

I have a requirement to Monitor percentage of 0s in a dataset and put 3 weeks data in weekly email and to generate an alert in email notification when there are > 90% of 0s.

How this can be done in dataiku?

Thanks in advance

Best Answer

  • qweenfo
    qweenfo Registered Posts: 3
    Answer ✓
    Options

    To keep track of zero percentages in your dataset using Dataiku and send out weekly emails containing three weeks of data, as well as trigger an alert for zero percentages exceeding 90%, follow these steps:

    1. Set up a recipe or scenario in Dataiku to compute the zero percentage in your dataset.
    2. Create a Dataiku flow to schedule this calculation to occur weekly.
    3. Configure an email plugin within Dataiku to dispatch the weekly email including the calculated data.
    4. Establish an alert within the email notification settings to activate when the zero percentage surpasses 90%.

    By following these steps, you'll ensure consistent monitoring of zero percentages and receive alerts whenever thresholds are breached, all conveniently managed within Dataiku's workflow.

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    To send data as part of your scenario look at scenario reports:

    https://knowledge.dataiku.com/latest/mlops-o16n/automation/tutorial-reporters.html

    To generate an email notification for the percentage of zeros in your dataset you need to set a metric in your dataset. Then retrieve the value of the metric in the scenario and use conditional step execution logic on a reporter scenario step to send it. Here is a post covering conditional step execution in detail.

  • Ruta
    Ruta Registered Posts: 8
    Options

    Hey Turribeach Thanks so much for the answer.

    I still have some confusion here, I have created scenario with steps to calculate percentage of 0 in a column and stored it in a new data set. Now I m trying to send email with last 3 weeks data from new data set with some filtering on new data set, how can I do that ?

    I tried adding send message step in scenario, but it adds complete data set as an attachment without any filtering on it.

    I also tried using custom python code to to do filtering on output data set and store the result in new Data frame variable . But how can I use this variable from python script into email body or send message step?

  • Grixis
    Grixis PartnerApplicant, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 56 ✭✭✭✭✭
    Options

    Hello @Ruta

    Indeed, there is no explicit example in the documentation for this but I think the two attachments meet your need.

    By first doing a computation metrics step of a dataset you will keep the result as 'stepOutput'.

    Consequently during your scenario you can add a step following to iterate on stepOutput_your_name_of_previous_update_metrics_steps to set project variables by the using a visual step.

    In the attached example I update the metrics of a dataset of my project by naming the step the_metrics so just behind I set variables by taking my objectstepOutput_the_metrics to which I set all the information as complete_json_example.

    Then 3 other example sets to show you how to use the dataiku formula language to impose a filter on precise values.(filter(parseJson(stepOutput_the_metrics)["database.your_dataset_name"]['computed'], x, x["metricId"]=="col_stats:MEAN:build_time_avg")[0].value) col_stats:MEAN:build_time_avg as the metrics ID you want to capture.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options


    @Ruta
    wrote:

    I still have some confusion here, I have created scenario with steps to calculate percentage of 0 in a column and stored it in a new data set.


    In order to be able to take conditional steps in a scenario you need to have the calculate percentage of 0s in a dataset metric so you can retrieve the value of the metric and store it in a scenario variable. This is irrespective of your requirement of actually sending the dataset with calculate percentage of 0s as a mail attachment. In other words you need to do both things, the metric and the calculate percentage of 0s dataset. Have another look at my post and the post from @Grixis
    as it explains how to do this. If you are still having issues describe all the steps you followed and where do you see an error.


    @Ruta
    wrote:

    Now I m trying to send email with last 3 weeks data from new data set with some filtering on new data set, how can I do that ? I tried adding send message step in scenario, but it adds complete data set as an attachment without any filtering on it.

    Mail dataset attachments can't have any filters applied to them. But this is a trivial problem to bypass. Just add a new Filter recipe using your original calculate percentage of 0s dataset as an input and then set whatever filters you want. Then use the new output dataset in your mail attachment.

  • Ruta
    Ruta Registered Posts: 8
    Options

    Is it possible to attach more than 1 result dataset in send message step (in scenario). If yes, how to do it ?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    Why do you think you can only attach only 1 result dataset? Show a screen shot of how you doing it.

  • Ruta
    Ruta Registered Posts: 8
    Options

    Hi,

    I am able to attach multiple dataset in single send message step.

    Currently I am facing one more issue, when I run python recipe manually to calculate percentage of zero in input dataset, the output dataset file gets refreshed each time when the python recipe is executed and historical data is also deleted. Could you please help to this problem?

    following code not working

    # Write recipe outputs
    results_w_brands_no_filter_S3_PROD_predcited_zeros = dataiku.Dataset("results_w_brands_no_filter_S3_PROD_predcited_zeros",ignore_flow=True)
    results_w_brands_no_filter_S3_PROD_predcited_zeros.write_with_schema(new_result)

    Thanks

  • Ruta
    Ruta Registered Posts: 8
    Options

    Hey Thanks so much for the answer.

    On output dataset (csv file) how can I apply filter to fetch 3 weeks records only and send mail every weekend.

    I need to also highlight records with warning if the error value is > 90.

    How can I do this in dataiku? Do I need to write python recipe for it or their is some other way to do it . Can you please explain in detail . I m new to dataiku.

    Thanks

Setup Info
    Tags
      Help me…