Regarding, Dataiku Scenario, How to control the Scenario steps using variable?

Registered Posts: 1
edited March 31 in Using Dataiku

Hi everyone, I have designed my dataset where it will always have single value either 'true' or 'false'. In my Dataiku scenario. I want to control the flow based on this output.

If Dataset contains 'true' then , next step (building datasets in the project) should proceed and email notification need to be triggered

if Dataset contains 'false' then scenario should stop and Dataset build should not occur and no email should be sent.

Could anyone guide me on the best way to implement this logic in Dataiku Scenario?

Operating system used: Windows

Operating system used: Windows

Operating system used: Windows

Answers

  • Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,384 Neuron

    It's possible to do this using SQL but only if your dataset is stored in a technology that supports SQL. Therefore I would suggest using a metric for a more portable solution which will work for any dataset type. Follow these steps:

    1. In the dataset where you have your true/false variable click on Metrics and then Edit Metrics and then enable Max on column statistics for your true/false column
    2. Create a scenario step to "Compute metrics" and add the dataset that has the variable to it. You must give this step a name and you should not use spaces (let's call this step Compute_Metrics)
    3. Next create a scenario step to "Define scenario variables"
    4. On the Define scenario variables step toggle the "Evaluated variables" to ON
    5. Then define a new variable (let's call it My_Column) with this formula:

      filter(parseJson(stepOutput_Compute_Metrics)['CT_TEST.my_dataset_NP']['computed'], x, x["metricId"]=="col_stats:MAX:my_column")[0].value

      You should replace CT_TEST by your Project ID, my_dataset with your dataset ID (leave the _NP) and my_column with your column name, case sensitive. Note that "Compute_Metrics" refers to the previous step name where you computed metrics for the dataset.
    6. Finally in all your conditional steps (ie steps that should only run when variable = true) set "Run this Step" to "If condition is satisfied" and the condition to:

      My_Column == "true" && outcome != 'FAILED'

    Sample execution with true value:

    With false value:

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.