Regarding, Dataiku Scenario, How to control the Scenario steps using variable?

Hi everyone, I have designed my dataset where it will always have single value either 'true' or 'false'. In my Dataiku scenario. I want to control the flow based on this output.
If Dataset contains 'true' then , next step (building datasets in the project) should proceed and email notification need to be triggered
if Dataset contains 'false' then scenario should stop and Dataset build should not occur and no email should be sent.
Could anyone guide me on the best way to implement this logic in Dataiku Scenario?
Operating system used: Windows
Operating system used: Windows
Operating system used: Windows
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,407 Neuron
"I think declaration in Reporters Mail 'Run Condition' is little different" ⇒ No, it is not. It's exactly the same.
"it is not working" ⇒ You need to be more explicit than that for me to be able to help you. You set the run condition, what's the value of the variable, what's the value of the outcome and did the mail reporter execute or not? I think you are doing something wrong. Here are my tests:
My_Column is true. Scenario outcome is success. I set the mail reporter run condition like this:
Here is my scenario run. The mail reporter does not execute as expected:
My_Column is true. Scenario outcome is success. I set the mail reporter run condition like this:
Here is my scenario run. The mail reporter executes as expected:
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,407 Neuron
It's possible to do this using SQL but only if your dataset is stored in a technology that supports SQL. Therefore I would suggest using a metric for a more portable solution which will work for any dataset type. Follow these steps:
- In the dataset where you have your true/false variable click on Metrics and then Edit Metrics and then enable Max on column statistics for your true/false column
- Create a scenario step to "Compute metrics" and add the dataset that has the variable to it. You must give this step a name and you should not use spaces (let's call this step Compute_Metrics)
- Next create a scenario step to "Define scenario variables"
- On the Define scenario variables step toggle the "Evaluated variables" to ON
- Then define a new variable (let's call it My_Column) with this formula:
filter(parseJson(stepOutput_Compute_Metrics)['CT_TEST.my_dataset_NP']['computed'], x, x["metricId"]=="col_stats:MAX:my_column")[0].value
You should replace CT_TEST by your Project ID, my_dataset with your dataset ID (leave the _NP) and my_column with your column name, case sensitive. Note that "Compute_Metrics" refers to the previous step name where you computed metrics for the dataset. - Finally in all your conditional steps (ie steps that should only run when variable = true) set "Run this Step" to "If condition is satisfied" and the condition to:
My_Column == "true" && outcome != 'FAILED'
Sample execution with true value:
With false value:
-
Thank you so much… It worked out well… You have explained each step clearly… It helped a lot… Thank you once again. My final query: Now, how to set/call this Scenario Variable (My_Column =="true") in Mail Reporter's Run Condition?
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,407 Neuron
You can use the same condition you use for the scenario step in the mail reporter. It works exactly in the same way.
-
I think declaration in Reporters Mail 'Run Condition' is little different, If I declare like "
My_Column == "true" && outcome != 'FAILED'", it is not working
-
It works and thank you so much...
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,407 Neuron
what was the problem?