Dataiku Scenario

goku
Level 2
Dataiku Scenario

I need to run a scenario within the same scenario(recursion or self call). It is basically running the scenario within a loop until a condition is failed or satisfied. Please help.

 

0 Kudos
12 Replies
Turribeach

Can you please describe exactly what you are trying to achieve not how you think you can solve it? ie describe the goal or the requirement.

0 Kudos
goku
Level 2
Author

My goal is to increment the global date variable recursively and then for each incremental date I want to run the steps to populate a dataset and then again increment and run the scenario.

 

My Idea: In scenario S1, step1: populate the dataset, step2:increment the date (global variable), step3: run S1 again.

0 Kudos
Turribeach

Sorry to pester but again this seems like a way of achieving your goal not the goal itself. What is exactly the desired outcome? Why do you need to populate the dataset recursively? 

0 Kudos
goku
Level 2
Author

I am trying to load the historical data through an API call.

The plugin that I created takes only one project variable (global variable) but not the range of dates. 

So, i want to recursively call the same scenario to backfill the data.

0 Kudos
Turribeach

Scenarios can't recursively call themselves nor can their be multiple executions of the same scenario at the same time. Personally I would modify your plugin to be able to handle a range of dates as how would you handle the situation when the data load fails for a few days and you need to catch up and load multiple days? 

Further more I wouldn't use project variables because these can be left in an incorrect state and cause data to be duplicated. In DSS, there are instance-level variables, project-level variables, and scenario-level variables. The scenario-level variables are only visible for the duration of the scenario run, and are defined either programatically (in Python) or using a Define variables Scenario step. All 3 types of variables are available to recipes of jobs started in a scenario run. If you are loading an historical table the right approach will be to check the last date loaded in the historical table at runtime as part of your Scenario run and then load all dates from that date till today using a range date Scenario variable that the recipe/plugin can read. This will make the plugin have an auto-catch up capability and the range date Scenario variable will always get the correct value as it will be calculated at runtime. 

Here is how you define a scenario variable:

from dataiku.scenario import Scenario
scenario = Scenario()
all_scenario_vars = scenario.get_all_variables()
all_scenario_vars['new_scenario_var'] = "new value"
scenario.set_scenario_variables(**all_scenario_vars)

You can run this on a Custom Python step in your Scenario. 

Finally while you can't run scnearios recursively there is one trick you can use. There is scenario trigger called "Trigger after scenario" which basically runs after another scenario runs. You could easily have two scenarios which are set to "Trigger after scenario" like this:

  • Scenario 1 "Trigger after scenario" Scenario 2
  • Scenario 2 "Trigger after scenario" Scenario 1

And obviously if you trigger Scenario 1 it will then trigger Scenario 2 after it finishes which then will trigger Scenario 1 ad infinitum. However you will need to be able to find a way to break the recursive call as it will never ever stop. 

0 Kudos
goku
Level 2
Author

Hello, I don't see any option called "Trigger after scenario" in the ADD STEP. I only see Run another scenario and kill another scenario..

0 Kudos
goku
Level 2
Author

This is a short time so cannot modify the plugin for some other reasons. 

Thank you very much for suggesting this new solution scenario variables.

Final point I liked it very much because I have tried a modified version of what you have suggested i.e., Run another scenario but not "Trigger after scenario". I do have a logic already in place to stop the loop execution.

I am going with the final option and will let you know if this works. Thanks once again.

 

 

 

 

 

0 Kudos
Turribeach

I don't recommend you use "Run another scenario" step because this will mean both "chained" scenarios could be running at the same time. "Trigger after scenario" is much safer because it guarantees both scenarios can not run at the same time as one needs to finish before the other starts. In fact you can even set the Check every (seconds) and the Grace delay (seconds) which gives you safety gap. 

0 Kudos
goku
Level 2
Author

 "Trigger after scenario" is not available in my instance. Is there a way i can add that option? Please let me know

0 Kudos
Turribeach

What version of Dataiku are you running?

0 Kudos
goku
Level 2
Author

version 12

0 Kudos
Turribeach

Trigger after Scenario is not a scenario step, it's on the Scenario Settings under Triggers:

Screenshot 2024-02-22 at 01.11.05.png

0 Kudos