Programatically Create Scenario Steps
Hi All,
I'm looking for advice on a solution that I am looking to develop. Specifically we have a series of projects (19 in total) that contain the same outputs but slightly different logic in terms of how to create them.
We want each project's scenario, which is triggered by our users to be the same, however, sometimes we may notice gaps within the scenario that needs to be corrected or added.
At present we are manually replicating changes across all 19 project scenarios, but there is obviously a flaw in this in that we might not correctly replicate these changes!
I'm looking for ideas on how to approach this in a better way, I've been exploring whether it's possible to use the Dataiku API in order to programatically update the scenarios based off a template scenario, however I seem to have a wall here in regards to the 'steps' not being an updatable attribute of the scenario settings (compared to things like the reporters or the run as user (which we are already programatically updating).
Thanks in advance!
Ben
Operating system used: Windows
Best Answer
-
For the purpose of completion, sharing the code that we deployed for our solution...
import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu import dataikuapi client = dataiku.api_client() template_project = client.get_project("TEMPLATEPROJECTID") template_scenario = project.get_scenario("TEMPLATE_SCENARIO_ID") template_scenario_settings = template_scenario.get_settings() template_steps = template_scenario_settings.raw_steps list_projects = ["PROJECT A","PROJECT B","PROJECT C"] for p in list_projects: project = client.get_project(p) scenario = project.get_scenario("COMMON_SCENARIO_ID") scenario_settings = scenario.get_settings() for i in range(len(scenario_settings.raw_steps)): del scenario_settings.raw_steps[0] for step in raw_steps: scenario_settings.raw_steps.append(step) scenario_settings.save()
Thanks again @VitaliyD
for bringing is to a solution!
Answers
-
Too add to this, it seems like something could be possible using the 'scenario' API whereby we use a python script in order to create our scenario and then that python script could be stored and referenced from each flow via the global shared repository.
Unfortunately in this case that doesn't seem possible as we require a folder check as one of our steps and this doesn't yet seem like it is supported (file checks are) by the 'scenario' API.
I'm still on the hunt for ideas!
-
Hi,
You can copy the steps from one scenario to another using the scenario settings. Please refer to the example below:
import dataiku, json from dataiku import pandasutils as pdu import pandas as pd client = dataiku.api_client() project = client.get_default_project() scenario = project.get_scenario('1') scenario_settings = scenario.get_settings() raw_steps = scenario_settings.raw_steps scenario2 = project.get_scenario('2') scenario_settings2 = scenario2.get_settings() scenario_settings2.raw_steps.append(raw_steps[0]) scenario_settings2.save()
I hope this helps.
Best,
Vitaliy
-
Thank you for this reply @VitaliyD
, this was extemely helpful.
Because of the use of the 'append' I understand (and have tested) that this would add steps from the template scenario to the bottom of our already existing scenario, which already has steps in.
Do you know how the code provided could be edited for this case where we want to essentially set the raw steps of the second projects to be exactly the same as the first projects steps?
I looked at adjusting the code you provided to do the following but unfortunately this did not work.
scenario_settings2.raw_steps = raw_steps
One method I can think of would be to delete the existing scenario on the second project, then create it from scratch before appending each step from the 'template project scenario'.
Before I develop this I just want to check whether there is a neater solution? For example, is it possible to (rather than delete the entire scenario), clear all the steps from the scenario before using the append method you provided.Thanks again.
Ben
-
Hi,
You can copy a step by step. Just make sure the step already exists. Example:
# copy step by step scenario_settings2.raw_steps[0] = raw_steps[0] # or modyfy the existing step as required scenario_settings2.raw_steps[0] = {'delayBetweenRetries': 10, 'id': 'reload_schema__d_us_50', 'maxRetriesOnFail': 0, 'name': 'Step #1', 'params': {'items': [{'itemId': 'us_50', 'partitionsSpec': '', 'type': 'DATASET'}], 'proceedOnFailure': False}, 'resetScenarioStatus': False, 'runConditionExpression': '', 'runConditionStatuses': ['SUCCESS', 'WARNING'], 'runConditionType': 'RUN_IF_STATUS_MATCH', 'type': 'reload_schema'} scenario_settings2.save()
Best,
Vitaliy
-
Right that makes sense, it's more than plausible though that we adjust the number of steps within the scenario (for example the template scenario now has five steps but the scenario that we are looking to override has six steps).
To confirm there is no delete step action?Anyhow, the information you provided is great and has bought me to a solution of sorts, which is (in order to accomdate with the issue outlined at the beginning of this post), delete the scenario, create a new scenario and then use the append method you provided.
Thanks again!
Ben
-
The steps are just a python list in the scenario settings. So you can delete a step by just removing an element from the list. Example:
del scenario_settings2.raw_steps[0] scenario_settings2.save()
Best.