Is it possible in a recipe code to know what scenario its running under?

mahbh2001
Level 1
Is it possible in a recipe code to know what scenario its running under?

Hi ,

I have a flow defined in Dataiku with python and a visual recipe and want to automate it using the scenario. I want to run the same flow under multiple scenarios. Is it possible to identify in the recipe code during execution which scenario is running it?

regards

Mahesh 

 

0 Kudos
4 Replies
MiguelangelC
Dataiker

Hi,

Depending on your DSS version you can see in the Jobs view whether a job was triggered by an scenario,, and if so, which one:

jobCapture.PNG

Looking at the code, this information can be extracted from the full job log (same as output.log). On the jobs page, select an activity > Actions > View full job log. In the start session entry at the top you'll see  something like this:

["scenario-run: scenario\u003dDKU_TUTORIAL_BASICS_103.SCENARIOMACTEST runId\u003d2023-02-10-11-23-19-309","admin"],"authSource":"USER_FROM_UI","realUserLogin":"admin","userGroups":["administrators"],"userProfile":"DATA_SCIENTIST"},"stepRun":{"scenarioRun":{"start":1676028199308,"trigger":{"projectKey":"DKU_TUTORIAL_BASICS_103","scenarioId":"SCENARIOMACTEST","trigger":{"id":"manual","type":"manual","name":"Manual 




 

 

0 Kudos
mahbh2001
Level 1
Author

Thanks for the details, would you know if I can get this information programmatically in the recipe code?

0 Kudos
MiguelangelC
Dataiker

Hi,

You can use the Python API for projects to list its jobs. Among the returned information, you can encounter in the dict for the jobs which scenario do they belong to, if at all, under the 'scenarioId' key:

 

 

import dataiku
client=dataiku.api_client()
project=client.get_project(<project ID>)
print(project.list_jobs())

 

 

 

You can also be more specific and query a specific job for its log:

 

 

import dataiku
client=dataiku.api_client()
project=client.get_project(<project ID>)
myjob=project.get_job(<job id>)
print(myjob.get_log())

 

 

 

I am not sure if I understand the question correctly, but running the above code within the code recipe that spawns this job is not feasible. You'd have to spawn another code recipe or jupyter notebook to do so.

 

 

 

0 Kudos
mahbh2001
Level 1
Author

Hi ,

I am trying to call scenarios externally using scenario APIs in my application and want scenarios to share the same flows.

While doing that I want to ensure the data is partitioned based on the scenario runid so that each flow pickup their data correctly. I am able to retrieve data in a folder e.g.

\infoder1\<scenario_run_id1>\file1.json

\infoder1\<scenario_run_id2>\file2.json

But when I trigger build \out_folder , I want flow instance running under scenario_run_id1 to process file1.json and flow that running under scenario_run_id2 to process file2.json which requires access to scenario_run_id in flow (if its running under scenario). 

 

Is it possible to get this information programmatically which I can include in the recipe code ?

 

 

 

 

 

0 Kudos