Is it possible in a recipe code to know what scenario its running under?

Options
mahbh2001
mahbh2001 Registered Posts: 4

Hi ,

I have a flow defined in Dataiku with python and a visual recipe and want to automate it using the scenario. I want to run the same flow under multiple scenarios. Is it possible to identify in the recipe code during execution which scenario is running it?

regards

Mahesh

Answers

  • Miguel Angel
    Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
    Options

    Hi,

    Depending on your DSS version you can see in the Jobs view whether a job was triggered by an scenario,, and if so, which one:

    jobCapture.PNG

    Looking at the code, this information can be extracted from the full job log (same as output.log). On the jobs page, select an activity > Actions > View full job log. In the start session entry at the top you'll see something like this:

    ["scenario-run: scenario\u003dDKU_TUTORIAL_BASICS_103.SCENARIOMACTEST runId\u003d2023-02-10-11-23-19-309","admin"],"authSource":"USER_FROM_UI","realUserLogin":"admin","userGroups":["administrators"],"userProfile":"DATA_SCIENTIST"},"stepRun":{"scenarioRun":{"start":1676028199308,"trigger":{"projectKey":"DKU_TUTORIAL_BASICS_103","scenarioId":"SCENARIOMACTEST","trigger":{"id":"manual","type":"manual","name":"Manual 




  • mahbh2001
    mahbh2001 Registered Posts: 4
    Options

    Thanks for the details, would you know if I can get this information programmatically in the recipe code?

  • Miguel Angel
    Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
    Options

    Hi,

    You can use the Python API for projects to list its jobs. Among the returned information, you can encounter in the dict for the jobs which scenario do they belong to, if at all, under the 'scenarioId' key:

    import dataikuclient=dataiku.api_client()project=client.get_project(<project ID>)print(project.list_jobs())

    You can also be more specific and query a specific job for its log:

    import dataikuclient=dataiku.api_client()project=client.get_project(<project ID>)myjob=project.get_job(<job id>)print(myjob.get_log())

    I am not sure if I understand the question correctly, but running the above code within the code recipe that spawns this job is not feasible. You'd have to spawn another code recipe or jupyter notebook to do so.

  • mahbh2001
    mahbh2001 Registered Posts: 4
    Options

    Hi ,

    I am trying to call scenarios externally using scenario APIs in my application and want scenarios to share the same flows.

    While doing that I want to ensure the data is partitioned based on the scenario runid so that each flow pickup their data correctly. I am able to retrieve data in a folder e.g.

    \infoder1\<scenario_run_id1>\file1.json

    \infoder1\<scenario_run_id2>\file2.json

    But when I trigger build \out_folder , I want flow instance running under scenario_run_id1 to process file1.json and flow that running under scenario_run_id2 to process file2.json which requires access to scenario_run_id in flow (if its running under scenario).

    Is it possible to get this information programmatically which I can include in the recipe code ?

Setup Info
    Tags
      Help me…