Programatically Create Scenario Steps

benmoss
benmoss Partner, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 10 Partner

Hi All,

I'm looking for advice on a solution that I am looking to develop. Specifically we have a series of projects (19 in total) that contain the same outputs but slightly different logic in terms of how to create them.

We want each project's scenario, which is triggered by our users to be the same, however, sometimes we may notice gaps within the scenario that needs to be corrected or added.

At present we are manually replicating changes across all 19 project scenarios, but there is obviously a flaw in this in that we might not correctly replicate these changes!

I'm looking for ideas on how to approach this in a better way, I've been exploring whether it's possible to use the Dataiku API in order to programatically update the scenarios based off a template scenario, however I seem to have a wall here in regards to the 'steps' not being an updatable attribute of the scenario settings (compared to things like the reporters or the run as user (which we are already programatically updating).

Thanks in advance!

Ben


Operating system used: Windows

Tagged:

Best Answer

  • benmoss
    benmoss Partner, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 10 Partner
    edited July 17 Answer ✓

    For the purpose of completion, sharing the code that we deployed for our solution...

    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    import dataikuapi
    
    client = dataiku.api_client()
    
    template_project = client.get_project("TEMPLATEPROJECTID")
    template_scenario = project.get_scenario("TEMPLATE_SCENARIO_ID")
    template_scenario_settings = template_scenario.get_settings()
    template_steps = template_scenario_settings.raw_steps
    
    list_projects = ["PROJECT A","PROJECT B","PROJECT C"]
    
    for p in list_projects:
        
        project = client.get_project(p)
        scenario = project.get_scenario("COMMON_SCENARIO_ID")
        scenario_settings = scenario.get_settings()
        
        for i in range(len(scenario_settings.raw_steps)):       
            del scenario_settings.raw_steps[0]
        
        for step in raw_steps:
            scenario_settings.raw_steps.append(step)
            
        scenario_settings.save()

    Thanks again @VitaliyD
    for bringing is to a solution!

Answers

  • benmoss
    benmoss Partner, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 10 Partner

    Too add to this, it seems like something could be possible using the 'scenario' API whereby we use a python script in order to create our scenario and then that python script could be stored and referenced from each flow via the global shared repository.

    Unfortunately in this case that doesn't seem possible as we require a folder check as one of our steps and this doesn't yet seem like it is supported (file checks are) by the 'scenario' API.

    I'm still on the hunt for ideas!

  • VitaliyD
    VitaliyD Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 102 Dataiker
    edited July 17

    Hi,

    You can copy the steps from one scenario to another using the scenario settings. Please refer to the example below:

    import dataiku, json
    from dataiku import pandasutils as pdu
    import pandas as pd
    
    client = dataiku.api_client()
    project = client.get_default_project()
    
    scenario = project.get_scenario('1')
    scenario_settings = scenario.get_settings()
    raw_steps = scenario_settings.raw_steps
    
    scenario2 = project.get_scenario('2')
    scenario_settings2 = scenario2.get_settings()
    scenario_settings2.raw_steps.append(raw_steps[0])
    scenario_settings2.save()

    Screenshot 2022-10-05 at 12.04.31.png

    I hope this helps.

    Best,

    Vitaliy

  • benmoss
    benmoss Partner, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 10 Partner

    Thank you for this reply @VitaliyD
    , this was extemely helpful.

    Because of the use of the 'append' I understand (and have tested) that this would add steps from the template scenario to the bottom of our already existing scenario, which already has steps in.

    Do you know how the code provided could be edited for this case where we want to essentially set the raw steps of the second projects to be exactly the same as the first projects steps?

    I looked at adjusting the code you provided to do the following but unfortunately this did not work.

    scenario_settings2.raw_steps = raw_steps

    One method I can think of would be to delete the existing scenario on the second project, then create it from scratch before appending each step from the 'template project scenario'.

    Before I develop this I just want to check whether there is a neater solution? For example, is it possible to (rather than delete the entire scenario), clear all the steps from the scenario before using the append method you provided.

    Thanks again.

    Ben

  • VitaliyD
    VitaliyD Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 102 Dataiker
    edited July 17

    Hi,

    You can copy a step by step. Just make sure the step already exists. Example:

    # copy step by step
    scenario_settings2.raw_steps[0] = raw_steps[0]
    # or modyfy the existing step as required
    scenario_settings2.raw_steps[0] = {'delayBetweenRetries': 10,
      'id': 'reload_schema__d_us_50',
      'maxRetriesOnFail': 0,
      'name': 'Step #1',
      'params': {'items': [{'itemId': 'us_50',
         'partitionsSpec': '',
         'type': 'DATASET'}],
       'proceedOnFailure': False},
      'resetScenarioStatus': False,
      'runConditionExpression': '',
      'runConditionStatuses': ['SUCCESS', 'WARNING'],
      'runConditionType': 'RUN_IF_STATUS_MATCH',
      'type': 'reload_schema'}
    scenario_settings2.save()

    Best,

    Vitaliy

  • benmoss
    benmoss Partner, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 10 Partner

    Right that makes sense, it's more than plausible though that we adjust the number of steps within the scenario (for example the template scenario now has five steps but the scenario that we are looking to override has six steps).

    To confirm there is no delete step action?

    Anyhow, the information you provided is great and has bought me to a solution of sorts, which is (in order to accomdate with the issue outlined at the beginning of this post), delete the scenario, create a new scenario and then use the append method you provided.

    Thanks again!

    Ben

  • VitaliyD
    VitaliyD Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 102 Dataiker
    edited July 17

    The steps are just a python list in the scenario settings. So you can delete a step by just removing an element from the list. Example:

    del scenario_settings2.raw_steps[0]
    scenario_settings2.save()

    Best.

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭

    Thank you for sharing this solution with our Community @benmoss
    !

Setup Info
    Tags
      Help me…