Retry and delay parameters in Scenarios- Custom Python Script

pkansal
pkansal Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 23 ✭✭✭✭
How do we specify the retry and delay parameter while creating a scenario in a custom python script? it was not clear from the documentation at https://doc.dataiku.com/dss/latest/python-api/scenarios-inside.html#dataiku.scenario.Scenario.build_dataset

Answers

  • Vitaliy
    Vitaliy Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 102 Dataiker
    edited July 17

    Hi,

    To create a Scenario, you will need to use the Dataiku Project API create_scenario method. Then, create a build dataset step and specify delay and retries on the step using Dataiku Scenario API get_settings method. Please refer to an example below, which can be used as a starting point for your own script.

    import dataiku
    
    client = dataiku.api_client()
    project = client.get_project("PROJECT_NAME")
    new_scenario = project.create_scenario('SCENARIO_NAME', 'step_based')
    
    step_def = [{'delayBetweenRetries': 30,
     'id': 'build_0_true_d_DATASET_NAME',
     'maxRetriesOnFail': 3, # set desired max retries
     'delayBetweenRetries': 1000, # set desired delay
     'name': 'Step #1',
     'params': {'builds': [{'itemId': 'DATASET_NAME',
        'partitionsSpec': '',
        'type': 'DATASET'}],
      'jobType': 'RECURSIVE_BUILD',
      'proceedOnFailure': False,
      'refreshHiveMetastore': True},
     'resetScenarioStatus': False,
     'runConditionExpression': '',
     'runConditionStatuses': ['SUCCESS', 'WARNING'],
     'runConditionType': 'RUN_IF_STATUS_MATCH',
     'type': 'build_flowitem'}]
    
    settings_new = new_scenario.get_settings()
    settings_new.get_raw()['params']['steps'] = step_def
    settings_new.save()

    Please note, If you need to check the format required to modify something in scenario settings using API, you can always create a scenario/step via UI with the required settings and then read the settings via API with the scenario.get_settings().get_raw() methods to inspect them.

    Best,

    Vitaliy

  • rnorm
    rnorm Registered Posts: 9 ✭✭✭✭

    Hi @VitaliyD
    ,

    Thanks for the information. I'm in a similar situation, but I'd like to do that within a custom python scenario:

    I can get the definition of a step using BuildFlowItemsStepDefHelper.get_step() method, but I can't find a way to actually set a new definition with parameters such as the ones in your example in step_def (retries, etc...). Do you know a way to do it ?

  • Vitaliy
    Vitaliy Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 102 Dataiker
    edited July 17

    Hi @rnorm
    ,

    In a custom python scenario, there is no such functionality as it is custom and it is up to your code to handle that. If a step fails an exception is thrown so you can use it to create such functionality manually. Please refer to the example below I have written so you can better understand the approach:

    from dataiku.scenario import Scenario
    import time
    
    # The Scenario object is the main handle from which you initiate steps
    scenario = Scenario()
    # build_dataset with retries function
    def build_with_retries(n=1,retries=3,delay=1000):
        if n > retries: 
            return
        else:
            try:
                scenario.build_dataset("DATASET_NAME")
            except Exception as e:
                n = n+1
                time.sleep(delay)
                build_with_retries(n=n, retries=retries, delay=delay)
                
    build_with_retries(retries=3, delay=1000) # specify retries and delay

    Hope this helps.

    Best.

Setup Info
    Tags
      Help me…