Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

Retry and delay parameters in Scenarios- Custom Python Script

pkansal
Level 3
Retry and delay parameters in Scenarios- Custom Python Script
How do we specify the retry and delay parameter while creating a scenario in a custom python script? it was not clear from the documentation at https://doc.dataiku.com/dss/latest/python-api/scenarios-inside.html#dataiku.scenario.Scenario.build_...
0 Kudos
3 Replies
VitaliyD
Dataiker
Dataiker

Hi,

To create a Scenario, you will need to use the Dataiku Project API create_scenario method. Then, create a build dataset step and specify delay and retries on the step using Dataiku Scenario API get_settings method. Please refer to an example below, which can be used as a starting point for your own script.

import dataiku

client = dataiku.api_client()
project = client.get_project("PROJECT_NAME")
new_scenario = project.create_scenario('SCENARIO_NAME', 'step_based')

step_def = [{'delayBetweenRetries': 30,
 'id': 'build_0_true_d_DATASET_NAME',
 'maxRetriesOnFail': 3, # set desired max retries
 'delayBetweenRetries': 1000, # set desired delay
 'name': 'Step #1',
 'params': {'builds': [{'itemId': 'DATASET_NAME',
    'partitionsSpec': '',
    'type': 'DATASET'}],
  'jobType': 'RECURSIVE_BUILD',
  'proceedOnFailure': False,
  'refreshHiveMetastore': True},
 'resetScenarioStatus': False,
 'runConditionExpression': '',
 'runConditionStatuses': ['SUCCESS', 'WARNING'],
 'runConditionType': 'RUN_IF_STATUS_MATCH',
 'type': 'build_flowitem'}]

settings_new = new_scenario.get_settings()
settings_new.get_raw()['params']['steps'] = step_def
settings_new.save()

 

Please note, If you need to check the format required to modify something in scenario settings using API, you can always create a scenario/step via UI with the required settings and then read the settings via API with the scenario.get_settings().get_raw() methods to inspect them.

Best,

Vitaliy

rnorm
Level 3

Hi @VitaliyD ,

Thanks for the information. I'm in a similar situation, but I'd like to do that within a custom python scenario:

I can get the definition of a step using BuildFlowItemsStepDefHelper.get_step() method, but I can't find a way to actually set a new definition with parameters such as the ones in your example in step_def (retries, etc...). Do you know a way to do it ?

0 Kudos
VitaliyD
Dataiker
Dataiker

Hi @rnorm,

In a custom python scenario, there is no such functionality as it is custom and it is up to your code to handle that. If a step fails an exception is thrown so you can use it to create such functionality manually. Please refer to the example below I have written so you can better understand the approach:

from dataiku.scenario import Scenario
import time

# The Scenario object is the main handle from which you initiate steps
scenario = Scenario()
# build_dataset with retries function
def build_with_retries(n=1,retries=3,delay=1000):
    if n > retries: 
        return
    else:
        try:
            scenario.build_dataset("DATASET_NAME")
        except Exception as e:
            n = n+1
            time.sleep(delay)
            build_with_retries(n=n, retries=retries, delay=delay)
            
build_with_retries(retries=3, delay=1000) # specify retries and delay

Hope this helps. 

Best.