Retry and delay parameters in Scenarios- Custom Python Script
Answers
-
Hi,
To create a Scenario, you will need to use the Dataiku Project API create_scenario method. Then, create a build dataset step and specify delay and retries on the step using Dataiku Scenario API get_settings method. Please refer to an example below, which can be used as a starting point for your own script.
import dataiku client = dataiku.api_client() project = client.get_project("PROJECT_NAME") new_scenario = project.create_scenario('SCENARIO_NAME', 'step_based') step_def = [{'delayBetweenRetries': 30, 'id': 'build_0_true_d_DATASET_NAME', 'maxRetriesOnFail': 3, # set desired max retries 'delayBetweenRetries': 1000, # set desired delay 'name': 'Step #1', 'params': {'builds': [{'itemId': 'DATASET_NAME', 'partitionsSpec': '', 'type': 'DATASET'}], 'jobType': 'RECURSIVE_BUILD', 'proceedOnFailure': False, 'refreshHiveMetastore': True}, 'resetScenarioStatus': False, 'runConditionExpression': '', 'runConditionStatuses': ['SUCCESS', 'WARNING'], 'runConditionType': 'RUN_IF_STATUS_MATCH', 'type': 'build_flowitem'}] settings_new = new_scenario.get_settings() settings_new.get_raw()['params']['steps'] = step_def settings_new.save()
Please note, If you need to check the format required to modify something in scenario settings using API, you can always create a scenario/step via UI with the required settings and then read the settings via API with the scenario.get_settings().get_raw() methods to inspect them.
Best,
Vitaliy
-
Hi @VitaliyD
,Thanks for the information. I'm in a similar situation, but I'd like to do that within a custom python scenario:
I can get the definition of a step using BuildFlowItemsStepDefHelper.get_step() method, but I can't find a way to actually set a new definition with parameters such as the ones in your example in step_def (retries, etc...). Do you know a way to do it ?
-
Hi @rnorm
,In a custom python scenario, there is no such functionality as it is custom and it is up to your code to handle that. If a step fails an exception is thrown so you can use it to create such functionality manually. Please refer to the example below I have written so you can better understand the approach:
from dataiku.scenario import Scenario import time # The Scenario object is the main handle from which you initiate steps scenario = Scenario() # build_dataset with retries function def build_with_retries(n=1,retries=3,delay=1000): if n > retries: return else: try: scenario.build_dataset("DATASET_NAME") except Exception as e: n = n+1 time.sleep(delay) build_with_retries(n=n, retries=retries, delay=delay) build_with_retries(retries=3, delay=1000) # specify retries and delay
Hope this helps.
Best.