dataikuapi - How to rerun a job through python using the api ?
Hi,
i would like to rerun a job i made in Dss through the api using python. I used this documentation(https://doc.dataiku.com/dss/latest/python-api/jobs.html) but it's not describe how to defination properly the job.
I have these error : "dataikuapi.utils.DataikuException: java.lang.IllegalArgumentException: Computable not found or not buildable:TEST.Build_churn_test_tmp_prepared_2020-07-22T15-15-12.332" I didn't find any documentation about it
here my script
import dataikuapi
import time
host = "http://192.168.1.14:10000"
apiKey = "BTs4pTOGCOtsBgAiSE5YHrS6GPmnBjHh"
client = dataikuapi.DSSClient(host, apiKey)
# client is now a DSSClient and can perform all authorized actions.
# For example, list the project keys for which the API key has access
# dss_projects = client.list_project_keys()
# print(dss_projects)
project = client.get_project('TEST')
dss_job = project.list_jobs()
# print(dss_job)
# failed_jobs = [job for job in dss_job if job['state'] == 'SUCCESS']
# print(failed_jobs)
# Start a job
print("Step 1 - Job definition")
definition = {
"type": "NON_RECURSIVE_FORCED_BUILD",
'projectKey': 'TEST',
'id': 'Build_churn_test_tmp_prepared_2020-07-22T15-15-12.332',
'name': 'Build churn_test_tmp_prepared',
'initiator': 'admin',
'triggeredFrom': 'RECIPE',
'recipe': 'compute_churn_test_tmp_prepared',
"outputs": [{
'id': 'Build_churn_test_tmp_prepared_2020-07-22T15-15-12.332',
'type': 'DATASET',
'targetDatasetProjectKey': 'TEST',
'targetDataset': 'churn_test_tmp_prepared'
}]
}
print("Step 2 - start job")
job = project.start_job(definition)
Best Answers
-
Hello,
In the next section of the doc you can find the syntax to run a job: https://doc.dataiku.com/dss/latest/python-api/jobs.html#starting-new-jobs
Please find attached a function that rebuild a list of given datasets (the variable outputs)
def run_job(project,outputs,job_type="NON_RECURSIVE_FORCED_BUILD"):
definition = {
"type" : job_type,
"outputs" : [{
"id" : "%s" %(output_name),
"partition" : "NP"
} for output_name in outputs ]
}
job = project.start_job(definition)
state = ''
print 'Building Datasets %s' % outputs
while state != 'DONE' and state != 'FAILED' and state != 'ABORTED':
time.sleep(1)
state = job.get_status()['baseStatus']['state']Cheers,
Du
-
Hi Du,
Thanks for your answer. It's working