dataikuapi - How to rerun a job through python using the api ?

Options
etienne_95
etienne_95 Registered Posts: 4 ✭✭✭✭

Hi,

i would like to rerun a job i made in Dss through the api using python. I used this documentation(https://doc.dataiku.com/dss/latest/python-api/jobs.html) but it's not describe how to defination properly the job.

I have these error : "dataikuapi.utils.DataikuException: java.lang.IllegalArgumentException: Computable not found or not buildable:TEST.Build_churn_test_tmp_prepared_2020-07-22T15-15-12.332" I didn't find any documentation about it

here my script

import dataikuapi
import time

host = "http://192.168.1.14:10000"
apiKey = "BTs4pTOGCOtsBgAiSE5YHrS6GPmnBjHh"

client = dataikuapi.DSSClient(host, apiKey)

# client is now a DSSClient and can perform all authorized actions.
# For example, list the project keys for which the API key has access
# dss_projects = client.list_project_keys()
# print(dss_projects)

project = client.get_project('TEST')
dss_job = project.list_jobs()
# print(dss_job)

# failed_jobs = [job for job in dss_job if job['state'] == 'SUCCESS']
# print(failed_jobs)

# Start a job
print("Step 1 - Job definition")

definition = {
"type": "NON_RECURSIVE_FORCED_BUILD",
'projectKey': 'TEST',
'id': 'Build_churn_test_tmp_prepared_2020-07-22T15-15-12.332',
'name': 'Build churn_test_tmp_prepared',
'initiator': 'admin',
'triggeredFrom': 'RECIPE',
'recipe': 'compute_churn_test_tmp_prepared',

"outputs": [{
'id': 'Build_churn_test_tmp_prepared_2020-07-22T15-15-12.332',
'type': 'DATASET',
'targetDatasetProjectKey': 'TEST',
'targetDataset': 'churn_test_tmp_prepared'
}]
}
print("Step 2 - start job")

job = project.start_job(definition)

Best Answers

  • duphan
    duphan Dataiker, Registered Posts: 32 Dataiker
    edited July 17 Answer ✓
    Options

    Hello,

    In the next section of the doc you can find the syntax to run a job: https://doc.dataiku.com/dss/latest/python-api/jobs.html#starting-new-jobs

    Please find attached a function that rebuild a list of given datasets (the variable outputs)

    def run_job(project,outputs,job_type="NON_RECURSIVE_FORCED_BUILD"):
    definition = {
    "type" : job_type,
    "outputs" : [{
    "id" : "%s" %(output_name),
    "partition" : "NP"
    } for output_name in outputs ]
    }
    job = project.start_job(definition)
    state = ''
    print 'Building Datasets %s' % outputs
    while state != 'DONE' and state != 'FAILED' and state != 'ABORTED':
    time.sleep(1)
    state = job.get_status()['baseStatus']['state']

    Cheers,

    Du

  • etienne_95
    etienne_95 Registered Posts: 4 ✭✭✭✭
    Answer ✓
    Options

    Hi Du,

    Thanks for your answer. It's working

Setup Info
    Tags
      Help me…