Get the run error message from Dataiku API

SanderVW
SanderVW Registered Posts: 44 ✭✭✭✭

Hello,

I'm building an automator for our scenarios that logs the results in a dataset. For this I have been exploring the API but cannot get the error message of a failed scenario/run/job.

Is there a way to get the error message? I can start the scenario no problem and can access the run object (to see outcome) but cannot find the actual error message on a failed run, which I would prefer to log as well.

Many thanks in advance!


Operating system used: Windows

Best Answer

  • SanderVW
    SanderVW Registered Posts: 44 ✭✭✭✭
    edited July 17 Answer ✓

    I found a way to do this! You can get a list of all the jobs that started after the scenario did (by timestamp) and get the logs, filtered by "[ERROR]".

    client = dataiku.api_client()
    project = client.get_default_project()
    scenario = project.get_scenario('scenarioId')
    run = scenario.run_and_wait(no_fail=True)
    
    jobs = project.list_jobs()
    for job in jobs:
        if job['def']['initiationTimestamp'] >= run.get_details()['scenarioRun']['trigger']['timestamp']:
            failed_job = project.get_job(job['def']['id'])
            job_log = (failed_job.get_log())
            lines = job_log.splitlines()
            for i in range(len(lines)):
                if '[ERROR]' in lines[i]:
                    print(lines[i] + lines[i+1])

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,982 Neuron
    edited July 17

    Something like this should do:

    import dataiku
    from dataiku.scenario import Scenario
    
    current_scenario = Scenario()
    client = dataiku.api_client()
    p = client.get_default_project()
    
    scenario_id = current_scenario.scenario_trigger["scenarioId"]
    run = p.get_scenario(scenario_id).get_current_run()
    run_details = run.get_details()
    
    for step_run in run_details.steps:
        step_name = step_run["step"]["name"]
        print("Step:", step_name)
        
        result_log_tail = step_run.get("result", {}).get("logTail", {}).get("lines")
        if result_log_tail:
            print("Result logTail:", result_log_tail)
        
        for report_item in step_run["additionalReportItems"]:
            report_type = report_item["type"]
            print("Report type:", report_type)
            
            report_log_tail = report_item.get("logTail", {}).get("lines")
            if report_log_tail:
                print("Report logTail:", report_log_tail)

  • SanderVW
    SanderVW Registered Posts: 44 ✭✭✭✭
    edited July 17

    Thank you for the quick reply. Unfortunately, this does not end up showing the error messages in my case, it only shows the type of step (e.g. a lot of "Report type: "BUILT_DATASET").

    I did manage to get some info through some adjustments, but this is the same info I get from the first_error_details property:

    {'clazz': 'java.lang.NullPointerException', 'message': 'NullPointerException', 'stack': 'java.lang.NullPointerException\n\tat com.dataiku.dip.scheduler.steps.BuildFlowItemStepRunner.run(BuildFlowItemStepRunner.java:245)\n\tat com.dataiku.dip.server.services.ScenariosService.runStep(ScenariosService.java:594)\n\tat com.dataiku.dip.scheduler.scenarios.StepBasedScenarioRunner.run(StepBasedScenarioRunner.java:226)\n\tat com.dataiku.dip.scheduler.ScenarioThread.execute(ScenarioThread.java:146)\n\tat com.dataiku.dip.futures.FutureThreadBase.run(FutureThreadBase.java:96)\n'}
    Step: set buildPeriod

    This does show the type of error (NullPointerException) but not the actual message you would see if you go to the job (where it specifies which dataset is not built yet in this case).

    Is there alternatively a way to get this message from a job instead of the run? And if so, how would you get the job object through the scenario/run?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,982 Neuron
    edited July 17

    I just had a look at your code. While it may work for you it's not necessarily doing what it attempts to do. First you are filtering for all jobs that started after the scenario started. This is NOT the correct way of filtering jobs for a particular scenario run. What you should do is look at the the following properties in the job as returned by the list_jobs() method: scenarioId, scenarioProjectKey, scenarioRunId. These should be matched to the particular Run ID you are interested/. Then your code implies that it gets a handle to a failed_job, this is incorrect too. This is just a handle to all jobs in the loop. In order to know if the job has failed you first need to get a handle for the job, then you can check if the job has failed or not looking at its status. Below is a sample of what I mean.

    jobs = project.list_jobs()
    for job in jobs:
        job_handle = project.get_job(job['def']['id'])
        if job_handle.get_status()['baseStatus']['state'] == 'FAILED':
            job_log = (job_handle.get_log())
            lines = job_log.splitlines()
            for i in range(len(lines)):
                if '[ERROR]' in lines[i]:
                    print(lines[i] + lines[i+1])

    job_handle.get_status()['baseStatus']['state']

Setup Info
    Tags
      Help me…