Get the run error message from Dataiku API

SanderVW · ‎02-28-2024

Hello,

I'm building an automator for our scenarios that logs the results in a dataset. For this I have been exploring the API but cannot get the error message of a failed scenario/run/job.

Is there a way to get the error message? I can start the scenario no problem and can access the run object (to see outcome) but cannot find the actual error message on a failed run, which I would prefer to log as well.

Many thanks in advance!

Operating system used: Windows

SanderVW · ‎02-29-2024

I found a way to do this! You can get a list of all the jobs that started after the scenario did (by timestamp) and get the logs, filtered by "[ERROR]".

client = dataiku.api_client()
project = client.get_default_project()
scenario = project.get_scenario('scenarioId')
run = scenario.run_and_wait(no_fail=True)

jobs = project.list_jobs()
for job in jobs:
    if job['def']['initiationTimestamp'] >= run.get_details()['scenarioRun']['trigger']['timestamp']:
        failed_job = project.get_job(job['def']['id'])
        job_log = (failed_job.get_log())
        lines = job_log.splitlines()
        for i in range(len(lines)):
            if '[ERROR]' in lines[i]:
                print(lines[i] + lines[i+1])

View solution in original post

Turribeach · ‎02-28-2024

Something like this should do:

import dataiku
from dataiku.scenario import Scenario

current_scenario = Scenario()
client = dataiku.api_client()
p = client.get_default_project()

scenario_id = current_scenario.scenario_trigger["scenarioId"]
run = p.get_scenario(scenario_id).get_current_run()
run_details = run.get_details()

for step_run in run_details.steps:
    step_name = step_run["step"]["name"]
    print("Step:", step_name)
    
    result_log_tail = step_run.get("result", {}).get("logTail", {}).get("lines")
    if result_log_tail:
        print("Result logTail:", result_log_tail)
    
    for report_item in step_run["additionalReportItems"]:
        report_type = report_item["type"]
        print("Report type:", report_type)
        
        report_log_tail = report_item.get("logTail", {}).get("lines")
        if report_log_tail:
            print("Report logTail:", report_log_tail)

SanderVW · ‎02-29-2024

Thank you for the quick reply. Unfortunately, this does not end up showing the error messages in my case, it only shows the type of step (e.g. a lot of "Report type: "BUILT_DATASET").

I did manage to get some info through some adjustments, but this is the same info I get from the first_error_details property:

{'clazz': 'java.lang.NullPointerException', 'message': 'NullPointerException', 'stack': 'java.lang.NullPointerException\n\tat com.dataiku.dip.scheduler.steps.BuildFlowItemStepRunner.run(BuildFlowItemStepRunner.java:245)\n\tat com.dataiku.dip.server.services.ScenariosService.runStep(ScenariosService.java:594)\n\tat com.dataiku.dip.scheduler.scenarios.StepBasedScenarioRunner.run(StepBasedScenarioRunner.java:226)\n\tat com.dataiku.dip.scheduler.ScenarioThread.execute(ScenarioThread.java:146)\n\tat com.dataiku.dip.futures.FutureThreadBase.run(FutureThreadBase.java:96)\n'}
Step: set buildPeriod

This does show the type of error (NullPointerException) but not the actual message you would see if you go to the job (where it specifies which dataset is not built yet in this case).

Is there alternatively a way to get this message from a job instead of the run? And if so, how would you get the job object through the scenario/run?

SanderVW · ‎02-29-2024

I found a way to do this! You can get a list of all the jobs that started after the scenario did (by timestamp) and get the logs, filtered by "[ERROR]".

client = dataiku.api_client()
project = client.get_default_project()
scenario = project.get_scenario('scenarioId')
run = scenario.run_and_wait(no_fail=True)

jobs = project.list_jobs()
for job in jobs:
    if job['def']['initiationTimestamp'] >= run.get_details()['scenarioRun']['trigger']['timestamp']:
        failed_job = project.get_job(job['def']['id'])
        job_log = (failed_job.get_log())
        lines = job_log.splitlines()
        for i in range(len(lines)):
            if '[ERROR]' in lines[i]:
                print(lines[i] + lines[i+1])

Turribeach · ‎03-14-2024

I just had a look at your code. While it may work for you it's not necessarily doing what it attempts to do. First you are filtering for all jobs that started after the scenario started. This is NOT the correct way of filtering jobs for a particular scenario run. What you should do is look at the the following properties in the job as returned by the list_jobs() method: scenarioId, scenarioProjectKey, scenarioRunId. These should be matched to the particular Run ID you are interested/. Then your code implies that it gets a handle to a failed_job, this is incorrect too. This is just a handle to all jobs in the loop. In order to know if the job has failed you first need to get a handle for the job, then you can check if the job has failed or not looking at its status. Below is a sample of what I mean.

jobs = project.list_jobs()
for job in jobs:
    job_handle = project.get_job(job['def']['id'])
    if job_handle.get_status()['baseStatus']['state'] == 'FAILED':
        job_log = (job_handle.get_log())
        lines = job_log.splitlines()
        for i in range(len(lines)):
            if '[ERROR]' in lines[i]:
                print(lines[i] + lines[i+1])

job_handle.get_status()['baseStatus']['state']

Get the run error message from Dataiku API

Get the run error message from Dataiku API

Labels

Setup info

Sign up to take part

Get the run error message from Dataiku API

Get the run error message from Dataiku API

Labels

Setup info