Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

SCENARIO : get the name of dataset that failed in a step

GanNing69
Level 1
SCENARIO : get the name of dataset that failed in a step

Hello, 

am currently trying to recover the name of the datasets which failed in a build step of a scenario. So far I have only found the way to recover the first step which fails, however I specifically want to know the name of all the dataSets which fail in order to transfer them to a separate dataset. Do you have any ideas for doing this in this python step ?

 For example (if the construction of one of these datasets fails), I want to be able to recover its name

 

 

0 Kudos
2 Replies
Turribeach

Can you please post the Python code you got so far in code block (the </> icon on the toolbar)?

0 Kudos
GanNing69
Level 1
Author

Of course, here is my current python code which aims to recover the logs in the event of an error. The rest is just a succession of steps preconstructed by dataiku.

 

import dataiku
from dataiku import pandasutils as pdu
import pandas as pd
from dataiku.scenario import Scenario

scenario = Scenario()

current_scenario = Scenario()
client = dataiku.api_client()
p = client.get_default_project()

#permet de récupérer la variable de scenario pour avoir l'heure de début du run
project = dataiku.api_client().get_project(dataiku.default_project_key())
variables = project.get_variables()
DebScen = dataiku.get_custom_variables()['DebScen']

#récupère le scenario et le run acutel avec ses détails
scenario_id = current_scenario.scenario_trigger["scenarioId"]
run = p.get_scenario(scenario_id).get_current_run()
run_details = run.get_details()

errors = []

#PARCOURS CHAQUE STEP ET LEURS DETAILS
for step_run in run_details.steps:
    step_name = step_run["step"]["name"]
    error_msg = None
    
#RECUPERE LES LOGS TAIL (fin des logs) ET ASSIGNE LES MSG D'ERREUR A LA LISTE ERRORS

    result_log_tail = step_run.get("result", {}).get("logTail", {}).get("lines")
    if result_log_tail:
        error_msg = "\n".join(result_log_tail)
        
    for report_item in step_run["additionalReportItems"]:
        report_type = report_item["type"]
        
        report_log_tail = report_item.get("logTail", {}).get("lines")
        if report_log_tail:
            if error_msg : 
                error_msg += "\n" + "\n".join(report_log_tail)
            else :
                error_msg = "\n".join(report_log_tail)
                
                
#CREATION DU DF CONTENANT : LA DATE DE DEBUT DU SCENARIO, LE NOM DU SCENARIO, NOM DE LA STEP QUI A FAIL ET SON MSG D'ERREUR

    if error_msg : 
        errors.append({"scenario_name" : scenario_id, "step_name" : step_name, "error_message" : error_msg, "deb_scenario" : DebScen})

#CREATION DU DATASET FINAL AVEC L'HISTORIQUE DES DONNEES
dfErrors = pd.DataFrame(errors)
dataset = dataiku.Dataset('scenario_logs')
existing_data = dataset.get_dataframe()
upadating_ds = pd.concat([existing_data,dfErrors], ignore_index = True)

#OUTPUT
scenario_logs = dataiku.Dataset("scenario_logs")
scenario_logs.write_with_schema(upadating_ds)
0 Kudos