Want to Stop Rebuilding "Expensive" Parts of your Flow? Explicit Builds are the Answer!READ MORE

How to write job logs as txt file?

Azs
Level 1
How to write job logs as txt file?

Hello Community!

I am trying to export job logs as a text file for a time interval. And trying to export and keep in a folder in current project. But, it is writing only one and rest of them have not exported. How to write all logs?

 

Code:

all_job_logs_str=''.join([str(item) for item in all_job_logs])


with logs_folder.get_writer("logs.txt") as writer:

           writer.write(bytes(all_job_logs_str, encoding='utf-8'))

 

all_job_logs_str keeps all logs as string data type. 

 

Thank you in advance.

 

0 Kudos
3 Replies
CatalinaS
Dataiker
Dataiker

Hi @Azs,

 

Can you please share more code? How are you running this code? Please confirm if you are running it in DSS as a Python recipe or outside DSS as a Python script. 

0 Kudos
Azs
Level 1
Author

Hi @CatalinaS ,

Thank you for your response. Yes, it's running in DSS as python code and in notebook. I'm also using following code for it.

 

client = dataiku.api_client()
project = client.get_default_project()
datasets = project.list_datasets(as_type='listitems')
jobs = [dataset.name for dataset in datasets if dataset.name.split('_')[1]=='jobs']

if jobs:
jobs = jobs[0]
else:
name = dataiku.default_project_key().split('_')[0]
jobs = f"Jobs"

try:
jobs_dataset = dataiku.Dataset(jobs)
except:
print('dataset with jobs info not found in this project')



jobs_dataframe = jobs_dataset.get_dataframe()


all_job_logs=[]

line = 0
end_line = len(jobs_dataframe) + 1

while line < 3: ##len(jobs_dataframe)

row = jobs_dataframe.iloc[line]
one_job = DSSJob(client, row.job_project_key, row.job_id)

try:
job_log = one_job.get_log()
#print(job_log)
except Exception:
pass

all_job_logs.append(job_log)

line+= 1

# save job log to filesystem
jobs_dataset = project.get_dataset(team_jobs)

# create new logs folder
foldername = 'job_logs_folder'
current_folders = [item.get('name') for item in project.list_managed_folders()]
if not foldername in current_folders:
project.create_managed_folder(foldername, connection_name=team_filesystem)

logs_folder = dataiku.Folder(foldername)

# save job log as "log.txt" in the folder
all_job_logs_str=''.join([str(item) for item in all_job_logs])

# run this cell to save the one job log.
with logs_folder.get_writer("log.txt") as writer:
writer.write(bytes(all_job_logs_str, encoding='utf-8'))

 

0 Kudos
CatalinaS
Dataiker
Dataiker

Hi @Azs ,

Thanks for sharing the code.

I was able to export the logs of all jobs using below code:

import dataiku
from dataikuapi.dss.job import DSSJob
client = dataiku.api_client()
project = client.get_default_project()
jobs = project.list_jobs()

# create new logs folder
foldername = 'job_logs_folder'
current_folders = [item.get('name') for item in project.list_managed_folders()]
if not foldername in current_folders:
    project.create_managed_folder(foldername)

logs_folder = dataiku.Folder(foldername)

for job in jobs:
    print(job)
    job_hndl = DSSJob(client, job["def"]["projectKey"], job["def"]["id"])
    all_job_logs = job_hndl.get_log()
    all_job_logs_str=''.join([str(item) for item in all_job_logs])
    with logs_folder.get_writer("logs.txt") as writer:
        writer.write(bytes(all_job_logs_str, encoding='utf-8'))
    print(all_job_logs_str)

 

This is simplification of your code without using your naming convention. 

I suspect that the issue happens because you are overwriting the file in the while loop when you are calling get_writer() function instead of appending it. 

I hope this helps.

0 Kudos