Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
import os
import glob
# Read recipe inputs
ctdna_output = dataiku.Folder("aMqrRJSr")
ctdna_output_info = ctdna_output.get_info()
#get list of files
files = glob.glob(os.path.join(ctdna_output.get_path(),'R2810_1624_*.csv')) *** (where i am having the problem)
****Please help with a solution
Hi,
From the post's title, it seems that the code is running outside of DSS (probably in a pod). As a result of this, you don't have access to the DSS filesystem (the same implies to the managed folders hosted in another location (S3, HDFS, Azure Blob, …)), so you will need to use get_download_stream/upload_stream to read/write from/to the managed folder. Please refer to the below example:
folder_handle = dataiku.Folder("jWoN2f4k")
paths = folder_handle.list_paths_in_partition()
for path in paths:
with folder_handle.get_download_stream(path) as f:
output_df = pd.read_csv(f)
print(output_df.shape)
# do something with dataframe
Best,
Vitaliy
Hi Vitaliy,
I do have access to the DSS.
If i run that code on the DSS computation, it work, but once i change my computation to kubernetes, it gives errors.
Hi, without knowing what the error is, we can't say much. Can you add a job diag? If you can't add it here, I would suggest opening a support ticket providing the job diag of the failed job (https://doc.dataiku.com/dss/latest/troubleshooting/obtaining-support.html#guidelines-for-submitting-...).
Best,
Vitaliy