Moving folder to Kubernetes
Hi,
I have a code recipe currently running on DSS, i switched the computation from DSS to Kubernetes and stated having issues.
How do I move from folder from managed folder in DSS to Kubernetes.
Note : i want to be able to call all the dat in the folder not individual data
Below is the python code i have written so far :
# Read recipe inputs
Clinical_data = dataiku.Folder("TEST_WITH_FANG.LeD8SADS")
Clinical_data_info = Clinical_data.get_info()
# pass a partition identifier if the folder is partitioned
#paths = Clinical_data.list_paths_in_partition()
path = Clinical_data.get_path()
#read folder
import os
os.makedirs('test_data', exist_ok=True)
for file_path in paths:
with Clinical_data.get_download_stream(file_path) as f:
with open(os.path.join('test_data', file_path[1:]), 'wb') as local_fp:
testing = local_fp.write(f.read())
Thanks
Answers
-
Hi,
As the code is running outside of DSS (in a pod), so you don't have access to the DSS filesystem. You will need to use get_download_stream/upload_stream to read/write from/to the managed folder.
Not sure why you need to create the folder and save files directly in the filesystem, but your code will save the files in the pod's filesystem when running in Kubernetes. Assuming this is wanted behaviour, your code can look something like follows:
# Read recipe inputs Clinical_data = dataiku.Folder("TEST_WITH_FANG.LeD8SADS") paths = Clinical_data.list_paths_in_partition() import os folder = 'test_data' os.makedirs(folder, exist_ok=True) folder_path = os.path.join(os.getcwd(),folder) print(paths, folder_path) for path in paths: with input_folder.get_download_stream(path) as f: new_fiile_path = folder_path + path with open(new_fiile_path, 'wb') as local_fp: local_fp.write(f.read())
Below is the test run in my lab:
Best,
Vitaliy