Moving folder to Kubernetes

Scobbyy2k3
Level 3
Moving folder to Kubernetes

Hi,

I have a code recipe currently running on DSS, i switched the computation from DSS to Kubernetes and stated having issues.

How do I move from folder from managed folder in DSS to Kubernetes.

 

Note : i want to be able to call all the dat in the folder not individual data

Below is the python code i have written so far :

# Read recipe inputs
Clinical_data = dataiku.Folder("TEST_WITH_FANG.LeD8SADS")
Clinical_data_info = Clinical_data.get_info()

# pass a partition identifier if the folder is partitioned
#paths = Clinical_data.list_paths_in_partition()
path = Clinical_data.get_path()

#read folder

import os

os.makedirs('test_data', exist_ok=True)

for file_path in paths:
with Clinical_data.get_download_stream(file_path) as f:
with open(os.path.join('test_data', file_path[1:]), 'wb') as local_fp:
testing = local_fp.write(f.read())

 

 

Thanks

 

0 Kudos
1 Reply
VitaliyD
Dataiker

Hi,

As the code is running outside of DSS (in a pod), so you don't have access to the DSS filesystem. You will need to use get_download_stream/upload_stream to read/write from/to the managed folder.

Not sure why you need to create the folder and save files directly in the filesystem, but your code will save the files in the pod's filesystem when running in Kubernetes. Assuming this is wanted behaviour, your code can look something like follows:

# Read recipe inputs
Clinical_data = dataiku.Folder("TEST_WITH_FANG.LeD8SADS")

paths = Clinical_data.list_paths_in_partition()

import os

folder = 'test_data'
os.makedirs(folder, exist_ok=True)
folder_path = os.path.join(os.getcwd(),folder)
print(paths, folder_path)

for path in paths:
    with input_folder.get_download_stream(path) as f:
        new_fiile_path = folder_path + path
        with open(new_fiile_path, 'wb') as local_fp:
            local_fp.write(f.read())

Below is the test run in my lab:

Screenshot 2022-11-05 at 12.30.36.png

Best,

Vitaliy

0 Kudos