[Python API] Get path of the file inside a "UploadedFiles" dataset
Hi,
I am a looking the name of the input file inserted into an "UploadedFiles" dataset.
For a managed folder, I am using the "list_contents" function to do so and it works perfectly.
My code is currently the following
import dataiku project = dataiku.api_client().get_default_project() dataset = project.get_dataset("MY_UPLOADED_FILE_DATASET") #CODE HERE TO RETRIEVE FILE NAME INSIDE MY_UPLOADED_FILE_DATASET
Looking forward for your answers,
Bests,
Julien
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,993 Neuron
Unfortunately the dataikuapi.dss.dataset doesn't have the methods to do that so you need to use the dataiku.Dataset class (see here for list of methods of these two classes). Here is a sample code snippet:
dataset = dataiku.Dataset("some_dataset_name") dataset_files = dataset.get_files_info() for file in dataset_files['pathsByPartition']['NP']: print(file)
And sample output:
Keep in mind that partitioned and non-partitioned datasets will use different keys to store the file names (NP = non-partitioned). You can upload multiple files so you may get more than one file. Enjoy!
PS: The above code will work fine inside DSS. If you want to connect from outside DSS and use the internal API you will need further code.
Answers
-
Thanks a lot, exactly what I was looking for !