[Python API] Get path of the file inside a "UploadedFiles" dataset

JKO
JKO Registered Posts: 2 ✭✭✭

Hi,

I am a looking the name of the input file inserted into an "UploadedFiles" dataset.

For a managed folder, I am using the "list_contents" function to do so and it works perfectly.

My code is currently the following

import dataiku
project = dataiku.api_client().get_default_project()
dataset = project.get_dataset("MY_UPLOADED_FILE_DATASET")
#CODE HERE TO RETRIEVE FILE NAME INSIDE MY_UPLOADED_FILE_DATASET

Looking forward for your answers,

Bests,

Julien

Best Answer

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,024 Neuron
    edited September 16 Answer ✓

    Unfortunately the dataikuapi.dss.dataset doesn't have the methods to do that so you need to use the dataiku.Dataset class (see here for list of methods of these two classes). Here is a sample code snippet:

    dataset = dataiku.Dataset("some_dataset_name")
    dataset_files = dataset.get_files_info()
    for file in dataset_files['pathsByPartition']['NP']:
        print(file)
    

    And sample output:

    Keep in mind that partitioned and non-partitioned datasets will use different keys to store the file names (NP = non-partitioned). You can upload multiple files so you may get more than one file. Enjoy!

    PS: The above code will work fine inside DSS. If you want to connect from outside DSS and use the internal API you will need further code.

Answers

Setup Info
    Tags
      Help me…