Check out the first Dataiku 8 Deep Dive focusing on Productivity on October 29th Read More

Uploading a file to a folder using the Python API

Level 3
Uploading a file to a folder using the Python API

Hi, does anyone have a code snippet of how to upload a file to a Dataiku folder using the external Python API? Thanks!

0 Kudos
5 Replies
Dataiker
Dataiker

Hi @Turribeach ,

Here you go

import dataiku

folder = dataiku.Folder("FOLDER_ID")
folder.upload_file("/uploaded_file.csv", local_file_path)
Please also refer to managed folder documentation for more info. There are other methods like upload_stream and upload_data
Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
Level 3
Author

I am getting this exception:

Exception: Default project key is not specified (no DKU_CURRENT_PROJECT_KEY in env)

Isn't the Dataiku package supposed to be used only from DSS and I should use the dataikuapi from outside DSS?

0 Kudos
Level 3
Author

Solved it with this.

import os
os.environ["DKU_CURRENT_PROJECT_KEY"] = "PROJECT_KEY"

Level 3
Author

After some Googling and support help I figured out how to do this using the Python API package (dataikuapi) which is the recommended package to use when using it outside DSS. There seems to be an undocumented method (get_managed_folder) which is not shown in the API client definition page.

import dataikuapi

# Set Dataiku URL and API Key
host = "https://your_dss_URL"
apiKey = "paste your API here"

# Create API client
client = dataikuapi.DSSClient(host, apiKey)

# Ignore SSL checks as these may fail without access to root CA certs
client._session.verify = False

# Get a handle to the Dataiku project, must use Project Key, take it from Project Home URL, must be all in uppercase
project = client.get_project("PROJECT_KEY")

# Get a handle to the managed folder you want to upload a file to, must use Folder ID, take it from URL when browsing the folder in Dataiku. Case sensitive!
managedfolder = project.get_managed_folder("folder_id")

# Upload a local file to the managed folder
with open("C:\full_path_to\same_file.csv", "r") as file:
managedfolder.put_file('same_file.csv', file)

 

0 Kudos
Level 3
Author

OK get_managed_folder is documented here. Dataiku has broken all the API classes in different pages which I think makes it harder for someone new to understand the API. I thought the https://doc.dataiku.com/dss/latest/python-api/client.html page was the whole API documentation. It is also very confusing to have both APIs (dataiku and dataikuapi) in the same page. In addition to this the different APIs implement things in different ways. For dataiku you use dataiku.Folder("FOLDER_ID") to get a folder handle whereas in the dataikuapi you first need to get a handle to a project and then do project.get_managed_folder('FOLDER_ID'). Finally there are no examples on how to use get_managed_folder. 

0 Kudos