You now have until September 15th to submit your use case or success story to the 2022 Dataiku Frontrunner Awards!ENTER YOUR SUBMISSION

Python code to read S3 files

Solved!
sj0071992
Neuron
Neuron
Python code to read S3 files

Hi,

 

How we can read S3 files using Python recipe in Dataiku?

 

Thanks in Advance

0 Kudos
1 Solution
AlexT
Dataiker
Dataiker

Hi,

Perhaps you can provide a bit more context around what you want to achieve exactly. Creating a managed folder in most cases can be done in the UI and does not need to be automated.

If you do wish to use the API to create the managed folder and specify the path you can use this :

 

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
import dataikuapi

client = dataiku.api_client()

#assuming it will run notebook/scenario within the project 
project = client.get_default_project()
project_key = project.get_summary()['projectKey']


#project_k = project.project_key()

managed_folder_name = "my_s3_managed_folder"
s3_connection_name = "s3-test"

#nnection to S3 should already exist, create managed folder if it does not exists

folder = dataiku.Folder(managed_folder_name)

try:
    folder_id = folder.get_id()
except:
    print("creating folder")
    project.create_managed_folder(managed_folder_name,connection_name=s3_connection_name)


# Modify path within the root path of the connection - default is /${projectKey}/${odbId}'

fld = dataikuapi.dss.managedfolder.DSSManagedFolder(client, project_key, folder_id)
fld_def = fld.get_definition()
# replace path relative within the root of the S3 connection 
fld_def['path'] = '/${projectKey}/${odbId}/testing'
fld.set_definition(fld_def)

 

View solution in original post

0 Kudos
21 Replies
Sajid_Khan
Level 3
Level 3

.

0 Kudos