Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
How we can read S3 files using Python recipe in Dataiku?
Thanks in Advance
Hi,
Perhaps you can provide a bit more context around what you want to achieve exactly. Creating a managed folder in most cases can be done in the UI and does not need to be automated.
If you do wish to use the API to create the managed folder and specify the path you can use this :
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
import dataikuapi
client = dataiku.api_client()
#assuming it will run notebook/scenario within the project
project = client.get_default_project()
project_key = project.get_summary()['projectKey']
#project_k = project.project_key()
managed_folder_name = "my_s3_managed_folder"
s3_connection_name = "s3-test"
#nnection to S3 should already exist, create managed folder if it does not exists
folder = dataiku.Folder(managed_folder_name)
try:
folder_id = folder.get_id()
except:
print("creating folder")
project.create_managed_folder(managed_folder_name,connection_name=s3_connection_name)
# Modify path within the root path of the connection - default is /${projectKey}/${odbId}'
fld = dataikuapi.dss.managedfolder.DSSManagedFolder(client, project_key, folder_id)
fld_def = fld.get_definition()
# replace path relative within the root of the S3 connection
fld_def['path'] = '/${projectKey}/${odbId}/testing'
fld.set_definition(fld_def)
.