How to change a file name while writing file in S3 or exporting in managed folder using dataiku
Hi Team
can you please let us know how to change file name
- while writing in S3
- exporting it to a managed folder.
regards,
pritam
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @pritam003
,Changing the file name when using the Visual Export to Folder recipe is not possible.
You can use Python recipe and read/write APIs. Create a python with managed folder ( backed on S3) as the output.
import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu managed_folder_id = "URKU7Oqb" # Read dataset convert df my_dataset = dataiku.Dataset("customers_labeled_prepared") df = my_dataset.get_dataframe() # Write recipe outputs output_folder = dataiku.Folder(managed_folder_id) output_folder.upload_stream("some_name.csv", df.to_csv(index=False).encode("utf-8"))
Answers
-
Thanks @AlexT
this helped a lot.can you please let us know how we can create a managed folder using python api and get the managed folder id through it.
TIA
pritam
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Please see solution provided here: https://community.dataiku.com/t5/Using-Dataiku/is-there-any-way-to-create-managed-folder-by-using-the-python/m-p/20122
Thanks!
-
Hi @AlexT
,I tried the same code for my excel workbook stored in a folder of s3 bucket but i am getting an error
Exception: Unable to fetch schema for Information_20230224.xlsx: b'Failed to read project permissions, caused by: NotFoundException: Project does not exist: Information_20230224'
error
import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu managed_folder_id = "Uw9PLtD4" # Read dataset convert df my_dataset = dataiku.Dataset("Information_20230224.xlsx") df = my_dataset.get_dataframe() df
I basically want to import my excel workbook stored in a folder in S3 bucket. I have to use this whole workbook with all sheets for further processing.
please help me in this.