How to change a file name while writing file in S3 or exporting in managed folder using dataiku

Solved!
pritam003
Level 2
How to change a file name while writing file in S3 or exporting in managed folder using dataiku

Hi Team 

can you please let us know how to change file name

 

  • while writing in S3 
  • exporting it to a managed folder.

 

regards,

pritam

 

0 Kudos
1 Solution
AlexT
Dataiker

Hi @pritam003,

Changing the file name when using the Visual Export to Folder recipe is not possible.

You can use Python recipe and read/write APIs. Create a python with managed folder ( backed on S3) as the output.

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

managed_folder_id = "URKU7Oqb"

# Read dataset convert df
my_dataset = dataiku.Dataset("customers_labeled_prepared")
df = my_dataset.get_dataframe()


# Write recipe outputs
output_folder = dataiku.Folder(managed_folder_id)
output_folder.upload_stream("some_name.csv", df.to_csv(index=False).encode("utf-8"))

 

View solution in original post

0 Kudos
4 Replies
AlexT
Dataiker

Hi @pritam003,

Changing the file name when using the Visual Export to Folder recipe is not possible.

You can use Python recipe and read/write APIs. Create a python with managed folder ( backed on S3) as the output.

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

managed_folder_id = "URKU7Oqb"

# Read dataset convert df
my_dataset = dataiku.Dataset("customers_labeled_prepared")
df = my_dataset.get_dataframe()


# Write recipe outputs
output_folder = dataiku.Folder(managed_folder_id)
output_folder.upload_stream("some_name.csv", df.to_csv(index=False).encode("utf-8"))

 

0 Kudos
pritam003
Level 2
Author

Thanks @AlexT  this helped a lot.

 

can you please let us know how we can create a managed folder using python api and get the managed folder id through it.

 

TIA 

pritam

0 Kudos
Nainish09
Level 2

Hi @AlexT

I tried the same code for my excel workbook stored in a folder of s3 bucket but i am getting an error

 

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
managed_folder_id = "Uw9PLtD4"

# Read dataset convert df
my_dataset = dataiku.Dataset("Information_20230224.xlsx")
df = my_dataset.get_dataframe()
df

 

error

Exception: Unable to fetch schema for Information_20230224.xlsx: b'Failed to read project permissions, caused by: NotFoundException: Project does not exist: Information_20230224'

 I basically want to import my excel workbook stored in a folder in S3 bucket. I have to use this whole workbook with all sheets for further processing.

please help me in this.

0 Kudos