Problem when using folder.upload_stream() to save files to Dataiku Folder

Blossom
Level 1
Problem when using folder.upload_stream() to save files to Dataiku Folder

Hello community,

Right now I'm developing a Dataiku recipe to save parquet format file into Dataiku folder. 

First I need to call other service to get a dataframe and transform the dataframe into parquet format. But after running the recipe, the parquet file size is always 0KB with nothing inside. I used folder.upload_stream() provided by Dataiku. I have verified that there is no problem with dataframe

import io
f = io.BytesIO()
df.to_parquet(f)

folder.upload_stream("name of file.parquet", f)

I don't know how to fix this problem, does someone have the same issue?

Thank you for your help

0 Kudos
3 Replies
Clรฉment_Stenac

Hi,

The simplest would be to use the alternative form of to_parquet:

data = df.to_parquet()
folder.upload_data("name_of_file.parquet", data)
0 Kudos
Blossom
Level 1
Author

Hi,

For this alternative way, we need to specify the path and generate that parquet file in that path. And the method itself returns a None type

df.to_parquet(path="file_name.parquet")

I don't know if will work with dataiku. Even it works it will create extra file somewhere.

Do you know other ways to solve the problem? 

Thanks a lot

0 Kudos
Blossom
Level 1
Author

Sorry, I find the answer. It's just a python question instead of dataiku problem

folder.upload_stream("name of file.parquet", f.getvalue())

 Thank you for your time

0 Kudos