Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello community,
Right now I'm developing a Dataiku recipe to save parquet format file into Dataiku folder.
First I need to call other service to get a dataframe and transform the dataframe into parquet format. But after running the recipe, the parquet file size is always 0KB with nothing inside. I used folder.upload_stream() provided by Dataiku. I have verified that there is no problem with dataframe
import io f = io.BytesIO() df.to_parquet(f) folder.upload_stream("name of file.parquet", f)
I don't know how to fix this problem, does someone have the same issue?
Thank you for your help
Hi,
The simplest would be to use the alternative form of to_parquet:
data = df.to_parquet()
folder.upload_data("name_of_file.parquet", data)
Hi,
For this alternative way, we need to specify the path and generate that parquet file in that path. And the method itself returns a None type
df.to_parquet(path="file_name.parquet")
I don't know if will work with dataiku. Even it works it will create extra file somewhere.
Do you know other ways to solve the problem?
Thanks a lot
Sorry, I find the answer. It's just a python question instead of dataiku problem
folder.upload_stream("name of file.parquet", f.getvalue())
Thank you for your time