Save pandas dataframe to .xlsx in managed S3 folder
Dear all,
In a similar vein to the question Save pandas dataframe to .csv in managed S3 folder I would like to know how to write an excel file to the same type of managed S3 folder. I have successfully been able to write CSVs, as well as, images (as explained here ) to said folder, but when I attempt to save my pandas dataframe to an excel file using the same syntax as for the CSV:
with handler.get_writer("my_df.xlsx") as w:
df.to_excel(w)
I get the following error:
AttributeError: ManagedFolderWriter instance has no attribute 'tell'
Could anybody help me with this issue?
Regards,
Mario
Best Answer
-
Nevermind, the solution was right under my nose. In fact, the contents of the dataframe must first be written to a BytesIO object which then is uploaded to the managed folder:
buf = io.BytesIO()
df.to_excel(buf)handler.upload_stream('my_df.xlsx', buf.getvalue())
buf.close()So, much like saving an image to the same type of folder.
Answers
-
Hi @Mario_Burbano
. Thank you for your solution. Is there a way I can expand this to write multiple dataframes to different sheets within the excel workbook. So far I got:import io
buf = io.BytesIO()df1.to_excel(buf, sheet_name='Sheet1', encoding='utf-8', index = None, header = True)
test_S3_folder_demo.upload_stream('my_df.xlsx', buf.getvalue())
buf.close()The above works, now how do I add a 'Sheet2' with data from a data frame 'df2'?
Could you please help?
-
I suggest you take a look at the following link, which illustrates how to work with multiple sheets in a single document.
Regards
-
Just posted a solution if you want: https://community.dataiku.com/t5/Plugins-Extending-Dataiku/Multi-worksheet-xlsx-Exporter/td-p/23617