You now have until September 15th to submit your use case or success story to the 2022 Dataiku Frontrunner Awards!ENTER YOUR SUBMISSION

Save pandas dataframe to .xlsx in managed S3 folder

Solved!
Mario_Burbano
Level 2
Save pandas dataframe to .xlsx in managed S3 folder

Dear all, 

In a similar vein to the question Save pandas dataframe to .csv in managed S3 folder I would like to know how to write an excel file to the same type of managed S3 folder. I have successfully been able to write CSVs, as well as, images (as explained here ) to said folder, but when I attempt to save my pandas dataframe to an excel file using the same syntax as for the CSV:

with handler.get_writer("my_df.xlsx") as w:
    df.to_excel(w)

I get the following error:

AttributeError: ManagedFolderWriter instance has no attribute 'tell'

Could anybody help me with this issue?

Regards, 

Mario

1 Solution
Mario_Burbano
Level 2
Author

Nevermind, the solution was right under my nose. In fact, the contents of the dataframe must first be written to a BytesIO object which then is uploaded to the managed folder:

buf = io.BytesIO()
df.to_excel(buf)

handler.upload_stream('my_df.xlsx', buf.getvalue())
buf.close()

So, much like saving an image to the same type of folder. 

View solution in original post

4 Replies
Mario_Burbano
Level 2
Author

Nevermind, the solution was right under my nose. In fact, the contents of the dataframe must first be written to a BytesIO object which then is uploaded to the managed folder:

buf = io.BytesIO()
df.to_excel(buf)

handler.upload_stream('my_df.xlsx', buf.getvalue())
buf.close()

So, much like saving an image to the same type of folder. 

fahim2605
Level 1

Hi @Mario_Burbano . Thank you for your solution. Is there a way I can expand this to write multiple dataframes to different sheets within the excel workbook. So far I got:

import io
buf = io.BytesIO()

df1.to_excel(buf, sheet_name='Sheet1', encoding='utf-8', index = None, header = True)

test_S3_folder_demo.upload_stream('my_df.xlsx', buf.getvalue())
buf.close()

The above works, now how do I add a 'Sheet2' with data from a data frame 'df2'?

Could you please help?

Mario_Burbano
Level 2
Author

I suggest you take a look at the following link, which illustrates how to work with multiple sheets in a single document. 

Regards