Announcing the winners & finalists of the Dataiku Frontrunner Awards 2021! Read their inspiring stories

Save pandas dataframe to .xlsx in managed S3 folder

Solved!
Mario_Burbano
Level 2
Save pandas dataframe to .xlsx in managed S3 folder

Dear all, 

In a similar vein to the question Save pandas dataframe to .csv in managed S3 folder I would like to know how to write an excel file to the same type of managed S3 folder. I have successfully been able to write CSVs, as well as, images (as explained here ) to said folder, but when I attempt to save my pandas dataframe to an excel file using the same syntax as for the CSV:

with handler.get_writer("my_df.xlsx") as w:
    df.to_excel(w)

I get the following error:

AttributeError: ManagedFolderWriter instance has no attribute 'tell'

Could anybody help me with this issue?

Regards, 

Mario

1 Solution
Mario_Burbano
Level 2
Author

Nevermind, the solution was right under my nose. In fact, the contents of the dataframe must first be written to a BytesIO object which then is uploaded to the managed folder:

buf = io.BytesIO()
df.to_excel(buf)

handler.upload_stream('my_df.xlsx', buf.getvalue())
buf.close()

So, much like saving an image to the same type of folder. 

View solution in original post

3 Replies
Mario_Burbano
Level 2
Author

Nevermind, the solution was right under my nose. In fact, the contents of the dataframe must first be written to a BytesIO object which then is uploaded to the managed folder:

buf = io.BytesIO()
df.to_excel(buf)

handler.upload_stream('my_df.xlsx', buf.getvalue())
buf.close()

So, much like saving an image to the same type of folder. 

View solution in original post

fahim2605
Level 1

Hi @Mario_Burbano . Thank you for your solution. Is there a way I can expand this to write multiple dataframes to different sheets within the excel workbook. So far I got:

import io
buf = io.BytesIO()

df1.to_excel(buf, sheet_name='Sheet1', encoding='utf-8', index = None, header = True)

test_S3_folder_demo.upload_stream('my_df.xlsx', buf.getvalue())
buf.close()

The above works, now how do I add a 'Sheet2' with data from a data frame 'df2'?

Could you please help?

Mario_Burbano
Level 2
Author

I suggest you take a look at the following link, which illustrates how to work with multiple sheets in a single document. 

Regards

A banner prompting to get Dataiku DSS