Dataframe pivot table from python doesn't save properly as Dataiku dataset

Festa · September 2024

Hi everyone,

I am creating a pivot table using pandas data frame in python.

df_pivot = pd.pivot_table(df, values=['Units'], index =['Name', 'City', 'State', 'Zip'], columns= ['Date'],
aggfunc={'Units': 'sum'}, margins = True, fill_value=0)

However, when I save it using the following:

dataset_new = dataiku.Dataset("Dataset Output")
dataset_new.write_with_schema(df_pivot)

The index columns (the ones you use to convert to rows in the pivot table, ['Name', 'City', 'State', 'Zip'] ) do not print out in the final Dataiku dataset. That is, I end up with a dataset that contains only the columns defined in the pivot table (columns= ['Date']). Why does this happen? How can I fix this?

Some workarounds I have tried and have not worked are:

I tried to save the dataframe as Excel file, but I could not export it from the DSS Notebook to my computer.
I thought I could save the dataframe as Exce and then convert it to a Dataiku dataset. However, l I could not write the Dataiku dataset with an Excel file. Is there a function that can do this?

I sincerely appreciate any adivce!

Operating system used: MACos

Ignacio_Toledo · September 2024

https://community.dataiku.com/discussion/44271/dataframe-pivot-table-from-python-doesnt-save-properly-as-dataiku-dataset

Hello @Festa,

Have you tried doing this?

dataset_new.write_with_schema(df_pivot.reset_index())

I think the Pandas DataFrame indexes are not saved to the dataframe, so you have to set them back as columns.

Festa · September 2024

https://community.dataiku.com/discussion/comment/44528#Comment_44528

Hi @Ignacio_Toledo!

Thank you so much for your suggestion. It worked!

Dataframe pivot table from python doesn't save properly as Dataiku dataset

Answers

Categories

Setup Info

Tags