Dataframe pivot table from python doesn't save properly as Dataiku dataset

Festa
Festa Registered Posts: 5 ✭✭

Hi everyone,

I am creating a pivot table using pandas data frame in python.

df_pivot = pd.pivot_table(df, values=['Units'], index =['Name', 'City', 'State', 'Zip'], columns= ['Date'],
aggfunc={'Units': 'sum'}, margins = True, fill_value=0)

However, when I save it using the following:

dataset_new = dataiku.Dataset("Dataset Output")
dataset_new.write_with_schema(df_pivot)

The index columns (the ones you use to convert to rows in the pivot table, ['Name', 'City', 'State', 'Zip'] ) do not print out in the final Dataiku dataset. That is, I end up with a dataset that contains only the columns defined in the pivot table (columns= ['Date']). Why does this happen? How can I fix this?

Some workarounds I have tried and have not worked are:

  1. I tried to save the dataframe as Excel file, but I could not export it from the DSS Notebook to my computer.
  2. I thought I could save the dataframe as Exce and then convert it to a Dataiku dataset. However, l I could not write the Dataiku dataset with an Excel file. Is there a function that can do this?

I sincerely appreciate any adivce!

Operating system used: MACos

Operating system used: MACos

Operating system used: MACos

Answers

Setup Info
    Tags
      Help me…