You now have until September 15th to submit your use case or success story to the 2022 Dataiku Frontrunner Awards!ENTER YOUR SUBMISSION

Code fails to generate the correct table column

Solved!
webzest
Level 2
Code fails to generate the correct table column

Hello,

I am following the Python tutorial and arrived at the steps to create a table.  However, the resulting table works in the notebook and does generate a three column table, with a customer_ID column.  when I switch to code versing and ran the code, the table does not have the Customer_ID column, which then generates an error in the next step of the tutorial.  Below is the code that generates correctly in Notebook but does not generate the customer_id column in code method.

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
orders = dataiku.Dataset("orders")
orders_df = orders.get_dataframe()


# Compute recipe outputs from inputs
# TODO: Replace this part by your actual code that computes the output, as a Pandas dataframe
# NB: DSS also supports other kinds of APIs for reading and writing data. Please see doc.

#orders_by_customer_df = orders_df # For this sample code, simply copy input to output

orders_by_customer_df = orders_df.assign(total=orders_df.tshirt_price*orders_df.tshirt_quantity
       ).groupby(by="customer_id"
                ).agg({"pages_visited":"mean",
                       "total":"sum"})


# Write recipe outputs
orders_by_customer = dataiku.Dataset("orders_by_customer")
orders_by_customer.write_with_schema(orders_by_customer_df)

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
orders_by_customer = orders_df.assign(total=orders_df.tshirt_price*orders_df.tshirt_quantity
       ).groupby(by="customer_id"
                ).agg({"pages_visited":"mean",
                       "total":"sum"}).reset_index()

 

0 Kudos
1 Solution
sergeyd
Dataiker
Dataiker

Hi,

You have first group-by part without .reset_index() followed by writing to dataset output. The correct group-by part with .reset_index() just does the grouping without writing anything. 

You just need to add .reset_index() to the first group-by and delete the second group-by. 

View solution in original post

2 Replies
sergeyd
Dataiker
Dataiker

Hi,

You have first group-by part without .reset_index() followed by writing to dataset output. The correct group-by part with .reset_index() just does the grouping without writing anything. 

You just need to add .reset_index() to the first group-by and delete the second group-by. 

webzest
Level 2
Author

Thank you for clearing this up...

0 Kudos