Group By - Sum is not what is expected

Luongp
Level 1
Group By - Sum is not what is expected

I have imported a dataset from excel and there are 190k rows of data with each row representing a shipment that is made. A shipment will have an associated customer number, revenue, and cost. I imported my data and made sure my aggregation columns are in integer type.

 

When I use the group by recipe and select customer number as the group key and the revenue and cost as the sum aggregate, the outputted dataset's sum of the revenue and cost are very off than what they would be if I manually calculated it on excel. I see that the count of the rows are the same for the customer number when I compare them to my dataset from excel but somewhere I am losing my cost and revenue numbers when it aggregates the sum. 

0 Kudos
1 Reply
AlexT
Dataiker

Hi @Luongp,
What engine are you using, what data type are your revenue columns, and what DSS version are you using?

This is unexpected. Perhaps you can submit a support ticket with job diagnostics from the group by recipe and screenshots illustrating the difference between DSS vs your manual calculations.

https://doc.dataiku.com/dss/latest/troubleshooting/obtaining-support.html

Thanks

0 Kudos