Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello -
I'm trying to calculate the percentage of revenue of the total revenue for each region, which requires me knowing the total revenue.
After all my cleansing and recipes, I'm left with this table:
I now want to calculate each regions percentage contribution to total revenue (adding up all the Revenue_Sum column).
In Alteryx, I'd use the summarize node to calculate the total and then append it back onto the table so that I can then easily do a calculation on each row.
How do I calculate a column total?
I'm not sure why I'm so tripped up on a seemingly simple problem but I think it's because my mind is still stuck a little in the "Alteryx way of doing things"...so I appreciate any help and assistance being new to Dataiku.
Thanks
Chris
Hi,
Aggregations are not directly done in the Prepare recipe. Instead, they are done either in the "Group" recipe or the "Window" recipe.
Your use case seems to be a small variant of the second one: you want a dataset with as many rows as previously, and just add a column that is the sum of revenue for all sales areas. For that, you would use a Window recipe "window: unbounded" and "Aggregate: SUM of Total revenue". See how this time we did not "partition" it, so it will compute the global sum instead of a sum per "partition".
Grouping and Windowing are quite powerful and can do a lot of different things, so we recommend reading the linked articles. The big difference to remember is:
Hope this helps,
Hi,
Aggregations are not directly done in the Prepare recipe. Instead, they are done either in the "Group" recipe or the "Window" recipe.
Your use case seems to be a small variant of the second one: you want a dataset with as many rows as previously, and just add a column that is the sum of revenue for all sales areas. For that, you would use a Window recipe "window: unbounded" and "Aggregate: SUM of Total revenue". See how this time we did not "partition" it, so it will compute the global sum instead of a sum per "partition".
Grouping and Windowing are quite powerful and can do a lot of different things, so we recommend reading the linked articles. The big difference to remember is:
Hope this helps,
Hello - Wow, thank you for that detailed explanation and the links. It's opened my eyes to a bunch of new possibilities!
You've perfectly identified my use case with the suggestion of the window being unbounded so I can add a column that is the sum of revenue for all sales areas.
However, I seem to be only able to get the CUMULATIVE sum, and not the Total sum for all rows:
Here is my Window definition:
Here is my aggregation:
And with that, the output actually adds a cumulative sum:
So it's close...but I would need the total sum on each row.
Any pointers on what I'm doing wrong is greatly appreciated.
Thanks again
Update: Thanks @Clรฉment_Stenac I got it to work by toggling the Window Frame option. While I don't fully understand the explanation, this post by @fchataigner2 references the issue. I have some more reading to do to fully grasp this but it seems like it is working now.