Cumulative sum on raws by group
Hi Dataiku Folks,
Do you know if there is a visual way to go from the table 1 to table 2 (values are all "1" to simplify the example).
Output dataset will have: two new columns, no aggregation but cumulative sum by two ids.
Window recipe with custom aggregation seems not able to create new columns ; prepare recipes seems not able to deal with raws.
Is the only way a code recipe ?
Operating system used: linux aws
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,170 Neuron
See below. And your spreadsheet has a bug, val1_sum z/v final is 6 not 5 (you got two 2s).
Answers
-
You are right, I made a miskake. Thks.
Thanks for your solution. We were understanding the official doc in a wrong way :
Custom aggregations have specific limitations that need to be considered when using this feature. Firstly, custom aggregations must adhere to the structure of an aggregation. They cannot be simple calculations such as COLUMN_1 + 10 as this would be as defining a new column rather than an aggregation. Instead, they require aggregation operations to be applied as the result of your expression, using one of the available aggregations, such as count, countd, min, max, avg, or sum.
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,170 Neuron
This has nothing to do with custom aggregations. You just had to set the partition and window settings properly and the cumulative sum can be calculated with the standard functions. You could also use custom aggregations but these work based on window you define so without those settings custom aggregations will be useless. I recently replied to another thread where a custom aggregator was needed: