Running Total in a Dataset
How can I create a running total in a given dataset in Dataiku? I mean by that to create an additional column where the figures from another numeric column in the same dataset are accumulated, record by record, for a defined range of records. It seemed the Windows Recipe might be the way of doing that, but it doesn't contain an option for accumulating. When I tried it was possible to calculate a sum of the range of records selected, but that sum would populate every record in the new column created, instead of accumulating a running total, record by record. Would be the case of using some code?
Best Answer
-
Ashley Dataiker, Alpha Tester, Dataiku DSS Core Designer, Registered, Product Ideas Manager Posts: 163 Dataiker
Hi @ptavares
,You were on the right track with the Window recipe! To get a running total, you will need to set up your window frame in the following way:
- use an order column
- toggle on the 'window frame' option, and select limit following rows which you would set to zero.
This will give you a window frame that is unbounded on one side and bounded on the other, thus a cummulative/running calculation. To make a moving calculation, like a moving average, you'd set a bound on both sides. To get a 'total' calculation, like you'd get from a group by, leave the window frame unbounded on both sides.
LMK if this helps!
Cheers,
Ashley
Answers
-
ptavares Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 3 ✭
Hi AshleyW,
It worked magically. Many thanks for the help.
Paulo
-
Sean Dataiker, Alpha Tester, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer Posts: 168 Dataiker
Hi @ptavares
, to echo @AshleyW
's answer, you may be interested in this tutorial on the Window recipe. It includes a cumulative sum example.