Sum for the last 12 months
Hello,
I want to get the sum of revenues for the last 12 months for an id.
The problem is for some month i dont have data (so empty row) so i can't use the windows frame=12.
Do you any idea to solve it ?
For your information i have already see this post:
Solved: Using Window recipe and moving average - Dataiku Community
But the problem in my example is i can have empty data for 1 month and in this case i just want the last 11 month sum (example id=2)
I have reproduced on Excel the output I want:
id | date | revenu | Expected compute |
1 | 202012 | 10 | 10 |
1 | 202011 | 20 | 30 |
1 | 202101 | 30 | 60 |
1 | 202102 | 40 | 100 |
1 | 202103 | 50 | 150 |
1 | 202104 | 60 | 210 |
1 | 202105 | 70 | 280 |
1 | 202106 | 80 | 360 |
1 | 202107 | 90 | 450 |
1 | 202108 | 100 | 550 |
1 | 202109 | 110 | 660 |
1 | 202110 | 120 | 780 |
1 | 202111 | 130 | 900 |
1 | 202112 | 140 | 1020 |
2 | 202012 | 5 | 5 |
2 | 202011 | 15 | 20 |
2 | 202101 | 25 | 45 |
2 | 202103 | 45 | 90 |
2 | 202104 | 55 | 135 |
2 | 202105 | 65 | 200 |
2 | 202106 | 75 | 275 |
2 | 202107 | 85 | 360 |
2 | 202108 | 95 | 455 |
2 | 202109 | 105 | 560 |
2 | 202110 | 115 | 675 |
2 | 202112 | 135 | 800 |
Thanks for your time
Ineedi2
Best Answer
-
Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
Hi,
One suggestion is to try to define the Window based on value and not number of rows, with "limit window on a value range of the order column" option (see attached image), such as 12 months. You will need to parse your dates to use as ordering column. See also the example in this video, https://www.youtube.com/watch?v=mL4OxXK8-NY
Another suggestion is to use the time series plugin and the resampling recipe to interpolate the missing months as zeros (see attached image), which you can then follow with your window recipe.
I hope this helps.
Answers
-
Thank you for your answer. I have found a personal approach, however, I still tested your first method which did not give the expected results.
I think I have set up the Windows recipe wrong, I will try again this morning.
Thanks
-
-
hi,
my date column has monthly level data that to start of the month. when i perform resampling plugin my dates are converting to end of the month. please see the below conversion. resampling should occur on the start of the month only. could anyone help?
2022-06-01 -->2022-05-31
2022-05-01-->2022-04-30