New to Dataiku DSS? Try out our NEW Quick Start Programs today and get onboarded on the product in just one hour! Let's go

window sum around date

Solved!
DavidALI
Level 2
window sum around date

Hi, 

I have a table with columns Name , Birth_Date ( year) , count

Name         Birth_Date        count

Richard       1978                  3

Richard      1981                   2

Mary           1956                   2

I would like to calculate for each name the sum of count for the Birth_date and during a period of five years before and five years after the Birth_Date ( for example : how many Richard are born during the period 1973-1983 )

The results would be something like :

Name         Birth_Date        count      Birth_date_frame_time_10         count_10_years_frame_time

Richard       1978                  3                   1973-1983                                          39

Richard      1981                   5                   1976-1986                                           23

Mary           1956                   4                   1951-1961                                          34

 

I tried the window recipe, partition by name, order by Birth_date, window frame ( limit window on a value of range of the ordered column, use lower bound : - 5 , use upper bound : 5)

but it results in count_10_years_frame_time wich are sometimes lower of  count , like that : 

Name         Birth_Date        count      Birth_date_frame_time_10          count_10_years_frame_time

Richard       1978                  3                      1973-1983                                          2

Richard      1981                   5                      1976-1986                                          20

Mary           1956                   4                       1951-1961                                          3

Thank you for your precious help

David

 

 

 

 

0 Kudos
1 Solution
fchataigner2
Dataiker
Dataiker

Hi,

using lower bound = 5 and upper bound = 5 translates into "all rows where d - 5 <= Birth_Date <= d + 5 ,  with d being the value of Birth_Date for the row being processed", so it's a window of 11 years actually. Note that lower bound = -5 is also valid and translated into "all rows where d - (-5) <= Birth_Date <= d + 5" which is actually "all rows where Birth_Date = d + 5" 

View solution in original post

0 Kudos
2 Replies
fchataigner2
Dataiker
Dataiker

Hi,

using lower bound = 5 and upper bound = 5 translates into "all rows where d - 5 <= Birth_Date <= d + 5 ,  with d being the value of Birth_Date for the row being processed", so it's a window of 11 years actually. Note that lower bound = -5 is also valid and translated into "all rows where d - (-5) <= Birth_Date <= d + 5" which is actually "all rows where Birth_Date = d + 5" 

View solution in original post

0 Kudos
DavidALI
Level 2
Author

Thank you !

I would never had guessed alone that the minus is implicit in the lower bound parameter.

 

0 Kudos
A banner prompting to get Dataiku DSS