Public

# window sum around date

Options
Registered Posts: 18 ✭✭✭✭✭

Hi,

I have a table with columns Name , Birth_Date ( year) , count

Name Birth_Date count

Richard 1978 3

Richard 1981 2

Mary 1956 2

I would like to calculate for each name the sum of count for the Birth_date and during a period of five years before and five years after the Birth_Date ( for example : how many Richard are born during the period 1973-1983 )

The results would be something like :

Name Birth_Date count Birth_date_frame_time_10 count_10_years_frame_time

Richard 1978 3 1973-1983 39

Richard 1981 5 1976-1986 23

Mary 1956 4 1951-1961 34

I tried the window recipe, partition by name, order by Birth_date, window frame ( limit window on a value of range of the ordered column, use lower bound : - 5 , use upper bound : 5)

but it results in count_10_years_frame_time wich are sometimes lower of count , like that :

Name Birth_Date count Birth_date_frame_time_10 count_10_years_frame_time

Richard 1978 3 1973-1983 2

Richard 1981 5 1976-1986 20

Mary 1956 4 1951-1961 3

Thank you for your precious help

David

• Dataiker Posts: 355 Dataiker
Options

Hi,

using lower bound = 5 and upper bound = 5 translates into "all rows where d - 5 <= Birth_Date <= d + 5 , with d being the value of Birth_Date for the row being processed", so it's a window of 11 years actually. Note that lower bound = -5 is also valid and translated into "all rows where d - (-5) <= Birth_Date <= d + 5" which is actually "all rows where Birth_Date = d + 5"

• Registered Posts: 18 ✭✭✭✭✭
Options

Thank you !

I would never had guessed alone that the minus is implicit in the lower bound parameter.

• Dataiku DSS Core Designer, Registered Posts: 1 ✭✭✭
Options

Wish this was in the documentation! I wanted to get a moving average of records that were between 365 and 351 days in the past. Understanding how this works now, I set the Lower bound to 365 and upper bound to -351 and it works great!

Help me…