Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

How to get 7 rolling days data from a custom date?

Kishan
Level 2
How to get 7 rolling days data from a custom date?

Hello Dataiku,

Can anyone advise, how I can get past 7 days rolling data in windows or group by recipe? My data does not have sequential dates, but I need a count for different column (say id) for seven previous sequential date? Suppose there are no records in 7 days frame, give me zero.

0 Kudos
4 Replies
Turribeach

Can you please post an Excel document with some dummy data showing what you want calculated? See this thread for how this helps people get help.

0 Kudos
Kishan
Level 2
Author

Hi, thanks for your reply:

So basically here is the requirement. 

Take end date and go back sequential 7 back grouped by user and pull the count of values in column 'Data'.
(for example in below table: for user-039X, going back from april 26, I need 7 days back which is till April 19, count of values of "DATA" column. So only in 26th April, there is one value of User 039X which is AA118. So the new column (say count) would be 1.

 

UserPART_DATEData
039X26-Apr-24AA118
039X15-Apr-24AA117
039X12-Apr-24AA116
039X10-Apr-24AA115
039X5-Apr-24AA114
039X3-Feb-24AA113
039X24-Jan-24AA112
039X23-Jan-24AA111
039X18-Jan-24AA110
039X15-Jan-24AA109
039X15-Nov-23AA107
039X15-Nov-23AA108
039X8-Nov-23AA106
039X7-Nov-23AA105

 

Answer is:

rolling part-dateUserCount of Data
26 April 2024 - 19 April 2024039X1
25 April 2024-18 April 2024039X0
24 April 2024-17 April 2024039X0
23 April 2024-16 April 2024039X0
22 April 2024-15 April 2024039X1
0 Kudos
Kishan
Level 2
Author

How I can get within prepare recipe or window recipe

0 Kudos

You can easily achieve this in a Window recipe but you need to meet some requirements first. First you need to make sure you are using actual dates data types so use a prepare recipe to convert your field to a native date. Since you want to produce an output which has dates your data doesn't have you will need to fill in the gaps, this is called data densification. There are many ways of doing that but in a nutshell if you get a distinct list of all your users and do a cartisian join with a table that has all the dates sequentially you will end up with a table that has all Users against all Dates. Then you join it back to your actual dataset joining by User and Date to remove the duplicates (as some Users have data on some Dates). I posted about this here and here.

You will end up with a dataset like this:

Screenshot 2024-05-06 at 18.53.24.png

Now it's easy to build a Window recipe as follows:

Screenshot 2024-05-06 at 19.11.00.png

Screenshot 2024-05-06 at 19.11.25.png

Screenshot 2024-05-06 at 19.12.28.png

And here is how the output looks like:

Screenshot 2024-05-06 at 19.13.01.png

0 Kudos