How to get 7 rolling days data from a custom date?
Hello Dataiku,
Can anyone advise, how I can get past 7 days rolling data in windows or group by recipe? My data does not have sequential dates, but I need a count for different column (say id) for seven previous sequential date? Suppose there are no records in 7 days frame, give me zero.
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,982 Neuron
Can you please post an Excel document with some dummy data showing what you want calculated? See this thread for how this helps people get help.
-
Hi, thanks for your reply:
So basically here is the requirement.
Take end date and go back sequential 7 back grouped by user and pull the count of values in column 'Data'.
(for example in below table: for user-039X, going back from april 26, I need 7 days back which is till April 19, count of values of "DATA" column. So only in 26th April, there is one value of User 039X which is AA118. So the new column (say count) would be 1.User PART_DATE Data 039X 26-Apr-24 AA118 039X 15-Apr-24 AA117 039X 12-Apr-24 AA116 039X 10-Apr-24 AA115 039X 5-Apr-24 AA114 039X 3-Feb-24 AA113 039X 24-Jan-24 AA112 039X 23-Jan-24 AA111 039X 18-Jan-24 AA110 039X 15-Jan-24 AA109 039X 15-Nov-23 AA107 039X 15-Nov-23 AA108 039X 8-Nov-23 AA106 039X 7-Nov-23 AA105 Answer is:
rolling part-date User Count of Data 26 April 2024 - 19 April 2024 039X 1 25 April 2024-18 April 2024 039X 0 24 April 2024-17 April 2024 039X 0 23 April 2024-16 April 2024 039X 0 22 April 2024-15 April 2024 039X 1 -
How I can get within prepare recipe or window recipe
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,982 Neuron
You can easily achieve this in a Window recipe but you need to meet some requirements first. First you need to make sure you are using actual dates data types so use a prepare recipe to convert your field to a native date. Since you want to produce an output which has dates your data doesn't have you will need to fill in the gaps, this is called data densification. There are many ways of doing that but in a nutshell if you get a distinct list of all your users and do a cartisian join with a table that has all the dates sequentially you will end up with a table that has all Users against all Dates. Then you join it back to your actual dataset joining by User and Date to remove the duplicates (as some Users have data on some Dates). I posted about this here and here.
You will end up with a dataset like this:
Now it's easy to build a Window recipe as follows:
And here is how the output looks like: