How to get 7 rolling days data from a custom date?

Options
Kishan
Kishan Registered Posts: 3 ✭✭✭

Hello Dataiku,

Can anyone advise, how I can get past 7 days rolling data in windows or group by recipe? My data does not have sequential dates, but I need a count for different column (say id) for seven previous sequential date? Suppose there are no records in 7 days frame, give me zero.

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    Can you please post an Excel document with some dummy data showing what you want calculated? See this thread for how this helps people get help.

  • Kishan
    Kishan Registered Posts: 3 ✭✭✭
    Options

    Hi, thanks for your reply:

    So basically here is the requirement.

    Take end date and go back sequential 7 back grouped by user and pull the count of values in column 'Data'.
    (for example in below table: for user-039X, going back from april 26, I need 7 days back which is till April 19, count of values of "DATA" column. So only in 26th April, there is one value of User 039X which is AA118. So the new column (say count) would be 1.

    UserPART_DATEData
    039X26-Apr-24AA118
    039X15-Apr-24AA117
    039X12-Apr-24AA116
    039X10-Apr-24AA115
    039X5-Apr-24AA114
    039X3-Feb-24AA113
    039X24-Jan-24AA112
    039X23-Jan-24AA111
    039X18-Jan-24AA110
    039X15-Jan-24AA109
    039X15-Nov-23AA107
    039X15-Nov-23AA108
    039X8-Nov-23AA106
    039X7-Nov-23AA105

    Answer is:

    rolling part-dateUserCount of Data
    26 April 2024 - 19 April 2024039X1
    25 April 2024-18 April 2024039X0
    24 April 2024-17 April 2024039X0
    23 April 2024-16 April 2024039X0
    22 April 2024-15 April 2024039X1
  • Kishan
    Kishan Registered Posts: 3 ✭✭✭
    Options

    How I can get within prepare recipe or window recipe

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    You can easily achieve this in a Window recipe but you need to meet some requirements first. First you need to make sure you are using actual dates data types so use a prepare recipe to convert your field to a native date. Since you want to produce an output which has dates your data doesn't have you will need to fill in the gaps, this is called data densification. There are many ways of doing that but in a nutshell if you get a distinct list of all your users and do a cartisian join with a table that has all the dates sequentially you will end up with a table that has all Users against all Dates. Then you join it back to your actual dataset joining by User and Date to remove the duplicates (as some Users have data on some Dates). I posted about this here and here.

    You will end up with a dataset like this:

    Screenshot 2024-05-06 at 18.53.24.png

    Now it's easy to build a Window recipe as follows:

    Screenshot 2024-05-06 at 19.11.00.png

    Screenshot 2024-05-06 at 19.11.25.png

    Screenshot 2024-05-06 at 19.12.28.png

    And here is how the output looks like:

    Screenshot 2024-05-06 at 19.13.01.png

Setup Info
    Tags
      Help me…