Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I have time-series data (essentially a history recording a series of three events). I want to convert these three events into a single record using either or both a lead or lag type function and specific time conditions. Let me provide an example, where ID is the unique case number, EVENT# is which event was recorded, and DAYS is the number of days from an initial event (not recorded):
ID EVENT# DAYS
01 1 30
01 2 45
01 3 80
02 1 30
02 2 62
02 3 99
The condition I want to test for to identify a dummy variable goes something like this:
Within an ID, (if the difference in DAYS for Event = 2 compared to Event = 1 is between (1,30)) AND (if the difference in DAYS for Event = 3 compared to Event = 2 is > 30) then the dummy =1, else = 0
So in the two examples, above, ID = 01, Dummy =1 because both conditions are met (condition 1 = 15 and condition 2 = 35), but with ID = 02, Dummy =0 because the first condition was not met (condition 1 = 32, condition 2 = 37).
I can do this in SAS by working from Event = 3 and lagging once for condition 2 and using two separate lag functions for condition 1 or by separating the conditions, using the lag function, then aggregating the results, but I need to do this within Dataiku.
Hi JimCreech,
It is possible to achieve a similar result in Dataiku by going through the following steps:
You will find attached a sample DSS project with an example of implementation of those steps using the small example data mentioned in your post. If you are working on a regular basis with time series data, you may find this tutorial on the Window recipe capabilities useful.
Hope this helps !
Best,
Harizo
Hi JimCreech,
It is possible to achieve a similar result in Dataiku by going through the following steps:
You will find attached a sample DSS project with an example of implementation of those steps using the small example data mentioned in your post. If you are working on a regular basis with time series data, you may find this tutorial on the Window recipe capabilities useful.
Hope this helps !
Best,
Harizo