I have 3 columns -> USER_ID , PRODUCT_ID (BOTH NUMERIC) && ORDER_DATE
I want to generate a field that takes the difference b/w the max(ORDER_DATE) & 2nd max(ORDER_DATE) in DAYS for the same combination of USER_ID & PRODUCT_ID and stores the result in the new column.
A sample of the data below:
Any inputs on the same would be appreciated!
This is a great use case to use the window recipe: https://academy.dataiku.com/latest/tutorial/visual-recipes/window.html
You can use partition by USER_ID and PRODUCT_ID, order by ORDER_DATE, and use two lags (1,2) to retrieve the "max(ORDER_DATE) & 2nd max(ORDER_DATE)".
Hope it helps,