Want to Stop Rebuilding "Expensive" Parts of your Flow? Explicit Builds are the Answer!READ MORE

Create running number in Window recipe

luongnguyen
Level 1
Level 1
Create running number in Window recipe

Hi community,

I have a dataset that contains multiple line for each id like this:

idDate
123401/11/2014
123401/12/2014
123401/09/2015
123401/12/2015
123401/11/2018

 

I then used a window recipe to calculate the date difference by number of month between each line with a custom aggregation and then I have this:

idDateDifference
123401/11/2014 
123401/12/20141
123401/09/20159
123401/12/20153
123401/11/201835

 

Now, I want to create another column to define each run of the Id. The run is the same if the difference is not more than six months.

The result should look like this:

idDateDifferenceRun
123401/11/2014 1
123401/12/201411
123401/09/201592
123401/12/201532
123401/11/2018353

 

Can someone help me to achieve this? (Ideally with visual recipe and custom_aggregation in SQL and without Python).

Can I achieve this in the same Window Recipe that calculate the 'Difference' column?

Thanks a lot for your help.

Best regards,


Operating system used: Window

0 Kudos
2 Replies
fchataigner2
Dataiker
Dataiker

Hi,

given that you're using a special rule to compute whether an increment in the running number is needed, this won't be doable in a single recipe.

You'll need

- a first Window recipe like the one you have now, to compute the difference

- a Prepare recipe on the output of the first Window recipe, to convert difference into 0 or 1 depending on whether the difference is more than 6 months or not (something like a Formula step with `if(difference > 6, 1, 0)`

- a second Window recipe, where you partition and order like in the first recipe, and put the frame to "limit following -> 0" and no limit preceding, with a "Sum" aggregate on the 0/1 column

luongnguyen
Level 1
Level 1
Author

Hi,

Thanks a lot for your response.

I think that makes sense.

I thought about that solution also but it seems a bit heavy as manipulation.

I can't skip the extre Prepare Recipe by using the computed_columns directly in the second Window Recipe and it works.

It's a shame we can do it in the same recipe. I think I can do it in the same SQL recipe by using nested calculation but it will be more complicate after for the maintenance.

Thanks a lot,

0 Kudos