You now have until September 15th to submit your use case or success story to the 2022 Dataiku Frontrunner Awards!ENTER YOUR SUBMISSION

GROUP RECIPE - Custom aggregation

Solved!
lnguyen
Level 2
GROUP RECIPE - Custom aggregation

Hi,

I have a table with structure similar to this:

IDdate startdate endcontract_nbtag_subscription
12330/01/201530/02/20151first subscription
12330/01/202030/06/20202re-subscription
12330/01/202230/06/20223addition
12330/01/202230/06/20223re-subscription
45630/01/201530/02/20151first subscription
45630/01/202030/06/20202re-subscription
45630/01/202030/06/20202re-subscription
45630/01/202030/06/20202addition
45630/01/202230/06/20223re-subscription

 

So now I want to use a GROUP RECIPE to creat a dataset with one row for each ID:

I want to create a custom aggregation which, for each id, counts the number of distinct "contract_nb" where "tag_subscription" equal to "re-subscription" but I can't make it work.

Can someone help me to find the right custom aggregation code to achieve this?

As you can see I must count the "contract_nb" because it's unique for one action of subscription which can contain several tags like "addition" or "re-subsctiption" and like in the case of id 456, one action can have 2 "re-subcription" row.

The end result should look like this:

IDmin date startmax date endnumber of re-subscription
12330/01/201530/06/20222
45630/01/201530/06/20222

Thanks  a lot in advance


Operating system used: Window

0 Kudos
1 Solution
NN
Neuron
Neuron

Hi @lnguyen ,

In the group recipe have you been able to look at the Computed Columns and Custom Aggregations options?

In the computed columns section you can create a new column using a dss formula

if(tag_subscription=='re-subscription',contract_nb,null)

and in the next group step you can take a distinct of this new column that you created.

The second option you can try in you can add a custom aggregation using a simple case when expression in the custom aggregation step of the group recipe

count(distinct (case when "tag_subscription"='re-subscription' then "contract_nb" end))

Hope i have understood your question correct.

 

View solution in original post

2 Replies
NN
Neuron
Neuron

Hi @lnguyen ,

In the group recipe have you been able to look at the Computed Columns and Custom Aggregations options?

In the computed columns section you can create a new column using a dss formula

if(tag_subscription=='re-subscription',contract_nb,null)

and in the next group step you can take a distinct of this new column that you created.

The second option you can try in you can add a custom aggregation using a simple case when expression in the custom aggregation step of the group recipe

count(distinct (case when "tag_subscription"='re-subscription' then "contract_nb" end))

Hope i have understood your question correct.

 

lnguyen
Level 2
Author

Hi @NN ,

It's exactly what I was looking for.

I knew the code is simple but my SQL is rusty so your help is very much appreciated.

Thanks again,

Have a nice day,