Community Conundrum 25:Feature Visualization is now live! Read More

Window / Group recipe

Level 4
Window / Group recipe

Hi All

So I asked that already here : 
https://community.dataiku.com/t5/Using-Dataiku-DSS/Conditional-recode/m-p/6264#M3729 and @VinceDS gave me some idea on how to solve it, but when I was asking my dataset was much simpler. I attached new excel with a few more columns.

Basically I want to create column I created manually called "desired output", and I want to join Product + Group 2, I was following the instructions from previous topic, but it is counting total count of product, but I want to count it within date and study column.

 

Thanks for any help

emate

 

 

0 Kudos
5 Replies
Level 4
Author

attached file

0 Kudos
Dataiker
Dataiker

Hey @emate,

This is an interesting conundrum. I'm confused about how you'd like to generate the content of the 'desired output'. In the excel you've shared, it seems like it's "if the Product is A then concatenate what's in the Product and Group2 columns. otherwise, use the value in the Product column".

Your question mentions counting something within the date and study columns as well. Could you be more specific as to how this relates to the 'desired output' column?

Thanks!

Ashley

0 Kudos
Level 4
Author

Hi @AshleyW 

Sorry, maybe I wasn't clear enough:

So basically, in my 'desired output' I want to rename Product name in the way:

If there is a product that is falling into more than 1 group (looking at column called "Group 2", I want to create new name for each row by combining Product column + Group2 (thats why I have (AJ1 and AJ2), if product is falling into only 1 group like product B (it has only one, unique group name (B1)) - I would like to keep orginal value from Product column.

Thanks

emate

 

 

0 Kudos
Dataiker
Dataiker

Ah, ok I think I understand what you're aiming for!

What I might try is to 1) aggregate your table, 2) join the resulting table back onto your dataset, and 3) use a formula to recode your Product names.

1) open a Group recipe, and use the Product column as a group key (if you also want to look at products that appear in more than one group within the same day and study--like it appears you might be trying to do from your post--you'd add those two columns as group keys). Set your aggregation to give you DISTINCT of Group2 and run. You should get a table with a row for each unique combination of your group keys and a value for Group2_distinct (the number of different groups the product appears in)

2) Join your aggregated dataset back onto the original one; join on the same fields that you used as group keys in the previous step. You'll end up with a dataset that looks like the one pictured.

3) Create a new column using a formula in the Prepare recipe that looks something like if(Group2_distinct==1, Product, concat(Product, " ", Group2))

and voila

lmk if this works for you!

Level 4
Author

Thank you I will try it out, but by the looks of what you have attached it will work for sure 🙂