Window / Group recipe
Hi All
So I asked that already here :
https://community.dataiku.com/t5/Using-Dataiku-DSS/Conditional-recode/m-p/6264#M3729 and @VinceDS
gave me some idea on how to solve it, but when I was asking my dataset was much simpler. I attached new excel with a few more columns.
Basically I want to create column I created manually called "desired output", and I want to join Product + Group 2, I was following the instructions from previous topic, but it is counting total count of product, but I want to count it within date and study column.
Thanks for any help
emate
Best Answer
-
Ashley Dataiker, Alpha Tester, Dataiku DSS Core Designer, Registered, Product Ideas Manager Posts: 163 Dataiker
Ah, ok I think I understand what you're aiming for!
What I might try is to 1) aggregate your table, 2) join the resulting table back onto your dataset, and 3) use a formula to recode your Product names.
1) open a Group recipe, and use the Product column as a group key (if you also want to look at products that appear in more than one group within the same day and study--like it appears you might be trying to do from your post--you'd add those two columns as group keys). Set your aggregation to give you DISTINCT of Group2 and run. You should get a table with a row for each unique combination of your group keys and a value for Group2_distinct (the number of different groups the product appears in)
2) Join your aggregated dataset back onto the original one; join on the same fields that you used as group keys in the previous step. You'll end up with a dataset that looks like the one pictured.
3) Create a new column using a formula in the Prepare recipe that looks something like if(Group2_distinct==1, Product, concat(Product, " ", Group2))
and voila
lmk if this works for you!
Answers
-
Mateusz Dataiku DSS Core Designer, Neuron 2020, Registered, Neuron 2021, Neuron 2022 Posts: 91 ✭✭✭✭✭✭
-
Ashley Dataiker, Alpha Tester, Dataiku DSS Core Designer, Registered, Product Ideas Manager Posts: 163 Dataiker
Hey @emate,
This is an interesting conundrum. I'm confused about how you'd like to generate the content of the 'desired output'. In the excel you've shared, it seems like it's "if the Product is A then concatenate what's in the Product and Group2 columns. otherwise, use the value in the Product column".
Your question mentions counting something within the date and study columns as well. Could you be more specific as to how this relates to the 'desired output' column?
Thanks!
Ashley
-
Mateusz Dataiku DSS Core Designer, Neuron 2020, Registered, Neuron 2021, Neuron 2022 Posts: 91 ✭✭✭✭✭✭
Hi @AshleyW
Sorry, maybe I wasn't clear enough:
So basically, in my 'desired output' I want to rename Product name in the way:
If there is a product that is falling into more than 1 group (looking at column called "Group 2", I want to create new name for each row by combining Product column + Group2 (thats why I have (AJ1 and AJ2), if product is falling into only 1 group like product B (it has only one, unique group name (B1)) - I would like to keep orginal value from Product column.
Thanks
emate
-
Mateusz Dataiku DSS Core Designer, Neuron 2020, Registered, Neuron 2021, Neuron 2022 Posts: 91 ✭✭✭✭✭✭
Thank you I will try it out, but by the looks of what you have attached it will work for sure