This website uses cookies. By clicking OK, you consent to the use of cookies. Read our cookie policy.

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Solved!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Window / Group recipe

Hi All

So I asked that already here :

https://community.dataiku.com/t5/Using-Dataiku-DSS/Conditional-recode/m-p/6264#M3729 and @VinceDS gave me some idea on how to solve it, but when I was asking my dataset was much simpler. I attached new excel with a few more columns.

Basically I want to create column I created manually called "desired output", and I want to join Product + Group 2, I was following the instructions from previous topic, but it is counting total count of product, but I want to count it within date and study column.

Thanks for any help

emate

1 Solution

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Ah, ok I think I understand what you're aiming for!

What I might try is to 1) aggregate your table, 2) join the resulting table back onto your dataset, and 3) use a formula to recode your Product names.

1) open a Group recipe, and use the Product column as a group key (if you also want to look at products that appear in more than one group within the same day and study--like it appears you might be trying to do from your post--you'd add those two columns as group keys). Set your aggregation to give you DISTINCT of Group2 and run. You should get a table with a row for each unique combination of your group keys and a value for Group2_distinct (the number of different groups the product appears in)

2) Join your aggregated dataset back onto the original one; join on the same fields that you used as group keys in the previous step. You'll end up with a dataset that looks like the one pictured.

3) Create a new column using a formula in the Prepare recipe that looks something like if(Group2_distinct==1, Product, concat(Product, " ", Group2))

and voila

lmk if this works for you!

Solutions shown first - Read whole discussion chronologically

5 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hey @emate,

This is an interesting conundrum. I'm confused about how you'd like to generate the content of the 'desired output'. In the excel you've shared, it seems like it's "if the Product is A then concatenate what's in the Product and Group2 columns. otherwise, use the value in the Product column".

Your question mentions counting something within the date and study columns as well. Could you be more specific as to how this relates to the 'desired output' column?

Thanks!

Ashley

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi @AshleyW

Sorry, maybe I wasn't clear enough:

So basically, in my 'desired output' I want to rename Product name in the way:

If there is a product that is falling into more than 1 group (looking at column called "Group 2", I want to create new name for each row by combining Product column + Group2 (thats why I have (AJ1 and AJ2), if product is falling into only 1 group like product B (it has only one, unique group name (B1)) - I would like to keep orginal value from Product column.

Thanks

emate

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Ah, ok I think I understand what you're aiming for!

What I might try is to 1) aggregate your table, 2) join the resulting table back onto your dataset, and 3) use a formula to recode your Product names.

1) open a Group recipe, and use the Product column as a group key (if you also want to look at products that appear in more than one group within the same day and study--like it appears you might be trying to do from your post--you'd add those two columns as group keys). Set your aggregation to give you DISTINCT of Group2 and run. You should get a table with a row for each unique combination of your group keys and a value for Group2_distinct (the number of different groups the product appears in)

2) Join your aggregated dataset back onto the original one; join on the same fields that you used as group keys in the previous step. You'll end up with a dataset that looks like the one pictured.

3) Create a new column using a formula in the Prepare recipe that looks something like if(Group2_distinct==1, Product, concat(Product, " ", Group2))

and voila

lmk if this works for you!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Thank you I will try it out, but by the looks of what you have attached it will work for sure 🙂