Window / Group recipe

Mateusz
Mateusz Dataiku DSS Core Designer, Neuron 2020, Registered, Neuron 2021, Neuron 2022 Posts: 91 ✭✭✭✭✭✭

Hi All

So I asked that already here :
https://community.dataiku.com/t5/Using-Dataiku-DSS/Conditional-recode/m-p/6264#M3729 and @VinceDS
gave me some idea on how to solve it, but when I was asking my dataset was much simpler. I attached new excel with a few more columns.

Basically I want to create column I created manually called "desired output", and I want to join Product + Group 2, I was following the instructions from previous topic, but it is counting total count of product, but I want to count it within date and study column.

Thanks for any help

emate

Best Answer

  • Ashley
    Ashley Dataiker, Alpha Tester, Dataiku DSS Core Designer, Registered, Product Ideas Manager Posts: 163 Dataiker
    Answer ✓

    Ah, ok I think I understand what you're aiming for!

    What I might try is to 1) aggregate your table, 2) join the resulting table back onto your dataset, and 3) use a formula to recode your Product names.

    1) open a Group recipe, and use the Product column as a group key (if you also want to look at products that appear in more than one group within the same day and study--like it appears you might be trying to do from your post--you'd add those two columns as group keys). Set your aggregation to give you DISTINCT of Group2 and run. You should get a table with a row for each unique combination of your group keys and a value for Group2_distinct (the number of different groups the product appears in)

    2) Join your aggregated dataset back onto the original one; join on the same fields that you used as group keys in the previous step. You'll end up with a dataset that looks like the one pictured.

    3) Create a new column using a formula in the Prepare recipe that looks something like if(Group2_distinct==1, Product, concat(Product, " ", Group2))

    and voila

    lmk if this works for you!

Answers

  • Mateusz
    Mateusz Dataiku DSS Core Designer, Neuron 2020, Registered, Neuron 2021, Neuron 2022 Posts: 91 ✭✭✭✭✭✭
  • Ashley
    Ashley Dataiker, Alpha Tester, Dataiku DSS Core Designer, Registered, Product Ideas Manager Posts: 163 Dataiker

    Hey @emate,

    This is an interesting conundrum. I'm confused about how you'd like to generate the content of the 'desired output'. In the excel you've shared, it seems like it's "if the Product is A then concatenate what's in the Product and Group2 columns. otherwise, use the value in the Product column".

    Your question mentions counting something within the date and study columns as well. Could you be more specific as to how this relates to the 'desired output' column?

    Thanks!

    Ashley

  • Mateusz
    Mateusz Dataiku DSS Core Designer, Neuron 2020, Registered, Neuron 2021, Neuron 2022 Posts: 91 ✭✭✭✭✭✭

    Hi @AshleyW

    Sorry, maybe I wasn't clear enough:

    So basically, in my 'desired output' I want to rename Product name in the way:

    If there is a product that is falling into more than 1 group (looking at column called "Group 2", I want to create new name for each row by combining Product column + Group2 (thats why I have (AJ1 and AJ2), if product is falling into only 1 group like product B (it has only one, unique group name (B1)) - I would like to keep orginal value from Product column.

    Thanks

    emate

  • Mateusz
    Mateusz Dataiku DSS Core Designer, Neuron 2020, Registered, Neuron 2021, Neuron 2022 Posts: 91 ✭✭✭✭✭✭

    Thank you I will try it out, but by the looks of what you have attached it will work for sure

Setup Info
    Tags
      Help me…