Conditional group / window or concat across rows to a single cell

sam2023
sam2023 Dataiku DSS Core Designer, Registered Posts: 2

Hi,

Was trying to look for similar example and unable to find any so far and hence posting for help.

Please see the dataset below for reference.

I would like to build an formula / Aggregation or recipe in such a way that for a given subject, if Advertised = Yes and for Media = Radio, then for that given subject all rows need to be updated as shown in the assessed value field, with the concat of the Advertised and Media information.

Could anyone of you please help with this small dataset ? I tried the Group & Window recipes and with Concat and could not get the desired results.

Thank you for your help. Much appreciated in advance.

Sam

SubjectProductTermAdvertisedMediaAssessed Value
S1P1E1NoTVPromotion Done for S1 as follows
Tv->No, Radio->Yes
S1P1E2YesRadioPromotion Done for S1 as follows
Tv->No, Radio->Yes
S2P2E3NoTVNo Promotions for S2 done
Tv & Radio->No
S2P2E4NoRadioNo Promotions for S2 done
Tv & Radio->No
S3P3E5NoTVPromotion Done for S3 as follows
Tv->No, Radio->Yes
S3P4E5YesRadioPromotion Done for S3 as follows
Tv->No, Radio->Yes
S4P5E6NoTVPromotion Done for S4 as follows
Tv->No, Radio->Yes
S4P6E6YesRadioPromotion Done for S4 as follows
Tv->No, Radio->Yes
S5P7E7NoTVNo Promotions for S5 done
Tv & Radio->No
S5P7E8NoRadioNo Promotions for S5 done
Tv & Radio->No
S5P8E7YesTVNo Promotions for S5 done
Tv & Radio->No

Answers

  • Konstantina
    Konstantina Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 27 ✭✭✭✭✭

    Hi @sam2023
    ,

    Here is my suggested solution:

    1. First, I used a Group recipe where I applied the following steps:
      • In the Computed columns, I created the column "Advertised_Media" using the formula

    concat(Advertised, "|", Media)

    step_1a.png

    • In the Group step, I chose the Concat option --> Count distinct of the "Advertised_Media" column


    2. Then, through a Join with... recipe, I (left) joined the original dataset with the output dataset of the Group
    recipe of step 1

    • In the Post-join computed columns, I created a column named "Advertised_Radio_Flag" using the formula
      if(contains(Advertised_Media_concat, "Yes|Radio"), 1, 0)

      step_2.png


      3. Finally, I used a Prepare recipe for the final dataset creation
      • I created a column named "Assessed_Value" using a Formula step where I used the formula

        if(Advertised_Radio_Flag == 1, concat(concat('Promotion Done for ', Subject), ' as follows \nTV->No, Radio->Yes'), concat(concat('No promotions for ', Subject), ' done \nTV & Radio->No'))

        step_3.png
      • Optional step: I deleted the columns I created in previous steps so that I end up with the schema you showed in the post

    I hope this helps!

    -Konstantina

  • sam2023
    sam2023 Dataiku DSS Core Designer, Registered Posts: 2

    Thank you Konstantina, I will give a try. Appreciated very much for your insightful help.

Setup Info
    Tags
      Help me…