Conditional group / window or concat across rows to a single cell

Hi,
Was trying to look for similar example and unable to find any so far and hence posting for help.
Please see the dataset below for reference.
I would like to build an formula / Aggregation or recipe in such a way that for a given subject, if Advertised = Yes and for Media = Radio, then for that given subject all rows need to be updated as shown in the assessed value field, with the concat of the Advertised and Media information.
Could anyone of you please help with this small dataset ? I tried the Group & Window recipes and with Concat and could not get the desired results.
Thank you for your help. Much appreciated in advance.
Sam
Subject | Product | Term | Advertised | Media | Assessed Value |
S1 | P1 | E1 | No | TV | Promotion Done for S1 as follows Tv->No, Radio->Yes |
S1 | P1 | E2 | Yes | Radio | Promotion Done for S1 as follows Tv->No, Radio->Yes |
S2 | P2 | E3 | No | TV | No Promotions for S2 done Tv & Radio->No |
S2 | P2 | E4 | No | Radio | No Promotions for S2 done Tv & Radio->No |
S3 | P3 | E5 | No | TV | Promotion Done for S3 as follows Tv->No, Radio->Yes |
S3 | P4 | E5 | Yes | Radio | Promotion Done for S3 as follows Tv->No, Radio->Yes |
S4 | P5 | E6 | No | TV | Promotion Done for S4 as follows Tv->No, Radio->Yes |
S4 | P6 | E6 | Yes | Radio | Promotion Done for S4 as follows Tv->No, Radio->Yes |
S5 | P7 | E7 | No | TV | No Promotions for S5 done Tv & Radio->No |
S5 | P7 | E8 | No | Radio | No Promotions for S5 done Tv & Radio->No |
S5 | P8 | E7 | Yes | TV | No Promotions for S5 done Tv & Radio->No |
Answers
-
Konstantina Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 27 ✭✭✭✭✭
Hi @sam2023
,Here is my suggested solution:
- First, I used a Group recipe where I applied the following steps:
- In the Computed columns, I created the column "Advertised_Media" using the formula
concat(Advertised, "|", Media)
- In the Group step, I chose the Concat option --> Count distinct of the "Advertised_Media" column
2. Then, through a Join with... recipe, I (left) joined the original dataset with the output dataset of the Group
recipe of step 1- In the Post-join computed columns, I created a column named "Advertised_Radio_Flag" using the formula
if(contains(Advertised_Media_concat, "Yes|Radio"), 1, 0)
3. Finally, I used a Prepare recipe for the final dataset creation- I created a column named "Assessed_Value" using a Formula step where I used the formula
if(Advertised_Radio_Flag == 1, concat(concat('Promotion Done for ', Subject), ' as follows \nTV->No, Radio->Yes'), concat(concat('No promotions for ', Subject), ' done \nTV & Radio->No')) - Optional step: I deleted the columns I created in previous steps so that I end up with the schema you showed in the post
- I created a column named "Assessed_Value" using a Formula step where I used the formula
I hope this helps!
-Konstantina
- First, I used a Group recipe where I applied the following steps:
-
Thank you Konstantina, I will give a try. Appreciated very much for your insightful help.