Compare two datasets column by column without using code recipe
How to create dataset 3 (key = 'ID', join method = 'left') to compare datasets 1 and 2 WITHOUT using code recipe, i.e low code no code only. Thanks.
Dataset 1:
ID | Col A | Col B | Col C |
1 | AA | 11 | Z1 |
2 | BB | 100 | X2 |
3 | CC | 33 | Y5 |
Dataset 2:
ID | Col A | Col B | Col C |
1 | AA | 11 | Z1 |
2 | BB | 98 | X2 |
3 | CC | 33 | YY555 |
4 | DD | 44 | T6 |
Dataset 3 (string columns return true/false, numeric columns return ratio):
ID | Col A | Col B | Col C |
1 | TRUE | 100% | TRUE |
2 | TRUE | 98% | TRUE |
3 | TRUE | 100% | FALSE |
Operating system used: win 11
Best Answer
-
Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
Hi @yesitsmeoffical
,
You can utilize the "pre-join computed columns" step within the join recipe to calculate your computed columns. From your description it sounds like you want to return TRUE if dataset 1 column A == dataset 2 column A and if dataset 1 column C == dataset 2 column C. And it looks like for column B you want to return dataset 2 col B / dataset 1 column B.
I created the following output dataset using your sample inputs and three computed columns:The string computed columns have the formulas:
if(val("d1__Col A") == val("d2__Col A"), 'TRUE', 'FALSE') if(val('d1__Col C') == val('d2__Col C'), 'TRUE', 'FALSE')
The numeric computed column has the formula:100 * (val('d2__Col B') / val('d1__Col B'))
I'm attaching an export of the project so that you can test it out by importing into your DSS instance.
Thanks!
Sarina