Compare two datasets column by column without using code recipe

Tags
Registered Posts: 15 ✭✭✭

How to create dataset 3 (key = 'ID', join method = 'left') to compare datasets 1 and 2 WITHOUT using code recipe, i.e low code no code only. Thanks.

Dataset 1:

IDCol ACol BCol C
1AA11Z1
2BB100X2
3CC33Y5

Dataset 2:

IDCol ACol BCol C
1AA11Z1
2BB98X2
3CC33YY555
4DD44T6

Dataset 3 (string columns return true/false, numeric columns return ratio):

IDCol ACol BCol C
1TRUE100%TRUE
2TRUE98%TRUE
3TRUE100%FALSE

Operating system used: win 11

Best Answer

  • Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 319 Dataiker
    edited July 2024 Answer ✓

    Hi @yesitsmeoffical
    ,

    You can utilize the "pre-join computed columns" step within the join recipe to calculate your computed columns. From your description it sounds like you want to return TRUE if dataset 1 column A == dataset 2 column A and if dataset 1 column C == dataset 2 column C. And it looks like for column B you want to return dataset 2 col B / dataset 1 column B.

    I created the following output dataset using your sample inputs and three computed columns:

    Screen Shot 2023-12-27 at 3.51.29 PM.png

    The string computed columns have the formulas:

    if(val("d1__Col A") == val("d2__Col A"), 'TRUE', 'FALSE')
    if(val('d1__Col C') == val('d2__Col C'), 'TRUE', 'FALSE')


    The numeric computed column has the formula:

    100 * (val('d2__Col B') / val('d1__Col B'))


    I'm attaching an export of the project so that you can test it out by importing into your DSS instance.

    Thanks!
    Sarina

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.