To compare two datasets and find out the difference
Hi All,
I am trying to create a project flow where I am having two datasets let us say
Dataset 1:
Name id
xxxx user1
yyyy user2
Dataset 2:
Name id
aaaa user3
xxxx user1
yyyy user2
zzzz user4
In both the datasets the id column is unique, I need to compare both the datasets and my output dataset should have
output dataset:
Name id
zzzz user4
aaaa user3
Which means I need to compare the dataset1 with dataset2 and if any of the users in dataset1 is not available in dataset2 I need to take them and put it as an output dataset.
Similarly I want the vice versa of the scenario where I also want an output dataset which compare the dataset1 with dataset2 and if any of the users in dataset1 is available in dataset2 I need to take them and put it as an output dataset.
Output dataset :
Name id
xxxx user1
yyyy user2
Can you help me to achieve this?
Answers
-
Hi,
you can do a left outer join of dataset 2 (on the left) with dataset 1 (on the right), then have a post-filter in the Join recipe to only keep rows where right_id (or whatever name you give that column) is not defined.
If the id are unique, then an inner join will give only rows for which there is a match. You can have an inner join by changing the join type in the modal of the join