Comparing two data sets at Column level

bhakuniv · February 2020

Hi All

I am trying to compare two datasets with the same columns and finding out the differences at a column level. The goal is to identify the number of rows(defined by a unique key) that have an exact match at each column.

refer sample data below

Dataset 1

ID	Name	Age	Country
1	ABC	21	USA
2	XYZ	23	UK
3	DEF	67	CHN

Dataset 2

ID	Name	Age	Country
1	ABC	22	USA
2	XYZ	23	UK
3	DEF	67	SWZ

Output

	Count of ID
Name	3
Age	2
Country	2

Thanks

ATsao · February 2020

Hi,

This would probably be best handled by writing your own code, whether using Python or R, to perform this operation. In this case, you should include both of these datasets as an input to your code recipe and then store the result as an output dataset. More information about using Python and R recipes in DSS can be found in our documentation here:
https://doc.dataiku.com/dss/latest/code_recipes/python.html
https://doc.dataiku.com/dss/latest/code_recipes/r.html

Thanks,
Andrew

Comparing two data sets at Column level

Answers

Categories

Setup Info

Tags