Community Conundrum 25:Feature Visualization is now live! Read More

Python Merging with multiple keys

Level 2
Python Merging with multiple keys

Hi all,

I'm trying to learn Python for Dataiku DSS but I have a problem with a line of code. I want to merge two dataset with Python. I understood how to merge two dataset with one column in common:

# - * - coding: utf-8 - * -
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
co2_and_Oil = dataiku.Dataset ("CO2_and_Oil")
co2_and_Oil_df = co2_and_Oil.get_dataframe ()
meat_and_Egg_Production = dataiku.Dataset ("Meat_and_Egg_Production")
meat_and_Egg_Production_df = meat_and_Egg_Production.get_dataframe ()

#dataframe definition's
merge_Python_df = co2_and_Oil_df.merge (meat_and_Egg_Production_df, left_on = "Entity", right_on = "Entity")

merge_Python = dataiku.Dataset ("Merge_Python")
merge_Python.write_with_schema (merge_Python_df)

but how does it work when the datasets need to be merge with multiple columns in commun? I tried this:

#dataframe definition
merge_Python_df = co2_and_Oil_df.merge (meat_and_Egg_Production_df, left_on = "Entity", "Code", "Year", right_on = "Entity", "Code", "Year")


but it doesn't work, could you help me?


0 Kudos
1 Reply


When you use python in DSS, it's just regular python, so DSS isn't doing anything special to your code, so this is actually just a python question.

That said, you'll want to throw all your merge keys into lists, like below:


merge_Python_df = co2_and_Oil_df.merge (meat_and_Egg_Production_df, left_on = ["Entity", "Code", "Year"], right_on = ["Entity", "Code", "Year"])


There's also a helpful stack overflow article on this:

Let me know if you have any additional questions.




0 Kudos