Core Designer Certificate: ending up with too many rows after data Join

Hey all,
I am trying to Left-Join data for the Core Designer Certificate project but I keep ending up with too many rows. I want to use both Country and Year to do the join, since the combination is unique in all the input tables. But after joining, I keep ending up with three rows per Country-Year combination.
I have tried changing all kinds of options in the recipe, and I have also tried Fuzzy Join, but I always get duplicates. I have specifically tried just joining two out of the three data sets first but that did not help me figure out what is happening.
Sean Dataiker, Alpha Tester, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer Posts: 169 Dataiker
Hi @francisco
, in my own copy of the project, I have the same settings for the Join step so you should be OK there. I don't have a post-filter step that you seem to have. I do have a pre-filter step. That might be where to look next. -
Hi Sean,
Thanks for the reply. It ended up being that something had gone wrong in how one of the three input data sets had been read in by DSS; when I inspected it directly I could see that there were three rows per Country-Year combination. I started over from scratch and with the data loaded in correctly I was able to complete the project.