Weird behavior of (left) join recipe with post-join computed columns losing records
I'm experiencing something unexpected with the join recipe using a left join and post-join computer columns.
I'm joining 2 datasets on a single column and then computing 4 additional columns after the join. I check the number of records before and after the join.
With 3 of my computed columns everything's fine and the number of records stays the same after the join (about 260k). But when I add the 4th column consisting of a series of case when statements on a single column, I end up with only 80k records in the output dataset.
None of the computer columns use the column used for the join.
Furthermore, if I use VIEW QUERY on the OUTPUT tab and run that query in my database, I do get the normal 260k records. So it's not a problem with the built query.
I'm using in database SQL processing in a Teradata database.
Any ideas what could be causing this?
Operating system used: Windows
Answers
-
Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
Hi @Antal
,
In the future (or if this is still an issue!) please feel free to open a ticket with the full job diagnostic along with screenshots showing the unexpected output so that we can help troubleshoot.
Thanks,
Sarina