Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on December 7, 2018 11:03AM
Likes: 0
Replies: 3
Hi,
What you describe is often described as the "ANTI JOIN" concept in SQL. I would recommend indeed to accomplish this using a SQL script recipe in Dataiku.
Unfortunately, there is not a common syntax for ANTI JOINs in SQL, as for regular LEFT/RIGHT/INNER/... JOINS. Depending on the flavour of SQL database you are using, you can look up the specific syntax you need. For instance, in PostgreSQL you can use a syntax like this:
SELECT *
FROM actor a
WHERE NOT EXISTS (
SELECT * FROM film_actor fa
WHERE a.actor_id = fa.actor_id
)
This is documented online in this article: https://blog.jooq.org/2017/01/12/a-probably-incomplete-comprehensive-guide-to-the-many-different-ways-to-join-tables-in-sql/. For other SQL flavours, similar articles certainly exist to help you write the right script.
Hope it helps,
Alex
Hi @acrow6
,
We've recently released a new version of Dataiku, 11.3, and you can now add additional output datasets to a Join recipe that contain all the unmatched rows from either a left, right, or inner join. This will let you easily create datasets with rows that don't match the join!
Anti-joins/exclusion joins are on our roadmap.
Cheers,
Ashley