Possible to make Datasets join with "contains" condition ?

Highlighted
Thanh_Thanh
Level 1
Possible to make Datasets join with "contains" condition ?
Hi all,

I'm pretty new to Dataiku, and I'm currently trying to join 2 datasets. I found the Join recipe.

However, my join condition is not "equals" but "contains". And when I choose this "contains" condition on datasets join, I have this error : "DSS can only join with equality conditions"

Any idea how I can do this please ?

Thanks in advance,

Thanh Thanh
0 Kudos
1 Reply
Alex_Combessie Dataiker
Dataiker
Re: Possible to make Datasets join with "contains" condition ?
Hi,

In order to join on "contains" condition, you need to have your input datasets in an SQL connection such as PostgreSQL (see all supported SQL connections on https://doc.dataiku.com/dss/latest/connecting/sql.html).

Alternatively, you can also use the Spark engine (requires configuration, cf https://doc.dataiku.com/dss/latest/spark/index.html). This engine is meant for HDFS connections but can also be used locally (although you lose the benefits of parallelization on Big Data).

Hope it helps,

Alex
0 Kudos
Labels (2)