Possible to make Datasets join with "contains" condition ?
Thanh_Thanh
Registered Posts: 1 ✭✭✭✭
Hi all,
I'm pretty new to Dataiku, and I'm currently trying to join 2 datasets. I found the Join recipe.
However, my join condition is not "equals" but "contains". And when I choose this "contains" condition on datasets join, I have this error : "DSS can only join with equality conditions"
Any idea how I can do this please ?
Thanks in advance,
Thanh Thanh
I'm pretty new to Dataiku, and I'm currently trying to join 2 datasets. I found the Join recipe.
However, my join condition is not "equals" but "contains". And when I choose this "contains" condition on datasets join, I have this error : "DSS can only join with equality conditions"
Any idea how I can do this please ?
Thanks in advance,
Thanh Thanh
Tagged:
Answers
-
Hi,
In order to join on "contains" condition, you need to have your input datasets in an SQL connection such as PostgreSQL (see all supported SQL connections on https://doc.dataiku.com/dss/latest/connecting/sql.html).
Alternatively, you can also use the Spark engine (requires configuration, cf https://doc.dataiku.com/dss/latest/spark/index.html). This engine is meant for HDFS connections but can also be used locally (although you lose the benefits of parallelization on Big Data).
Hope it helps,
Alex