Support joining to more than one dataset in join recipe

0 Kudos

Right now, the Join recipe only allows a new dataset to be joined to a single other dataset. Datasets may be joined from multiple datasets, but only joined to a single dataset. However, more complex joins are common. It would be great if for each join criterion, the dataset could be changed, so that any column can be joined to any column.

Here's an example:

 

A JOIN B ON A.C1 = B.C1 JOIN C ON A.C2 = C.C2 AND B.C3 = C.C3

 

This would reduce the need for creating SQL recipes and make performing complex joins visually much more manageable. It's not uncommon to have 20-30 datasets in a single join, with multiple datasets referencing keys from multiple other datasets. The Dataiku UI is nicer than trying to keep track of all that through code, but currently it only works where join paths are linear. 

 

Public