As an analyst who is selectively joining data through the visual prepare recipe, I'd like to be able to add a constant as one of the join conditions, so that I can make more selective joins through the visual prepare recipes without having to write SQL from scratch.
Today you can join on a column value coming from either dataset. But you can not pick a constant. I have a use case where a join has to be done only if a value in a column is equal to a certain constant value and that value does not appear in the table being joined. For example I can not model the following SQL in the visual Join recipe (without writing a minimally documented advanced join).
Select *
From t_customer as tc
left join t_address as ta
on tc.customer_no = ta.customer_no
and ta.address_type = 'primary_address'
In a very real case like this for me, the constant is being used because the database has other address types and we want all customer records. We do not want to duplicate customer records by adding result records for different address types like 'home_address'. Finally even if we do not have a primary address defined. We do want all of the customer records. This kind of join can not be done with a where clause.
I do note that one can use a "custom SQL condition" in a visual prepare recipe. But it is hard for the "analyst" persona to figure out how to use this more advanced setup. For example to do the above join, one would use the following condition. Assuming that t_customer is $left and t_address is $right:
$left."customer_no" = $right."customer_no"
and $left."address_type" = 'primary_address'
There appears to be little or no documentation on the "Advanced join" other than this comment in the knowledge base.
Advanced join |
Provides custom options for row selection and deduplication for when none of the other options are suitable. |
And these some what cryptic notes on the actual dialog box for custom join conditions.
Write a custom SQL condition. It will be copied in the request as:
SELECT (...) FROM (...) JOIN (...) ON <your_code_here>
You should use $left and $right to refer to both datasets.
The $left and $right tokens will be replaced by the corresponding quoted
table names.
It is nice that we can do this. This is a request for at least better documentation of this advanced feature. But preferably and option to use a constant rather than a table for the right part of the join conditions.