CTE SQL Script Patterns

Taylor Registered Posts: 15 ✭✭✭✭

Hey all,

Does anyone have some template / boilerplate code that works well with CTEs?

I very much prefer the simplicity of SQL Query recipes over SQL Script recipes, but I know you can't use SQL Query recipes with CTEs in them because of how DSS re-writes your query dynamically.

Does anyone have any simple patterns for using CTEs in SQL Script recipes that can be re-used in code snippets?

If not, then can someone provide some clarity on what the docs say about using SQL Script recipes?

"You must manage CREATE / DROP statements yourself"

but then it also mentions:

"The schema of the output dataset(s) are automatically infered from the tables created by your script"

I'm confused at how we're on the hook for CREATE statements, but the schema is automatically inferred from the tables created.

Also, I'm on 6.0.5 if that matters.




  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer Posts: 753 Dataiker


    Having documented practices and/or workarounds to use in SQL query recipes around CTE is on our todo list, but is not currently available.

    It's useful to remember what a SQL query recipe does:

    • When you validate the query, DSS sends it to the database and asks for the metadata of the result in order to fill the schema of the output dataset based on what the query outputs
    • When you run the recipe, DSS first creates the output table
    • Then, DSS rewrites your query to transform the SELECT into a "INSERT FROM SELECT" in the previously-created table. This is where the situation is a bit muddy with CTEs, because most databases don't support CTEs in subqueries.
    • Then DSS executes the rewritten query

    WIth a SQL script recipe, DSS just splits your script in statements and executes them. Hence, you must create the output table, because DSS does not do it for you. The idea is that "forcing you" to do it also allows you to do it in a custom fashion, for example using a CTAS.

    However, even if DSS does not create the table, it still knows what table you're supposed to create (this info is in the output dataset). Thus, *after* running your code, DSS can submit a "SELECT *" statement about this table, and inspect the metadata in order to fill the schema of the output dataset.

Setup Info
      Help me…