Using PostgreSQL for computation
Going through the link https://academy.dataiku.com/latest/concepts/where-compute-happens.html suggests that few of the recipes as well as Machine Learning training and execution can run in SQL database. Is it the same case with PostgreSQL as well?
Best Answer
-
Hi,
So to provide some additional context, DSS provides various execution engines that can be used to run jobs/recipes. By default, the DSS engine will always be selected/used but this can be changed to other engines if certain conditions are met. If the input and output dataset for a recipe is using the same SQL database, then you will be able to select the "In-database (SQL)" engine in most cases (unless the processor doesn't support it), where DSS will automatically generate the appropriate SQL query that corresponds to the recipe's actions and push it to be executed in the underlying database directly.
Therefore, to answer your question, to run your recipe in PostgreSQL directly, you need to make sure that both the input and output datasets for said recipe are using the same PostgreSQL database. Then, you should have the option to select the "In-Database (SQL)" engine from the recipe page (in the lower left). More information can be found in our documentation here and here.
I hope that this helps!
Best,
Andrew
Answers
-
Hi,
Yes indeed, PostgreSQL is part of compatible data sources and can be used to read / write datasets and execute most visual recipes in-DB. ML models (python based by default) will not be compatible with this backend.
Hope this helps
-
Thanks.
But how do we execute visual recipes in PostgreSQL ? I am interested in running the recipe on PostgreSQL so that it doesn't run locally in my DSS.
-
Great I understand the point about running recipes on PostgreSQL now.
What about running models on PostgreSQL using Vertica ML and XGBoost? They both are mentioned in the link https://academy.dataiku.com/latest/concepts/where-compute-happens.html for running on SQL.