Configuring dev servers on a project- or user-level
Hi,
my team at work is using Dataiku 13.5.5 and our typical setting is that we are collaborating on projects with external partners that have more limited access to data connections and other Dataiku features compared to us.
Just today we were informed by our external partner that they are not able to run test queries to endpoints that they set up via the API Designer. The error message was related to missing access to an SQL connection.
After some research it became clear that the problem was the following: The dev server that is started by Dataiku in order to run test queries was bundled with an SQL connection to which the external partners did not have access. We then found out that the configuration of the dev server (including which connections it is bundled with) can only be set on the instance-level.
This, in my humble opinion, seems like a bit of a design flaw regarding collaboration for the following reason: In larger organisations, different teams will likely have access to different data sources for compliance reasons. However, multiple teams could be developing endpoins which they would want to test. Thus, only being able to configure connections of the dev server globally on the instance-level seems flawed and error prone.
I currently did not see an option to configure the data connections of the dev server on a project- or user-level, which would solve this problem (please correct me if I am wrong and I might have missed this information). I think that this would be a very useful feature.
Thank you
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,352 DataikerHi Felix,
Thanks for your detailed post. Indeed, currently, the dev server can only use one bundled connection.
The admins define this under Administration - Deployer.This limitation does not apply to actually deployed endpoints on API Nodes/ K8s API deployments. You can use any compatible connection.
https://doc.dataiku.com/dss/latest/apinode/enrich-prediction-queries.html#api-node-configuration
Typical uses for SQL connections in API endpoints are fast lookup/enrichment endpoints, so you will want to use a low-latency transactional database with relatively small datasets.
When a SQL connection is available to the deployed endpoint or even a local API designer( dev/lambda), it cannot inherit user/group/connection permissions, as API endpoints are decoupled from the Dataiku instance.
Thus, allowing users to pick any connection at the API endpoint / Project could potentially affect other controls set by the admin.
I can understand how this can be limiting during development testing on the lambda/dev server. An admin can change this to a different connection at any time if needed.
Typically, you could consider having a single connection for this case set by the admin, e.g., a test database, where users can sync a partial test dataset during development, later switch to the actual connection, and deploy the endpoint if necessary.
Supporting multiple connections can be submitted as a potential feature for future consideration, either https://community.dataiku.com/categories/product-ideas/p1 or via a support ticket.
Thanks
