Using SQLExecutor2 inside shared library

Shahul · December 6

Hi,
I would like to execute some raw sql queries like insert the rows directly into the oracle database.

Based on the various community discussions, I chose to use SQLExecutor2.

My code is as below:

from dataiku import SQLExecutor2
import dataiku

def test():
    # get the needed data to prepare the query
    # for example, load from another table
    executor = SQLExecutor2(connection='test')
    res = executor.query_to_df(
    "SELECT * FROM testable")

I tried to call this function from my Python API endpoint. However it throws up an error as below:

Failed: Failed to run function : <class 'Exception'> : Default project key is not specified (no DKU_CURRENT_PROJECT_KEY in env)

Operating system used: RHEL

Turribeach · December 7

That post is still correct but you are coming from a different angle since you are in the API node. Personally I think you should move away from using libraries in the API node. If you want to re-use the code you could put it in a custom Python package which you can then install in the API node. User secrets are not available in the API node since there are no users in the API node, just the Dataiku OS account.

With regards to your SQL query needs I would move away from using a library or writting custom Python code. Either use a query enrichment as previously advised or alternatively you can expose a SQL query as an API endpoint which you can then call from your other API endpoint. This will use an existing connection in the Designer node which you will then need to remap and define in the API node config file. This will remove the need to hardcode connection details in your code although if I remember correctly the API node configuration file does not encrypt the credentials.

And finally with regards to logging responses to a database this is not recommended in an API endpoint as it adds a potential failure point and slows down the API. API endpoints should perform the minimum work possible for them to return their result as quickly as possible. The API node has built-in logging capabilities which you can enable as needed. Should you need to use the audit data you can either load the API node audit log files using a separate Dataiku Designer project or send the logging events to a Kafka topic for a more near realtime audit solution.

Turribeach · December 6

The API Node has no concept of projects, it's purely an infrastructure layer to run your API endpoints. In general the dataiku package is meant to be used from inside DSS, meaning from a Designer or Automation node, not the API node. It is possible to use the internal client remotely from an API endpoint but this is not a good design pattern. Why would you want to make your API endpoints depend on a Designer or Automation node? It's not a good idea. If you want to interact with a database in your API endpoint go ahead and write some Python code to do that but don't use DSS for that. For read only use cases, like query enrichment see here.

Shahul · December 7

Thanks for your answer.

Basically, I am trying to execute a SQL query (not related to datasets) inside the python api endpoint. This query would decide few business rules and log the request and response messages to an audit table.

I tried creating a shared library and used python connectors to connect to the database after installing the required client libraries and python packages to the code_env. It worked.

Now I wanted to use the client secrets (https://doc.dataiku.com/dss/latest/security/user-secrets.html) inside the shared library with a good design(not referring to automation/design nodes) but not sure how. You have any thoughts on this?

For instance, I wanted to use the cx_oracle library and wanted to pass the user,password and servicename from the user secrets instead of hardcoding in the shared library code.

I saw one of your answers to a similar question: Hide API Keys in Project Library Editor. Is the challenge mentioned in the post still holds good? The only issue with the answer is that it uses dataiku library in API node, which is an anti-pattern as per your initial post.

Using SQLExecutor2 inside shared library

Best Answer

Answers

Categories

Setup Info

Tags