Why are SQL queries in Dataiku slower than in my AWS Docker container (RDS Oracle)?

Hello,
I'm currently using Dataiku and SQLExecutor2 to run queries on my Oracle database hosted on AWS RDS, port 2484. When I execute the same query from a Docker container on AWS, the query takes about 15 ms. However, when I run it in Dataiku, it takes approximately 1 second, and the whole process, which takes about 8 hours in AWS, has extended to over 2 days and still hasn’t finished.
Could anyone explain why the queries take much longer in Dataiku? I see the following logs in my Python recipe:
2025-02-24 17:17:51,509 INFO Got initial SQL query response 2025-02-24 17:17:51,595 INFO Starting SQL query reader 2025-02-24 17:17:52,560 INFO Got initial SQL query response 2025-02-24 17:17:52,639 INFO Starting SQL query reader 2025-02-24 17:17:53,573 INFO Got initial SQL query response 2025-02-24 17:17:53,648 INFO Starting SQL query reader
Can you explain the logic behind how queries are executed in Dataiku and SQLExecutor2, and if there are any reasons why this process might be slower or if there’s something I can optimize to improve query performance?
Thanks!"
Operating system used: OS
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,329 Neuron
We don’t have all the facts to compare. You say you are running the queries in Dataiku, what specific Oracle driver version are you using in Dataiku? What Oracle database version do you connect to? Where is your DSD node located? Where are your docker containers located? What exactly do you mean by “I execute the same query from a Docker container on AWS”? You need to describe the whole technology stack there including all the driver and software stack versions used.
Finally how do you do go from a query that takes 1 second to run to taking 2 days? Are you running SQL queries in a for loop? That’s a really bad pattern. Explain exactly what is your requirement and how you are trying to achieve it.