Guidance Needed: Building a SQL Chatbot in Dataiku

Dear Dataiku Experts,
I’m working on a Dataiku project where I want to enable natural language interaction with a production SQL database — essentially, a chatbot I call the “Production Analyst.” The idea is that a user can type queries in natural language, and the chatbot will interpret them to perform production analysis, such as retrieving key metrics or generating diagnostic plots.
I’m familiar with using the LLM Prompt recipe to build chatbots for unstructured data (e.g., documents), but I’m unsure how to approach this for structured data like SQL tables.
Here’s what I’ve tried so far:
- Prompt recipe: While it supports natural language input, I couldn't find a way to tailor prompts specifically for SQL query generation and execution.
- AI-generated SQL: I’ve seen how Dataiku can generate SQL queries from natural language, which is great — but I want to go a step further by automatically fetching data and running diagnostic analyses within an application interface.
- Agent Tool – “Performs SQL queries in a set of SQL datasets”: I’ve set this up, but I’m unclear on what the next steps are after configuring the agent.
Outside of Dataiku, I’ve experimented with this concept in a Python Jupyter notebook using function/tool calling (e.g., with Gemini 1.5 or GPT APIs), and it works well. However, I’m unsure how to translate that into a working solution within Dataiku.
I've reviewed the documentation and browsed through community discussions but haven’t found a practical example or end-to-end guide on building an interactive SQL chatbot or app inside Dataiku.
Below is a snippet of the Python-based prototype I’ve been testing outside Dataiku:
# These are the Python functions defined above.db_tools = [list_tables, describe_table, execute_query]instruction = """You are a helpful chatbot that can interact with an SQL databasefor a computer store. You will take the users questions and turn them into SQLqueries using the tools available. Once you have the information you need, you willanswer the user's question using the data returned.Use list_tables to see what tables are present, describe_table to understand theschema, and execute_query to issue an SQL SELECT query."""client = genai.Client(api_key=GOOGLE_API_KEY)# Start a chat with automatic function calling enabled.chat = client.chats.create( model="gemini-2.0-flash", config=types.GenerateContentConfig( system_instruction=instruction, tools=db_tools, ),)
Could you kindly guide me on how I can replicate or implement something similar inside Dataiku — perhaps via plugins, custom recipes, webapps, or Agent Tools? Any example projects or detailed steps would be greatly appreciated.
Kind regards,
Yawar
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,467 Neuron
You should setup the SQL Assistant:
Here is how to use it:
-
Yawar Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 10 ✭✭✭
Thanks, as I mentioned in my question, I know about AI SQL Assistant (
AI-generated SQL)
. But I want to do make a chatbot for my database which can do more than writing SQL queries but also do data analysis like make charts of the data. Call other functions etc.Can I use AI SQL Assistant to make a webapp / chatbot and further connect LLM to extend its functionality?
Kind regards,
Yawar
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,467 Neuron
Have a look at the SQL Question Answering Tool plugin:
But note that "This tool directly answers the question, it does not return the records for the Agent to synthesize the answer". This plugin is availble in DSS 13.4 and later.
-
Yawar Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 10 ✭✭✭
Thanks, I will go through this in more details.
-
Just FYI, I am also watching this space. I want to do what you are doing too. But so far have not been able to find any links to this. If you find a solution please kindly post it here thx.
-
Yawar Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 10 ✭✭✭
I Sze, thanks for your message. I think I have found something but I need a bit more time to make sure its working fine. I am happy to post my solution here.
Basically, there are 3 key steps:
- Go to agent tools and create an agent tool with "Performs SQL queries in a set of SQL datasets". In there attach your SQL database, use SQL connection and a suitable LLM. Give suitable instructions to LLM and some description. Go to Quick test and make test query. Make sure you get the write answer in the Tool output. Make adjustment to your instructions until you get the right answer.
- In the Dataiku flow, click for OTHER next to Datasets and Recipe and click on Generative AI and click on visual agent. In the visual agent, click on v1, select LLM and provide some instruction to the agent. Click on Add Tool and select the tool that you created in the step 1. You need to do some testing in this step too.
- Finally look for "Dataiku Answers and Agent connect" in the Visual ML (Analysis) and Add "New agent connect". Here you need to provide database connection for prompt history, in LLM make sure to select "Agent - name" that you created in step 2. Configure LLM and make some sample questions. Click View Webapp. You should be able to chat with your SQL database.
I hope this will help. While this is working for me, I am finding issues with prompt engineering, it keep forgetting the SQL database table name and start to hallucinate. If someone has a better solution please share otherwise I am happy to hear expert opinion about fixing the issues that I am facing.