How to "send" a knowledge bank with a get_llm?

simon_aubert · December 2025

Hello all,

Here our context :
-An headless custom web app on Dataiku that receives through API a user question and must give an answer based on internal procedures
-as of today, the procedures are sent in markdown with the prompt but it's really heavy and the response time is more thant 20 seconds
-I have played with embedded database (chroma) and retrieval LLM. Not that easy to play with the embedding model when you have a lot of text but in the end it worked.

however, now, I would like to use my knowledge bank into my webapp with the get_llm. I'm not even sure it's possible, the documentation doesn't seem to adress this case.

Best regards,

Simon

Operating system used: Windows 11

Alexandru · December 2025

Hi Simon,
That's not really possible; sending the LLM vector directly to perform the LLM to perform lookup itself is not supported.

LLMs typically only accept text as input.
You need to perform the vector store to build the necessary context locally and then send text to the LLM.

If you are not happy with the overall speed, you can use Trace Explorer to understand where the time is spent.

Usually, using a lighter, faster model will improve speed, as you usually do not need reasoning models for such use cases.

Alexandru · December 2025

Hi,
You should be able to use the Knowledge Bank from the LLM Mesh following the example here

https://developer.dataiku.com/latest/tutorials/genai/nlp/llm-mesh-rag/index.html#running-an-enriched-llm-query

THanks

simon_aubert · December 2025

Hello @Alexandru
Thanks for your answer. However, from I understand, this code works in two steps :
1/query the vector store and find the context (a string)
2/prompt the llm with the question and the string found in step 1

What I would like is to directly send the chroma, not to retrieve data at intermediate step. My understanding is that sending the chroma db directly would help me to improve the speed of the answer since the llm has less work to do. Am I wrong ?

Best regards,

Simon

Alexandru · December 2025

Even if the LLM supports lookups in a managed vector database, it’s still fundamentally a two-step process.
The service first performs vector search on the store, then injects the top-K retrieved chunks into the prompt before the LLM generates an answer. The two steps are just abstracted.

simon_aubert · December 2025

Ok thanks @Alexandru that's helpful (sad but helpful).

How to "send" a knowledge bank with a get_llm?

Best Answer

Answers

Categories

Setup Info

Tags