How to "send" a knowledge bank with a get_llm?
Hello all,
Here our context :
-An headless custom web app on Dataiku that receives through API a user question and must give an answer based on internal procedures
-as of today, the procedures are sent in markdown with the prompt but it's really heavy and the response time is more thant 20 seconds
-I have played with embedded database (chroma) and retrieval LLM. Not that easy to play with the embedding model when you have a lot of text but in the end it worked.
however, now, I would like to use my knowledge bank into my webapp with the get_llm. I'm not even sure it's possible, the documentation doesn't seem to adress this case.
Best regards,
Simon
Operating system used: Windows 11
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,400 DataikerHi Simon,
That's not really possible; sending the LLM vector directly to perform the LLM to perform lookup itself is not supported.
LLMs typically only accept text as input.
You need to perform the vector store to build the necessary context locally and then send text to the LLM.
If you are not happy with the overall speed, you can use Trace Explorer to understand where the time is spent.
Usually, using a lighter, faster model will improve speed, as you usually do not need reasoning models for such use cases.
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,400 DataikerHi,
You should be able to use the Knowledge Bank from the LLM Mesh following the example here
https://developer.dataiku.com/latest/tutorials/genai/nlp/llm-mesh-rag/index.html#running-an-enriched-llm-queryTHanks
-
Hello @Alexandru
Thanks for your answer. However, from I understand, this code works in two steps :
1/query the vector store and find the context (a string)
2/prompt the llm with the question and the string found in step 1
What I would like is to directly send the chroma, not to retrieve data at intermediate step. My understanding is that sending the chroma db directly would help me to improve the speed of the answer since the llm has less work to do. Am I wrong ?
Best regards,
Simon -
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,400 DataikerEven if the LLM supports lookups in a managed vector database, it’s still fundamentally a two-step process.
The service first performs vector search on the store, then injects the top-K retrieved chunks into the prompt before the LLM generates an answer. The two steps are just abstracted. -
Ok thanks @Alexandru that's helpful (sad but helpful).