How to "send" a knowledge bank with a get_llm?

simon_aubert
simon_aubert Dataiku DSS Core Designer, Registered Posts: 21 ✭✭✭✭

Hello all,

Here our context :
-An headless custom web app on Dataiku that receives through API a user question and must give an answer based on internal procedures
-as of today, the procedures are sent in markdown with the prompt but it's really heavy and the response time is more thant 20 seconds
-I have played with embedded database (chroma) and retrieval LLM. Not that easy to play with the embedding model when you have a lot of text but in the end it worked.


however, now, I would like to use my knowledge bank into my webapp with the get_llm. I'm not even sure it's possible, the documentation doesn't seem to adress this case.

Best regards,

Simon


Operating system used: Windows 11

Best Answer

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,400 Dataiker
    Answer ✓

    Hi Simon,
    That's not really possible; sending the LLM vector directly to perform the LLM to perform lookup itself is not supported.

    LLMs typically only accept text as input.
    You need to perform the vector store to build the necessary context locally and then send text to the LLM.

    If you are not happy with the overall speed, you can use Trace Explorer to understand where the time is spent.

    Usually, using a lighter, faster model will improve speed, as you usually do not need reasoning models for such use cases.



Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,400 Dataiker

    Hi,
    You should be able to use the Knowledge Bank from the LLM Mesh following the example here


    https://developer.dataiku.com/latest/tutorials/genai/nlp/llm-mesh-rag/index.html#running-an-enriched-llm-query

    THanks

  • simon_aubert
    simon_aubert Dataiku DSS Core Designer, Registered Posts: 21 ✭✭✭✭

    Hello @Alexandru
    Thanks for your answer. However, from I understand, this code works in two steps :
    1/query the vector store and find the context (a string)
    2/prompt the llm with the question and the string found in step 1

    What I would like is to directly send the chroma, not to retrieve data at intermediate step. My understanding is that sending the chroma db directly would help me to improve the speed of the answer since the llm has less work to do. Am I wrong ?

    Best regards,

    Simon

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,400 Dataiker

    Even if the LLM supports lookups in a managed vector database, it’s still fundamentally a two-step process.
    The service first performs vector search on the store, then injects the top-K retrieved chunks into the prompt before the LLM generates an answer. The two steps are just abstracted.

  • simon_aubert
    simon_aubert Dataiku DSS Core Designer, Registered Posts: 21 ✭✭✭✭

    Ok thanks @Alexandru that's helpful (sad but helpful).

Setup Info
    Tags
      Help me…