Confused on how to use RAG (Retrieval Augmented Generation)
I'm playing with the new LLM recipes and getting a bit confused with the RAG functionality.
I can use an Embed recipe to create an Embedding dataset / Vector Store.
Then I can setup an LLM two query the resulting object in its settings.
But, how to go from there? How can I ask a question / query to the Embedding object? Clicking on it only gives the option of a Python recipe and there's also nothing like a Visual webapp.
Operating system used: AWS Linux
Answers
-
Hi,
Once you have defined an augmented LLM in the settings of the Knowledge Bank (the output of the Embedding recipe), you can directly use this augmented LLM in Prompt Studios and Prompt Recipes.
In other words, you can use the retrieval-augmented-LLM just like you would use a "bare" LLM. Under the hood, Dataiku will query the embedding to retrieve relevant documents, and query the underlying LLM, and then return the context-aware response.
There are also APIs for programatically interacting with the vector store. Documentation for these APIs will be available very soon. The whole LLM Mesh is still in Preview, so things are still evolving quickly.
We are also working on adding more conversational-oriented UI (which will initially take the form of a visual webapp), so stay tuned.
-
Antal Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 91 Neuron
Yes, I can see how this would work.
At the moment I'm getting 500 errors (socket closed) when using a Prompt Studio query with an augmented LLM.
The same prompt does work when using a "regular" LLM connection (also the same connection that is augmented using the Embedding recipe).
Could be related to the fix for embedding recipes using Azure OpenAI connections mentioned in the 12.3.1 release notes?