
In a previous post about Building a PM Helper with AI, I showed how a fun personal AI project I’d built to be my personal Product Management tool searched across multiple sources before synthesizing answers. Unfortunately, I made both a strategic and a tactical error in that 1.0 version. The solution? Using Retrieval Augmented Generation with a Vector Database. What I’m going to do here is offer some super fast high level definitions as I go through the problem space, and maybe in future posts, go more deeply into RAG and Vector databases in terms of value.
tl;dr:
- If your a product manager working with AI at any level, you will likely need to understand Retrieval Augmented Generation (RAG) to some degree. The following is a small, practical use case to help see the value in action.
- For the most part, when using LLMs for your own custom work, you’re stuck with the foundation model.
- Fine tuning can change the weights of the model to various levels, depending on how deep you want to go. These weights basically control how models transform input and can be in the billions. The deeper you want to have impact, the higher the cost. (You’re not likely fine-tuning for personal projects though. And if you are, it will all but certainly be with open source foundational models.)
- Retrieval Augmented Generation (RAG) doesn’t change weights at all. RAG just passes more information into a Prompt, (which is a fancy name for an information query, unless you really add fuller instructions ), but is limited to something called a context window. Basically, how much info you can pass in. It’s like saying, “Here, read this before answering.” So theoretically RAG reduces the chances of hallucinations and offers more “truthy” answers. (Assuming good data in what you feed it.)