ABCDEFGHIJKLMNOPQRSTUVWXYZ
GLOSSARY

Retrieval-Augmented Generation

DEFINITION

A technique that enhances LLM outputs by retrieving relevant information from an external knowledge base before generating a response, combining generative power with accurate, up-to-date data.

RAG solves one of the core limitations of LLMs: their knowledge is frozen at training time. By connecting a model to a dynamic knowledge base, you get responses that are both fluent and factually grounded.

RAG pipelines have two phases: (1) Retrieval — convert the query to an embedding, find the most similar chunks from your knowledge base using vector similarity search; (2) Generation — pass the query plus retrieved context to the LLM, which generates a grounded response.

The knowledge base is stored as vector embeddings in Supabase pgvector, Pinecone, or Weaviate. When a query arrives, it is embedded and compared against stored embeddings to find the most semantically similar chunks.

Tools That Use Retrieval-Augmented Generation

S
Supabase
9.1/10

Open-source Firebase alternative with pgvector for AI apps

Free / $25/mo ProView Review →
P
Pinecone
8.7/10

Managed vector database built for production AI search

Free / Serverless from $0.04/1M readsView Review →
N
n8n
9.2/10

Open-source workflow automation built for AI pipelines

Free (self-hosted)View Review →

Learn More About Retrieval-Augmented Generation