ABCDE F G H IJK L MNOP Q R S TUVWXYZ

GLOSSARY

Retrieval-Augmented Generation

DEFINITION

A technique that enhances LLM outputs by retrieving relevant information from an external knowledge base before generating a response, combining generative power with accurate, up-to-date data.

RAG solves one of the core limitations of LLMs: their knowledge is frozen at training time. By connecting a model to a dynamic knowledge base, you get responses that are both fluent and factually grounded.

RAG pipelines have two phases: (1) Retrieval — convert the query to an embedding, find the most similar chunks from your knowledge base using vector similarity search; (2) Generation — pass the query plus retrieved context to the LLM, which generates a grounded response.

The knowledge base is stored as vector embeddings in Supabase pgvector, Pinecone, or Weaviate. When a query arrives, it is embedded and compared against stored embeddings to find the most semantically similar chunks.

Tools That Use Retrieval-Augmented Generation

S

Supabase

9.1/10

Open-source Firebase alternative with pgvector for AI apps

Free / $25/mo ProView Review →

P

Pinecone

8.7/10

Managed vector database built for production AI search

Free / Serverless from $0.04/1M readsView Review →

N

n8n

9.2/10

Open-source workflow automation built for AI pipelines

Free (self-hosted)View Review →

Learn More About Retrieval-Augmented Generation

What Is RAG? Retrieval-Augmented Generation Explained for Builders

1 min read review

n8n vs Make vs Zapier: Best AI Automation Platform in 2025?

1 min read howto

Pinecone vs Supabase pgvector: Which Vector DB for Your RAG App?

Related Terms

Large Language Model

A type of AI trained on massive text data that can understand, generate, and manipulate human language. LLMs are the foundation of Claude, ChatGPT, Gemini, and similar tools.

Numerical vector representations of text, images, or other data that capture semantic meaning. Embeddings allow AI systems to measure similarity between concepts by comparing positions in high-dimensional vector space.

The maximum amount of text (in tokens) that a language model can process in one prompt-response cycle. Both input (prompt) and output (completion) count against the limit.

Vector Database

A specialized database designed to store, index, and query high-dimensional vector embeddings efficiently. Vector databases power semantic search, RAG pipelines, and similarity-based recommendation systems.

Stay in the loop

Weekly AI tool reviews, news digests, and how-to guides.

Join 12,000+ builders