The maximum amount of text (in tokens) that a language model can process in one prompt-response cycle. Both input (prompt) and output (completion) count against the limit.
The context window is the working memory of an LLM. Early models had 4K-token windows; modern models range from 16K (GPT-3.5-turbo) to 2M tokens (Gemini 1.5 Pro). Claude supports 200K tokens.
Context window size matters most for: analyzing entire codebases, processing long documents, extended multi-turn conversations, and complex multi-step reasoning chains where earlier context informs later decisions.
Anthropic's AI assistant with industry-leading reasoning and safety
Google's multimodal AI assistant with deep Search and Workspace integration
Weekly AI tool reviews, news digests, and how-to guides.
Join 12,000+ builders