A token is roughly 0.75 words. GPT-4o supports 128k tokens; Claude 3.5 supports 200k. Larger context windows allow agents to process full codebases, long documents, and extended conversation histories without losing track of earlier content. Context length is often the binding constraint in complex agentic workflows that require processing multiple large documents simultaneously.