← Glossary

Context window

The maximum number of tokens a model can process in a single request.

Context windows range from 128K (most frontier models) to 2M (Gemini 2.5 Pro). Larger windows reduce the need for chunking and can cut total token usage 30–50% on long-document workloads.

Related terms