Context window

Last updated January 08, 2026

The context window is the maximum length of input a large language model (LLM) can consider at once. In the development and maturation of LLM technology expanding the context window has been a major goal.^[1]^[2] The length of a context window is measured in tokens. In 2025, the Gemini LLM had the largest context window with two million tokens.^[3]

In some models the context length is limited by the size of inputs during the training runs.^[4] However, attention mechanisms can be adopted to allow LLMs to interpret sequences that are much longer than those observed at training time.^[5]

References

↑ Ratner, Nir; Levine, Yoav; Belinkov, Yonatan; Ram, Ori; Magar, Inbal; Abend, Omri; Karpas, Ehud; Shashua, Amnon; Leyton-Brown, Kevin; Shoham, Yoav (2023). "Parallel Context Windows for Large Language Models". Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers): 6383–6402. doi:10.18653/v1/2023.acl-long.352.
↑ Dong, Zican; Li, Junyi; Men, Xin; Zhao, Wayne Xin; Wang, Bingning; Tian, Zhen; Chen, Weipeng; Wen, Ji-Rong (10 December 2024). "Exploring context window of large language models via decomposed positional vectors". Proceedings of the 38th International Conference on Neural Information Processing Systems. 37. Curran Associates Inc.: 10320–10347.
↑ Yeung, Ken (2024-05-14). "Google announces Gemini 1.5 Flash, a rapid multimodal model with a 1M context window". VentureBeat. Retrieved 2025-08-26.
↑ Wu, S (2023). "BloombergGPT: A Large Language Model for Finance". arXiv: 2303.17564 [LG].
↑ Press, Ofir (2021). "Train short, test long: Attention with linear biases enables input length extrapolation". arXiv: 2108.12409 [LG].

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Ratner, Nir; Levine, Yoav; Belinkov, Yonatan; Ram, Ori; Magar, Inbal; Abend, Omri; Karpas, Ehud; Shashua, Amnon; Leyton-Brown, Kevin; Shoham, Yoav (2023). "Parallel Context Windows for Large Language Models". Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers): 6383–6402. doi:10.18653/v1/2023.acl-long.352.

[2] Dong, Zican; Li, Junyi; Men, Xin; Zhao, Wayne Xin; Wang, Bingning; Tian, Zhen; Chen, Weipeng; Wen, Ji-Rong (10 December 2024). "Exploring context window of large language models via decomposed positional vectors". Proceedings of the 38th International Conference on Neural Information Processing Systems. 37. Curran Associates Inc.: 10320–10347.

[3] Yeung, Ken (2024-05-14). "Google announces Gemini 1.5 Flash, a rapid multimodal model with a 1M context window". VentureBeat. Retrieved 2025-08-26.

[4] Wu, S (2023). "BloombergGPT: A Large Language Model for Finance". arXiv: 2303.17564 [LG].

[5] Press, Ofir (2021). "Train short, test long: Attention with linear biases enables input length extrapolation". arXiv: 2108.12409 [LG].

[1]

[2]

[3]

[4]

[5]