Context window

Last updated

The context window is the maximum length of input a large language model (LLM) can consider at once. In the development and maturation of LLM technology expanding the context window has been a major goal. [1] [2] The length of a context window is measured in tokens. In 2025, the Gemini LLM had the largest context window with two million tokens. [3]

In some models the context length is limited by the size of inputs during the training runs. [4] However, attention mechanisms can be adopted to allow LLMs to interpret sequences that are much longer than those observed at training time. [5]

References

  1. Ratner, Nir; Levine, Yoav; Belinkov, Yonatan; Ram, Ori; Magar, Inbal; Abend, Omri; Karpas, Ehud; Shashua, Amnon; Leyton-Brown, Kevin; Shoham, Yoav (2023). "Parallel Context Windows for Large Language Models". Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers): 6383–6402. doi:10.18653/v1/2023.acl-long.352.
  2. Dong, Zican; Li, Junyi; Men, Xin; Zhao, Wayne Xin; Wang, Bingning; Tian, Zhen; Chen, Weipeng; Wen, Ji-Rong (10 December 2024). "Exploring context window of large language models via decomposed positional vectors". Proceedings of the 38th International Conference on Neural Information Processing Systems. 37. Curran Associates Inc.: 10320–10347.
  3. Yeung, Ken (2024-05-14). "Google announces Gemini 1.5 Flash, a rapid multimodal model with a 1M context window". VentureBeat. Retrieved 2025-08-26.
  4. Wu, S (2023). "BloombergGPT: A Large Language Model for Finance". arXiv: 2303.17564 [LG].
  5. Press, Ofir (2021). "Train short, test long: Attention with linear biases enables input length extrapolation". arXiv: 2108.12409 [LG].