Type of site | Artificial intelligence |
---|---|
Founded | March 2025 |
Country of origin | United States |
Founder(s) |
|
URL | lmarena |
Registration | None |
Launched | May 3, 2023 |
LMArena (formerly Chatbot Arena) is a public, web-based platform that evaluates large language models (LLMs) through anonymous, crowd-sourced pairwise comparisons. Users enter prompts for two anonymous models to respond to and vote on the model that gave the better response, in which the model's identities are revealed. Users can also choose models to test themselves. [1] [2]
LMArena is popular within the artificial intelligence industry, with major companies supplying their large language models, such as GPT-4o, o1, Gemini, [3] and Claude, [4] and using their subsequent rankings to promote them. The website has even been used for preview releases of upcoming models.
Notably, Chinese company DeepSeek tested its prototype models in the LMArena months before its R1 model gained attention in Western media. [5] Other notable pre-release models include: OpenAI testing variants of GPT-5 under the codename: "summit" and Google DeepMind testing Gemini-2.5-Flash-Image on LMArena under the codename "nano-banana". [6] [7]
However, LMArena’s evaluation methodology for large language models has been examined in academic analyses, which have identified specific limitations and suggested areas for improvement. The platform has since implemented methodological updates in coordination with ongoing research through their policy updates. [8] [9]