Key Takeaways
- GPT-4o achieved an Elo rating of 1312 in the LM Arena leaderboard as of October 2024
- Claude 3.5 Sonnet holds the top position with a Quality Index of 87/100 on lmarena.ai
- Llama 3.1 405B scored 84 in Quality Index, trailing Claude by 3 points
- Claude 3 Opus has 85% win rate against GPT-4 in pairwise battles
- GPT-4o wins 62% of battles vs Llama 3.1 405B
- Llama 3.1 405B beats Claude 3 Opus in 55% of matchups
- GPT-4o scores 88.7% on MMLU benchmark via lmarena eval
- Claude 3.5 Sonnet 87.2% MMLU
- Llama 3.1 405B 86.5% MMLU 5-shot
- LM Arena has over 2.5 million user votes collected since launch
- Average daily battles on lmarena.ai exceed 50,000 as of Q3 2024
- 1.2 million unique users participated in LM Arena voting
- Claude 3.5 Sonnet context window of 200K tokens
- GPT-4o supports 128K input context
- Llama 3.1 405B has 128K context length
LM Arena stats cover model rankings, battles, benchmarks, user data, context lengths.
Benchmark Scores
Benchmark Scores Interpretation
Model Rankings
Model Rankings Interpretation
Technical Specs
Technical Specs Interpretation
User Interactions
User Interactions Interpretation
Win Rates
Win Rates Interpretation
Sources & References
- Reference 1LMARENAlmarena.aiVisit source
- Reference 2LEADERBOARDleaderboard.lmsys.orgVisit source
- Reference 3ARENAarena.lmsys.orgVisit source
- Reference 4HUGGINGFACEhuggingface.coVisit source
- Reference 5CHATchat.lmsys.orgVisit source
- Reference 6BLOGblog.lmarena.aiVisit source
- Reference 7BLOGblog.lmsys.orgVisit source
- Reference 8PLATFORMplatform.lmsys.orgVisit source
- Reference 9STATUSstatus.lmsys.orgVisit source
- Reference 10DISCORDdiscord.lmsys.orgVisit source
- Reference 11PLATFORMplatform.openai.comVisit source
- Reference 12AIai.meta.comVisit source
- Reference 13DEEPMINDdeepmind.googleVisit source
- Reference 14MISTRALmistral.aiVisit source
- Reference 15QWENLMqwenlm.github.ioVisit source
- Reference 16COHEREcohere.comVisit source
- Reference 17DEEPSEEKdeepseek.comVisit source
- Reference 18OPENAIopenai.comVisit source
- Reference 19AZUREazure.microsoft.comVisit source
- Reference 20DEVELOPERdeveloper.nvidia.comVisit source
- Reference 21Xx.aiVisit source
- Reference 22PLATFORMplatform.01.aiVisit source
- Reference 23AIai.googleVisit source






