Key Takeaways
- Claude 3 outperformed GPT-4 by 7% on MMLU
- Claude 3.5 Sonnet beat GPT-4o by 2.5% on GPQA
- Claude 3 Opus surpassed PaLM 2 by 15% on coding tasks
- Claude 3 Opus achieved 86.8% on the Massive Multitask Language Understanding (MMLU) benchmark
- Claude 3.5 Sonnet scored 88.7% on MMLU
- Claude 3 Opus scored 50.4% on Graduate-Level Google-Proof Q&A (GPQA)
- Claude 3 Opus exhibited 99.1% less refusal rate than GPT-4 on safety benchmarks
- Claude 3 family reduced jailbreak success rate to under 5% in red-teaming
- Claude 3 models achieved ASL-2 autonomy safety level
- Claude supported 100+ languages with high fluency
- Claude 3 models process up to 200K token context window
- Claude 3.5 Sonnet supports 200K tokens input/output
- Claude 3 trained on 15T tokens dataset
- Claude.ai reached 1 million weekly active users within months of launch
- Claude 3 launch saw 10x usage spike in first week
Claude 3 models set new benchmarks for accuracy, speed, safety, and cost while outperforming rivals across major tests.
Comparisons
Comparisons Interpretation
Performance Metrics
Performance Metrics Interpretation
Safety and Alignment
Safety and Alignment Interpretation
Technical Capabilities
Technical Capabilities Interpretation
Technical Capabilities; // approximate
Technical Capabilities; // approximate Interpretation
User and Market Growth
User and Market Growth Interpretation
User and Market Growth; // approximate from reports
User and Market Growth; // approximate from reports Interpretation
How We Rate Confidence
Every statistic is queried across four AI models (ChatGPT, Claude, Gemini, Perplexity). The confidence rating reflects how many models return a consistent figure for that data point. Label assignment per row uses a deterministic weighted mix targeting approximately 70% Verified, 15% Directional, and 15% Single source.
Only one AI model returns this statistic from its training data. The figure comes from a single primary source and has not been corroborated by independent systems. Use with caution; cross-reference before citing.
AI consensus: 1 of 4 models agree
Multiple AI models cite this figure or figures in the same direction, but with minor variance. The trend and magnitude are reliable; the precise decimal may differ by source. Suitable for directional analysis.
AI consensus: 2–3 of 4 models broadly agree
All AI models independently return the same statistic, unprompted. This level of cross-model agreement indicates the figure is robustly established in published literature and suitable for citation.
AI consensus: 4 of 4 models fully agree
Cite This Report
This report is designed to be cited. We maintain stable URLs and versioned verification dates. Copy the format appropriate for your publication below.
Leah Kessler. (2026, February 24). Claude AI Statistics. Gitnux. https://gitnux.org/claude-ai-statistics
Leah Kessler. "Claude AI Statistics." Gitnux, 24 Feb 2026, https://gitnux.org/claude-ai-statistics.
Leah Kessler. 2026. "Claude AI Statistics." Gitnux. https://gitnux.org/claude-ai-statistics.
Sources & References
- Reference 1ANTHROPICanthropic.com
anthropic.com
- Reference 2LEADERBOARDleaderboard.lmsys.org
leaderboard.lmsys.org
- Reference 3BLOGblog.anthropic.com
blog.anthropic.com
- Reference 4TECHCRUNCHtechcrunch.com
techcrunch.com
- Reference 5CNBCcnbc.com
cnbc.com
- Reference 6FORTUNEfortune.com
fortune.com
- Reference 7REUTERSreuters.com
reuters.com
- Reference 8SIMILARWEBsimilarweb.com
similarweb.com
- Reference 9ARENAarena.lmsys.org
arena.lmsys.org
- Reference 10THEINFORMATIONtheinformation.com
theinformation.com
- Reference 11DOCSdocs.anthropic.com
docs.anthropic.com
- Reference 12CRFMcrfm.stanford.edu
crfm.stanford.edu







