Key Takeaways
- Llama 2 7B has 7 billion parameters
- Llama 3 8B features 8 billion parameters with 32 layers
- Llama 2 70B uses 80 layers and 8192 hidden size
- Llama 3 70B outperforms GPT-3.5 on MT-Bench by 10%
- Llama 2 70B beats PaLM 540B on 7/9 benchmarks
- Llama 3 8B surpasses Mistral 7B on MMLU by 5 points
- Llama 3 70B achieves 50 tokens/sec on single A100 GPU inference
- Llama 3 8B quantized to 4-bit runs at 100+ tokens/sec on consumer GPU
- Llama 2 70B requires 140GB VRAM in FP16
- Llama 2 7B model achieves 63.9% accuracy on MMLU benchmark
- Llama 2 13B scores 67.5% on MMLU
- Llama 2 70B reaches 68.9% on MMLU
- Llama 2 7B was trained on 2 trillion tokens of data
- Llama 3 models trained on over 15 trillion tokens
- Llama 3.1 405B required 16.4 million GPU hours on H100s
Llama 3 family combines huge context and strong benchmarks with efficient attention and fast, quantized deployment.
Related reading
01 · Category
Architecture and Parameters24 stats
Architecture and Parameters Interpretation
02 · Category
Comparisons with Other Models23 stats
Comparisons with Other Models Interpretation
03 · Category
Inference and Deployment21 stats
Inference and Deployment Interpretation
04 · Category
Performance on Benchmarks25 stats
Performance on Benchmarks Interpretation
05 · Category
Training Data and Compute21 stats
Training Data and Compute Interpretation
06 · Category
Usage and Adoption Metrics22 stats
Usage and Adoption Metrics Interpretation
Cite This Report
This report is designed to be cited. We maintain stable URLs and versioned verification dates. Copy the format appropriate for your publication below.
Aisha Okonkwo. (2026, February 24). LLaMA Statistics. Gitnux. https://gitnux.org/llama-statistics
Aisha Okonkwo. "LLaMA Statistics." Gitnux, 24 Feb 2026, https://gitnux.org/llama-statistics.
Aisha Okonkwo. 2026. "LLaMA Statistics." Gitnux. https://gitnux.org/llama-statistics.
Sources & references
7 datasets cited across this report · attribution is report-level

