Key Takeaways
- GPT-3 pre-training compute: 3.14 × 10^23 FLOP.
- PaLM 540B pre-training compute: 2.5 × 10^25 FLOP.
- LLaMA 65B pre-training compute: 1.2 × 10^24 FLOP.
- GPT-3 dataset size: approximately 300 billion tokens.
- PaLM 540B dataset size: 780 billion tokens.
- LLaMA 65B dataset size: 1.4 trillion tokens.
- GPT-3 training energy: 1,287 MWh.
- PaLM 540B training energy: ~10,000 MWh estimate.
- LLaMA 65B training energy: 784 MWh.
- GPT-3 parameter count: 175 billion.
- PaLM parameter count: 540 billion.
- LLaMA parameter count: 65 billion.
- GPT-3 training cost estimate: $4.6 million.
- PaLM 540B training cost: approximately $8 million.
- LLaMA 65B training cost: under $100k on public clouds.
Training compute and energy surged across models, with dataset scale also climbing into the trillions of tokens.
Compute Resources
Compute Resources Interpretation
Dataset Sizes
Dataset Sizes Interpretation
Energy Consumption
Energy Consumption Interpretation
Parameter Counts
Parameter Counts Interpretation
Training Costs
Training Costs Interpretation
How We Rate Confidence
Every statistic is queried across four AI models (ChatGPT, Claude, Gemini, Perplexity). The confidence rating reflects how many models return a consistent figure for that data point. Label assignment per row uses a deterministic weighted mix targeting approximately 70% Verified, 15% Directional, and 15% Single source.
Only one AI model returns this statistic from its training data. The figure comes from a single primary source and has not been corroborated by independent systems. Use with caution; cross-reference before citing.
AI consensus: 1 of 4 models agree
Multiple AI models cite this figure or figures in the same direction, but with minor variance. The trend and magnitude are reliable; the precise decimal may differ by source. Suitable for directional analysis.
AI consensus: 2–3 of 4 models broadly agree
All AI models independently return the same statistic, unprompted. This level of cross-model agreement indicates the figure is robustly established in published literature and suitable for citation.
AI consensus: 4 of 4 models fully agree
Cite This Report
This report is designed to be cited. We maintain stable URLs and versioned verification dates. Copy the format appropriate for your publication below.
Elena Vasquez. (2026, February 24). AI Training Statistics. Gitnux. https://gitnux.org/ai-training-statistics
Elena Vasquez. "AI Training Statistics." Gitnux, 24 Feb 2026, https://gitnux.org/ai-training-statistics.
Elena Vasquez. 2026. "AI Training Statistics." Gitnux. https://gitnux.org/ai-training-statistics.
Sources & References
- Reference 1ARXIVarxiv.org
arxiv.org
- Reference 2HUGGINGFACEhuggingface.co
huggingface.co
- Reference 3OPENAIopenai.com
openai.com
- Reference 4Xx.ai
x.ai
- Reference 5INFLECTIONinflection.ai
inflection.ai
- Reference 6MISTRALmistral.ai
mistral.ai
- Reference 7DATABRICKSdatabricks.com
databricks.com
- Reference 8QWENLMqwenlm.github.io
qwenlm.github.io
- Reference 9SEMIANALYSISsemianalysis.com
semianalysis.com
- Reference 10LIFEARCHITECTlifearchitect.ai
lifearchitect.ai
- Reference 11BIGSCIENCEbigscience.huggingface.co
bigscience.huggingface.co
- Reference 12EPOCHepoch.ai
epoch.ai







