Key Takeaways
- GPT-4 achieves 86.4% accuracy on the MMLU benchmark
- Llama 2 70B scores 68.9% on MMLU
- Claude 2 scores 75.0% on MMLU
- ResNet-50 achieves 76.1% top-1 accuracy on ImageNet
- EfficientNet-B7 scores 84.3% top-1 on ImageNet
- ViT-Huge/14 reaches 88.55% top-1 on ImageNet-21k
- GPT-4V achieves 85.5% accuracy on RealWorldQA
- LLaVA-1.5 13B scores 78.5% on ScienceQA
- Kosmos-2 scores 68.8% on OK-VQA
- Claude 3.5 Sonnet reaches 84.9% on HumanEval
- GPT-4o scores 90.2% on HumanEval pass@1
- o1-preview achieves 74.4% on AIME 2024
- H100 SXM5 GPU delivers 1979 TFLOPS FP16 performance
- A100 80GB achieves 624 TFLOPS FP16 tensor
- Grok-1 314B model inference at 1.5x faster on custom stack
Blog post covers AI benchmarks with model accuracy, speed stats.
Computer Vision
Computer Vision Interpretation
Efficiency and Inference
Efficiency and Inference Interpretation
Multimodal Models
Multimodal Models Interpretation
Natural Language Processing
Natural Language Processing Interpretation
Reasoning and Mathematics
Reasoning and Mathematics Interpretation
Sources & References
- Reference 1OPENAIopenai.comVisit source
- Reference 2AIai.meta.comVisit source
- Reference 3ANTHROPICanthropic.comVisit source
- Reference 4AIai.googleVisit source
- Reference 5MISTRALmistral.aiVisit source
- Reference 6BLOGblog.googleVisit source
- Reference 7FALCONLLMfalconllm.tii.aeVisit source
- Reference 8HUGGINGFACEhuggingface.coVisit source
- Reference 9ARXIVarxiv.orgVisit source
- Reference 10GITHUBgithub.comVisit source
- Reference 11LLAVA-VLllava-vl.github.ioVisit source
- Reference 12MINIGPT-4minigpt-4.github.ioVisit source
- Reference 13OTTER-VLotter-vl.github.ioVisit source
- Reference 14QWENLMqwenlm.github.ioVisit source
- Reference 15DEEPMINDdeepmind.googleVisit source
- Reference 16AZUREazure.microsoft.comVisit source
- Reference 17NVIDIAnvidia.comVisit source
- Reference 18Xx.aiVisit source
- Reference 19VLLMvllm.aiVisit source
- Reference 20DEVELOPERdeveloper.nvidia.comVisit source






