Key Takeaways
- The ENIAC computer, completed in 1945, had a peak performance of approximately 0.0000001 gigaFLOPS (100 kiloFLOPS)
- The Manchester Mark 1, operational in 1949, performed about 1.2 kiloFLOPS in floating-point operations
- The UNIVAC I, delivered in 1951, achieved around 0.000001 gigaFLOPS (1 kiloFLOPS) peak performance
- Frontier supercomputer holds the current TOP500 #1 at 1.194 exaFLOPS Rmax as of June 2023
- Aurora supercomputer ranks #2 at 1.012 exaFLOPS Rmax in June 2023 TOP500 list
- Eagle supercomputer at 561.2 petaFLOPS Rmax, #3 on June 2023 TOP500
- AMD Ryzen Threadripper PRO 5995WX scores 100 GFLOPS peak single CPU FP64
- Intel Core i9-13900K achieves 1.7 TFLOPS FP32 peak with AVX-512
- NVIDIA H100 SXM GPU delivers 67 TFLOPS FP64 Tensor Core performance
- Frontier supercomputer efficiency is 52.72 gigaFLOPS/W Green500 #1 June 2023
- Aurora at 49.03 gigaFLOPS/W #2 on Green500 June 2023
- Eagle achieves 46.18 gigaFLOPS/W efficiency #3 Green500 June 2023
- Moore's Law predicts doubling of transistors every 2 years, implying ~1.86x computing power
- Exascale computing achieved 2022, zettascale targeted by 2030 at 10^21 FLOPS
- Quantum supremacy demonstrated by Google Sycamore at 53 qubits, 200s vs classical 10k years
Computing power has exponentially increased from primitive kiloFLOPS to modern exaflop supercomputers.
CPU and GPU Performance
- AMD Ryzen Threadripper PRO 5995WX scores 100 GFLOPS peak single CPU FP64
- Intel Core i9-13900K achieves 1.7 TFLOPS FP32 peak with AVX-512
- NVIDIA H100 SXM GPU delivers 67 TFLOPS FP64 Tensor Core performance
- AMD Instinct MI300X GPU reaches 163.4 TFLOPS FP16 peak
- Apple M2 Ultra chip peaks at 31.6 TFLOPS FP32 GPU performance
- Intel Xeon Platinum 8592+ offers 2.9 TFLOPS FP64 per socket peak
- NVIDIA A100 80GB GPU achieves 19.5 TFLOPS FP64 with Tensor Cores
- AMD EPYC 9754 (Genoa) peaks at 3.2 TFLOPS FP64 dual socket
- Qualcomm Snapdragon 8 Gen 2 GPU at 3.2 TFLOPS FP32 peak for mobile
- IBM Power10 processor delivers 5 TFLOPS FP64 per chip
- NVIDIA RTX 4090 GPU reaches 82.6 TFLOPS FP16 peak shader
- AMD Radeon RX 7900 XTX at 61 TFLOPS FP32 peak performance
- Intel Arc A770 GPU delivers 17.2 TFLOPS FP16 peak
- ARM Neoverse V2 core in AWS Graviton3 peaks at 0.4 TFLOPS FP32 per core
- Google Tensor Processing Unit v4 (Trillium) achieves 2.7 petaFLOPS FP16 per pod
- Cerebras Wafer-Scale Engine 2 (WSE-2) delivers 20 petaFLOPS FP16 AI performance
- Graphcore IPU Colossus MK2 GC200 at 350 TOPS INT8 per chip
- SambaNova SN40L chip reaches 2 petaFLOPS FP16 per card
- Tenstorrent Grayskull at 114 TOPS INT8 peak performance
- SiPearl Rhea CPU for HPC peaks at 1.7 TFLOPS FP64 per socket
- Frontier's HPE Slingshot-11 interconnect enables 200 Gb/s per node for calculating power
- NVIDIA DGX H100 system with 8 H100 GPUs reaches 32 petaFLOPS FP8 AI
- AMD MI250X dual-GPU die at 47.9 TFLOPS FP64 peak
- Intel Ponte Vecchio (Max 1550) GPU at 56 TFLOPS FP64 Tensor
CPU and GPU Performance Interpretation
Current Supercomputers
- Frontier supercomputer holds the current TOP500 #1 at 1.194 exaFLOPS Rmax as of June 2023
- Aurora supercomputer ranks #2 at 1.012 exaFLOPS Rmax in June 2023 TOP500 list
- Eagle supercomputer at 561.2 petaFLOPS Rmax, #3 on June 2023 TOP500
- Fugaku at 442.0 petaFLOPS Rmax, #4 in June 2023
- LUMI at 381.0 petaFLOPS Rmax, #5 June 2023 TOP500
- Frontier's Rp peak is 1.707 exaFLOPS in June 2023
- El Capitan projected at over 2 exaFLOPS, but current Leonardo at 233.3 petaFLOPS #6 June 2023
- Alps supercomputer at 204.8 petaFLOPS #7 June 2023 TOP500
- MareNostrum 5 at 175.5 petaFLOPS #9 June 2023
- Frontier uses 37,888 AMD Instinct MI250X GPUs contributing to its exaFLOPS performance
- Aurora employs Intel Xeon Max CPUs and Data Center GPU Max for 1.012 exaFLOPS
- Summit at Oak Ridge has 27,648 NVIDIA V100 GPUs for 148.6 petaFLOPS Rmax current rank #13
- Perlmutter at NERSC delivers 64.6 petaFLOPS Rmax with AMD GPUs, rank #20 June 2023
- Frontier consumes 20.99 MW power for 1.194 exaFLOPS, efficiency 56.9 gigaFLOPS/W
- Japan's ABCI-Q at 95.2 petaFLOPS Rmax for quantum simulation, rank #26 June 2023
- China's OceanLite at 1.3 exaFLOPS AI performance but 125.4 petaFLOPS HPL #27
- Microsoft Azure Eagle at 561 petaFLOPS but HPL 561.2 petaFLOPS #3
- Nvidia-powered Isambard-AI at 132.0 petaFLOPS #18 June 2023 TOP500
- HPC6 at 110.4 petaFLOPS Rmax #24 June 2023
- AMD EPYC 7763 CPU in Frontier nodes contributes to overall calculating power
- HPE Cray EX architecture in Frontier enables 9.2 million cores
- Japan's Fugaku with A64FX processors at 442 petaFLOPS sustained
- European LUMI uses AMD MI250X GPUs for 381 petaFLOPS
- Selene at 63.5 petaFLOPS Rmax with NVIDIA A100 GPUs #33 June 2023
Current Supercomputers Interpretation
Energy Efficiency and Power Consumption
- Frontier supercomputer efficiency is 52.72 gigaFLOPS/W Green500 #1 June 2023
- Aurora at 49.03 gigaFLOPS/W #2 on Green500 June 2023
- Eagle achieves 46.18 gigaFLOPS/W efficiency #3 Green500 June 2023
- Alps at 40.60 gigaFLOPS/W #4 Green500, consumes less power per FLOP
- LUMI supercomputer at 38.99 gigaFLOPS/W #5 Green500 June 2023
- NVIDIA H100 GPU efficiency up to 1.98 TFLOPS/W FP64 Tensor
- AMD MI300X at 81.7 TFLOPS/W FP16 efficiency claimed
- Google TPU v5e at 2.5x better efficiency than v4, ~400 TFLOPS/W INT8
- Cerebras CS-3 wafer-scale at 1200 TFLOPS/W sparsity FP16
- Graphcore Bow IPU efficiency 500+ TOPS/W INT8
- Frontier total power draw 21 MW for 1.194 exaFLOPS
- Fugaku consumes 29.9 MW at 442 petaFLOPS, 14.8 gigaFLOPS/W
- Summit power 10.1 MW for 148.6 petaFLOPS, 14.7 gigaFLOPS/W
- Sunway TaihuLight used 15.37 MW for 93 petaFLOPS, 6.05 gigaFLOPS/W historical
- NVIDIA A100 SXM4 400W TDP for 19.5 TFLOPS FP64, ~48.75 GFLOPS/W
- AMD EPYC 9754 400W TDP dual socket ~8 GFLOPS/W FP64
- Intel Xeon 8592+ 350W TDP ~8.3 GFLOPS/W FP64 per socket
- Apple M1 Max 60W for 10.4 TFLOPS FP32, 173 GFLOPS/W GPU
- Qualcomm Snapdragon 8 Gen 2 5nm process 4nm effective, ~640 GFLOPS/W mobile GPU
- IBM Power10 at 20.6 gigaFLOPS/W in TOP500 systems
- SiPearl Rhea 2.0 nm process target 50+ GFLOPS/W FP64
- El Capitan projected 2 exaFLOPS at under 30 MW, ~66 gigaFLOPS/W target
Energy Efficiency and Power Consumption Interpretation
Future Projections and Theoretical Limits
- Moore's Law predicts doubling of transistors every 2 years, implying ~1.86x computing power
- Exascale computing achieved 2022, zettascale targeted by 2030 at 10^21 FLOPS
- Quantum supremacy demonstrated by Google Sycamore at 53 qubits, 200s vs classical 10k years
- IBM roadmap to 100k+ logical qubits by 2033 for fault-tolerant quantum computing
- Landauer limit theoretical minimum energy 2.8 kT ln2 per bit erasure ~3 zJ/op at room temp
- Dennard scaling ended 2006, but 3D stacking to continue power efficiency gains
- Optical computing could reach 10^15 FLOPS/W vs electronic 10^12
- Neuromorphic computing like Intel Loihi 2 at 10^12 ops/W synaptic
- Frontier to El Capitan 2x performance at same power by 2025
- AMD roadmap MI400 series 5x AI performance over MI300 by 2026
- NVIDIA Rubin platform R100 GPU 30x inference perf over Hopper by 2026
- Intel 18A process 1.8nm for Xeon 6 by 2025, 20% perf/W gain
- TSMC A16 1.6nm node 10% speed 15-20% power reduction 2026
- Quantum annealers like D-Wave Advantage 5000+ qubits solve optimization 10^6x faster
- Photonic chips Lightmatter Passage 36 petaFLOPS FP16 at 10 kW
- Global supercomputing capacity to hit 10 exaFLOPS aggregate by 2025
- AI training FLOPS doubling every 6 months, 10x every 2 years per OpenAI
- Bekenstein bound limits info density 10^69 bits/m^3 black hole, theoretical compute limit
- Reversible computing could approach Landauer limit, 10^42 ops/J theoretical
- Planetary computing limit Bremermann's 10^50 FLOPS/kg matter
- Margolus-Levitin theorem 6×10^33 ops/J energy-time limit per op
- ExaEnergy project targets 60 gigaFLOPS/W sustainable by 2030
- Post-Moore photonics-neuromorphic hybrid 1000x efficiency by 2040
Future Projections and Theoretical Limits Interpretation
Historical Milestones
- The ENIAC computer, completed in 1945, had a peak performance of approximately 0.0000001 gigaFLOPS (100 kiloFLOPS)
- The Manchester Mark 1, operational in 1949, performed about 1.2 kiloFLOPS in floating-point operations
- The UNIVAC I, delivered in 1951, achieved around 0.000001 gigaFLOPS (1 kiloFLOPS) peak performance
- The IBM 701, introduced in 1953, delivered approximately 0.000016 gigaFLOPS (16 kiloFLOPS)
- The CDC 6600, launched in 1964, reached 3 megaFLOPS peak performance
- The Cray-1 supercomputer, released in 1976, had a peak speed of 160 megaFLOPS
- The Cray X-MP, introduced in 1982, achieved up to 940 megaFLOPS in multi-processor configuration
- The Connection Machine CM-5, deployed in 1991, scaled to 1.056 teraFLOPS with 1024 processors
- ASCI Red, completed in 1997, became the first teraFLOPS supercomputer at 1.338 teraFLOPS
- ASCI White, operational in 2000, peaked at 7.226 teraFLOPS
- Earth Simulator, launched in 2002, achieved 35.86 teraFLOPS on TOP500 list
- Blue Gene/L, reached 280.6 teraFLOPS in 2006
- Roadrunner supercomputer hit 1.026 petaFLOPS in 2008
- Tianhe-1A achieved 2.566 petaFLOPS in 2010
- Fujitsu K computer reached 10.51 petaFLOPS in 2011
- Titan supercomputer delivered 17.59 petaFLOPS in 2013
- Tianhe-2 peaked at 33.86 petaFLOPS in 2014
- Sunway TaihuLight achieved 93.01 petaFLOPS in 2016
- Summit supercomputer reached 122.3 petaFLOPS in 2018
- IBM Power9-based Sierra hit 94.64 petaFLOPS in 2018
- Frontier became the first exaFLOPS machine at 1.102 exaFLOPS in 2022
- The first TOP500 list in June 1993 was topped by TMC CM-5/1024 at 59.7 gigaFLOPS
- Intel Paragon XP/S 140 at 143.4 gigaFLOPS topped November 1993 list
- Numerical Wind Tunnel at 170.0 gigaFLOPS in June 1994
- Intel Paragon at 281.0 gigaFLOPS in November 1996
- ASCI Red at 1.340 teraFLOPS in June 1997
- ASCI Red sustained 1.064 teraFLOPS in November 1997
- ASCI Red at 2.382 teraFLOPS in June 1999
- ASCI Q reached 4.944 teraFLOPS projected in November 2001
- Earth Simulator at 35.860 teraFLOPS in June 2002
Historical Milestones Interpretation
Sources & References
- Reference 1ENen.wikipedia.orgVisit source
- Reference 2TOP500top500.orgVisit source
- Reference 3CPUBENCHMARKcpubenchmark.netVisit source
- Reference 4TECHPOWERUPtechpowerup.comVisit source
- Reference 5NVIDIAnvidia.comVisit source
- Reference 6AMDamd.comVisit source
- Reference 7APPLEapple.comVisit source
- Reference 8INTELintel.comVisit source
- Reference 9QUALCOMMqualcomm.comVisit source
- Reference 10IBMibm.comVisit source
- Reference 11AWSaws.amazon.comVisit source
- Reference 12CLOUDcloud.google.comVisit source
- Reference 13CEREBRAScerebras.netVisit source
- Reference 14GRAPHCOREgraphcore.aiVisit source
- Reference 15SAMBANOVAsambanova.aiVisit source
- Reference 16TENSTORRENTtenstorrent.comVisit source
- Reference 17SIPEARLsiPearl.comVisit source
- Reference 18HPEhpe.comVisit source
- Reference 19ARKark.intel.comVisit source
- Reference 20ANANDTECHanandtech.comVisit source
- Reference 21SIPEARLsipearl.comVisit source
- Reference 22EXASCALECOMPUTINGPROJECTexascalecomputingproject.orgVisit source
- Reference 23NATUREnature.comVisit source
- Reference 24ENERGYenergy.govVisit source
- Reference 25NVIDIANEWSnvidianews.nvidia.comVisit source
- Reference 26TSMCtsmc.comVisit source
- Reference 27DWAVESYSdwavesys.comVisit source
- Reference 28LIGHTMATTERlightmatter.coVisit source
- Reference 29SEMIANALYSISsemianalysis.comVisit source
- Reference 30ARXIVarxiv.orgVisit source
- Reference 31EUROHPC-JUeurohpc-ju.europa.euVisit source
- Reference 32IEEEieee.orgVisit source






