GITNUXREPORT 2026

Qwen AI Statistics

Qwen models have high MMLU scores, high params, and big downloads.

Min-ji Park

Min-ji Park

Research Analyst focused on sustainability and consumer trends.

First published: Feb 24, 2026

Our Commitment to Accuracy

Rigorous fact-checking · Reputable sources · Regular updatesLearn more

Key Statistics

Statistic 1

Qwen-72B achieves 73.5% on MMLU benchmark

Statistic 2

Qwen1.5-72B-Instruct scores 80.5% on MMLU

Statistic 3

Qwen2-72B-Instruct reaches 84.2% on MMLU 5-shot

Statistic 4

Qwen-7B gets 62.4% on MMLU

Statistic 5

Qwen1.5-32B scores 78.1% on MMLU

Statistic 6

Qwen2-7B-Instruct achieves 70.5% on MMLU

Statistic 7

Qwen1.5-110B-Instruct hits 82.4% on MMLU

Statistic 8

Qwen2-1.5B-Instruct scores 65.9% on MMLU

Statistic 9

Qwen-14B reaches 68.2% on MMLU

Statistic 10

Qwen1.5-7B-Instruct gets 74.2% on MMLU

Statistic 11

Qwen2-72B scores 82.8% on MMLU

Statistic 12

Qwen1.5-1.8B achieves 67.9% on MMLU

Statistic 13

Qwen-1.8B hits 58.6% on MMLU

Statistic 14

Qwen2-0.5B-Instruct reaches 52.4% on MMLU

Statistic 15

Qwen1.5-4B scores 71.2% on MMLU

Statistic 16

Qwen-72B-Instruct gets 76.8% on MMLU

Statistic 17

Qwen2-7B reaches 70.5% on MMLU 5-shot

Statistic 18

Qwen1.5-14B-Instruct scores 77.5% on MMLU

Statistic 19

Qwen-7B-Instruct achieves 64.1% on MMLU

Statistic 20

Qwen2-1.5B scores 65.9% on MMLU

Statistic 21

Qwen1.5-0.5B-Instruct hits 54.3% on MMLU

Statistic 22

Qwen-14B-Instruct gets 69.7% on MMLU

Statistic 23

Qwen2-72B-Instruct scores 84.2% on MMLU-Pro

Statistic 24

Qwen1.5-72B scores 80.5% on MMLU-Redux

Statistic 25

Qwen-1.8B-Instruct achieves 60.2% on MMLU

Statistic 26

Qwen model repository has over 50 million downloads on Hugging Face

Statistic 27

Qwen2 series garnered 10 million downloads in first month

Statistic 28

Qwen1.5-7B has 15 million total downloads

Statistic 29

Qwen ranks top 5 on LMSYS Chatbot Arena with Elo 1285 for 72B

Statistic 30

Over 1000 forks on GitHub QwenLM repo

Statistic 31

Qwen-72B-Instruct used in 500+ Hugging Face spaces

Statistic 32

Qwen2-72B tops Open LLM Leaderboard v2

Statistic 33

Qwen1.5 series deployed on Alibaba Cloud by 1 million users

Statistic 34

Qwen GitHub stars exceed 20,000

Statistic 35

Qwen2 released June 2024 with 5 million inferences on DashScope

Statistic 36

Qwen-7B downloaded 8 million times on HF

Statistic 37

Qwen1.5-72B ranks #1 open model on MT-Bench

Statistic 38

Over 200 community fine-tunes of Qwen on HF

Statistic 39

Qwen2-7B-Instruct Arena Elo 1260

Statistic 40

Qwen adopted in 50+ commercial apps via ModelScope

Statistic 41

Qwen-14B has 2 million downloads

Statistic 42

Qwen1.5-32B used in 300+ papers citing it

Statistic 43

Qwen2-0.5B lightweight model with 1M+ downloads

Statistic 44

Qwen repo contributors over 50

Statistic 45

Qwen1.5-110B preview accessed by 100K developers

Statistic 46

Qwen-1.8B mobile deployments exceed 500K

Statistic 47

Qwen2 multilingual variants starred 10K times

Statistic 48

Qwen overall HF models viewed 100 million times

Statistic 49

Qwen-VL released with 1M image-text pairs training

Statistic 50

Qwen-72B has 72 billion parameters

Statistic 51

Qwen1.5-110B contains 110 billion parameters

Statistic 52

Qwen2-72B features 72 billion parameters

Statistic 53

Qwen-14B has 14 billion parameters

Statistic 54

Qwen1.5-32B has 32 billion parameters

Statistic 55

Qwen2-7B has 7 billion parameters

Statistic 56

Qwen1.5-72B has 72 billion parameters

Statistic 57

Qwen2-1.5B contains 1.5 billion parameters

Statistic 58

Qwen-7B has 7 billion parameters

Statistic 59

Qwen1.5-14B has 14 billion parameters

Statistic 60

Qwen2-0.5B has 0.5 billion parameters

Statistic 61

Qwen-1.8B has 1.8 billion parameters

Statistic 62

Qwen1.5-7B has 7 billion parameters

Statistic 63

Qwen2-72B-Instruct uses Transformer architecture with 80 layers

Statistic 64

Qwen1.5-4B has 4 billion parameters

Statistic 65

Qwen-72B-Instruct has 72 billion parameters

Statistic 66

Qwen2-7B-Instruct features 7B params with 28 layers

Statistic 67

Qwen1.5-1.8B has 1.8 billion parameters

Statistic 68

Qwen-14B-Instruct has 14B parameters

Statistic 69

Qwen2-1.5B-Instruct has 1.5B parameters

Statistic 70

Qwen1.5-0.5B has 0.5 billion parameters

Statistic 71

Qwen-7B-Instruct has 7B parameters

Statistic 72

Qwen2-72B has group query attention with 8 query heads

Statistic 73

Qwen1.5-110B-Instruct uses 110B parameters with SwiGLU

Statistic 74

Qwen excels in 29 languages with C-Eval score of 85.2% for Qwen-72B

Statistic 75

Qwen1.5-72B achieves 81.7% on MultiICL benchmark

Statistic 76

Qwen2-72B scores 74.5% on MGSM multilingual math

Statistic 77

Qwen-72B gets 84.3% on CMMLU Chinese benchmark

Statistic 78

Qwen1.5-110B reaches 90.2% on C-Eval

Statistic 79

Qwen2-7B-Instruct scores 68.9% on IFEval multilingual

Statistic 80

Qwen supports Japanese with 82.1% on JMMLU for 72B

Statistic 81

Qwen1.5-32B achieves 76.4% on MultiMT-Bench

Statistic 82

Qwen2-1.5B gets 62.3% on Chinese HumanEval

Statistic 83

Qwen-14B scores 79.5% on C-SimpleQA

Statistic 84

Qwen1.5-7B reaches 73.8% on KoBBQ Korean benchmark

Statistic 85

Qwen2-72B-Instruct 88.4% on Chinese NLI

Statistic 86

Qwen-7B achieves 81.6% on CMMLU

Statistic 87

Qwen1.5-14B scores 77.2% on Arabic MMLU

Statistic 88

Qwen2-0.5B gets 55.7% on multilingual TriviaQA

Statistic 89

Qwen-1.8B reaches 70.4% on French MMLU variant

Statistic 90

Qwen1.5-72B-Instruct 83.9% on Spanish EQ-Bench

Statistic 91

Qwen2-7B scores 71.2% on German HellaSwag

Statistic 92

Qwen-72B-Instruct 86.7% on Russian RACE

Statistic 93

Qwen1.5-4B achieves 69.8% on Italian GSM8K

Statistic 94

Qwen2-1.5B-Instruct 64.5% on Hindi OpenbookQA

Statistic 95

Qwen-14B-Instruct scores 78.9% on Thai summarization

Statistic 96

Qwen1.5-1.8B gets 66.3% on Vietnamese ARC-Challenge

Statistic 97

Qwen2-72B reaches 75.8% on Korean coding eval

Statistic 98

Qwen1.5-0.5B scores 53.1% on multilingual commonsense

Statistic 99

Qwen trained on over 2 trillion tokens

Statistic 100

Qwen1.5 pre-trained on 7 trillion tokens including multilingual data

Statistic 101

Qwen2-72B trained on 7+ trillion high-quality tokens

Statistic 102

Qwen-72B used 10T tokens in pre-training

Statistic 103

Qwen1.5-110B post-trained with over 1 million instructions

Statistic 104

Qwen2 series employed YaRN for extended context up to 128K

Statistic 105

Qwen-7B trained with 2T Chinese-English tokens

Statistic 106

Qwen1.5-72B fine-tuned on 5B+ tokens of instruction data

Statistic 107

Qwen2-7B pre-trained with enhanced data mixture

Statistic 108

Qwen used supervised fine-tuning on 500K samples

Statistic 109

Qwen1.5 supports 14 trillion token pre-training scale

Statistic 110

Qwen2-0.5B trained on diverse code and math data

Statistic 111

Qwen-14B utilized RLHF with 100K preferences

Statistic 112

Qwen1.5-32B trained with long-context up to 32K tokens

Statistic 113

Qwen2-72B-Instruct rejection sampled with DPO

Statistic 114

Qwen-1.8B pre-trained on 1T+ tokens

Statistic 115

Qwen1.5-7B used 3T multilingual tokens

Statistic 116

Qwen2-1.5B fine-tuned on 2B instruction tokens

Statistic 117

Qwen-72B-Instruct aligned with human feedback on 20K samples

Statistic 118

Qwen1.5-14B trained for 128K context length

Statistic 119

Qwen2 supports 29 languages in training data

Statistic 120

Qwen1.5-4B pre-trained on synthetic data augmentation

Trusted by 500+ publications
Harvard Business ReviewThe GuardianFortune+497
Dive into the transformative world of Qwen AI and discover why it’s making waves in AI research and real-world applications—from shattering MMLU benchmarks with scores up to 84.2% for Qwen2-72B-Instruct, to training on trillions of tokens, boasting diverse architecture and parameter configurations, and amassing millions of downloads, over 1,000 GitHub forks, and deployments in 50+ commercial apps.

Key Takeaways

  • Qwen-72B achieves 73.5% on MMLU benchmark
  • Qwen1.5-72B-Instruct scores 80.5% on MMLU
  • Qwen2-72B-Instruct reaches 84.2% on MMLU 5-shot
  • Qwen-72B has 72 billion parameters
  • Qwen1.5-110B contains 110 billion parameters
  • Qwen2-72B features 72 billion parameters
  • Qwen trained on over 2 trillion tokens
  • Qwen1.5 pre-trained on 7 trillion tokens including multilingual data
  • Qwen2-72B trained on 7+ trillion high-quality tokens
  • Qwen excels in 29 languages with C-Eval score of 85.2% for Qwen-72B
  • Qwen1.5-72B achieves 81.7% on MultiICL benchmark
  • Qwen2-72B scores 74.5% on MGSM multilingual math
  • Qwen model repository has over 50 million downloads on Hugging Face
  • Qwen2 series garnered 10 million downloads in first month
  • Qwen1.5-7B has 15 million total downloads

Qwen models have high MMLU scores, high params, and big downloads.

Benchmark Performance

  • Qwen-72B achieves 73.5% on MMLU benchmark
  • Qwen1.5-72B-Instruct scores 80.5% on MMLU
  • Qwen2-72B-Instruct reaches 84.2% on MMLU 5-shot
  • Qwen-7B gets 62.4% on MMLU
  • Qwen1.5-32B scores 78.1% on MMLU
  • Qwen2-7B-Instruct achieves 70.5% on MMLU
  • Qwen1.5-110B-Instruct hits 82.4% on MMLU
  • Qwen2-1.5B-Instruct scores 65.9% on MMLU
  • Qwen-14B reaches 68.2% on MMLU
  • Qwen1.5-7B-Instruct gets 74.2% on MMLU
  • Qwen2-72B scores 82.8% on MMLU
  • Qwen1.5-1.8B achieves 67.9% on MMLU
  • Qwen-1.8B hits 58.6% on MMLU
  • Qwen2-0.5B-Instruct reaches 52.4% on MMLU
  • Qwen1.5-4B scores 71.2% on MMLU
  • Qwen-72B-Instruct gets 76.8% on MMLU
  • Qwen2-7B reaches 70.5% on MMLU 5-shot
  • Qwen1.5-14B-Instruct scores 77.5% on MMLU
  • Qwen-7B-Instruct achieves 64.1% on MMLU
  • Qwen2-1.5B scores 65.9% on MMLU
  • Qwen1.5-0.5B-Instruct hits 54.3% on MMLU
  • Qwen-14B-Instruct gets 69.7% on MMLU
  • Qwen2-72B-Instruct scores 84.2% on MMLU-Pro
  • Qwen1.5-72B scores 80.5% on MMLU-Redux
  • Qwen-1.8B-Instruct achieves 60.2% on MMLU

Benchmark Performance Interpretation

Qwen’s models show a clear upward trend, with newer versions—like the Qwen2-72B-Instruct, which scores 84.2% on MMLU (and 84.2% on the Pro version)—outperforming earlier ones, ranging from tiny 0.5B models (52.4%) to the original 7B (62.4%), while larger models such as the Qwen1.5-110B-Instruct (82.4%) and Qwen2-72B (82.8%) balance scale with progress, crafting a neat, measurable arc of growth across size and iteration.

Community and Adoption

  • Qwen model repository has over 50 million downloads on Hugging Face
  • Qwen2 series garnered 10 million downloads in first month
  • Qwen1.5-7B has 15 million total downloads
  • Qwen ranks top 5 on LMSYS Chatbot Arena with Elo 1285 for 72B
  • Over 1000 forks on GitHub QwenLM repo
  • Qwen-72B-Instruct used in 500+ Hugging Face spaces
  • Qwen2-72B tops Open LLM Leaderboard v2
  • Qwen1.5 series deployed on Alibaba Cloud by 1 million users
  • Qwen GitHub stars exceed 20,000
  • Qwen2 released June 2024 with 5 million inferences on DashScope
  • Qwen-7B downloaded 8 million times on HF
  • Qwen1.5-72B ranks #1 open model on MT-Bench
  • Over 200 community fine-tunes of Qwen on HF
  • Qwen2-7B-Instruct Arena Elo 1260
  • Qwen adopted in 50+ commercial apps via ModelScope
  • Qwen-14B has 2 million downloads
  • Qwen1.5-32B used in 300+ papers citing it
  • Qwen2-0.5B lightweight model with 1M+ downloads
  • Qwen repo contributors over 50
  • Qwen1.5-110B preview accessed by 100K developers
  • Qwen-1.8B mobile deployments exceed 500K
  • Qwen2 multilingual variants starred 10K times
  • Qwen overall HF models viewed 100 million times
  • Qwen-VL released with 1M image-text pairs training

Community and Adoption Interpretation

Qwen, the AI model family, has surged to acclaim, with over 50 million Hugging Face downloads (including 15 million for Qwen1.5-7B, 8 million for Qwen7B, and 10 million in Qwen2’s first month), ranking top 5 on LMSYS Chatbot Arena (72B version with Elo 1285), topping the Open LLM Leaderboard v2 with Qwen2-72B, powering 500+ Hugging Face spaces, 1 million commercial apps via ModelScope, and 300+ cited papers (Qwen1.5-32B), deployed on Alibaba Cloud for 1 million users, boasting 20,000 GitHub stars and 1,000 forks, supporting 200+ community fine-tunes, logging 5 million Qwen2 inferences, hitting 500K mobile deployments (Qwen-1.8B) and 1 million for Qwen2-0.5B, and being accessed by 100K developers for the Qwen1.5-110B preview, with multilingual variants gaining 10K stars, Qwen-VL trained on 1 million image-text pairs, and 50+ contributors—truly a standout in open-source and commercial AI.

Model Architecture

  • Qwen-72B has 72 billion parameters
  • Qwen1.5-110B contains 110 billion parameters
  • Qwen2-72B features 72 billion parameters
  • Qwen-14B has 14 billion parameters
  • Qwen1.5-32B has 32 billion parameters
  • Qwen2-7B has 7 billion parameters
  • Qwen1.5-72B has 72 billion parameters
  • Qwen2-1.5B contains 1.5 billion parameters
  • Qwen-7B has 7 billion parameters
  • Qwen1.5-14B has 14 billion parameters
  • Qwen2-0.5B has 0.5 billion parameters
  • Qwen-1.8B has 1.8 billion parameters
  • Qwen1.5-7B has 7 billion parameters
  • Qwen2-72B-Instruct uses Transformer architecture with 80 layers
  • Qwen1.5-4B has 4 billion parameters
  • Qwen-72B-Instruct has 72 billion parameters
  • Qwen2-7B-Instruct features 7B params with 28 layers
  • Qwen1.5-1.8B has 1.8 billion parameters
  • Qwen-14B-Instruct has 14B parameters
  • Qwen2-1.5B-Instruct has 1.5B parameters
  • Qwen1.5-0.5B has 0.5 billion parameters
  • Qwen-7B-Instruct has 7B parameters
  • Qwen2-72B has group query attention with 8 query heads
  • Qwen1.5-110B-Instruct uses 110B parameters with SwiGLU

Model Architecture Interpretation

The Qwen AI lineup spans a wide range of parameter sizes, from 0.5 billion up to 110 billion, across versions like Qwen, Qwen1.5, and Qwen2, with notable features including the Transformer architecture, SwiGLU activation in the 110B-instruction model, group query attention in Qwen2-72B, and varying layer counts (such as 80 layers in Qwen2-72B-Instruct and 28 in Qwen2-7B-Instruct) for select instruction-tuned variants.

Multilingual Support

  • Qwen excels in 29 languages with C-Eval score of 85.2% for Qwen-72B
  • Qwen1.5-72B achieves 81.7% on MultiICL benchmark
  • Qwen2-72B scores 74.5% on MGSM multilingual math
  • Qwen-72B gets 84.3% on CMMLU Chinese benchmark
  • Qwen1.5-110B reaches 90.2% on C-Eval
  • Qwen2-7B-Instruct scores 68.9% on IFEval multilingual
  • Qwen supports Japanese with 82.1% on JMMLU for 72B
  • Qwen1.5-32B achieves 76.4% on MultiMT-Bench
  • Qwen2-1.5B gets 62.3% on Chinese HumanEval
  • Qwen-14B scores 79.5% on C-SimpleQA
  • Qwen1.5-7B reaches 73.8% on KoBBQ Korean benchmark
  • Qwen2-72B-Instruct 88.4% on Chinese NLI
  • Qwen-7B achieves 81.6% on CMMLU
  • Qwen1.5-14B scores 77.2% on Arabic MMLU
  • Qwen2-0.5B gets 55.7% on multilingual TriviaQA
  • Qwen-1.8B reaches 70.4% on French MMLU variant
  • Qwen1.5-72B-Instruct 83.9% on Spanish EQ-Bench
  • Qwen2-7B scores 71.2% on German HellaSwag
  • Qwen-72B-Instruct 86.7% on Russian RACE
  • Qwen1.5-4B achieves 69.8% on Italian GSM8K
  • Qwen2-1.5B-Instruct 64.5% on Hindi OpenbookQA
  • Qwen-14B-Instruct scores 78.9% on Thai summarization
  • Qwen1.5-1.8B gets 66.3% on Vietnamese ARC-Challenge
  • Qwen2-72B reaches 75.8% on Korean coding eval
  • Qwen1.5-0.5B scores 53.1% on multilingual commonsense

Multilingual Support Interpretation

Qwen, a versatile language model that excels across 29 languages, proves its mettle with standout scores like 85.2% on C-Eval for Qwen-72B, 90.2% on C-Eval for Qwen1.5-110B, strong showings in benchmarks such as MultiICL (81.7% for Qwen1.5-72B) and Chinese NLI (88.4% for Qwen2-72B-Instruct), and solid performance in regional tasks like Korean coding (75.8% for Qwen2-72B) and French reasoning (70.4% for Qwen-1.8B), while even smaller models hold their own in various areas, from German HellaSwag to Russian RACE, and only a few benchmarks—like multilingual TriviaQA (55.3% for Qwen2-0.5B) or Chinese HumanEval (55.7% for Qwen2-1.5B)—show room for growth.

Training Details

  • Qwen trained on over 2 trillion tokens
  • Qwen1.5 pre-trained on 7 trillion tokens including multilingual data
  • Qwen2-72B trained on 7+ trillion high-quality tokens
  • Qwen-72B used 10T tokens in pre-training
  • Qwen1.5-110B post-trained with over 1 million instructions
  • Qwen2 series employed YaRN for extended context up to 128K
  • Qwen-7B trained with 2T Chinese-English tokens
  • Qwen1.5-72B fine-tuned on 5B+ tokens of instruction data
  • Qwen2-7B pre-trained with enhanced data mixture
  • Qwen used supervised fine-tuning on 500K samples
  • Qwen1.5 supports 14 trillion token pre-training scale
  • Qwen2-0.5B trained on diverse code and math data
  • Qwen-14B utilized RLHF with 100K preferences
  • Qwen1.5-32B trained with long-context up to 32K tokens
  • Qwen2-72B-Instruct rejection sampled with DPO
  • Qwen-1.8B pre-trained on 1T+ tokens
  • Qwen1.5-7B used 3T multilingual tokens
  • Qwen2-1.5B fine-tuned on 2B instruction tokens
  • Qwen-72B-Instruct aligned with human feedback on 20K samples
  • Qwen1.5-14B trained for 128K context length
  • Qwen2 supports 29 languages in training data
  • Qwen1.5-4B pre-trained on synthetic data augmentation

Training Details Interpretation

Qwen and its family of models are AI powerhouses, chomping through trillions of tokens—from multilingual and synthetic data to code, math, and instruction sets—fine-tuning with millions of commands, aligning with human preferences (via techniques like RLHF and DPO), and packing context lengths up to 128K, with each version outdoing the last in scale, specificity, and versatility.