Key Takeaways
- Genome-wide association studies link 7,000 SNPs to disease risk
- Pharmacogenomics identifies 300 actionable variants for 100+ drugs
- Prenatal whole-genome sequencing detects 13% more pathogenic variants than microarrays
- The human genome contains an estimated 20,000-25,000 protein-coding genes
- Non-coding RNAs number over 20,000 in the human genome including lncRNAs and miRNAs
- Pseudogenes in humans total around 14,000, mostly processed pseudogenes
- The common single nucleotide polymorphisms (SNPs) number over 10 million in the human genome with minor allele frequency >1%
- Structural variants (SVs) affect 20-50 kb per individual, totaling 1-2% of genome difference
- Copy number variations (CNVs) cover 12% of the human genome across populations
- The human genome contains approximately 3.2 billion base pairs of DNA sequence
- The haploid human genome size is measured at 3,054,815,472 base pairs in the GRCh38.p14 assembly
- Eukaryotic genomes like humans have linear chromosomes, with 22 autosomes and 2 sex chromosomes totaling 24 unique chromosomes
- Human Genome Project officially completed in 2003 with 99% coverage at 1x depth
- The first human genome sequence cost $2.7 billion and took 13 years
- Illumina HiSeq platform enabled 100x coverage human genomes for under $1,000 by 2015
From thousands of SNP links to full genome sequencing progress, statistics are rapidly turning data into action.
Related reading
Applications and Impacts
Applications and Impacts Interpretation
Gene Content
Gene Content Interpretation
Genetic Variation
Genetic Variation Interpretation
More related reading
Genome Size and Structure
Genome Size and Structure Interpretation
Sequencing Projects
Sequencing Projects Interpretation
How We Rate Confidence
Every statistic is queried across four AI models (ChatGPT, Claude, Gemini, Perplexity). The confidence rating reflects how many models return a consistent figure for that data point. Label assignment per row uses a deterministic weighted mix targeting approximately 70% Verified, 15% Directional, and 15% Single source.
Only one AI model returns this statistic from its training data. The figure comes from a single primary source and has not been corroborated by independent systems. Use with caution; cross-reference before citing.
AI consensus: 1 of 4 models agree
Multiple AI models cite this figure or figures in the same direction, but with minor variance. The trend and magnitude are reliable; the precise decimal may differ by source. Suitable for directional analysis.
AI consensus: 2–3 of 4 models broadly agree
All AI models independently return the same statistic, unprompted. This level of cross-model agreement indicates the figure is robustly established in published literature and suitable for citation.
AI consensus: 4 of 4 models fully agree
Cite This Report
This report is designed to be cited. We maintain stable URLs and versioned verification dates. Copy the format appropriate for your publication below.
Emilia Santos. (2026, February 13). Genome Statistics. Gitnux. https://gitnux.org/genome-statistics
Emilia Santos. "Genome Statistics." Gitnux, 13 Feb 2026, https://gitnux.org/genome-statistics.
Emilia Santos. 2026. "Genome Statistics." Gitnux. https://gitnux.org/genome-statistics.
Sources & References
- Reference 1GENOMEgenome.gov
genome.gov
- Reference 2NCBIncbi.nlm.nih.gov
ncbi.nlm.nih.gov
- Reference 3ENen.wikipedia.org
en.wikipedia.org
- Reference 4MEDLINEPLUSmedlineplus.gov
medlineplus.gov
- Reference 5GHRghr.nlm.nih.gov
ghr.nlm.nih.gov
- Reference 6NATUREnature.com
nature.com
- Reference 7GENOMEgenome.ucsc.edu
genome.ucsc.edu
- Reference 8ENSEMBLensembl.org
ensembl.org
- Reference 9GATKgatk.broadinstitute.org
gatk.broadinstitute.org
- Reference 10FLYBASEflybase.org
flybase.org
- Reference 11ARABIDOPSISarabidopsis.org
arabidopsis.org
- Reference 12YEASTGENOMEyeastgenome.org
yeastgenome.org
- Reference 13WHEATGENOMEwheatgenome.org
wheatgenome.org
- Reference 14MAIZEGDBmaizegdb.org
maizegdb.org
- Reference 15WORMBASEwormbase.org
wormbase.org
- Reference 16PLASMODBplasmodb.org
plasmodb.org
- Reference 17GENCODEGENESgencodegenes.org
gencodegenes.org
- Reference 18IMGTimgt.org
imgt.org
- Reference 19GUIDETOPHARMACOLOGYguidetopharmacology.org
guidetopharmacology.org
- Reference 20KINASEkinase.com
kinase.com
- Reference 21DRNELSONdrnelson.uthsc.edu
drnelson.uthsc.edu
- Reference 22ECOCYCecocyc.org
ecocyc.org
- Reference 23RICErice.uga.edu
rice.uga.edu
- Reference 24ILLUMINAillumina.com
illumina.com
- Reference 25INTERNATIONALGENOMEinternationalgenome.org
internationalgenome.org
- Reference 26UKBIOBANKukbiobank.ac.uk
ukbiobank.ac.uk
- Reference 27CANCERcancer.gov
cancer.gov
- Reference 28EARTHBIOGENOMEearthbiogenome.org
earthbiogenome.org
- Reference 29SCIENCEscience.org
science.org
- Reference 30ENCODEPROJECTencodeproject.org
encodeproject.org
- Reference 31NANOPORETECHnanoporetech.com
nanoporetech.com
- Reference 32ALLOFUSallofus.nih.gov
allofus.nih.gov
- Reference 33GTEXPORTALgtexportal.org
gtexportal.org
- Reference 341000GENOMES1000genomes.org
1000genomes.org
- Reference 35EBIebi.ac.uk
ebi.ac.uk
- Reference 36CBIOPORTALcbioportal.org
cbioportal.org
- Reference 37GNOMADgnomad.broadinstitute.org
gnomad.broadinstitute.org
- Reference 38PHARMGKBpharmgkb.org
pharmgkb.org
- Reference 39NEJMnejm.org
nejm.org
- Reference 40ACOGacog.org
acog.org
- Reference 41ISAAAisaaa.org
isaaa.org
- Reference 42ANCESTRYancestry.com
ancestry.com
- Reference 43ANNALSOFONCOLOGYannalsofoncology.org
annalsofoncology.org
- Reference 44GENOMEWEBgenomeweb.com
genomeweb.com







