Key Takeaways
- Global big data and business analytics market size was valued at $68.09 billion in 2019 and is expected to reach $274.3 billion by 2022
- The IDC 2020 outlook expects worldwide big data and analytics spending to total $274.3 billion in 2022
- IDC projects worldwide big data and analytics spending to reach $348.7 billion in 2024
- According to NIST, big data is often characterized by the 5 Vs (volume, velocity, variety, veracity, value)
- NIST defines “big data” as data sets with sizes beyond the ability of typical tools to capture, store, manage, and analyze
- NIST notes that big data can be analyzed to uncover patterns, correlations, and other insights
- HDFS uses block replication: each block is replicated 3 times by default
- HDFS default replication factor is 3
- Apache Spark is designed to run in-memory computations for speed; Spark uses directed acyclic graphs (DAGs) for execution
- The amount of data in the world in 2018 was estimated at 33 zettabytes
- The Seagate 2018 report estimated 175 zettabytes of data will be created by 2025
- Seagate estimated 79 zettabytes of data will be created by 2019
Big data market grows fast, drives cloud analytics and data-driven profits.
Market & Adoption
Market & Adoption Interpretation
Definitions & Characteristics
Definitions & Characteristics Interpretation
Infrastructure & Performance
Infrastructure & Performance Interpretation
Data Volume & Growth
Data Volume & Growth Interpretation
How We Rate Confidence
Every statistic is queried across four AI models (ChatGPT, Claude, Gemini, Perplexity). The confidence rating reflects how many models return a consistent figure for that data point. Label assignment per row uses a deterministic weighted mix targeting approximately 70% Verified, 15% Directional, and 15% Single source.
Only one AI model returns this statistic from its training data. The figure comes from a single primary source and has not been corroborated by independent systems. Use with caution; cross-reference before citing.
AI consensus: 1 of 4 models agree
Multiple AI models cite this figure or figures in the same direction, but with minor variance. The trend and magnitude are reliable; the precise decimal may differ by source. Suitable for directional analysis.
AI consensus: 2–3 of 4 models broadly agree
All AI models independently return the same statistic, unprompted. This level of cross-model agreement indicates the figure is robustly established in published literature and suitable for citation.
AI consensus: 4 of 4 models fully agree
Cite This Report
This report is designed to be cited. We maintain stable URLs and versioned verification dates. Copy the format appropriate for your publication below.
Lukas Bauer. (2026, February 13). Big Data Statistics. Gitnux. https://gitnux.org/big-data-statistics
Lukas Bauer. "Big Data Statistics." Gitnux, 13 Feb 2026, https://gitnux.org/big-data-statistics.
Lukas Bauer. 2026. "Big Data Statistics." Gitnux. https://gitnux.org/big-data-statistics.
References
- 1databricks.com/resources/whitepapers/the-enterprise-big-data-market-is-growing-rapidly
- 28databricks.com/resources/report/state-of-ai
- 2idc.com/getdoc.jsp?containerId=prUS46394220
- 3idc.com/getdoc.jsp?containerId=prUS46394320
- 4idc.com/getdoc.jsp?containerId=prUS46590221
- 5idc.com/getdoc.jsp?containerId=prUS46604421
- 35idc.com/promo/word-definitions/big-data
- 63idc.com/getdoc.jsp?containerId=prUS44410418
- 6gartner.com/en/newsroom/press-releases/2020-08-06-gartner-says-by-2025-75-percent-of-enterprises-will
- 7gartner.com/en/newsroom/press-releases/2019-11-18-gartner-says-by-2023-50-percent
- 8gartner.com/en/newsroom/press-releases/2021-07-07-gartner-says-by-2024-75-percent
- 23gartner.com/en/newsroom/press-releases/2016-03-30-gartner-says-advanced-analytics
- 9sas.com/en_us/insights/articles/analytics/big-data-analytics-statistics.html
- 10experian.com/blogs/business-strategy/data/big-data-statistics/
- 11delltechnologies.com/en-us/perspectives/big-data-statistics.htm
- 12ibm.com/blogs/business-analytics/2011/08/big-data-the-next-frontier/
- 13ibm.com/thought-leadership/institute-business-value/report/digital-data-world
- 34ibm.com/cloud/learn/big-data
- 67ibm.com/blogs/systems/2013/05/what-is-big-data/
- 68ibm.com/blogs/think/2014/02/big-data-and-the-end-to-end-data-pipeline/
- 14worldbank.org/en/programs/ic4d/brief/digital-growth
- 15seagate.com/gb/en/our-story/news/press-releases/seagate-and-the-institute-of-data-and-statistical-studies/
- 62seagate.com/www-content/about-us/newsroom/press-releases/files/Seagate-IOD-2018-Data-Created.pdf
- 16splunk.com/en_us/resources/reports/state-of-big-data-and-security.html
- 17alteryx.com/company/resources/resource-library/data-analyst-survey
- 18home.kpmg/us/en/home/insights/2017/10/data-and-analytics-survey.html
- 19pwc.com/gx/en/issues/analytics/assets/pwc-ceo-analytics-survey.pdf
- 20newvantage.com/blog/2014/02/analytics-programs-statistics/
- 21nucleusresearch.com/research/big-data-analytics-is-paying-off-for-companies/
- 22mckinsey.com/featured-insights/mckinsey-analytics/how-businesses-are-using-data-to-improve-performance
- 24dellemc.com/en-us/leadership/thought-leadership/industry-insights/index.htm
- 25cloudera.com/resources/whitepapers/enterprise-data-cloud.html
- 26thoughtspot.com/resources/data-silos-survey
- 27forbes.com/sites/forbestechcouncil/2019/05/07/how-data-analytics-creates-a-competitive-advantage/
- 29oreilly.com/library/view/data-science-for/9781491952965/ch01.html
- 30nist.gov/system/files/documents/2017/01/02/big-data.pdf
- 31nist.gov/publications/final-report-big-data-interoperability
- 32nist.gov/news-events/news/2015/10/nist-publishes-report-big-data
- 44nist.gov/publications/big-data-and-privacy-protection
- 45nist.gov/itl/smallbusinesscybersecurity
- 46nist.gov/system/files/documents/2017/01/02/big-data-interoperability.pdf
- 33cs.berkeley.edu/~brewer/5thv.pdf
- 51cs.berkeley.edu/~matei/papers/2010/spark.pdf
- 36hadoop.apache.org/docs/stable1/hadoop-project-dist/hadoop-common/History.html
- 47hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Replication
- 48hadoop.apache.org/docs/stable3/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
- 37research.google/pubs/pub62/
- 43research.google/pubs/pub56/
- 53research.google/pubs/spanner/
- 38static1.squarespace.com/static/5472b4d2e4b0f54f9d9b62e2/t/58a0d9a5a5451f12b9bdab4b/1487975622660/Lambda+Architecture.pdf
- 39arxiv.org/abs/1402.2773
- 40cs.brown.edu/~mph/undergrad/2011/papers/brewer-cap.pdf
- 41research.cs.umbc.edu/~mhamdi/papers/pacelc.pdf
- 42ieeexplore.ieee.org/document/6129373
- 49spark.apache.org/docs/latest/cluster-overview.html
- 50spark.apache.org/docs/latest/sql-programming-guide.html
- 52dl.acm.org/doi/10.1145/2071389.2071394
- 54static.googleusercontent.com/media/research.google.com/en//pubs/archive/36962.pdf
- 55cassandra.apache.org/_/index.html
- 56kafka.apache.org/documentation/#configuration
- 57kafka.apache.org/documentation/#brokerconfigs
- 58elastic.co/guide/en/elasticsearch/reference/current/index-modules.html#index-number-of-shards
- 59hbase.apache.org/book.html#arch.overview
- 60mongodb.com/docs/manual/sharding/
- 61redis.io/docs/latest/operate/oss_and_stack/management/scaling/
- 64cisco.com/c/en/us/solutions/collateral/service-provider/vni-forecast-highlights/white-paper-c11-741490.html
- 66cisco.com/c/en/us/solutions/collateral/service-provider/vni-forecast-highlights/white-paper-c11-520862.html
- 65newsroom.cisco.com/c/r/newsroom/en/us/a/i/vni.html
- 69emc.com/collateral/analyst-reports/idc/digital-universe-2014.pdf
- 70ericsson.com/en/reports/mobility-report
- 71ericsson.com/en/mobility-report/reports/june-2021
- 75ericsson.com/en/reports-and-papers/mobility-report/connected-devices
- 72snowflake.com/press-room/snowflake-2022-data-cloud-state-of-the-union/
- 73home.cern/news/news/knowledge-sharing/what-does-it-take-make-big-lhc-run
- 74home.cern/news/news/experiments/production-and-processing






