Epoch AI’s work is free to use, distribute, and reproduce provided the source and authors are credited under the Creative Commons Attribution license.

Cite this work as

                
                  Robi Rahman and David Owen (2024), "The size of datasets used to train language models doubles approximately every six months". Published online at epoch.ai. Retrieved from: 'https://epoch.ai/data-insights/dataset-size-trend' [online resource]

BibTeX citation

  
  @misc{epoch2024datasetsizetrend,
    title={The size of datasets used to train language models doubles approximately every six months},
    author={Robi Rahman and David Owen},
    year={2024},
    url={https://epoch.ai/data-insights/dataset-size-trend},
    note={Accessed: }
  }
  

              

The size of datasets used to train language models doubles approximately every six months

Across all domains of ML, models are using more and more training data. In language modeling, datasets are growing at a rate of 3.7x per year. The largest models currently use datasets with tens of trillions of words. The largest public datasets are about ten times larger than this, for example Common Crawl contains hundreds of trillions of words before filtering.

{"xAxis": {"label": "Publication date", "lim": [2009.2, 2026.8], "scaleType": "linear", "ticks": [2008.0, 2010.0, 2012.0, 2014.0, 2016.0, 2018.0, 2020.0, 2022.0, 2024.0, 2026.0, 2028.0], "tickLabels": ["2008", "2010", "2012", "2014", "2016", "2018", "2020", "2022", "2024", "2026", "2028"], "hideMinorGrid": true, "nice": false}, "yAxis": {"label": "Training dataset size (tokens)", "lim": [1893.2232341821834, 1936975432981144.8], "scaleType": "log", "ticks": [10.0, 1000.0, 100000.0, 10000000.0, 1000000000.0, 100000000000.0, 10000000000000.0, 1000000000000000.0, 1e+17, 1e+19], "tickLabels": ["$\\mathdefault{10^{1}}$", "$\\mathdefault{10^{3}}$", "$\\mathdefault{10^{5}}$", "$\\mathdefault{10^{7}}$", "$\\mathdefault{10^{9}}$", "$\\mathdefault{10^{11}}$", "$\\mathdefault{10^{13}}$", "$\\mathdefault{10^{15}}$", "$\\mathdefault{10^{17}}$", "$\\mathdefault{10^{19}}$"], "hideMinorGrid": true}, "showLegend": true, "legendPosition": "header", "showFrame": true, "objects": [{"type": "scatter", "alpha": 0.45, "zOrder": 1, "clip": true, "points": [{"x": 2010.7351598173516, "y": 6400000.0, "tooltipData": {"Model": "RNN LM", "Domain": "Language", "Training compute (FLOP)": "5.4e+16", "Training dataset size (datapoints)": "6.4e+06", "Organization": "Johns Hopkins University", "Publication date": "2010-09-26"}, "size": 8}, {"x": 2011.4659817351599, "y": 75000.0, "tooltipData": {"Model": "Vector Space Model", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "7.5e+04", "Organization": "Stanford University", "Publication date": "2011-06-19"}, "size": 8}, {"x": 2011.4906392694065, "y": 833333.0, "tooltipData": {"Model": "Recursive Neural Network", "Domain": "Vision, Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "8.3e+05", "Organization": "Stanford University", "Publication date": "2011-06-28"}, "size": 8}, {"x": 2011.852511415525, "y": 852000000.0, "tooltipData": {"Model": "NLP from scratch", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "8.5e+08", "Organization": "NEC Laboratories, Princeton University", "Publication date": "2011-11-08"}, "size": 8}, {"x": 2012.6885844748858, "y": 27000000.0, "tooltipData": {"Model": "LSTM LM", "Domain": "Language", "Training compute (FLOP)": "1.7e+16", "Training dataset size (datapoints)": "2.7e+07", "Organization": "RWTH Aachen University", "Publication date": "2012-09-09"}, "size": 8}, {"x": 2013.041095890411, "y": 6000000000.0, "tooltipData": {"Model": "DistBelief NNLM", "Domain": "Language", "Training compute (FLOP)": "2.6e+18", "Training dataset size (datapoints)": "6.0e+09", "Organization": "Google", "Publication date": "2013-01-16"}, "size": 8}, {"x": 2013.75, "y": 4100000.0, "tooltipData": {"Model": "RCTM", "Domain": "Language", "Training compute (FLOP)": "9.3e+15", "Training dataset size (datapoints)": "4.1e+06", "Organization": "University of Oxford", "Publication date": "2013-10-01"}, "size": 8}, {"x": 2013.75, "y": 155063.0, "tooltipData": {"Model": "RNTN", "Domain": "Language", "Training compute (FLOP)": "1.4e+16", "Training dataset size (datapoints)": "1.6e+05", "Organization": "Stanford University", "Publication date": "2013-10-01"}, "size": 8}, {"x": 2013.791095890411, "y": 33000000000.0, "tooltipData": {"Model": "Word2Vec (large)", "Domain": "Language", "Training compute (FLOP)": "3.9e+16", "Training dataset size (datapoints)": "3.3e+10", "Organization": "Google", "Publication date": "2013-10-16"}, "size": 8}, {"x": 2013.791095890411, "y": 692000.0, "tooltipData": {"Model": "Word2Vec (small)", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "6.9e+05", "Organization": "Google", "Publication date": "2013-10-16"}, "size": 8}, {"x": 2013.9276255707764, "y": 17000000.0, "tooltipData": {"Model": "TransE", "Domain": "Language", "Training compute (FLOP)": "1.3e+18", "Training dataset size (datapoints)": "1.7e+07", "Organization": "Universite de Technologie de Compi\u00e8gne \u2013 CNRS, Google", "Publication date": "2013-12-05"}, "size": 8}, {"x": 2013.9440639269408, "y": 1000000000.0, "tooltipData": {"Model": "RNN for 1B words", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "1.0e+09", "Organization": "Google", "Publication date": "2013-12-11"}, "size": 8}, {"x": 2014.0, "y": 6000000000.0, "tooltipData": {"Model": "GloVe (6B)", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "6.0e+09", "Organization": "Stanford University", "Publication date": "2014-01-01"}, "size": 8}, {"x": 2014.0, "y": 42000000000.0, "tooltipData": {"Model": "GloVe (32B)", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "4.2e+10", "Organization": "Stanford University", "Publication date": "2014-01-01"}, "size": 8}, {"x": 2014.3689497716894, "y": 75000.0, "tooltipData": {"Model": "Paragraph Vector", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "7.5e+04", "Organization": "Google", "Publication date": "2014-05-14"}, "size": 8}, {"x": 2014.4166666666667, "y": 6248.0, "tooltipData": {"Model": "AdaRNN", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "6.2e+03", "Organization": "Beihang University", "Publication date": "2014-06-01"}, "size": 8}, {"x": 2014.6666666666667, "y": 348000000.0, "tooltipData": {"Model": "RNNsearch-50*", "Domain": "Language", "Training compute (FLOP)": "1.6e+18", "Training dataset size (datapoints)": "3.5e+08", "Organization": "Jacobs University Bremen, University of Montreal / Universit\u00e9 de Montr\u00e9al", "Publication date": "2014-09-01"}, "size": 8}, {"x": 2014.6913242009134, "y": 652000000.0, "tooltipData": {"Model": "Seq2Seq LSTM", "Domain": "Language", "Training compute (FLOP)": "5.6e+19", "Training dataset size (datapoints)": "6.5e+08", "Organization": "Google", "Publication date": "2014-09-10"}, "size": 8}, {"x": 2014.85799086758, "y": 600000.0, "tooltipData": {"Model": "SC-NLM", "Domain": "Multimodal, Vision, Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "6.0e+05", "Organization": "University of Toronto", "Publication date": "2014-11-10"}, "size": 8}, {"x": 2014.9221461187215, "y": 1000000000.0, "tooltipData": {"Model": "SNM-skip", "Domain": "Language", "Training compute (FLOP)": "3.0e+20", "Training dataset size (datapoints)": "1.0e+09", "Organization": "Google", "Publication date": "2014-12-03"}, "size": 8}, {"x": 2015.2105022831051, "y": 929000.0, "tooltipData": {"Model": "genCNN + dyn eval", "Domain": "Language", "Training compute (FLOP)": "3.4e+16", "Training dataset size (datapoints)": "9.3e+05", "Organization": "Chinese Academy of Sciences, Huawei Noah's Ark Lab, Dublin City University", "Publication date": "2015-03-17"}, "size": 8}, {"x": 2015.6655251141551, "y": 37500000.0, "tooltipData": {"Model": "BPE", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "3.8e+07", "Organization": "University of Edinburgh", "Publication date": "2015-08-31"}, "size": 8}, {"x": 2015.9577625570778, "y": 929000.0, "tooltipData": {"Model": "Variational (untied weights, MC) LSTM (Large)", "Domain": "Language", "Training compute (FLOP)": "5.9e+15", "Training dataset size (datapoints)": "9.3e+05", "Organization": "University of Cambridge", "Publication date": "2015-12-16"}, "size": 8}, {"x": 2016.174885844749, "y": 204567.0, "tooltipData": {"Model": "Named Entity Recognition model", "Domain": "Language", "Training compute (FLOP)": "9.7e+16", "Training dataset size (datapoints)": "2.0e+05", "Organization": "Carnegie Mellon University (CMU)", "Publication date": "2016-03-04"}, "size": 8}, {"x": 2016.7351598173516, "y": 388960152200.7108, "tooltipData": {"Model": "GNMT", "Domain": "Language", "Training compute (FLOP)": "6.6e+21", "Training dataset size (datapoints)": "3.9e+11", "Organization": "Google", "Publication date": "2016-09-26"}, "size": 8}, {"x": 2016.7351598173516, "y": 929000.0, "tooltipData": {"Model": "Pointer Sentinel-LSTM (medium)", "Domain": "Language", "Training compute (FLOP)": "7.5e+15", "Training dataset size (datapoints)": "9.3e+05", "Organization": "MetaMind Inc, Salesforce", "Publication date": "2016-09-26"}, "size": 8}, {"x": 2016.8415525114156, "y": 929000.0, "tooltipData": {"Model": "VD-LSTM+REAL Large", "Domain": "Language", "Training compute (FLOP)": "2.1e+16", "Training dataset size (datapoints)": "9.3e+05", "Organization": "Salesforce Research, Stanford University", "Publication date": "2016-11-04"}, "size": 8}, {"x": 2016.844292237443, "y": 929000.0, "tooltipData": {"Model": "NAS with base 8 and shared embeddings", "Domain": "Language", "Training compute (FLOP)": "1.0e+16", "Training dataset size (datapoints)": "9.3e+05", "Organization": "Google Brain", "Publication date": "2016-11-05"}, "size": 8}, {"x": 2016.844292237443, "y": 47160000.0, "tooltipData": {"Model": "BIDAF", "Domain": "Language", "Training compute (FLOP)": "3.5e+18", "Training dataset size (datapoints)": "4.7e+07", "Organization": "University of Washington, Allen Institute for AI", "Publication date": "2016-11-05"}, "size": 8}, {"x": 2017.0602739726028, "y": 133000000000.0, "tooltipData": {"Model": "MoE-Multi", "Domain": "Language", "Training compute (FLOP)": "9.4e+19", "Training dataset size (datapoints)": "1.3e+11", "Organization": "Jagiellonian University, Google Brain", "Publication date": "2017-01-23"}, "size": 8}, {"x": 2017.352511415525, "y": 107785.0, "tooltipData": {"Model": "Mnemonic Reader", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "1.1e+05", "Organization": "Fudan University, Microsoft Research", "Publication date": "2017-05-08"}, "size": 8}, {"x": 2017.4468036529681, "y": 1866666666.6666667, "tooltipData": {"Model": "Transformer", "Domain": "Language", "Training compute (FLOP)": "7.4e+18", "Training dataset size (datapoints)": "1.9e+09", "Organization": "Google Research, Google Brain", "Publication date": "2017-06-12"}, "strokeColor": "black", "strokeWidth": 0.5, "size": 8, "zOrder": 2}, {"x": 2017.5657534246575, "y": 46600000.0, "tooltipData": {"Model": "ConvS2S (ensemble of 8 models)", "Domain": "Language", "Training compute (FLOP)": "5.6e+19", "Training dataset size (datapoints)": "4.7e+07", "Organization": "Meta AI", "Publication date": "2017-07-25"}, "size": 8}, {"x": 2017.5794520547945, "y": 4000000.0, "tooltipData": {"Model": "GSM", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "4.0e+06", "Organization": "Peking University, Microsoft Research", "Publication date": "2017-07-30"}, "size": 8}, {"x": 2017.5997716894976, "y": 2000000.0, "tooltipData": {"Model": "AWD-LSTM - 3-layer LSTM (tied) + continuous cache pointer (WT2)", "Domain": "Language", "Training compute (FLOP)": "3.0e+17", "Training dataset size (datapoints)": "2.0e+06", "Organization": "Salesforce Research", "Publication date": "2017-08-07"}, "size": 8}, {"x": 2017.6189497716894, "y": 929000.0, "tooltipData": {"Model": "EI-REHN-1000D", "Domain": "Language", "Training compute (FLOP)": "1.1e+16", "Training dataset size (datapoints)": "9.3e+05", "Organization": "Korea Advanced Institute of Science and Technology (KAIST)", "Publication date": "2017-08-14"}, "size": 8}, {"x": 2017.6600456621004, "y": 2000000.0, "tooltipData": {"Model": "GL-LWGC-AWD-MoS-LSTM + dynamic evaluation (WT2)", "Domain": "Language", "Training compute (FLOP)": "4.6e+17", "Training dataset size (datapoints)": "2.0e+06", "Organization": "Ben-Gurion University of the Negev", "Publication date": "2017-08-29"}, "size": 8}, {"x": 2017.7050228310502, "y": 929000.0, "tooltipData": {"Model": "ISS", "Domain": "Language", "Training compute (FLOP)": "3.4e+15", "Training dataset size (datapoints)": "9.3e+05", "Organization": "Duke University, Microsoft", "Publication date": "2017-09-15"}, "size": 8}, {"x": 2017.7351598173516, "y": 2000000.0, "tooltipData": {"Model": "AWD-LSTM+WT+Cache+IOG (WT2)", "Domain": "Language", "Training compute (FLOP)": "3.2e+15", "Training dataset size (datapoints)": "2.0e+06", "Organization": "NTT Communication Science Laboratories", "Publication date": "2017-09-26"}, "size": 8}, {"x": 2017.8239726027398, "y": 90000.0, "tooltipData": {"Model": "PhraseCond", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "9.0e+04", "Organization": "Carnegie Mellon University (CMU), University of Pittsburgh", "Publication date": "2017-10-28"}, "size": 8}, {"x": 2017.8267123287671, "y": 2000000000.0, "tooltipData": {"Model": "S-Norm", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "2.0e+09", "Organization": "University of Washington, Allen Institute for AI", "Publication date": "2017-10-29"}, "size": 8}, {"x": 2017.8321917808219, "y": 107785.0, "tooltipData": {"Model": "DCN+", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "1.1e+05", "Organization": "Salesforce Research", "Publication date": "2017-10-31"}, "size": 8}, {"x": 2017.8321917808219, "y": 2000000.0, "tooltipData": {"Model": "Fraternal dropout + AWD-LSTM 3-layer (WT2)", "Domain": "Language", "Training compute (FLOP)": "3.1e+17", "Training dataset size (datapoints)": "2.0e+06", "Organization": "Jagiellonian University, Mila - Quebec AI (originally Montreal Institute for Learning Algorithms), University of Montreal / Universit\u00e9 de Montr\u00e9al", "Publication date": "2017-10-31"}, "size": 8}, {"x": 2017.85799086758, "y": 2000000.0, "tooltipData": {"Model": "AWD-LSTM-MoS + dynamic evaluation (WT2, 2017)", "Domain": "Language", "Training compute (FLOP)": "3.4e+18", "Training dataset size (datapoints)": "2.0e+06", "Organization": "Carnegie Mellon University (CMU)", "Publication date": "2017-11-10"}, "size": 8}, {"x": 2017.9166666666667, "y": 400000000.0, "tooltipData": {"Model": "DL scaling LM", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "4.0e+08", "Organization": "Baidu", "Publication date": "2017-12-01"}, "size": 8}, {"x": 2018.0833333333333, "y": 103000000.0, "tooltipData": {"Model": "QRNN", "Domain": "Language", "Training compute (FLOP)": "6.9e+17", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Salesforce Research", "Publication date": "2018-02-01"}, "size": 8}, {"x": 2018.2242009132422, "y": 103000000.0, "tooltipData": {"Model": "4 layer QRNN (h=2500)", "Domain": "Language", "Training compute (FLOP)": "5.9e+17", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Salesforce Research", "Publication date": "2018-03-22"}, "size": 8}, {"x": 2018.338812785388, "y": 2000000.0, "tooltipData": {"Model": "Dropout-LSTM+Noise(Bernoulli) (WT2)", "Domain": "Language", "Training compute (FLOP)": "1.3e+17", "Training dataset size (datapoints)": "2.0e+06", "Organization": "Columbia University, New York University (NYU), Princeton University", "Publication date": "2018-05-03"}, "size": 8}, {"x": 2018.3908675799087, "y": 2000000.0, "tooltipData": {"Model": "aLSTM(depth-2)+RecurrentPolicy (WT2)", "Domain": "Language", "Training compute (FLOP)": "7.3e+16", "Training dataset size (datapoints)": "2.0e+06", "Organization": "University of Manchester, Alan Turing Institute", "Publication date": "2018-05-22"}, "size": 8}, {"x": 2018.4166666666667, "y": 1000000000.0, "tooltipData": {"Model": "GPT-1", "Domain": "Language", "Training compute (FLOP)": "1.8e+19", "Training dataset size (datapoints)": "1.0e+09", "Organization": "OpenAI", "Publication date": "2018-06-01"}, "size": 8}, {"x": 2018.657305936073, "y": 3390000000.0, "tooltipData": {"Model": "Big Transformer for Back-Translation", "Domain": "Language", "Training compute (FLOP)": "4.8e+20", "Training dataset size (datapoints)": "3.4e+09", "Organization": "Facebook AI Research, Google Brain", "Publication date": "2018-08-28"}, "size": 8}, {"x": 2018.6627853881278, "y": 2000000.0, "tooltipData": {"Model": "(ensemble): AWD-LSTM-DOC (fin) \u00d7 5 (WT2)", "Domain": "Language", "Training compute (FLOP)": "6.7e+17", "Training dataset size (datapoints)": "2.0e+06", "Organization": "NTT Communication Science Laboratories, Tohoku University", "Publication date": "2018-08-30"}, "size": 8}, {"x": 2018.7296803652969, "y": 2000000.0, "tooltipData": {"Model": "LSTM+NeuralCache", "Domain": "Language", "Training compute (FLOP)": "9.8e+14", "Training dataset size (datapoints)": "2.0e+06", "Organization": "KU Leuven, ESAT - PSI, Apple", "Publication date": "2018-09-24"}, "size": 8}, {"x": 2018.7406392694065, "y": 100000000.0, "tooltipData": {"Model": "Transformer (Adaptive Input Embeddings) WT103", "Domain": "Language", "Training compute (FLOP)": "4.5e+19", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Facebook AI Research", "Publication date": "2018-09-28"}, "size": 8}, {"x": 2018.777397260274, "y": 3300000000.0, "tooltipData": {"Model": "BERT-Large", "Domain": "Language", "Training compute (FLOP)": "2.8e+20", "Training dataset size (datapoints)": "3.3e+09", "Organization": "Google", "Publication date": "2018-10-11"}, "size": 8}, {"x": 2018.7883561643835, "y": 103000000.0, "tooltipData": {"Model": "TrellisNet", "Domain": "Language", "Training compute (FLOP)": "2.8e+18", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Carnegie Mellon University (CMU), Bosch Center for Artificial Intelligence, Intel Labs", "Publication date": "2018-10-15"}, "size": 8}, {"x": 2018.844292237443, "y": 1800000000.0, "tooltipData": {"Model": "Mesh-TensorFlow Transformer 2.9B (translation)", "Domain": "Language", "Training compute (FLOP)": "6.8e+19", "Training dataset size (datapoints)": "1.8e+09", "Organization": "Google Brain", "Publication date": "2018-11-05"}, "size": 8}, {"x": 2018.844292237443, "y": 6333333333.333333, "tooltipData": {"Model": "Mesh-TensorFlow Transformer 4.9B (language)", "Domain": "Language", "Training compute (FLOP)": "1.6e+20", "Training dataset size (datapoints)": "6.3e+09", "Organization": "Google Brain", "Publication date": "2018-11-05"}, "size": 8}, {"x": 2018.8716894977167, "y": 929000.0, "tooltipData": {"Model": "Multi-cell LSTM", "Domain": "Language", "Training compute (FLOP)": "2.0e+15", "Training dataset size (datapoints)": "9.3e+05", "Organization": "University of Hyderabad", "Publication date": "2018-11-15"}, "size": 8}, {"x": 2018.8744292237443, "y": 20000000000.0, "tooltipData": {"Model": "GPipe (Transformer)", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "2.0e+10", "Organization": "Google", "Publication date": "2018-11-16"}, "size": 8}, {"x": 2019.021917808219, "y": 103000000.0, "tooltipData": {"Model": "Transformer-XL (257M)", "Domain": "Language", "Training compute (FLOP)": "3.8e+20", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Carnegie Mellon University (CMU), Google Brain", "Publication date": "2019-01-09"}, "size": 8}, {"x": 2019.1189497716894, "y": 10666666666.666666, "tooltipData": {"Model": "GPT-2 (1.5B)", "Domain": "Language", "Training compute (FLOP)": "1.9e+21", "Training dataset size (datapoints)": "1.1e+10", "Organization": "OpenAI", "Publication date": "2019-02-14"}, "size": 8}, {"x": 2019.2351598173516, "y": 3300000000.0, "tooltipData": {"Model": "SciBERT", "Domain": "Language", "Training compute (FLOP)": "8.9e+19", "Training dataset size (datapoints)": "3.3e+09", "Organization": "Allen Institute for AI", "Publication date": "2019-03-26"}, "size": 8}, {"x": 2019.3020547945205, "y": 1300000000.0, "tooltipData": {"Model": "BERT-Large-CAS (PTB+WT2+WT103)", "Domain": "Language", "Training compute (FLOP)": "1.5e+20", "Training dataset size (datapoints)": "1.3e+09", "Organization": "Amazon", "Publication date": "2019-04-20"}, "size": 8}, {"x": 2019.4413242009134, "y": 2000000.0, "tooltipData": {"Model": "AWD-LSTM + MoS + Partial Shuffled", "Domain": "Language", "Training compute (FLOP)": "3.2e+17", "Training dataset size (datapoints)": "2.0e+06", "Organization": "University of Texas at Austin", "Publication date": "2019-06-10"}, "size": 8}, {"x": 2019.5, "y": 32000000000.0, "tooltipData": {"Model": "RoBERTa Large", "Domain": "Language", "Training compute (FLOP)": "8.5e+21", "Training dataset size (datapoints)": "3.2e+10", "Organization": "Facebook, University of Washington", "Publication date": "2019-07-01"}, "size": 8}, {"x": 2019.7105022831051, "y": 46400000000.0, "tooltipData": {"Model": "Megatron-BERT", "Domain": "Language", "Training compute (FLOP)": "2.2e+22", "Training dataset size (datapoints)": "4.6e+10", "Organization": "NVIDIA", "Publication date": "2019-09-17"}, "size": 8}, {"x": 2019.7105022831051, "y": 157000000000.0, "tooltipData": {"Model": "Megatron-LM (1.2B)", "Domain": "Language", "Training compute (FLOP)": "1.1e+22", "Training dataset size (datapoints)": "1.6e+11", "Organization": "NVIDIA", "Publication date": "2019-09-17"}, "size": 8}, {"x": 2019.7105022831051, "y": 34800000000.0, "tooltipData": {"Model": "Megatron-LM (8.3B)", "Domain": "Language", "Training compute (FLOP)": "9.1e+21", "Training dataset size (datapoints)": "3.5e+10", "Organization": "NVIDIA", "Publication date": "2019-09-17"}, "size": 8}, {"x": 2019.7351598173516, "y": 3300000000.0, "tooltipData": {"Model": "ALBERT", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "3.3e+09", "Organization": "Toyota Technological Institute at Chicago, Google Research", "Publication date": "2019-09-26"}, "size": 8}, {"x": 2019.8102739726028, "y": 25500000000.0, "tooltipData": {"Model": "T5-3B", "Domain": "Language", "Training compute (FLOP)": "9.0e+21", "Training dataset size (datapoints)": "2.6e+10", "Organization": "Google", "Publication date": "2019-10-23"}, "size": 8}, {"x": 2019.8102739726028, "y": 200000000000.0, "tooltipData": {"Model": "T5-11B", "Domain": "Language", "Training compute (FLOP)": "3.3e+22", "Training dataset size (datapoints)": "2.0e+11", "Organization": "Google", "Publication date": "2019-10-23"}, "size": 8}, {"x": 2019.8333333333333, "y": 103000000.0, "tooltipData": {"Model": "Base LM + kNN LM + Continuous Cache", "Domain": "Language", "Training compute (FLOP)": "3.1e+19", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Stanford University, Facebook AI Research", "Publication date": "2019-11-01"}, "size": 8}, {"x": 2019.844292237443, "y": 125250000000.0, "tooltipData": {"Model": "XLM-RoBERTa", "Domain": "Language", "Training compute (FLOP)": "2.1e+22", "Training dataset size (datapoints)": "1.3e+11", "Organization": "Facebook AI", "Publication date": "2019-11-05"}, "size": 8}, {"x": 2019.85799086758, "y": 700000000.0, "tooltipData": {"Model": "Sandwich Transformer", "Domain": "Language", "Training compute (FLOP)": "2.4e+19", "Training dataset size (datapoints)": "7.0e+08", "Organization": "Allen Institute for AI, Facebook AI Research", "Publication date": "2019-11-10"}, "size": 8}, {"x": 2019.85799086758, "y": 31900000000.0, "tooltipData": {"Model": "CamemBERT", "Domain": "Language", "Training compute (FLOP)": "8.3e+20", "Training dataset size (datapoints)": "3.2e+10", "Organization": "Facebook, INRIA, Sorbonne University", "Publication date": "2019-11-10"}, "size": 8}, {"x": 2019.9045662100457, "y": 103000000.0, "tooltipData": {"Model": "Transformer-XL DeFINE (141M)", "Domain": "Language", "Training compute (FLOP)": "1.7e+18", "Training dataset size (datapoints)": "1.0e+08", "Organization": "University of Washington, Allen Institute for AI", "Publication date": "2019-11-27"}, "size": 8}, {"x": 2019.9276255707764, "y": 2000000.0, "tooltipData": {"Model": "MMLSTM (WT-2)", "Domain": "Language", "Training compute (FLOP)": "1.9e+17", "Training dataset size (datapoints)": "2.0e+06", "Organization": "Beijing University of Posts and Telecommunications, University of West London", "Publication date": "2019-12-05"}, "size": 8}, {"x": 2019.9276255707764, "y": 912344.0, "tooltipData": {"Model": "MMLSTM (PTB)", "Domain": "Language", "Training compute (FLOP)": "5.8e+16", "Training dataset size (datapoints)": "9.1e+05", "Organization": "Beijing University of Posts and Telecommunications, University of West London", "Publication date": "2019-12-05"}, "size": 8}, {"x": 2020.0739726027398, "y": 40000000000.0, "tooltipData": {"Model": "Meena", "Domain": "Language", "Training compute (FLOP)": "1.1e+23", "Training dataset size (datapoints)": "4.0e+10", "Organization": "Google Brain", "Publication date": "2020-01-28"}, "size": 8}, {"x": 2020.102511415525, "y": 103000000.0, "tooltipData": {"Model": "TaLK Convolution", "Domain": "Language", "Training compute (FLOP)": "2.7e+19", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Carleton University", "Publication date": "2020-02-08"}, "size": 8}, {"x": 2020.1052511415523, "y": 3300000000.0, "tooltipData": {"Model": "ALBERT-xxlarge", "Domain": "Language", "Training compute (FLOP)": "2.4e+21", "Training dataset size (datapoints)": "3.3e+09", "Organization": "Toyota Technological Institute at Chicago, Google", "Publication date": "2020-02-09"}, "size": 8}, {"x": 2020.116210045662, "y": 46400000000.0, "tooltipData": {"Model": "Turing-NLG", "Domain": "Language", "Training compute (FLOP)": "1.6e+22", "Training dataset size (datapoints)": "4.6e+10", "Organization": "Microsoft", "Publication date": "2020-02-13"}, "size": 8}, {"x": 2020.1381278538813, "y": 103000000.0, "tooltipData": {"Model": "Feedback Transformer", "Domain": "Language", "Training compute (FLOP)": "7.7e+18", "Training dataset size (datapoints)": "1.0e+08", "Organization": "LORIA, University of Lorraine, Facebook AI Research", "Publication date": "2020-02-21"}, "size": 8}, {"x": 2020.1940639269408, "y": 103000000.0, "tooltipData": {"Model": "TransformerXL + spectrum control", "Domain": "Language", "Training compute (FLOP)": "2.6e+19", "Training dataset size (datapoints)": "1.0e+08", "Organization": "University of California Los Angeles (UCLA), JD.com", "Publication date": "2020-03-11"}, "size": 8}, {"x": 2020.2105022831051, "y": 103000000.0, "tooltipData": {"Model": "Tensor-Transformer(1core)+PN (WT103)", "Domain": "Language", "Training compute (FLOP)": "1.6e+18", "Training dataset size (datapoints)": "1.0e+08", "Organization": "University of California (UC) Berkeley", "Publication date": "2020-03-17"}, "size": 8}, {"x": 2020.2269406392695, "y": 25000000000.0, "tooltipData": {"Model": "ELECTRA", "Domain": "Language", "Training compute (FLOP)": "3.1e+21", "Training dataset size (datapoints)": "2.5e+10", "Organization": "Stanford University, Google, Google Brain", "Publication date": "2020-03-23"}, "size": 8}, {"x": 2020.3470319634703, "y": 929000.0, "tooltipData": {"Model": "NAS+ESS (23M)", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "9.3e+05", "Organization": "Northeastern University (China), NiuTrans Research, Kingsoft", "Publication date": "2020-05-06"}, "size": 8}, {"x": 2020.407305936073, "y": 374000000000.0, "tooltipData": {"Model": "GPT-3 175B (davinci)", "Domain": "Language", "Training compute (FLOP)": "3.1e+23", "Training dataset size (datapoints)": "3.7e+11", "Organization": "OpenAI", "Publication date": "2020-05-28"}, "strokeColor": "black", "strokeWidth": 0.5, "size": 8, "zOrder": 2}, {"x": 2020.4961187214612, "y": 346666666666.6667, "tooltipData": {"Model": "GShard (dense)", "Domain": "Language", "Training compute (FLOP)": "4.8e+22", "Training dataset size (datapoints)": "3.5e+11", "Organization": "Google", "Publication date": "2020-06-30"}, "size": 8}, {"x": 2020.588812785388, "y": 103000000.0, "tooltipData": {"Model": "DeLighT", "Domain": "Language", "Training compute (FLOP)": "3.8e+18", "Training dataset size (datapoints)": "1.0e+08", "Organization": "University of Washington, Allen Institute for AI, Facebook AI Research", "Publication date": "2020-08-03"}, "size": 8}, {"x": 2020.5970319634703, "y": 86000000000.0, "tooltipData": {"Model": "ERNIE-GEN (large)", "Domain": "Language", "Training compute (FLOP)": "2.0e+20", "Training dataset size (datapoints)": "8.6e+10", "Organization": "Baidu", "Publication date": "2020-08-06"}, "size": 8}, {"x": 2020.8020547945205, "y": 1000000000000.0, "tooltipData": {"Model": "mT5-XXL", "Domain": "Language", "Training compute (FLOP)": "8.2e+22", "Training dataset size (datapoints)": "1.0e+12", "Organization": "Google, Google Research", "Publication date": "2020-10-20"}, "size": 8}, {"x": 2020.804794520548, "y": 36383733333.333336, "tooltipData": {"Model": "German ELECTRA Large", "Domain": "Language", "Training compute (FLOP)": "1.4e+21", "Training dataset size (datapoints)": "3.6e+10", "Organization": "deepset, Bayerische Staatsbibliothek Muenchen", "Publication date": "2020-10-21"}, "size": 8}, {"x": 2020.9166666666667, "y": 16700000000.0, "tooltipData": {"Model": "CPM-Large", "Domain": "Language", "Training compute (FLOP)": "2.6e+20", "Training dataset size (datapoints)": "1.7e+10", "Organization": "Tsinghua University, Beijing Academy of Artificial Intelligence / BAAI", "Publication date": "2020-12-01"}, "size": 8}, {"x": 2020.9769406392695, "y": 58000000.0, "tooltipData": {"Model": "DensePhrases", "Domain": "Language", "Training compute (FLOP)": "2.1e+18", "Training dataset size (datapoints)": "5.8e+07", "Organization": "Korea University, Princeton University", "Publication date": "2020-12-23"}, "size": 8}, {"x": 2020.9824200913242, "y": 2000000.0, "tooltipData": {"Model": "CT-MoS (WT2)", "Domain": "Language", "Training compute (FLOP)": "5.4e+17", "Training dataset size (datapoints)": "2.0e+06", "Organization": "Google, National Tsing Hua University", "Publication date": "2020-12-25"}, "size": 8}, {"x": 2021.0109589041097, "y": 400000000.0, "tooltipData": {"Model": "CLIP (ResNet-50)", "Domain": "Multimodal, Vision, Language, Video", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "4.0e+08", "Organization": "OpenAI", "Publication date": "2021-01-05"}, "size": 8}, {"x": 2021.0109589041097, "y": 400000000.0, "tooltipData": {"Model": "CLIP (ViT L/14@336px)", "Domain": "Multimodal, Vision, Language, Video", "Training compute (FLOP)": "1.0e+22", "Training dataset size (datapoints)": "4.0e+08", "Organization": "OpenAI", "Publication date": "2021-01-05"}, "size": 8}, {"x": 2021.027397260274, "y": 576000000000.0, "tooltipData": {"Model": "Switch", "Domain": "Language", "Training compute (FLOP)": "8.2e+22", "Training dataset size (datapoints)": "5.8e+11", "Organization": "Google", "Publication date": "2021-01-11"}, "size": 8}, {"x": 2021.1463470319634, "y": 103000000.0, "tooltipData": {"Model": "SRU++ Large", "Domain": "Language", "Training compute (FLOP)": "2.1e+19", "Training dataset size (datapoints)": "1.0e+08", "Organization": "ASAPP", "Publication date": "2021-02-24"}, "size": 8}, {"x": 2021.1776255707764, "y": 56800000000.0, "tooltipData": {"Model": "Generative BST", "Domain": "Language", "Training compute (FLOP)": "1.4e+22", "Training dataset size (datapoints)": "5.7e+10", "Organization": "Facebook AI Research", "Publication date": "2021-03-05"}, "size": 8}, {"x": 2021.1776255707764, "y": 1900000000000.0, "tooltipData": {"Model": "M6-T", "Domain": "Multimodal, Language, Vision", "Training compute (FLOP)": "5.5e+21", "Training dataset size (datapoints)": "1.9e+12", "Organization": "Alibaba", "Publication date": "2021-03-05"}, "size": 8}, {"x": 2021.4385844748858, "y": 171600000000.0, "tooltipData": {"Model": "EMDR", "Domain": "Language", "Training compute (FLOP)": "1.9e+21", "Training dataset size (datapoints)": "1.7e+11", "Organization": "Mila - Quebec AI (originally Montreal Institute for Learning Algorithms), McGill University, DeepMind", "Publication date": "2021-06-09"}, "size": 8}, {"x": 2021.4413242009134, "y": 15600000000.0, "tooltipData": {"Model": "DeBERTa", "Domain": "Language", "Training compute (FLOP)": "2.6e+22", "Training dataset size (datapoints)": "1.6e+10", "Organization": "Microsoft", "Publication date": "2021-06-10"}, "size": 8}, {"x": 2021.4440639269408, "y": 1600000000.0, "tooltipData": {"Model": "ALIGN", "Domain": "Multimodal, Vision, Language", "Training compute (FLOP)": "2.6e+22", "Training dataset size (datapoints)": "1.6e+09", "Organization": "Google Research", "Publication date": "2021-06-11"}, "size": 8}, {"x": 2021.5109589041097, "y": 668000000000.0, "tooltipData": {"Model": "ERNIE 3.0", "Domain": "Language", "Training compute (FLOP)": "2.2e+22", "Training dataset size (datapoints)": "6.7e+11", "Organization": "Baidu", "Publication date": "2021-07-05"}, "size": 8}, {"x": 2021.6107305936073, "y": 225000000000.0, "tooltipData": {"Model": "Jurassic-1-Jumbo", "Domain": "Language", "Training compute (FLOP)": "3.7e+23", "Training dataset size (datapoints)": "2.2e+11", "Organization": "AI21 Labs", "Publication date": "2021-08-11"}, "size": 8}, {"x": 2021.6107305936073, "y": 41750000000.0, "tooltipData": {"Model": "Zidong Taichu", "Domain": "Multimodal, Speech, Vision, Language", "Training compute (FLOP)": "8.0e+20", "Training dataset size (datapoints)": "4.2e+10", "Organization": "Chinese Academy of Sciences, Wuhan AI Computing Center", "Publication date": "2021-08-11"}, "size": 8}, {"x": 2021.6271689497717, "y": 167000000000.0, "tooltipData": {"Model": "XLMR-XXL", "Domain": "Language", "Training compute (FLOP)": "3.4e+22", "Training dataset size (datapoints)": "1.7e+11", "Organization": "Facebook AI Research", "Publication date": "2021-08-17"}, "size": 8}, {"x": 2021.6803652968038, "y": 103000000.0, "tooltipData": {"Model": "PermuteFormer", "Domain": "Language", "Training compute (FLOP)": "2.8e+18", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Peking University", "Publication date": "2021-09-06"}, "size": 8}, {"x": 2021.6913242009134, "y": 560000000000.0, "tooltipData": {"Model": "HyperCLOVA 204B", "Domain": "Language", "Training compute (FLOP)": "2.0e+23", "Training dataset size (datapoints)": "5.6e+11", "Organization": "NAVER", "Publication date": "2021-09-10"}, "size": 8}, {"x": 2021.7187214611872, "y": 150000000000.0, "tooltipData": {"Model": "PLATO-XL", "Domain": "Language", "Training compute (FLOP)": "9.9e+21", "Training dataset size (datapoints)": "1.5e+11", "Organization": "Baidu", "Publication date": "2021-09-20"}, "size": 8}, {"x": 2021.777397260274, "y": 270000000000.0, "tooltipData": {"Model": "Megatron-Turing NLG 530B", "Domain": "Language", "Training compute (FLOP)": "8.6e+23", "Training dataset size (datapoints)": "2.7e+11", "Organization": "Microsoft, NVIDIA", "Publication date": "2021-10-11"}, "size": 8}, {"x": 2021.7801369863014, "y": 1000000000000.0, "tooltipData": {"Model": "Yuan 1.0", "Domain": "Language", "Training compute (FLOP)": "3.5e+23", "Training dataset size (datapoints)": "1.0e+12", "Organization": "Inspur", "Publication date": "2021-10-12"}, "size": 8}, {"x": 2021.8321917808219, "y": 103000000.0, "tooltipData": {"Model": "S4", "Domain": "Language", "Training compute (FLOP)": "7.8e+19", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Stanford University", "Publication date": "2021-10-31"}, "size": 8}, {"x": 2021.9358447488585, "y": 300000000000.0, "tooltipData": {"Model": "Gopher (280B)", "Domain": "Language", "Training compute (FLOP)": "6.3e+23", "Training dataset size (datapoints)": "3.0e+11", "Organization": "DeepMind", "Publication date": "2021-12-08"}, "size": 8}, {"x": 2021.9495433789955, "y": 600000000000.0, "tooltipData": {"Model": "GLaM", "Domain": "Language", "Training compute (FLOP)": "3.6e+23", "Training dataset size (datapoints)": "6.0e+11", "Organization": "Google", "Publication date": "2021-12-13"}, "size": 8}, {"x": 2021.9550228310502, "y": 200000000000.0, "tooltipData": {"Model": "LongT5", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "2.0e+11", "Organization": "Google Research", "Publication date": "2021-12-15"}, "size": 8}, {"x": 2021.9687214611872, "y": 1740000000.0, "tooltipData": {"Model": "XGLM-7.5B", "Domain": "Language", "Training compute (FLOP)": "2.2e+22", "Training dataset size (datapoints)": "1.7e+09", "Organization": "Meta AI, Facebook AI Research", "Publication date": "2021-12-20"}, "size": 8}, {"x": 2021.9769406392695, "y": 668000000000.0, "tooltipData": {"Model": "ERNIE 3.0 Titan", "Domain": "Language", "Training compute (FLOP)": "1.0e+24", "Training dataset size (datapoints)": "6.7e+11", "Organization": "Baidu, Peng Cheng Laboratory", "Publication date": "2021-12-23"}, "size": 8}, {"x": 2022.0520547945205, "y": 3300000000.0, "tooltipData": {"Model": "data2vec (language)", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "3.3e+09", "Organization": "Meta AI", "Publication date": "2022-01-20"}, "size": 8}, {"x": 2022.0997716894976, "y": 419430400000.0, "tooltipData": {"Model": "RETRO-7B", "Domain": "Language", "Training compute (FLOP)": "1.7e+22", "Training dataset size (datapoints)": "4.2e+11", "Organization": "DeepMind", "Publication date": "2022-02-07"}, "size": 8}, {"x": 2022.1052511415523, "y": 341173367965.0, "tooltipData": {"Model": "GPT-NeoX-20B", "Domain": "Language", "Training compute (FLOP)": "9.3e+22", "Training dataset size (datapoints)": "3.4e+11", "Organization": "EleutherAI", "Publication date": "2022-02-09"}, "size": 8}, {"x": 2022.10799086758, "y": 1560000000000.0, "tooltipData": {"Model": "LaMDA", "Domain": "Language", "Training compute (FLOP)": "3.6e+23", "Training dataset size (datapoints)": "1.6e+12", "Organization": "Google", "Publication date": "2022-02-10"}, "size": 8}, {"x": 2022.1271689497717, "y": 1500000000000.0, "tooltipData": {"Model": "ST-MoE", "Domain": "Language", "Training compute (FLOP)": "2.9e+23", "Training dataset size (datapoints)": "1.5e+12", "Organization": "Google, Google Brain, Google Research", "Publication date": "2022-02-17"}, "size": 8}, {"x": 2022.1666666666667, "y": 12000000000.0, "tooltipData": {"Model": "DeepNet", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "1.2e+10", "Organization": "Microsoft Research", "Publication date": "2022-03-01"}, "size": 8}, {"x": 2022.169406392694, "y": 275000000000.0, "tooltipData": {"Model": "Statement Curriculum Learning", "Domain": "Language, Mathematics", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "2.8e+11", "Organization": "OpenAI", "Publication date": "2022-03-02"}, "size": 8}, {"x": 2022.2214611872148, "y": 103000000.0, "tooltipData": {"Model": "Segatron-XL large, M=384 + HCP", "Domain": "Language", "Training compute (FLOP)": "2.6e+19", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Microsoft Research, University of Waterloo", "Publication date": "2022-03-21"}, "size": 8}, {"x": 2022.2433789954339, "y": 1400000000000.0, "tooltipData": {"Model": "Chinchilla", "Domain": "Language", "Training compute (FLOP)": "5.8e+23", "Training dataset size (datapoints)": "1.4e+12", "Organization": "DeepMind", "Publication date": "2022-03-29"}, "size": 8}, {"x": 2022.2582191780823, "y": 585000000000.0, "tooltipData": {"Model": "PaLM (540B)", "Domain": "Language", "Training compute (FLOP)": "2.5e+24", "Training dataset size (datapoints)": "5.8e+11", "Organization": "Google Research", "Publication date": "2022-04-04"}, "strokeColor": "black", "strokeWidth": 0.5, "size": 8, "zOrder": 2}, {"x": 2022.285616438356, "y": 100000000000.0, "tooltipData": {"Model": "Sparse all-MLP", "Domain": "Language", "Training compute (FLOP)": "5.3e+20", "Training dataset size (datapoints)": "1.0e+11", "Organization": "Meta AI", "Publication date": "2022-04-14"}, "size": 8}, {"x": 2022.3360730593606, "y": 180000000000.0, "tooltipData": {"Model": "OPT-175B", "Domain": "Language", "Training compute (FLOP)": "4.3e+23", "Training dataset size (datapoints)": "1.8e+11", "Organization": "Meta AI", "Publication date": "2022-05-02"}, "size": 8}, {"x": 2022.35799086758, "y": 1000000000000.0, "tooltipData": {"Model": "UL2", "Domain": "Language", "Training compute (FLOP)": "1.2e+23", "Training dataset size (datapoints)": "1.0e+12", "Organization": "Google Research, Google Brain", "Publication date": "2022-05-10"}, "size": 8}, {"x": 2022.3634703196346, "y": 524288000000.0, "tooltipData": {"Model": "Gato", "Domain": "Multimodal, Robotics, Games, Language", "Training compute (FLOP)": "4.0e+21", "Training dataset size (datapoints)": "5.2e+11", "Organization": "DeepMind", "Publication date": "2022-05-12"}, "size": 8}, {"x": 2022.4045662100457, "y": 10130000000.0, "tooltipData": {"Model": "GPT-2 Medium (FlashAttention)", "Domain": "Language", "Training compute (FLOP)": "8.9e+20", "Training dataset size (datapoints)": "1.0e+10", "Organization": "Stanford University, University at Buffalo", "Publication date": "2022-05-27"}, "size": 8}, {"x": 2022.4303652968038, "y": 103000000.0, "tooltipData": {"Model": "DITTO", "Domain": "Language", "Training compute (FLOP)": "3.3e+18", "Training dataset size (datapoints)": "1.0e+08", "Organization": "Tsinghua University, Apple, Westlake University, Chinese University of Hong Kong (CUHK)", "Publication date": "2022-06-06"}, "size": 8}, {"x": 2022.5109589041097, "y": 10500000000.0, "tooltipData": {"Model": "CodeT5-large", "Domain": "Language", "Training compute (FLOP)": "2.7e+21", "Training dataset size (datapoints)": "1.0e+10", "Organization": "Salesforce", "Publication date": "2022-07-05"}, "size": 8}, {"x": 2022.513698630137, "y": 360000000000.0, "tooltipData": {"Model": "NLLB", "Domain": "Language", "Training compute (FLOP)": "1.8e+22", "Training dataset size (datapoints)": "3.6e+11", "Organization": "Meta AI", "Publication date": "2022-07-06"}, "size": 8}, {"x": 2022.527397260274, "y": 379000000000.0, "tooltipData": {"Model": "BLOOM-176B", "Domain": "Language", "Training compute (FLOP)": "3.7e+23", "Training dataset size (datapoints)": "3.8e+11", "Organization": "Hugging Face, BigScience", "Publication date": "2022-07-11"}, "size": 8}, {"x": 2022.5860730593606, "y": 1319000000000.0, "tooltipData": {"Model": "AlexaTM 20B", "Domain": "Language", "Training compute (FLOP)": "2.0e+23", "Training dataset size (datapoints)": "1.3e+12", "Organization": "Amazon", "Publication date": "2022-08-02"}, "size": 8}, {"x": 2022.5915525114156, "y": 400000000000.0, "tooltipData": {"Model": "GLM-130B", "Domain": "Language", "Training compute (FLOP)": "3.5e+23", "Training dataset size (datapoints)": "4.0e+11", "Organization": "Tsinghua University", "Publication date": "2022-08-04"}, "size": 8}, {"x": 2022.7022831050228, "y": 1600000000.0, "tooltipData": {"Model": "PaLI", "Domain": "Language, Vision, Multimodal", "Training compute (FLOP)": "1.7e+23", "Training dataset size (datapoints)": "1.6e+09", "Organization": "Google", "Publication date": "2022-09-14"}, "size": 8}, {"x": 2022.8744292237443, "y": 106000000000.0, "tooltipData": {"Model": "Galactica", "Domain": "Language, Biology", "Training compute (FLOP)": "3.2e+23", "Training dataset size (datapoints)": "1.1e+11", "Organization": "Meta AI", "Publication date": "2022-11-16"}, "size": 8}, {"x": 2022.9906392694065, "y": 400000000000.0, "tooltipData": {"Model": "Hybrid H3-2.7B", "Domain": "Language", "Training compute (FLOP)": "6.5e+21", "Training dataset size (datapoints)": "4.0e+11", "Organization": "Stanford University, University at Buffalo", "Publication date": "2022-12-28"}, "size": 8}, {"x": 2023.1463470319634, "y": 1340000000000.0, "tooltipData": {"Model": "LLaMA-65B", "Domain": "Language", "Training compute (FLOP)": "5.5e+23", "Training dataset size (datapoints)": "1.3e+12", "Organization": "Meta AI", "Publication date": "2023-02-24"}, "size": 8}, {"x": 2023.2050228310502, "y": 1000000000000.0, "tooltipData": {"Model": "Falcon-40B", "Domain": "Language", "Training compute (FLOP)": "2.4e+23", "Training dataset size (datapoints)": "1.0e+12", "Organization": "Technology Innovation Institute", "Publication date": "2023-03-15"}, "size": 8}, {"x": 2023.2050228310502, "y": 4900000000000.0, "tooltipData": {"Model": "GPT-4", "Domain": "Multimodal, Language, Vision", "Training compute (FLOP)": "2.1e+25", "Training dataset size (datapoints)": "4.9e+12", "Organization": "OpenAI", "Publication date": "2023-03-15"}, "strokeColor": "black", "strokeWidth": 0.5, "size": 8, "zOrder": 2}, {"x": 2023.2187214611872, "y": 246750000000.0, "tooltipData": {"Model": "PanGu-\u03a3", "Domain": "Language", "Training compute (FLOP)": "4.7e+23", "Training dataset size (datapoints)": "2.5e+11", "Organization": "Huawei Noah's Ark Lab", "Publication date": "2023-03-20"}, "size": 8}, {"x": 2023.2461187214612, "y": 532000000000.0, "tooltipData": {"Model": "BloombergGPT", "Domain": "Language", "Training compute (FLOP)": "2.4e+23", "Training dataset size (datapoints)": "5.3e+11", "Organization": "Bloomberg, Johns Hopkins University", "Publication date": "2023-03-30"}, "size": 8}, {"x": 2023.3552511415523, "y": 1000000000000.0, "tooltipData": {"Model": "StarCoder", "Domain": "Language", "Training compute (FLOP)": "8.5e+22", "Training dataset size (datapoints)": "1.0e+12", "Organization": "Hugging Face, ServiceNow, Northeastern University, Mila - Quebec AI (originally Montreal Institute for Learning Algorithms), Carnegie Mellon University (CMU), Johns Hopkins University, Leipzig University, ScaDS.AI, Queen Mary University of London, Roblox, Sea AI Lab, Technion - Israel Institute of Technology, Monash University, CSIRO, Data61, McGill University, Saama, University of British Columbia (UBC), Massachusetts Institute of Technology (MIT), Technical University of Munich, IBM, University of Vermont, UnfoldML, SAP, University of Notre Dame, Columbia University, New York University (NYU), University of Allahabad, Discover Dollar, Toloka, Telefonica, Stanford University, Weizmann Institute of Science, Alan Turing Institute, Wellesley College, EleutherAI, Forschungszentrum Julich", "Publication date": "2023-05-09"}, "size": 8}, {"x": 2023.35799086758, "y": 2700000000000.0, "tooltipData": {"Model": "PaLM 2", "Domain": "Language", "Training compute (FLOP)": "7.3e+24", "Training dataset size (datapoints)": "2.7e+12", "Organization": "Google", "Publication date": "2023-05-10"}, "size": 8}, {"x": 2023.379908675799, "y": 1600000000.0, "tooltipData": {"Model": "ONE-PEACE", "Domain": "Multimodal, Vision, Speech, Language", "Training compute (FLOP)": "1.8e+20", "Training dataset size (datapoints)": "1.6e+09", "Organization": "Alibaba, Huazhong University of Science and Technology", "Publication date": "2023-05-18"}, "size": 8}, {"x": 2023.513698630137, "y": 1600000000000.0, "tooltipData": {"Model": "InternLM", "Domain": "Language", "Training compute (FLOP)": "1.0e+24", "Training dataset size (datapoints)": "1.6e+12", "Organization": "Shanghai AI Lab, SenseTime", "Publication date": "2023-07-06"}, "size": 8}, {"x": 2023.5465753424658, "y": 1500000000000.0, "tooltipData": {"Model": "Llama 2-7B", "Domain": "Language", "Training compute (FLOP)": "8.4e+22", "Training dataset size (datapoints)": "1.5e+12", "Organization": "Meta AI", "Publication date": "2023-07-18"}, "size": 8}, {"x": 2023.5465753424658, "y": 2000000000000.0, "tooltipData": {"Model": "Llama 2-70B", "Domain": "Language", "Training compute (FLOP)": "8.1e+23", "Training dataset size (datapoints)": "2.0e+12", "Organization": "Meta AI", "Publication date": "2023-07-18"}, "size": 8}, {"x": 2023.6600456621004, "y": 300000000000.0, "tooltipData": {"Model": "Jais", "Domain": "Language", "Training compute (FLOP)": "4.9e+22", "Training dataset size (datapoints)": "3.0e+11", "Organization": "Cerebras Systems, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Inception G42", "Publication date": "2023-08-29"}, "size": 8}, {"x": 2023.6803652968038, "y": 2625000000000.0, "tooltipData": {"Model": "Falcon-180B", "Domain": "Language", "Training compute (FLOP)": "3.8e+24", "Training dataset size (datapoints)": "2.6e+12", "Organization": "Technology Innovation Institute", "Publication date": "2023-09-06"}, "size": 8}, {"x": 2023.7406392694065, "y": 4000000000000.0, "tooltipData": {"Model": "Amazon Titan", "Domain": "Language, Image generation", "Training compute (FLOP)": "4.8e+24", "Training dataset size (datapoints)": "4.0e+12", "Organization": "Amazon", "Publication date": "2023-09-28"}, "size": 8}, {"x": 2023.8184931506848, "y": 4390400.0, "tooltipData": {"Model": "CODEFUSION (Python)", "Domain": "Language", "Training compute (FLOP)": "7.9e+18", "Training dataset size (datapoints)": "4.4e+06", "Organization": "Microsoft, Microsoft Research", "Publication date": "2023-10-26"}, "size": 8}, {"x": 2023.8212328767124, "y": 1400000000000.0, "tooltipData": {"Model": "ChatGLM3-6B", "Domain": "Multimodal, Language, Vision", "Training compute (FLOP)": "5.0e+22", "Training dataset size (datapoints)": "1.4e+12", "Organization": "Zhipu AI", "Publication date": "2023-10-27"}, "size": 8}, {"x": 2023.8294520547945, "y": 3180000000000.0, "tooltipData": {"Model": "Skywork-13B", "Domain": "Language", "Training compute (FLOP)": "2.5e+23", "Training dataset size (datapoints)": "3.2e+12", "Organization": "Kunlun Inc.", "Publication date": "2023-10-30"}, "size": 8}, {"x": 2023.8360730593606, "y": 3100000000000.0, "tooltipData": {"Model": "Yi-34B", "Domain": "Language", "Training compute (FLOP)": "6.1e+23", "Training dataset size (datapoints)": "3.1e+12", "Organization": "01.AI", "Publication date": "2023-11-02"}, "size": 8}, {"x": 2023.8415525114156, "y": 6200000000000.0, "tooltipData": {"Model": "Grok-1", "Domain": "Language", "Training compute (FLOP)": "2.9e+24", "Training dataset size (datapoints)": "6.2e+12", "Organization": "xAI", "Publication date": "2023-11-04"}, "size": 8}, {"x": 2023.8716894977167, "y": 3800000000000.0, "tooltipData": {"Model": "Nemotron-3-8B", "Domain": "Language", "Training compute (FLOP)": "1.8e+23", "Training dataset size (datapoints)": "3.8e+12", "Organization": "NVIDIA", "Publication date": "2023-11-15"}, "size": 8}, {"x": 2023.9127853881278, "y": 3000000000000.0, "tooltipData": {"Model": "Qwen-72B", "Domain": "Language", "Training compute (FLOP)": "1.3e+24", "Training dataset size (datapoints)": "3.0e+12", "Organization": "Alibaba", "Publication date": "2023-11-30"}, "size": 8}, {"x": 2024.0915525114156, "y": 3000000000000.0, "tooltipData": {"Model": "Qwen1.5-72B", "Domain": "Language", "Training compute (FLOP)": "1.3e+24", "Training dataset size (datapoints)": "3.0e+12", "Organization": "Alibaba", "Publication date": "2024-02-04"}, "size": 8}, {"x": 2024.174885844749, "y": 7000000000000.0, "tooltipData": {"Model": "Aramco Metabrain AI", "Domain": "Language", "Training compute (FLOP)": "1.0e+25", "Training dataset size (datapoints)": "7.0e+12", "Organization": "Saudi Aramco", "Publication date": "2024-03-04"}, "size": 8}, {"x": 2024.2022831050228, "y": 1500000000000.0, "tooltipData": {"Model": "MM1-30B", "Domain": "Multimodal, Language, Vision", "Training compute (FLOP)": "4.9e+23", "Training dataset size (datapoints)": "1.5e+12", "Organization": "Apple", "Publication date": "2024-03-14"}, "size": 8}, {"x": 2024.2378995433792, "y": 9000000000000.0, "tooltipData": {"Model": "DBRX", "Domain": "Language", "Training compute (FLOP)": "2.6e+24", "Training dataset size (datapoints)": "9.0e+12", "Organization": "Databricks", "Publication date": "2024-03-27"}, "size": 8}, {"x": 2024.2965753424658, "y": 15000000000000.0, "tooltipData": {"Model": "Llama 3-70B", "Domain": "Language", "Training compute (FLOP)": "7.9e+24", "Training dataset size (datapoints)": "1.5e+13", "Organization": "Meta AI", "Publication date": "2024-04-18"}, "size": 8}, {"x": 2024.366210045662, "y": 3000000000000.0, "tooltipData": {"Model": "Yi-Large", "Domain": "Language", "Training compute (FLOP)": "1.8e+24", "Training dataset size (datapoints)": "3.0e+12", "Organization": "01.AI", "Publication date": "2024-05-13"}, "size": 8}, {"x": 2024.4331050228311, "y": 7000000000000.0, "tooltipData": {"Model": "Qwen2-72B", "Domain": "Language", "Training compute (FLOP)": "3.0e+24", "Training dataset size (datapoints)": "7.0e+12", "Organization": "Alibaba", "Publication date": "2024-06-07"}, "size": 8}, {"x": 2024.4522831050228, "y": 6750000000000.0, "tooltipData": {"Model": "Nemotron-4 340B", "Domain": "Language", "Training compute (FLOP)": "1.8e+25", "Training dataset size (datapoints)": "6.8e+12", "Organization": "NVIDIA", "Publication date": "2024-06-14"}, "size": 8}, {"x": 2024.5602739726028, "y": 15600000000000.0, "tooltipData": {"Model": "Llama 3.1-405B", "Domain": "Language", "Training compute (FLOP)": "3.8e+25", "Training dataset size (datapoints)": "1.6e+13", "Organization": "Meta AI", "Publication date": "2024-07-23"}, "size": 8}, {"x": 2024.5767123287671, "y": 7588000000000.0, "tooltipData": {"Model": "AFM-on-device", "Domain": "Language", "Training compute (FLOP)": "4.5e+23", "Training dataset size (datapoints)": "7.6e+12", "Organization": "Apple", "Publication date": "2024-07-29"}, "size": 8}, {"x": 2024.5767123287671, "y": 7400000000000.0, "tooltipData": {"Model": "AFM-server", "Domain": "Language", "Training compute (FLOP)": "4.3e+24", "Training dataset size (datapoints)": "7.4e+12", "Organization": "Apple", "Publication date": "2024-07-29"}, "size": 8}, {"x": 2024.7105022831051, "y": 18000000000000.0, "tooltipData": {"Model": "Qwen2.5-32B", "Domain": "Language", "Training compute (FLOP)": "3.5e+24", "Training dataset size (datapoints)": "1.8e+13", "Organization": "Alibaba", "Publication date": "2024-09-17"}, "size": 8}, {"x": 2024.7159817351599, "y": 18000000000000.0, "tooltipData": {"Model": "Qwen2.5-72B", "Domain": "Language", "Training compute (FLOP)": "7.8e+24", "Training dataset size (datapoints)": "1.8e+13", "Organization": "Alibaba", "Publication date": "2024-09-19"}, "size": 8}, {"x": 2024.7187214611872, "y": 10000000000000.0, "tooltipData": {"Model": "Telechat2-115B", "Domain": "Language", "Training compute (FLOP)": "6.9e+24", "Training dataset size (datapoints)": "1.0e+13", "Organization": "China Telecom", "Publication date": "2024-09-20"}, "size": 8}, {"x": 2024.8239726027398, "y": 8350000000000.0, "tooltipData": {"Model": "Doubao-pro", "Domain": "Language", "Training compute (FLOP)": "2.5e+25", "Training dataset size (datapoints)": "8.4e+12", "Organization": "ByteDance", "Publication date": "2024-10-28"}, "size": 8}, {"x": 2024.8470319634703, "y": 7000000000000.0, "tooltipData": {"Model": "Hunyuan-Large", "Domain": "Language", "Training compute (FLOP)": "3.5e+24", "Training dataset size (datapoints)": "7.0e+12", "Organization": "Tencent", "Publication date": "2024-11-06"}, "size": 8}, {"x": 2024.9303652968038, "y": 15000000000000.0, "tooltipData": {"Model": "Llama 3.3 70B", "Domain": "Language", "Training compute (FLOP)": "6.9e+24", "Training dataset size (datapoints)": "1.5e+13", "Organization": "Meta AI", "Publication date": "2024-12-06"}, "size": 8}, {"x": 2024.9385844748858, "y": 6500000000000.0, "tooltipData": {"Model": "EXAONE 3.5 32B", "Domain": "Language", "Training compute (FLOP)": "1.3e+24", "Training dataset size (datapoints)": "6.5e+12", "Organization": "LG AI Research", "Publication date": "2024-12-09"}, "size": 8}, {"x": 2024.9796803652969, "y": 14800000000000.0, "tooltipData": {"Model": "DeepSeek-V3", "Domain": "Language", "Training compute (FLOP)": "3.4e+24", "Training dataset size (datapoints)": "1.5e+13", "Organization": "DeepSeek", "Publication date": "2024-12-24"}, "size": 8}, {"x": 2025.0575342465754, "y": 9000000000000.0, "tooltipData": {"Model": "Doubao-1.5-pro", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "9.0e+12", "Organization": "ByteDance", "Publication date": "2025-01-22"}, "size": 8}, {"x": 2025.1940639269408, "y": 16000000000000.0, "tooltipData": {"Model": "Hunyuan-TurboS", "Domain": "Language", "Training compute (FLOP)": "NA", "Training dataset size (datapoints)": "1.6e+13", "Organization": "Tencent", "Publication date": "2025-03-11"}, "size": 8}, {"x": 2025.2609589041097, "y": 30000000000000.0, "tooltipData": {"Model": "Llama 4 Behemoth (preview)", "Domain": "Multimodal, Language, Vision", "Training compute (FLOP)": "5.2e+25", "Training dataset size (datapoints)": "3.0e+13", "Organization": "Meta AI", "Publication date": "2025-04-05"}, "size": 8}, {"x": 2025.2609589041097, "y": 30000000000000.0, "tooltipData": {"Model": "Llama 4 Maverick", "Domain": "Multimodal, Language, Vision", "Training compute (FLOP)": "2.2e+24", "Training dataset size (datapoints)": "3.0e+13", "Organization": "Meta AI", "Publication date": "2025-04-05"}, "size": 8}, {"x": 2025.2609589041097, "y": 30000000000000.0, "tooltipData": {"Model": "Llama 4 Scout", "Domain": "Multimodal, Language, Vision", "Training compute (FLOP)": "4.1e+24", "Training dataset size (datapoints)": "3.0e+13", "Organization": "Meta AI", "Publication date": "2025-04-05"}, "size": 8}, {"x": 2025.2746575342467, "y": 13200000000000.0, "tooltipData": {"Model": "Pangu Ultra", "Domain": "Language", "Training compute (FLOP)": "1.1e+25", "Training dataset size (datapoints)": "1.3e+13", "Organization": "Huawei", "Publication date": "2025-04-10"}, "size": 8}, {"x": 2025.3267123287671, "y": 36000000000000.0, "tooltipData": {"Model": "Qwen3-235B-A22B", "Domain": "Language", "Training compute (FLOP)": "4.8e+24", "Training dataset size (datapoints)": "3.6e+13", "Organization": "Alibaba", "Publication date": "2025-04-29"}, "size": 8}, {"x": 2025.3607305936073, "y": 3000000000000.0, "tooltipData": {"Model": "Seed1.5-VL", "Domain": "Vision, Language, Multimodal, Video", "Training compute (FLOP)": "1.4e+24", "Training dataset size (datapoints)": "3.0e+12", "Organization": "ByteDance", "Publication date": "2025-05-11"}, "size": 8}, {"x": 2025.527397260274, "y": 15500000000000.0, "tooltipData": {"Model": "Kimi K2", "Domain": "Language", "Training compute (FLOP)": "3.0e+24", "Training dataset size (datapoints)": "1.6e+13", "Organization": "Moonshot", "Publication date": "2025-07-11"}, "size": 8}, {"x": 2025.5383561643835, "y": 14000000000000.0, "tooltipData": {"Model": "EXAONE 4.0 (32B)", "Domain": "Language", "Training compute (FLOP)": "2.7e+24", "Training dataset size (datapoints)": "1.4e+13", "Organization": "LG AI Research", "Publication date": "2025-07-15"}, "size": 8}, {"x": 2025.5575342465754, "y": 7500000000000.0, "tooltipData": {"Model": "Qwen3-Coder-480B-A35B", "Domain": "Language", "Training compute (FLOP)": "1.6e+24", "Training dataset size (datapoints)": "7.5e+12", "Organization": "Alibaba", "Publication date": "2025-07-22"}, "size": 8}, {"x": 2025.594292237443, "y": 23000000000000.0, "tooltipData": {"Model": "GLM 4.5", "Domain": "Language", "Training compute (FLOP)": "4.4e+24", "Training dataset size (datapoints)": "2.3e+13", "Organization": "Zhipu AI, Tsinghua University", "Publication date": "2025-08-05"}, "size": 8}, {"x": 2025.594292237443, "y": 23096708000000.0, "tooltipData": {"Model": "gpt-oss-20b", "Domain": "Language", "Training compute (FLOP)": "5.5e+23", "Training dataset size (datapoints)": "2.3e+13", "Organization": "OpenAI", "Publication date": "2025-08-05"}, "size": 8}, {"x": 2025.594292237443, "y": 146732030000000.0, "tooltipData": {"Model": "gpt-oss-120b", "Domain": "Language", "Training compute (FLOP)": "4.9e+24", "Training dataset size (datapoints)": "1.5e+14", "Organization": "OpenAI", "Publication date": "2025-08-05"}, "size": 8}], "size": 8, "fillColor": "rgb(0.0, 165.0, 166.0)", "strokeColor": "rgb(0.0, 165.0, 166.0)", "fillAlpha": 0.45, "strokeAlpha": 1, "marker": "M 0.0,-0.5 C 0.13260155,-0.5 0.25978993539242673,-0.44731684579412084 0.3535533905932738,-0.3535533905932738 C 0.44731684579412084,-0.25978993539242673 0.5,-0.13260155 0.5,0.0 C 0.5,0.13260155 0.44731684579412084,0.25978993539242673 0.3535533905932738,0.3535533905932738 C 0.25978993539242673,0.44731684579412084 0.13260155,0.5 0.0,0.5 C -0.13260155,0.5 -0.25978993539242673,0.44731684579412084 -0.3535533905932738,0.3535533905932738 C -0.44731684579412084,0.25978993539242673 -0.5,0.13260155 -0.5,0.0 C -0.5,-0.13260155 -0.44731684579412084,-0.25978993539242673 -0.3535533905932738,-0.3535533905932738 C -0.25978993539242673,-0.44731684579412084 -0.13260155,-0.5 0.0,-0.5 Z 0.0,-0.5", "isFilled": true}, {"type": "line", "color": "#E03D90", "zOrder": 2, "clip": true, "strokeWidth": 1.5, "lineStyle": "-", "tooltipData": {"Growth rate": "3.7x/year", "90% CI": "3.2x to 4.3x per year", "R\u00b2": "0.58"}, "points": [{"x": 2010.0, "y": 10267.129587751586}, {"x": 2010.1616161616162, "y": 12618.40026930684}, {"x": 2010.3232323232323, "y": 15563.36890163933}, {"x": 2010.4848484848485, "y": 19195.65447274348}, {"x": 2010.6464646464647, "y": 23675.66771472882}, {"x": 2010.8080808080808, "y": 29201.2571144407}, {"x": 2010.969696969697, "y": 36016.446392882404}, {"x": 2011.1313131313132, "y": 44422.21119754067}, {"x": 2011.2929292929293, "y": 54789.77093281231}, {"x": 2011.4545454545455, "y": 67576.9827287361}, {"x": 2011.6161616161617, "y": 83348.5615466415}, {"x": 2011.7777777777778, "y": 102801.01939123457}, {"x": 2011.939393939394, "y": 126793.42500659256}, {"x": 2012.1010101010102, "y": 156385.34248108044}, {"x": 2012.2626262626263, "y": 193570.607854683}, {"x": 2012.4242424242425, "y": 238747.44137601432}, {"x": 2012.5858585858587, "y": 294467.95355632936}, {"x": 2012.7474747474748, "y": 363192.9002960137}, {"x": 2012.909090909091, "y": 447957.3455527037}, {"x": 2013.0707070707072, "y": 552504.6972864157}, {"x": 2013.2323232323233, "y": 681452.0256322009}, {"x": 2013.3939393939395, "y": 840493.9641576584}, {"x": 2013.5555555555557, "y": 1036654.1989952725}, {"x": 2013.7171717171718, "y": 1278595.6522270888}, {"x": 2013.878787878788, "y": 1577003.05798946}, {"x": 2014.040404040404, "y": 1945054.8268143241}, {"x": 2014.20202020202, "y": 2399005.0368950944}, {"x": 2014.3636363636363, "y": 2958901.2544551874}, {"x": 2014.5252525252524, "y": 3649469.8839597846}, {"x": 2014.6868686868686, "y": 4501208.147410358}, {"x": 2014.8484848484848, "y": 5551730.91723643}, {"x": 2015.010101010101, "y": 6847431.8822894795}, {"x": 2015.171717171717, "y": 8445532.408108642}, {"x": 2015.3333333333333, "y": 10416608.574210076}, {"x": 2015.4949494949494, "y": 12847707.988679245}, {"x": 2015.6565656565656, "y": 15846194.02625256}, {"x": 2015.8181818181818, "y": 19544487.260978315}, {"x": 2015.979797979798, "y": 24105913.486967392}, {"x": 2016.141414141414, "y": 29731916.59019114}, {"x": 2016.3030303030303, "y": 36670954.80963784}, {"x": 2016.4646464646464, "y": 45390564.74536274}, {"x": 2016.6262626262626, "y": 55984125.46032359}, {"x": 2016.7878787878788, "y": 69050083.89166287}, {"x": 2016.949494949495, "y": 85165465.14292227}, {"x": 2017.111111111111, "y": 105041964.386118}, {"x": 2017.2727272727273, "y": 129557377.08445375}, {"x": 2017.4343434343434, "y": 159794364.6151994}, {"x": 2017.5959595959596, "y": 197088267.27880633}, {"x": 2017.7575757575758, "y": 243086076.23617938}, {"x": 2017.919191919192, "y": 299819168.7196735}, {"x": 2018.080808080808, "y": 369793018.68516535}, {"x": 2018.2424242424242, "y": 456097844.75169206}, {"x": 2018.4040404040404, "y": 562545081.9130993}, {"x": 2018.5656565656566, "y": 693835703.9530985}, {"x": 2018.7272727272727, "y": 855767830.0958048}, {"x": 2018.888888888889, "y": 1055492784.3211516}, {"x": 2019.050505050505, "y": 1301830915.553204}, {"x": 2019.2121212121212, "y": 1605661126.124331}, {"x": 2019.3737373737374, "y": 1980401311.0652616}, {"x": 2019.5353535353536, "y": 2442600925.599895}, {"x": 2019.6969696969697, "y": 3012671850.0954013}, {"x": 2019.858585858586, "y": 3715789829.289516}, {"x": 2020.020202020202, "y": 4583006295.564986}, {"x": 2020.1818181818182, "y": 5652619677.145033}, {"x": 2020.3434343434344, "y": 6971866751.605283}, {"x": 2020.5050505050506, "y": 8599008739.02674}, {"x": 2020.6666666666667, "y": 10643678762.679688}, {"x": 2020.828282828283, "y": 13127773372.11761}, {"x": 2020.989898989899, "y": 16191622986.01743}, {"x": 2021.1515151515152, "y": 19970534795.958176}, {"x": 2021.3131313131314, "y": 24631394912.12186}, {"x": 2021.4747474747476, "y": 30380038467.441334}, {"x": 2021.6363636363637, "y": 37470339807.22728}, {"x": 2021.79797979798, "y": 46215424209.37119}, {"x": 2021.959595959596, "y": 57001496272.501076}, {"x": 2022.121212121212, "y": 70304895668.25427}, {"x": 2022.2828282828282, "y": 86713133481.57184}, {"x": 2022.4444444444443, "y": 106950838155.98622}, {"x": 2022.6060606060605, "y": 131911756881.6591}, {"x": 2022.7676767676767, "y": 162698225686.05182}, {"x": 2022.9292929292928, "y": 200669851324.34436}, {"x": 2023.090909090909, "y": 247503554883.49182}, {"x": 2023.2525252525252, "y": 305267628772.92365}, {"x": 2023.4141414141413, "y": 376513077642.30524}, {"x": 2023.5757575757575, "y": 464386277069.462}, {"x": 2023.7373737373737, "y": 572767925302.7809}, {"x": 2023.8989898989898, "y": 706444424511.7286}, {"x": 2024.060606060606, "y": 871319260169.6204}, {"x": 2024.2222222222222, "y": 1074673713600.7081}, {"x": 2024.3838383838383, "y": 1325488421408.373}, {"x": 2024.5454545454545, "y": 1634839982640.0884}, {"x": 2024.7070707070707, "y": 2016390128854.3997}, {"x": 2024.8686868686868, "y": 2486989060041.045}, {"x": 2025.030303030303, "y": 3078344660965.1455}, {"x": 2025.1919191919192, "y": 3796789810315.963}, {"x": 2025.3535353535353, "y": 4682910606634.895}, {"x": 2025.5151515151515, "y": 5775840340213.262}, {"x": 2025.6767676767677, "y": 7123845496513.717}, {"x": 2025.8383838383838, "y": 8786457323766.395}, {"x": 2026.0, "y": 10837100880437.3}]}, {"type": "annotation", "color": "#3E555E", "text": "GPT-4", "x": 2023.2050228310502, "y": 4900000000000.0, "ha": "right", "va": "center", "background": true, "hasArrow": false, "targetX": 2023.2050228310502, "targetY": 4900000000000.0, "relDx": 0.0, "relDy": 0.03, "arrowType": "arc", "arrowColor": "#3E555E", "targetSize": 8}, {"type": "annotation", "color": "#3E555E", "text": "GPT-3", "x": 2020.407305936073, "y": 374000000000.0, "ha": "right", "va": "center", "background": true, "hasArrow": false, "targetX": 2020.407305936073, "targetY": 374000000000.0, "relDx": -0.01, "relDy": 0.0, "arrowPosition": "right", "arrowType": "arc", "arrowColor": "#3E555E", "targetSize": 8}, {"type": "annotation", "color": "#3E555E", "text": "PaLM", "x": 2022.2582191780823, "y": 585000000000.0, "ha": "right", "va": "center", "background": true, "hasArrow": true, "targetX": 2022.2582191780823, "targetY": 585000000000.0, "relDx": -0.06, "relDy": 0.07, "arrowType": "arc", "arrowColor": "#3E555E", "targetSize": 8}, {"type": "annotation", "color": "#3E555E", "text": "Transformer", "x": 2017.4468036529681, "y": 1866666666.6666667, "ha": "right", "va": "center", "background": true, "hasArrow": false, "targetX": 2017.4468036529681, "targetY": 1866666666.6666667, "relDx": -0.01, "arrowType": "arc", "arrowColor": "#3E555E", "targetSize": 8}, {"type": "annotation", "color": "#E03D90", "text": "3.7x/year", "x": 2015.0, "y": [6774785.544680601], "background": true, "weight": "bold", "hasArrow": true, "targetX": 2015.0, "targetY": [6774785.544680601], "relDx": 0.08, "relDy": -0.14, "hasArrowHead": true, "arrowType": "arc", "arrowColor": "#E03D90", "targetSize": 8}], "additionalLegendItems": [], "tooltipKeyWidth": 120, "tooltipMinWidth": 250, "topRightText": "199 models", "addDataPadding": false, "title": "Training data of notable LLMs", "originalDataAspectRatio": 0.7451612903225805}

Enable JavaScript to see an interactive visualization.

Authors

Robi Rahman, David Owen

Published

June 19, 2024

Explore this data

Data on AI Models

Our comprehensive database of over 3000 models tracks key factors driving machine learning progress.

The size of datasets used to train language models doubles approximately every six months

Explore this data

Data on AI Models

Related insights

We value your privacy