2024 Impact Report
In 2024, Epoch published influential research, launched FrontierMath, expanded its AI data hub, engaged with policy and industry leaders, raised $7M, and more.
Published
2024 Impact Report
2024 has proven yet another impactful year for AI. The release of OpenAI’s o1 established inference-time scaling as a crucial driver of progress, later culminating in the announcement of OpenAI’s o3, for which OpenAI claims large advances in math, reasoning and coding benchmarks. We also saw many other major LLM releases, including highly capable models from OpenAI competitors, such as Google’s Gemini 1.5 and 2.0 and Anthropic’s Claude 3.5, a proliferation of GPT-4 scale models such as Mistral Large (2), GLM-4, Doubao Pro, Nemotron-4 340B and Grok 2, and Llama 3.1 405B and DeepSeek v3, the first downloadable-weight models comparable to GPT-4 in performance. Lastly, we have seen large advances in video generation, including the release of models such as Sora, ImageGen and Veo 2, as well as some early results in computer interaction through GPT-4o and Claude 3.5 Sonnet.
Amidst all these developments, Epoch AI’s mission of informing the public about ongoing developments remains as critical as ever. Throughout the year, we have updated our data page with new hubs and data insights tracking the most important trends in the field. We have launched an AI capabilities tracking effort, including the release of FrontierMath, an advanced math benchmark, as well as our benchmarking hub. And we have released many research reports, including an analysis of the most important bottlenecks to AI scaling over the rest of this decade.
As companies continue to scale AI training and inference in the coming years, we anticipate comparable advances in the reasoning and generality of AI models. To continue bringing clarity to AI, over 2025 we plan to do substantial work in measuring AI capabilities and modelling AI’s potential economic impact, and to continue expanding our coverage of AI models, chips and clusters. As such, we are fundraising up to $10M over the next two years to support and expand our programs. Consider donating through our website or reach out to donate@epoch.ai if you are considering a large donation.
We are immensely excited about the future of AI and of our organization. Thank you for joining us on our journey!
Highlights from 2024
Some of our most notable outputs from 2024 include:
FrontierMath
A novel, private benchmark of over a hundred challenging math problems made in collaboration with over 70 mathematicians. We interviewed three Fields Medalists and an International Mathematical Olympiad expert, who consider the hardest subset of the benchmark exceptionally challenging and high quality, and expect it will take several years for AI to solve such problems. The benchmark sponsor, OpenAI, featured FrontierMath in the announcement of their new o3 model, which they claim solves 25% of the benchmark. This project had close to 1M views on Twitter and was covered in Science, Ars Technica and TIME.
Why this matters. Existing benchmarks to measure AI’s reasoning capabilities are close to saturation and often struggle with data contamination issues. FrontierMath tackles both challenges, and we aim for it to become the primary means of tracking progress in math capabilities in the coming years.
Can AI Scaling Continue Through 2030?
A long-form report investigating four key bottlenecks in scaling up compute for AI pre-training: power infrastructure, GPU production, data availability and latency. We conclude that under fairly conservative assumptions it is quite likely that the current trend of compute scaling can continue until 2030 despite these bottlenecks—which would lead to a 10,000x increase in training compute, similar to the gap between GPT-2 and GPT-4. This project had over 1.5M views on Twitter, and was discussed in a report by the US Congressional Budget Office.
Why this matters. Compute scaling during pre-training has been the most important driver of AI advances in the last decade. This report identifies and contributes novel data and analysis on the most important bottlenecks we could see stopping scaling. It is already helping governments plan for the infrastructure needed for AI development and contextualizes previous research and discussion.
Epoch AI’s Data Hub
An updated collection of data and insights about AI. It comprises a database of Notable AI models, Large-Scale AI models over 1e23 FLOP, AI Hardware and notable AI Benchmarking evaluations. Our data has been cited by technology leaders Satya Nadella and Sundar Pichai, and the US Industry and Security Bureau, among others. It has received over 8M views on Twitter and has been widely discussed in media.
Why this matters. Accurate, up-to-date, and easy-to-cite information on the key trends and numbers in AI helps ground discussions on solid evidence. Our database is a resource that enables sophisticated discussion and analysis of trends in government, academia, and industry.
Alongside this work, we have produced many other notable outputs, including our ICLM 2024 paper Limits of LLM Scaling Based on Human-Generated Data, and our NeurIPS 2024 paper Algorithmic Progress in Language Models. We have also contributed to external efforts, including the International Scientific Report on the Safety of Advanced AI and consultations for the UK Department of Science, Innovation and Technology and the European Joint Research Centre.
Press and citations
Our work has been extensively covered by the media, reflecting widespread interest in our research and its significance for the future of AI. Below is a selection of our most notable mentions, illustrating the attention and recognition our efforts have drawn from science, technology, business, and government audiences.
What people are saying about Epoch AI
Epoch AI’s research, data, and publications have garnered praise from experts across academia, industry, and government. Below, you’ll find a selection of testimonials highlighting the trust and credibility we’ve built—and the significant impact our work continues to have.
“Epoch AI’s research brings both clarity and depth to the rapidly evolving landscape of artificial intelligence. Their work is a relevant resource for anyone who wants to understand the trajectory of AI and its societal implications.”
“Epoch does the most thoughtful and best-researched survey work in the industry. Several times I have thought I found errors in their results, only to discover when going through their notebooks that they had it right. They are my go-to resource for field-wide trends.”
“Epoch AI’s research and analysis is indispensable. As AI systems grow more powerful, it becomes increasingly crucial to have high-quality, independent information on AI progress.”
2024 in numbers
Outputs
Data collection
Engagement
Social media
Company
Our plans for 2025
Through 2025, we plan to focus our work on three key efforts:
Curating Data on AI
In addition to continuing our successful work tracking AI models and AI hardware, we plan to expand our data coverage to include AI clusters and companies. Alongside these efforts, we plan to release weekly data insights, curating important graphs relevant to AI.
Why this is important: Accurate, up-to-date, and easy-to-cite information on the key trends and numbers in AI helps ground discussions on solid evidence. Our database is a resource that enables sophisticated discussion and analysis of trends in government, academia, and industry.
Measuring AI capabilities
We plan to significantly expand our AI Benchmarking Hub, aiming for it to become the best resource online for tracking trends in model capabilities. It will feature independent evaluations in leading AI benchmarks—including FrontierMath, GPQA, SWE-bench, RE-bench and others. We will also develop new benchmarks to fill important gaps in AI capability measurement, and a sophisticated methodology to track AI advances—including capability scores and model cards.
Why this is important: The trend of capabilities in AI benchmarks is the most important leading indicator of AI impact. Our benchmarking effort will provide accountability to the results achieved by major labs, and clearly illustrate how fast the field is achieving new breakthroughs, as well as the relation between novel capabilities and scaling training and inference compute.
Modeling the impact of AI
We are developing an economic model of AI automation and compute scaling. This model will provide insights into AI investment, training and inference computation, global economic output, and labor displacement. Combining growth theory with AI scaling laws, we aim for the model to become a standard tool of analysis that could be used by government and academics to inform their thinking and decision-making—much like how the DICE model has been for climate policy.
Why this is important: We perceive a gap in AI and economics, whereas major economists haven’t yet engaged in depth with the consequences of advanced AI that could automate many current tasks. We think this model could substantially improve the discourse on the impact and stakes of AI by grounding research on a concrete model that can be used to explore disagreements and the effects of key modelling decisions.
While these are not the only outputs we will produce, they represent our current top priorities. The state of AI is in flux, and so these are subject to change as we adapt to the landscape. Besides this work, we plan to develop important organizational aspects.
Communications
We plan to improve the polish and frequency of our communications, continuing our weekly newsletter, and experimenting with weekly data insights and videos, among other formats. We also intend to significantly improve our website, making our content easier to find and engage with.
Hiring
To achieve our goals, we plan to gradually expand our capacity with top-end talent, including hiring an AI Data Lead, a research engineer and researchers for our benchmarking work, leads for specific projects, and expanding our software, design, and operations team. Depending on a successful fundraising initiative, we expect to have a core team of 27 full-time employees by the end of the year.
Feedback
We’re excited about improving how we engage our audiences. Initiatives we plan to experiment with include beta reader groups, better interfaces for receiving feedback on our website, and more extensive audience surveys to learn what we are doing well and how we can improve.
Partnerships
We plan to engage industry, government, and private foundations to fund us to continue our mission. We are particularly interested in partners with strong AI or computing expertise who are invested in our mission to inform the public, and new funding relationships that can help us diversify our support base and garner more diverse feedback. You can read more about our existing partners and support on our donation page.
Support our work
As our impact in 2024 has demonstrated—from FrontierMath’s adoption by leading labs to our data being cited by top tech leaders and policymakers—we’ve developed a proven framework for bringing clarity to AI’s rapid progress. Yet there is room to scale this impact: we aim to broaden our independent model evaluations, develop advanced new benchmarks, and expand our coverage beyond models and hardware to include AI clusters and companies. We also plan to move from occasional data releases to weekly insights, supported by larger, specialized teams boosting the quality and frequency of our research.
To realize these ambitions, we are raising $10M over the next two years. We’re particularly eager to form partnerships with organizations that can offer more than just funding—be they AI labs with deep technical expertise, computing companies experienced in large-scale infrastructure, or foundations aligned with our mission of public understanding. Our projects typically range from $100K to $1M, reflecting both smaller, focused efforts and major initiatives like new benchmark development. With increased support, we can meaningfully grow the breadth, depth, and impact of our work, ensuring the AI community has the independent analysis and reliable data it needs to navigate the field’s rapid evolution.
Whether you’re an individual donor or an institutional partner, we invite you to join us in advancing a shared vision of responsible and transparent AI. Your financial support—and your feedback—will help us align our work with the real needs of both the AI community and the broader public. To learn more about how you can contribute, please email donate@epoch.ai.