Overview

The Growth and AI Transition Endogenous (GATE) model is an integrated assessment model of the impact of AI development on the economy.

To accompany our paper describing the GATE model in detail, we have developed a playground that allows interested readers to modify parameter settings and observe the model’s behavior in a wide range of scenarios.

In this documentation, we describe the following:

  • A concise summary of how the GATE model is structured, implemented, and solved.
  • How to use the GATE playground, and interpret its predictions.
  • Explanations for playground’s default parameter settings.

If you would like to ask any questions or provide feedback about the GATE playground, you may contact us at info@epoch.ai. You can also read our accompanying blogpost for an overview of the key results suggested by the model.


Model structure

The core dynamic in GATE is an automation feedback loop: Investments drive increases in the computation used to train and deploy increasingly capable AI systems, which in turn leads to the gradual automation of tasks currently performed by humans. This in turn increases output, which makes additional resources available for further investments into AI development. The model consists of three modules as described in the figure below:


How to interpret the GATE model’s predictions

To use the GATE playground most effectively, it’s important to understand how its predictions are meant to be interpreted, and what the model’s limitations are. Notably, the model’s predictions are not meant to represent Epoch AI’s forecasts of future AI developments and economic impacts. As with any economic model, the GATE model’s predictions are instead conditional forecasts, depending on a range of assumptions both in terms of specifications and in terms of parameter values.

In particular, GATE is most useful for analyzing the high-level qualitative dynamics of AI automation, assuming that AI capabilities improvements are solely driven by increases in physical computation and better algorithms. Thus, GATE can be used for deriving stylized facts about the economic impacts of AI automation – in contrast, its quantitative predictions are substantially more uncertain and unreliable.

It is important to note that GATE predictions may be subject to optimization errors. GATE is a complex economic model with a large number of parameters, so the model’s predictions become unreliable for certain parameter ranges due to optimization issues. It is especially important to verify if the model’s predictions are simply due to optimization problems when results are unintuitive. For example, one approach to checking this is to slightly perturb parameter settings to see if results change substantially. If you identify any such bugs, please email info@epoch.ai.


FAQ

What is a FLOP?

A “FLOP” stands for a “FLoating-point OPeration”. This is a unit of computation commonly used when talking about training or running AI systems. Note that this is not the same as FLOP/s (often written as FLOPS), which refers to the number of FLOP per second.

What is effective compute?

“Effective compute” is a measure of the amount of computation available for training or running AI systems, adjusted to account for algorithmic progress. In particular, due to algorithm improvements, it becomes possible to do more with the same amount of computation, which “in effect” has the same impact as increasing the number of computations performed. We measure effective compute in units of effective FLOP (eFLOP), relative to the start of the simulation.“

What is effective labor?

The effective labour is the aggregation of the labour contributions of AI and human workers. It roughly corresponds to the amount of human workers that would be needed to produce the same output without AI. It is described in depth in the Production section

What if some tasks remain unautomated?

By default, we assume that all tasks are automatable given enough effective compute, although this is possible in principle. The beliefs add-on can be used to specify scenarios where full automation is not achievable (e.g. by placing 100% probability on a maximum task automation fraction of 0.8, instead of a max fraction of 1). As long as the fraction of tasks that can be automated is sufficiently large, we still generally observe similar results in the model dynamics up to full automation.

What are the most important parameters of the model?

In our experience with the model, the parameters that matter the most are the AGI training requirements and FLOP gap fraction—which jointly determine the mapping between effective compute and tasks automated—and the hardware and software returns to R&D—which control how costly it is to scale effective compute. In the future we plan to release a sensitivity analysis that further clarifies which parameters influence most the model outcomes.

How did you estimate the parameters for the default parameter preset?

Parameters commonly found in economic models (e.g., relative risk aversion) are estimated from the economics literature. AI-related parameters are estimated based on previous work, such as our work on estimating training compute, algorithmic progress, returns to software R&D, etc. The reasoning for the parameter estimates is provided in the accompanying technical paper.

What do the aggressive and conservative presets correspond to?

They correspond to an example scenario in which we have adjusted some of the model’s most important parameters to more extreme but still plausible values. In particular, we change the values of the AGI training requirements and the hardware and software R&D returns. The rest of the parameters are set to their default value. These presets are meant to showcase the range of scenarios that are possible to illustrate with GATE.

What happens after full automation?

The main focus of GATE is on the dynamics in the leadup towards full automation, and it is likely to make poor predictions about what happens close to and after full automation. For example, in the model the primary value of training compute is in increasing the fraction of automated tasks, so once full automation is reached the compute dedicated to training falls to zero. However, in reality there may be economically valuable tasks that go beyond those that humans are able to perform, and for which training compute may continue to be useful.

Why don’t the model predictions for GWP growth, capital stock, or compute investment match the values of today?

The GATE model uses gradient descent to solve for the optimal allocation of investments and compute, in order to maximize a utility function. The predicted investments near the start of simulations can therefore be quite different from those observed today.

Does GATE take a stance on which tasks will be automated first?

In the model we consider an abstract task space including all economically useful tasks, including both cognitive and physical tasks. However, in GATE we do not take a stance on which tasks are likely to be automated first. Instead, GATE only assumes that there is a fixed spectrum of tasks that require progressively more effective compute at training and inference to be automated. We do not consider bottlenecks inherent to specific tasks, such as physical tasks that require robotics.

Why is the initial fraction of automated tasks nonzero?

We do this so that the initial value of runtime compute is nonzero. Instead, we calibrate the initial fraction based on the current share of GWP that is allocated to runtime compute, which results in approximately 10% of tasks being initially automated.

Acknowledgements

Roles and Contributions

Ege Erdil initiated the project, developed the early prototype, and played a central role in advancing the key theoretical and modeling ideas. Andrei Potlogea contributed significantly to the technical exposition and introduced refinements to the economic model. Tamay Besiroglu coordinated the project, contributed to the writing, and ensured alignment across modeling, engineering, and writing efforts. Anson Ho provided ongoing support throughout the project, including calibration of parameter values, general model refinement, and coordinating external feedback. Jaime Sevilla contributed to both the engineering and writing, ensuring coherence between the model’s implementation and its conceptual framework. Matthew Barnett contributed to the writing and parameter settings.

Engineering and Sandbox Development

Edu Roldan provided technical support on the development of the model. Matej Vrzala contributed to the design of and implemented the interactive sandbox. Andrew Souza supported the implementation of the interactive sandbox. Robert Sandler provided design support, contributing to the usability and presentation of the sandbox interface.

We are grateful to Tyler Cowen, Chad Jones, Ben Golub, Ryan Greenblatt, Kevin Kuruc, Caroline Falkman Olsson, Anton Korinek, Daniel Kokotajlo, Lev McKinney, Daan Jujin, Zachary Brown and Dan Valentine, as well as seminar attendees at the 15th Oxford workshop on Global Priorities Research for their insights and feedback.