What is a FLOP?
A “FLOP” stands for a “FLoating-point OPeration”. This is a unit of computation commonly used when talking about training or running AI systems. Note that this is not the same as FLOP/s (often written as FLOPS), which refers to the number of FLOP *per second*.
What is effective compute?
“Effective compute” is a measure of the amount of computation available for training or running AI systems, adjusted to account for algorithmic progress. In particular, due to algorithm improvements, it becomes possible to do more with the same amount of computation, which “in effect” has the same impact as increasing the number of computations performed. We measure effective compute in units of effective FLOP (eFLOP), relative to the start of the simulation.“
What is effective labor?
The effective labour is the aggregation of the labour contributions of AI and human workers. It roughly corresponds to the amount of human workers that would be needed to produce the same output without AI. It is described in depth in the Production section
What if some tasks remain unautomated?
By default, we assume that all tasks are automatable given enough effective compute. The beliefs add-on can be used to specify scenarios where full automation is not achievable (e.g. by placing 100% probability on a maximum task automation fraction of 0.8, instead of a max fraction of 1). As long as the fraction of tasks that can be automated is sufficiently large, we still generally observe similar results in the model dynamics up to full automation.
What are the most important parameters of the model?
In our experience with the model, the parameters that matter the most are the AGI training requirements and FLOP gap fraction—which jointly determine the mapping between effective compute and tasks automated—and the hardware and software returns to R&D—which control how costly it is to scale effective compute. In the future we plan to release a sensitivity analysis that further clarifies which parameters influence most the model outcomes.
How did you estimate the parameters for the default parameter preset?
Parameters commonly found in economic models (e.g., relative risk aversion) are estimated from the economics literature. AI-related parameters are estimated based on previous work, such as our work on estimating [training compute](https://epoch.ai/blog/training-compute-of-frontier-ai-models-grows-by-4-5x-per-year), [algorithmic progress](https://epoch.ai/blog/algorithmic-progress-in-language-models), [returns to software R&D](https://epoch.ai/blog/do-the-returns-to-software-rnd-point-towards-a-singularity), etc. The reasoning for the parameter estimates is provided in the [accompanying technical paper](https://arxiv.org/abs/2503.04941).
What do the aggressive and conservative presets correspond to?
They correspond to an example scenario in which we have adjusted some of the model's most important parameters to more extreme but still plausible values. In particular, we change the values of the AGI training requirements and the hardware and software R&D returns. The rest of the parameters are set to their default value. These presets are meant to showcase the range of scenarios that are possible to illustrate with GATE.
What happens after full automation?
The main focus of GATE is on the dynamics in the leadup towards full automation, and it is likely to make poor predictions about what happens close to and after full automation. For example, in the model the primary value of training compute is in increasing the fraction of automated tasks, so once full automation is reached the compute dedicated to training falls to zero. However, in reality there may be economically valuable tasks that go beyond those that humans are able to perform, and for which training compute may continue to be useful.
Why don’t the model predictions for GWP growth, capital stock, or compute investment match the values of today?
The GATE model uses gradient descent to solve for the optimal allocation of investments and compute, in order to maximize a utility function. The predicted investments near the start of simulations can therefore be quite different from those observed today.
Does GATE take a stance on which tasks will be automated first?
In the model we consider an abstract task space including all economically useful tasks, including both cognitive and physical tasks. However, in GATE we do not take a stance on which tasks are likely to be automated first. Instead, GATE only assumes that there is a fixed spectrum of tasks that require progressively more effective compute at training and inference to be automated. We do not consider bottlenecks inherent to specific tasks, such as physical tasks that require robotics.
Why is the initial fraction of automated tasks nonzero?
We do this so that the initial value of runtime compute is nonzero. Instead, we calibrate the initial fraction based on the current share of GWP that is allocated to runtime compute, which results in approximately 10% of tasks being initially automated.