Epoch After Hours: What does economics actually tell us about AGI? – Phil Trammell

Stanford economist Phil Trammell has been rigorously thinking about the intersection of economic theory and AI (incl. AGI) for over five years, long before the recent surge of interest in large language models.

In this episode of Epoch After Hours, Phil Trammell and Epoch AI researcher Anson Ho discuss what economic theory really has to say about the development and impacts of AGI: what current economic models get wrong, the odds of explosive economic growth, what “GDP” actually measures, and much more!

– Timestamps –
00:00 – Problems with existing work on the economics of AI
10:18 – Declining returns to R&D
18:28 – What real GDP misses
26:57 – Task-based models & AI automation
49:32 – The limits of economic theory
01:09:11 - How to detect an economic singularity
01:23:32 - Increasing returns to scale

Key takeaways

The economics of AGI literature is too heavily anchored on one specific growth model

Phil thinks there are three big problems with existing work on the economics of AGI and explosive growth.

First, not enough people have been thinking about it at all. This is the biggest issue of all.

Second, prior work has anchored too heavily on the family of “semi-endogenous growth” models. Different models make drastically different predictions about AGI’s impact on economic growth, and it’s worth looking at these more.

For example, the most famous semi-endogenous model is the Jones model, and it says that the primary constraint on growth is the number of researchers. So if you could automate R&D and dramatically scale up the number of AI researchers, this model predicts that the rate of technological progress would also increase dramatically.

In contrast, a Schumpeterian model says that the main constraint on growth is that innovators need a temporary monopoly on what they’re producing. Specifically, new innovations need to come sparsely enough for it to be worth the costs of developing an innovation. In many such models, automating R&D doesn’t do much at all!

This is of course going too far, but on the margin, Phil thinks it’s worth considering other models:

“I think the Jones story is closer to right if you have to pick one. But there’s been so much work anchored on the Jones model that I think taking a look at other approaches would be informative.”

Third, semi-endogenous growth models might mislead us about AGI impacts. For example, Phil points out that the Jones model doesn’t include a hard limit on how much research can be parallelized at a given time. So if you scale up the number of researchers really fast, the model predicts that the rate of technological progress goes to infinity practically immediately. But he argues that this is intuitively wrong:

“If you think about an assembly line, bringing in loads of engineers and management scientists to speed up its production, […] as the number of engineers goes to infinity, the rate of progress that they can make in five minutes just hits a ceiling.”

When it comes to AI progress, this matters because the Jones model might make bad predictions if AI R&D is automated, and the amount of research input increases substantially.

Economic theory can help point out where intuitions about AGI impacts are wrong, but theory alone isn’t enough

What does economic theory have to say about the impacts of AGI? After studying this for several years, Phil has come to the following conclusion:

“This is a pessimistic view, but a view I’ve somewhat come around to is that economic theory is primarily a destructive project. […] if it’s good for anything, it’s most useful for pointing out that intuitions that you might have had don’t actually hold in general.”

For example, a common intuition is that as goods become more expensive, people consume them less. But economic theory demonstrates that this intuition isn’t necessarily right, in a way that we observe in practice – there are “Giffen goods” that are consumed more as their prices rise.

The hope is that economic theory can play a similar role when it comes to the impacts of AGI. For instance, what does economic theory imply about the chances of >30% per year GDP growth, or what happens to human wages if AGI is developed? Often the conclusions are ambiguous or contradict our intuitions. So perhaps we should be a lot more uncertain about AI’s economic implications.

Phil also emphasized that theory alone isn’t enough. In his words:

“To understand the economic implications of AGI we need more than pure theory – we also want data to track progress and ground economic models.”

An example of how this could work is in so-called “task-based models” of AI automation, which view the economy as a set of tasks that may or may not be automated. As better AI systems are developed, the fraction of automated tasks increases, and “AGI” is achieved when all the tasks are automated.

Under this framing, it’s crucial to know how the fraction of automated tasks evolves over time. But how might we know this? Some models like GATE determine this by hypothesizing a map between AI training compute and the task automation fraction. But this depends on highly speculative estimates of the training compute needed to get to AGI (and hence a task automation fraction of 1).

Phil thinks there are better approaches. While there’s limited data today, as AI progress continues, we’ll have more data to work with, and across different dimensions. We could infer the task automation fraction in labor statistics databases. We could also observe the trend in the length of tasks that AIs can do – AIs might perform tasks that take humans longer and longer to do, until they can do any task that a human can. We could also look at equivalents to human ability, like when AI systems reach(ed) “pre-school” level, “smart high-school” level, eventually matching the smartest humans.

Extrapolating these trends helps us know how the task automation fraction might progress over time. Whether or not you agree with them, it’s still useful to consider what different approaches say about the task automation fraction – it’d certainly be a step up compared to what we have today.

The economic indicators don’t point to an approaching economic singularity (yet)

Are we approaching an economic singularity?

Several years ago, Nobel-Prize-winning economist William Nordhaus attempted to answer this question. His conclusion was a resounding no – if we were approaching a singularity, we should observe it in the macroeconomic statistics. But we don’t!

However, while Phil thinks that Nordhaus’ work was a great theoretical insight, there are two issues with it. The first is that Nordhaus’ most updated numbers were from 2021, which in AI terms is ancient history – ChatGPT wasn’t even a thing yet! So Phil updated Nordhaus’ old analysis with new data, and found essentially no change.

The second issue is perhaps more worrisome. The macroeconomic variables that Nordhaus analyzed are lagging indicators of the singularity – they only tell you when an economic singularity is already underway. Phil likens this to “being a weather forecaster and just looking out the window and saying whether it’s raining.”

What we really want are indicators that give us advance notice of the singularity, and to this end, Phil proposes looking at the “network-adjusted capital share”.

“The network-adjusted capital share asks: for every dollar of revenue spent on that good, if you trace it all the way back down the supply chain, how much of it is paid out in exchange for value added by capital, as opposed to value added by labor or taxes?”

For instance, suppose you paid $5 for a coffee at Starbucks. Some of that money goes to workers, some to Starbucks owners, and some to suppliers. But those suppliers also pay their own workers and owners, who pay their suppliers, and so on down the chain. Then the “network-adjusted capital share” tells you what fraction of the $5 ended up as payment to capital owners if you trace things down the supply chain.

This becomes a leading indicator if we hone in on a good that can later drive the growth of everything else. For example, one might expect fully automated semiconductor production to then drive the intelligence explosion, and then the automation of other tasks. If this is the case, looking for rises in the network-adjusted capital share for semiconductors serves as a leading indicator for the full intelligence explosion.

So Phil used data from the US Bureau of Economic Analysis and the OECD, and looked at the network-adjusted capital share for different goods over time. Here’s what he found:

“It turns out they’re all basically as flat as a board. For semiconductors, […] it’s 50-50. It’s sort of always been 50-50. So there you go. So you got one more little data point that maybe the singularity is not near, at least if it’s going to be an industrially intense singularity early on.”

Increasing returns to scale will probably hold in the long run – with huge implications

A key question for the plausibility of explosive economic growth is whether we’ll see increasing returns to scale in production. This means that if you double all your inputs (e.g. workers, factories, technology), you get more than double the output (e.g. world GDP).

Right now, the human workforce isn’t very scalable because people reproduce slowly, and this limits growth. But AIs can be scaled much more quickly, by simply running more AIs on more chips. AIs can also scale along different dimensions, such as by sharing knowledge between different systems, and by developing much smarter AIs trained on far more data and parameters (“Jupiter brains”). In a world with increasing returns to scale, this scalability would make explosive growth substantially more likely.

So are there increasing returns to scale? One view is that this only exists to a mild extent at best. For example, one piece of evidence that suggests there are increasing returns to scale is international trade, but Phil points out that the gains here are pretty small:

“It’s sort of funny—economists are associated with the idea that tariffs are really horrible and that there are all these gains from trade. Implicitly that’s an argument about increasing returns to scale. Because if there were constant returns to scale, then why bother trading between countries? Each country can just chop the land in half and produce half as much on its own.

That is in fact what is typically estimated. The gains from trade between the US and Canada, it’s maybe a few percent of GDP. Which is a lot in the grand scheme, but it’s not that much. […] But in the long run, standard models and estimates predict that the effect should be small.”

But Phil thinks that this misses the full picture. In particular, if we look at smaller physical scales, ranging from individual self-sufficiency to agglomeration in modern small cities, we see pretty clear increasing returns to scale. And he thinks that it’s suspicious to think that the returns just happen to level off at the international scale.

Instead, he thinks that the small returns currently observed in international trade is due to a different reason – gains from trade are really due to the benefits of specialization, but society hasn’t yet fully made use of its capacity for this at global scale. As populations grow and more advanced technologies are developed, it takes time for workers to specialize in new tasks, fully exploiting the gains from trade that make increasing returns possible. But without enough time, the gains might seem smaller than they actually are.

“It seems a little suspicious to me that in the whole space of ways to arrange matter and energy to produce things of value, you get strong benefits from specialization up to the size of a small modern city in 2025 and then no gains at all after that.

I would guess that with enough time, if we just stagnated at current population sizes, we would develop more and more specializations and we would continue to urbanize globally—not just increase the fraction of people in cities but the fraction of people in the biggest cities. In time we would extract ever more benefits of specialization. But what has happened over the last century is that populations have grown really quickly and urbanization has grown really quickly. That has created this big overhang, if you like, of potential specialization to be exploited.”

And since AI systems can be scaled much faster than human populations, they could exploit this overhang of unrealized specialization. So in the long run, increasing returns to scale seem more likely than existing estimates might suggest. And this also makes the explosive growth more plausible than otherwise.

In this podcast

Anson Ho is a researcher at Epoch AI. He is interested in helping develop a more rigorous understanding of future developments in AI and its societal impacts.

Phil Trammell is a postdoc at Stanford University's Digital Economy Lab, working with Erik Brynjolfsson and Chad Jones on questions related to economic growth and AI.

Transcript

Introduction [00:00:00]

Phil

Here’s a fun fact. This is crazy. So real GDP is not a quantity. As new goods get introduced, they can cause the relationship between our wealth and our willingness to sacrifice consumption for safety to go the other way, making us willing to sacrifice less consumption for safety because the consumption tastes so damn good. This is a pessimistic view, but economic theory is primarily a destructive project.

Anson

How do we detect the economic singularity?

Phil

Yeah, that’s a big one.

Anson

Hello, my name is Anson. I am a researcher at Epoch AI, and today I have the pleasure of speaking with Phil Trammell. Phil is an economist at Stanford University, and he has thought a lot about scenarios involving economic theory and AGI, and knows a lot about different kinds of thought experiments relating to this topic. I thought it’d be really great to have you on. Welcome, Phil.

Phil

Thank you, Anson.

Issues with Current Economic Growth Models [00:01:01]

Anson

Let’s start with explosive growth. Most discussions about AI and explosive growth so far have focused on one particular family of economic growth models, which is the semi-endogenous family. What do you think about that? What are some issues with how most people have been thinking about AI and explosive growth from an economic perspective?

Phil

I think the biggest issue is that not enough people are thinking about it at all. There’s been a lot of anchoring on a particular semi-endogenous growth model, the Jones model. In the Jones model, the constraint on growth, particularly on technology growth, is just a lack of research inputs. You have researchers doing R&D. They develop better technology and that speeds growth. The only reason why we don’t have really fast R&D already is because we only have so many people doing it. So if we automated it, if we created lots of AIs, lots of robots doing R&D, we would have really fast growth.

One issue is that might not be the whole story. It could be that a Schumpeterian model is closer to correct in some contexts. This is a model in which the main thing constraining growth is that every innovator needs a patent or a trade secret. They need a temporary monopoly on what they’re producing in order to justify the cost of developing the innovation. So on those views, growth has to proceed slowly enough. The sequence of new innovations has to come sparsely enough for it to be worth doing any innovation at all. I think we see this potential worry in how the LLM market has unfolded, where because each one has so little moat, they keep pouring money into making better models, but very quickly someone makes an even better one and steals the lead and everyone quickly switches over to the competitor. So there have been worries about how sustainable this is.

Phil

In these models, R&D is, for simplicity, typically presented as being just a function of capital expenditure. So if you ask, taking the model literally, what happens if you automate R&D? Well, nothing because it’s already all about the lab equipment. That’s going too extreme. I think the Jones story is closer to right if you have to pick one. But there’s been so much work anchored on the Jones model that I think taking a look at other approaches would be informative.

Given the Jones model, or a model in that class, I think we are led astray by a few things. Let’s say a bit about two in particular. One is that in the Jones model, there’s not a hard limit on how much research can be parallelized at a given time. So if you scale up the number of researchers really fast, the rate of technological progress goes to infinity with the research inputs. Maybe not one for one. Maybe the rate of progress is like the square root of the inputs, but still a sudden really big surge in the inputs means that you make really fast technological progress. I think that just can’t be right. If you think about an assembly line, bringing in loads of engineers and management scientists to speed up its production, the more you stuff in, the faster progress they’ll make. But as the number of engineers goes to infinity, the rate of progress that they can make in five minutes just hits a ceiling.

Limits of Parallelization in Research [00:05:31]

Anson

Another example here would be like, no matter how many instances of GPT-3 that OpenAI is running, they can’t just arbitrarily push up the rate of algorithmic progress by running a trillion instances of GPT-3.

Phil

That’s right. And importantly, no LLM can single-handedly carry on AI research at all. With GPT-3 alone, they can’t really make any progress. You need people and GPT-3, or even just people. So I think it’s a bit of a different case, but it’s another example of gross complementarity.

Anson

So it would be that even if we had GPT-3 plus lots of humans, if we scaled that up arbitrarily, we still wouldn’t be able to just get arbitrary algorithmic progress.

Phil

Yeah, that would be the tighter analogy.

Anson

Should we just totally scrap this Jones Law of motion then? Is this law of motion fundamentally going to be way too aggressive?

Phil

No. It’s very simple, and for many purposes I think it makes reasonable predictions, so we shouldn’t always scrap it. And to the extent that it leads us astray, it could lead us astray in either direction. I think this is a common pattern. When you think of a bottleneck that models haven’t been incorporating before, on the one hand, the lesson could be “Oh, actually, growth won’t be as explosive as we thought, because once we relieve some other bottleneck, we’ll just slam into this new one we’d forgotten about.” But on the other hand, to the extent that we’ve been constrained all along by the bottleneck in question without realizing it, if this is a new margin on which we could get more growth than we had been anticipating because it could be relieved, it could be getting relieved more quickly than before.

So in this case, the reason why scaling up the number of people and computers working on algorithmic progress wouldn’t make progress go to infinity is the high latency that different human brains have between each other. It’s hard to efficiently divide a big project among lots of different people who only have faint understandings of what part they are playing in the big orchestration.

Phil

We’ve been getting better over time at collaborating. The internet allowed for collaboration with people with complementary skills in very different parts of the world. Obviously, if you go back far enough, the invention of language and writing obviously lowered latency in some sense. And that’s been furthering growth all along. But the rate at which that bottleneck is being relieved could rise pretty radically in the event of neural nets, virtual brains that are themselves getting bigger and bigger. So instead of having one, two, three, four employees, you have fully integrated minds that are two, three, four times bigger. So assuming a constant rate of penalty for parallelization, a constant exponent on the research inputs in the Jones model would be leading us astray in the pessimistic direction.

If you want to think through the problem carefully, we probably shouldn’t be using the Jones model because it doesn’t even work well as a bound. It could be leading us astray in one direction or the other. But if you want simplicity and if you’re not going to be extrapolating too far out of sample, then it’s still simple and insightful.

Are Ideas Getting Harder to Find? [00:10:07]

Anson

The tricky thing for us is that we’re thinking about AI, and that almost necessarily entails us having to extrapolate a lot of things pretty far out of bounds of the regimes that are being tested. So one question that is pertinent to models like Tom Davidson’s takeoff model and the GATE model is that we might get declining returns to scale over time as we make further technological progress. I know that you’ve been thinking a little bit about this. How fast has this decline been? What do we know about this question?

Phil

I think you’re referring to this point that ideas get harder to find. And the rate at which ideas are getting harder to find might itself change over time. That’s something else that the Jones model doesn’t feature. I think that’s a much smaller weakness than this parallelizability point, but it’s still a strong assumption and probably an incorrect assumption.

Something I used to think was that in the grand scheme of things, the rate at which ideas were getting harder to find was itself growing, and that you could see that by comparing the function from growth and research inputs to growth in productivity outputs in the distant past to in the more recent past. There’s this famous Cramer paper from 1993 that if you look over like the million years of the Malthusian past, he estimates that basically ideas aren’t getting harder to find at all. Every time you double the population, you double the rate at which they are coming up with proportional productivity improvements. The data is really sketchy, obviously, but I think even after accounting for that, it seems pretty robust that this “ideas getting harder to find” effect does not appear very strongly in the distant past, if anything like Cramer’s data are to be accepted.

By contrast, this well-known paper, “Are Ideas Getting Harder to Find?” from 2020 finds that over the last century you have to be growing the research inputs three times faster than your target rate of growth in productivity.

Phil

So I thought “Okay, well, so that’s a big change.” And if you project it forward, then the rate at which ideas get harder to find will keep on rising until eventually we just hit the ceiling of technological maturity or before that. If there is some technology level that is feasible, but that we never really get to because we slow down before we were anywhere close. Thinking about it more, I think that’s probably not right.

The big difference between that analysis of the distant past and the analysis in the “Are Ideas Getting Harder to Find?” paper is that in the past, they’re looking at how quickly we can create more copies of the same object, namely the human body. Modern growth is mainly not about that, but about creating new kinds of goods and higher quality goods. Another difference is that modern growth is often driven by profit-seeking R&D, which can be constrained by that creative destruction issue I was getting at before.

Phil

The issue here would basically be if a few hundred years ago, we come up with the idea of patents and we have an entrepreneurial culture take hold where people are trying to invent new products, new production processes and capitalize on that. That’s going to really boost growth at the time, because it’s something people could have been doing all along that they weren’t, and now they’re doing it. But that process is capped by something different from the old Malthusian, purely resource-constrained thing driving growth. So you get this burst of extra growth that starts picking up in the Industrial Revolution. That adds a lot. But if it itself starts slowing down for whatever reason, that’s unrelated to the rate at which ideas are getting harder to find. Then in a model which only has room for ideas getting harder to find, you’ll pick that up as ideas getting harder to find.

There was a paper actually by James Bessen and a co-author recently which makes this claim that ideas haven’t even been getting harder to find over the last century, properly understood, but that R&D has been contributing ever less per unit of R&D expenditures or R&D effort. It’s been contributing ever less to productivity growth because for whatever reason, it’s unfolded in a direction where new R&D efforts have ever more been cannibalizing past ones, which both discourages investment to begin with and itself means that you’re adding less to output.

Phil

If the big new thing is that you come up with Amazon, but that makes you a multi-billionaire and so it warrants billions of dollars in investments in how to get all the logistics right. But actually without you, everyone would just be shopping at Walmart and the consumer surplus would just be a penny less, then you’re not actually increasing output very much. You’re just transferring it by winning the race a second faster. But both parties still run almost the entire race. That’s been happening all along. But that kind of thing, according to Bessen, is happening more now. So if you’re just looking at a time series of R&D inputs to productivity growth over time, you’ll think “Oh, ideas are getting harder to find,” but that’s not really what’s going on.

Now, I don’t know if that’s entirely right. But it’s another example of how I think you’re really comparing apples and oranges when you look at modern growth, where it’s all about the intensive margin. More and different kinds of things per person than the old Malthusian regime. So I don’t think you can infer that the rate at which ideas are getting harder to find is itself rising.

The Problem with GDP and New Products [00:17:56]

Anson

So the takeaway is we would like to be able to answer this question, but the data is not really comparable between different regimes, both on the R&D inputs and the R&D output side. There is also this omitted variable bias kind of issue that makes it not directly comparable. That’s unfortunate because we would like to know what these bottlenecks are for hardware efficiency and software efficiency. Changing tacks a little bit. You’ve written a paper about how new products and new varieties of goods is an important consideration when we’re thinking about the real impacts of economic growth. What exactly is missing from the current way that we think about economic growth?

Phil

The problem is that we try to flatten growth in the goods and services we enjoy into a single variable, and put a number to how much that variable is growing. I’m not making a standard critique of GDP here. I mean, it’s somewhat standard, but I’m not making the particular standard critique that it leaves out all kinds of things, sources of value other than consumption, like leisure or friendship. That’s true. But what I’m saying is that even when it comes to consumption, the part that’s just the knickknacks that we consume in a very concrete way and enjoy, you just can’t flatten things into this single dimension.

I think a really simple way to see that is to just say, okay, let’s say we only had one consumption good: horse. So we’re in the Golden Horde, one of the empires that followed Genghis Khan’s when it split up. They got a lot of use out of their horses. They could ride them, but they could also use their skin and their milk. Let’s say they could only consume horses. The quality of the consumption basket that they got to enjoy rises as the number of horses rises, like you’d rather have five horses than one, but it only rises toward a ceiling, and that ceiling is below what a typical modern middle-class American gets to enjoy.

Phil

So whatever number you put to consumption or GDP in the horse economy, if your index is going to be homogeneous of degree one, if it’s going to have the natural feature that if you double all the stuff in it, then you double the number you assign to consumption. So if you’ve got five horses, and then you go from that to ten horses, then you’ve doubled consumption. Whatever number you put on how much consumption five horses count for, as horses go to infinity, consumption is going to rise to infinity. But the quality of the consumption basket is still going to be below what’s presumably the finite consumption basket that the middle-class American gets.

Anson

The issue is, no matter how many loaves of bread that I have in 1800, it doesn’t matter because I would swap no number of loaves of bread for my modern day laptop.

Phil

Yeah, I mean, at least it’s totally possible to have that preference. I think that’s a common preference.

Implications for Existential Risk and Growth [00:21:42]

Anson

There was this previous paper that I think you were involved in together with Leopold Aschenbrenner on whether or not we could try to speed through the time of perils, relating existential risk and growth. What does this imply for that?

Phil

Yeah, this paper and some recent papers by Chad Jones are all centered around this calculation of how much consumption we might be willing to sacrifice for safety over time as we get richer. The assumption is that as we get richer, our marginal utility in consumption falls because we’ve got these concave utility functions. We have more to lose if we die because we’re doing better and better. So for both reasons, we’ll be willing to sacrifice more consumption for safety over time. If that effect is strong enough, it could mean that we’re willing to tolerate really rapid AI development, which could get out of hand and be disruptive or even existentially catastrophic. Right now we’re willing to take that risk, but in the future, we won’t be, at least if we survive to a future where we’re rich enough to want to make the tradeoff the other way.

I think that argument probably has a fair bit of truth to it. It’s not just a theoretical point that as we get richer, we get willing to sacrifice a lot of consumption for safety. We see it in medicine being a luxury good. As people get richer and as societies get richer, we spend not just more on medicine, but a larger fraction of our incomes on medicine. On the other hand, these trends can reverse. Healthcare spending as a fraction of GDP actually has fallen a bit in the US in recent years and in other advanced countries. It’s always very easy to predict a trend correctly right before it turns around for one reason or another.

But the deeper issue with putting too much stock in this argument is that, as we were just saying, in principle, as new goods get introduced they can cause the relationship between our wealth broadly construed and our willingness to sacrifice consumption for safety to go the other way. New products can start making us willing to sacrifice less consumption for safety because the consumption tastes so damn good. Now again, on balance historically, I think this effect has tended not to win out. But if AI nudges the direction of technological development so that we’re getting more copies of the old things and so we’re satiating in those, but we’re inventing these new things that are really desirable, because they’re like immortality pills or they’re wires to the head that just give you mind-blowing joy, then it’s totally consistent with a reasonable-looking utility function to be like the rat pushing the heroin button and getting more expected utility out of a path that involves more short-term pleasure, but a lot of it, and a higher chance of disaster than the safer path. At least for a non-negligible discount rate.

I hope that doesn’t happen. I hope that growth is more in this satiating, safety-promoting direction in the near future because I care about the distant future and getting to it. But it’s at least a logical possibility that it goes the other way. I think we can be led astray, and I certainly have been led astray by doubling down too hard on a simple one-dimensional model of things in which basically all the automation or AI can do is speed us up or slow us down on some trajectory that’s already set in stone. There’s lots of different dimensions of technological development and paths we could go down. And by speeding up one a bit more than the others relative to how fast they’ve been proceeding relative to each other in the past, a lot of our intuitions might break.

The GATE Model and Task Automation [00:26:57]

Anson

Moving on to the GATE model. A core part of this model is the amount of production that you get is determined strongly by the fraction of tasks that are automated. So there’s this question about whether or not this framing even makes sense. I’m curious what you think about that. Does the fraction of tasks as the framing for these growth models hold much water or doesn’t it really make that much sense?

Phil

Yeah, I think qualitatively they can be pretty insightful. I don’t want to say the whole approach is worthless or anything like that, but I don’t currently think they’re as useful as I used to think. For starters, it’s telling that the very first task-based model in a form anything like we know them today was Zeria’s in 1998, and it came out right when the O*NET task database of US occupations was first put out in 1997 by the Bureau of Labor Statistics. People collected all this data thinking “Might as well, it could be useful for something.” And people came up with a model that makes use of this data. If we collected other data, we probably would have come up with some other model. I don’t think there’s a deep fact that work subdivides into these nuggets that we call tasks and that we can think about plowing through one by one and seeing what happens to growth.

The Email and Zoom Call Example [00:28:52]

Phil

A simple example of how that can go wrong was explained to me by Pamela Mishkin at OpenAI, who was one of the co-authors on the paper that makes heavy use of a task-based model: “GPTs are GPTs”. Daron Acemoglu wrote a paper making use of their estimates of automatability by LLMs, by task. It seems like this really rich resource. I and some other people at Stanford were trying to follow up on that. I talked to Pamela and she was like “Oh, this whole framework doesn’t really make much sense anyway.” I didn’t get it at first, so we had a Zoom call and this was her example.

Let’s say I’ve got two tasks. Task A: write an email to Phil explaining the limitations of the task-based model. Task B: have a Zoom call with Phil going through the limitations of the task-based model in more detail. If all she had to do was task A, maybe an LLM could have done it better and faster. But given that task B is coming, she felt that if the LLM can’t do task B, have the Zoom call, it would take her more time to read and internalize and remember the particular choices of terminology that the LLM would produce in the email, so that we can discuss it on common terms over the Zoom call.

Instead of having to keep going back and saying, “Wait, what’s Phil getting at? Oh yeah, the LLM used this term for something.” That would take more time. So given that task B is coming up, the most efficient workflow is for her to be the one that writes the email.

An evaluation of a given task on whether an LLM can do the task would predict that if someone’s job consists of task A and task B and then we introduce an LLM that you found could do task A, their productivity would double. They could just do task B all day, they’d get twice as much of it done. Maybe half of them would be fired. It depends on the effect. But actually it would have no effect at all in this case. Everyone would just keep on doing task A.

Three Biases in Growth Projections [00:32:07]

Phil

If these sorts of effects are significant enough, I think that means that our projections from task-based models could be wrong in at least three big ways. One is that the growth impacts of automating technologies, including LLMs or AI in general, could be delayed relative to what you would have thought. Because it’s got to automate A and B before you see any of the effects. Second bias is that when they come, they’ll be more sudden than you would have thought.

The third, more subtly, gets back to this latency thing. This is what I’m thinking about all the time now. The message of Pamela’s example is that there’s a kind of increasing returns to scale and that it’s more productive to have the same mind doing tasks A and B. That might outweigh the fact that if you tried to outsource task A in isolation, this alternative system could way more efficiently or quickly perform task A. Despite that, there’s a positive spillover where doing task A makes the factor of production, in this case her body and brain, more productive at task B, and that positive spillover is so big that it totally displaces the use of the LLM at task A.

We can currently take advantage of these positive spillovers or these cases of learning by doing from concrete task to concrete task. We can take advantage of them up to the size of a given human’s capacity for work. Some kinds of work maybe can be subdivided. Is there something you can just take out of your day and ask someone else to do? But not that much.

What that’s telling us is that we have these spillovers where it’s really helpful to have the same mind doing all the different things. We can take advantage of the increasing returns that this allows for, but only up to the size of what a single person can do in a day or in a year or in a life. With bigger systems with really big neural nets that can do a lot of things simultaneously and have really big memories, that can use the learning they got from writing that email to Joe six years ago on the other side of the world to inform the Zoom call that they have with Phil about task-based models tomorrow. You would expect to see even more growth than you would predict from a naive task-based model where the best you can do is just automate all the tasks one by one, and then get a world in which everyone was still the same size.

Level of Abstraction in Task Categorization [00:35:27]

Anson

Is this not a question of whether the tasks and owners happen to be at the right level of abstraction? Couldn’t I make an argument about how, for example, if we look at which occupations existed in the past before the Industrial Revolution, and we compare that to the tasks of today, maybe a large fraction of those jobs, using the precursor to ONET, wouldn’t have been so bad that it’s just totally useless? You’re not saying that this is totally useless, but is this not just a question of ONET being at the wrong level of abstraction?

Phil

I see what you’re saying. If we’re talking about GATE in particular, one thing it can do in a best-case scenario is to be improved on over time as data comes in. We could see where it looks like this is the function from model size to fraction of tasks automatable. If we’re looking at the occupation level and there’s only three, then it’s just not going to be useful like that. You’re going to have to turn to other ways of indexing our progress, like SWE-bench or the task time length growing. But you’re not going to be able to use O*NET because the right level of granularity is just unreasonably large.

On the other hand, I don’t think there’s one objective right level of granularity. Because for some purposes, like with A and B, it could be that some people have questions that need follow-up emails and that don’t require follow-up Zoom calls. You want to be able to capture the fact that Pamela is going to be able to just ask the LLM to handle those. My lesson is more like you’ve got to be careful when thinking about your application of a task-based model rather than that there is a right task-based model, but it’s just at a coarser level of granularity or something.

Anson

I think that’s very much the kind of thing that I was trying to get at. If I’m trying to do this coding task, maybe the description at ONET-level would look more like doing coding, doing data visualization. But then really the thing that we actually care about is more like it’s able to write this particular block of code in this Python notebook, or it’s able to write this Python script. What I meant by level of abstraction was maybe we need to step down to a different thing that’s more amenable to capturing that kind of automation. So it’s less of a problem of the framing of a fraction of tasks, but maybe it’s more like the particular data that we have, the things that we’re trying to use to measure these things, are quickly becoming out of date. Maybe in the same way that the precursor to ONET was out of date and they had to come up with a new thing.

Adapting Task Models for AI Automation [00:38:40]

Phil

When you say “it’s quickly becoming out of date,” do you mean just because the nature of work is changing fast, or do you mean that it’s becoming out of date because day to day, the list of tasks that you’re doing is different? Because “code this particular thing” is different from “code that particular thing.”

Anson

I think it’s becoming out of date in the sense that we are getting a better sense of what kinds of tasks AI is automating over time. Maybe in the past, we just didn’t really have this understanding. Now we’re realizing more and more just how often these task descriptions might be compared to the actual test automation progression.

Phil

I see. So we should cut things up in a way that corresponds more closely to how we think things will actually unfold? So that things will be cleaner: where it can fully do A, B and C before it can even do a little bit of D?

Anson

Exactly. I think that would be the hope. But then I think it’s also kind of hard to do that. I’m not sure how I would cut things up.

Phil

Yeah. That’s a good thought. I haven’t thought about how we might categorize tasks in a way that’s more amenable to tracking AI automatability.

Uncertainty in the GATE Model [00:39:59]

Anson

Speaking of the GATE model, would you say that the biggest problem with the GATE model or the greatest source of uncertainty is coming from the AI R&D module? Or do you think it’s actually not just that?

Phil

I can only speak for myself. I think the biggest source of uncertainty is the assumption that there’s some amount of compute where we can be confident that a model trained with that much compute would be able to fully automate anything in particular. And secondly, that we can interpolate between here and there such that, with 20% of the compute to the threshold to the finish line, we could automate whatever percent of the tasks. This isn’t an issue with GATE in particular. It’s inherited from Tom’s takeoff speeds model. It would be really nice to have a threshold like that where we could say, “Okay, we know we’ll get to AGI, to a system that can automate AI R&D or that can do something else you’re interested in.” But the interpolation just seems totally made up. I’ve never thought that whole methodology was very well grounded.

Moving from Interpolation to Extrapolation [00:41:24]

Anson

I think the interpolation indeed is pretty made up. I guess the issue is just what do we do instead?

Phil

We can’t do it now, but I think it will start to be the case that full-on O*NET tasks will be automated and full-on sub-classifications of work time, certain types of eventually whole occupations. But before that, categories from time-use surveys or from timesheets will start to be eliminated. Then we’ll have the data to extrapolate rather than interpolate. Instead of assuming a finish line and then putting a point in the middle and then fitting a curve, we’ll be able to say every time we double the model size, we tend to cut the remaining net tasks in half or whatever the relationship will look like.

At the moment, there’s that “GPTs are GPTs” paper. They just ask an LLM for its guesses about what it could do. Then they have people try to judge it, but they adjust the prompt that they give the LLM to match the people. Then they extrapolate. Even if you fully trust those estimates, that’s a single data point. That’s like what GPT-4 could automate. How partially it could automate each O*NET task. People haven’t done a follow-up one for GPT-5, or for a reasoning model. They haven’t recapitulated the whole methodology of that paper.

The Anthropic Economic Index and other data on LLM usage that’s starting to come out from the other providers might let us continuously start to do this sort of extrapolation. They’re not just anchoring on O*NET tasks. Anthropic also has their own categorizations that might cluster more naturally. They have this endogenous clustering thing where they apply some ML to try to figure out what people are using the models for. That’s maybe responding to your point about how we can create on the fly a way of carving up work into tasks that better corresponds to automatability. That approach will have its flaws as well. But I think from at least a certain perspective, it’ll be more grounded than the sort of set end point and then interpolate approach that we’ve been having to make use of so far. But it’s still early stages for that.

In a really fast takeoff scenario then there won’t be time, because we’ll move so quickly from being able to automate almost nothing to being able to automate everything at once. So maybe it’s just intrinsically hard to predict. But I think probably not. I think probably we will be able to start doing extrapolations that are a bit more empirical as time goes on and as we start developing a track record of automating tasks completely.

Empirical Approaches to Tracking Progress [00:45:09]

Phil

You can track progress and start to extrapolate in this more empirical way. Not just by looking at the dimension of tasks, but by looking at task time length. METR has their famous study of this. I think that’s just great. That’s amazing because it’s the first real example of starting to do what we would have liked to be able to do all along. Track progress on something that, in principle, would eventually get you to full automation of at least some class of tasks. We’re far enough along already that we can draw a line through the data that we have and make an extrapolation. It’s not like we just have one or two data points right at the beginning, like maybe with O*NET.

You might be able to carve up the space of work or the space of R&D or the space of AI R&D in other ways as well. I think Situational Awareness has been criticized by some for being as hand-wavy as it is about the kind of extrapolation by equivalent to human age, basically. As smart as a high schooler or as smart as a college student. I think that framing is not justified in that essay. But it could be, or something like it could be. More experienced workers are paid a lot more than more junior workers.

If you know one thing about a person to predict their wages, even more than education or race or gender, age explains so much. What that’s telling us is that there’s a lot of tacit knowledge that people accrue through life. Maybe the way things will unfold is that AIs first get good at learning the sorts of things you can learn from books which they have access to no less than a college student. In fact, they have better access in some sense. So the people right out of college with no tacit experience will be first automated. Then as they start to get more sample efficient and they’re deployed for longer and in those early-stage roles they accumulate the data, that means that they can displace the people that had only had three years of real world experience. Then you could see that creeping up.

Maybe. I’m not saying that’s what’s happening. You hear anecdotes about young software engineers finding it harder to get jobs. But it could be that a story along the lines sketched out in Situational Awareness comes to have an empirical grounding, in which case we’ll have this other way to extrapolate. There might be others as well that I haven’t thought of.

Anson

I guess it reminds me of AI 2027, where they’re using this METR data for forecasting the timeline to a superhuman coder. Instead of just doing this for a superhuman coder, which makes the most sense given what kinds of tasks were included in the METR study, perhaps we could try to generalize this somehow by getting more data or just generalizing the study to a broader fraction of tasks in the economy.

Phil

Yeah, but still looking at task length. That would be expanding the scope of that particular dimension, the task length dimension.

Economic Theory as a Destructive Project [00:49:32]

Anson

How much should we update on economic theory anyway?

Phil

Yeah. That’s a profound question. Something I’ve had to wrestle with having specialized in it. Definitely less than I used to think.

Phil

This is a pessimistic view, but a view I’ve somewhat come around to is that economic theory is primarily a destructive project.

Anson

It doesn’t sound promising.

Phil

It’s not so bad. What I mean is, if it’s good for anything, it’s most useful for pointing out that intuitions that you might have had don’t actually hold in general. Coming up with counterexamples or curious models in which something you thought was inevitable turns out not to hold. Every now and then on further investigation, hopefully empirical investigation, you’ll find that this theoretical curiosity is actually relevant. So it can open your mind. But it’s a much more limited role than being able to deduce the truth from first principles.

A classic example is minimum wages. You’ll see people saying, as an economist, I know X. As the price of something goes up, people want less of it. So minimum wage is going to cause unemployment. That’s not something you know as an economist. That’s something you know as a shopper or something. Everyone kind of has the intuition that typically when things get more expensive, they’ll be less demanded. You didn’t need a PhD to learn that. If you learned anything in a PhD either it was the tools to empirically test whether that’s true in some contexts, the fact about whether it’s true in some contexts because you did actually apply the tools, or the theoretical insight that it’s not true by necessity.

So it turns out that there are such things as Giffen goods, or you can very commonly have backward-sloping supply curves for labor. Because if someone gets paid more, they might just work fewer hours, buy back their own leisure with some of the money.

Labor markets can look all sorts of ways from first principles. So this theory can shake you out of this kind of intuition that you might have had walking into econ 101 or walking out of econ 101. I think actually the empirical case on minimum wages is just messy. In some cases they do probably cause some unemployment, in other cases not. It depends a lot on magnitudes and all the rest of it. There’s some people who triumphantly say “Economics is wrong. Minimum wages don’t cause unemployment.” I’m not saying that. I’m just saying I think if economic theory has anything to add there, it’s destructive of the common-sense intuition we all have about prices and quantities that we know from everyday life.

Generally when thinking about AI and growth, there’s some constructive points. The Jones model can, at least to some extent, if you squint, explain the past and so it can serve as a nice basis where you can swap out the Ls with Ks and see what full automation would do to growth. But beyond that, I think it doesn’t have that much to add, even though I’m trying to add a little bit. If it does have something to add, it’s by identifying ways in which the future could be surprising. Which then will have to just look into more with whatever empirical tools we have.

AGI’s Impact on Growth and Wages [00:54:03]

Anson

Some of the ways in which you’ve applied this is to say things like what is the impact of AGI on the probability of explosive growth and on wages. My understanding is that you thought that it’s kind of ambiguous just from a purely theoretical perspective. I’m curious if that’s right? And also whether you think you have a different perspective if you take all of the empirical evidence into account? What’s your overall view and what’s your purely theory, let’s-try-to-find-the-counter-examples kind of view?

Phil

It’s all a little bit mixed together. But one thing that I think definitely is in this “Aha, here’s a theoretical curiosity” point is that real GDP is such a bizarre chimera of a variable that you could have full automation and really explosive growth in every intuitive sense of the term and yet real GDP growth could go down. An example of why it might at least not go up that much, which I think it probably won’t all work out this way but I don’t think this is crazy, is that you get this effect where there’s this common pattern you find where new goods, just as they’re introduced, have a really small GDP share. Because they have zero GDP share before they’re introduced. At first they’re really expensive—we’re not very productive at making them. As the price comes down, as we get more productive, the price falls but the quantity rises faster. The elasticity of demand is greater than one. Every time the price falls a little bit, the quantity rises a lot. So the dollar value of the good rises. So the share is rising. After a while it goes the other way, once the goods are really abundant, at least relative to everything else.

The Good Lifecycle Hump and Baumol’s Cost Disease [00:56:38]

Phil

Every time we have the price go up, the quantity only rises a little bit because we’re basically satiated in it. So you get this hump: new goods - small share; goods that have been around for a medium length of time that we’re mediumly productive at - high share, they dominate GDP; old goods like food - small share. So we’re continually going through this hump.

Everyone’s familiar with Baumol’s cost disease. But the way it’s usually presented is that AI might have less of an effect on growth than you might have thought, because we’ll be bottlenecked by the few things that have not yet been automated that you still need people for. And actually, you can have Baumol after full automation. Because, remember the hump, right? Real GDP growth at a given time is the weighted average of the growth rates of all the goods where the weightings are the GDP shares. The GDP shares will be dominated by the goods that we’re intermediately productive at in this view.

So let’s say for every good you have its own specific technology growth rate. Like how quickly it can be produced is some arbitrary function of its current technology level. It can be hyperbolic. You can have A dot equals A squared or something. So for every good, there is some finite date by which we’ll be able to produce infinite quantities of it in finite time.

So it’ll be free. So GDP share will be zero. And we just go through these ever higher index goods, ever more complex goods over time. And at any given time, all of GDP are the goods that have a productivity level of five or whatever happens to be in the middle as far as GDP shares go. So some effect like that can produce something like a Baumol effect even after full automation.

I think it would be pretty weird if that kept the absolute number low. Like anything as low as the current number indefinitely. But the idea that maybe it causes measured real GDP growth to not be that high for a while when the world is starting to look remarkably different doesn’t seem crazy to me. And maybe it’s worth knowing and having as a scenario in your back pocket in case things start looking weird and anyone says “What are you talking about? I don’t see the numbers.” I’m trying to be cautious, but that’s an example of destructive economic theory.

Anson

Do we have any quantitative sense of what the hump looks like?

Phil

That’s a good question. There’s that Besson paper and you could just do a bunch of case studies by good. I should look into that more quantitatively.

Probability of Explosive Growth [01:00:15]

Anson

To go back to the thing of what’s your overall view then?

Phil

Oh, I didn’t get to the wages either. I should say something about wages, but yeah, my overall view on what?

Anson

The probability of explosive growth over the next, say, five decades.

Phil

If by explosive you mean an order of magnitude higher than now, that I think is more likely than not. I mean, with 25 years of 30% growth, the world’s very different. In fact, I was just punching in the numbers on my phone on the way here. So 25 years of 30% growth is so different that even our intuition that you can do 1.3 to the 25, or you can do e to the 0.3 times 25 will be about the same—that’s broken down. So 1.3 to the 25 is 700. The economy is 700 times bigger in some sense. E to the 0.3 times 25 is 1,800. So just the additional compounding from continuous growth at an annualized 30% a year more than doubles the number.

Anyway, so in a world that different, some funky effect where we’re bottlenecked by these new goods, which will be really advanced because they’ll be some weird nanobot-type thing that it takes a lot of serial steps to create.

Phil

And we need it—we can’t make up for a lack of it with large quantities of all the goods that we’ve been able to produce in the meantime. So we’re bottlenecked by the scarcity of this new thing. But some weird effect like that or some natural resource constraint or some regulatory imposition, like maybe we really do satiate and we start wanting to just really make the world very safe and stable even if it means throttling growth to only 30% a year or whatever. All of that roughly cancels out to me at the moment with the counterarguments, which are also valid. That, due to international competition people will continue to race with each other for military purposes, even if they would prefer each to be safer. So that’s about where I land. I’m even odds once you press too far beyond order of magnitude.

The Slippery Nature of Real GDP [01:03:24]

Anson

And how much does your answer change if we don’t consider the economic theory aspect of this? We don’t consider the hump of the productivity and shares. Would your answer change much?

Phil

I should have said before: 50 years is long enough that I do think we’ll be able to develop robots and AI that can do what we can do. But some of the uncertainty will also just be that that doesn’t work out. And it’s a little hard to separate it out. I mean, digging into the theory of what chain-weighting is has made me pretty viscerally feel like real GDP is a much slipperier concept than I ever used to think.

Here’s a fun fact. This is crazy. So real GDP and lots of real variables like inflation-adjusted variables, real capital or whatever, let’s say real GDP, is not a quantity. What do I mean? It’s not. Here’s what I mean. Imagine a timeline of some economy. So, the US from 1950 to 2025, 75 years. And imagine an alternative timeline with an alternative economy living it out that’s exactly the same as the US in 1950, at the beginning, in its own 1950, and exactly like the US in 2025, at the end in year 75. But in the middle things happened in a different order. So the microwave was invented in 2006, and the iPhone came out in 1971. And the distribution of wealth changed hands, evolved in a different way. But at the end, it’s exactly the same. Everyone’s got the same preferences. Exchanges the same goods and services for the same dollar bills. Atom for atom. Everything unfolds exactly the same in 2025 and in the 1950 on both timelines. Timeline A, timeline B.

Unless people have homothetic preferences, meaning that the fraction of their income they spend on each good is constant, no matter how rich they are. So no luxuries or inferior goods, which is completely wrong. You don’t spend the same fraction on food when you’re starving as when you’re richer. But unless people have homothetic preferences that are the exact same preferences across the population and totally stable over time—unless those three conditions are met, there is a timeline B on which real GDP growth chain-weighted across the years with perfect measurement is any number.

Real GDP as a Path-Dependent Measure [01:06:47]

Anson

Okay.

Phil

Isn’t that crazy? I mean, even the fact that there could be any variation means that, to my mind, real GDP is not a quantity. Because it’s baking in the history. You see what I’m saying? A yardstick shouldn’t matter—the order in which you measure things. It should order things in the same way. But the order in which things happen can change what share of GDP a given good was while it was growing quickly.

So let’s say there’s two of us and one of us is going to be rich one year, and the other one is going to be rich the other year. And the stuff that I like more, I’m going to bid up the price. I’ve got a lot of clones that have my preferences and you’ve got a lot of clones. We bid up the price more of the things we like when we’re rich. The way things happen is that the things we like are growing quickly in absolute units while we happen to have the money. So our preferences are mostly determining what GDP is. And the things you like are growing quickly when you and your clones have the money. Real GDP is going to be higher across the two years than if it’s the other way, where the things I like grow when I’m poor and vice versa.

And it’s that kind of effect that can mean that you can scramble things up so that as long as people depart from perfect homotheticity, constant preferences, same across population, then real GDP can be any number. So maybe I’ve overinternalized this. But given that I’ve overinternalized this, I sort of feel like I can’t separate the theory from the overall opinion I think.

Anson

I guess it’s a funny way of framing it though. I think I would still call it a quantity, but it just depends on the history. Because in physics we could say the work done is not a function of state. It’s not just these are the things that you have and it’s only determined by that. It also depends on the path it took historically. But then there are also other quantities that are just dependent on the state. Like the internal energy or something.

Phil

Yeah.

Anson

I think that’s sort of like a word choice, though. I don’t think that really makes a difference to your claim.

Detecting the Economic Singularity [01:09:11]

Anson

So, new topic. How do we detect the economic singularity?

Phil

That’s a big one. I don’t have anything super original. But I have something a little bit original. William Nordhaus, this Nobel Prize-winning economist, has a paper from 2021 asking “Are we approaching an economic singularity?” And his answer was no, because he observed that if we were, we would see a bunch of macroeconomic variables moving in sort of predictable directions. So we would see the capital share rising, for instance. I think that’s the most robust one.

If we’re approaching a world of full automation where we’ve got these robots and computers that can do everything people can do, and there’s just more of them, then the share of income paid to their owners has to be higher than the share paid to labor. And it’s got to grow ever higher as their relative quantity increases. But other examples like that. So he plots these things and finds that they’re all as flat as they ever were. So he says the singularity is not near.

And methodologically, I think that’s a great insight. The issue with it is that it can be a really lagging indicator of whether basically this tells us a singularity is underway.

Phil

It’s sort of like being a weather forecaster and just looking out the window and saying whether it’s raining. But I mean, if you start to see all of them moving in the same direction a little bit, that should be an update. And it’ll only be super lagging on the most fully software intelligence explosion comes first then it starts to impact the real economy very discreetly later sort of scenario. Which I think is a possible scenario. But it’s good to know that in other scenarios you can extract some real information from the macroeconomic variables well in advance.

So Nordhaus did that. One very small thing I’ve done is just get the analyses up to date and expand on them a little bit, add a few similar variables to the list and find that they’re all just as flat as they were when Nordhaus measured them.

Anson

I guess that’s not that surprising.

Phil

Which is not surprising, but I can report that on Nordhaus’s methodology we’re not seeing any more evidence than we did in 2021 or 2015 when he first wrote the paper actually.

Phil

A subtler thing is that you can look at what’s called a network-adjusted capital share. The network-adjusted capital share is a feature of a good. So the capital share is a feature of an economy, right? What fraction of all income is received in exchange for the use of capital. Network-adjusted capital share asks: for every dollar of revenue spent on that good, if you trace it all the way back down the supply chain, how much of it is paid out in exchange for value added by capital, as opposed to value added by labor or taxes?

Anson

What was the thing that we were tracing?

Phil

A dollar of revenue.

Anson

Okay.

Phil

For example, Starbucks sells a cup of coffee. You give them $5. Some of it is paid to the people working at Starbucks. So maybe $1. Some of it is received as profits by the people who own the physical infrastructure of Starbucks and the brand, the shareholders of Starbucks. So that’s another dollar. And let’s put aside taxes and $3 are spent on intermediate inputs. I’m making up these numbers. So on the coffee beans and the cups and the electricity to keep the lights on and all of that.

Now, as far as the Starbucks balance sheet is concerned, those are capital expenses. But in reality, they’re not all capital expenses because we haven’t traced them down the chain. You’ve got to ask of every dollar they get in revenue for coffee beans: how much goes to the owners of the firm for just owning the capital of the firm? How much goes to the people working at the firm for their labor, and how much is spent on intermediate inputs that that firm uses? If it’s a farm, the tractor or something, and then for every dollar they give John Deere, how much is spent on capital, labor, and intermediate inputs at John Deere. And so of that original $5 that you gave, you can in principle ask, it’s all ultimately going to capital or labor all the way down the supply chain.

I mean in some sense, it never ends. You’re cutting up the pennies ever smaller. If you’re going to try to do this manually, you’d have to stop after a while. But you can say, what’s the capital contribution in some sense to this cup of coffee as opposed to the labor?

So the reason this is relevant is—one neat thing is that it sounds totally intractable to compute. There’s this A.J. Jacobs book in which he tried to thank everyone in the world for his cup of coffee. So at the beginning, he’s thanking the barista. But a few chapters in he’s thanking the people who made the paint for the truck that carried the whatever. I mean I haven’t read the book, but that’s my understanding. So it sounds intractable.

It turns out the US Bureau of Economic Analysis and the OECD similarly create these big input-output tables every year where they actually try to figure out relatively coarsely, but still at a level of granularity like 87 industries or something I think it is, how much of every dollar of revenue goes to the immediate labor, capital, and then all the intermediate inputs across all the other industries.

So you get a big matrix and it turns out that you can work out what the network-adjusted capital share is for any item in the matrix with some matrix inversion. You do that infinite tracing down the chain, and then you can plot what that is over time for any given good.

Self-Replicating Systems and Automation [01:17:26]

Phil

You might want to know if a good is approaching having a network-adjusted capital share of one for a few reasons. One is that if it’s a kind of good that can later drive the growth of everything else. If it’s semiconductors or whatever, if we’ve fully automated semiconductor production, we think that will then drive the intelligence explosion, which drives the automation of other things. This will actually be more of a leading indicator. You’ll actually be able to do some weather forecasting instead of just looking out the window.

But secondly, you might just be interested in it because if a good has a network-adjusted capital share of one, everyone on earth could die and it would still keep getting cranked out. And that’s the kind of thing that could have a big effect on the world. Having self-replicating in some sense in this grand supply chain sense—having self-replicating drones would be really important for military purposes. Or robocops or something. If you no longer need the support of your population to have the equipment to suppress them, maybe dictatorship will be more stable. Or if some omnicidal maniac wants to create little killer machines that can self-replicate, that’ll all—people can intervene to shut it down or something, but at least you’ve crossed some threshold of it being more worrying when you have a closed loop in which no one needs to actively intervene to keep the system going, which then leaves its mark on the world.

So I just thought that would be interesting—compute that and invert that matrix and look at that network-adjusted capital share for different goods over time. And it turns out they’re all basically as flat as a board. For semiconductors, that’s not at the industry level that’s computed annually. It’s only every five years, unfortunately. So it’s always really out of date. But it’s 50-50. It’s sort of always been 50-50. So there you go. So you got one more little data point that maybe the singularity is not near, at least if it’s going to be an industrially intense singularity early on.

Detecting the Industrial Revolution [01:20:22]

Anson

I’m curious, if we went back in time here and we tried to apply a similar style of thinking to whether or not we could have detected the Industrial Revolution happening. Suppose we went to 1700 in France and we tried to detect the Industrial Revolution coming in advance with some kind of leading indicator, what would people have done? How well would you have done?

Phil

In practice, I think the people at the time would not have done well.

Anson

Yeah.

Phil

Even if they’d known to keep an eye out for something like that. I want to share a bit about the economists of 1700s France, the Physiocrats. They created the first of those input-output tables, as far as we know, that the BEA now studiously collects every year or five years. In some ways they had free market thoughts which Adam Smith incorporated into Wealth of Nations. But they also had this crazy idea, at least a lot of them did, that the real wealth came from the earth. It came from the land, and everything else that people do is just icing on the cake. That’s rearranging the wood into a chair, and yeah, you’ve got to do that down the line. But to really get more wealth, you’ve got to speed up the beginning of the pipeline. So all these moves to the cities and all those first glimmers of industrialization were moving the wrong way. Everyone had to get back to the countryside for economic growth—the worst possible advice you can imagine! Maybe an early example of economic theory gone awry.

The things to do would have been first to look around for what’s going on in other countries. Sort of an obvious example, but don’t assume that what’s happening in France is the best guide to what will be happening in France in a few decades or in a century. Likewise, I’m realizing my own folly as I’m speaking, but I was looking at the US for all these network-adjusted capital shares. But there are other countries like Japan where there’s a fair bit more automation, at least in a lot of sectors. Because that demonstrates that something’s feasible, even if for whatever reason it hasn’t been implemented here, it might come very quickly, as industrialization came over the English Channel a bit later.

Tacit Knowledge vs. Population Growth [01:23:32]

Anson

One question I have about explosive growth is that a lot of the framing is usually around increasing the number of AI researchers. But you also mentioned these other returns to scale. So the framing I want to take here is: what exactly are the returns to tacit knowledge? In particular, suppose you could either 10x the human population on the one hand, or on the other hand, allow all humans to share all of their tacit knowledge with each other. Which do you think would have a bigger effect on economic growth, and what would that look like? Is this a growth effect in the long run? If we maintain this, is there some kind of level effect here? How do you expect this to work out?

Phil

I expect that doubling the population would have a bigger effect on growth. The increasing returns to scale that specialization could allow for or just better coordination through one person’s knowledge immediately being accessed by another—that’s something we’ve been improving on over time. Not that quickly. But as I mentioned, the invention of the internet and modern telecommunications and just all the academic infrastructure we have for sharing papers, and even on the tacit knowledge front, I think we’ve been getting better at communicating over time. If that’s had an effect, it’s been small enough that we’ve been able to not be completely crazy when talking about the economy in really broad brushstrokes. Even proposing that it’s a significant source of increasing returns to scale such that with these big-brained AIs, one 10x-ed AI is going to be a lot more productive than ten smaller ones—I think it’s true. But the idea that this would be such a large effect that it would outweigh the raw replication effect—that with ten times more people you’d have, at baseline, a ten times larger economy. Here the replication argument does work. Because with ten times more people with the same tastes, they’re going to make ten times as much stuff. I’d be very surprised if this effect of mind melding was big enough to outweigh the full elasticity of one “double the population, double output” replication point. But I don’t know. Obviously we’d be pushing the boundary to develop AIs that could immediately share all their tacit knowledge. But there is another way you can shed some light on this, which is just by looking at the returns to working longer hours.

Returns to Scale and Working Hours [01:27:18]

Phil

Something I’ve been interested in is why exactly in some professions you make so much more by working really long hours. As a lawyer or whatever, you’ll have an investment banker working 80-hour weeks. That’s really miserable. And you think, “Well, why can’t you just get a normal job where you’re working 40 hours a week but get paid half as much?” But you get paid less than half as much, or the job is not available at all.

There are some genuine workaholics who would want to work that much, but I think there are people for whom it feels suboptimal. The reason why that’s what we end up doing has got to be because of this increasing returns to scale thing, where it’s more productive to have this breadth of tasks associated with a project located in a single brain. If you can’t get it all into a single person, then you make do with as few as you can so that you can have the integration of the disparate parts of the project. So that’s like sharing tacit knowledge with one other person—yourself for another workweek crammed into the same workweek. The constant returns to scale view would say that working 80 hours gets you twice the salary or less if you’re overworked. The increasing returns to scale view says that it gets you more than twice the salary. And it does. But it doesn’t get you more than four times the salary. So the primary effect is just the fact that you’re working more hours.

Anson

Is that the case, though? In this case I wonder how comparable it is. If we’re really sharing the tacit knowledge across the entire economy, of all the possible agents in this economy, it feels potentially a lot bigger than just sharing with one person and stuffing it within the same week.

Phil

It’s a good point. I would wonder how much everyone needs everyone else’s tacit knowledge, right? Because to a large extent, we’re just doing jobs that require independent knowledge bases. But I can’t rule it out. Maybe someone knows something that rules it out, but I definitely take the point that it might be a bit bigger than the investment banker returns to scale. But it’s a lower bound. Intuitively, for me, it’s something close to an upper bound. But yeah, I don’t have more to say about that.

Jupiter Brains and Agglomeration Effects [01:30:26]

Anson

Which is going to be more valuable if we consider the thought experiment of two half Jupiter brains versus one Jupiter brain?

Phil

We all know that having twice as many people leads to more than twice as much output through agglomeration. People cluster in cities nowadays almost entirely because of each other, not because of some natural resource or port that the city happens to be next to. Everyone gets to go to their own favorite kind of barber and eat at their own favorite kind of restaurant. Whereas if we were all just living in towns of 100 people, we’d all have to make do with something generic. On current margins, economists tend to think that this effect is pretty small—that you get benefits from specialization up to maybe the size of a small city or small country. But beyond that, it’s basically constant returns to scale. One argument for this is that the gains from trade between pretty similar countries with similar natural resource endowments are estimated to be pretty small.

It’s sort of funny—economists are associated with the idea that tariffs are really horrible and that there are all these gains from trade. Implicitly that’s an argument about increasing returns to scale. Because if there were constant returns to scale, then why bother trading between countries? Each country can just chop the land in half and produce half as much on its own. That is in fact what is typically estimated. The gains from trade between the US and Canada, it’s maybe a few percent of GDP. Which is a lot in the grand scheme, but it’s not that much. All the recent fretting about tariffs—there can be large short-run losses if you’ve built up a trade network and then you suddenly sever it, because the supply chain has to be reorganized. But in the long run, standard models and estimates predict that the effect should be small. On the other hand, I think in the long run the effects could be pretty large after all.

Anson

Interesting.

Phil

My pet theory of this would be that a large part of what’s happening, as we’ve developed more advanced technology, is that we make better use of our latent capacity for specialization. As academic fields have gotten more specialized, they’ve been making use of the fact that you had more academics to fill all these specializations. One thing we could have done is just keep calling everyone a natural philosopher, like basically all the academics were back in the Middle Ages—or theologian or lawyer. They had like four kinds of academics and just lots of people largely duplicating each other’s work. If there’d been this big influx in academics and people hadn’t yet come up with molecular biology and all the rest of it, then as they did so you’d be getting all of these gains. Which on some level you could attribute to the development of molecular biology, but on some level you’ve got to attribute it to the fact that there are more people now to take up this further specialization. Likewise, if you suddenly moved a bunch of people from small towns into a big city and you just had a bunch of middle-of-the-road barbers and diners that hadn’t yet specialized, there would be these gains up to the degree of specialization that the size of the city allowed for. But it wouldn’t all be realized at once.

If a view like that is right, I think it helps to explain what would otherwise be a bit suspicious. Which is that when moving from full individual-level autarky where everyone’s self-sufficient—everyone has their own little house on the prairie, there are no gains from trade at all—and moving from that to a modern small city or small country, assuming it doesn’t have oil (that’s a big case where there are gains from trade, I should have said, but that’s clearly not about specialization, that’s just about a necessary natural resource), you get all these gains from trade up to that size and then it just suddenly levels off. And that’s just a fact of nature. It seems a little suspicious to me that in the whole space of ways to arrange matter and energy to produce things of value, you get strong benefits from specialization up to the size of a small modern city in 2025 and then no gains at all after that. I would guess that with enough time, if we just stagnated at current population sizes, we would develop more and more specializations and we would continue to urbanize globally—not just increase the fraction of people in cities but the fraction of people in the biggest cities. In time we would extract ever more benefits of specialization. But what has happened over the last century is that populations have grown really quickly and urbanization has grown really quickly. That has created this big overhang, if you like, of potential specialization to be exploited.ß

Brain Size and Welfare Capacity [01:37:32]

Phil

Here’s another thought experiment. It seems to me that the welfare capacity of a being, of a brain, probably tends to grow superlinearly in the size of the brain. In the EA [Effective Altruism] community, in the animal welfare community, there’s a pretty strong inclination that it goes the other way. A lot of research seems to support the conclusion that it goes the other way. People at Rethink Priorities when coming up with their animal welfare weights say there are two things that matter: how intensely a given creature can feel pleasure or pain on a utilitarian perspective, and its list of capacities for valenced experience. Can it feel depression? Can it feel anxiety? Can it feel elation? They find that small creatures can check off a surprisingly large fraction of the boxes that large ones can, including ourselves. And the intensity on some level doesn’t seem to be that much smaller. Little creatures can squeal like they’re just as intensely feeling pain or pleasure as we can. So they think, “Okay, all this extra gray matter that we’ve got isn’t adding much to our welfare capacity.” I think this is neglecting a dimension of welfare capacity that is analogous to the size of a population, which I call the “size” of the experience.

My thought experiment is if you imagine a split-brain case—these split-brain patients that on some level seem to have two separate streams of experience. If someone’s submerged in an ice bath, they’re experiencing some pain all over their body. Then you cut their corpus callosum. Now suddenly you have two streams of experience, each of which is only feeling cold on half of a body. On the Rethink Priorities-type view, you’ve basically doubled the amount of pain in the world because you now have two beings that check off the same list and have the same intensity.

But I would say that’s crazy—that by one snip, you’ve doubled the amount of pain in this bathtub. No. You’ve probably left it about the same, or if anything, diminished it a little to the extent that there are psychologically sophisticated kinds of suffering that can only arise when you have the hemispheres communicating. Likewise I just think it’s a bit absurd to think that there’s this intense non-monotonicity where if you disaggregated my neurons into lots of small mouse brains and then even smaller fly brains, that the sum of the welfare capacities across the whole set of creatures would rise and rise and rise until suddenly it was dust and down to zero. I think it’s probably more integration, more complexity means more capacity for value itself.

I’ve got my economics thought and I’ve got this brain thought. What does this all add up to? The parallelizability thought about how, at least eventually on some margin, latency could be a really strong bottleneck. You’d get way more out of one big chip than two small chips or one automated researcher that in some sense has a double brain rather than two people with normal brains side by side. All of it seems to add up to the possibility that in a radical future where we’re turning Jupiter into some giant computer, two half Jupiter brains could be a lot less valuable than one whole Jupiter brain. I don’t think this extends all the way up. Apparently the galaxy is 100,000 light years across. So unless we’re engaged in this very long, slow dance where we’re all playing little parts that can’t communicate with each other for a very long time, you probably at some point get plain old constant returns to scale, because you just can’t communicate efficiently over far enough distances. But you’d have to get pretty big for the speed of light to start being an issue here.

Implications: Peace and Risk Tolerance [01:42:59]

If you get increasing returns to scale for a long way, I think that has some potential implications. One of them is that we should perhaps expect a more peaceful future than we might have anticipated, if you believe the liberal peace hypothesis that countries that benefit from being able to trade with each other are less likely to go to war. The gains from trade are just going to rise over time as we make use of our capacity for specialization, at least if we don’t grow into space more quickly than we can absorb that capacity. So if the increasing returns to scale thing actually starts showing up more, then maybe that’s good news on the peace front.

Another implication maybe is that if you’re thinking of trying to invest for a really long time to turn matter and energy into hedonium or whatever, then you should be more risk tolerant than you otherwise would have been. Because it’s better to own one whole Jupiter brain than two half Jupiter brains. I guess this is true prudentially as well, if you’re just trying to be a hedonist—selfishly hedonist futurist—and you want to turn yourself into a Jupiter brain. Then you want to increase the chance you get the whole one. You care about getting the whole one more than a 50-50 chance of getting half. I don’t know all the places this could go. It’s obviously opening a massive can of worms, but I think it’s fun to think about.

State Size and Coordination Over Time [01:44:41]

Anson

On the peace thing, do you think this is a core explanation for why we’ve seen fewer and fewer individual nation states over time? If we compare say, 1800 to today, is the reason actually just returns to scale flowing through this?

Phil

I’ve wondered a little bit about that. I don’t think it necessarily tells us how big the returns to scale are, because as long as it’s true that to some extent we all benefit from being able to coordinate better, then there’s going to be some pressure toward having larger states. With enough time, just like over the course of evolution even traits with very small fitness advantages can eventually become dominant. I don’t think it tells us anything quantitatively. But states are effective mechanisms for coordinating groups of people. They’ve been getting bigger over time. This is an example of something I was saying before—this latency bottleneck or coordination bottleneck has been around all along. It doesn’t feature in most of our growth models, really, certainly not the Jones model. It’s been getting relieved over time. So to some extent, we’ve got to attribute the growth that we’ve had so far to the ongoing relief of this bottleneck. Which means that in the event of full automation and literal or almost literal galaxy brains, that constraint’s getting relieved more quickly. That would be a new dimension on which growth could proceed faster than before.

Anson

So this is not necessarily Phil’s theory for why world government is going to exist if you just go with the long-run equilibrium?

Phil

Well, it’s an argument for thinking that eventually a world government is more likely than you would have thought otherwise. But whatever form that takes—it could be that countries just collaborate more over time or trade more with each other over time, and the regulations sync up and it ends up functionally looking like a world government. This prediction would have seemed a lot more sensible a few decades ago, maybe 1990. Since then, I think for the first time in history basically, we’ve seen an increase in the number of countries in the world. Not a huge increase, but Czechoslovakia split up, various countries have split up and they haven’t merged.

Anson

Like all of history?

Phil

Maybe it’s overstating it, but I was reading something a while back that this past 35 years basically has been unique on record. We have no idea about the average size of tribes, but yeah, it’s been anomalous as far as the records go. There are more wars now than there were ten years ago. So you never know what the future holds. But this is just one argument on the pile for the thought that in the long run, if the more efficient arrangement wins out, it’ll probably involve more integration and fewer states. Because the degree of increasing returns to scale will itself rise over time as we make better use of our latent capacity.

Anson

Okay. I think this is a good place to end. Thank you for joining us on the podcast, Phil.

Phil

Thank you, Anson.

Anson

Thank you all for tuning in to Epoch AI’s podcast. We look forward to you joining us for future episodes.

What does economics actually tell us about AGI? – Phil Trammell

Key takeaways

The economics of AGI literature is too heavily anchored on one specific growth model

Economic theory can help point out where intuitions about AGI impacts are wrong, but theory alone isn’t enough

The economic indicators don’t point to an approaching economic singularity (yet)

Increasing returns to scale will probably hold in the long run – with huge implications

In this podcast

Watch & subscribe

Transcript

Introduction [00:00:00]

Issues with Current Economic Growth Models [00:01:01]

Limits of Parallelization in Research [00:05:31]

Are Ideas Getting Harder to Find? [00:10:07]

The Problem with GDP and New Products [00:17:56]

Implications for Existential Risk and Growth [00:21:42]

The GATE Model and Task Automation [00:26:57]

The Email and Zoom Call Example [00:28:52]

Three Biases in Growth Projections [00:32:07]

Level of Abstraction in Task Categorization [00:35:27]

Adapting Task Models for AI Automation [00:38:40]

Uncertainty in the GATE Model [00:39:59]

Moving from Interpolation to Extrapolation [00:41:24]

Empirical Approaches to Tracking Progress [00:45:09]

Economic Theory as a Destructive Project [00:49:32]

AGI’s Impact on Growth and Wages [00:54:03]

The Good Lifecycle Hump and Baumol’s Cost Disease [00:56:38]

Probability of Explosive Growth [01:00:15]

The Slippery Nature of Real GDP [01:03:24]

Real GDP as a Path-Dependent Measure [01:06:47]

Detecting the Economic Singularity [01:09:11]

Network-Adjusted Capital Share [01:12:27]

Self-Replicating Systems and Automation [01:17:26]

Detecting the Industrial Revolution [01:20:22]

Tacit Knowledge vs. Population Growth [01:23:32]

Returns to Scale and Working Hours [01:27:18]

Jupiter Brains and Agglomeration Effects [01:30:26]

Brain Size and Welfare Capacity [01:37:32]

Implications: Peace and Risk Tolerance [01:42:59]

State Size and Coordination Over Time [01:44:41]

We value your privacy