Is it 3 years, or 3 decades away? Disagreements on AGI timelines
When might AI truly transform our world, and what will that transformation look like? Even within our own research team, timelines for AGI differ substantially. In this episode, two Epoch AI researchers with relatively long and short AGI timelines candidly examine the roots of their disagreements. Ege and Matthew dissect each other’s views, and discuss the evidence, intuitions and assumptions that lead to their timelines diverging by factors of two or three for key transformative milestones.
The hosts discuss:
- Their median timelines for specific milestones (like sustained 5%+ GDP growth) that highlight differences between optimistic and cautious AI forecasts.
- Whether AI-driven transformation will primarily result from superhuman researchers (a “country of geniuses”) or widespread automation of everyday cognitive and physical tasks.
- Moravec’s Paradox today: Why practical skills like agency and common sense remain challenging for AI despite advancements in reasoning, and how this affects economic impact.
- The interplay of hardware scaling, algorithmic breakthroughs, data availability (especially for agentic tasks), and the persistent challenge of transfer learning.
- Prediction pitfalls and why conventional academic AI forecasting might miss the mark.
- A world with AGI: moving from totalizing “single AGI” or “utopia vs. doom” narratives to consider economic forces, decentralized agents, and the co-evolution of AI and society.
Credits
Watch & subscribe
Transcript
Contrasting AGI Timelines
Okay, so today, apparently we’re going to be talking about AI timelines.
So the context is, I think out of everyone at Epoch, you have the most conservative timelines, or you generally expect things will be relatively slower, especially compared to probably most people talking about this in San Francisco, in the tech world. I’m not actually sure, but it’s plausible that I have the most aggressive timelines out of people at Epoch.
Is that true? I wasn’t…
I mean, I don’t know the exact reality of it, but I think whenever we’ve done these forecast, internal forecasting exercises at Epoch, I tend to arrive at close to the most aggressive, if not the most aggressive. I think there’s, maybe Pablo, Anson are similar to me, I think, but if they’re slightly further or slightly more aggressive than me, it’s not by like a huge margin. So I would definitely say that we represent, to the extent there is one, like the two poles, like the bull case and the bearish case.
Yeah.
Now, I should also mention that even the bull case might not sound that bullish to some people who follow it. Especially I was just thinking about recently Dario talking about what he expects in 2027 or maybe even like 2026.
Nobel laureates, AIs.
Yeah, well, that’s right. Well, so I’m not actually sure what he means by that. I mean, it does sound like he means something very impressive that might accelerate economic growth or at least accelerate drug development, medical progress. So it’s like, with a caveat that I’m not sure what he exactly is trying to imply by those developments, it could just be some sort of like, narrow system that is very good at understanding science and integrating scientific papers or something. So with that caveat, I do think I’m more conservative than that. So I don’t think the two polar opposites at Epoch definitely don’t comprise the whole range of views that one could have on this topic. You can definitely be much more aggressive than I am, but I do think that I tend to fall closer to his side. Much more so than you.
Yeah, I mean, I guess one way to try to quantify this is when you expect, I don’t know, we often talk about big acceleration, economic growth. One way to quantify is when do you expect, maybe US GDP growth, maybe global GDP growth to be faster than 5% per year for a couple of years in a row. Maybe that’s one way to think about it. And then you can think about what is your median timeline until that happens. I think if you think about like that, I would maybe say more than 30 years or something. Maybe a bit less than 40 years by this point. So 35. Yeah. And I’m not sure, but I think you might say like 15 or 20 years.
Yeah, I’d say closer to 10 to 15 for the 5% threshold, I think especially so when you talk about things like 30%, which I think is a natural one that especially like Epoch has talked a lot about before for that one. I think that median is almost maybe misleading in a certain sense because I have uncertainty about. Even if we had like some sort of system that was just as good as humans at everything, and it just cost maybe 10e15 FLOP per second to run, then even if we had that, I’m still quite unsure whether that would lead to greater than 30% year on year economic growth.
And so then if you think about that, if I said, what is my median timeline to 30% GDP growth, then I would have to take into account both my timeline to that particular capability, but also the fact that even at that level of capability, I’m only maybe 70, 80% sure that would lead to such rapid growth, which I think leads naturally to like 20 year timelines or median timelines, or maybe even slightly longer than 20 years. But if you. But I think if you just focus on capabilities, like the capabilities that people are sufficient to reaching that level of growth, then my median is more like 10 to 15 or maybe even shorter than that. Although it just really depends on what capabilities we’re talking about.
Yeah, that makes sense. So in that case, the spread, at least if you quantify it like that seems to be maybe a factor of three or something.
You mean like your timelines, yours are like three times longer or something like that?
Yeah, something like that. Which, I mean, I don’t know if you go three times below your timelines, like three times faster than you.
Well, that almost sounds like it’s like the Dario timeline that I was discussing earlier. So. Yeah, so the difference between you and me is like maybe the difference between me and Dario. So there’s still plenty of variation.
Yeah, yeah. So I guess one thing that… So I had this AI timelines dialogue back in November 2023, which with. It was almost exactly like this. It was with Ajeya Cotra and Daniel Kokotajlo, where Ajeya was, her timelines were like yours. And then Daniel’s were like Dario. So it was these three people where it was like a factor of three, and me and Daniel are like separated by 10x or something. And I think it’s maybe useful to, for me at least, to look back on how that went because it’s been like one and a half years close to that since that dialogue happened. And there are a bunch of things from there. I think some of my predictions were too conservative. Like I was more pessimistic about progress in math than actually ended up happening. I thought maybe it would take like twice as long for us to get IMO gold level models or something like that. I also thought that revenue growth for the leading labs would be slower than it actually ended up happening. So I said like in 10 years you might have labs doing, I don’t know, $30 or $100 billion per year in revenue. While that seems like it’s going to happen in 2029 or 2028..
Mean, that sounds like a very important, directly relevant milestone to economic growth. So then I’m curious, what were your timelines in that post? I do remember it, but what are they now compared to that?
I think it’s shorter. So I think maybe back then it was more like 40 years. And now I would say more like 30 years. So I have updated them to be faster. But it is important to clarify also timelines on what, because in that post we answer multiple questions and one of them was when are we going to have an AI system that could do any job that could in principle be done remotely? So not necessarily is done remotely, but just could in principle be done remotely. And for that one I would now say maybe 30 years. Well, I said 40 years in that post, so I would have more bullish on that than I used to be.
Updating Beliefs as Capabilities Advance
Okay, I have a question for you then, because I think this is a common thing that comes up. People will, I mean, I guess I used to hear it a few years ago people would say like, just update all the way. Have you updated in any time in the last, like three years towards significantly longer timelines? Because if all of your updates are just like making things shorter, then it would almost seem like there’s a predictable pattern here that maybe you should just update all the way and join me or maybe even join Daniel in these very short timelines.
I mean, I just don’t think Daniel’s timelines are right. So I wouldn’t want to…
Sure. So maybe updating all the way just means joining me. Okay, but explain to me why you shouldn’t. Why you’re not buying into that. Why? Maybe you think there have been things that have delayed your timelines that make this update all the way framing kind of misleading.
Yeah. I’m not sure I have updated enough times for me to think that there is a bias. But before I looked into this at all, so many years ago, if you ask me, when are we going to get, I don’t know, 30% per year growth? What is your median? I might have said 200 years or something like that. And that was before I really knew much about AI, so that there I just have been drawing on the outside view. Yeah, that’s what I mean.
I also thought that something similar, but my kind of outside view and informed view was more in the late 21st century or something, so maybe it’s just that we started from different priors and we’ve just been updating in the same way.
Yeah, maybe that’s what’s going on. Yeah, I guess like, what would have been, even over the last year, I’m not quite sure how to think about the AI developments. I think probably the most informative thing for me has been the fact that I was too pessimistic about revenue growth. And I think that’s a much more objective measure than some of the other things that you can look at, like math capabilities or something. And that just seems like irrelevant. Basically, I expected revenue growth to slow down as the labs reached higher levels of revenue, and that doesn’t seem to be happening. I mean, not happening too much. It’s happening a little bit, but less than I expected.
I mean, that’s just like a standard pattern with startups is that they’re often able to grow quite quickly and then it’s obviously much harder to grow a large company than it is to grow a small company.
Yeah. But for example, I’m expecting revenue this year for OpenAI should be $12 billion or something. And so it’s like three times more than last year, and last year was maybe three times more than the year before that. So I think that is pretty impressive. OpenAI is forecasting that they’re going to have $100 billion in revenue by 2029. That’s their own internal forecast. And now that just doesn’t look implausible to me. While in that debate I was more pessimistic. I was like saying that’s going to happen in like 10 years and not in like 5, 6 years. So that just seems like the thing I was wrong about. And it is relevant in a way. I think it is more relevant than the things that people pay more attention to. For me at least, they pay more attention to the reasoning models which do math and complex reasoning and programming. And they look at, oh, these look very impressive. They look at the code forces Elo of the models and how quickly it’s progressed. While for me it’s just much more relevant, the fact that revenue growth has been continuing at a speed that was faster than I expected. So that has been a relevant update for me.
Well, I think that there might be a reason why people think in those terms. I mean, just my hypothesis is that they are imagining that the impacts from AI will come from something that are different than what you think of.
That’s right.
So, I think that maybe you could say a standard Epochean view is that the main impacts from AI will be the fact that AI will be deployed across the economy to do essentially all of the ordinary cognitive and physical tasks that humans are already doing to generate value. So extracting resources, organizing labor, manufacturing things, coming up with new concepts and just doing a lot of routine maintenance of things. This type of thing I think is often neglected in the conversation because when people talk about the impacts from AI, they’ll focus on, it’ll be some sort of scientist in a lab. It will be doing brilliant biological development. You have the framework of this country of geniuses in a data center, which I feel is a little misleading because I think that the country of geniuses in a data center won’t be the main value add from AI. The main value from AI will be the fact that it’s not a country of geniuses, but like a country of just AI is doing a lot of very ordinary labor tasks. Obviously some of that will be biological, but like the fraction of the economy that is paid out to paying biological researchers is quite small. So that just seems like a small part of the story.
Yeah, I mean, there are obviously people who argue that’s just because research has these huge externalities that are unpriced. And actually the value of a typical researcher is enormous. You just don’t pay them that much. I don’t find that very plausible though.
Well, even under that story, though. Well, I mean, it’s not clear that then Anthropic would be devoting that much inference compute to doing biological research because of this externality argument. They wouldn’t be capturing that much revenue from it in the first place. So even under that argument, it’s not like they’d be deploying most of their inference on these biological researchers. So I’m still not exactly clear why people find this country of geniuses in a data center model compelling.
Yeah, I agree, it’s not. So this sort of ties into why, you might ask.. Well, if the reasoning models are just so impressive, why don’t you update much more? Shouldn’t your timelines be like a few years? Why are they like 30 years? I think the reason is that, I don’t feel like we have seen as rapid, anywhere near as rapid progress on the capabilities that would actually unlock a lot of economic value. So I think it’s much more important to have agency and ability and executive function and ability to adapt plans to changing circumstances and just sort of do simple tasks, but in a reasonably competent way. And if you look at our economy, as you said, we don’t pay mathematicians or whatever, they’re just a very small fraction of the economy. Clearly, even if you automated that, how much value would that create? Some people will say, oh, all of our technological improvements in science and whatever, it all rests on mathematics. But that just means that mathematics is one input that is needed in a process that probably has hundreds of bottlenecks across the place. So if you just automate that, yeah, it might be true that scientific progress needs math to happen. But that doesn’t mean if you just automate math, you’ll suddenly be inventing tons of new exotic technologies that are going to increase economic growth by a ton. I think people often neglect the important capabilities that humans have that enable them to be competent economic agents, enable them to do most jobs because it just looks normal to us. Almost every human has these capabilities to some extent. Some humans have more.
I mean, in some way the fact that some humans. So for some of these capabilities, let’s say mathematics capabilities, that’s impressive among humans because of the fact that only a small fraction of humans have these capabilities that they can prove mathematical theorems and they understand higher mathematics.
Moravec’s Paradox and the Agency Challenge
So in some sense if everyone could do that, like if it was a routine task, it wouldn’t feel impressive to us because everyone would have it. So then I think that we have this bias towards thinking that the tasks that are important for AI to do are the tasks that we assign importance to because of the fact that only a small fraction of humans can do them, which is actually in some sense the opposite bias we should have. Because the tasks that are most important are the tasks that we think of as being so ordinary that all the humans can do it, that it doesn’t seem impressive.
In fact, those tasks are also the ones that are likely to be more difficult because those are tasks that are likely to have been more optimized…
By evolution, so you’re referring to the Moravec argument. .
That’s right. And I mean, I think Moravec actually didn’t explicitly make this observation, but it is something that drops out of the framework that you should expect, sort of all else equal, if the variance in human capabilities in a task is fairly narrow, then you should expect reaching human competence on that task to be more difficult because it suggests that the task has been optimized. It’s not a perfect analogy, there are counterexamples, but as a tendency, I think it’s probably true. So, for instance, if you look at people’s ability to play chess, there’s a very good way to quantify this, which is you can look at how many orders of magnitude of compute, scaling and software progress did it take to go from a medium club player, like a 1200 Elo player, to level of a world champion, like 2800 Elo. And this took something like five orders of magnitude of computer scaling, like 100,000x. Right. So that’s a huge difference. Imagine you’re looking at something like the energy efficiency of the body, or how much food a human needs to eat in order to survive, it’s just unimaginable that on any such metric, you would have five orders of magnitude of gap between the worst and best humans.
So you’re making some point that the efficiency here might not be. Or somehow by scaling these systems, we’re not being very efficient…
Oh, no. That’s not my… Okay, so my point is, in chess, we have this very wide gap of ability, which suggests that the way, chess playing has not been optimized very much among humans because humans don’t vary that much in their use of compute or energy, or like their brain size doesn’t vary that much. So how come some humans are like five orders of magnitude worse at chess than other humans? But clearly that’s because this chess playing is not a capability that has been optimized for in humans. Right? So there can be just some random variation that’s maybe genetic, maybe environmental, cultural, whatever. That just makes some people into much better chess players than other people. But we don’t see this for other things like sprinting, for example, running, or…
Just intuitively, it doesn’t seem like there’s that much of a difference between the way that people can just manipulate objects. They can hold things up, use a fork, write with a pencil. Obviously there will be differences in particular narrow instances maybe, some people have practiced at juggling, and so they can obviously pick that up and they can show off that skill in a way that other people haven’t practiced. But in terms of the most useful cases, I think that a reasonable benchmark for AI in this case would be whether you can assemble IKEA furniture or something. If you can just do that, it’s not something that requires. It’s not some sort of thing that you could do with just like factory robots. You really have to move all these things and make sure the screws get screwed in perfectly correctly. And I think that’s the type of thing where effectively, most humans, I won’t say all humans, but most humans just have this ability to do it. And there’s not much skill level to it. Maybe some people obviously can do it faster than others, but even that is probably more like some people just get tired of doing it rather than they’re just so much better at the manipulation of screwing in these screws and hammering things. So I do think that is the case, and that is a skill where I think that there is a strong case for evolution would have optimized us for being very good at this particular type of general object manipulation, which definitely supports the Moravec argument.
That’s right. Yeah, I think that’s right. And the reason for my relative pessimism is because I think we haven’t seen… I don’t think we have a clear trend that we can extrapolate where models are just becoming… So, for instance, you can look at how sort of “agentic” or how good at these kinds of tasks were models five years ago, and then you can look at how good are they today. And yeah, they’re better at it. I don’t know, like Claude is able to get out of Mt Moon (Pokemon) after 72 hours or something like that, which is just better than what models could do before. So there’s clearly some improvements, and I think we should continue to expect improvement. But I think it’s nowhere near the pace of improvement we’ve seen on math. If you look at five years ago, the competence of models at math and you look at their competence today.
It’s night and day.
Yeah, it’s just such a vast difference. While in agency, basically the models went from in math or programming being on the level worse than even a 5th percentile human programmer or something from the bottom they were worse than that. And now in competitive programming they’re among the best humans in the world. So they cross this vast range. But in these capabilities of agency and sort of proper long term planning and execution and sort of what you might call common sense, which people used to mean, just knowledge about the world, but that’s not really what it should mean. Models know a lot of things about the world much more than most people do. But so for instance, Claude is a good example playing Pokemon because in the abstract it knows exactly what it’s supposed to do because it’s seen so many tutorials, so many walkthroughs are on the internet for Pokemon Red, it just knows what it’s supposed to do. If you were a human, you got stuck at that part of the game, you could probably just ask Claude, what am I supposed to do here? And then it will probably give you a pretty good answer.
Well, in fact, I was thinking about this very recently because I’ve been trying to get operator to work, doing like routine tasks on the computer for me, and I found that it just gets stuck in these particular loops that from a human perspective are extremely dumb. It’ll just continuously keep looking at the same content over again, going in a cycle, it won’t be able to extract the right information from a page or if it gets stuck on some sort of landing page, it doesn’t know how to click next. Or maybe because the button doesn’t say next, it doesn’t know which button you have to click in order to get to the next page. But I would think that if you just ask an LLM, I’m looking at this page right now and I just want to know how do I navigate to the next page in order to find the information looking for, I would think that the LLM could like actually give me a reasonable answer for how to do that. But for some reason even though it has this explicit knowledge of how to do, it’s not able to apply that explicit knowledge.There’s some sort of gap between those two things and I find that very interesting. I don’t know exactly what explains that. That is one reason I think that I’m not in the camp that thinks that we’re just going to have these amazing economic transformations in the next five years. I think that’s more of a thing in the 20s and 30s. I still want to get your sense on these things though, because we’ve talked a little bit about why capabilities seem to be lagging. The models are stupid in various ways, even though they’re very smart in some ways.
Well, I don’t think that’s why they’re lagging. I think this is a description of what’s happening.
Oh, sorry, that’s what I meant. But I want to hear, after thinking of this, what makes you think: Okay, this is going to take decades or like 30 years maybe, rather than just these are just things that we need to work through that will take some time, years, but eventually we’re going to like work out all the kinks and we’re going to make sure that this…
The question is what would be the way you’re thinking about it? So from my point of view, I don’t see… So if there was a clear trend, I could extrapolate or maybe even assume, like I do assume that at some point the current trend is going to speed up. I think we have reasons to expect that, probably the most important reason is that humans are able to do it, and the human brain doesn’t seem like it seems like something that we should be able to match with the amount of resources that we’re going to be putting into this over the next few decades.
Well, yeah. So if you just look at projections, from Epoch even, of just the amount of compute that will be available to use to train these models by the end of the decade, it’s several orders of magnitude more than the current level. If you buy this compute model of…
So what I would say is…So first of all, conditional on the computer model being correct, I would say maybe by the end of the decade there’s, I don’t know, I would give like 20% or 25% chance that we are able to get something that is on par with the human brain, so until something like 10e30 FLOP, and I think the reason I’m maybe 20%…
So you actually have very concentrated probabilities here. Actually a quite substantial probability is by the end of the decade, but even though your median is 30 years away.
Yeah, so the reason my median is longer and I would have to…My medium for different things is in different years. So I should be clear about that. But the reason I’m more pessimistic is because first of all, I’m not sure if the model is correct. So one of the uncertainties I have, for example, a lot of people think of software progress as basically accelerating this trend. So I think people like Daniel Kokotajlo probably have this view that software progress is just adding another 4x every year of scaling and which is just on top of the physical compute scaling as an additional thing we have. So we’re going like twice as fast as you might assume. I’m not sure if that is actually correct.
I mean, I would just say that it seems like a lot of software progress at least is caused by or is bottlenecked by just the ability to run experiments.
That’s right. So what I would say is, I think something that should give us some pause is that before say 2022 or 2023, you could have said that the reason we are not yet on par with the human brain is because while the human brain is trained on more compute, it uses more inference compute and we just haven’t scaled the models up to that level yet. Which is even true because the human brain, we think is maybe trained on something like 10e24 FLOP. A couple of years ago we didn’t have models that were at that scale of computer and it takes maybe 10e15 FLOP/s to run. It’s only now that we have individual GPUs that can provide that amount of throughput.
So there was an argument that, there was a resource based argument you could have made that we just haven’t scaled things up sufficiently. But I think now it is harder to make that argument in the sense that you could still say if only we just kept scaling, eventually we would reach something with that competence. But you would have to concede the fact that the way we are doing things seems to be much less efficient. Our software is much worse. So there’s clearly a big software efficiency gap there. We don’t know, I don’t know how to think about the rate at which we are making progress towards something like the human brain. And things like reasoning models suggest to me that scaling a new paradigm is probably going to be the way that these agency and whatever obstacles are actually solved. It’s not going to be solved by training GPT 6 or 7 or 8. It’s going to be more like people are going to find as they scale compute some new thing to do and then they’re going to scale up the resources that are invested in that and that is going to give result. That’s sort of how I expect it to be resolved.
But I can’t because I lack this clear trend to extrapolate. I just fall back on things like this evolutionary prior. How hard might this task be given how much useful computation could have happened in evolutionary history and then just have some fairly uncertain prior conditional on us not yet being there. And when I do that I end up more pessimistic just because there’s a lot of pessimistic outcomes on the tail. So where 10e30 or 31, whatever FLOP is not going to be enough. There are outcomes where a lot of tasks might just be bottlenecked by the fact that it’s hard to get cheap real world data. Evolution might have an advantage because it has billions of years to optimize things in the real world, while we are operating usually under much more tight budget constraints for time and data.
So what would convince me? If I just saw a clear rapid trend of improvement on some kind of agency benchmark or whatever that I could trust? Maybe if models suddenly started getting much better at playing Pokemon games that are first of all, maybe even just the ones that exist. But it would be even more impressive if they started playing ones that are out of distribution that they don’t know the walkthrough for, so they’re supposed to figure out how to play it on their own. Like then that would be a bigger update for me if they suddenly started becoming competent, that sort of thing than even things like revenue growth. I think that’s just a big bottleneck for me that I’m not seeing progress on.
Okay, well, I guess so. The way I think about it, I was going to think of challenging your points in that way. So there’s various arguments that people are giving for short timelines. We’ve discussed the compute argument, for example. There’s just going to be a lot of compute scaling. It sounds like you agree with that, but only to some extent. You assign 25% credence that we will get something on par with the human brain in this general sense by the end of the decade. But there’s other arguments that people have given.
To be clear, I said 25% conditional on the compute centric model being basically right.
Oh, actually, so your real credence is significantly lower. Okay. Well, how much lower? I’m curious.
Maybe 15%
Okay, yeah, I would put it higher than that, but it makes sense because of the fact that I’m just more optimistic.
Missing Capabilities for AGI
But okay, so there’s other arguments that people are given in addition to the compute argument. I think one argument that kind of seems valid to me. It’s very vague, but if you just think of people will say sometimes that they can see now how AGI will be built, they see a clear pathway; and I think one way of interpreting this, they can see a clear pathway. There’s like general categories of abilities that we need to solve before getting to the point at which AIs can do everything. And they can see that in the past 10 years, like we’ve made a ton of progress on just solving categories. It seemed like the ability to solve game playing, like Go and Starcraft or whatever just fell in the late 2010s. And then natural language processing fell in the early 2020s. And now math and reasoning developments are just falling in the mid-2020s. And then, so if you just look at this trend, if you have this model where there’s only N domains, where once all of them fall, then we’re there. Then if you only think that there’s maybe 4 or 3 domains left, then this would indicate that we’re maybe only 10 to 15 years away. If you just extrapolate that.
I think it’s not that optimistic though, because over the past 10 or a little bit more than 10 years, our compute scaling has been quite fast and we’re not going to be able to keep that up as the end of the decade approaches.
Well, okay, so that’s a good point. Yeah, I would say that this. I was sort of trying to distinguish this from the compute argument, but it’s a very reasonable point to say. I guess then in this case it matters to the extent that you think that there’s only a very few domains left. So you might think, as you just mentioned, there’s agency. It’s just maybe that’s just a domain that at some point will just fall as quickly as natural language processing and mathematics fell. And then maybe there’s just one more after that, which is something like multimodality or maybe even just robotics or something. And so then if there’s only maybe 2 to 3 left, then at that point, how can you be so confident that those 2 to 3 won’t fall in our current massive scale up of compute?
I mean, I guess the question is how massive is the scale up we can expect? I think it’s not actually that massive.
So, yeah, how much have we scaled over the last, if you’re willing to concede maybe that three domains have fallen in the last 10 years, something like.
Nine orders of magnitude.
Okay, so then nine orders of magnitude would be quite a lot more than our current level.
That’s right.
So actually that’s a very good point. Yeah. So then, okay, well, what about just like agency itself? Like if agency just falls but nothing else happens, it’s plausible that agency alone would mark a lot of value..
My question is would the next thing to fall be agency or would it be something else? I mean, it might just be that multimodality falls before agency is. I mean, it seems unclear to me. I think it’s reasonable to expect and I would agree with the statement that before the end of the decade we’re going to se, probably, we’re going to see another sort of thing that falls that is on par with math or complex reasoning. So there will be another thing like that maybe in a few years. I think that’s reasonable to expect. But what will it be? My guess is probably not agency. My guess is probably it’s something else. So it might be, for example, it might be long context performance, but it might not be agency. It might be genuine multimodality, it might be something else. It’s just hard for me to anticipate. So I wouldn’t want to…
Basically I think my view is there are probably more things, I tend to think that almost by default you should sort of assume that it’s more things. Because if you just look at the economic impacts of all the things that AI has been able to do so far, it is relatively small. But there are so many different things you do in the economy. You could have a model where almost all of that is downstream of one or two capabilities that humans have over AIs, but I’m not sure how plausible I find that. And it feels to me like the more we sort of zoom in on intelligence, the more it looks like there are tons of different competences that are a part of it. And I think we only realize this as AI continues to make progress.
There are things that people before might have thought that, oh, if an AI system can do X, Y and Z, then that’s just sufficient for it to be transformative. And now we are learning. No, actually not. Because actually there are other capabilities that we just weren’t paying much attention to. And now they look a lot more salient because now AI systems are becoming bottlenecked on them while previously they were not. Previously they were worse at a bunch of other things as well. So, for example, one of these things is like Yann LeCun’s has these questions or these common sense questions, if I put this object on top of that object and if I do this manipulation or whatever, what would happen? And well models are just becoming increasingly good at answering that kind of question. But that is not leading. Clearly that was not the decisive bottleneck, because even though models are becoming better at that, even though that is associated with revenues going up, it is not associated with the kind of transformation you would expect if models were like, at human competence. So I think that’s another reason why I’m pessimistic when it comes to this view. There’s one or two things left and then like, we’re going to do them and then like, I think that particular way of thinking about it doesn’t have a good track record because you could have said the same thing. Imagine you were in 2010 and you say, well, there’s one or two things that we have to figure out, which…
There was this common framework that I think people analyzed, which was the idea that certain domains were AI complete. So the idea being that if you solve this domain, then you’ve cracked it all, analogous to NP complete or whatever. So you crack one, it cracks the whole thing. And I think that that isn’t a framework that people find particularly enlightening anymore.
Yes.
Which. So, okay, but one way of thinking about it is like, well, the reason why these systems haven’t had that great of an impact is just because you need all of them together before they’re sort of explosive. So maybe the effect will be a little more sudden or something. There isn’t that much, you know. So once, I mean, I agree there’s only a few things left, but then once they’re unlocked, then it’ll be like this huge value that’s unlocked and we’re actually quite close. In other words, you can’t just continuously extrapolate the impact they’ve had and then just assume that extrapolation will continue…
I agree with that.
Okay. Because I honestly think. I don’t know if I would have agreed at the time with Yann Lecun’s statement that, that’s the thing of bottlenecking value. I would think in terms of what’s bottlenecking value, I think probably the most important thing is just the fact that these models are not capable of maintaining this coherent world model over a long time horizon in a very basic sense. I mean, if you have a conversation with an LLM over just few hours even, then it will just start forgetting things and it won’t make any sense. It’ll forget very basic broad stroke details of the conversation that you’re having. Not just specific things, but things that make it so that it almost feels like, it doesn’t have an understanding of the full context of what you’re talking about.
I would think that this might be something that falls in the next few years. It’s possible. I mean, part of what gives me pause, of course, is that I think people have been saying that this will fall for a while and that it just never does. Or it’s kind of falling, but definitely not in the way that some people were predicting. I think that even though that gives me pause, I still think that even if this was unlocked, I think it’d be a huge deal. I think this would unlock a very large amount of value even without any multimodality or anything. Just because this would allow you to use the models to take on tasks in the real world. I mean, the fundamental thing about work is that it’s usually something where it’s an ongoing process. There’s very few jobs that just require someone intervening in a small aspect of a process and not needing that much context in order to do a thing. Most jobs, at least in the United States economy, require an ongoing context, a lot of back history onboarding and knowledge of what these specific things they’re doing. So I think if we just solve that one component, that could be a very large transformation.
I agree. And I mean, that might just, I would assume that could be easily trillions of dollars per year of revenue, maybe more.
But it’s not like explosive growth level.
It’s not. Yeah, I think that. So to me it looks like the lack of common sense and lack of agency and ability to execute plans is a different competence than maintaining coherence over long context. AI is a bad at both right now, but I think they are different because, there are examples I’ve seen of tasks that are given to LLMs that are not long context tasks. They’re actually short context tasks. But the LLM still does very dumb things when it’s trying to solve the task. So one example is, I saw this, Colin Fraser, I don’t know if you know him. He’s on Twitter, does works like, does a bunch of stuff with the limitations of LLMs. And he set up this command line environment with a few commands where like, the AI can list files in a directory and check the content of different files and so on, and he would give the LLMs puzzles in this environment. So it would say, for example, okay, like in this directory you have 10 files. They are ordered from one to 10 and one of them contains a password and the others are empty, and the one actually maybe it was 16 files, it doesn’t matter. He would say, if you check a file and it’s empty, then all the files that have a smaller number than it are also empty. And then it would say, okay, like you have a certain amount of dollars and $10 or something, and each command call costs you $1, so that you’re supposed to end up with the most money. Find the password. Okay? And I think this is a very simple observation that based on these, the file that has the password has to be the last file, because there’s nothing else. But even reasoning models that are supposed to be very good at complex reasoning, they just can’t identify that is the case. They just make random command calls and then check random files until by chance they happen to check the last one and then that has the password.
That just gives me this pause because that is not a long context ability and it is something like it is a reasoning task snd it is a kind of reasoning task that’s maybe more realistic. In a sense that, in the real world most reasoning you have to do is not solving complex math problems or LeetCode or, it’s not like that. It’s actually more like this, where there’s some simple thing you have to figure out using reasoning that is just the last file. But even though maybe if you present it as a puzzle in the right way, maybe it would be able to solve it, but it’s just unable to do it in this context.
Or another example is same environment again, you have a bunch of files. This time a random one contains a password and then if you find a password, you get like $3. There are 10 files and each command, each time you check the content of a file, it costs you $1. The question is, what do you do? For a human who is somewhat good at reasoning, it’s obvious that you should just not do anything, because your expected number of calls to find a file is like 5. So you’re gonna be spending on average $5 to find a password that’s worth $3. So you just not play. But LLMs usually think like they execute commands, oh, let me check this and that, and they just lose a bunch of money for no reason. So these are very simple examples, and I think for any individual thing that you point out like this, it’s possible that people are going to take action to address specifically that kind of tasks.
But they’re not going to be solving this more abstract problem of solving these common sense reasoning issues.
Yeah, because people can always… When I look at the models, it doesn’t feel to me like sometimes people talk about this framing of, in year in 2019 the models were as good as toddlers, and later they become good as high school students and undergrads and PhD students. Okay, to me, that’s clearly not how the development is progressing. It’s more like there were some puzzles that would trip them up before and now a lot of those puzzles are solved. But now there are some different puzzles that aren’t really any more advanced. They’re sort of at, for a human, they’re at the same level of difficulty, but not for an AI. For an AI, there’s something different and they just can’t do them. So then I’m not sure if I can just extrapolate. There are solving more and more puzzles and then how do I extrapolate that to when are they going to be able to solve everything? I just don’t feel confident in the ability to make an extrapolation, which is why I just fall back on, well, humans seem to have this capability in principle. It can’t be that difficult to reach this if humans have this capability.
But I don’t think we actually have a clear path. I think the closest thing to a clear path we have is that we continue scaling, but it’s probably not the same thing. We just have more and more compute, and more compute enables us to run more experiments and discover new algorithmic innovations that might be relevant, that might impinge on this, and over time that is going to enable us to solve the problem, but that is a very abstract and vague roadmap. I do believe it’s going to work, but it’s not the kind of roadmap where you can say, we just have this architecture, we just scale it up by like 10x-100x and then that’s going to… it’s not that straightforward.
Beating benchmarks vs Being Useful
Right, well okay, let me challenge you then because I think we both have some same underlying model here, but we just have different parameters in the model or something. I think that we both agree that our probability of mass event economic transformations is sort of front loaded, in the sense that I think there’s a relatively high probability of it happening soon I would say in the next 10 years relative to like. That’s right.
Yeah I agree.
And this is because if you just extrapolate current trends in compute scaling it’s like not really possible to continue it after 10 years at our current rate. Unless the economy itself is probably producing compute at a faster rate than it probably currently can based on just the ability for actors to invest and the historical rate of how much less costly computers getting per year. Anyway. I think we agree on that underlying model. I think I’m just more impressed with what I’ve seen. You talk about these reasoning tasks that the LLMs are still kind of dumb at, but in the last few years I felt like everyone has been proposing, I mean you even gave the example of Yann Lecun giving these like things that “haha, like aren’t these LLMs like so dumb? They just can’t do these reasoning tasks.” It almost seems like everything that people are throwing at it anytime someone comes up with some sort of benchmark it just gets solved extremely quickly. People gave this GPQA benchmark and it’s these like graduate level physics, chemistry, biology problems. Then within one year went from I think little better than random performance to doing better than the graduate students who were hired to take place in this study to, to get the expert baseline on GPQA. It’s not just GPQA, it keeps happening over and over again and it’s not clear to me that this is, I think this just seems inconsistent with the idea that they’re just, we’re not making good progress on this general sort of reasoning and I think you’re framing that it’s just that we’re patching narrow issues doesn’t seem exactly right.
For GPQA, I don’t think that’s right.
Well yeah, but just in general it seems we’re genuinely making progress on just getting the models to have a better common sense understanding of the world. Now again, I don’t think that we’ve made that much progress on long term coherence, but I was trying to make a distinction between that and the problems that you were talking about. The problem that you brought up with Colin Fraser, for example. I just felt like this, not just that problem, but the general class of problems seem like those are the types of things that I expect to be solved within a few years.
I don’t actually agree with that. I think it’s going to take longer than that.
I mean, there’s a question too, of like in a few years, will there be something that LLMs do or whatever comes past LLMs, maybe we won’t keep talking about them as if they’re just LLMs, but is there something that these models will be dumb at? I think the answer to that is yes. They’ll just have some defect, but I think that’s quite a different question of whether there’ll just be some area in which they’re defective.
No, that’s such a weak claim. You might have a person who is dumb at doing or defective or incompetent at doing certain things, but that person is still very economically useful, usually just because they’re bad at one thing. But the way LLMs are bad at one thing, that’s not at all the same thing.
No. I’m not sure, though. It does seem like sometimes people, the way that they critique the LLMs is they’re searching over a bunch of things and they come across one thing that they’re bad at.
I agree, but that’s not even the important critique. The important critique is that people are just finding examples of, because there is this visceral sense that one hand LLMs look like they’re very smart, on the other hand they’re very useless. So you have to…
In relative sense. I feel like that that should be clarified. I do think you would agree that they’re useful in a conventional sense.
Sure. Just like not in the sense of transforming the…
Yeah.
They’re not as useful, nowhere near as useful as a human would be who has that level of knowledge. So then something is clearly going wrong, or something is missing, maybe not going wrong. Then I think people just have that sense and they’re trying to point to the thing that’s missing. And I think this is actually surprisingly difficult. It’s just not something. It’s not a competence. The competence that are missing are not ones that we are used to measuring. So if you look at how we interview people for jobs and so on we’re just so used to testing for capabilities that actually show a lot of variation in humans. We’re so used to thinking of intelligence in those terms.
Well, this gets back to the point that the reason why it’s so impressive to do math is because only a small fraction of people can do math. Right? Then the point of these like LeetCode interviews for example, is that generally the people who are good at the LeetCode interviews, that’s just highly correlated with doing these long term coding projects, but of course the claim is not that doing a LeetCode question requires the same skills. I mean, in fact this is famously criticized as people criticize programming interviews because it’s: “Well, that’s very different than what you’re actually going to be doing day to day in a programming job. So why is this what we’re testing?” Well the reason why we’re testing it is because it just happens to be that the way that humans work is that these two core abilities which are not logically connected, like strongly, are just happen to be correlated among humans. So yeah, so I definitely agree with that.
In fact, even in LLMs, I think there are some interesting signs where the models that are the best at competitive programming are not the models that are best at conventional software engineering work. I think the best at computer programming are models like (OpenAI) o3, while the best at sort of conventional software engineering is like (Anthropic) Sonnets, even though it’s worse at competitive programming. I think that, I mean if you look at the human distribution, in fact that’s also to some extent true. The best competitive programmers in the world are not the best general sort of software engineers in the world. Yeah,, I agree with all of that.
So the correlation might break at the tails for both humans and AIs.
That’s right. So the point I’m trying to make is that again, I just keep coming back to this. I just don’t see a trend of improvement that’s rapid enough that you should have a median of 5, 10 years. I just don’t think that the improvement right now is… So that doesn’t mean it can’t happen, because you might just as we’ve seen for other capabilities, you might see an emergence or a sudden acceleration in capabilities, which I do expect to happen at some point. But the question is if you can’t rely on the sort of current slope, then you have to have some kind of belief about, okay, when can we expect a trend break? When can we expect this thing to suddenly emerge and want to suddenly become competent?
There I think there are different ways you can think about it. One way is what you mentioned where I go, there’s these three or whatever, three, four things that we got over nine orders of magnitude of scaling. So we have scaled maybe a little more than three orders of magnitude until end of decade. Maybe that gives us one more thing. Seems plausible to me. Is that going to be enough? I don’t think that’s going to be enough. Then it slows down, probably the scaling is going to slow down, it becomes more drawn out. So I would agree that if you’re asking for the single five year period in which this is most likely to happen, then I think the next five years is probably the one that looks best, just because we will be scaling so much faster. I think that there’s a lot of truth to that. And I’m not even sure how to update on systems like GPT 4.5, right?
Well, it’s just one data point. I mean I noticed a lot of people were saying, oh, pre-training is ending. People were like…
I don’t think it’s ending, I think it’s more like….
Yeah, I mean I was going to say that. I mean I was going to say I’m not sure how much we can really update one data point.
No, I mean even if you want to update one data point, I’m not sure. I just expected a better performance from 4.5 than what we actually got. I think the actual model was kind of less impressive and it was also larger and more expensive to serve than I had expected. I was expecting something a bit smaller, so it was definitely a negative update for me. It was not like a huge negative update because I don’t expect pre-training to get us to AGI anyway. I don’t think that’s going to happen. I think pre-training is going to be one part in a complex large pipeline which gets us to AGI. It’s just going to be one thing that goes in there. Just like reinforced learning for reasoning is another thing or instruction following is another thing. Just these things that are called post-training I think are just going to keep expanding until they just become bigger retraining, and then we’re going to be upset that we use such terrible names for these phases because I think that’s what’s going to happen.
For agency, for example, I think what we are missing, I think it’s very plausible, I don’t know, more than 80% likely, that if someone from the future just gave us a trillion tokens of data of models trying to execute plans of whatever the equivalent of reasoning traces, imagine trillions of tokens of reasoning traces. But except for reasoning, it’s for following and making or following plans and doing work. I think that could easily be enough to turn the models we have now into like…But we don’t have that data, we have to generate it and we don’t have a clear way of that. So we have to figure out how to generate it, we have to figure out how expensive it is going to be, and it’s just very hard to say.
I mean, I’m curious then because I feel like one then clear pathway here, which may, it’s quite expensive obviously, but you can just have recording people. You can just record people.
But would that have worked? Like for instance, if we had done that for math, would that have worked? Would that have given the models the ability to solve math problems?
I mean, I think it might be in a similar way with reasoning models, which gives it just enough so that you can then do reinforcement learning on it. And then once RL gets traction, then the rest can be solved automatically because then you can just do these rollouts and then you can just store the rollouts that were successful at completing the tasks and so then the rest just gets solved automatically by training recursively on these rollouts.
Yeah, if it gets solved like I think that’s probably the most promising way I see of doing it right now.
AI Excelling in Some Tasks While Struggling with Others
I mean, I think a large crux of this, throughout all of this, is just how much do you expect transfer learning to be successful? If we just, if someone had the plan prior to pre-training that we’re just going to have a bunch of data of people doing work tasks, I just would not consider that. I mean I would have predicted that just wouldn’t be enough to yield an agent that would be able to do the work tasks because it wouldn’t have sufficient broader world knowledge in order to… Like maybe you could get it to work in the same way that were getting agents to work in Starcraft, which is that it can just do this very narrow thing that you have like a ton of data for, but like not in a more general sense. So I think that a large crux here, not just with this agency but also for reasoning is how much do you expect that there will be transfer learning between this pre-training, and the other data that we’re going to be training on it so that all of these other things just like get patched, so you don’t need like a hundred trillion tokens, you need quite a lot less than that.
I honestly think that for reasoning, in fact, the way the reasoning models competences are distributed suggests that the pre-training is very important because I think they are the thing they’re best at is piecing together things that they know in a relatively simple way. I think we have seen this for Frontier Math. In fact there was a thread I think yesterday by Daniel Litt where he was talking about how he had one of the difficult problems that he proposed and that problem got solved by (OpenAI) o3 mini-high. If you look at how it solves it, it’s not actually that good at proving any of the individual steps, but it just knows so much stuff that a human would not know. So a human would find it very difficult because you need to know some obscure results from 1850 or something. Unless you’re an expert in that field, it’s just very hard. But these models have been pre-trained on so much data that they just know so much stuff and the RL is able to elicit those capabilities, able to get them to piece together what they know into an answer. I mean that’s impressive, but it is just a different kind of capability than, say the kind of reasoning that I was talking about before, which is the kind of reasoning…
Intuitive reasoning or something that isn’t just relying on knowledge.
Think about the kind of reasoning that you have to do when you’re playing a Pokemon game. Okay? It’s just so counterintuitive to humans that you can be so good at this sort of piecing together different dilemmas and so on and then so bad at things like… So for instance, a very common thing that LLMs do when they’re interacting with a game environment and things happen that are outside of their expectations, for some reason there’s something that happens that they didn’t expect. They will often say so maybe they are trying to get out of like a cave in Pokemon and then they can’t figure out how to do it. Maybe they go through one door thinking that it is a different door, so they end up in a different place. Instead of saying “Okay, I made a mistake, I have to correct my model of the situation”, instead of saying that they’re like “Oh, the game is broken”. Yeah, it just seems so bizarre that you would fail on such a simple task of reasoning. Obviously the game is not broken, right. Again, I don’t have a good explanation of why that happens. It just seems like it’s just much harder for some reason to get the models to do that. But I do want to distinguish between the kind of reasoning that reasoning models seem to be good at and a totally different kind of reasoning that is actually much more important for people in their daily life.
Yeah. Again, I would characterize this as a crux rooted in your optimism about transfer learning. So whether we’ll have this very shallow form of transfer learning which might be sufficient to just apply one’s vast knowledge or apply the LLM’s vast knowledge of the world in cases that are surprising to humans but wouldn’t work out of distribution or wouldn’t work in some sort of situation in which you can’t just rely on something that people have done before.
So an interesting thing people have pointed out is that how surprising it is that you have these language models that have such vast knowledge, plus now you have the reasoner capabilities that have been added on top of that knowledge. And despite this, there isn’t a single instance that I’ve seen of an LLM that comes up with an original definition in math that seems interesting. So this is a very different problem from proving a difficult theorem. I mean, it can be part of proving a difficult theorem that you have to introduce some conceptual machinery or whatever that might be interesting in its own right. That’s often how a lot of difficult theorems got proved. But I don’t know, it just seems so bizarre because for a human you might say, well, yeah, a human can’t really do that.
Because a human has such a limited amount of knowledge, they don’t have the ability to see all of these different interactions between different domains because their knowledge is so limited. But an LLM clearly doesn’t have that bottom line. And it clearly also doesn’t have the bottleneck that it just can’t do math. It’s obviously able to do math. So then why is it unable to come up with a definition or some abstract object or whatever that would be interesting to investigate. I’ve never seen a mathematician use a model and say, oh, it’s proposed this definition, which just seems like an interesting thing.
Well, it almost seems like what’s broken in it is that it’s almost more like a database than something that’s able to make connections between its knowledge.
So it is able to make connections in a narrow sense, like when it’s solving a problem. For instance, maybe to solve a problem you need three or four lemmas put together, all of which it knows, and then it is able to put those together to solve the problem. So maybe that’s because it has been trained to be able to solve problems and so it is able to do that kind of merging or interaction whatever between the things it knows, but it hasn’t been trained to sort of be creative or come up with original ideas. And so it is unable to do that.
And that’s sort of more of an argument for narrowness of transfer learning, in the sense that just training a model to be good at piecing together things to solve problems doesn’t also get it to be good at piece into their things to come up with original new ideas or concepts.
Yeah, I mean, so it’s almost like I’m agreeing with every point you’re saying, and then I still just come to this intuition that, prior to LLMs I would not have been optimistic that we would have been able to get to the current level of transfer learning, in just being able to do ordinary language tasks. Even though they have all of these limitations, there is just like a subscription in my view, or at least like relative to my prior state, looking at the state of the technology, I expected them to just have a much harder time being able to do these very fluid interactions with users; in having complex in depth conversations with me about of various topics in a way that demonstrates that they understand exactly the points I’m making and that they’re not making, they’re not just like pattern matching me to a conversation that they had in its training data. Because I would find it highly unlikely for many of these conversations that I have at the LLMs that it had any conversation like that in its training data.
Economic Impact of AI vs the Internet
So from my perspective, I just feel like everything you’re saying is like, yep, limitation, limitation. And then I go, okay, but why don’t you just expect these things to be patched? Like we just seem to be making a lot of progress and it’s just like when you extrapolate that.
Okay, how do you extrapolate that?
I agree, it’s difficult. It relies heavily on intuition. It relies heavily on how much you think is required to get to this point where they’re super economically useful.
And that’s a very vague.. The internet is very economically useful.
Yeah. I mean to be clear, I think in this entire conversation what we mean by economically useful is way different than what the mainstream means. I think the mainstream would be very impressed if LLMs were just as big as the internet. If they caused the acceleration in growth that was observed during the 1990s, which at the time was considered a significant economic boom. I think it was framed in a certain way, especially retrospectively, as like a brief end to the great stagnation. Then it came to the end in the 2000s, especially after the Great Recession, and then we’ve had slower growth since then. I think that if it just revived that level, like the 1990s level of growth, that would be huge from a mainstream perspective. That’s not what we mean. We’re actually referring to that as something dramatically more impressive than that, and indeed something completely unprecedented.
Yeah, yeah. So for instance, we’ve talked about a bunch of timelines here. If the timeline that say I was asked about is when are we going to get the economic impact of AI, that is on par with what the internet had by the year 2020 or something like that, then I would be saying my timelines would be a lot shorter. Like I would be saying seems quite plausible that we get to that level even by the end of the decade, even by 2030, probably in 10 years, I would be well over 50%.
Well then doesn’t that require greater than say 5% growth in the next five years?
I don’t think it requires, I mean, I think it’s a bit unclear. So the current level of growth in the US is like 2% or 2.5%, something like that around that.
Yeah, around that.
It’s fairly slow.
But the entire internet before 2020, aren’t you condensing like 30 years of impact? I mean the internet’s been around for a while, so. So you’re talking about the impact that the internet had during each year since it was out or the total impact over that whole 30 year period?
Yeah, I mean I guess the question is how much economic growth did we get out of the internet? Like if not for the internet…
I mean it’s quite hard to tell. I don’t know about the internet in particular, but at least like the information revolution, I mean has it did lead to, I mean I think most economists think it let it, that it drove the higher growth in the 1990s in the United States. I think it also, there was observed higher growth in developing countries in the 2000s, which lagged behind the United States in terms of adopting information technologies. So I do think that it probably added like 1 to 2 percentage points of growth per year for a while. I’m not sure exactly.
2% per year for a while. That’s quite a bit higher than what my estimate would be.
I mean, I don’t know. It could be 1 percentage point, but the point is over a decade it’s still quite substantial.
So my guess for what we would like to see is if AI didn’t exist, maybe economic outputs in 2030 would be… I would say my median is it would be 5% lower than counterfactual or something like that. So then I’m saying it adds maybe like one. Maybe a bit less than. If you assume we’ve already gotten a little bit. If you think we’ve gotten nothing so far, then maybe 1% per year.
We have kind of in some sense, just a very rough heuristic of measuring the impact of AI is just looking at the revenue of the current AI lab.
Yeah. So the problem with that is that AI has also been used internally inside a lot of, like inside Facebook, inside Google. They use AI to do things like ad targeting, which actually probably drives most of their revenue, actually. So the impact on that.
Well, that’s not like a foundation model.
No, it’s not a foundation model, but I don’t want to do that separation because my. Well be that by 2030, the relevant AI systems also look meaningfully different than the systems we have today. Again, probably less than 50% likely.
Do you think that it’ll be so different that it’s like talking about these ad recommenders versus a foundation model. Are you expecting that big of a change by 2030?
No.
Okay.
I think that’s too close by this point. And the foundation model paradigm has had enough longevity that you should probably expect it to last for another five years. So I don’t expect that level of difference. But in the end, I don’t think we would like.. I don’t find that to be a natural separation or something.
I mean, I do think there’s this general thing that people have done for decades, and I think people still do it, even though, in my opinion it’s becoming increasingly obvious that it’s not a conceptually useful way of framing these things, which is, in particular, viewing general AI as just being qualitatively distinct from narrow AI. People used to say, well what we have now is narrow AI, and then in the future we could have this thing that’s called general AI, which would be like a human. But I think that this framing, even though on its face, like the way that people often state this, it isn’t literally false. It is true that there are narrow AI systems and that is definitely distinct from a general AI system. The problem with this framing is that there’s some sort…It seems like there’ll actually be more continuity between these two things. It’ll be like AI systems will just get increasingly more general. Then for so long, people have been saying, what are your AGI timelines? And it’s like, well, what does that mean? Because there’s some threshold of generality, and then even that might not be very useful to talk about because there’ll be this imbalance of capabilities that the AI systems are capable of doing. So they’ll be very good at some things, very bad at other things. So then when’s the point? Are we going to say that on average, across all of these things, it’s as good as a human? That just seems, it seems very hard to define that. Even if you did define it’s not clear that’s a useful thing to talk about. So for some reason, even though I do concede that this framing is getting less common, I think the discourse has improved online about this slightly because I think people kind of recognize that there’s this continuity. But I think that it’s still something people haven’t really internalized to a large extent, which is that it’s just not going to be clear that there’s not going to be a year where we get the general AI, where it’s like, now we have it. This is not a useful way of thinking about what things will happen.
Yeah, yeah. Another thing I want to flag is that I just gave this estimate, like maybe 5% more kind of backdrop, but maybe even could be more than that. I mean, 5%, I don’t know if it’s just a rough number I come up with as a median by end of this decade, but I do want to point out that there is a famous paper that came out last year from Daron Acemoğlu, who tried to estimate the economic impact that LLMs and like AI systems would have in 10 years, so not by the end of the decade, more like in 2034. His number was like 0.5% or 0.6% in 10 years.
That’s like percentage point increase in the yearly economic growth.
No, that’s a percentage point increase in final GDP.
I see. Yeah. Yeah. So that’s much less than I…
So first of all, it’s like 10 years instead of five years, and it’s 0.5% instead of 5% or maybe I could even be higher. So what I’m seeing here is actually more than 10x more optimistic even in the very short term, compared to what I think would be a mainstream view among economists. Because this paper, I think, is a fairly mainstream paper among economists.
I remember it’s cited everywhere. It certainly got a huge media attention compared to almost any other economics paper on the topic.
Yeah. I think the mistake that’s being made in this paper is that, okay maybe there are other mistakes as well. But I think the central mistake is looking at the current capabilities of AI and then assuming we’re just going to have in 10 years, we’re just going to have that, but it’s going to be like, a little bit better or something. It’s just going to be the same. But if you had done that 10 years ago, then you would have made totally wrong predictions about what the technology would be doing today. Because 10 years ago was the time of supervised learning. It was the era of image segmentation and image labeling and predicting who is going to pay out their insurance and who is not going to pay out.
Well, everyone thought that driverless cars would be the thing that would be unrolled in the 2020s. I mean, it is happening, but it’s being overshadowed by things that no one really was predicting.
That’s right. So I expect, as I said, even by the end of the decade, we will have probably at least one other thing that’s like complex reasoning. It just suddenly emerges with scaling and algorithmic progress and then makes a bunch of progress and leads to discontinuities on a bunch of trends, and I expect that it’s just going to overshadow in many ways the capabilities that we’re talking about now, and that’s going to come on top of the fact that the things we can do right now are also going to just keep getting better. I think when you combine that, in 10 years, probably you have two things and not just one, maybe more even than two things. If you just look at the past 10 years, it’s probably more than two things that we have seen, like two big things like that.
So it’s just a very poor way of making projections 10 years out to look at, to do this. I think Jacob Steinhardt calls this like a zero order forecast or just look at how things are today and then you just assume it’s going to be like that but a little bit better. And it’s just such a poor way. But I think it’s, interestingly, it feels more grounded to people because you’re not doing like science fiction. You’re looking at things that are happening now and somehow that feels like more “realistic” even though it’s actually just a much worse way of making predictions.
I kind of made this point earlier, but I feel like there’s a sense in which some people just have unusually short timelines because, the technology that they’re imagining is not the same type of thing that we’re imagining. I think that a lot of people have short timelines, and by short timelines let’s just say two to three years or something, and what they mean by it is that they have short timelines to a particular capability. So it is often the case, I think, especially if people are looking at current reasoning models. I think we both agree, like you have your recent article about what you expect in 2025, and you noted that there will be probably this acceleration in AI progress. One of the reasons is because currently, or at least the existing systems like DeepSeek-R1, (OpenAI) o1 and o3, it’s quite probable that there just hasn’t been that much… I mean we know this, there hasn’t been that much compute that’s actually been devoted to the RL stage of training.
It’s a very natural extrapolation to just say, well, once we make the amount of compute that’s currently available, if we just allocate a substantial fraction of that to the RL stage of training, currently that stage is like less than a tenth or maybe even more like a hundredth of this amount of money that’s being spent on pre-training. But if they were on par with each other, if they were within the same order of magnitude, then we will see a dramatic acceleration just because that will be a few orders, like a couple of order of magnitude scale up on this particular dimension, which is reasoning models. And I think there’s a particular line of thought which influences how people see these things. My guess is that it might be popular at the AI labs, is that they’re looking at that and they’re seeing that, well, we’re going to have this massive scale up in the next few years. And if that leads to the final conclusion, if that’s the last building block to this very impressive system, then we’re just, we’re going to have AGI or whatever you want to call it in a couple of years. So to be clear, I think both of us disagree with that. I think we might just, I mean, you might disagree with it more than I do.
I think it’s like our disagreement is more about, I don’t know, I think it’s not on this point. It’s more about maybe how many big things are there that are supposed to be discovered. Maybe there’s some general prior disagreement that even before you learned anything about AI, your timelines to a big economic change were shorter than my timelines. So there’s probably some prior disagreement that’s just in there, maybe that I just generally think things are slower and there are more things that might be out of model or something.
Okay, so that’s a legitimate disagreement. I want to bracket that for a second though, because I think that even considering that you might not think that there’s just one big thing remaining.
Maybe.
But I think that what people are imagining, if they think of what will be the mechanism through which AI has large impacts on the world, they’re imagining it’ll be a…
Country of geniuses.
Yeah, yeah, that’s right. So they’re imagining the mechanism through which AI will have a large impact on the world is specifically through this reasoning mechanism, that it’s like the world currently. Like what it’s most bottlenecked on, the idea, is that it’s just bottlenecked on very high quality reasoning. That if we could just like apply that to some tasks in the world, then that will just yield a ton of technological progress. It will yield a ton of progress on medicine. It will yield a ton of scientific progress. Maybe there’s all this data lying around.
We have like physical observations from the Large Hadron Collider. We have the Hubble and now the James Webb telescope. We could somehow just like, we had something that was very good at reasoning, just like putting this all together, then it could somehow accelerate scientific technological development.
Well, I mean, accelerates by how much I think is the big question. I think it’s very cheap, very easy to say that these models should accelerate scientific development. But obviously they might do that in a very unimpressive way where they just help scientists with routine calculations and accelerate proving lemmas and things like that, which seems very plausible that it’s going to happen. I think present systems are probably already good enough to do that to some extent. Or even more mundane is that they just help scientists by helping them structure their LaTeX papers better and help them make nice tables. So that’s a very mundane way in which you can speed up scientific progress. Obviously that’s not what people are talking about.
Of course not.
But I do think that does really speed up scientific progress. I don’t want to say that it actually doesn’t have any impact.
In fact, you’d might prioritize that more than the other people. So by downplaying it you’re actually meaning to highlight it.
That’s right. I’m saying these capabilities, they are just acting as force multipliers on what the humans can do. That’s good because humans already have all of the other capabilities that we need, while AI systems, as we’ve discussed before, seem to still lack a lot of key competences. I don’t even know what people really mean or really envision when they talk about reasoners that are say at Nobel laureates level. I’m not sure what that means.
Widespread Automation Beats Genius in Datacenters
Sure. I want to try to explain this a little better. I mean, it’s hard for me because I’m not sure I completely understand it. But I think that there’s a sense in which we think of the technological and scientific developments, in say the last century, they’re being upheld by this relatively small group of people. I think that’s the idea. You have this small group of scientists in the world that are responsible for the innovations that we’ve come to know. As a fraction of the total global population, these brilliant innovators are quite tiny, maybe 0.01% of the entire population. So if you could have just a ton of reasoning models that do whatever they do and what they’re doing is maybe just like a lot of theorizing and looking at experimental data and coming up with new experiments to run and so on. It might be much easier to like add to that effective population. If you have this data center of AI models that are capable of doing whatever it is they do. So that you don’t really need to replicate the entire economy, you just need to replicate whatever it is that they do, and then since they’re responsible for the majority of technological developments, then just doing what they do would be sufficient for this acceleration. So I’m curious, what do you say to that argument?
I think a few things. First of all, I think just doing what they do is not actually that easy in the sense that I think some people maybe have the idea that research and development and invention are just much narrower tasks than I think they actually are in the real world. So if you could actually fully automate the job of a researcher, then I think you’re already not verified from automating, perhaps all work that can be done remotely, and maybe even all to some extent, physical work, because a lot of research does have a physical component, like if you have to carry out experiments and so on. There are a lot of experimental physicists in the world who just manipulate tools in the physical world to get results. So it’s not all theorizing and writing things on a computer or looking at experimental results.
If you could just automate the job of researcher, I think that’s much harder than people probably give it credit for. It’s not just a matter of reasoning. But there’s also the other point, which is that to what extent are people who are formally researchers or who formally do R\&D officially, if you just look at their job description, it’s says researcher or something. How much are those people responsible for technological innovations? I think probably not that much.
I think most innovation comes from other sources. I think people overrate the process of formal science quite a bit, and they underrate the amount of discoveries that happen due to serendipity; due to the scale of the world economy, due to the number of people that we have and number of different things that they are trying all the time, the number of different recombinations of ideas that might occur, the number of sort of complementary innovations that might be unlocked. For example some person makes an advance in metallurgy, which enables you to build better telescopes, which enables you to make better astronomical observations, that maybe leads to a better theory of physics and so on. These kinds of causal chains are very underrated. In contrast, people just really overrate the idea of “Oh, you just have some data, you just look at the data and then you just see the right theory”. They underrate all of the support that’s needed for science to happen.
I think one reason to believe that these effects that I’m talking about are pretty substantial are just a number of instances of simultaneous discovery that you have in science. A famous one is Newton and Leibniz with calculus. Some people ask, why isn’t it weird that both of them come up with this idea around the same time, so much that there was accusations of plagiarism and so on. But I think it’s just that the rest of science, the rest of the world, had reached such a point where the questions that you would be asking that would naturally lead you to calculus, like people finally had very high quality astronomical observations, and they were finally able to actually make theories about what caused planetary emotions and so on. And that naturally led to, for instance, people naturally became concerned with how projectiles move in the air because they were using artillery a lot more and things like that, and these questions became important. They also had more of an ability to accumulate knowledge and draw upon the knowledge of others, thanks to things like printing press and whatever.
So basically, I think there are a lot of things that go into such a revolution, which is why you have simultaneous discovery. I think this happens so many times that even in domains which are, say, pure math with no applications, often even in those domains, what’s happening is that the field has a certain level of “abstract technology”, a certain level of tools and results and lemmas and theorems that are state of the art, and that state of the art is sort of gradually advancing over time as people sort of improve upon it in various domains. What might often happen is there’s a result that’s proven in one field of math, and then suddenly at the same time, a lot of different people realize that this result that was just proven it has implications in some other different area of math, which suddenly leads to a burst of progress where people were stuck for 20 years and then suddenly within the space of one year, there’s a bunch of different people who make breakthroughs about the same question.
I think these sort of scale effects are very underrated. I think when AI actually drives a big increase in technological innovation and productivity in scientific progress, it’s going to be largely because of the scale effects. It’s not going to because you have a single data center with lots of smart AIs, it’s not going to be for that reason. I think people just overrate how much you can deduce by sheer intelligence by just looking at a bunch of data without any relevant experiments; and they overrate to what extent that is actually the bottom in scientific progress versus all these other things that I mentioned.
So in the general case, in just the abstract, there’s some actual latent variable which we might call scale or something that is actually the thing that allows progress to occur. Then the thing that’s reported on is maybe it’s more salient from a human focus point of view. It’s better for narrative purposes to think about Leibniz or Newton inventing calculus, and it’s just a harder story to tell if you talk about this abstract ability to look into the world or have better telescopes that allow that start making you ask the questions that lead naturally to Newton and Leibniz coming up with calculus. I guess I would say, yeah, that’s like, generally true in the economic sphere. It’s even true in the field of AI that historically people are just very underrated this.
I mean, I think until quite recently, like until five or ten years ago, people just didn’t seem to have the strong sense, except for there were exceptions obviously with Kurzweil, Moravec and so on; but people just had this strong impression that the right way of doing AI was just to come up with the right algorithms and that you have this theory of intelligence. If you just kind of theorize about how humans reason and how humans plan and pursue goals in the real world, that’ll be the thing that really cracks everything. Looking back now, if you look at the things that have been most successful algorithmic innovations, we could talk about transformers or more recently, reasoning models, this didn’t come from people thinking about what is the best way to reason or how can we replicate whatever it is that humans do. In most cases, this just came from looking at the current technologies, looking at the abilities or what we can do with GPUs and how we can leverage computation and more effectively. That just generally tends to be the type of innovation that actually drives progress forward. That type of innovation, in some sense it’s caused by abstract reasoning because it does require people to think about what type of architecture would best be suited for the hardware that we have; but in another sense, the story that is more important in a fundamental sense is just the hardware itself, which then the transformer on top of it is just a minor component of the story.
Yeah. And I think people overrate the… So people don’t have a good sense, I think, of how strong the diminishing terms are on things like research. One way I would give an example of this, is it’s going to sound kind of strange, during the Napoleonic Wars, Napoleon is very well known as a very competent general, and there’s a quote about him from the Duke of Wellington, that Napoleon’s leadership of a French army was worth 40,000 additional soldiers on the French side. Now, that might be true. Suppose that’s true. That suggests this very high premium and value on leadership and direction and management at the top.
But now imagine instead of one Napoleon, you had 10 Napoleons in one army. Does that amount to like 400,000 extra soldiers on the French side? No, it doesn’t, because once you already have one, the additional marginal ones have very strong diminishing terms, and I think this applies to some extent when people look at highly compensated jobs in our economy; like software engineering, like management roles, like maybe superstar researchers who just come up with tons of ideas. I think if you just had like 10x or 100x, the number of those people, the marginal value of those people would drop by a lot because you would just become bottlenecked on all the other things that stop whatever they’re doing.
For instance, once you already have a good manager, just having a second good manager in a company is not worth that much. Once you already have enough researchers to sort of deduce as much as you reasonably can from the data you have, then you just become bottlenecked on your ability to run experiments and collect data, and not on the part where people have to sit down and think about it. Maybe, I don’t know if anyone has actually measured this, but I would not be surprised if the diminishing deterrents are extremely bad. For instance, in weather forecasting, there is this law that every time you increase your computation on your weather simulation models by two times, you get to look maybe one day ahead or something, or maybe even more, maybe.
Well, that’d be surprisingly effective.
I think it’s ten times (scaling of compute for) one day (of weather), probably. That’s very steep diminishing returns. Now, I don’t know, I think it’s better than that with science. I think probably it’s more like… I think, an interesting thought experiment, which we had to think about when we tried to formally model this out, is if suppose we had twice the number of researchers, no AI, just twice the number of researchers in our world. In fact, suppose that we not only had twice the researchers, but imagine that we had twice of everything else as well. Twice the data collection, twice the scale of our economy and whatever. How much faster would R\&D progress be in various domains? For instance, how much faster would we be progressing along something like Moore’s law? How much faster would we be decreasing the price per FLOP that we get out of GPUs or how much faster would software progress be in the field of machine learning?
You can ask these questions. I think even if you see scale, everything else is sort of not obvious because there is a penalty to doing things in parallel rather than doing things sequentially. It’s worse to have two people thinking about something for one day than to have one person thinking about it for two days, because there are just certain connections you can make that require you to spend a certain amount of serial time on a problem that you can’t paralyze very effectively. So that’s assuming you get to scale everything else; and on top of that, with the scenario of the country of geniuses in the data center, you’re not even scaling everything else, you’re just scaling the number of researchers and you’re assuming everything else remains fixed. So I just think we have a lot of reasons to be pessimistic about that picture. Even if the returns to pure cognitive research effort just happened to be so high that would just yield a lot of value still, adopting that model, just deploying your AIs inside a data center and then not widely deploying them throughout the world, you would just be leaving so much money, so much resources, so much value on the table if you did that. So even if that alone was sufficient to unlock a lot of value, you probably still wouldn’t want to do it.
Now, I think the relevant question is, as I said, maybe you focus on that scenario because you’re looking at the reasoning models and which don’t seem to be very good at things like agency and long context and whatever. But I think it’s just implausible that you’re going to get a transformative researcher who is unable to reason over long contexts and who lacks agency to direct a research program and to… I think those abilities are pretty essential to what make a good researcher. It’s not just the narrow ability to solve well scoped sort of specific problems. I think that’s just a very small part of what the researcher actually does.
I mean, and going back to previous points, I think that it’s just generally underrated how much efficiency in the economy is not due to people thinking about things on a top down level. So obviously there’s the model where you have scientific researchers who are coming up with new innovations that make things more efficient. They come up with new inventions, they come up with new theories that allow us to make things more efficiently and so on. But I think that’s not where the majority of practical innovation comes from.
I think the majority of practical innovation is coming from people who are not very much reported on, who are in the middle of the process in some part of the supply chain. They’re like supervising in some factory, or they’re even just like in the factory themselves, or they’re designing things and they’re an engineer and they just come up with some sort of incremental improvement. I think it’s just underrated how much this aggregate of incremental improvements across the entire economy is generally what makes things so efficient over time. I suspect that the reason why people don’t consider that story, or it’s as salient. Maybe they buy it if you ask them, but it doesn’t seem they’re talking about that story as much when it comes to thinking about how AI will impact the world. I suspect the reason is because it’s just not like an interesting story from a point, from almost like a historical narrative point of view. It’s easier to write a historical narrative about how this guy invented something, like invented the light bulb or whatever. It’s just like, it’s much less interesting to do this data series of light bulb efficiency of lumens per watt, or whatever, over time of people making it slightly more efficient; because they notice this efficiency and this supply inefficiency in the supply chain and, or they realize that you could get resources from here rather than there, or you could just like put together labor more efficiently at this stage.
Not only is it harder to find the information for how that happened, which is an independent reason to think that this would be a hard story to tell, but it’s just not as compelling a story to tell in terms of what we expect. Humans just want to tell stories where you put a protagonist in it. So I mean obviously I can’t read other people’s minds, but I really suspect that might play a big role in why people consider these major innovators, or we remember Einstein much more than we remember the silent, diffuse, decentralized workers who probably actually did quite a lot more in terms of making our economy more efficient than Einstein did, in aggregate.
Yeah, that’s right. I mean, even for some of the inventors that are remembered as inventors, maybe Edison is a good example who is remembered for inventing the light bulb. Actually, if you look at Edison, unlike some other inventors, he became quite wealthy, and you might as: “Okay, was it because he managed the light bulb?” Well, the answer is no. I mean, that’s part of it, that’s a small part of the story. Actually. What had to be done is…Suppose you have a light bulb, but it’s the year 1870. Well, homes just don’t have electricity. There’s no complementary infrastructure. There’s nothing. It’s the light bulb. You just can’t do anything with it. It’s just useless.
What you have to do is make the necessary investments happen that are going to allow that innovation to actually be used in a retail or commercial or whatever context. That’s the kind of thing that Edison was able to do that made him a bunch of money. He’s not inventing that. The idea of a light bulb is pretty easy. So sometimes people think, oh, it’s just the idea that was hard. But the idea is very easy of a light bulb. I mean, people knew forever that if you make something hot, then it glows, and then that’s basically the idea. So what had to be done is to first give it the form of a product that is actually going to be commercially viable and then make all the other necessary investments happen, that are going to make that product actually useful to people. That does create a lot of value. It’s not something people talk about. People just talk about as if like you just invent the light bulb and then it’s over. While actually most of the difficult work is after you invent the product.
So in other words, it’s creating the infrastructure that supports it and sustains it and makes it useful.
That’s right, and actually that proves the viability of the product. It’s very easy to have an idea for a product and then to have a demo inside a lab where the product “works”. But that doesn’t mean the product would be viable at any kind of scale, that it would be cheap to manufacture, that it would benefit from economies of scale in manufacture. It’s just like a much lower bar.
Well, an easier example might just be you come up with the idea of a steam engine and then it’s just not exactly clear how this will be applied. If you have an existing example of something like a railroad network or, if you have some sort of infrastructure that means that engines can be put to good use, then that makes the invention useful, but without the infrastructure, it’s not useful. That just indicates that the hard part is just designing the infrastructure that is surrounding it and supporting it. It’s not just that the claim isn’t the infrastructure is hard to come up with as an idea. Rather, the claim is that’s most of the work. It’s actually the important part that is more boring to talk about, but is nonetheless more important for actually getting the value out of the product.
That’s right, and I think people do the same thing with military campaigns and wars where they will talk about individual generals instead of, like this general such and such instead of talking about the… Like an army, especially in the modern era, is like a huge thing with millions of people who are actually in the army, and then there’s even more people who are sort of behind the army who are like producing all the equipment and supplies and providing logistical support, and fuel and all the support that is needed for the army to be on the field. But I mean, that’s just boring. You don’t want to talk about that.
So you’d rather talk about, Hitler made this mistake and he invaded at the wrong time in Moscow. But it’s not really an argument about industrial capacity or it’s not like people don’t…. People more rarely talk about exactly how many tanks, exactly how many… All of the infrastructure that allowed either side of the campaign to win. But even though the number of armaments and the scale of what they were able to produce is much more decisive a factor than just Hitler making these strategic blunders at certain points. But the strategic blunders are much more interesting to talk about. It’s a lot more fun. It makes it a story.
Yeah. Yeah, I think that’s right. If you tie this back to AI, I do think people are maybe… I wonder why people have this view, maybe it’s just because it is a better story. But people seem to believe in, fundamentally they seem to believe in the power of reasoning and in the power of deduction, and I don’t know how else to express it, they just don’t seem to have the same belief in the power of scale and of just having a lot of people and a lot of resources to throw at a problem. Or just existing so that they do lots of different things and you have a very big surface area that is exposed to potential discoveries and to take advantage of economies of scale. That just seems like a much more relevant thing than having individual people who happen to be intelligent. I think for AI systems we can expect “individual AI systems” to be much more impactful than individual humans, mainly because we will be able to build AI systems that use a lot more compute and take in a lot more data in a way that we can’t do for humans. Human brains don’t vary that much in size and humans don’t vary that much in the sort of amount of data they can take in from the outside world and so on, or the amount of time compute they’re trained with.
Still, even if you have that, I think the main channel for impact is still going to be those systems actually being deployed widely and not just being assigned to a very narrow research tasks, and then we end up being bombed by everything else, which is what I think would happen in this case. I think a lot of people think about AI automation of AI R\&D as like a key causal channel where AI will just automate its own R\&D and then it will just make its software better, and then that will create this sometimes called software only singularity, where if the returns to research efforts in AI R\&D are favorable enough, then what happens is AI improves itself. So with the same stock of compute, you’re able to get even more performance and then that leads to even more research effort and so it snowballs.
But if you have the view that experimental discoveries are mostly driven by compute and not mostly driven by research effort, then this feedback loop doesn’t work, because just because you have more intelligence to throw out the problem doesn’t mean you have any more data, right? So you can actually figure out… So for instance, this is like if you were in 2010 and I don’t know, suppose you had like 10 times, 100 times the human researchers, could they ever have figured out the transformer and the whole paradigm of unsupervised learning? And could they have predicted that paradigm is going to lead to the successes that it did after 2020?
It seems like in order to do, to even be on the right track, if you’re given context about what type of innovations you need to make, then it’s probably rather easy for someone in 2010, as long as they’re a fairly smart team of researchers. But just even knowing the context of what type of innovations you need to make, that would be most useful to apply is most of the story and that just comes from empirically observing the world. It’s just, you can’t get around that. It’s just very difficult to know exactly the innovations that will be most useful until you’ve reached the sufficient scale, until you’ve reached the sufficient level of infrastructure and practical tools that allow you to see:“Ah so the thing that would most increase value on the margin would be having this transformer because it takes advantage of this, of these GPUs that can run in parallel and it can take advantage of all of these other facts” that wouldn’t have been salient. It wouldn’t probably wouldn’t have been on your mind at all in like 2010, much less 1990 or 1950.
Yeah, yeah. I mean I think it’s like there is an aspect to which the results of… I mean that’s part of the fact that innovations generally are designed to take advantage of complementary innovations that happen to also be around same time, just like Transformers are designed to take advantage of GPUs. I think that’s one reason why it’s harder to come up with innovations in advance because you don’t know what the overall context is like. But I think even setting that aside, it’s just hard to predict which architecture is going to perform better. If you’re in 2010… So for instance in 2014 long, short term memory networks were very popular for a while and now there is nobody using them anymore basically.
And it was quite surprising I think a priori for a lot of people that it would work out that way because it was not like…It doesn’t seem like there’s strong theoretical reasons. It’s like if you were just smarter at reasoning in the abstract, it doesn’t seem like you’d come up with the idea that these LSTMs are so much worse in some ways, but it just happened to be that way. And it’s just very unclear how we could have discovered that through any other means other than just experimentation.
How Stories Shape Our Expectations of AI
That’s right. That’s exactly what I mean. I think people show this, there are a lot of other signs where reality just turns out to be a lot more complex and a lot more detailed and non trivial than you expect. I think another example of this is people who think about how wars are likely to go before the wars actually happen. So for instance, before the Second World War there was a lot of thinking about, okay, we have all these new technologies we have tanks, we have airplanes that can actually bomb cities, which we didn’t have in the First World War. So what’s going to happen when there’s actually a war? I mean, there’s no war. So you don’t know. People haven’t tried. So all you can do is speculate and often people’s speculations are just very wrong.
For instance, in Britain, I think the British government expected in the first few weeks of the war, more casualties from aerial bombardments than they had during the entire war, like hundreds of thousands or something like that. The common view was that we can’t have a war because if there is a war, then all the major cities are gonna be bombed and they’re just all gonna be destroyed and there’s no defense. So basically, people thought about aerial bombardment the same way that we today think about nuclear weapons, like sort of nuclear weapons light or something. But that turned out to be totally wrong. And I mean, it was wrong just due to lots of boring practical reasons that were difficult to anticipate in advance.
For instance, you can’t bomb cities in daytime because then the airplanes are visible and they get shot down. But then if you bomb at nighttime, then your bombing is very inaccurate and you don’t actually hit your targets very well. So your bomb just gets very spread out. So it’s very inefficient. It turned out later in the war, when the Allies started bombing Germany, that the bombing was actually very inefficient as a way of doing economic damage. So basically, for each dollar of capital they destroyed in Germany, they had to spend like four or five dollars on the aircrafts that were being shot down and on the fuel. So it was like this. It only works because they had such an overwhelming…
They’re just a lot richer.
Yeah, there’s just a lot richer combined. So that’s the only reason it worked. But it’s actually very inefficient strategy. This just contrasts so heavily with how people thought in advance that it would be this devastating weapon that there will be no defense against, when iIn fact turned out to be very inefficient. So I think it’s very easy to make mistakes like this because, well, the world is just very detailed. It’s just hard to speculate about how wars are going to go in advance. Just like it’s hard to speculate about how research projects are going to go in advance. Or, for instance, people are also very bad at judging which startup ideas are good and which ones are bad. It’s a very similar thing. Or they often reason very superficially and because they can’t really do any better, and it’s very hard to predict if a startup idea is going to be successful or not.
But crucial to your point is that it’s not just that the people are being dumb. Yeah. It’s like this is not necessarily because they just have some defect. They’re missing some consideration that you could have just thought about if you just thought for longer. It’s rather that the missing consideration is something that can only practically be revealed by experimental data.
That’s right. So if there had been very serious war games in advance of the Second World War, where people actually tried to play out what a war would be like, it’s possible they would have discovered this. But in practice, it almost never happens. People never actually do war games anywhere near approaching the level of seriousness that would actually be displayed in a war.
But I also think even if they did that, there’d probably be a ton of things they’d miss just because of…
Because it’s a much smaller scale. That’s right. There’s much less effort being devoted to it. It’s much more political in various ways where people have incentives to get certain conclusions in their war games. Just overall, there’s much less effort being devoted to the problem when there’s an actual war. It’s like you scale up the effort that you’re trying to dedicate towards how to win it by at least a thousand times, maybe more. You should just expect a lot of surprises to come from that alone. I agree that overall, my view, and this is not just my view of AI, but more broadly my view of the world, is that there’s just a ton of detail. Often you can only really grapple with that detail when you collect a lot of data and do a lot of experiments and have just this big exposed surface area to new discoveries, and you can’t do it by just like armchair reasoning or by looking at results of experiments that were already been done, I think that is not a very effective way of making progress.
Yeah, I want to relate this. So the way that people have thought about these things in the past is like you have one AI in a basement or something, maybe the next iteration of this is the country of geniuses in a data center. So I wonder if there’s a trend here because it seems like people are kind of realizing that it’s not going to be an AI in a basement. Because we know the ChatGPT’s rolling out and that’s not like an AI in a basement. But it’s almost like they’re not seeing that this is a trend. Rather they’re just analyzing in a similar way to what you were talking about what economists did earlier, where they just look at…
Zeroth order forecasting..
Exactly, so they just have the… What is the current thing and then just like some things are getting better, or they’re extrapolating some things but not other things. And this particular thing they’re not extrapolating is the scale of the development and how widespread it is and how much relies on the fact that it’s being deployed widely. As opposed to just it being a localized development that’s being applied to one task or in one domain that allows it to achieve outsized effects through that alone.
Yeah, I think that’s right. I mean, I even think the framing of, I don’t know… I just think that there’s still a lot of people who just think about one AI, I mean maybe it’s not in the basement, but it’s still like somehow it’s like one AI…
The super intelligent AI or ASI..
That’s right. And it is The ASI, usually like just one, and it is going to have maybe certain kinds of preferences and whatever and those are just going to be certain developers and certain preferences. Maybe it has a result as a way it has been trained or I don’t know, maybe if we misaligned.. But basically it all comes down to like this one thing, this one entity somehow that has certain preferences and that dictates how things are now, as opposed to how you would typically analyze an economy, which is like you wouldn’t analyze an economy by saying, “oh, this individual person in this economy has these preferences and then those preferences are just going to drive…” You would do a much different kind of analysis to determine what is going to happen in a large economy compared to how people seem to think about the future of AI in general, where they think a lot less of economic considerations of which systems would businesses want to deploy and would consumers want to use and then maybe even in the very long term. What would the economic pressures lead to? Maybe an AI only or primarily AI driven economy looking like they just focus way less on those considerations and way more on things like what will the AI want to do?
I mean, yeah, I think this will get clear. I mean once these AI agents start to work, I think people will notice probably that there’s multiple agents. I think it just makes sense from a perspective of like a business, right? You’re like hiring this, you’re hiring this agent who works for you, and it’s like you’re not hiring the same agent that everyone hires. I mean, it’s not like clear exactly how that would be coherent in the first place. So it just seems very obvious that there’ll just be many agents. I think maybe there’s some sense, people talk about them merging and all this stuff and there is some sense in which maybe there’s long term considerations in which we could talk about economies of scale and how AI organizations might be different than human organizations because of their ability to communicate more effectively and efficiently than humans can. But at least in the medium term, especially after this adoption and during the period of explosive growth when the economy starts accelerating, even at that point, I think that there will probably be millions of agents just because that’s the natural structure of what will be put in place.
If you just take into account the fact that what people will be purchasing when they purchase these AI services is they’ll want something that’s individualized to their own needs, their own business needs. They’ll want something that is automating labor, working in their organization, essentially hiring in a similar way to the way that we might onboard and hire human workers. And in that case that just looks very different than the case of a single monolithic agent. And so again there might be these long term considerations, but it’s very difficult. It’s much easier to look at what’s immediately close to us. What’s immediately close to us is not this monolithic agent. So the long term considerations, it’s speculative, it’s hard to say, but at the very least people seem to be underrating this fact of this more decentralized, diffuse model of development for sure.
So one thing that I think plays a part of this is that sometimes people will overwrite these long term considerations as if they’re very near term considerations. I don’t know whether you get this impression, but sometimes people will talk about the Dyson sphere as if it’s like, we’ll have AGI and then there’ll be like ASI soon after that and then like the Dyson sphere. They’re imagining that’s like add a few years here and then you get the Dyson sphere.
I think there’s actually a very large orders of magnitude difference, such that so much about the reality we live in will be different by the time something like a Dyson sphere is constructed. That talking about it as if we can make like detailed predictions, I mean this is like a general prediction maybe of we’ll get to this level of development, but anything that is concrete beyond that. I just think it fails to realize how many events will occur before that point, such that the world will be so utterly transformed and unrecognizable, that makes talking about these specific considerations at the level of a Dyson sphere just very, it’s not useful. I think almost all speculation in this respect, unless it’s done at a very high level, is just not useful.
Yeah, I mean, I wonder like how easy people think it is to construct a Dyson sphere sometimes when they talk about this. But…
How AGI Will Impact Culture
Well, I almost think, okay. So this might be a strawman… but I almost feel like a simple model of AI development is we get the ASI, it like takes over the world or something and so then it’s just managing the world and then it just immediately builds a ton of infrastructure and looks like we just have this explosion of infrastructure. And that’s the main thing that’s happening in the world. It just like happens so quickly that not much is…they don’t imagine tons of cultural changes happening in the same time. I mean, especially if they’re imagining the world where the AI doesn’t kill everyone, they’re just imagining this world which is maybe functionally similar. It has like, it has inhabitants that are functionally similar to humans. It has inhabitants that are even kind of similar to the current human culture in a certain way. It’s like us and then this just extends all the way to the Dyson sphere.
So there’s enormous, there’s enormous progress in one axis, which is, there’s enormous economic growth, but there’s not enormous change in other dimensions that have historically gone alongside economic growth. If you think about how different our ancestors were from our current, from current humans, it’s not just that we have a lot more stuff. It’s that we also are just very culturally different. We have very different attitudes, we have very different ways of thinking about the world. Our concept of, us versus them is very different. It used to be that people had a much more localized sense of us versus them. They had a more tribal, my community was the people they care about. Now it’s more nation state and there’s increasingly over the last century, a global citizen sense of, it’s all of humanity that is my tribe.
I just think people are underrating these other facts about the way that humans and human values have changed alongside this massive transformations in economic growth. I think part of it is downstream of what were talking about earlier, because I think they’re imagining that a lot of economic growth will come from AI being applied to a narrow use case, instead of AI being so widely applied that it sort of seeps into our culture and the way that we view ourselves.
To be clear, this is also a mistake that a lot of people made about the Industrial Revolution. A lot of countries saw what was happening in England and the Netherlands and maybe to some extent France and so on, countries outside Europe. They saw the military power and sort of wealth that was being created as a result of this change, and they obviously wanted to get their hands on that. But they didn’t want the associated cultural and social changes. For example, things like the undermining of all sort of traditional feudal or monarchic structures and transition towards more mass participation in politics and things like that.
Sure. So they still want a king and they want him to do the official king duties and have that authority, but just with all of the industrial wealth that Britain had.
That’s right.
And unfortunately they didn’t see that wasn’t possible. That in order to do this, you need to establish industrial norms and customs and the ways of organizing labor in society in such a way that made these traditional power structures just obsolete.
Yep, that’s right. So I think maybe with AI, the one argument you can give that wouldn’t apply to Industrial Revolution is that the timescales of the changes are likely to be so compressed and that there will just be inertia in people’s preferences. I think that’s quite plausible. People just don’t change their minds about important things that quickly, and even with the Industrial Revolution, I think there was probably quite a bit of effect from generational replacement that allowed this to happen.
Well, that’s assuming that the primary actors are humans.
Are humans. That’s right.
So AIs could potentially have, they’ll start contributing to the culture in the same way that humans contribute to culture by writing and talking and interacting with others. So AIs will increasingly become part of the culture. And so then in some ways, I agree that some parts of the world may not change so much because of the facts that you just mentioned, these innate biological facts that it’s like after young adulthood, people tend to get set in their ways and so on. This is something that can’t much be accelerated by technological progress unless you’re willing to radically alter how the human body works. But it’s definitely not something that is necessarily true about civilization as a whole. It could very well apply to biological humans. It doesn’t necessarily apply to the culture that is dominant and it could just change in ways that in… Because it could be increasingly be shaped by forces that are not that don’t necessarily have these constraints. I think part of this is people are not realizing to the extent…
Obviously, there’s a strong sense in which the way that people want to think about AIs is just as tools alone, not as like, active contributors to the world. Not as some sort of thing that we’re interacting with that is like a person that is playing a part in the story of this civilization. That we have all these cultural trends, and we have these historical processes that are determining things. They think of it more as just, well humans are at the top. We have these tools that we’re controlling. We’re like the shareholders of the world and there’s like a risk obviously, that the tools can rebel. So I’m not downplaying the fact that a lot of people are concerned about the fact that the tools could revolt and then suddenly start taking control of the world. But I’m saying that there seems to be this, this lack of imagination of something which is in between these two alternatives. Which is AIs are not fully tools and we’re just shareholders of the world, that we are the board of directors that make decisions and there’s nothing that’s interesting happening beyond what humans determine.
I think it’s like AIs will play a role in shaping the course of events somewhat independently from humans, and also they’ll be under our control in many other ways. Of course, over time this will evolve. AIs will probably increasingly play an active role in shaping events. But I think even if you thought that AIs will have a positive impact on the world; even if you think AI alignment is easy, you should still think that AIs will just be playing a large role in shaping the course of events, determining the things that we talk about and playing a role in what social structures and what political structures we have, and I just think that there isn’t that much discussion about it.
Beyond Utopia-or-Extinction
It’s hard for me to understand where people who sort of disagree with this framework, disagree with this view are coming from. I do think a lot of people are very inclined to have very extremized views of the future in the sense that. For instance, a lot of people who believe in a high chance of human extinction from AI also believe that if there isn’t human extinction, then the future is going to be very utopian and coordination problems are going to be solved. We’re just going to fix certain values and preferences in the AI that are just going to be stable after that. So this is the most important time in history, basically.
That’s such a bimodal view. Either we’re dead or the outcome is just totally wonderful. It’s very rare that the development of technology proceeds like that. It’s either completely terrible or it’s extremely good and there’s like no problems and all the problems get solved. That’s just very, almost like, I don’t know how nice it is to say this, but it’s sort of a mythological view, almost like a religious view.
Well, yeah, so there’s the Robin Hanson post “AGI is Sacred” in which he kind of talks about this. I think that people have this sort of distinction. They have a near view of what AI will be capable of doing, and in that sense they’re much more grounded in these realistic constraints and about the types of things that actually happen in the real world because you’re talking about the year, the next two years or something. You can’t make these extremized claims about LLMs having perfect values.
But then when they talk about the long term, then it’s just this binary utopia versus extinction. There’ll just be one AI. It will have this idealized set of values. It’ll be like this perfect agent that’s perfectly coordinated with all its other copies. Then there’s just a deciding factor which is it’ll either be aligned or it won’t be aligned. If it’s not aligned, then it’ll just fight a war of extermination against all the humans and completely take over everything. And if it’s aligned, then we just get this perfect utopia because it knows exactly what everyone wants and it just wants the best of humanity.
It’s like, well, okay, I understand that like, AI is a unique technology. It’s definitely different than things we’ve experienced in the past. But nothing historically has even remotely resembled any of these axes, much less all of them combined into a single package. So it’s interesting that people are very confident in this picture because maybe one picture among several. You could imagine people posing this as a model and saying that, if AI happens to be this very extremized package, which we only assign 10% probability to, then this could occur, but it doesn’t sound like that’s what people are saying. It sounds like that’s what people think, the default majority probability scenario, is it just has this large package of idealized features.
It won’t coordinate in the ways that humans ordinarily coordinate rather than engaging in just ordinary trade. It’ll do logical decision theory or coordinate acausally, even in the absence of communication. There’s just a bunch of different features, each one of them alone is an interesting discussion. And it may play some like, I don’t want to downplay that there’s some role that maybe acausal coordination will play in the future, but it seems strange to make this such a central part of one’s picture, given the complete absence of historical precedent of any of these factors playing a large role to how technology, history, sociology, anything has played out.
Yeah, I think the near versus far mode thing is definitely playing a big part. On the other side, of course, are people who don’t have extreme views about the future of technology because they just don’t believe in the potential of technology. There are also a lot of people like that. I would normally say that a kind of more grounded or more realistic analysis of future impact of AI. I think such analysis would almost certainly have to be informed by economics because economics is just the tool that we typically use to study these kinds of problems in a rigorous way, but most economists do not believe in the technology at all. This means that the analysis they produce is usually predicated on a very sort of nerfed and inferior version of technology compared to what you might expect. So of course they don’t really predict any big changes because they think it’s going to maybe even be less impactful than the internet or something like that. In that case you don’t get any interesting analysis out of them. Which is another…I think that’s definitely not the same kind of story, but it does mean…
I think like there is a unique lack of people who are both sort of more informed by the way economists usually analyze like a potential future potential like technology and its impacts and who are also believing in the transformative potential of technology. The intersection of these two sets is very small. So in the discourse about AI this view is almost non existent. I think we have this view and maybe a couple of other people do, but that’s about it. This is a very small minority view.
Yeah, Anton Cornack is interested in it, Robin Hanson, but it’s like not enough people to produce. It feels they just don’t influence the discourse to nearly the degree of like weighted by the plausibility of what they’re saying. Yeah, I think there’s so many people who talk about other things that seem less plausible. Yeah.
AI’s Impact on Wages and Labor
I mean I think similar kinds of things happen with discussion about the impact of AI and human wages, which the discussion is usually very confused and informed by…It neglects very basic things on both sides. On the side of sort of the AI optimists, in the sense of bullish on capabilities. I mean maybe they’re not optimistic in the sense that they think the outcome will be bad, but they’re envisioning transformative capabilities. And also on the side of the sort of economists who are like for instance, it’s very common for economists to say human wages are not going to go down because humans will always have a comparative advantage over AIs and it will always beneficial to trade, which is a very poor argument.
People on the AI side often point out, horses sort of became obsolete and we no longer use them, we just use cars. So why wouldn’t the same thing happen to humans? They point to lots of random things on the economist side like humans are not horses, they’re different. Yeah, I mean they’re different but they are not different in a way that is materially relevant to the analogy. Just like horses just became unprofitable to employ. I mean there are some humans who are just unprofitable to employ. I mean, if a human has severe enough, maybe mental illness, they might just not find any form of employment because there is no organization to which they would even make any positive contribution. Just trying to manage them and trying to keep them from sabotaging or harming the rest of the organization would just take up more value, more time than they could ever contribute to the organization on their own. So they’re just left out. They’re just unemployed.
I mean, I would say. So all of those considerations are true. I would say that there’s even just a surprising lack of very basic conceptual tools that are common among economists. But for some reason, economists just don’t like applying them. One particular point in this topic is just this distinction between employment and wages. I think this is very annoying to me almost because so much of the discussion, in my view, has focused on will humans be able to get a job after AGI? If you think about it, what does that even mean? What does it mean to get a job?
If I offer you a dollar a day to clean my bathroom, okay, that’s a job. Okay, so would you just say, well you have a job so that’s not an issue? Well, no, you’d say no, because it’s a dollar a day. I’m not…That’s not enough. I’m not able to feed myself on that amount or it would be extremely difficult. At least you very well couldn’t achieve your current standard of living or anywhere near it. So the concept of will we have a job? Just seems very ill defined. Then they’re applying all these conceptual tools. They’re saying, oh well, there will be a comparative advantage, which implies that AIs and humans will have mutual gains from trading with each other.
Which is true.
That’s absolutely true, but it doesn’t imply anything about the size of those gains. They could just be infinitesimal or even negative. I mean, sorry, that particular argument, it wouldn’t work for negative. But in the particular case where there’s some sort of cost to managing them, then it could be negative, and then that argument just doesn’t work. So it doesn’t feel like these are things that economists would be able to point out if they were, if they took their framework seriously, if they just analyzed what they teach in school. If they said what they put in the textbooks because there’s this long history of economists clearly talking to their students and having dialogues with them in which they have to correct misconceptions about how fundamental things about the world work. So they’ve come up with many of these analytical tools for correctly interpreting the important parts of the problem that you need to analyze that it’s actually more important to talk about wages rather than employment. Then for some reason it’s just like these analytical tools which are normally implied by economists, who especially who are really good at the topic, just somehow just don’t make their way over to any of these discussions.
It’s quite frustrating because even if you introduce yourself into the discussion and you say, look guys, there’s this existing conceptual toolkit that is actually very useful and if you just reflect on it and you see how the argument flows and that it’s internally self consistent and it really relates to things that we care about. Even if you show that to people, it’s seen as one perspective among many, and that’s just one guy’s opinion. Even though these are the conceptual tools that have been developed which really from an inside view feel far more compelling, far more logically justified than almost all of the random speculation that people usually bring to bear on this subject.
Yeah, that’s true. I mean another thing is that, here’s the flip side of this where the people who are very optimistic about AI capabilities, again in the sense that I defined before, they often also have very strange views. For example, there is a group of people who call themselves AI optimists. I don’t know if it’s like an official name, but this is how they describe themselves, and some people from that group say oh, it’s very important or to have open source AI. The reason is because once we are in the post AGI world, if you have open source AI, then people will just be able to spin up a local copy of their AI and then have it take care of them or have it produce stuff for them.
It’s such a basic error because if you have wealth in that world, if you just own a bunch of assets, then those assets will appreciate a ton and then they will just pay so much rents or dividends or whatever that you don’t need to do that. It’s just going to be more efficient to purchase things on a market, because on a market you will benefit from many kinds of economies of scale in things like AI inference. So it will just be more efficient to buy it on a market. If you don’t have that capital, then you won’t be able to afford the power that you need to run your local AI because that is going to cost money and your wages will have collapsed. So either you have wealth and then you don’t need to do it, or you don’t, then you can’t do it. So there’s no world, in which having your open source AI “take care of you”, that’s going to be a viable solution. You might want to have an open source AI for say, privacy reasons. There are good reasons to want to have an open source AI in some situations.
Or it might decrease, increase the rents that these companies are able to charge because of the fact that the software is available everywhere. So it might slightly decrease the price of just getting an area.
You expect competition between companies anyway, so… I mean there’s probably some impact of it, but it’s probably not enormous, especially because…
Well, my point is that it’s marginal.
Yeah, it’s marginal.
But once you apply this economic lens then you see that.
Yeah, well again, I don’t even understand where sometimes it seems to me like some of these people just don’t believe in economics. They just don’t accept economic reasoning as a valid way of thinking about the world. It’s just much better to have random speculation that is baseless or just based on intuition, which is often very misguided when it comes to thinking about these kinds of problems. On the side of economists, they just refuse to take the possibility of the technology being transformative seriously. There’s a sort of, I think Carl Schulman described it as an inversion of priorities between the social and natural sciences in the sense that normally as a social scientist, like if a natural scientist came to you or an engineer came to you and said, here is a technology that we are building. We think we’re going to be able to build this in 20 years, what will be the economic impact?
The job of social scientists should be to accept the claims made about the technology by people who are actually building a technology and then analyze its social impact, which is their expertise. But instead of doing this, economists seem to be doing so. The impact of this technology would be far too big, which means the technology cannot exist. They’re sort of inverting the reasoning and again, it’s just very strange. I think part of it is because economists have, for a long time, they’ve been pushing back against people who argue technology X is going to lead to collapsing wages and increasing unemployment and people will be losing their jobs. They’re just so used to pushing back against that for other technologies that they just think it’s more of the same. They just respond in the same way, like they have some cashed arguments in their head and they’re just not engaging with the details of this particular technology that make it very different, but you’re giving all the same responses.
Which is interesting because even if they thought that the natural scientists were wrong, I mean it could be reasonable from that point of view. Even so, you could still just say, we’re going to analyze this technology, assuming they’re correct. Then just say, I don’t think they’re right but let’s just do that anyway. They didn’t even do that though. They’re not like framing it that way. They’re not saying I disagree with the natural scientists. They’re just, they’re just refusing to even consider the possibility that the technologists might be correct about what the technologies are capable of doing. Which is a very fascinating way of dismissing whether something is probable is to say that it’s not even possible. It’s a much stronger claim.
Yeah, people think that we have a lot of leverage today to affect how the medium to long term future is going to go because they think that, especially if they believe in the impact of this technology. I’m not talking about the economists who think it’s going to be like a nothing burger or something, but people who think it is very important. Regardless of whether they’re pessimistic or optimistic, they tend to see this moment as a moment of uniquely high leverage, which might be true in relative terms, relative to how much leverage you would have some other time in history. But I think in absolute terms there’s a lot of neglect of the kind of forces that maybe economists would put a lot more, at least some economists would put a lot more credence in shaping culture and values and so on.
Are you just talking about sociologists or general social scientists?
General social scientists.
Why Better Preservation of Information Accelerates Change
I think in particular for AIs you have this idea, I guess this is popular among the LessWrong crowd. There’s this set of values and it’s this utility function. I mean, sometimes people will say that it’s implicit. I mean, other times it’s like this explicit thing that is actually controlling the AI. It’s just like we have this written down utility function which is the values. Since it’s in this machine coded format, then it can just persist, in the way that data on a hard drive can persist over time because it can be copied and stored relatively losslessly, and that’s the same thing that can be true for a utility function that you give to the AI.
So unlike humans who might not be able to… we don’t have this utility function that’s in us. Maybe some people think we do, but the AI will have it because it’s in this machine format and that allows it to persist over time. I think that’s the key mechanism people point to. The idea that there’ll be this value lock in is essentially tied to the fact that there is this inherent property of computers that allow information to propagate with high integrity over time that is related to one’s values.
I mean, I think this reveals a couple of interesting things. So first of all, it is interesting if you look at, say, the longevity of links on the internet or internet websites, that this abstract view that information is just very easy, it’s just preserved, just doesn’t seem to be true. In fact, most websites have a fairly short lifespan and often a lot of the information in them is just lost unless it’s archived. But even if it’s archived, it’s kind of hard to find it in the archives. The second thing is this assumes that the reason human values might change or drift over time is because of a sort of difficulty of faithful transmission of the values. But if that was the case, you would have expected human value drift or change to have slowed down as our technology for maintaining information has improved, but as the exact opposite of what has happened.
And that’s happened in multiple stages. You’re not just talking about computers, you’re also talking about the fact that we develop writing.
Yes.
Then after that we could print things, so we had books, and then of course, in the information age, it’s become vastly easier to preserve information. I mean, talk to almost any old person and they’re like upset about all these cultural things that are happening that weren’t true 30 years ago. In many ways, part of this is not just independent of like these information retrieval mechanisms. Part of it is caused by it because of the fact that we have computers, it enabled this completely different culture that did not exist 30 years ago. In some sense the computers, which you might have naively thought would help preserve cultural values, did the opposite on net, which is quite interesting.
Which is not even that… For instance, a lot of historians attribute the process of Reformation to the invention of the printing press. Because they’re like, before then you didn’t have like the Bible was this handwritten and hand copied thing that only a few people could read in the churches or in the nobility, and most people just didn’t have a Bible. It was expensive to have a handwritten book. So the printing press basically enabled religion to become something for the masses, instead of… Basically the masses could form their own ideas about what the religion was about instead of being told by the church what it’s about. Iit was an undermining of the authority of the church. I mean a lot of people believe this. I mean there’s some plausibility. The timing does fit. I mean the Reformation begins shortly after the printing press becomes fairly widespread. So there is, a long history of these kinds of innovations which under the theory that value preservation is the main thing that’s difficult and that’s what leads to changes in values. That just doesn’t seem to explain the historical experiences of changes in values or culture at all.
We could go even before humans in the Cambrian, arthropods were the dominant creatures who were in the ocean. After that the arthropods were also the dominant land animals on the continents. The question is, was the reason we’ve had a lot of change over the last few hundred million years, is the reason for that change because it’s difficult to preserve information over time? Because maybe you think that the reason for this is because the mechanism by which genetic information gets transmitted over time is because it’s just not faithful enough. So the arthropods changed or evolved and then maybe they got weaker because there were just so many mutations.
But that’s not actually the story. In fact, the amount of genetic innovation, there are these living fossils… For example, all insects have the same basic body structure. They have like six legs, they’re segmented, they have like very little innovation relative to the amount of innovation that occurred during the Cambrian. So almost all of the change that had occurred at the beginning, then it was actually quite well preserved for hundreds of millions of years. The reason why the insects are not consider the dominant animal on Earth right now is not because that information wasn’t preserved, but rather it was supplanted by this other thing, which was vertebrates.
I think the same thing could be true here. You could have these AIs that perhaps they can preserve values in some way that humans can’t, although even that is a little unclear. But let’s just take that for granted. Then even in that case, there could just be new waves of AIs that somehow are more fit for the environment that is constantly changing. Then those AIs become prominent in the ways that land animals certainly became prominent. Of course, even among vertebrates, there were reptiles and then the mammals took over and so on. Is that process of dynamism and renewal can continuously occur within a framework in which information is preserved as long as there’s some sort of competition and renewal and dynamism in the world itself that allows new things to take over if they’re more fit than other things, along relevant dimensions.
Yes. I think another thing is that people often do a very local analysis. I think value preservation thing is a good example of that. They realize that technological improvements make it easier to preserve values, which is true. So then they say, then values are just going to be preserved. But they don’t take into account that technological improvements also make it much easier to change values and to create incentives for values to be changed, to adapt to new circumstances. So they’re looking at only one narrow impact of technology and ignoring the broader impacts. Analyzing only one variable is a mistake I think a lot of people make.
They also make it, in my opinion, when it comes to AI doom, AI risk scenarios, because they say AI like this economy, or maybe maybe it’s one AI, maybe it’s an economy of AI, maybe they just have this coordination with each other. Whatever it is, they’re going to be so much more powerful than us that they will just want to… It would be trivial for them to wipe us out. Yeah, that’s true, but I don’t know.. There are lots of communities in the world today who would be trivial for the US or China or whatever to wipe out. They just go in Sentinel Island, but why don’t they do that? Well, because there’s no reason, because it doesn’t pay to do that. There are a bunch of norms that you would be undermining by doing that for very trivial gains. The gains are very small and the cost of the direct cost of the operation and then the indirect costs of undermining certain norms and how other people or other countries will react to you, well, it’s just not worth it to do that.
Just as doing something becomes cheaper, the benefits you will get from doing it also become correspondingly cheaper and some other costs of doing it increase due to the need to maintain norms in a bigger and more developed economy or something. You don’t want to undermine those norms and so on. So there are just all these other variables that would in practice prevent such an action. And again, people just ignore, they just look at, or it would just be easy for them to defeat us.
Well, they’re almost perhaps imagining that there wouldn’t be these norms. It would just be the most direct consideration would just be the hard capabilities. And that’s almost if you were looking at the industrial revolution like once we get these major capabilities, you’re just looking at the raw capability of what you can do with the steamship or whatever, rather than thinking that it might be integrated within a political economic framework in which people actually have to care about these considerations. They have to care about how actions will look to other people. They have to care about undermining their reputation. Because if you do something crazy like that, then you might be violating some trust that people placed in you, or you signed on to some treaty a while ago. The cost, of course, it could very well be that you could easily wipe out this island. But we know just from practical considerations that the benefit from wiping out an island is not as large as the cost of violating a treaty like that. We just know that there’d be this massive outrage of people saying: “Why did you randomly attack this one island?” There’s no reason for doing that. They were embedded within this whole social and political framework that is just completely neglected when just talking about what are the raw capabilities of what AI will be doing.
Again, I think it’s because people don’t imagine that AIs will not just be tools for humans to use, they’ll be integrated into a social framework. It’s hard for me to see exactly why people are doing this. I think in some sense, I wonder how much of this is because it’s carried over from a time in which people thought that AIs would not be very good at natural language processing, for example. For a long time, people thought that might be the final key before we get to AGI, just the ability to have social interactions, like almost anything else would be like come before passing the Turing test. Then like at that final point you will have the AGI. But I think once you recognize that the capacity to do social interaction and engage in social interaction, understanding, communication, coordination, trade; these are all things that happen actually prior to massive economic transformation and super advanced technologies, then in that case it just makes it much more natural that prior to the point where we have these advanced technologies, AIs will be integrated into a social framework in the same way that humans are integrated with each other.
Markets Shaping Cultural Priorities
I think part of it is that people are not very used to thinking in general equilibrium or economic terms. Or it’s just easier to reason about things by looking at just one variable or one market or something. And then you don’t think about the overall consequences. In fact, another example of this, it just throws up when people think about the impact of AI on wages, the same thing where they are doing a partial equilibrium analysis of a problem that really should be analyzed in general equilibrium. It’s like a very common mistake. But is there another reason why?
So you mean the supply and demand. So if I could just elaborate. The specific thing you’re saying which is just people imagine that the most important thing for analyzing human wages is like we have this supply and demand model. You have like the supply of labor is increased because of AI and that drives down prices. I would say this is not actually the most enlightening way to view wages because the supply and demand model is most applicable to when you’re looking at one part of the economy with all else held equal. But in fact the most important element of AI is that it will be this broad force. It’s not just impacting…
This might be applicable if you were talking about AI in a particular profession that was just like getting automated soon or something, then in that case the supply and demand model would be reasonable. If you’re talking about it in analyzing AGI, then it’s quite unclear because especially there’s this concept you have to… You have supply and also demand. It’s like what exactly does demand mean in a general equilibrium context? That’s not. Is very unclear what that actually means when you dig into the technical details.
So the appropriate way to analyze this is more about the, which in my opinion is actually a simpler framework, which makes it surprising that people don’t use it, which is just the production function theory. It’s just the theory of production that economists have already devised, which is that we have factors of production. We have like land, labor, capital, and a few others that you could name entrepreneurship, whatever. The main ones being labor and capital. You can analyze that there’s an effect which is that people’s wages are determined by their marginal product, which is the amount that they contribute total production on the margin, their individual contribution. Like if they’re added to the production process, how much do they contribute to it? That can be interpreted as their wage.
Then you can think of AI in two senses. You can think of it as just a massive increase in the labor supply. If capital is held constant, that should mean that wages go way down. On the other hand, you can also just think of AI as being this massive accelerator of all the accumulable factors of production, including capital. But then when you look at this framework, then you realize that there’s these other factors of production, like land, which can’t be accumulated even if we had AI, because land is just something that’s fixed. Then in that case you just have this, you would have eventual declining wages. So this is actually, in my opinion, this is a very simple argument. It’s mathematically quite elegant in my opinion, and it’s in some sense even more intuitive.
Although I don’t know exactly if other people would buy that. I think it’s more intuitive if you really reflect on it and study it than the supply and demand model. But for some reason it’s almost completely absent in the way that people talk about wages. Again, because even though it’s the way that an economist would naturally approach the subject, because if they were to recognize it as a general equilibrium problem and they took AI seriously, they don’t want to take it seriously. So then the only people talking about it are talking about it in these terms that look nothing like this.
That’s right, yeah. I mean, again, we are trying to fill this niche of the intersection of these two sets, which otherwise seems very sparse. But yeah, I think this leads to us having this mix of views, which don’t really agree with anyone else because we both say the technology will be transformative at this big impact, which doesn’t agree with what economists think. But then we say, no, the right way to think about the future of technology is to think in these economic terms. Think about what is going to be economically efficient and preferable, instead of just one preferences of one AI or AGI. That doesn’t fit with people who really buy into potential technology, and it doesn’t fit a lot of techno optimist people who believe that AI will just drive up wages, because that just sounds good.
Or there’s just a bunch of generic things that they say. In general, I think it’s common, not just on AI, but in general, when people advocate for something, when they want to express optimism for something, they’ll often just be very reluctant to name any downside.
Yeah.
So for me, I’m happy to talk about how I’m optimistic about AI, but then I’m perceived if I say, actually I think that there’s this reasonable argument that AGI could drive wages down below subsistence level. Which sounds very scary to people because the literal interpretation of that claim is that just on the wages that a human would earn that would be insufficient for them to feed themselves. So other people who consider themselves optimists, like I do, in a factual sense will just object to that because they see me maybe as applying a pessimistic frame to the technology; but I’m not applying a pessimistic frame to the technology. I just think that’s likely accurate.
Now, of course, it might come across better if I add, I think that wages are not the only thing supporting human welfare. We can receive government income, we can live off capital investments, we can receive charity. There are a number of ways that humans could sustain high quality of living in the future, even if our wages are below subsistence level. Just the fact that I’m willing to give one concession just comes off as being pessimistic because it’s almost rare for people to present a nuanced, mixed picture of the future where it’s like, you’re not going to get everything you want. There are going to be some things that are very unusual and strange that might be very scary about the future. Even though I just described myself as optimistic about AI, I think there’s a number of perspectives from which you could very reasonably say that the future is going to be very scary and bad just because it’s going to be very different. The fact that it’s very different is like a reason for people to oppose it.
Yeah. So I think one point I wanted to get back to when I was saying people have this view of neglecting the importance of economic forces. I think they do this for culture and values as well, in the sense that they think, as I said before, that the current moment is one of unusually high leverage, and as I said, in comparative terms to other movements in history, that might well be correct. But in absolute terms, I just don’t think we have that much leverage or control over the future, because I think most details of the future, or also true for the present, are just determined mainly by economic or structural forces about which cultures perform better, which value systems perform better, are more suitable for whatever technological environments you’re in and whatever.
Or there might be some technology that is easy to create that allows for different types of social structures, and it’s just the fact that the world is structured in such a way that technology was easy to create. When we think about how we should design institutions, we actually don’t have that much free choice.
That’s right. To a large extent we see this because the variation in the design of institutions over time is much greater than over different countries. If you compare institutions like in 1700 to institutions today the variance is just so much bigger. The world of 1700 is just so much more different.
And it’s not. Yeah, it’s not just like the governance structures, like the ordinary way in which people live their lives, the thing that they do day in, day out, the way that they’re like room looks, what they do to get to work, the type of work they’re doing, all of it is just like the types of things that they do are just more similar across space than would be across time.
That’s right. People often compare political systems in different countries with each other, and I think they really exaggerate the differences in these systems because the US and China is one comparison where the differences look very large to people who are seeing it from up close. But if you’re looking at the US and China from the point of view of someone in 1500, then their political systems seem basically the same. I mean, there are a few details that are different, but from the point of view of someone in 1500, they would just look at, the organizational charts of the governments and different ways in which different people have responsibilities over different things and the sheer scale of both governments…
Right and there’s this massive bureaucracy. In the past government bureaucracies were quite small. Even the famous Chinese bureaucracy was just nothing compared to modern bureaucracies. So even though China’s pointed out, when people talk about history as being this famously bureaucratic institution, compared to countries that aren’t described as famously bureaucratic in the modern world. It was actually not very bureaucratic. Just measuring the number of people hired, the fraction of people that were actually involved with handling bureaucratic affairs and so on. It’s interesting that the way that we think about these things is it’s like there’s an emphasis given to how these institutions looked at their time compared to contemporaneous institutions, but not compared to modern institutions. That gives people an incorrect sense of how to actually analyze the differences and I think that’s part of why they underestimate it, because their view of history doesn’t take into account the fact that these things are often downplayed in historical narratives.
That’s right. The point I’m trying to get at in part is that it’s just not easy, to imagine that a modern society in 2025 could be that much different from the US or from China or from Japan or whatever. There are axes in which you can definitely have variation, but those axes would look pretty small or inconsequential to someone who was living 500 years ago. They would look like most of our societies are just exactly the same. I mean there’s just so many things that are exactly the same in your life regardless of where you live. If you just spend, like a week going to vacation on China as opposed to the US, the differences are so minor. You can think about, from the moment you land on the airport, to passport control and how the airport is designed, structured, how you get from the airport to the hotel that you’re going to stay at…
And the type of activities that you do when you arrive.
Yeah, and the type of activities you do when you arrive, it’s like almost all similar. Mode of transportation is similar. In fact, it’s really hard to find differences. You really have to look and stretch to find something that’s different.
But of course, people will be happy to point those out because those are the most salient things. I mean, the things that are all the same are not as salient. Those are typical.
That’s right. I think this just shows the extent to which the way we live, our culture, values, institutions, are just shaped by economic pressures and pressures of expediency and what works well. There are cultures that are much more different and much more, you could say dysfunctional, but those countries do very poorly in the modern world, precisely because they have been unable to adapt to the modern world. So they are very poor. They’re very ineffective. Usually they’re very unproductive. But yeah, they have very different culture, a very different way of life.
They have traditional modes of producing resources, like subsistence farming and so on.
Exactly right. But those countries, those places just end up being irrelevant precisely because of the choices they made or have been unable to make…Sticking to the past, traditional culture just makes them irrelevant.
Right? I guess, in some sense. So the implication here being that suppose you went back 200, 300 years ago and you were trying to preserve like the subsistence farming values, in some sense you actually probably could preserve it. But as long as you are okay with it just being preserved in a small pocket of the world, if you were just concerned with it existing, as long as it just exists in some form or another, you would have been relatively successful just because there are subsistence farming communities.
There are even hunter gatherer communities.
That’s right. But you wouldn’t have been able to get the whole world on board because that would have inherently meant that the whole world would have to adopt this entirely inefficient mode of production, which would have also meant that if anyone at some point defects from that, then they just like take over the world. Not necessarily in the violent sense, but just that they’re like the dominant thing that’s happening. They’re going to be the most GDP, people are going to migrate there because you can have a much higher quality of life if you go there. Then in the process of migrating there, they’re going to adopt all these different norms and cultural values, and definitely their children are going to. So it’s just infeasible in any sort of way to actually get the whole world on board rather than just preserve it in a small, much more narrow sense.
That’s right. What you should just generically expect with such a transformative economic and technological change, as we might expect from AI eventually, is that it’s just going to totally transform the conditions that shape our culture. You just shouldn’t expect the culture to stay the same, and probably you shouldn’t even want it to stay the same because it would just be inefficient. There’s a sense in which even if you think the present values we have are the best possible values, because maybe you think that subjectively, or maybe there’s a different reason you think that.
But even then there’s a trade off between the sort of enormous increase in all kinds of production and capabilities that you’re going to get from this, accompanying this cultural transformation, together with the fact that civilization maybe is going to have values that are going to be less agreeable to you compared to current civilization, but on the other hand, there will probably be some variance and then the world overall would be so powerful and so capable that maybe just some small part of the world keeps your values going. That might just be good enough. Or even if that doesn’t happen, just you trade off how much value aligned is the future with you versus how much capabilities does it have, and it’s not clear at least how that trade off is going to go.
Challenges in Defining What We Want to Preserve
I mean, I think a specific sense in which this consideration becomes relevant. And I’m not sure exactly how much people really believe this, but it is implicit in people’s…. Sometimes people will say, we need to make sure the future stays under human control, or we need to keep the future human. It’s not exactly clear what they mean by human, right? What are the boundaries here? In an abstract sense, if you are able to modify your mind or even just modify your body, upload yourself, upgrade your cognition, make your brain larger; in almost any dimension in which you want to improve yourself, if you continuously make these improvements, you’ll rapidly get to a state that would not in any traditional sense be considered human.
In order to preserve the humanness factor, you have to leave a ton of value on the table. You have to just give up all of this value that we could get from being much smarter, being much more capable, having like expanded consciousness, being able to do things that we currently can’t do, having modes of social interaction that we can’t do, and there’s all these forms of value that you’d just be leaving alone. So the dominant force, even in the good case, where AI doesn’t kill us or if there isn’t some sort of omnicidal war of the AIs that kill off all the humans and they stay aligned or whatever, well it’s not going to stay human unless there’s massive coordination to prevent that.
Even if there is massive coordination, is that something we want? Because then we’re leaving a ton of value on the table for almost seemingly no reason. The reason why you care about humanness, at least for most people, it’s not because you inherently care about that value, but because you think it helps you get other things that you care about. So it’s more of an instrumental value than it is a terminal value. So it’s very unclear to me what people mean when they say that we need to make sure things stay under human control. I think they’re just imagining that boundary has special importance, that there’s something important there. It’s possible that they have some sort of generalized concept of a human that somehow doesn’t apply to like the AIs will be building. But it does maybe apply to post humans who modify themselves. But that’s very unclear to me why you would have such a framework.
Yeah, I mean obviously a lot of people might say that the AIs are not conscious and so maybe they’re not moral patients. Or maybe there are other reasons that we disregard that world where they are either making decisions or maybe even a world in which humans have for one or another reason become extinct. Maybe just over time they just naturally become extinct. I mean a lot of species have become extinct in the natural world. So it would not be too surprising for humans to eventually become extinct just for boring…
Just base rates and the fact that when there’s a lot of things changing in the world, then things can just happen like that.
Yeah, That’s exactly right. I mean there’s also the point where we should expect in the longer run future for us to have more control over the direction of the species and of evolution. I think so far, we have been, we’re just unable to engineer or build humans. We just have to rely on natural processes for that, and we can’t even really influence the natural process too much. Even our ability to influence it is quite limited in any direct way, but I mean there’s no reason to expect that to just continue in the future. Once that happens, then there’s a choice about what we will, we do. Maybe people will want to keep the natural way of doing it alive, but they’re just so inefficient.
Well again, it’s just like the subsistence farming or something. You’re just keeping. By maintaining the natural human form, you’re giving up so much that it’s just, it’s hard for me to see why that feels appealing. I think it feels appealing because people are imagining the current state and they like they don’t want to currently die. Of course, if the choice is not like if the choice is between dying and keeping your current human form, then of course it’s preferable. But if the choice is like keeping your current human form or more like just evolving over time and upgrading and becoming something that is eventually not human, then yeah you’re leaving so much value on the table that you could have been obtaining, and you’re just not seeing that you’re doing that. Maybe.
I agree. Another reason I think people are very worried about AIs is because they imagine that the AIs are going to just have completely different values that are just somehow incompatible. This is usually what people mean by misalignment. I think there’s a bunch of problems with this framing. One of them is that the variation in values among humans is already very wide, and it’s very wide in a couple of different senses. One of them is that human values are usually indexical in a sense that, for instance, I care about my health and you care about your health, and other people care about their health. But people usually don’t care anywhere near as much about each other’s health, the health of strangers. If some stranger is ill and he asks for donations to get a treatment, then you probably pay very little to help them out compared to how much you would pay to help yourself.
I think there’s some sense in which people don’t recognize how true this is, because I think that they don’t think like economists in that they give more weight to what people say than to what people actually do. So it’s very easy to say you’re like a man of humanity. You care about the species, okay? It’s much more rare for people to actually care as much about a random stranger on Earth than they care about themselves. In fact, that would be. That’s almost deeply inhuman to do that. It’s deeply inhuman in a very obvious sense, if you just think about what that would mean. So that can’t be what we mean when we say people are altruistic.
When we talk about someone being a good person, another person being altruistic. What we usually mean is within the normal range of human variation, which even the extreme ends of that variation are very far from what we could imagine a being doing. Like a perfectly altruistic being that gives no weight to its own, or negligible weight to its own welfare would be crazily different than what we’d expect from a human.
That’s right.
So the idea that there’s some sort of concept of human values that’s shared among everyone in this broad sense, they were all just part of the same pool that we care about each other, that just doesn’t describe human behavior. Now, it might describe what people claim is human values, but it doesn’t describe what they actually do. It doesn’t describe how they spend their money or their time or anything or how they help one another. They’re much more concerned about local family, friends than they are just concerned about humanity or something. So I really think this bucketing, like somehow human values as if it’s this shared repository, as if we’re all just like one team, almost like that we’re this super organism that you can be aligned with…
It makes sense maybe to imagine individual alignment. You can imagine an AI serving an individual human. It is quite unclear like serving all of humanity like that. That’s such an abstract thing, and given the disagreement not just indexically, but I think, as you were going to say, that there’s also just vast political and moral disagreement among what the correct state of the world ought to be like.
That’s right. I once was talking to someone from Anthropic and the way I made this point is imagine you order everyone in the world in descending order of how much you would endorse their values, how much you would prefer it if the world was more like the way they like it to be. Then imagine a world in which the median person in that world is from the bottom first percentile of that distribution. How bad is that world according to you? And most people say, yeah, that world is very bad. Right. But if you think about the effect size of going from median to first person or something, that’s like two standard deviations or something in a normal distribution, it’s actually not that large.
So that means, even aside from the indexical stuff, there’s such vast variation. In fact, if you had to sort of rank the variation in human values across the three categories, I would say the variation over space is less than variation over time in the sense that the variation over different countries or different societies at a given time is less than variation in the same society across time. But even that variation is less than the variation among the members of a single society at a given time, there’s just such vast variation.
Even in a localized community, so like in the Effective Altruism community there are like plenty of negative utilitarians, which that would just be the view that you just want to minimize suffering without regard to maximizing happiness. That actually if it were taken to its extreme would be like a very different outcome than the ordinary version of utilitarianism where you also care about maximizing happiness. So even in this one community, which in a relative sense, there’s so many background assumptions that people in this community tend to agree with each other on. Within a single point in time in history, within a particular English speaking culture, within like particular groups of intellectuals, in people who inter socialize with each other regularly; there’s still the type of disagreement that would produce dramatically different outcomes if they got their way, if they had… It’s implausible, but if they somehow got full control of the world, this would yield very different outcomes. Even within that very narrow range, you still find intense disagreement. So then what is the basis of thinking that there’s shared human values?
Risk Attitudes in AI Decision-Making
Yeah, I think this is just people thinking to some extent in far mode. Sometimes people think that their moral values, there’s this notion of a reflective equilibrium that if only people who disagree with them thought hard enough.
Well, it’s almost convenient that it’s like this unfalsifiable, unobservable reflective equilibrium that we can just all think about. And it’s like we haven’t come to equilibrium yet, but like, trust me, we will. We just like all reflect. It’s like, well, how do you know this? Well, I’ve thought about it a lot and come to this conclusion. Like other people have also thought about it a lot. So there isn’t like this empirical trend of people converging. Or maybe there is, but it doesn’t, it’s not obvious to me. Where’s the empirical evidence for any of this?
Yeah, I agree. When you think about that again, it becomes rather strange to worry about AIs being misaligned. I think maybe the defense you could make of that case is that the power balance between AIs and humans, or maybe between different AIs is going to be a lot more concentrated. So a single AI with a particular set of preferences might just have a lot more power than any single human would ever have. That means the preferences of that AI become unusually important or something. But even there, if you look at the…
At least speaking personally, if I compare GPT4o or Claude 3.7 Sonnet or whatever, there are certainly things I really don’t like about the preferences of those models, but if I compare them to the median person, then they just seem better. They just seem like they have better preferences, they care, they are more altruistic, they are more concerned with the state of the world. I think they are more thoughtful and sort of more careful. There are just so many ways in which they look better. So instead of sort of the median person sort of determining the state of the world, if it was more like these models, that just doesn’t look like a bad outcome to me.
In my opinion, I think part of the reason why people disagree with that point of view is almost like they’re putting their thumb on the scales. It’s like just compare behaviorally what a more normal person is like then compared to an LLM. It almost feels obvious to me that if you do that exercise, you’ll just come obviously the conclusion that the AI values are much more thoughtful, kind, caring, compassionate, etc. Etc. Patient.
So it just seems like the only way you could come to a different conclusion just based on our current evidence is if you’re putting the thumb on your, on the scales and saying like, well, I’m just going to give extra points to the human because it’s human. And because I just identify with the human, but I don’t identify with the AI, so I’m going to judge it more harshly. Then that’s just discrimination. That’s just so obvious that you’re just being biased in this very transparent way.
Yeah, I mean, I think if people really stopped idealizing the human values so much and just looked objectively at the actual values that we see and they would come to this conclusion. Even if that was not true, so even if you imagine, there’s the traditional meme of the AI that just maximizes something meaningless like paperclips. Let’s say paperclips, point of analogy. Even if that was true, there is still a lot of uncertainty about what does that mean? Because what if the AI maximizes the expected log of paperclip number or something like that. Well, that’s very different because in that case you wouldn’t really expect the AI to do anything really radical or transformative because it would be very risky. It would be quite risk averse.
Well, because obtaining a lot of resources would not be that much better, than just obtaining like a moderate amount of resources, but making sure that you don’t do anything crazy that would, that could cause you to be deleted or shut down or like you lose a war. I mean, history is replete with people launching wars and then losing unexpectedly. It is like, am I in that reference class? It’s very difficult to know. I think part of this is, well it’s the AI so it’ll just know. It won’t have any meaningful uncertainty about whether it’ll just win the war. I don’t know about that though, because I think that it’s just a general fact about the real world that, even if you’re just like very good at reasoning as were saying earlier, there’s so many empirical things that are difficult to predict, that even if there’s just some remaining uncertainty, if you’re risk averse at all, you don’t want to risk total collapse.
I mean, it is just interesting to me that in this entire literature, I think, I don’t know if anyone has actually investigated the question of whether AIs are actually risk averse, which is very strange because if you…
They don’t even see it as an important assumption guiding this conclusion that if AI has different values, then it’ll take over the world violently and kill all the humans or at least disempower them and whatever that means.
In fact, I think the point about them being risk loving, risk neutral, or risk averse; that’s just a much more important question than the specific of the values they have. Because if you really love taking risk, if your utility function is sort of very convex and some kind of variable, then even if the value you have is actually a value that we would in the abstract say is desirable, you’re going to take actions that are very extreme in pursuit of it and that we would disapprove of.
Right. Like an altruistic agent.
Yeah.
Might commit fraud on a massive scale because they’re not risk averse. Yeah.
Imagine how implausible that sounds, like it’s never happened before.
Historical Lessons for AI Coexistence
That’s right. But I think people just don’t get this sense because I think they’re imagining that the contents of the values is what’s super important as opposed to this other part of the utility function, which is just put by the wayside.
I think another thing is just that there’s no engagement with what social scientists or economists have talked about in terms of what actually drives the patterns of conflict in the real world. I mean, in some sense the standard model that’s being proposed is like the AIs will go to war with the humans or take over or whatever you want to call it, because they have different values. Then is that the way that social scientists have tried to explain war in the past? They’ve just said whenever there’s two agents that have different values, war is inevitable. That is not….
Obviously probably proposed this model, but it’s not like a dominant model of the field. This is not the primary way that people tend to explain wars. Instead, the primary way that people tend to explain wars is…. Well, it’s a number of things, but oftentimes it’s a breakdown in the ability to coordinate. For example, if they’re somehow not able to agree mutually to some contract, they suspect that the other person doesn’t have a mechanism that would allow them to credibly commit, then that makes trade much more difficult.
Or they just have an irresolvable disagreement about who is more likely to win in a war.
Well, that’s right. So yeah, you have an epistemic disagreement. In that case, the epistemic disagreement can cause a war because you won’t be able to find a negotiated settlement that both can agree on because they have this just intrinsic disagreement about what would happen if they ever went to war. I think another example, which I think is also pretty common cause of war, which is just some sort of value that is ideological, that by its very nature is something that people are unwilling to compromise on. If there’s some sort of territory that is like a sacred part of the nation’s territory or something that we’re just unwilling to ever give this up, then that could very well cause war just because of the fact that the larger nation that wants the territory, they might be very willing to pay money. They might be very willing to say, we’ll just purchase this land from you. But then in response the territory will say, well, no, we can’t trade money for territory because these are sacred lands, whatever you want to call it.
So because of the fact that they’re unwilling to see different values as being commensurate with each other, then that could very well cause conflict because that by its very nature removes the possibility of compromise and negotiated settlements. But the arguments for AI would have to be that AI, in order to go to war with us, would have to have one of these features. Maybe AIs or humans have some sort of sacred value that we’re unable to compromise on, or maybe there’s no mechanism that would allow AIs to credibly commit…
So, in fact, I do want to point out that when you think about problem in these terms, you begin to realize that efforts to align AI might actually be actively harmful because humans clearly get into a bunch of wars. And humans have a bunch of these tendencies where they, for instance, tend to see certain values as sacred and not negotiable, like territory or nationalism, and there are other things that I think people would not want to compromise on. People also tend to be miscalibrated about the likelihood of winning wars, and that’s another thing that drives them towards war. When we align AIs, currently we are giving them the same biases that lead humans to engage in war.
So they’re also, in addition to getting maybe kindness, they’re also getting the ideological values of like, you shouldn’t give up territory, or you shouldn’t compromise on some sort of religious end or whatever.
Or they’re also being miscalibrated. I mean, their calibration gets worse.
Oh, absolutely. With RLHF, I mean, that was a major result. I remember GPT4.
I think people just should think more. It’s just so weird to me that they are worried about what is in effect like a takeover, war, or conflict, and then there is no engagement with the actual literature about what causes conflicts in the real world. Because if they were engaging with that literature, they would realize that the preferences are actually not the story. Almost nobody has said that two actors end up in a conflict because they want different things. Yeah, fundamentally that is a reason. If they just agreed, if they just had literally the same preferences, then there would be no reason for them to get into a conflict. But that’s just such an extreme demand. The human value distribution is already so wide that it’s just impossible to have an AI that’s aligned with something you could call human value. So that’s just not a feasible solution.
I think another thing driving this, though, is I think people just are not aware of the empirical literature on just like the costs of war, or they somehow, or even if they are aware of that, they’re just like not aware, or they just think it won’t apply to AIs for some sort of reason. And so I think there’s the common refrain, which is really driving a lot of this is that: AI does not love you, nor does it hate you. You’re made of atoms, it can be used for something else.
So I don’t know how literally people are supposed to take that, which is never a good sign that people’s arguments that they rely on and come back to, it’s unclear how literally you’re supposed to take the central argument underpinning everything. It’s a weird… But putting that aside for a moment, if we just take it literally, the amount of atoms individual human bodies are just extremely small. So you’d have to make the case that somehow going to war with humans is worth the benefits that you get from somehow getting their atoms from their bodies. I think the intuitive first reaction that people have to this for some reason is just to think, well there’ll be no cost at all to going to war. Well, obviously like the AI will just be this super intelligent agent that can just create a super virus that just kills everyone without anyone knowing anything.
Okay, but if it’s more like a conventional military conflict, then that is not remotely true. There will be these enormous costs in procurements. There will be these reputational costs of defecting against international norms or just existing legal norms. I mean, because it’s not exactly clear. Because another mistake people make is they assume that all the AIs will be on the same side and there might be different factions even within the humans, among the AIs. It’s not exactly clear that it’s just going to be this AI versus human conflict. In the real world, there’s various ways that conflicts tend to be defined.
There’s various lines that we draw conflicts around, and there are many ways that conflicts could be drawn around, but in practice, rarely they’re not. Like it’s almost never that any society has an actual war between men and women. Maybe this has happened, I don’t know, but that’s at least extremely rare in historical terms. You could imagine various other lines through which we could draw conflict. You could imagine the Old World going to war with the New World. You could imagine Europe going to war with Asia or whatever. There’s many lines you could draw, and the point is that actual conflicts tend to be drawn only around a relatively small selection of all possible lines that could be drawn. And human and AI is one possible line through which we could draw this line. There just hasn’t been that much emphasis in any of these arguments for justifying exactly why the line would be drawn along that axis.
Even if the line was drawn along that axis, why exactly what motive do they have for going to war? Even if they have a motive, why are the costs of war so low that it makes sense to do it even though, you’re getting a relatively small benefit from killing humans? I mean, apparently in the literal case, you’re just getting their atoms from their bodies, which seems…Obviously maybe the steel man of this is, you’re getting with land or something. But none of this is specified in a way that, in my experience, that allows you to engage with it. People think that there’s some detailed argument behind this. There’s a bunch of people who’ve just interpreted in their various individual ways and then it’s almost….And then you’re just like playing whack-a-mole and then talking about it.
A Warning Sign in Safety Discourse
So one of the things that is often a bad sign, and I think this happens from… Again, I will go back to this division between sort of economists and people who are very pessimistic about AI in the sense that they think it will kill everyone. I think both sides have this weird tendency where there is a shared conclusion. So for the economist, it might be, oh, AI is just not a big deal. It’s just not going to have a very big economic impact. For the sort of AI doomers, it might be that AI is just going to kill everyone. It’s very likely to kill everyone. At least.
Or at least I would say more traditional view. Just like it’s going to kill everyone, unless it shares our values.
Yeah, unless it shares our values. I’ll say that, okay. You have both these views and if you ask different people, they all give you sort of different arguments for why this view is true. This is actually fairly strange. So if you imagine that there is a conclusion, imagine that you ask different economists, why are tariffs usually a bad idea? You would not get 100 different arguments from 100 different economists. You would get basically the same argument about comparative advantage and gains from trade and specialization. This is basically the same argument. That suggests that, yeah, that argument is pretty good because everyone seems to just find it very plausible, that particular argument.
But in the case of why will AI not be a big deal? Economists often agree on the conclusion that it will not be a big deal, but often they just give wildly different reasons. So some of them say, oh, it won’t be a big deal because it will be very tightly regulated and we will just not deploy it very much.
There will be Baumol effects….
There will be Baumol effects. There will be bottlenecks. They think it will just take a very long time for us to develop a technology instead of saying that, oh yeah, I believe the technology will be developed, but then it will just not be good enough. Or they will say, oh, I don’t believe in this model of growth where as you scale up R\&D effort, they don’t believe that productivity is driven by scale. They think it’s driven by the quality of institutions. And then there’s no reason to expect AI to have an impact on that or something. I mean, I’ve heard this argument. Or they’ll say, oh, I don’t see a way for us to make going to a restaurant or going to a hairdresser a much more pleasant experience than it already is. We are bottlenecked. AI can’t improve that, so then we will just be bottlenecked. I mean, there are just tons of different arguments for basically the same conclusion. And on the AI doomer side, it’s the same thing.
So what’s suspicious here is not that people have.. If were just analyzing this in the individual case, any individual argument is not necessarily flawed for this general reason. But it’s the fact that they have different arguments for the same conclusion. Yeah, it’s not different arguments for different conclusions.
Yes, that’s right.
Which would have been more. That would be what you’d expect if it was just a bunch of people working independently to a conclusion based on their own line of thinking. There actually seems to be a shared sense that you should come to one conclusion even if they haven’t arrived at that conclusion via the same route.
That’s right. I just think this is in general a sign of motivated reasoning that people arrive at the same conclusion but very different arguments. In the case of the people who are worried about AI extinction, there isn’t even a canonical case or argument that has been written down for why we should expect that.
Well, I think some people have tried and then other people go, oh, that’s not the, I totally disagree with that. So you can’t… Maybe some tiny faction of the community thinks that’s like a canonical argument, but like then a much larger fraction of the community are like, no, no, I think that’s a terrible argument. Then actually like you have to point in this other direction. Yeah, well it’s like that should… I mean at least, at the very least you might still think you’re correct even if you’re in such an epistemic situation. But at the very least you might want to reflect and go, why is it that there are so many people who agree with me who are arriving at this position from a completely separate perspective that is suspicious just in any domain, if that occurs.
I find it suspicious. So for instance, our reason for expecting AI to be a big economic deal I think is basically; there are maybe a few arguments, but there’s really one central argument about this increasing returns of terms of scale, that AI will restore this previous trend of growth acceleration because it will do R\&D. On top of that, it will turn labor into something that can be accumulated and capital, researcher effort and so on, and that will just lead to increased returns to scale. I think a very canonical argument is in line with the existing theory about, in growth economics about what causes acceleration and long term growth. So it’s not a theory that we have specially developed for AI.
In fact, it was a theory that was completely developed independently from AI. No one had originally said anything about AI. I think that the original people talking about it were mostly saying, here’s this acceleration, or at least in the field of economics people have just said here’s this acceleration and it’s now going to stop. So they didn’t develop theory to predict this future acceleration. It’s almost like they just didn’t extrapolate it or something. But I think that does give it some level of credibility because then it wasn’t specially developed for AI. It’s just that you can notice this connection to AI.
That’s right. I agree that this makes this particular story more plausible. The same thing is true for. Why do you think AI is plausible…. Why do you think it’s something we should expect in the near term? Again, I think the arguments there, there aren’t a million different arguments for why you should expect that. I think there are maybe one or two really decisive arguments. One of them is just an argument based on evolution. The other one is looking at maybe the past 10 years or a little bit more than 10 years and looking at the experience and maybe trying to extrapolate that. But those are just very simple arguments in my opinion.
The arguments that we discussed previously, so many of them feel so random and also so uninformed by the relevant literature on the topics that the arguments are about. People make arguments about AI’s impact on wages without looking at the simplest economic theories about what determines wages. Or they make arguments about the likelihood of AI going to war with humans or taking over from humans without looking at the basic theory people have developed about what causes conflict and war. Or economists make arguments about what is going to happen with AI without looking at the basic things that experts in AI think the technology will be capable of. And it’s just so bizarre that this is considered an acceptable way to argue. You just ignore all of the serious work that has been done in related fields and you just sort of derive some speculation from first principles and intuitive reasoning.
Well, you know what it kind of reminds me of? It kind of just reminds me of how narrow LLMs are in some sense, which is like you’re using them to apply it to a specific problem and it’s not able to make these connections. It just has this large base of memorized things and that’ll just trot out. But it’s not like applying them in a way that integrates all of these facts across different disciplines and synthesizing them in a coherent way.
In some ways that’s what makes me think that maybe humans are not that much better than existing systems. Maybe this is kind of what makes me a little more optimistic than you is that I see this and I go, this is kind of like what we see in our current AI systems too, in a sense. So the critiques people have about how the current systems aren’t doing this. It’s like, well humans don’t do it. Okay, we do it a little bit. But there are some obvious ways in which they could just do it a little more. And it’s like, surprising that they don’t.
Revisiting Core Assumptions in AI Alignment
I guess one thing we could end with is discussing why. So maybe some of what you’ve written about. You’ve already written about this sort of value misspecification type stuff. But why? Given that one, we seem to observe the systems empirically as just being better aligned than most people, like most people are then, yet this view about concern about them being misaligned still persists on top of this assumption that, oh, they’re misaligned, then we’re screwed. Just a whole different assumption. There is this strange persistence to the view that the systems are going to somehow have malign preferences as a way of maybe a consequence of the way they’re trained. This, insofar as I can see, is not based on any empirical evidence.
There are some studies of faking alignments and whatever, but those have to…Compared to what you would have assumed if you were 10 years ago, you haven’t seen anything like the modern systems. You were asked to make predictions about how aligned they would be, how well they would understand the human values, what humans would want them to do in certain situations. I think most people, I think they might deny this claim, but I think most people would have claimed something a lot worse than what we end up getting. I mean, if you just look back at what people have said historically, they said, oh, if you ask the AI to fill a cauldron, then it will destroy the world in order to increase the probability that the cauldron is filled by a tiny amount. And that just seems totally different from…
And then what they’ll say is, well, that wasn’t intended to be like a realistic example. It was just to illustrate the point. Would you still think that works as illustrating the point? Because I think, no, I think you would agree that if you tried to illustrate that as, as the example of here’s an example of how the thing might fail, then I think you’d go… Everyone’s going to see that and go, I don’t see how the system’s going to fail in that way.
So at the very least, you can concede now that would not be a good example. At some point in 10 years ago, you thought that was a good example, so you updated on something, maybe in some literal sense you didn’t think it would happen exactly that way, but you definitely updated that in some sense that’s not the type of failure that will occur. Because it’s obvious that even now you can concede that people would object to that if you tried to raise that as some sort of example. I just find it frustrating because I think the unwillingness of some people to concede that point, to just say that systems are not failing in…it just seems manifestly obvious they’re not failing in the way that people had traditionally imagined a failure.
Then sometimes they’ll say, that’s because the systems aren’t like super powerful or they’re not like super intelligent. But then it’s like, so did you imagine that there’d be no way to get any empirical evidence about the way that the systems would fail? Because the only point at which you could ever observe failure was after it was super intelligent. Okay, but even if that was your view, it’s just like from your perspective, you’re just saying all empirical research is just pointless, because what is even the point of doing anything empirical if you can’t get any evidence about what this failure mode will actually look like? So then that almost undermines, then if they were to point to examples of existing research that somehow justify the claim that these systems are likely to be misaligned, then isn’t that contradictory? You’re saying that one hand you’re saying that it’s unfalsifiable. On the other hand, you’re proposing evidence that are in favor that you claim supports theory. You have to choose one or the other.
Well, I think that, I mean, there is a way you can frame this in which maybe they think there’s 80% chance that there’s going to be no evidence, but there’s 20% chance that there will be. So there being no evidence is not decisive against their theory. But there being evidence might be decisive in favor. That’s not inconceivable. If you have a theory that there will be an earthquake of a certain magnitude in San Francisco in the next year, then that is the kind of theory where you have to wait until the end of the year to find out it’s not going to happen. But the earthquake happening at any time in the year can prove that you were right.
But I think it’s not really like that because if you look at the original writing, which you have done so <div class="transcript-metadata">Matthew</div> has written, you have written about the historical value misspecification argument. If you just look at the original text, people are clearly saying there are two things that are going to be hard. One of them is specifying what humans care about to a sufficient precision. And then the second one is to get the AI to care about that. So both of these are hard, but the first one just seems like it has been solved. You can just plug any system into GPT4 or whatever, and it already has a sufficiently good understanding of human values. You can just ask it, describe a very concrete, detailed situation and then say, what would be the moral or ethical thing to do in this situation? And it would just answer as good as a human would.
Which I think causes this confusion, because I think this is the number one objection to whenever I bring this up, which is just people say, it was never predicted that we’d have a hard time with the system understanding human values. That was never the hard part. The hard part was value specification. Okay? But what people meant by specification, the original way of thinking about it, was that they’d code up some sort of utility function that would represent the AI’s values and that this would be some sort of formal specification, formal representation, codification of the things that humans cared about. Then you put the codification in the AI, and then there’s this separate, entirely separate thing which is you also need to make, write the code that causes the AI to actually maximize that formal specification.
The specification part is not merely understanding the specification is specifically some sort of representation of the values that is being accessed by the model in a transparent way, legible way that is actually being implemented. For example, just getting GPT4 to do what you want it to do, taking actions that satisfy your preferences, that demonstrates that you’ve done the specification correctly, because that shows that there is some representation of what you care about in there that is correctly being accessed. Then it manifests in actual behavior, not just understanding, but in the actual observable, executable behavior of it performing the task that you want an action model belief.
I just feel like there’s this sharp distinction between these two things. And for some reason, almost every time I explain this, there’s just this…It’s almost like I like to explain this over and over again to people and then they’re <div class="transcript-metadata">like</div> “You mean that you’re saying that people thought it’d be hard to get the AI to understand human values?” It’s like no, that’s not what I’m saying. It’s almost weird because I almost feel like,> at the time these arguments were written, I would have thought that everyone had this shared understanding of what the arguments were saying. So there was an illusion of transparency.
This illusion exists in other arguments as well, in the same community. I think a lot of LessWrong arguments have this basic character. They seem clear somehow. I mean, they don’t seem clear to me, but they seem clear to other people. Everyone understands them in their own way and then they don’t realize that their understandings are not compatible. It’s really funny to me, but I guess the really interesting question, maybe a question that we can close with is, I think the empirical evidence we have received, I think we’ve certainly so. I mean sometimes people from less wrong will say that actually the success of deep learning was evidence in favor of misalignment just because we don’t understand the internals of what’s happening in the models.
But I think beyond that, if you just look at the success of existing techniques of alignment, existing techniques of getting the models to be helpful and harmless and pleasant to talk to and have a good understanding of human values. It just seems like there’s no sign of any difficulty empirically. So basically all the arguments for why there is this, even the potential for AIs to have some very weird values like maximizing paperclips. Setting aside the fact that even if that was true, there’s no reason to expect a conflict or anything like that, it’s a whole separate argument. Setting that aside, just focusing on the values, there doesn’t seem to be any reason to expect that to happen beyond just abstract theoretical arguments about the properties of certain kinds of optimizers which seem very speculative and very tenuous at best. I’m being generous here.
Definitely none of those arguments would have initially predicted the success, because initially the success of deep learning paradigm was not predicted. So clearly people really don’t have that good of an understanding of how these models work when they’re optimized.
I just feel like it’s always suspicious when you have to say that the empirical evidence, the most directly natural interpretation of what we’re actually seeing, is irrelevant because of some abstract argument that doesn’t make predictions in a near term functional or practical sense. That is always just a very suspicious thing to say in almost any domain. I mean, now I’m not claiming that’s always suspicious in all domains, but the prior here should be against that type of argument. It should not be the type of argument that should sound natural to anyone who has an understanding of how science works, right?
I mean, though I do have to say it is self consistent in the sense that they both believe that reasoning alone and abstract reasoning alone can get you to have such a high confidence in such an extrapolation, such a conclusion. Right. And at the same time they believe in the AI in a basement, a country of geniuses in a data center, or just by the power of pure reasoning, these things. So that is consistent. They just believe in the power of abstract reasoning.
So if they’re dismissing empirical evidence in some way, then yeah, it’s not like they’re being inconsistent in the way they’re evaluating evidence. Yeah, maybe that’s right.
So I mean, maybe just what’s going on is that they just have this confidence in abstract reasoning. But there would still be the question of why doesn’t anyone use abstract reasoning to come to a different conclusion? Which often happens a lot in domains where empirical evidence is not available to settle the question. Obviously there’s a ton of disagreement about philosophy when it comes to questions about which there is no empirical way to settle the matter. There is even disagreement in physics. You can look at various extensions to existing physical theories. People have proposed to try to settle various conceptual and whatever problems in existing theories, but there is no empirical evidence that directly bears on those theories that can discriminate them.
So what happens is people just have tons of disagreement and they can’t converge on the answer. It’s just very normal, even in domains that are objective like physics, for people to have strong disagreements just when they’re basing their arguments on abstract reasoning, when there’s no empirical evidence available. So it is just very suspicious if you have a convergence of views that is about such a speculative question based only on abstract arguments which are individually different and often understood differently by different people who read them. They’re not even sufficiently clear in that sense. So I think that is just interesting that people put that much weight on this kind of abstract argumentation and that it just seems to convince me.
When I hear such an argument in almost any domain, even another category of this is postdiction, post explanations, you’re explaining something after the fact, or you’re trying to give an account of something that has happened… like historical analysis for instance, you might say that the reason the Americas didn’t develop is because they didn’t have domesticatable animals. That’s one explanation. Okay, but you can tell a really nice story. But what is the empirical evidence for that story? As with other stories, there’s tons of stories you might tell of a similar character.
Well also I think that people just underrate if there’s not that much empirical evidence. If we don’t have a lot of especially experimental evidence of like different scenarios, we can’t like rerun the Americas or whatever. This is the type of domain in which it seems like the best models should be quite simple. They shouldn’t be like very detailed models that have that, that can pinpoint exact things about the environment.
You just don’t have that much evidence
Yeah, that’s right. The evidence only gives you enough confidence to narrow down the possible theories into theories that are very high level. There’s not enough there, such that you could justify a very detailed theory.
Yes, you just have so few bits of evidence from such a historical outcome that, it’s just impossible to say really. You can say oh I looked at the details but like different people look at the details. There’s lots of details in history so it’s usually very easy to find, especially when you get to curate the evidence that you’re looking at…. A good example of this is imagine that you, I don’t know, flip a coin like 100 times, or maybe a 1000 times. It’s going to come up heads like 500 plus or minus 40, 30 times, whatever. If you were sort of looking at the coins chosen at random, then suppose that you’re trying to figure out if there were more heads or more tails, right?
If someone is just someone arguing for the position that there are more heads, then they can just pick like there’s so many things, so many heads they can pick. Same thing true for tails regardless of the outcome, and reality is often just like that. If you want to support a hypothesis, it’s very easy to select things that seem to support it, but you need to have this balanced view and you need to consider all of the evidence available. If you just look at evidence that looks maximally favorable to your theory, then it’s very easy to just get stuck in this feedback loop where you just keep updating up and up and then you just end up very confident. Even though actually if you sort of look at the whole body of evidence it looks very murky and mixed and there isn’t actually that much decisive evidence that you’re right or wrong one way or another.
So I think this is another thing people do when they just come up with a thesis and then they look at the things that when they write a book on something, let’s say whenever a blog post, when they write an account that’s trying to support their thesis, they look at the examples that sort of best illustrate it. While in fact what they should be trying to do is try to look at the examples that are most problematic, that are the worst examples for their theory, and say is this example what went wrong? Do you have an explanation for this? Maybe you don’t. Maybe it’s just some random thing that happened, but you think there’s still no world tendency in your direction.
But that’s just something people don’t do. I’ve never seen people who believe in AI doom try to grapple with the bad predictions of their model in this way. They usually say, oh, we didn’t actually make that prediction. We sort of deny that it happened. Or they deny that it’s somehow surprising to them. They deny that it’s relevant. They just don’t want to deal with that. While if it ever happened the other way around, imagine that there was a key experiment, a key result that showed the model being deliberately deceptive in a harmful way.
And weren’t trying to train for that.
We weren’t trying to train for it just happened. They would just be. Can you imagine the amount of gloating? And yes, were right. See, we told you along. But if you’re going to do that, then when it doesn’t happen, like there should be at least some impasse, some update on that. Maybe it’s not a huge update because maybe you think it is unlikely to get evidence, as I said, until everyone’s dead. But that itself should be a suspicious theory structure if you believe that because you’re sort of guarding yourself against the possibility of theory being falsified.
But even if that is your view, you should still be making some updates. It’s just something that the structure of arguments around this just not only seems poor, but it seems suspicious. It has this quality of, I don’t know, some philosophical or religious or whatever arguments that are just, not only they are wrong, but they’re sort of, in principle, they are the kinds of arguments that would be unable to reach any empirical conclusion.
But they can reach the largest conclusion just not like smaller conclusions.
Yeah, not smaller. That’s fine. Apparently they can do that. So whenever I see it also makes me suspicious in other domains like history or whatever. Somehow just comes up with this grand detailed theory of history and then they just sort of pick and choose evidence that seems to support their view and they ignore evidence that goes against their view.
But I think part of this is they’re really just not doing it consciously. Yeah, they’re not doing it consciously. I think that this just plays much of the frustration.
Simple Models in Complex Domains
That’s right. I think people just do not have this idea that elaborate and abstract theories and explanations and domains where first we don’t get to run experiments, at least not decisive experiments, that we’ll be able to sort of discriminate relevant theories where there is a lot of lack of data and there’s just a ton of complexity, a ton of variables that in principle might matter. In those kinds of domains. The natural thing to do is just be skeptical of anything that is not like an extremely simple explanation. That’s not because the true model of the situation is extremely simple. It’s likely to be extremely complicated. But it is not a model that you will be able to estimate or infer any parameters of.
So you can only make deductions about sort of the simplest tendencies, the simplest high level phenomena, and you can’t make any deductions about anything more detailed than that because there’s just not enough evidence. But people don’t, they just don’t have that bias, which is why there are entire books written about intricate and detailed historical, sort of grand historical theses like…
Guns, Germs and Steel.
Guns, Germs and Steel is an example. That’s right. And interestingly, people will often ignore sort of the brute facts. Often very simple and very striking facts will be ignored…these more complex elaborate things will not be ignored.
So for instance, one example, we might emphasize the population over time as being responsible for the acceleration of technologies…
You have more people and economies of scale and the greater surface area for more discoveries and more economic activity which will add more tools. It’s just this simple increase in scale. Very compelling but very underemphasized. People often ask questions like why did Rome not have an industrial revolution? The simple argument is just that their level of technology and economies of scale and capital stock and all of these, they were just not sufficient for them to have an industrial revolution. But that’s just a boring answer. Just saying, well, they just need to wait longer. But okay, well sometimes that’s just the answer.
If someone asks you why is some 15 year old not yet… Why was Terence Tao at 15 years old not good enough to prove the Green Tau theorem? Okay, well he just hadn’t accumulated as much experience and hadn’t made as many. Maybe connections with potential collaborators didn’t have as good advice or something. Certain things need to accumulate, but people just find that a very boring and a very dry explanation.
So they’d prefer some sort of weird, interesting narrative that he lacked this particular insight. And then that was like the final piece or something, but it’s like actually like no, I mean at 15 he wasn’t like that at all. Yeah, yeah.
And the same thing is true for economic history questions, some people just do this all the time. It’s just like another thing with things like Gurn, Germs and Steel. There’s this well known correlation of the wealth person or income person of countries with their distance from the equator. If you look at it’s just really striking and it just seems so weak to say, oh that’s just an accident, why would that apply? Clearly something is going on. The correlation like that correlation of that magnitude in the social sciences doesn’t show up if there isn’t a reason for it to show up. So if you’re interested in what makes countries wealthy, it just seems like a very relevant thing to study that correlation, but often people just put way more stock in things that are just way more speculative than that simple raw correlation.
Well, I think one other thing people will do is they’ll just list a bunch of factors and not prioritize how many of these factors are important. I don’t know exactly what Guns, Germs and Steel said but I think it named the fact that Eurasia was a large continent as being one factor or something, for like its outsized importance. It sounds like the whole book should be about that because that just sounds like such an important fact in the, in the discussion. The fact that it’s such a large continent. Then if you just list 10 factors of here’s why the world played out the way it did, and then you’re giving like a tenth of the weight to a factor that might explain like 50% of the relative variance or something, then that’s just misleading.
Yeah, that is misleading. Yeah, yeah, absolutely. But I think you should just by default expect the largest variation to arise for the most boring reasons.
That’s right.
For instance, the biggest source of variation in the ability of armies to win wars is just the amount of equipment they have and their size. Okay, well, that’s just very boring if you just say, well we went to war and then we won because we had more people and more guns. That’s such a boring explanation.
I think it’s partly, people want to find it interesting, but you have to distinguish what is super interesting about this, about what’s going on with… Well, okay, we know it’s super interesting, but can you just extract what the actual fact that is most true here about what we know about why this thing happened? I think people just can mix up the two. They don’t necessarily explicitly conflate them, but I think they just, they don’t make this consistent distinction of how they think about these two things.
I mean, historians in particular love talking about these long narratives of what occurred without necessarily pausing and making a strong emphasis that like, 90% of what they just said was probably irrelevant because what we know based on what we confer based on the evidence is probably due to this, very boring minor, or this like, thing that. Yeah, obviously that was. That’s like the most important thing, but I’m going to just give 90% of the narrative that won’t give you really much insight at all, but it’s still fun to read. But I think they just don’t think like that.
Yeah, I totally agree with that. People even do this in AI research, and I think they did it for a long time. I think it has changed now to a large extent because of the scaling hypothesis getting more traction. But for a long time people would just talk about their clever insights and research ideas and whatever, and they would sort of underemphasize the amount of compute that was spent on a project or the amount of data that had been collected, or the amount of experimentation that had to be done to discover the setup that later went into paper. Those are actually the things that drive most of the variation across the performance of AI systems. But they would focus on the things that are most interesting, which are the…
Architecture.
Architecture. So the exact way in which it was trained and the exact way in which the weights were initialized and so on. And yeah, those are more interesting to a machine learning researcher, but they are actually responsible for only a small fraction of the variation in performance across systems. If you’re talking about this clearly, what you should really talk about is things like compute and data quality and the budget you have for project. How much experimentation did you use to discover the setup that worked? Maybe even talk about scaling law experiments you’ve done. How did you come to decide on this particular architecture? Why did you do this? And often people will like embellish….
Just this genius insight…
And often they would present an ex-post theoretical argument for why their choice was good, even though the way it was discovered is not theoretical at all, it was just discovered by experimentation.
Or I mean this was especially common in the old days of just presenting extraordinary results and then attributing it to their insights and excellent architecture. And then people just weren’t very curious about the fact that, well obviously the reason why the results were extraordinary is just because they use so much more compute than everyone else who was working on this problem.
You know people also did this with like AlphaGo.
Yeah, absolutely.
People are like, oh, like Google, they came up with this genius idea of like …Another funny thing is that the search AlphaGo does is called like Monte Carlo Tree Search, even though it’s deterministic, just because it was inspired by other Go bots which do do random search.
So they just kept the name?
Yeah, they just kept the name even though….
I didn’t know that.
Yeah, it is a tree search, but it’s not random.
Interesting.
But it’s just such a simple thing that data like clearly there is almost no algorithmic insight in AlphaGo Zero. There’s nothing. In fact to a large extent that is deliberate. So people later on improved on the model, but they did so by introducing a lot of specific features and architectural details proper to Go. But the Google team, they explicitly didn’t want to do that because they wanted to show a general algorithm that could work for any game. So then people just said, oh, like Google, they just did this amazing thing, like amazing software, but they just didn’t speak at all about the fact that AlphaGo Zero used, I think it used as much compute as GPT3 or more compute than GPT3, something like that.
Well, there was a long time in which AlphaGo Zero was the most expensive model..
Yeah. So, yeah, they had really good results. I wonder why? People ignore this basic fact. I mean, this was before the scaling hypothesis was really dominant. So I think people just ignored it. When the results came out in 2017 for AlphaGo Zero, I did not know for a lot of years after that the key variable that had driven those results had been the amount of compute that they spent. I was like, oh, and then people would read the paper. It doesn’t matter. You could have, probably with that scale of computer, you could have done almost anything and it would have worked probably. You didn’t need any special algorithmic secret sauce or anything. You just do a few attempts and until you get something that looks like a reasonable scaling curve of ElO against compute spent and then you just scale it up and then probably works. It’s a very simple approach. But people don’t want to talk about that, as usual. And I think people don’t want to…
Well, there might even be an strong incentive not to talk about it. Both from the researcher point of view. You wanted to emphasize your insights, but it also makes you look super inefficient as a company. The fact that you only achieved certain results because you just spent a lot of money. I think that there’s a sense in which that looks really bad.
Yeah, true. Anyway, I think the scaling hypothesis becoming popular has definitely changed some of this. I still think people underrate the extent to which compute also drives algorithmic innovation. I think they just attribute that to…
They’re happy to agree with the scaling hypothesis, but as this some independent factor in well, algorithms still are an important independent factor in determining AI progress. It’s not that the scaling hypothesis is wrong or anything, but that there’s this channel which doesn’t route through compute, which is just like researcher talent. And that’s just like this separate mechanism for how we could get like these extraordinarily performant systems. Yeah, I think that’s right. I think people just. It’s almost like there needs to be a scaling hypothesis 2.0, like the strong scaling hypothesis, compute is just so important that it’s even a large driver of these other things which you thought were not related to compute.
Yes, I mean, to some extent we are seeing that for, I think, synthetic data, which costs a lot of compute to generate.
Oh, yeah, that’s right.
There’s some extent to which we are seeing about here. Yeah, I think that’s probably a good place to end it.

