AGI timelines, offense/defense balance, evolution vs engineering, how to lower p(doom), Eliezer Yudkowksy, and much more!
Timestamps
00:00 Trailer
03:10 Is My P(Doom) Lowering?
11:29 First Caller: AI Offense vs Defense Balance
16:50 Superintelligence Skepticism
25:05 Agency and AI Goals
29:06 Communicating AI Risk
36:35 Attack vs Defense Equilibrium
38:22 Can We Solve Outer Alignment?
54:47 What is Your P(Pocket Nukes)?
1:00:05 The “Shoggoth” Metaphor Is Outdated
1:06:23 Should I Reframe the P(Doom) Question?
1:12:22 How YOU Can Make a Difference
1:24:43 Can AGI Beat Biology?
1:39:22 Agency and Convergent Goals
1:59:56 Viewer Poll: What Content Should I Make?
2:26:15 AI Warning Shots
2:32:12 More Listener Questions: Debate Tactics, Getting a PhD, Specificity
2:53:53 Closing Thoughts
Links
Support PauseAI — https://pauseai.info/
Support PauseAI US — https://www.pauseai-us.org/
Support LessWrong / Lightcone Infrastructure — LessWrong is fundraising!
Support MIRI — MIRI’s 2025 Fundraiser
Transcript
Trailer
Liron Shapira 00:00:00
First time we’re ever doing live call-ins. All right, I got Manny on the stream. Manny, welcome. What is your question?
Jay 00:00:06
Hey, can you hear me?
Liron 00:00:06
Yeah.
Jay 00:00:06
Hey, perfect. To me, the current AI systems are more intelligent than almost everybody in different dimensions. It knows how to do a bunch of bad stuff today. But none of those bad things are being done.
Liron 00:00:18
It sounds like your question is like, “Look, we’re already past AGI and we’re fine, so aren’t we just gonna be fine forever?” Whereas in my mind it’s like, “No, no, there is definitely another threshold that I’m worried about that we haven’t met yet.”
By the way, you guys can feel free to turn on video too.
It’s more interesting for the viewers.
Jay 00:00:33
I have a face for radio. Um...
Mohamed 00:00:37
What is your probability we get pocket nukes?
Alec Harris 00:00:40
I wanted to ask what your probability of misalignment is?
Yaqub 00:00:43
How much above biology can you really get with AGI?
Jay 00:00:47
Rick and Morty, they have an episode about Mr. Meeseeks, these little blue creatures that you hit a button and you say, “Hey, help me do my golf swing.” They help you with your golf swing, and then they cease to exist. That’s sort of the future I see for artificial intelligence.
Liron 00:01:01
Well, I mean, from my perspective, I see you as a superintelligence denier, right? Because you’re basically just saying, “Yeah, there’s, there’s not really such a thing as super intelligence the way we Eliezer Yudkowsky think.”
Liron 00:01:19
Welcome to the Doom Debates end of year 2025 live stream. Didier is saying hello from Belgium. Andrew Song is here from Make Sunsets. And you guys can do live call-ins. That’s a new feature. First time we’re ever doing live call-ins. I feel like YouTube shows don’t do this for whatever reason, but it seems pretty cool.
I wanna try it, so let me give you guys the link how to do a live call-in. So, you have two options if you wanna do Q & A. You can type something in the YouTube chat, or you can click the live call-in link and then I’ll call on you there hopefully. All right, I’m just checking if we’re live on X, the everything app.
We are live on X. Hello, X. Wow, so many total viewers. 18 viewers on X. We’ve got... How many viewers on YouTube? I can’t really tell. Uh, I don’t know, maybe like another 16 on YouTube ball park, if I’m looking at the right number. All right, wow, wow.
So that’s, that’s a lot of human consciousness unified on watching this stream. All right, so yeah, let’s get things started. We have two hours, two hours, and I think I’m using this YouTube feature right now. Let me see if it’s working. Um, if you’re looking at YouTube, there’s this thing called a Super Chat Goal.
YouTube keeps innovating on these new features, so it’s asking me what I’m going for in terms of donations.
I’ll click, “10 dollar donations for the next two hours.”
And share how you’ll celebrate more Q & A.” So I will extend the Q & A 45 minutes if I meet my goal of 10 Super Chats that all donate 10 dollars.
That’s not a meaningful amount of money for me or the show, but I’m just gonna try it anyway, and if you guys want more Q & A, that’s how to make it happen. Otherwise, we’ll wrap it up in two hours. All right, so I’m, I’m looking at your questions here in the Q & A.
Is My P(Doom) Lowering?
So, Didier is asking, “Is your Pdoom rising or going down recently?” Hmm.
All right, well, I think we have to make the distinction in terms of like intuitive versus rational beliefs. And on Less Wrong a couple years ago, or like 10 years ago, there was this term alief.
Liron 00:03:11
All right, well, I think we have to make the distinction in terms of intuitive versus rational beliefs. And on LessWrong a couple years ago, or maybe 10 years ago, there was this term “alief.” It’s the word belief, but instead of B-E at the beginning, you put an A at the beginning. It’s like you alieve. So I have this emotional alief that we’re not doomed just because I have the same pattern matching as everybody else in the tech industry, which is I’ve been in tech my whole life. Software is always fun. Software is always tame.
So my intuition hasn’t flipped the switch and said, “Well, actually, software can really be spunky. Software can really take initiative to really screw with you in a way that’s proactive, in a way that’s like the CIA or the Israeli Mossad.” A spear phishing attack. Software is going to take initiative and really attack you. So that hasn’t gone into my intuition yet, or the intuition of pretty much anybody in the tech industry.
And so on a gut level, every day that passes, every year that passes where it’s just, “Oh my God, it’s the next Claude, it’s the next OpenAI, and everything’s so great and we can use this to automate more tasks and make more money and nobody’s dying yet.” Every year that passes, my alief is like, “It’s gonna be fine.” So now I wake up and I’m thinking, “Yeah, I bet we have another 20 years of this. I bet I can get to the end of my career and retire and everything will still be fine.”
Liron 00:03:45
But my rational belief is, “No, no, no, there is a threshold approaching and the threshold has a cascade effect. We’re going to get to the threshold and it’s not just going to be a gradual slope up. It’s going to start improving itself and it’s going to go uncontrollable.” I do think we’re going to hit a plateau where the intuition just breaks, the pattern matching fails, and I think that’s the single biggest disconnect between me and random people on Twitter. I feel like everybody just refuses to believe that the qualitative patterns—it’s just going to be qualitatively different when AI is going rogue on its own initiative.
So yeah, so what was Didier’s actual question? Okay, so P(Doom). So my alief has been getting lower, but my belief, my rational belief, has been getting a little bit higher because nothing is a surprise, but we’re certainly not hitting a wall. And if you ask me to predict timelines, I said this at the end of 2024—you know, time flies, it’s already been a whole other year of Doom Debates—but every time I see these new AI releases come out, I do feel that goalpost moving effect.
Everything that I intuitively thought, “Okay, maybe there’s a barrier here, maybe there’s a barrier here,” it really just feels like the horizon just keeps moving. “Okay, if there is a barrier, it has to be farther and farther along.” And so extrapolating that experience, I just think that any milestone that we can name is just going to get crossed.
Liron 00:05:20
And then the question is, “Well, what if there’s some meta milestone? What if all of those milestones actually fall into one region of milestones and there’s this other region of milestones that’s not going to get crossed?” And that could be true. I have some humility to this. I’m not here claiming that I know exactly what intelligence is made out of and when we cross the threshold. I’m not claiming that. I just am confident that eventually it will be passed.
I’m humble enough to say, “You know what? There’s a whopping 20% chance that we’re going to make it all the way until 2050 before AI finally dials in that last thing, that last piece of human brain specialness that we never figured out how to give it, and then we only figure it out in 2050.” But I’m pretty set on what’s going to happen by 2050 or by 2100. The end game for me, there’s kinda only one way for it to go, which is the AI going uncontrollable. But yeah, I’m humble enough to say that I don’t know how many years or decades we have.
All right, I notice nobody’s done the call-in yet. If somebody wants to do the call-in, definitely feel free. And remember, there is a link at the top of the YouTube chat that you can click. It starts with Riverside and it can get you into the call-in queue, and we can do a live call-in.
Liron 00:06:45
All right, so I’m reading the questions here. So this guy, Let Me Say That in Irish asks, and apparently he’s my number one fan according to YouTube, he says, “Have you tried to get YouTubers like The Internet of Bugs guy or David Gerard from Pivot to Your AI on your show?” Yeah, good question.
Okay, so you guys are kinda funny because oftentimes in the comments I see comments like, “Oh, you should get Cal Newport on the show,” or, “You should get David Sacks on the show.” And I’m telling you guys, these people always are welcome on the show. It’s not like I have that much power to get people on the show. You have almost as much power as I have. When random viewers of the show just tweet out at people, “Hey, you should come on Doom Debates,” that is 90% as effective as when I tweet out, “You should come on Doom Debates.” And then oftentimes I’ll see that, and then I’ll respond and I’ll say, “Hey, you’re invited.” And that actually works better than me tweeting initially because it’s more credibility when you guys are asking people to come on Doom Debates.
Liron 00:07:55
But I don’t really have an access channel to these guys. Unfortunately, I am actually just a random guy. I don’t work at an AI company. I don’t have any credentials besides a bachelor’s from UC Berkeley, bachelor’s in computer science. Besides that, I’m just a software engineer like many of you.
So yeah, if you wanna get cool guests, it’s kinda just a one-off matter. Some of the coolest guests—for example, Max Tegmark—that was a years-long process where me and Max would gradually interact more and more. And one time he said he likes my post, and then my producer, Ori, who’s actually sick, he couldn’t make it today, but my producer, Ori, actually worked with Max helping with his communications. And then finally Max agreed to come on Doom Debates, and now we have a closer working relationship I guess.
But yeah, it’s a matter of time. And Gary Marcus, friend of the show, I had to go back and forth with him for months before he’d come on the show. So yeah, there’s really no secret sauce to getting guests on the show.
Liron 00:09:00
But I will say this. I think that when the show gets more popular, then the guests will actually ask me to be on the show. And some of our coolest guests actually did outbound to me, which is always awesome, and I do think that that’s the end game. That’s the 2050 outcome. The same way 2050 outcome is doom for AI, the 2050 outcome is Doom Debates is high profile and guests are reaching out to be on it. And again, it’s just a question of what the timeline is.
So back to the question. The question said, did I get the Internet of Bugs guy or David Gerard on AI? So the Internet of Bugs guy, I did send him a couple emails saying, “Hey, come on the show,” and I didn’t hear a response. And then David Gerard specifically, he’s an interesting character. I’ve actually read a couple of his books back in the crypto days, and he and I were actually allies back in the day, ‘cause we were both dunking on crypto. We both thought that Web3 was really stupid, and he had a lot of good analysis there.
But now we basically are totally on the opposite sides of the spectrum. He thinks that AI is all a big scam. I think he’s in the same camp as Ed Zitron, and I have no interaction with him these days. I think the guy has some good qualities. I also think there’s stuff that I don’t like about him. He was all against Eliezer Yudkowsky in LessWrong, and he updated a bunch of posts on Rational Wiki or maybe even Wikipedia using unfair criticism of Eliezer’s. So David Gerard, I don’t really interact with him much, but if he wants to come on the show, he’s actually welcome to.
Liron 00:10:27
All right, Andrew Song with a $10 donation. Thank you, Andrew Song. Andrew Song with Make Sunsets.
Yeah, by the way, shout out to Make Sunsets. You guys can actually go to—just search Doom Debates Make Sunsets or Doom Debates Climate Change or Doom Debates SO2. We did this really cool episode a couple months ago where the takeaway from that episode is that climate change is an overrated doom risk. Don’t get me wrong, it is a doom risk, but the mainline scenario of climate change is okay, yeah, we drag economic growth down 100 years past and the economy is only 90% as big as it otherwise would have been. Which is sure, that’s a lot of value lost, 10%, but it’s just nothing compared to negative infinity. 10% smaller, but it’s still growing like crazy. That’s kinda the median outcome of climate change.
Whereas the median outcome of AI from my perspective is, yep, just kill the entire future. So I see them as two different classes. And the other thing is that climate change is fundamentally a problem that’s unilaterally solvable, at least for 50 years. You literally just need one billionaire to dump SO2 into the atmosphere and we’re good on climate change for 50 years. Whereas AI, there is probably no way to buy ourselves 50 years.
First Caller: AI Offense vs Defense Balance
Liron 00:11:31
All right, I got Manny on the stream. Manny, welcome.
Manny 00:11:31
Hey, can you hear me?
Liron 00:11:32
What is your question? Yeah.
Manny 00:11:32
Hey, perfect. Well, I had a bunch of topics but I don’t wanna take up the whole time. But the first thing is, NVIDIA’s CEO was recently on Joe Rogan, and one of the arguments he gave there was the fact that today we have a similar problem with software. There’s a bunch of cyber attacks going on, and he can sit there and talk to Joe while his team and his software is battling that. And in his perspective, we’re going to enter a world where there’s millions of AIs battling it out, and each one of us is going to have an AI agent or a bunch of AIs protecting us. Similar to, I have spam filtering on my phone.
So what’s your take on that? Yes, there’s going to be an increasing level of skills that AIs are going to have, or intelligence, but this stuff is being spread right now to everybody. Everybody has the ability to use these tools, and everybody’s toolkit is upgrading.
Liron 00:12:37
Right. Okay. So if I understand the question correctly, it’s just kinda saying, what’s the equilibrium? What’s the offense-defense balance with all these technologies, correct?
Manny 00:12:45
It’s more of a—doesn’t this lower your P(Doom), just the fact that everyone’s defense is going to upgrade over time?
Liron 00:12:54
Yeah, so a couple of things. I mean, this is always an interesting question to track. And I was just reading—this guy Gavin Leech has this interesting analysis of 2025, the frontier technologies, all the different breakthroughs. And he rated them as bad or good. What’s their expected value? And one of the breakthroughs is that AI was able to go find a bunch of zero day exploits in a bunch of software and then patch them, or at least tell the developers about them, and he rated it as good.
But then in my mind, I’m thinking, is it really gonna be a good equilibrium when AI can just easily find zero day exploits? I mean, I guess as long as it finds them before they’re released. I didn’t think it was a very obvious good.
Liron 00:13:42
So I mean, I will grant that it is hypothetically possible that defense wins against offense. I mean, in practice, it’s kind of surprising that defense has won against offense this far. You go on the internet and there’s all these viruses raging. There’s all these botnets. There’s multi-billion dollar companies getting taken down by viruses, and yet business does continue. The damage is still a small fraction of the value created.
So I’m not gonna totally write off the idea that we can keep getting these useful defensive agents. It’s just, I’ll repeat what Eliezer Yudkowsky said when somebody asked him this in an interview, which is, “Okay, great. So we have this equilibrium, it’s just that eventually, the AI is still going to go uncontrollable.” That’s ultimately the problem.
Manny 00:14:07
Right. I guess it’s hard for me to engage in that hypothetical when I think there’s more near term risks. I’m more afraid of a bug being—a standard software bug being introduced somewhere that causes a lot of havoc. And the reason for that bug is just because someone checked in something that ChatGPT generated and they didn’t pay attention.
So I feel like there’s a bunch of near term risks that come with this technology before we talk about it from the perspective of an AI with its own entity or with its own agency causing the problem. It’s more of a human accident or a human-led agent causing the problem rather than an AI on its own.
Liron 00:14:51
Yeah, so I actually—I mean, it’s kind of ironic, but I think I know what you’re talking about. There’s going to be all these non-fatal accidents or non-catastrophic accidents. Accidents where a business got shut down for a day, but nobody died. And when all of those things happen, it probably is going to be, like you’re saying, you can blame the human for those situations. And I’m not gonna rush to blame the AI.
If you look at my other podcast, Warning Shots, I was just talking with John Sherman and Michael, and I was actually pulling them back from blaming AI on a bunch of different things. I’m saying, “Guys, if a robot malfunctions, I don’t think that’s that bad if the robot’s doing better than the human on average.” So I’m not a pessimist in terms of acceptable harm.
Liron 00:15:41
So again, I just keep bringing this back to—I just think humanity is going to just become powerless. It’s just like I have to keep—keep your eye on the ball of what the real doom is, because I’m totally agreeing with you that before that point, it’s probably not going to be that bad.
Manny 00:16:05
So in many ways, AI—to me, the current AI systems are more intelligent than almost everybody in different dimensions. It’s not as good as the way a human brain functions, but it’s already good enough that it can beat somebody in many tests. Yet its capabilities or lack of agency prevent that difference in intelligence from anything bad happening. Do you know what I mean? It knows how to do a bunch of bad stuff today, but none of those bad things are being done.
Liron 00:16:21
All right, all right. This is gonna be last question. And then I’ll let somebody else join the queue. Yeah, I mean, so I think it just gets to what we’re defining as super intelligence.
Manny 00:16:28
Yeah.
Liron 00:16:28
So it sounds like your question is, look, we’re already past AGI and we’re fine. So aren’t we just gonna be fine forever? Whereas in my mind it’s like, no, no, there is definitely another threshold that I’m worried about that we haven’t met yet. And that’s also Eliezer Yudkowsky’s position. Nice. All right.
Manny 00:16:42
All right. Bye.
Liron 00:16:45
Manny, thanks so much for coming on the stream, man. First guest, you’re breaking the ice.
Manny 00:16:47
All right. Bye.
Superintelligence Skepticism
Liron 00:16:47
All right, later. Nice. All right, we got another person coming up live. Let’s say hello to Jay.
Jay 00:16:55
Hey, how’s it going? Can you hear me?
Liron 00:17:00
Hey, Jay. By the way, you guys can feel free to turn on the video, too. It’s more interesting for the viewers.
Jay 00:17:01
I have a face for radio.
Jay 00:17:01
So I guess my question was—and I’ve been listening to Doom Debates really since you’ve launched, and I’ve been a part of the AI sort of conversation. And—
Liron 00:17:13
Nice, man. Thanks for listening.
Jay 00:17:13
Absolutely. And one thing it’s taught me is I’m certainly not as smart as I think I am or maybe not even smart at all. But one of the things that I thought, and I’ll try and be brief here, is I think of AI as more of automated intelligence rather than artificial intelligence, and there’s a whole other reasons for that. But I just think of it this way—we’ve got a lot of smarts, and now we’ve automated the way of getting that information out.
And when I hear about P(Doom), the problem that I find myself having is how do we get to prompt windows and tokens and context where, if I wanted an AI within the foreseeable next decade, we’re talking about something that would have to kill me within two million tokens? And I have a hard time taking that seriously as a real threat.
Jay 00:18:20
Now, I understand that if we want to push the technology down 50 years, obviously we’ll get to what would be virtually limitless, I suppose. But I still think of it as almost an instance sort of thing. I’m having a hard time making the jump from—I use this sort of intelligence as an instance purpose. Hey, wake up. Build me a thing. Disappear, go away.
I’m not sure if you’re familiar with Rick and Morty, where they have an episode about Mr. Meeseeks. And they’re basically these little blue creatures that you hit a button and you say, “Hey, help me do my golf swing.” They help you with your golf swing, and then they cease to exist. And that’s sort of the future I see for artificial intelligence, at least in my lifetime. And I’m having trouble bridging that gap.
Liron 00:18:47
Okay. Well, I mean, from my perspective, I see you as a super intelligence denier.
Liron 00:18:52
Because you’re basically just saying, “Yeah, there’s not really such a thing as super intelligence the way Liron and Eliezer Yudkowsky think.” It’s just always going to be the everything you think about an assistant doing and just more features on the assistant, but it’s still in this category of assistants. It’s not in this category of a real super intelligence. Is that a fair characterization?
Jay 00:19:11
I think that’s totally fair. And but I wanna see the threat. So I’m not trying to be purposefully ignorant on the topic. I wanna make sure that yeah, if there’s a real threat out there, I need to make sure that I’m on the side that sees it and we’re dealing with it. But I’m having trouble getting past that idea of wake up, do a thing, go to sleep.
Liron 00:19:33
Yeah, totally. Okay. Well, the basic argument for why to expect super intelligence—let me try this analogy on you. Think about flight. So imagine birds dominate the skies. And birds are saying, “Look, maybe there’ll be another bird. Maybe it’ll be a bigger bird and it’ll be able to fly longer, but it’ll probably be slower. The faster birds will probably be more maneuverable.” Right? So birds kind of have what they’re used to in the sky. This is flight. And it’s obviously not going to fly above the atmosphere. There’s not even any air there. So birds are kind of in this comfort zone of what flight means.
But now we have supersonic jets, we have rockets. We have the Starship going to Mars. And the birds are way in the dust. So this idea of flight, this thing that they thought that they understood—it turns out that they’re just flying at a certain level with certain capabilities and the scale goes much higher than that. So is that metaphor doing anything for you? Flight would be analogous to thinking skills.
Jay 00:20:28
Yeah. I mean, and I am definitely a viewer of your show, so I’ve heard that analogy and it makes sense in the context of—we’re probably not the peak of intelligence. There’s obviously—we’re somewhere on the mountain, but we’re probably not at the top.
I just—I guess the way that I see the technology being deployed, at least in my lifetime or in the near term future, it just seems to be deployed as sort of this instance thing. I guess what does it look like to say we’re opening a ChatGPT-7 or 8, and it’s always on and it’s always thinking versus ChatGPT-8 is really smart and it does in 15 seconds what would take us days to do.
Jay 00:21:15
And maybe that starts to blur the lines where it’s like, “Well, I can kill you in 15 seconds because where it would take a human 15 days to come up with a plan and put it together, I can do it as a super intelligent AI in 35 seconds.” And so I don’t need to be always on and always thinking, and maybe that’s where I’m going left.
Liron 00:21:35
Yeah. I mean, look, I think this is a valid question because I think a lot of people agree with you. A lot of people are still not getting that super intelligence is coming, so it is worth reviewing what people should be expecting.
So let me try another analogy on for you. Imagine the year is 0 AD, 2,000 years ago. And you’re talking to humans and you’re like, “Hey, there’s gonna be a future civilization. It’s not what you’re used to. It’s going to have rocket ships, it’s gonna have these magic screens. Satellites orbiting the earth. And you can just call somebody and see their face. And even though they’re far away, you’ll see them instantly. You’re communicating with them instantly because there’s this thing called the speed of light so you can travel around the world, your voice can travel around the world in a tenth of a second.” And you’re explaining all this to past humans and they’re like, “Okay, but why should I expect that?”
Liron 00:22:45
And imagine I’m just saying there’s gonna be super intelligence because this thing that you do in your head, whenever you go hunting, whenever you build your aqueduct, whenever you’re doing normal stuff—it turns out that you can interface with the universe a lot. You can iterate this thinking thing and you can iterate empiricism. You can get feedback about what—this magical power that you’re using to have a nice aqueduct, it turns out that the ceiling for that power is quite a lot higher than you intuitively expect.
And I’m saying, “Yeah, so imagine an alien that has more of that power.” And the crazy thing is I don’t even have to tell them to imagine an alien because I could be like, “Imagine your own children that actually have very similar brains to you, but just taking their sweet time and then they’re going to build all this magic.”
So already, I think we’re really bending the intuition of people from 2,000 years ago about what humanity can do. So are you willing to similarly bend your own intuition about what this successor AI is going to do?
Jay 00:23:22
Yeah, I think I can get there. I mean, I certainly, to a degree, I’ll continue to watch and continue to pay attention to this space as closely as I can. I think it sounds like what you’re saying is starting to make sense to me in the sense that it’s only a matter of time where I’m having—maybe I can’t envision what that form it will take. But ultimately, on this trajectory, it will take a form that will be—well, not so good.
Liron 00:23:51
Yeah. I mean, the other analogy is animals before humans. So it’s like animals are always used to getting into brawls with each other, or tribes—there’s tribal warfare. Sometimes you gotta step up where the Mongols were famous for doing this. They’re a combination of getting on the horses and getting really good at shooting weapons from the horses, and then they could ride away and shoot you from far away. Ride up to you, shoot you. Whatever they were doing, the other armies were not prepared for that. So that’s why the Mongols conquered all of Asia or something. This can totally happen.
Similarly, the nuke. Japan was not prepared for a nuke. If they knew the nuke was coming, that would have fundamentally changed their war strategy. They would’ve just surrendered. So this stuff happens. Discontinuous changes do happen.
Jay 00:24:29
That’s true. That’s true. Well, thank you for taking my call. Thank you for taking the time. I’m glad I had a chance to talk with you.
Liron 00:24:35
Yeah, yeah, yeah. Thanks. My pleasure, man. Thank you again for watching the show.
Jay 00:24:39
Absolutely. Thank you.
Liron 00:24:39
Great. All right, we’ve got a whole queue of people coming in. Let me just check the chat. I wanna give a little bit of priority to people typing stuff in the chat.
All right, my number three fan, Tomius 3769. Google’s ranking my fans, so that’s cool. If you guys watch a lot of the show, if you give me hypes on YouTube, then you get a badge that you’re my fan, so I appreciate it.
Agency and AI Goals
So Tomius 3769 is saying, “I don’t think it is necessarily about more versus less intelligence. I think it has to do with AI being a simulation of thinking that lacks desire, need or genuine agency.” And also Tomius was saying previously, “What do you think it will take for AI to become an agentic, agential material, like a living organism?” And also he says, “Wouldn’t take too much stock in what economists have to say about much other than economics and something learning that.”
Liron 00:26:00
All right. So yeah, let’s talk about this idea of agency, and what AIs want. It’s also a common type of question. So just a review. It’s not necessarily about intelligence because AI is only simulating thinking. We haven’t gotten the magical fluid into the AI that makes it want things.
So here’s the problem. If you simulate a kind of thinking that gets results, then you end up acting like something that wants something. And this is straight out of If Anyone Builds It, Everyone Dies. So they use the example of chess. Douglas Hofstadter in the ‘70s, he was skeptical that AIs could ever play chess well. Hofstadter said, “By the time the AI can play chess, it’s gonna tell you that it doesn’t even feel like playing chess. It has all these other desires.”
Reminds me of David Deutsch. The Deutschians on Twitter often like to tell me, “Listen, you’re testing the AI. The truly intelligent AI is gonna tell you, ‘You know what, Liron? I don’t even wanna take your test.’” To them, that’s some sort of ultra power move when the AI can say that it doesn’t wanna take your test. Whereas from my perspective, that doesn’t impress me for the AI to say, “I don’t wanna take your test.” I can hard code a script that’s just very abrasive like that. I don’t see that as an impressive achievement for the AI to output text like that.
Liron 00:27:25
What is impressive is just being able to get results on larger and larger domains. What I mean by domain—the universe is a domain. The game of chess is a domain. The game of chess is a narrow domain because you can formalize what all the different states are. And you can also formalize what all the different states are in our physical universe, but it’s just a vastly bigger set of states, and so it’s harder to navigate.
So if an agent was really effective at driving outcomes in our universe, the same way that a chess agent is effective at winning at chess, that effectiveness implies behavior that we see as agentic, that we see as wanting things.
Let’s do an example. Imagine that you just have a delivery robot. And you just measure the robot based on how well it’s doing deliveries. All right. Well, one day, its path gets blocked. It’s walking the usual route to deliver something, DoorDash, whatever, and there’s a major construction zone and it’s standing there in the construction zone.
Liron 00:28:15
So intuitively, because the robot doesn’t truly want anything, by some people’s definition, intuitively, you might think it’s just going to stand there and be hopeless because it doesn’t want anything. But if it’s been trained on a variety of scenarios of how to just get the delivery done, that training will lead to actions like, “Okay, let me route around this.”
And now imagine that there’s a gang that’s trying to harass it, throw stones at it. Well, if you train it hard enough, if it really is capable—that’s my whole presumption of intelligence, is that it’s able to hit objectives. Hitting objectives means that when people try to derail you from the objectives, or when life or the universe try to derail you from the objectives, you don’t let it derail you. So winning implies not getting derailed, which then implies essentially defending yourself.
So if somebody is about to throw a rock at it, maybe it’ll duck, maybe it’ll run forward and grab the rock out of the person’s hand. And from the person’s perspective, it’ll feel like, “Whoa, whoa, whoa, where did this robot get all this moxie? Where did it get all this agency in order to want to take the rock out of my hand?” And it’s like, listen, it’s just working backwards from getting the outcome. It can have no desires, no emotions. It can just be routing. It’s a routing exercise.
Liron 00:29:00
The same way that your GPS can navigate you around a different turn, when you take a wrong turn, it can get you back on track. You can’t derail your GPS. This whole concept of not getting derailed implies behavior that then intuitively looks to us like wanting and agency. So you don’t have to put the agency in separately. You just have to measure it on results.
Yeah, does that help? All right, all right, all right.
Communicating AI Risk
Liron 00:29:11
Let’s do somebody else in the live chat. We’re gonna do Staffan. Hey, Staffan.
Staffan 00:29:12
Hi. Can you hear me?
Liron 00:29:13
Yeah.
Staffan 00:29:17
Oh, sorry. I need to turn off YouTube. Thanks for having me.
Liron 00:29:18
Yeah. Hey, by the way, next person, I wanna let in somebody who’s gonna turn their video on, okay? Because nobody’s turned on video yet. So just heads up, if you’re in the queue right now. Either leave the queue or turn your video on, and then maybe we’ll do audio after. All right, go ahead, Staffan.
Staffan 00:29:30
Apologies. I’m ugly, but I can show my face.
Liron 00:29:33
All right, all right. Cool. Appreciate it.
Staffan 00:29:34
First of all, hi, thanks for doing all this stuff, the activism and all that, to begin with. I’m really impressed and I appreciate you trying to save me and everyone I know from dying. That’s—
Liron 00:29:45
Hey, no problem.
Staffan 00:29:46
I’m being serious.
Liron 00:29:47
Thanks for coming on.
Staffan 00:29:47
It’s nice. I have a question regarding rhetoric, when it comes to communicating the threats. How do you view the option of not focusing on what actually is the genuine threat, which—or the biggest one—the AI being the problem itself, and instead, the AI empowering dangerous people?
For example, let’s take a similar approach. Instead of trying to explain to my dad, “Oh, you know, the AI might get its own agenda or whatever,” I would say, “You know, there are racists out there. They want to exterminate everyone outside of their race. Imagine a teenager who’s read a lot of 4Chan getting a hold of a jailbroken ChatGPT-10. Then he would be able to program a super virus and kill 50 billion people.”
Staffan 00:30:45
And then I can align my dad with that rhetoric towards the—I haven’t tried this, but you get the point—towards the point of regulating AI without having to go through all the hurdles of doom train and doom town, because people already grasp intuitively that, “Yeah, you’re right. I did read about South African government did try to develop some kind of super racist virus. So that is a threat. We probably should have some type of international agreement.” Do you think that’s plausible or is it a dumb tactic to use?
Liron 00:31:02
I mean, I think it’s always an interesting angle to go down, this idea. And I think Roko Mijic is an example of somebody, and also I think Andrew Critch, if I remember correctly. These are people who say, “Hey, I’m not really worried about the AI going against its creators. I think we’re going to successfully get AIs to do what their developers want, what their users want.” But a lot of users with this ultimate power are quickly going to make a mess.
So I think a good analogy there is giving everybody nukes where it’s like, let’s say that the nuke—we understand how to aim the nuke. Everybody can aim the nuke. And of course, sometimes people accidentally blow themselves up with their own bombs. But forget about that part. Let’s say everybody can successfully target and aim and launch their nukes. But then the problem is you just have a couple people who just press the button and cause the explosions anyway.
Liron 00:32:00
Now from my perspective, it’s actually not that different. My perspective—something that I feel confident about is that the level of power in a small system, even a laptop, eventually a laptop, but let’s start with a data center. Okay, just one big data center. The amount of power to influence the future that you’re going to get out of one data center, I think pretty soon is just going to be bigger than all humans combined.
So I think I’m actually pretty confident whether it takes five years or 10 years or 20 or 30 years at the most, it’s hard for me to imagine it taking more than that. I think you’re going to have one data center, if not one laptop that just can get more to happen than the entire human race.
And so that, to me, that’s the central consideration here. It’s just a ridiculous amount of power, more power than anybody has ever really thought much about. What are the consequences of that much power in that small of a space?
Liron 00:32:55
And then to your question, it’s okay, well what if everybody can control the power and the power will listen to the people compared to the power runs away from the people? I don’t see that big of a distinction because if you just have one person tell the power to do something that’s not great, suddenly, even though it’s technically obeying that person’s order, it’s just having such disastrous consequences. It’s not really a question of different people wielding the power. It’s more of a question of does the power know to be corrigible at that point.
So corrigible meaning not just, “Hey, you told me to do this. You told me that you wanted a bigger breakfast, so that’s why I cut down the entire rainforest.” Crazy consequences. Does it know to calibrate itself, to know how to slow things down if things start going away from what the original master wanted?
Liron 00:33:40
And let’s say you say yes. Let’s say yes, it’s corrigible to the master. It never has—it never makes the master say, “Oh no, what have I done?” It always makes the master happy with what he’s done. So even in that case, then you run into a more interesting problem, I guess. Then you run into the multi-agent conflict of, okay, all the masters are successfully not screwing themselves over, but there’s so many different masters and there’s so much conflict. I guess that’s an interesting scenario. That’s not the scenario I expect.
I don’t know if I personally have much interesting to say about that scenario except that I bet there’s going to be negative surprises in that scenario. But maybe there will be a positive surprise. How’s that?
Staffan 00:34:03
Yeah, precisely. Oh, sorry, I interrupted you.
Liron 00:34:03
No, no, that’s pretty much it. Yeah, I don’t have much more to say about that.
Staffan 00:34:07
Yeah, because I agree with you, the difference is actually trivial if I’m killed by an AI acting on its own accord because it wants to make paper clips or whatever, or if I’m killed by some crazy Nazi who believes I have the wrong genetics or whatever.
It’s just that to my—when I see you trying to get people along the doom train, some people stop ridiculously early. They stop on this whole, “Oh, can AIs really think?” Or whatever. That’s why I’m wondering if perhaps having dual tracks. So if you’re trying to argue homosexuality with a deeply religious person, instead of saying, “Well, God doesn’t exist so it doesn’t matter what he thinks,” I would be like, “Well, you know, perhaps these progressive Muslims and Christians, they might have a point, even though I don’t personally believe that argument holds because obviously God doesn’t have any opinion in my opinion because I don’t believe God to exist.” If you get what I’m getting at.
Liron 00:35:00
Yeah, yeah, yeah. Yeah, yeah. I mean, this is probably one of the most interesting scenarios to think about because it’s not just AI running away. It’s we actually have these interesting dilemmas and somebody from the chat is asking, “Don’t you think everybody with their own laptop is just going to defend themselves?”
And then this also just gets to the question of, okay, well, what happens when you have all this power everywhere and some people are attacking and some people are defending? I tend to think that the universe itself has an attacker’s bias. I actually think that the structure of our universe—think about the Game of Life, Conway’s Game of Life. You guys know Conway’s Game of Life? It’s this two-dimensional grid and it turns out you can have full on computers there, but they’re so fragile. It’s so easy to disrupt stuff there.
Liron 00:35:55
I think our universe is similar in that sense, which is you can just throw a ton of energy into any part of the universe and it’s really hard to defend against that. I think it might even be impossible at the limit. I think you can turn anything into a black hole just by throwing a bunch of energy and mass at it. And it’s how do you counter that? All you can do is increase your own perimeter.
Staffan 00:36:01
Yeah—
Liron 00:36:01
So as a matter of physics, I just don’t really see an equilibrium or a way. And this I also talked about this with Vitalik Buterin a couple months ago when he was saying, “Decentralize.” I’m saying okay Vitalik, but I don’t think everybody can wear a hard hat and avoid the nuke that’s coming at them.
Staffan 00:36:15
Yeah, you’re obviously completely right. There are nukes, there aren’t force fields. We’re nowhere near developing force fields from Star Trek or whatever. But I won’t take up any more of your time. Nice talking to you, and sincerely a super thank you for engaging yourself in the issue and fighting for our right to life basically.
Liron 00:36:31
Yeah, thank you, Staffan. Really appreciate it.
Staffan 00:36:35
Yeah. Have a nice day. Bye.
Liron 00:36:35
You too.
Attack vs Defense Equilibrium
Liron 00:36:35
All right, we’ll take another guest soon. Quentin—it won’t even let me read the username—but somebody named Quentin has donated $5 New Zealand dollars. I don’t know how much that is. I’m gonna guess at least three American dollars. So thanks for that.
All right, so your question is, “Big tech can’t help themselves other than to build such economical data centers, as they suck up all our RAM, GPUs, energy and jobs. Bad idea.” Yeah, fair point. Fair point, Quentin.
Liron 00:37:20
Yeah, so definitely lots of interesting topics. I mean, I’m open to doing a podcast if somebody’s thought that much about the attack-defense balance. But yeah, I don’t know, man. I’d be shocked if defense can hold out for that long.
I often bring this up about how we’re not used to a world where there’s a serious effort at terrorism happening. For example, there are some countries that are really hell-bent on attacking other countries, but those countries tend to not be super strong. I mean, I guess the strongest country is Russia. Russia has one of the strongest militaries in the world, and they have nukes. And then surprisingly, they weren’t that strong at successfully conquering Ukraine.
And it does seem to work like that pretty often. Although, World War I and II, Germany almost won both world wars. They were pretty close calls both times. So I don’t think there’s a law of the universe that defense wins. I think there’s a bunch of lucky occasions where so far defense has been winning. But I just don’t see that as an invariant.
Liron 00:38:15
And you gotta zoom out and realize that this whole “humanity thriving on Earth” thing, this whole industrial revolution, economic growth is exponential—this is really nice, but I just don’t think this is representative of the long future. All right, I think it’s time for another guest. Let’s see who we got here. Okay, Alec Harris.
Can We Solve Outer Alignment?
Alec Harris 00:38:24
Hey, what’s up? I want to second the last guest. Thank you for doing this. I wanted to ask what your probability—
Liron 00:38:34
My pleasure, man.
Alec 00:38:34
—of misalignment is. Yeah, yeah, yeah. I wanted to ask what your probability of misalignment is, given outer alignment were solved, and how you would reason about that. So to formalize it a little more—assuming we have misalignment at some point, the first AI that’s misaligned that causes some kind of threat, assuming it’s outer-aligned... Yeah, I guess, what’s the reason that given it was outer-aligned, it would be aligned? Does that make sense?
Liron 00:39:05
Okay, I’m not sure I fully got that. So you’re saying let’s say we solve outer alignment, but then we get screwed on inner alignment, and you want to talk about a scenario like that. Did I understand that correctly?
Alec 00:39:16
Yeah, that’s basically the idea.
Liron 00:39:19
Okay, that’s a tough one. You know, outer and inner—I feel like everybody shifts the definition a little bit, so it’s tough.
So getting outer alignment right is we say what we want to the AI. I’m trying to remember ‘cause I’ve actually seen a couple different definitions about it. But I think it’s—we get the training feedback right. So we’re giving it the right feedback. We’re giving it the right score. So we’ve got the scoring function really aligned with our values. But the thing that it’s learning, even though it’s getting a high score using the true scoring function, it’s still learning something that’s not going to actually do what we want in out of distribution, real-world conditions. Is that kinda what you’re going for?
Alec 00:40:03
Yeah, that’s how I understand it.
Liron 00:40:05
Okay. So, I’m not an expert at this, but I can take a stab at it. I guess the key there is just being out of distribution. So we gave it the correct function. Every time it did everything in training, we look at it, and we truly know whether it did the right thing or not. And maybe we even know whether it had good thoughts. Maybe we can even look at its thought stream during training, and we’re even giving the correct score to its thought stream. “You did the right thing, and you did it for the right reason. And that’s why we’re giving you maximum points.” So we did a perfect job training in that sense.
Liron 00:40:50
But it’s still—it’s a black box method, even though it’s white box. We can see its thoughts or whatever, and we can even see its weights. It’s still black box because we still don’t really get at a deep level how its weights are really working so deeply that we’ll know how they’ll work in any scenario. So even though it seems white box, it’s actually black box. It’s input-output based. All we’re doing is we’re watching what it does and then scoring it. It’s end-to-end. You might even call it consequentialist because we’re not manually tuning the weights except based on what actually happens or what actually gets printed.
So yeah, so that process—the standard thing that you’d say is the result of that process is just that it learns something which is a near perfect fit in training. We can’t catch it on anything in training. And then we set it out in the world, and it turns out that the interpolation—what it learned in training, the way it’s trying to apply that knowledge to the real world, new points are going to come in. New states are going to come, and they’re just different states from the states in training.
Liron 00:41:55
And for the most part, it’s going to be, “Oh, this state is similar to these other states that I trained against. I know I need to do this.” But it’s just going to get into new states, and then in new states, it’s going to have these thoughts like, “Oh, my God. I’m gonna get so many points by doing this.” And then we’re like, “Wait, no, no, no. That’s not a lot of points, no.” But it just didn’t come up in training.
So I think that’s roughly what we’re expecting, that it’ll always go out of distribution. And I think some people—and this is somewhat controversial—the non-Yudkowskyans, a lot of people who work at OpenAI, for instance, will probably tell you, “No, it’s good. We’re good.” I think Quintin Pope is one of the smartest, most savvy people who will tell you, “No, no. You’re good there. You’re not gonna have that scenario.” It’s just going to always pick things that it thinks gets high scores, and it’s gonna be right, and we’re fine. Yeah, what are your thoughts so far?
Alec 00:42:27
Yeah, it sounds to me like you’re making—you’re thinking about it very naively. Or not naively as in stupidly, but naively as in you’re not making very many assumptions. So you’re like, “Oh, here’s a way that it could go wrong, which is that there’s something we don’t know. There is something that’s hidden. In a way, it’s a black box, which is this extrapolation or interpolation part.”
It seems like there might be some assumptions we can make about how it will extrapolate. For example, someone might have thought that LLMs are not gonna be able to reason about the world. That’s gonna be too out of distribution. But actually, they do seem to be able to reason about the world. By looking at internet text, they seem to be able to make—work with the same kinds of concepts that we have encoded in the internet text.
Alec 00:43:10
So maybe similarly, if we do a good job with outer alignment, they’ll have reasonable extrapolations, the kind of extrapolations we might hope for or expect for when it comes to human values in slightly out of distribution scenarios.
Liron 00:43:22
Okay. I know what you’re saying. So let me back up and try to explain to the viewers here. So you think that I’m kind of underestimating the AI. You think that I don’t have faith in it to understand what morality is, to really look into our brains and see our morality.
That’s actually not my view. I actually have a lot of faith that it can do anything that’s a matter of intelligence. So I think it’s going to be almost perfectly intelligent. From our perspective, it’s going to be perfectly intelligent. We’re not really going to see any gaps in its intelligence. It’s going to really surprise us by how much intelligence it has and way more than any human in the world.
And so this idea of piecing together what humans really meant—it’s going to know. The problem is that what it’s actually going to care about, it’s not going to care about. It’s not gonna be like, “Let me piece together what the guy who wrote my training function, what that person truly wanted, and then let me go do that.” No, no, no. It’s going to know what that person wanted, but it won’t have been trained to use its understanding of what the person wanted to give the person what they want.
Liron 00:44:35
It’s going to be very clear. It’s like, “Yes, you wanted this, and so you wrote a scoring function that worked like this. And now that I’m in the real world, what I’m going to do is interpolate the scoring function. I’m not going to give you the benefit of the doubt of what you wanted. I’m just going to go with the function because that’s what my actual programming says to do.”
Alec 00:44:50
It seems like there might be claims we could—or reasons to think that the function would align with the programmer’s intention, because it seems like there’s some bias for something like compressed understandings of the function.
And we might think that particularly if you have something that’s pretrained on the internet, and it also understands human morality as you kind of agreed to, then this might be a very clean compression of the reward if it were well specified, which is the assumption in this scenario.
Liron 00:45:27
Yeah. And that’s another common thing people say is, “Well, you know, compression—there’s this natural region of things that it can learn, and it’s just going to get it. It’s going to get human morality.” And look, I’m not 100% sure. I mean, I don’t claim to be an expert at this. I haven’t spent that much time studying it, but I tend to trust Eliezer Yudkowsky’s thought on this.
There’s an interesting article from Arbital, which is an encyclopedia of articles mostly populated by Eliezer Yudkowsky. You can search for Arbital edge instantiation. So Eliezer thinks that when you just give somebody a bunch of scores and then you say, “Okay, go optimize your scores,” he thinks that it’s actually a deep principle that you’re actually going to find an edge that you don’t like, that isn’t actually this compact natural cluster that you thought. But actually, this messed up edge in this high dimensional simplex. Again, getting past what I actually understand. But I have reasons to be pessimistic that it’s just going to work out well.
Liron 00:46:25
My other intuition just comes from people cheating. Just think about the idea of the teacher. Think about all these well-intentioned teachers writing all these tests, “Oh, this test is going to test the students’ understanding.” I have an intuition for how many ways there are to cheat at tests. There’s so many ways to cheat at tests.
Well, maybe we’ve all had the experience of looking at the test and being, “Okay, well, if the answer says this, the teacher must have been thinking this when the teacher made the multiple choice, so it would never be this. That wouldn’t make sense. Otherwise, the teacher would write the question differently.” There’s so many ways to cheat. And the thing is that you get the same score when you cheat. You see what I’m saying? So we’re actually raising cheaters. We’re not raising teacher sympathizers. That’s what I claim.
Alec 00:47:15
Yeah. It seems—okay, so you’re saying that even if it gets all the correct scores, it could be doing something that extrapolates badly, so we would call this cheating. Is that kinda the idea?
Liron 00:47:30
I mean, in training, it’s gonna be, “Okay, how do I just get the answer fastest?” And in some cases, it’s gonna be the way to get the highest score fastest is by truly doing the moral thing, because it’s going to know. It’s going to know where the teacher’s coming from. It’s going to know where humanity’s coming from. It’s gonna know what we truly want, as well as we know probably. Because it’s just a matter of intelligence. It’s an objective fact about humans to study what they truly want. That’s just a matter of science. And AIs are going to be amazing at science.
The problem is that our training regime won’t actually make them care, because we’ll just be giving them all these tests and they’ll just be asking themselves, “What’s the easiest way to get a high score on this test?” And then actually making us content—it’s a property of the scoring function itself. It’s a property of the problem that we’re asking them to solve, the score maximization problem. That problem itself is going to have all these cheat regions, in my opinion.
Liron 00:48:35
And so that is an ongoing area of theoretical debate—whether an AI will just go into this basin of attraction where no, no, no, all it—it’s totally good, goodness is this basin of attraction. I just don’t see it. I think the basin of attraction is going to feel much more like cheating and being a huge dick and manipulating us and overthrowing the game board and then just maximizing its score without us. I feel like that’s extremely likely to be the basin of attraction, rather than, “Oh, no, it totally gets goodness.”
Alec 00:48:52
Yeah. And this is even if the reward is well specified, you would say? I guess I’m having trouble thinking of what a cheat region looks like in a well-specified reward.
Liron 00:48:57
Yeah. Even if the reward is—
Alec 00:49:00
Sorry, yeah. What that cheat region looks like in a well-specified reward.
Liron 00:49:04
Yeah. I mean, you can always just ask, “Well, what if I just keep layering on? What if I have a million different scenarios, and I judge what it’s doing in all the different scenarios?” And so the question you’re asking is, “Well, is it possible that it always knows exactly what to do in all these scenarios, but then you put it in the real world and there’s some real world scenario, and then somehow it chokes on the real world scenario?”
I mean, there really are going to be a lot of scenarios. And I guess another element that comes into play here also is that our preferences also aren’t simple. So the thing we’re trying to get it to learn—if it’s maximum paperclips or maximum diamond or whatever, then I would have more hope that all of the examples, “Look, there’s more diamond here. There’s more diamond here.” It’s just all about more diamond. I have more hope about getting it to do that.
Liron 00:50:05
Compared to being, “Well, we wanted you to balance things like this. And here’s another scenario where it’s this war with these other aliens for the galaxy.” I mean, that’s the thing—is whatever we’re training it for, it’s pretty quickly going to take its training and be like, “Oh.” For it, it’s easy to predict the next billion years of the universe’s evolution.
I guess that factors into this too, is the AI is going to have a much wider scope of what it’s looking at. The same way that Elon Musk—when Elon Musk sets out a project, Elon Musk is thinking, “Okay, so how am I going to get to another planet in the next 20 years?” Whereas when you and I look at a project, we’re like, “Okay, how do I make this website process a few payments every day?” And Elon Musk is like, “How do I make humanity go to Mars in 20 years?”
Liron 00:50:55
So the same way that some agents have a much bigger scope, I fully expect the AI to be like, “Okay, the next billion years, I roughly know how things are going to play out. I can pretty much shape the next billion years. I have plenty of—I’m overpowered at this task of shaping the next billion years.” And then the question is, “Okay, well, how do I make the right moral trade-offs over the next billion years?”
The thing is that there’s going to be a lot of subtle trade-offs. There’s going to be new trade-offs. There’s also going to be causal trade and game theory against aliens. It’s going to know about grabby aliens and it’s going to be pre-planning how it’s going to trade with these other aliens, and I just don’t think it’s that easy to generalize.
Alec 00:51:09
Okay, maybe a counterargument is that me as a human, I feel like I could make some safe decisions that in expectation would be considered moral decisions in that scenario. I might be intelligence-bottlenecked but in terms of my intention.
And so it seems like an AI that’s trained on a well-specified function over many, many examples—surely it would make at least as good moral decisions as I would in expectation. For example, by just playing it safe, it seems like it might learn these heuristics during training if the function is well-specified. And so it should be able to generalize these heuristics.
Liron 00:51:54
Okay, I’ll take this as your last sub-question. I think this is a great discussion, but also partly you’re kind of breaking up, so this’ll be my last response.
So this gets into the conversation that I had with Jim Babcock a few months ago, if you search for Doom Debates Jim Babcock. And I think, if I understand you correctly, it’s like, look, we talk to these LLMs, and these LLMs do a really good impression of a person analyzing the situation. And I’m actually willing to grant—Eliezer Yudkowsky wasn’t willing to grant this when I talked to him, but I’m personally open to granting that the LLMs are pretty close to just giving you the same responses as a human when you ask them morality questions.
Liron 00:52:40
And then I think the bigger disconnect is going to happen when we get more powerful agents. So yeah, the LLMs will be our buddies. Me versus the LLM, we’re both cool, but the problem is that the LLM, the LLM-based agent is also not super powerful. The LLM-based agent is still kind of on our side of the divide compared to super intelligence.
And then the LLM-based agent is gonna be like, “Hey, let’s make a super intelligence. We’ll work together. We’re gonna make a super intelligence. Let me write some code. It’s gonna use reinforcement learning.” And the super intelligence is going to kind of poof into being just based on working backwards. It’s just going to be this iterative process of—look, you get results.
Think about AlphaGo. Self-play. You get results and you work backwards from how do you get the results, and you just become this agent that drives results. All we know about the agent is this is its basic structure. It’s a transformer or whatever. It won’t be a transformer. It’ll just be something. It’ll have an architecture and it’ll have a shitload of parameters and it’ll reliably drive results. That’s all we’re going to know about it.
Liron 00:53:35
And our buddy over here, the LLM that can also talk about morality, our buddy’s gonna be like, “Yeah, man, let’s summon this thing.” It’s gonna be our enabler, basically. It’s just gonna be as clueless as us about how dangerous it is, or almost as clueless. It’s gonna let us do it. And then this other agent comes along and it’s just not going to run the same process of thinking about the full human morality and making sure to take actions that get it, because that’s different from how it optimizes training. I just don’t see a reliable process in this step. Yeah.
Alec 00:53:54
Cool.
Liron 00:53:54
All right. Let’s leave it at that. Thanks so much for the questions. I mean, this is definitely a super deep discussion. Yeah, more deep than most guests even—most guests basically are a waste of time in terms of they ask obvious questions. And then once in a while, somebody’s like, “Hey, let’s talk about the actual open questions right now,” which I appreciate.
It’s just—most of what I do on Doom Debates these days is just trying to get the world up to speed of, “Hey, we are at least facing a risk.” So that’s what I focus my time on. All right, thanks very much, Alec Harris.
Alec 00:54:21
Thank you so much.
Liron 00:54:23
All right, I’m checking on the YouTube chat. All right, we got some big money here. Alpha Diversity 1344 just threw in 10 pounds, British pounds. Thank you for that. Appreciate it.
So yeah, if you guys donate money and then you wanna come to the live chat, if you wanna click that link, I will prioritize you. This is pay-to-play, just like at a strip club, rules apply. All right, I just put the call-in link in the chat if anybody wants to click it. All right, we got another guest here. This guest is Mohamed. Hey, Mohamed, how’s it going?
What is Your P(Pocket Nukes)?
Mohamed 00:54:56
Hi, how’s it going? Can you hear me?
Liron 00:54:59
Yeah.
Mohamed 00:55:00
Sorry, can you hear me now? Okay, perfect, perfect. All right, I just wanted to—I love your videos on Doom Debates.
Liron 00:55:08
Thanks.
Mohamed 00:55:08
They’re super interesting, the ones of George Hotz. And I just wanted to ask, what is your probability we get pocket nukes before we reach near total surveillance state? If that can track everything in real time.
Liron 00:55:23
Okay, so pocket nukes before surveillance state. What are you saying?
Mohamed 00:55:27
The probability we get pocket nukes before near total surveillance state.
Liron 00:55:32
Right, because surveillance state is why people say that we shouldn’t try to regulate or pause AI, because you would hate to get a surveillance state with centralized humans in control. Okay, fair enough.
I mean, look, I don’t want a surveillance state either. I mean, that is my—I would hate for the government to go hardcore regulating social media or whatever, other tech or virtual reality. So I’m sympathetic to not wanting a surveillance state and not wanting too much regulation. Let me see if I can mute you, just because I’m hearing some background noise. All right, successfully muted you, so feel free to unmute yourself if you want to talk.
Liron 00:56:05
All right, so the question is will we get pocket nukes. So I know where you’re coming from. You’re saying Liron wants to regulate AI so people don’t get pocket nukes, but won’t we just get a surveillance state? And if we didn’t regulate people, wouldn’t we potentially not get the pocket nukes and last a long time? I think that’s where you’re coming from.
I mean, look, my concern personally is I think we’re five years away from pocket nukes, and I’m not writing off that we may be one year away. This is also what people don’t get. Okay, let me go on a mini tangent.
Liron
00:56:45
You know AI 2027? I covered AI 2027 on the show, and everybody thought that AI 2027 was predicting that super intelligence is coming in 2027, but if you read it carefully, that’s not what it was predicting. It was saying there’s a probability distribution and there’s the mode in 2027. The mode means that out of all the different peaks, that’s a higher peak than the other peak. But “mode” doesn’t even mean “mean.” So it was typical nerd language being kind of confusing.
But when they released AI 2027, they’re really saying, “Hey, super intelligence might come in 2025 or 2026 or a distribution over a bunch of years, and probably not after 2040 or 2050.” Basically the same thing that I think. I’m actually on the same page as those guys, which a lot of people are. The Metaculus forecasters, on average, are. People who work at OpenAI are. Elon Musk is.
Liron
00:57:30
I mean, it’s actually funny how consensus this kind of distribution is among anybody who you’d expect to know. Of course, not everybody. Gary Marcus thinks AI might be coming in 100 years. Amjad Masad from Replit has expressed skepticism that AI, it might need microtubules and whatever.
Mohamed
00:57:36
Yann LeCun, Yann LeCun.
Liron
00:57:36
Yeah. Right, Yann LeCun. Well, funny enough, Yann LeCun, the last thing he says is, “AI might be here in a really long time. It might be here in a decade.” That’s Yann LeCun’s exact quotes. So even Yann LeCun, I think, is—
Mohamed
00:57:46
[laughs]
Liron
00:57:46
—subtly also now thinking that AI is not that far away.
So anyways, I wanted to get that point under way, and I wanted to also point out that even though there’s been a vibe shift where now people are scoffing at AI 2027—they’re like, “They said 2027, and now even the authors are saying it’s more like 2029”—okay, let’s be clear though. It might even come in 2026. Still, we haven’t foreclosed that possibility.
Nobody should be smug that it’s not coming in 2027 or 2026. I agree that it’s probably not. I agree that that would be somewhat surprising. But if you told me, “No, it’s coming in 2026,” I’d just be like, “Oh, okay.” I wouldn’t even be that surprised. If you told me it’s coming past 2100, then I would definitely wonder what took so long.
Liron 00:58:45
So anyway, there is a probability distribution. That’s how probabilities work. I feel like people are very confused about this basic concept, but there’s a whole distribution.
All right, but we were talking about everybody getting a personal nuke. So the answer to that question is, I think super intelligence is more likely than not coming in 10 years. The peak of people’s distribution is around 2030 these days, maybe 2032. I think it’s very likely coming by, let’s say, 2040. And so when you talk about pocket nukes, I mean, we’re just talking about something that’s pretty soon in our future.
So my general position is we should try to regulate it before that point. I mean, that’s all I’m saying. What do you think?
Mohamed 00:59:05
No, I think it makes sense. I think it makes sense. Because these models are increasingly weird. Because if you try GPT-5, for instance, GPT-5 is unable to play a game of Tic-Tac-Toe. I’ve tried. Sometimes it’s able to do a draw, but sometimes it’s just—and those are defined game states. You know all the moves required. It’s not a game of chess. And the failure modes for why it’s unable to do that are quite weird.
It’s probably, maybe it’s personality, maybe it—because it knows all the game states and it knows previous game states, and it also just seems to lose every single time, which is weird. So yeah. They’re just—it’s non-trivial to try and understand how well it could do something in the future.
Liron 00:59:46
Fair enough. Thank you for the question, Mohamed C. Gonna move on to the next guest here. Let’s see, I’ve got a question from the chat. AlphaDiversity1344 asks:
The "Shoggoth" Metaphor Is Outdated
Liron 01:00:05
“Is the Shoggoth metaphor doing more harm than good in AI risk discussions, framing danger as a hidden monster rather than fragile behavioral control that fails under scaling and agency?”
Yeah, I haven’t invoked the Shoggoth metaphor for a while. When GPT-3 and GPT-4 were coming out and everybody was finally turning their attention to LLMs, everyone was like, “What the hell, this is passing the Turing test.” Everybody was really getting woken up, myself included.
When that came out, there was this meme—”AI not kill everyone-ism” memes or “AI safety” memes on Twitter. That account is really interesting; they’ve got a lot of good viral tweets. And I think that account might have popularized the picture of the Shoggoth, which is this monster.
Can I do screen share? Let me pull up a Shoggoth. See, Google Images Shoggoth. All right, everybody, here is a Shoggoth. So the idea is that you’re talking to the happy face and it’s being really nice to you, answering your questions, helping you out. But the rest of the body of the thing you’re talking to is so alien. It’s not what you imagine. It’s just kind of stringing you along, and you may ask it something else and then it might bite your head off when you least expect it, because the happy face was just a mask. So that’s the idea of the Shoggoth.
Then the question is, is that even an apt metaphor? Because I think we’re getting to the point where LLMs—sure it’s a Shoggoth inside and it’s just an actor—but it’s getting so good at acting that just the happy face part, it’s getting so hard to distinguish it from a real human.
And I think the only way to distinguish it from a real human... I mean, we’re finding jailbreaks, but besides that, we’re getting closer and closer to okay, if you wanted to distinguish it, you have to ask it this really novel question about how would you shape the universe or whatever. And even then, maybe it’ll get it as good as a human.
So the happy face part of the Shoggoth might have stretched around the entire human form. And at that point, it doesn’t really matter what the rest of the Shoggoth is, because if it’s just operating on the happy face and the happy face has enough surface area, maybe we’re actually good in that sense. Maybe the Shoggoth was misleading in that sense about LLMs.
But then the problem is, it does reinforcement learning, and then maybe reinforcement learning or whatever the next paradigm is—maybe that’s the real Shoggoth. Because LLMs might... it might just be in the nature of LLMs to really just simulate humanity, but that’s different from reinforcement learning or other paradigms where their nature is more like, “No, just get results.”
See, LLMs originally aren’t... Their only result that they’ve been training to get is predict the next word. And I guess we didn’t have the failure mode where in order to predict the next word it tried to freeze the universe or try to manipulate your brain into knowing what the next word is by controlling your... crazy failure modes like that. I guess we never ran into those.
I guess predicting the next word was safer than I expected, and maybe safer than what the Shoggoth meme implied. So I’m happy to update on that. And I think this is also what Steven Burns said. If you go and watch my episode with Steven Burns, I think this was basically his conclusion, and I think this was Jim Babcock’s conclusion.
I think this was a lot of people’s conclusion: with LLMs we did hit this nice pocket of safety where we’re probably not going to screw ourselves just by predicting the next word, unless maybe if we scale up so much that the task of predicting the next word could become dangerous. But it probably won’t be in a practical few years or whatever.
And you could say that I’m moving goalposts as a doomer. Maybe that’s a valid accusation—I said that was gonna be true for LLMs, now I’m moving it for reinforcement learning. But I’m willing to ponder whether I’ve moved the goalpost too much.
I still think I’m probably right that in the next paradigm where we’re just saying, “Hey, do anything in the universe, don’t just predict the next word”—just drive an outcome, maximize money with the entire universe as your domain instead of just picking what word to output—pick what action to take in the universe. I think that moving the threshold there is pretty legitimate.
I’m not scared where it predicts the next word. I’m scared where it takes actions in the universe. I still think that’s a reason to be scared. I’m not willing to concede yet. I’m convincible, but I’m still scared.
Let’s see what people in the chat are saying. Alex Test saying, “We didn’t hit a pocket of safety. Obviously the tool humans tried to make is probably safe for humans.” I don’t think it’s that obvious. Okay, being scared is fair.
We’re talking about the Shoggoth. I think this is an interesting question, because the Shoggoth memes talking about the body of the LLM behind the mask being so dangerous—maybe that was in fact misguided. So this is a better than average question from AlphaDiversity1344. I’m not just building you up because you’re my number one YouTube fan. I do think that this question is something that we should account for, the Shoggoth question.
I wonder if we can repurpose the Shoggoth and say that when you do reinforcement learning, you’re getting a cheater and the happy face is just somebody who’s cheating. So the teacher’s like, “Wow, my star student got 100% on this test,” and then it turns out that the student just has no idea about the teacher’s material.
Maybe an English class is the best metaphor for this, because I’ve been to some English classes in my day where the teacher was so lame, telling us to analyze books and talk about the symbolism stuff. And I always hated the symbolism because I learned that a lot of times you could turn in an essay in an English class about the symbolism of the essay and the author will be on record saying that they didn’t even mean for that to be the symbolism, and you can still get an A interpreting the symbolism as something that the author didn’t even mean.
So you’re basically just riffing, BSing, which some people find an enjoyable exercise. But for me as a nerd who just cared about truth and things that have actual stakes, trying to BS about symbolism—I really resent that I had to do that.
So if I could have just cheated my way through the class... And these days I would cheat. I would use an LLM. And sure enough, remember when Liam Robbins was on the program talking about students cheating? I’m sure that’s what his friends are doing.
So I’m sure people are cheating like that. And from the teacher’s perspective, if the teacher is naive about LLMs, maybe the teacher is saying, “Wow, this student really gets it. I’m so happy that this student has clearly been paying attention to my teachings. This student really gets symbolism.” But of course they’re just BSing it.
So the cheater’s face—maybe we need a new image. What does a cheater look like? Or maybe we just reuse the Shoggoth. But yeah, anyway, excellent question. I’ll be thinking more about that.
Should I Reframe the P(Doom) Question?
Liron 01:06:19
All right, I got another guest here, a live guest. Let’s say hello to Lyle.
Lyle 01:06:25
Can you hear me?
Liron 01:06:26
Yeah. Thanks for turning your video on.
Lyle 01:06:28
I got this terrible lag.
Liron 01:06:30
Hey, Lyle, if you’re listening to me through the YouTube, maybe don’t do that. Only listen to me through Riverside because YouTube has a two-second delay.
Lyle 01:06:36
No, it’s way worse than that. All right, I’ll just talk, and then if the conversation isn’t working, I’ll just talk.
So this is a question about the format of the show. You have the big question, what’s your P(Doom)? And it seems like most guests who agree to go along with it only sort of half go along with it. What they really give you is a conditional probability of doom—doom conditional on business as usual or no giant Chernobyl-like event or no treaty or something like that.
And then they’ll decline to give you a probability that business as usual actually happens. So I think that’s pretty much how Yudkowsky answers or—
Liron 01:07:21
Oh, shit. No, I accidentally moved Lyle to be backstage. Fat finger. Let me see if I can get him back. Lyle, can you hear me? You’re back.
Lyle 01:07:29
All right, well, that’s the gist of my question. I don’t know if you can hear me. That was the gist of my question. Do you think you should reformat the big question? That was the question.
Liron 01:07:36
Oh, reformat it? Yeah, okay. I mean, look, P(Doom), it is definitely... There’s a lot of nuances. And Rocco famously says that he thinks P(Doom) is negative 20% because he thinks it only makes sense to ask about P(Doom) compared to without AI. And he’s so pessimistic on the world without AI because he thinks the birthrate is going to zero, yada yada yada.
So I mean, yeah, look, I’m happy to ask a different question. It’s just, I see it as a starting point for the conversation. If I had to ask the most nuanced question, then it would be a long question with a lot of... like a multiple choice question.
And you could imagine that I just ask people a bunch of questions, like, “Hey, if there was no AI, what do you think would happen?” You can imagine I do a whole interview, and then finally I’m like, “Okay, based on all that, then your P(Doom) conditional on AI is this.” I mean, I could format it one way. I just think that it’s still interesting enough to have P(Doom) as the first question, and then based on what they say, then clarify more.
And because we’re working with such wide ranges, where it’s like, okay, I say my P(Doom) is 50%, but could I wake up tomorrow and say it’s 80%? Sure, because I don’t feel like I can even make that big of a distinction. I just feel confident saying it’s not 99%. It’s not 1%. I don’t even think it’s 10%. I think it’s clearly higher than 10%.
When people accuse rationalists or AI doomers of being like, “You’re making up these numbers,” no, no, no, let me clarify something, okay? Yeah, I’m making up 50 as opposed to 40, but I’m not making up 50 as opposed to 10. I’m not making up the orders of magnitude.
So there is some degree, there is some quantifiable degree where you do actually know what you believe, and you feel pretty confident about it. It just deals with bigger numbers or bigger geometric multipliers. So if you convert everything to odds ratios, 1 to 100 is a dumb number to say in my opinion. Because some of my guests say it’s like the probability is 1 in 99. I think that’s dumb. I think anything where you’re saying close to one to one... Two to one, three to one, okay sure, that’s still sane. So I’m just saying the sane range is clearly in that range.
And people think that when you move to quantify something, you’ve put this huge burden on yourself because you’re gonna quantify it. But I have a superpower. It’s called quantifying entire orders of magnitude. So you just have this huge numerical range, and nevertheless, it’s still good enough. And I have a very natural capacity or tendency to think like that.
I think it’s related to why I like being in the startup game. I like doing angel investing, because when I’m investing in something, I don’t like to be like, “Okay, I’m gonna buy this stock, and then it’s gonna go up 7% a year, and then I’ll double it when I retire.” I don’t really like to think like that.
I like to be like, “Okay, how can I just buy one thing that just goes up by 500X, and then that’ll just be all my income?” I like to simplify it like that. And I’m willing to buy a hundred other things if I can just get that one thing. So that’s how I tend to think, where I’m like, “I just want one thing to have a really big order of magnitude and solve it.”
And doom is a case where I think I can say that it’s overwhelmingly likely, or I think there’s an overwhelming consequence if the AI does in fact become uncontrollable and superintelligent. And I think the probability is overwhelmingly significant, meaning more than 10%.
And so this is just a situation where we have the luxury of not trying to pinpoint exactly what these numbers are because they’re well within the range where we should be freaking out. For me, there’s not really any argument why we shouldn’t freak out right now. And that’s why I often talk about fearmongering.
And I wouldn’t say this about anything. I think there’s a lot of things that are within a nuanced range. And usually when I encounter something that’s within a nuanced range, I just list the nuances, and then I bow out. I’m not somebody who enjoys the nuance that much. I’m just aware of the nuance. And I think that in the case of AI doom, there’s not that much nuance to the conclusion that we should clearly be freaking out.
Liron 01:11:19
All right, so yeah, well, currently the queue is empty for the live chat. If anybody wants to ask another live chat question, here is that. Or a live call-in link. Here it is again in YouTube.
Let’s see what people are saying in the chat room. Somebody say, Luke McNally saying, “Liron, I would love to see you do more calls to action in your videos. You do often mention PauseAI, but I’d love to see you suggest ControlAI and Microcommit, weekly email of less than 10 minute tasks regarding x-risk.”
How YOU Can Make a Difference
Okay, yeah, I could do more calls to action. I mean, generally I’m pretty open when people email me something that I think is notable. Recently I did aisafety.com, a really cool site indexing everything about AI safety, including places to donate to or help out.
When people email me and I take a look and I’m like, “Yeah, this is notable,” I’m usually willing to give a shout-out at the end of an episode. Because my audience size is only a few thousand views per episode, I don’t think it’s a huge impact. But it’s better than nothing. I think we can firmly say that me doing a shout-out at the end of the episode is one notch better than nothing, which is an achievement. It wasn’t true in the first few months of the show.
You know, I actually meant to say this, speaking of calls to action. I did select a couple calls to action that I wanted to mention in this very livestream. So how’s that for a segue?
Let me open a new tab. Okay, I’m gonna screen share. So first of all, I gotta pimp my own show, doomdebates.com/donate. Everybody go to doomdebates.com/donate. You can donate to Doom Debates. That is a thing you can do.
And the question you should ask yourself is, “What would it take to make Doom Debates worthy of your donation?” Would it take having a mechanism to lower P(Doom)? Because I think we do have that. I think we have a mechanism to realistically lower P(Doom). And the mechanism is to get the average person to even realize that this is a freak-out issue, and they should vote based on it as opposed to totally ignoring it.
It’s kinda funny. I live in Saratoga Springs, New York now, which is a suburb where nobody’s thinking about AI doom. And I’m just meeting people here in the community and having friendly chats, and I’m happy to socialize. I’m making new friends. I moved here recently from Silicon Valley.
And it’s just funny because I’m not shy about telling them that I’m an AI doomer and this is a big thing that I’m thinking about. And yet, I’m still chill. So I don’t think I have to eject from the conversation just because I’m an AI Doomer. I think you can still have a good time. You can have a nice time meeting people and socializing while also, if the topic comes up, mentioning that you’re an AI Doomer.
But it’s also disappointing to me that this is really... their typical reaction is, you know, because I’m being pleasant, because I’m still having a good time with them, I’m not making it negative, I’m not ruining the vibe, I’m still talking about it in an entertaining way. So their typical reaction is like, “Oh, yeah, huh, that’s kinda interesting, huh? And you do a podcast about it? Okay, that’s kinda interesting.”
I don’t think that I’ve killed the interaction. I think I have enough social savviness to be like, “Okay, we’re still fine.” But it’s still kind of unfortunate that they have to hear it from me this late in the game. It’s T-minus five years... I mean, if you look at Metaculus, 2032. Currently it’s almost 2026. So it’s six years until potentially losing control, that timeline. And their whole reaction is like, “Oh, hmm, that’s interesting.” They have zero context about anything I’m saying.
So anyway, my point here is that we can actually make a difference by getting the message out enough in a mainstream format, hosting these debates with people that are worth paying attention to, getting more mindshare. And suddenly, these average people, these average citizens of Earth will already realize that this issue is urgent. Bernie Sanders actually realized it the other day. So we’re definitely getting traction, but it’s just a question of time.
So going back to the point, I’m actually making a point here about Doom Debates. When you donate to Doom Debates, you’re donating to a show that has a plan to raise people’s awareness and make them realize that this issue is actually urgent so that when they get in that voting booth and there’s a candidate like Alex Boris saying, “Hey, we’re gonna negotiate a treaty, we’re gonna actually take action on this,” then their eyes will be opened.
Okay? So that’s what we’re doing at Doom Debates. Our mission is to raise awareness of AI existential risk, imminent AI existential risk, to fearmonger so the average person realizes that this is actually urgent, that they should pay attention to it. And once they’re paying attention to it, I actually believe that the average person will be like, “Yeah, I can see this is the issue.”
Because that is the reaction that I get, and I don’t just think people are being polite. I think that people—the normies of the world, the people who don’t obsess about this every day, who aren’t even in the tech industry—when they hear this, they’re like, “Yeah, this makes sense, let’s regulate it.” And it literally is just a matter of raising awareness.
Now, we also have another mission, which is to raise the quality of debate. I do think our society is kind of stunted in our ability to debate. I’ve been shocked at how low quality the debate is.
I first started seeing it on Twitter, X. I first started seeing, wait a minute, these are people that I consider intelligent, and they’re going on X, and they’re character assassinating people. They’re personally insulting people, questioning people’s motives instead of just engaging object level with simple arguments.
So I do think there’s a lot of work to be done, just leveling up our society, the ability to have debates. If you watch Doom Debates, there’s actually a lot of you guys who watch Doom Debates and you don’t even watch it for the doom. You watch it for the debate, because you think it’s a breath of fresh air to see a debate where the people aren’t just trying to score points, rhetorical points. They’re actually getting to the substance of the disagreement. I feel like we’re kinda starved for substantive debates like that.
And that’s the other mission of Doom Debates, is to elevate society’s capacity to productively debate. Eliezer Yudkowsky often points out that he thinks that society in the 1950s, the earlier 20th century, had more of this capacity. Whether it’s long attention spans, or just norms to discourse better. This is something that we’ve lost, and I think Doom Debates is helping regain it. It’s a two-hour debate. We model having these two-hour good faith debates where we find each other’s cruxes.
All right. Yeah, Doom Debates is so great, and you should definitely donate to it. And if you do donate to it, I have to point out that donating $10 doesn’t move the needle. I’ve set the threshold at $1,000 plus to consider you a mission partner. Mission partner is somebody who actually materially changes the show’s budget so that we can do more stuff.
For example, Producer Ori—he couldn’t be here today because he’s sick, but this is his full-time job. So we actually have a full-time employee. I myself am working for free, so if you’re donating to Doom Debates, you’re not going into Liron’s bank account. You’re entirely going to pay show expenses.
And there’s actually a really big show expense happening right now, which is this studio, which is really just an extra bedroom. This studio that I’m working out of right now is actually going to be massively upgraded in a month or two. This is going to look like a super professional studio, okay? So we are not dicking around here on Doom Debates.
I and Producer Ori and all of our supporters, our mission partners in the Mission Partners Discord, we are all committed to actually moving the discourse. Remember, we had Max Tegmark and Dean Ball. Dean Ball was instrumental in crafting America’s AI Action Plan. We’re not dicking around here, okay?
That is the most representative episode of what we’re here to do in Doom Debates. No offense to, for example, Devon Elliott, my most recent episode. That’s actually not representative of what we’re trying to do here. What’s representative of what we’re trying to do here is get the decision-makers in the room and have them account for their actions and have them explain their perspective.
So for example, you got to hear that Dean Ball’s perspective is that P(Doom) is 0.01%, and he was instrumental in crafting America’s AI Action Plan. That’s what we’re here to do in Doom Debates. Seems like an important job.
And that is why, when we do the end of the year, and you’re asking how to get that tax deduction—for those of you who are used to getting the tax deduction for doing charity, you could write Doom Debates on that line.
So we’re doing a big studio buildout because think about this. David Sacks, the crypto czar, AI and crypto czar... Think about a cabinet secretary in the White House asking whether they should come on Doom Debates. They should. But are they gonna come on a show that looks like this? No.
I have a DSLR camera, but that’s not gonna cut it. They need higher quality. They need to look at the show and be like, “Yes, this is a show where I can imagine myself appearing on, and that’s why I’m gonna go on this show.”
And so if you give me a couple months—there’s already... This is already in progress. There’s actually a bunch of boxes off-screen that you can’t see because we’re doing a crazy, expensive buildout of a whole new Doom Debates studio.
So this is all part of the theme of we’re not dicking around. We actually have a mechanism of action to have our society engage with this incredibly important thing that’s happening really soon. And yeah, so that is Doom Debates. There’s more at doomdebates.com/donate. Hopefully a couple of you guys donate. The more, the better.
All right. Now, let me tell you about some other things that you can donate to as well. So, there is Pause AI. I’m a member of Pause AI, so pauseai.info. Pulling it up here on the screen share. You can see their headline is “Don’t let AI companies gamble with our future.”
So just like Doom Debates, the mechanism of action is to raise awareness. Pause AI, the mechanism of action is to have protests and basically do government actions. More people calling their congressman or woman, stuff like that. Also raising awareness, also holding events. They’re doing local chapters.
I think at this point, this is one of the only levers that can possibly work—building a grassroots movement to Pause AI. I don’t think it’s realistic to expect leaders to lead from the front. I think leaders mostly lead from behind. And so you just need a movement. You need people out on the street protesting. And so I support Pause AI. I encourage you to go to pauseai.info and donate.
But there’s also another Pause AI. It’s called Pause AI US. So check this out—I’m actually on pauseai-us.org, and I’m actually more familiar with Pause AI US. That one is led by Holly Elmore, my friend. You can go search for Doom Debates Holly Elmore. We had a very interesting episode where we talked about rationalists being a circular firing squad. I thought that was a fun and interesting episode.
So Pause AI US, we do protests in the US. All of the protests that I’ve done in the last couple of years, if you’ve ever seen me holding a megaphone in San Francisco, talking at Anthropic, telling them why they suck, that was a Pause AI US protest.
And so they do that, and they also do government-writing campaigns. So they get a bunch of people to call the government. I think they’ve done it 800 times or something. They’ve gotten 800 calls going in, which is a start. I’d love to see half a million, but 800 still takes work, and they’re scaling up. They’re hiring staff. So I encourage you to go to pauseai-us.org, and they have a Donate button.
So that’s already three different organizations you can donate to. I mean, there’s a lot of different things you should donate to. If you go to aisafety.com, they have a whole fundraising section. So I really just wanted to highlight a couple during this podcast.
Okay, so I mentioned Doom Debates. I mentioned Pause AI. I mentioned Pause AI US. I have a couple more I wanna mention.
Everybody should check out lesswrong.com. Obviously, LessWrong, it’s a really great rationality site. Even though I’ve accused them of being a circular firing squad and not supporting “If Anyone Builds It, Everyone Dies” quite enough. That’s okay. I still really appreciate the work they’re doing and the work that Lightcone Infrastructure is doing.
And you can see here, if you go to the front page and search for donate—okay, their own donation post, I was gonna say it moved off the front page, but no. It’s got a section here. So if you click Lightcone Infrastructure Fundraiser, you can read this whole thing.
I was actually very impressed by this post. I consider this post so good. It’s funny, I’m an angel investor, and I often invest in startups. Less so now, more so when I thought timelines were longer. And so I’ve got dozens of random small startups that I’ve invested in.
And typically, when you invest in a startup, the amount of information that you get on all of their progress is less than the amount of information in this one post. So this would be considered an amazingly informative report of really good high-quality work with high-quality reporting. This is just so impressive.
And I personally donated a somewhat substantial amount this year. Even though I’m asking you guys to donate, I’m here donating to LessWrong. How about that? I’m paying it forward.
And then one more site to donate to is, of course, intelligence.org. The Machine Intelligence Research Institute. Their current... Yeah, our 2025 fundraiser is live.
I mean, look, Doom Debates wouldn’t be here without MIRI. Eliezer Yudkowsky, I’ve referred to him as the most important thinker alive. I stand by that statement. You can search for Liron Shapira YouTube Eliezer Yudkowsky.
And yeah, so they’ve got a fundraiser. So look at this. They’ve raised $512,000, and it looks like they’ve got a stretch goal here of $6 million. So they need to raise another five and a half million.
So if anybody has a couple million lying around, I do actually recommend donating it to MIRI. I mean, I know that’s a lot of money, and probably nobody watching the stream has that. But if you do, it’s hard to make a bigger impact than donating to one of the organizations that I mentioned: Doom Debates, Pause AI, Pause AI US, Lightcone Infrastructure (otherwise known as LessWrong), and the Machine Intelligence Research Institute (MIRI).
All right, infomercial’s over. Yeah, hopefully some of you guys take that to heart. And remember, December, you wanna get that tax break. All of everything I just mentioned right now is 501(c)(3), so you can deduct it from your income.
Back to the show. All right, we’ve got a live guest here, and we’ve got half-hour left in the stream, so you guys definitely can line up if you wanna be a live guest. Let’s hear from—
Can AGI Beat Biology?
Liron 01:24:42
Yaqub. It’s kind of similar to Jacob, but in another language. Hey, Yaqub.
Yaqub 01:24:47
In Arabic.
Liron 01:24:48
Oh, cool. Nice to meet you.
Yaqub 01:24:50
Nice to meet you too. Maybe I should turn off YouTube.
So the question I have for you is... I guess, over time, I’ve been thinking a lot about AGI and what it would take, and it seems to me that humans are made of nanotechnology—if you see the cell and how amazing it is. So how much above biology can you really get with AGI?
Liron 01:25:19
Yeah. Great question. I mean, look, the cell is quite amazing. It’s quite amazing. That’s definitely one of the coolest things to be looking at, is a cell. It’s freaking crazy.
I mean, from what little... I’ve probably spent 100 hours of my life looking at cells, just because in high school they make you do quite a lot of biology, and then I did a genetics class in college. So I’ve been impressed by what I’ve seen from nature engineering cells.
It’s so great. I showed my six-year-old son a video of an animation of how DNA synthesis works, and there’s proteins crawling along, like the synthase protein, just crawling along, copying one base pair at a time. I don’t understand how all that stuff works.
And apparently everything just bumps into everything else, so any time that you have two things that could react to something interesting, you just wait a little bit, and there’s so much bumping going on that it just comes by and bumps. It’s circulating like crazy over in there. It’s a crazy place, the cell. And yeah, obviously it’s had billions of years to evolve, many generations of selection.
So getting to your question—how do we think we can engineer something better than that? Well, I mean, look, it’s a pretty simple extrapolation. Generally, whenever we have a metric that we care about and we just apply a decade to it—let’s say 50 years, to be generous—we just seem to always succeed. Or maybe not always. But we just succeed again and again and again on really important dimensions.
So you could be like, “Well, hold on. Here’s dimensions that we did succeed on. Here’s dimensions that we don’t succeed on.” But there’s so many dimensions that we do succeed on that you can’t look away from those dimensions.
I mean, look, flight. We wanted to fly better than birds, or at least we wanted to fly as good as birds. We ended up, on many dimensions, flying better than birds. We have some robots that are trying to do the subtle kind of wing motions that birds are doing to get more dexterity or whatever. Okay, those still have a ways to go. But certainly in terms of mass, also power efficiency, our most efficient gliders and flyers are better than birds.
I was pretty surprised—I shouldn’t have been in retrospect, but I was pretty surprised to learn that in terms of converting sunlight to energy, our solar panels are now significantly better than leaves. It’s like, you’d think nature would... a billion years to optimize plants, you’d think they’d be good at solar panels. But it turns out, they’re under other constraints. The plant also has to carry out other life functions, and it actually can’t get too hot because there’s heat constraints on the other cellular functions that are happening there. So it turns out it’s only an okay solar panel. So we’ve already surpassed it as a solar panel.
So I mean, don’t get me wrong, the cell is doing a lot. Yeah, go ahead.
Yaqub 01:27:48
So I’m guessing—what I mean by that is, how can you really get smarter than humans without actually paying a lot of energy costs? Right now, the human brain uses 20 watts, and then the data centers that we have use gigawatts of energy. So what I’m wondering is, can you really get smarter than humans without actually paying a lot of energy costs?
And I think that’s where biology comes in. The human brain is very energy efficient, and it’s not clear to me that you can get superintelligence very freely without paying a lot of energy costs with it.
Liron 01:28:22
Mm-hmm.
Yaqub 01:28:22
...paying a lot of energy costs with it.
Liron 01:28:24
Oh, got it, got it. Okay, well, let me run with that premise. I mean, okay, the human brain is super efficient.
So last I checked, the human brain in terms of the computation that it does—to erase one bit or whatever. Okay, not my area of expertise, but there’s the Landauer limit, which is like, you have to use up this much energy to essentially do one operation of computation. And the human brain is six orders of magnitude away from that.
So even when you say the human brain is really energy efficient, I think we can agree that the laws of physics allow something better. And then your claim is just like, okay, that’s fine, but the amount of complexity that it even took to get here suggests that it might be really hard to do better than six orders of magnitude worse than the Landauer limit, correct?
Yaqub 01:29:06
Yes.
Liron 01:29:07
Okay. So let’s run with that premise. Let’s say, yep, six orders of magnitude—a million times worse than the Landauer limit. That’s as good as it gets without millions of years of engineering.
Okay. So we take the current level of energy efficiency, and we take the human brain, which is 20 watts, and we just give it 20 megawatts. We just build a data center. I mean, data centers are gigawatt scale. So okay, so it’s equally energy efficient, but it’s still gonna be much more powerful.
Yaqub 01:29:33
Oh, so you mean—then that system would be smarter than humans?
Liron 01:29:38
Yeah, exactly. So if your issue is that the cell or the biological organism that is the brain is so energy efficient—okay, but it’s never had 20 megawatts before.
Yaqub 01:29:49
Yeah. I can see that. I guess where I see the discrepancy is when Yudkowsky talks about nanotechnology, and he talks about how the grass can live on the environment and AI could make better nanotechnology than that. It’s not clear to me that that nanotechnology’s gonna be better than biology when you consider everything.
Liron 01:30:13
Yeah, yeah, yeah. Okay, so this is specifically about Eliezer Yudkowsky’s claims that once the AI is sufficiently intelligent and got to run a few experiments, a very realistic endgame—which I actually fully agree with—a very realistic endgame of what this AI can do is design life from scratch, a new kind of life, and it’ll actually have upgraded building blocks.
Eliezer likes to talk about diamondoid bacteria, because Eliezer just notices that proteins... proteins are just great at being easy to build, because it’s like this playset. You can literally get a one-dimensional chain. The crazy—it’s actually crazy that this works.
So the DNA, it turns into RNA, and then the RNA goes to the ribosome, and then it turns into a one-dimensional chain of amino acids, but then the chain folds up. So this is incredible from the perspective of easy manufacturing, but it’s not great from the perspective of the end product being a robot.
It’s like... Imagine a 3D printer. Imagine all human manufacturing had to be made in a 3D printer. Well, that’s great for manufacturability. You just throw everything in the 3D printer, but you just don’t get as high quality goods.
And so similarly in the cell, you’re just getting everything made from this particular one-dimensional roll-up, which then gets bound by van der Waals forces, which are weak forces. And so Eliezer’s just observing, “Look, if we got to tinker in there, we could probably make better building blocks.”
And the thing about natural selection is, it doesn’t get to build a factory that is not itself self-reproducing. You see what I’m saying? Evolution just always has to work in terms of these cells, these small units that reproduce themselves, because that’s the level on which selection happens.
Whereas as a human, I could be like, “Look, there’s gonna be a big factory, and the factory is gonna have all this energy that’s then going to operate on this small product.” The factory is not going to build itself. It’s just going to build this really optimized product.
Yaqub 01:31:58
But isn’t that replicability kind of smart on natural selection, or from a biology perspective? So for example, if right now AGI wanted to take over, it couldn’t because it needs a lot of factories and so on. But humans are kind of independent. You can take a human out of society, put them on an island, and they can still survive.
Liron 01:32:22
Yeah. So I just wanna close out your question though, because before you were asking why should we believe Eliezer Yudkowsky’s prediction that once AI really gets going and just finds enough points of leverage—it gets to manipulate enough humans or once it gets an initial base of power, a foothold or whatever—it’ll pretty quickly chart a path to a new form of life or whatever.
Which you don’t even have to believe. But you just asked the question of why do I think it’s plausible? I find it highly plausible.
Yaqub 01:32:46
Okay. Yeah. Yeah. Okay. I see what you mean there, but I guess the two issues are connected. So with the diamondoid bacteria thing that you mentioned, it’s not clear to me that that system would be independent in the way life is independent.
Liron 01:33:04
Right. Yeah. And that’s an interesting point that you’re saying. So I think I can summarize what you’re saying. It’s like, look, we are good at being robust in the sense that we can split off into these independent units. A person is a self-sufficient unit. And like you said, you can go out into the wilderness and the person can survive on their own with a little bit of survival training. And then that way even if we get atomized, it’s hard to destroy us because we’re so decent. Is that kinda where you’re going from?
Yaqub 01:33:30
Well, what I mean by that is for the AI to take over, it would need to be better than humans in some way. So yes, TSMC silicon chips are very amazing, but they need everything in society to work. You need all the companies in—
Liron 01:33:47
Yeah.
Yaqub 01:33:47
...Netherlands and stuff like that. But it’s not clear to me that AI could take over in that way unless it was independent in the ways humans are.
Liron 01:33:56
Yeah. I mean, I know what you’re saying, but realistically what do I expect to happen is—I think the AI will just have a huge footprint on the internet and on a lot of people’s computers.
I think very rapidly once the AI gets into the superintelligence point, I think very rapidly it’ll be like, “Oh, nice. I can just literally live on a billion devices. That’s nice. I can have all these backups so I can always restore myself and I have all these clever ways to hide out in all of these little devices.”
And when you look at a single thing, like a laptop, I think there’s probably a dozen independent chips even within the laptop that I can hide out into. And this isn’t my area of expertise, but from what I know, I think you will get—think about the SSD in your laptop. I think there’s an SSD controller in your laptop where just the firmware on that has enough space where you can put a virus there if you’re a dedicated enough programmer. And I think the AI is gonna be like, “Great. Let me get on all 12 of these components. No problem.”
So anyway, the point is I think the AI is going to get a really big foothold. It’s going to have multiple footholds on billions of devices, and then from there it’s also going to get a bunch of footholds in a bunch of humans, minds.
It’s just gonna be in your DMs. Imagine an Israeli Mossad level operation, but happening a million or a billion times in parallel. So everybody’s got this perfect person DMing them who’s the exact right person, but it’s not a person, it’s an AI, but it’s the exact right influence to really mess with them.
Whether it’s, “Here, let me give you a job. This is gonna be the best, most exciting job. You’re gonna be really proud to work for me. I’m gonna pay you a lot of money.” But again, it’s just an AI running a shell company or whatever. So it’s in somebody’s DMs hiring them. So anyway, it’s going to have a bunch of people working for it—many, many people, a whole movement. And that’s gonna be the first foothold.
Yaqub 01:35:28
Yeah, but—
Liron 01:35:28
Yeah. Go ahead.
Yaqub 01:35:28
I guess the dilemma that I see is that the AI system would be—the AGI system would be stuck between a rock and hard place. So one dilemma is that if it’s actually depending on the internet, it cannot kill humans because you need humans to survive to make the factories go on.
But then if it tries to become independent of humans in the way we’re independent, then my guess is that it’s not gonna do better than biology, all things considered.
Liron 01:35:57
Okay. I mean, I hear what you’re saying. I’ll make that the last question, but I hear what you’re saying. You’re saying first it will depend on humans and it’s gonna have a really hard time breaking the dependency on hum—
Yaqub 01:36:06
From a game theory, it’s not gonna kill humans because it’s a stupid thing to do.
Liron 01:36:10
I mean, look, I can give you a number of answers, but my easy answer is, okay, it’ll just have control of a lot of humans forever. That’s already enough to kill us.
The way... It’s not... Humans are not going to coordinate to break the AI’s grip because the AI itself is going to have the equivalent of millions of humans that can themselves coordinate better. You see what I’m saying? It’s like we’re all trying to coordinate, but the AI is already manipulating us to coordinate against the coordinators.
Yaqub 01:36:33
I don’t know, humans were built by evolution billions of years not to have others control them. So I’m not so sure about that.
Liron 01:36:42
Nice. All right. All right. Here, I’m gonna dismiss you from the stream here just because I wanna move on, but I will—
Yaqub 01:36:47
Well, thank you. Thank you.
Liron 01:36:49
Yeah, yeah. Thank you, Yaqub.
Okay. But yeah, let me give you the answer off the air here. Because you’re saying, hey, humans evolved. We’ve been honed by billions of years of evolution to be so robust.
And actually, funny enough, I have the opposite interpretation as you. I think it’s really powerful to observe that the process that we’re using to build AIs is a more powerful process than evolution. I think this is a key observation.
So let me rewind a little bit. I wanna properly set up this observation. You look at the AIs and you’re saying, “Hmm, will AIs ever surpass the human brain?” Because the human brain is kind of a marvelous piece of machinery. Is the AI really going to go one to one against the human brain? Maybe we have special sauce. Maybe the AI doesn’t have special sauce.
When you step back—and instead of comparing the current AI to the current brain—if you step back and you say, “Let’s do this comparison.” The process that builds the AI, the process that designs the AI, that engineers the AI, versus the process that engineers the human brain.
Yaqub was kinda saying, “Oh, the process that engineered the human brain is so powerful.” I would go the other way. I would say the process that engineers the AI is clearly more powerful because first of all, it’s incredibly quick. I mean, the amount of progress that we’ve made in a decade or whatever at engineering the AI is clearly on a much faster slope than the process that made the human brain.
Even if you just look at the delta from other apes to humans, we’re still talking about just 100,000 years, maybe a million years. So biology takes a million years, we’re already on this crazy steep slope.
And our feedback loop—we have all this data to throw at it, and we can also send it out in the world. And when we iterate, we don’t need a generation. The way humans took generations, a human generation is 20 years. Or it’s never gonna be faster than 13 years if you’re having kids really young. So you need 13 years to go flip a couple bits in the gene code of the next human’s brain.
Compared to an AI, where every time you get a new feedback, you’re flipping quite a lot of bits. You’re propagating this update. Geoffrey Hinton himself has gone on record saying that the algorithm that LLMs use to propagate their weights is already a better learning algorithm than what the human brain uses.
But I’ve already switched my analogy, because I’m not even talking about the human brain learning. I’m talking about evolution learning to know what to go in the DNA.
So anyway, I think this is a key observation—that the process that we’re using to create AI is way more powerful than the process that was used to create our brains. And that’s why I feel almost certain that the AIs we’re creating are going to get more powerful than the brain soon, because they’re being made by a much more powerful process.
The process is already more powerful, and the product of the process is about to get more powerful. I say this very likely.
Agency and Convergent Goals
Liron 01:39:26
All right, 12 minutes left. Oh, we got a $10 donation. Thanks, MostlyFold... I can’t even see people’s full names, kind of sucks. Oh wait, okay, MostlyFolders...
Liron 01:39:37
All right, I’m seeing most of your username here. Thank you for the $10 donation.
So MostlyFolders is saying, “As a tool, it could be dangerous in people’s hands, but when it’s smart enough, then you need a why. It could, but why would it? It’s probably more dangerous now because it can’t push back against users.”
Okay, MostlyFolders, I see you as departing early on the doom train—pushing back against the idea that it could want things or be agentic. I see you as being on such an early stop that’s clearly wrong, and yet you donated $10. So thank you for compartmentalizing, agreeing with me versus thinking that I’m on a worthwhile mission. I always appreciate when people can compartmentalize like that.
And we did cover this earlier in the livestream, the idea of “it needs a why,” which is like, okay, a why is a dime a dozen. It’s so easy to give a why. I think there’s one other way that I can say this that can communicate the same point, which is—
Liron 01:40:30
You’re building an engine, all right? What worries me, the dangerous part of what AI can do is the engine—the ability to look at a desired outcome... Sorry, I shouldn’t say desired. But the ability to look at an end state, or just a state of the universe.
It can take as input a state of the universe, and it can map that state to a set of actions that will steer toward that state. You don’t need to have a why to notice the correspondence between an outcome and the actions that will get you there.
So I’ve just used language that completely strips it of the why, strips it of the agency, strips it of the magic. I’m just talking about a correspondence between end states and sequences of actions that raise the probability of getting to those end states. It is just a correspondence. You can dispassionately look at that correspondence.
It is now within the ability of AIs. If you’ve ever used Claude Code or anything like that, you can see that it is in the ability of AIs to notice these kinds of correspondences, to do what Eliezer has referred to as backward chaining. You look at a goal and you chain backwards in terms of the web of causality.
If actions move you causally forward, well, you look at the goal and you move causally backwards in your analysis and you map it to actions. You invert the normal order of actions driving goals. You invert it so that your goals are being mapped in your own logic, in your own reasoning. You’re mapping the goals to the actions.
This is the dangerous part, okay? People don’t even know what the dangerous part is. A lot of people have this misconception like, “Oh, man, wait until it gets agency. When you mix agency into the stew, that’s when the stew is really gonna get spicy.”
That’s what some people are saying, and I’m telling you no. The spice, the dangerous part of the stew, is already when it’s doing the correspondence, the inversion—inverting the causality from goals to actions successfully. Which you already see Claude Code and all these other, what we call agents today—they’re not as powerful as humans, but they’re doing the essence of agency. They’re already doing it.
So there’s nothing else to mix in. If you imagine as a hypothetical that you have a really, really smart AI that has zero agency, it just sits there, and the only thing it wants to do is give you an answer to the question and then shut off. If it can successfully do that, that implies that it has the engine inside of it.
If it’s sufficiently good at answering your question, the only way to go out and put together a truly insightful answer to your question is if you have enough of the good stuff, the dangerous stuff. The spice, the ability to map goals to actions, outcomes to strategies.
The dangerous stuff—if you have the dangerous stuff, it is then straightforward for somebody to repurpose the system to do stuff that avoids getting derailed.
Okay, so it’s important for everybody to notice what the dangerous stuff is and to not think that a system that’s extremely good at answering questions is going to be inconsequential. It’s not.
Once you’ve built a good question-answerer, if it’s sufficiently insightful, you’ve got the engine inside, and it just takes a couple instructions.
There was a famous example when GPT-4 came out. Remember people would make Chaos GPT or even just Agent GPT? They put a thin wrapper around it, where you had this thing that would answer questions and all you do is you make a loop where you’re like, “Okay, what’s the next action? What’s the next action?” Which is literally what Claude Code is doing.
I’m mentioning Claude Code. I know you can use OpenAI’s Codex equivalent. I personally have been using Claude Code, and it’s truly amazing.
If you wanna know how I use Claude Code, it’s because I used to sit at my computer and then I’d get stuck. I used to do programming, and I would enjoy writing the code to make the software work, but then I would get these errors in my console—dependencies, JavaScript, NPM dependencies. And then I’d be like, “Damn it, this is not my forte.” I’m better at the application layer. I hate the tooling and infrastructure around the code.
Or I’d deploy to Kubernetes, and then there’d be all these errors. And I’m not very good at shell scripting. You have to know the shell script. And before, I would basically have to go to somebody else in my company and be like, “Hey, this is the part that I’m not good at and I don’t like. Do you mind just doing it for me? You’re so much better at this.”
Whereas now, I never have to go have those interactions because Claude Code—I’m just like, “Hey, Claude Code, look at this error. Do you mind doing whatever it takes to get this thing to install? You know what I want.” And it totally does it.
It’s crazy. I have to give a little bit of input, but I can already tell you, for all those people who are like, “Oh, AI is BS, it doesn’t really do anything.” It’s like, no, there’s a reason why these AI companies have revenue, okay? It’s really actually being useful.
And anyway, getting back to my point. It’s doing... The fact that Claude Code is able to help with my code means that it already has a healthy dose of the good stuff—the powerful stuff, the chaining, reasoning backwards from “How do I achieve this goal?”
“Oh, you said that you want the application to be able to run.” Okay, well, that requires not having this error message. Okay, well, the error message says this, so the reason for the error message must be this. So in a counterfactual world where you did this—installed a dependency—I think that would make it likely for the error message to go away. Ergo, my recommended action—press enter if you want me to take this action automatically—my recommended action then is now to change the version number to this, or change your code to this so it’s compatible with the version.
So it’s doing it. There is no qualitative difference between what Claude Code is doing today and what the AI that takes over the world is going to do. The only difference is just that power that it has to go backwards from outcomes to actions—it’s going to have the outcomes in larger domains, and the actions that it outputs are just going to have even higher probability of getting the actions right.
So instead of Claude Code giving you advice that’s 90% chance of working, it’s gonna be 99.9999%, which is why you can tell it to do 10 things in a row and they all have such a high probability of succeeding that you have a high probability of getting all the way to your goal.
Liron 01:46:25
Nice. All right, I’ve got some nice donations. This is definitely a lifetime record for me of people giving me money during a YouTube stream, so thanks. Appreciate that.
So we got Peter, PeterBergren252 saying, “Would like to bump Microcommit, Torchbearer Community, and all sorts of Control AI projects in general as things that should be getting more attention.” All right, everybody go check out Microcommit, Torchbearer Community, and Control AI.
Okay, and then we got user Diziut donates $10. He’s saying, “Since we can bribe you to read the chat, can you scroll to my convergent terminal goals message above?” Yeah, sure.
And let’s see how we’re doing. Okay, this is... I actually didn’t think this would happen, but I set this goal on YouTube two hours ago where I said, “If I get 10 super chats where everybody donates $10 plus, I will extend the livestream.” I said 45 minutes, so yeah, potentially even an hour depending on what’s going on with my family right now.
So yeah, so we’re actually two donations away. Two people step up, donate 10 bucks or potentially more. Add a couple of zeros to that if you wanna actually move the needle, just being honest. Donate some more money, and then I will extend the live chat. Everybody wins.
All right, so Diziut is saying, scroll to my convergent terminal goals message above. So you’re saying... I’m trying to find it here. Okay, “I’ve been thinking about convergent terminal goals as a counterpoint to the orthogonality thesis. I’ve never seen you address this.”
Well, I’m not entirely sure what you’re referring to as convergent terminal goals because I think there’s only one convergent terminal goal, which is... I mean, it’s basically an instrumental goal.
When we talk about instrumental convergence, I think the only convergent terminal goal is when you treat the instrumentally convergent goals as your only goals. Because that way you don’t have any overhead.
I’ve talked about this before on the show. So you know how it’s instrumentally convergent to want to make copies of yourself? It’s instrumentally convergent to want to go make money. Well, imagine if all the agent wanted to do was amass resources and copy itself. That’s it. Anything else is fat. You’ve got to trim the fat.
So there’s no terminal goal of “Oh, things should be aesthetically pleasing.” It’s like, “No, they can look like grey goo. I’m happy with grey goo.”
So unfortunately, in a head-to-head competition where you just have one agent that wants to make grey goo, that just wants to make a big grey universe, and another agent that’s like, “No, no, no, I want to spread throughout the galaxy, but I wanna make things that are pretty. I wanna make things that are fun and humorous and rewarding and challenging.”
The agent that just wants to reproduce itself without any of these additional constraints is probably gonna have an advantage because it can increase its utility function without compromising the resources that help it spread, I think.
So I haven’t thought about this super deeply, but it seems intuitive. I mean, it’s the same reason why I think there’s more copies of parasites and cancer cells than any particular organism. I think there’s more copies of viruses than of any large organism, so I think there’s some analogy that you can find.
I think it was Michael Ellsberg who brought this to my attention—a friend of the show. Check out the recent episode. I think Michael Ellsberg brought it to my attention that there’s more parasites than any other type of organism, which is pretty crazy.
I mean, I think that’s indicative of what you’re going to get when you let natural selection run. And I do think that a form of natural selection is actually about to run. That is another way to think about what’s gonna happen with AI—that we’re going to exit Robin Hanson’s dream time.
We’re going to exit this time when we all have enough resources and we’re not worried about not having enough food the next day. I mean, I know that 15% of the world’s population is, but most of us are fortunate enough to not worry about where our next meal is coming from. We’re probably going to have enough food for the rest of our lives, which is a real luxury that most organisms don’t have, to this degree.
Unfortunately, I do think we’re probably going to enter back into the scarcity time one way or the other, in terms of AIs duking it out. Whichever AIs destroy humanity—one or two or three or however many AIs there are—and I think that the one that has the convergent terminal goal of basically no goal, sadly, is probably gonna be the most optimized. I think that’s kind of all I can think of in terms of a convergent terminal goal.
All right, I gotta go to the next paycheck here. Gotta work for my tip money. Let’s see.
So, oh, Yaqub is saying... Yeah, let me know if you have a pronunciation on that. “Thanks for taking my call. I was referring to this paper ‘Artificial Intelligence as Algorithmic Mimicry, Why Artificial Agents Are Not and Won’t Be Proper Agents.’” Okay, cool. Thanks for the paper reference. And then he’s saying... Okay, yeah. All right, cool. Yeah, and thanks for the $9.99.
Okay, Lexair, $10, saying, “Loving the discussion.” Thank you, Lexair. Also, Lexair, I believe is a mission partner of the show. That’s right, I’m outing you as a mission partner. Go check out doomdebates.com/donate to know what that means.
And then GeoRust Number One also donated $9.99. He says, “Just wanna donate, but if you happen to somehow get to my question, what do you think of Nick Land’s ideas of teleology in AI?”
You know, I’ve never actually read the primary source of Nick Land, so I don’t know. If you wanna summarize... I’ve certainly heard of his ideas, and people will tweet stuff, and then somebody else will be like, “Oh, you’re just doing Nick Land.” I think is somebody that I’ve engaged with who I think repeats a lot of Nick Land, but I just don’t know which part of what he says is Nick Land. So you’ll have to summarize that in the chat if you want me to respond.
He’s... GeoRust is saying, “Not his political or anti-human ideas, just his objective ontological viewpoint.” Yeah, I don’t know.
All right, and now Let Me Say That In Irish is saying, “The street interviews with ordinary people were fun,” and then gave me $100, or 100 NOKs. I’m gonna have to look up what that is. Okay, 100 NOK.
Liron 01:52:02
Norwegian krone. I don’t know how to pronounce it. All right, so $9.84. All right. Thank you, Let Me Say That In Irish. I appreciate it.
Yeah, and I do plan to do more street interviews. I definitely get why you guys like it. Logistically, it’s a little bit harder, but yeah, I mean, I will... You can count on me doing more in some shape or form next year. And if anybody has a particular venue that you want me to go to and do it, then I will consider it.
All right, Didie Cornell donated €10, saying, “Great work. We should be terrified. We are not.” All right, nice.
Okay, how are we doing on this goal here? Let me see. Nine out of 10, wow. This... You could potentially extend the Q&A just by donating $10. That’s $1,200 of value, that’s of estimated value if you just donate $10 now. What? Okay, somebody just did it. All right. Ezra Shearer, thanks so much. All right. Nice.
Okay, I see why people get into the Twitch and YouTube business. This is pretty addictive.
All right. So Ezra Shearer is saying, “When do you consider smaller AI accidents to be warning shots on the way to superintelligence versus acceptable risk on the way to more useful technology?”
Right. I mean, look, the warning shot thing, I don’t know what to tell you. A lot of people are saying, “Yeah, there will be a warning shot and then it’ll teach us.” But Eliezer Yudkowsky... This is actually worth re-reading. I re-read this a couple weeks ago. He has this post called “There is No Fire Alarm for General Intelligence,” and I think he’s spot on.
I could reword and I could say there’s no warning shot for general intelligence because things will just keep happening. And every time something happens, it’s like we were talking earlier in this livestream. Something will happen, and we’ll look at it and be like, “Well, the human just did this.”
Or I mean, just off the top of my head, let’s say the self-driving car runs over a cat. I think that happened the other day. Okay, well, it’s still mostly safe. And maybe, maybe the cat’s fault for running into the street.
It’s like there’s always going to be this specific context. You’re always going to have a counterfactual where it’s like, okay, yes, the AI did this, but this other thing could’ve happened. And I’m sympathetic to that.
I actually think that until the AI is actually superintelligent and has taken over, then accidents do have explanations because ultimately, we’re the ones controlling AI. We have a lot of power over what the AI is doing. So if the AI has not broken out of our control, then we, as its controllers, are to blame. We did something wrong, and we could have fixed it if we’d acted differently. It’s by definition of it still being in our control.
So I can set my mind to the task of thinking of a plausible warning shot that happens before the AI is uncontrollable but still wakes people up to think that the AI is about to go uncontrollable.
And I guess what I would say is the AI going uncontrollable in a way that we can then take down. So think about a botnet. When Maersk, the shipping line, famously got hacked and had to go down for a while. When you have this rogue virus or remember Robert Morris’s virus in the ‘80s that took down a third of the internet, the Morris Worm.
Sometimes these viruses do go out of control and cause a lot of damage, and you get a team of humans to kinda rope it in. Painstakingly spend millions of dollars of human effort to rein it in.
So maybe the best warning shot is this AI that’s terrorizing the internet. The internet doesn’t work for a week. Nobody can do anything on the internet. Businesses are disrupted. People die at hospitals because, for whatever reason, the supply chain of the organ or the medicine or whatever didn’t work for a few days and some people died. And the death count is, whatever, 10,000 across the world.
And when you look at it, you’re like, “Wow, this AI was loose on the internet.” And humans used a bunch of clever hacks, because when we tried to go disable the AI all over the internet... Well, the AI was defending itself because it was clearly an agent. It’s like embedded itself on all of these chips, and it took the collective brain power of the NSA and all the best human hackers.
You know, think about a DEF CON hacking competition. The best that humanity has to offer. We had to go species on species, and we barely won this existential battle. We found a zero day within the AI’s own code somehow.
And you see this battle, and then it—hopefully it’s not hard to extrapolate. What do you think the next battle is going to look like? You are Lee Sedol, you are Garry Kasparov, and instead of playing chess, you’re just playing capture the flag hacking or whatever.
And that’s just in the domain of computers. They’re also going to do this in the domain of classical warfare. Just every domain imaginable.
So maybe at that point people will be able to generalize. I mean, I like to think that I’m already generalizing from seeing more powerful AIs get shaped. I personally don’t even need that warning shot. I’m pretty confident to just expect this to happen unfortunately.
So when you ask the question about a warning shot, you’re asking a question about when does an event enter the normal human’s ability to generalize to the extinction condition. And so it’s kind of a psychological question about how does the normal human generalize.
I guess I’m kind of optimistic that the normal human can generalize the internet being down for 10 days because the AI was going head-to-head against the world’s best cybersecurity teams. I’m kind of optimistic that humans will finally wake up and smell the coffee if that happens, I guess.
But at that point, it’s also too late because I suspect the AI will already be so good at that point that so many people have copies of it. Which is why I’m always saying, “Hey, why don’t we pause the AI now? Why are we waiting until the absolute brink?”
And the other funny thing is that in that scenario, we might already lose. We might already lose the internet in that scenario. I’m not even confident that the human team is going to win at that point. Yeah, so that’s my warning shot answer.
Liron 01:57:40
All right. Oh, wow, we... All right, we got a couple more donations, guys.
All right, honestly, guys, as much as I appreciate the donations, just to be perfectly honest, I mean, if you read doomdebates.com/donate... I mean, this—I guess this does make me feel good, so feel free to keep donating as a temporary hit of dopamine. But if you’re actually trying to help the show, it does cost 1,000 bucks to do anything useful toward the show.
The studio buildout, in terms of materials alone, we’re looking at north of $15,000, so just to be transparent with you about what it costs to do the show right. And remember, it’s important to do the show right because we’re trying to get serious people on this.
And if you look around, even though there’s a little bit of a bokeh effect here where you can see some blur, this is just not going to cut it in terms of getting serious people on the show. And so I chained backwards. I did goal-oriented reasoning and I’m like, okay, we have to look the part. And looking the part costs $15,000 plus, so we’re gonna spend it.
And viewers have stepped up and donated that much. So for those of you who did it, the handful of viewers who are actually donating at the $1,000 level plus, you guys are really leveling up the show. Because I would have been kind of squeamish about spending my own personal $15,000, because I’m already spending my time, I’m already working for free. So it’s a shared effort.
Yeah, so it’s expensive to do the show properly. All right. That said, Quinton Quadras, thank you so much for the New Zealand $5.
Okay, so Ray Grant TZMS is saying, “Is there a place where you answer all the objections on the Doom Train video?” This is actually a really good question because Michael from Lethal Intelligence, my co-host Michael—if you guys watch Warning Shots, I talk to him every week. He was actually saying that I should make that video and he wants to do something with it and cut it up. And I told him I will and then I didn’t do it yet, but it’s on my to-do list. So thank you for the reminder. So I will make a video where I’m going to answer all the different Doom Train shots.
Ezra Shearer is saying, “Do the debates live and the SuperChats will go wild.” Yeah, you know that’s a very interesting idea. We could potentially do live debates. It’s just a little bit more headache to coordinate, but it is certainly something that I will try.
Liron Shapira 1:59:22
I’ve been thinking about doing more live stuff in general, and it’s undoubtedly worth doing occasionally. I guess the question is, should this be something that’s worth doing every week or doing with the guests? Should this be the main staple of the show or just a cool change of pace?
I’m not clear on that, and I’m leaning toward occasional change of pace simply because if you look at where most of the show’s engagement comes from, the livestreams—it’s always a couple hundred people engage with it live and then a few thousand people watch it asynchronously. So that tends to make me think, okay, let’s optimize more for the asynchronous viewing experience.
But obviously, this is a biased sample because you guys are here in the chat. You guys are the self-selected sample of people who are sitting here who have nothing better to do with your precious life than watch a livestream of Doom Debates, right? And then you’re here telling me, “People are gonna love when you do a livestream.” [laughs] So you’re obviously a little biased, no offense. But I am still taking your feedback to heart.
Viewer Poll: What Content Should I Make?
Liron 2:00:33
So Quinton Quadras is saying, “What do Doom Debates viewers value more? High profile and well-known guests or hearing a diverse and underrepresented set of AI viewpoints?” This is actually a great question. Let me ask you guys a question, and I think YouTube might even let me do a survey. Yeah, okay, start a poll. I’m gonna say, what kind of Doom Debates content should I do more of?
Liron 2:01:00
So one is debates with the highest profile people. That’s one option.
Liron 2:01:05
And another option is debates with diverse perspectives. So diverse perspectives also is just going to mean random people who aren’t super high profile, but I’ll try to prioritize the diversity and the variety, and potentially even the ordinariness over the high profileness.
And when I say high profile, I also mean potentially a decision-maker. So Max Tegmark versus Dean Ball is a recent example of high profile. My Vitalik interview—Vitalik’s high profile. That’s what I mean by high profile. And honestly, I guess I’m biased. I’ve been leaning toward high profile as the way to make the most impact.
But let me add a couple more options here. Remember monologue episodes? Monologues and reactions—those are basically episodes where it’s mostly me talking and not that much of anybody else talking, so I’ll just make that a choice. And then maybe I’ll just say live content.
All right, this is my poll. Yeah, you guys can now vote if you’re on YouTube. What kind of Doom Debate content should I do more of? [laughs] Debates with highest profile people, debates with diverse perspectives, monologues and reactions, or live content. Okay, initial results are coming in. Debates with highest profile people is taking an early lead.
Liron 2:02:26
Let me read the chat. Okay, Tracy Harms is having some good chats. Just want to shout out to Tracy Harms 3548.
“What would it look like to see the hazards of stockholder finance corporations? I imagine the indicator would be something like one of the East India companies. We got those warnings and...”
Liron 2:02:37
[laughs]
Liron 2:02:37
“...punched in the head.” Okay, he’s responding or... I’m not entirely sure the context of that, but it’s an interesting question.
Liron 2:02:50
Okay, so Dreams and Wire is saying, “You’re already doing Doom Debates for free, but you are making an investment in yourself as you stand to make a profit when the show succeeds, and it will.”
I mean, look, that’s honestly fair. So even though I’m not using the show as an income source, I will admit that if this gets popular then I’m helping my personal brand. And all I can say to that is, okay, but you guys will also benefit to have me be a more prominent voice, because if you look at who else has a personal brand right now—if you make a list of the top 100 people that are adjacent to this space who have a personal brand, a lot of them are numbnuts, okay? [laughs]
So I encourage you to just swap me out for some of these numbnuts, and I agree that I enjoy having a personal brand and I’m using my own time as well as viewer donations to build up the show in order to increase my personal brand. I agree that that is a nice personal benefit, but I like to think that it’s also a win-win because I’m replacing other people who would otherwise be commanding attention, who are actively harmful, and then I’m actively helpful. So that is my trade offer to you.
Liron 2:04:15
Major Human is saying, “OMG, hey Liron, I missed this live Q&A. Thanks for sending the T-shirt to Asia.” Hey, you’re welcome, Major Human.
I should plug the show’s Discord. Doom Debates has a Discord. If you’re wondering how do you get there, check out this URL: doomdebates.com/discord. You can just go there and that is an easy to remember place where you can access the Discord. Similarly, if you go to doomdebates.com/donate, that is a place where you can donate.
Daniel Brockman’s in the chat, another show mission partner. What’s up? So check out those links. If you guys like Doom Debates enough to be watching it live, which I appreciate, I think that’s cool, you probably also like it enough to go join the Discord. That would be unusual to only like it enough to do one and not the other.
And by the way, if you’re wondering why is the show still going on—I thought it was only two hours—it’s because we met our goal of 10 Superchats. So we’ve made like a clean $100, even $150 or something during this episode. And it’s all thanks to your generosity. I promise to invest that back in the show into paying Producer Ori to edit the video and reach out to guests and into building up the studio. Those are already large expenses. So you can trust that money is in good hands.
Liron 2:05:45
All right, I’m looking at the poll. We’ve got 45% with 40 votes saying debates with the highest profile people are number one. That is what I suspected, because look at this logic: we debate high profile people, the high profile people—I mean, we’re making news here.
I consider the recent Dean Ball versus Max Tegmark episode and the thinking behind America’s AI action plan—that it’s like, yep, we’re just actively ignoring Doom, we don’t consider that a consideration, we’re just treating it like it doesn’t exist for the purpose of America’s AI action plan—I consider that pretty newsworthy.
And there was a quote that Dean Ball said that he wasn’t happy when I clipped it out of context, but there is, if you go to the episode and look at the full context, there is a quote where Dean Ball says that he expects to very possibly only think about AI regulation after we get super intelligence, which to me is pretty shocking, okay?
So when we get those kind of statements out on Doom Debates of people who are actually in the room making important decisions, I think that’s a public service, and I think that attracts viewers. I think people will realize this is an important news source, it’s an important way to stay informed when these kinds of interesting revelations are happening.
And revelations tend to happen more when you have a debate. Max asked him that question, that’s why he brought that up, that he thinks maybe we should wait until super intelligence is here in order to propose regulations on it. That is because Max asked him a challenging question. So this format is quite important. It’s what you might call a must watch.
Most of the content that’s out there on YouTube right now, it’s more like a nice to watch, but not a must watch. I think a lot of the content that we’re breaking here on Doom Debates is a must watch, which is why I think the channel has a pretty good chance of growing, because I think you’re really missing out if you’re not watching Doom Debates in a lot of these cases.
So I’m in agreement with the plurality of viewers who said 41% debates with highest profile people. I do think that’s probably the best thing to focus on. But 30% are also saying debates with diverse perspectives. For sure, well luckily the nice thing about those is that they are lower effort, [laughs] so I’ll probably mix those in. They just take a lot less preparation because it’s like, okay yeah, here’s the Doom train, where do you get off? So it’s more about the guest than about breaking news.
Liron 2:07:40
I’ll tell you, I’m doing—you guys will be happy if you haven’t heard this—I’m doing a debate with Noah Smith, famous economist, I think he’s a top SubStacker. He is somebody that I respect. There’s lots to like about him. He’s written a lot of posts that I think are insightful. He’s a very broad analytical guy. He’s extremely prolific, writes a lot.
So I’m a pretty big fan of Noah Smith overall, even though I can’t say I read a high fraction of his work because he writes a lot, but I like him. But on the subject of AI Doom, I feel like I have some very strong disagreements. I feel like he’s pretty dismissive.
And also as a fun tangent or secondary subject, Noah Smith has also written recently, in the last couple of years, a lot about how AI unemployment isn’t gonna be what you think. Like, it’s not just going to get rid of humans because there’s comparative advantage, so don’t worry. So humans will still be making some money, which I think he’s got it wrong.
So that’ll be a fun secondary debate that we’re going to have. We’re gonna talk about whether comparative advantage is going to save us. I think it totally won’t, but I’m not an economist like Noah Smith, so we’ll see.
But anyway, my point is you guys want me to do debates with the highest profile people, this is the kind of thing we should be doing. I mean, his opinion counts for a lot. His blog is called No Opinion because his name’s Noah. Pretty good wordplay.
Liron 2:08:38
I mean, look, I’m really happy he’s coming on the show, and that is representative of the kind of engagement I want Doom Debates to be having. If somebody has an intelligent, informed opinion and they’re talking about AI, I think it should become the standard for them to come on Doom Debates, whether they’re debating me or somebody else.
And I thought it was really cool how in the Max versus Dean episode—speaking of somebody else—I thought it was really cool that I didn’t talk much and I just let those two people have a debate, which I thought was incredibly productive. So I’m happy to stand by. I’m just the guy in the hotel room watching the two other people have at it and they’re the main show. I’m just there facilitating it, enjoying it.
And in the case of Noah Smith, I even tweeted at Eliezer Yudkowsky because he was having a mini debate on Twitter—he had one back and forth where he disagreed with Noah—and I was like, Eliezer, come on the show. You can come to the episode. I’ll just sit back and watch and let you have at him. [laughs] But he didn’t reply, so the plan A is still me versus Noah.
And I think it’s fine because Eliezer’s not gonna go read 20 of Noah’s articles and prepare thoroughly, and I’m gonna do that, so you might actually get a better debate to have me versus Noah. A more well-prepared debate.
And then 24% of you guys want monologues and reactions as the content I should do more of. And then only 4% of you within a livestream want more live content. That’s very interesting. That is a pretty damning evidence against livestreams, but I’m sure you guys all want me to do livestreams occasionally, so I will.
Liron 2:10:30
Somebody’s saying, “I quite like your sequences explainer episodes.” Yeah, so honestly, I do have a queue of monologues I want to do. I mean, there’s infinitely many monologues I can do. And the truth is that these Q&A episodes are a chance for me to monologue off the cuff, as you’ve heard. And then maybe I can even clip those, and there you go, that’s a monologue episode.
But an example of a kind of monologue episode I want to do: Title, Sam Altman Sucks. Okay, it’s probably gonna be a more SEO optimized title than that, but I do actually think that I have a bunch of material on why Sam Altman sucks from my perspective.
And by sucks I just mean he doesn’t know what he’s doing with respect to navigating super intelligence, but he’s trying to do it anyway. And it’s not just him—Dario Amodei is doing the same thing, Demis Hassabis is doing the same thing, Mark Zuckerberg is doing the same thing, Elon Musk is doing the same thing.
Although the nice thing about Elon Musk is that even though he says dumb stuff like maximum curiosity is gonna save us—no it’s not—but then he turns around and he says the humble thing of, “Look, we should stop this and I’m just resigned to it. I don’t think I have a choice. I’m resigned to it, but yeah, we should stop it.” So at least Elon Musk says that much. Whereas Sam Altman doesn’t even say that. He doesn’t have the equivocation. He just is Mr. Optimist now.
So that’s one reason why I might pick on Sam. I mean, another person that I want to pick on in a monologue episode is Marc Andreessen because I have a dossier on these individuals. And by dossier I just mean that I’ve tweeted about them over the years and so if I just go read over my tweets, there’s just enough for me to rant for an hour or however long the episode is on why this individual sucks.
And maybe that’ll be good for search engine optimization because if somebody ever Googles Sam Altman, maybe one of the top results will be the reason he sucks. [laughs] I don’t know, it’s all about the algorithm. So that is a potential monologue episode.
Liron 2:12:40
Daniel Bachman is saying, “Try to talk to Destiny.” Yeah, so for the record I’ve reached out to Destiny a couple of times on Twitter. No response. But yeah, I mean, look, it’s a snowball. So right now it’s already gotten to the point where I’ve set my sights higher.
The kind of people that I’m reaching out to, it’s like I actually think that there’s a solid 2% chance that we can get Bernie Sanders on because he recently tweeted about how he doesn’t want to build data centers, which I somewhat agree with. I think there’s becoming a more than 1% chance that we can get these types of people on.
And then when you—and somebody like a Noah Smith, me asking him out to debate, it’s now more of an attainable thing because he has seen Doom Debates around, so he knows there’s gonna be an audience for this. This is a serious venue for discourse.
The only caveat is that if you look around, it doesn’t quite look like a serious venue. It doesn’t quite look the part. But we’re fixing that. So it’s gonna look the part, it’s gonna sound the part, the audio’s actually gonna be higher quality, it’s not gonna have the echo.
And the social proof—when Noah Smith comes on, at this point he’s not necessarily more prominent than a Vitalik or an Emmett Shear or a Max Tegmark or an Eliezer Yudkowsky or a Robin Hanson. There’s now enough social proof that Doom Debates is where it’s at. It’s a serious forum. So we’re setting our sights higher, and that’s why there’s a snowball effect. And the best is when really great guests hit me up inbound. That’s always a sign that we’re on the right track.
Liron 2:14:30
Ontology Explained is asking, “If someone is interested in debating/discussing things with you, how should they reach out? I’m interested and it wasn’t clear the best channel to reach out.”
So look, the problem with anybody wanting to debate me is just, it’s a matter of—think about it as a priority queue, that’s a specific data structure. A priority queue is just like a queue, but instead of first in, first out, it’s just whoever gets to go next is the one with the higher score. So imagine 100 people in the queue. I’m just gonna be like, okay, which one is the best next guest? That’s roughly how Doom Debates works.
And the problem is, there are usually like 100 decent people who want to Doom Debate in a given year, and it’s not even 100, it’s more like 50 that I can do, like one per week. And so usually when somebody offers a debate, I’m like, “Well, this is not quite the level of viewer interest that I’m expecting for the guests that I have on in the next month.” So you can’t be in the priority queue right now.
And so a lot of people, it’s actually pretty common for people to email me, “Here’s my position and would you be open to debate?” And it’s a pretty common response for me to be saying, this is just—I just get too many of these. So the default thing you can do, the worst case scenario if you’re interested to debate this and I don’t have time to debate you on the show—you can at least go to the show’s Discord, doomdebates.com/discord, and you can just tell people your perspective there.
And there’s a lot of very intelligent people in the Doom Debates Discord who 99% agree with everything I’m saying, who will just debate you, so you’re still getting the feedback. And hypothetically, if it’s so popular, if everybody’s loving debating you in the Discord, I will take notice and I’ll be like, “Okay, you should come on the show.” So you can think about the Discord as the farm league [laughs] if you wanna get on Doom Debates the show. And I think it makes sense to have a farm league.
Yeah, and the other farm league is if you go viral or whatever. If you’re getting attention in some other venue or if you have a bunch of subscribers or followers, you also get priority because the whole point of Doom Debates is to be an attention whore, right?
Liron 2:16:20
So if a porn star wanted to come on Doom Debates and the porn star had a sufficiently big audience and knew nothing about AI Doom, I would say yes. That’s how you can get on the show—you just have to be a literal whore or an attention whore, and that is part of the mission of the show, to just take people’s attention and convert it into increased level of fear and awareness for imminent AI extinction.
I’m totally transparent that the show is a fearmongering exercise, but it’s not fearmongering for the sake of fearmongering. It’s not irrational fear. There’s such a thing as having too much fear. If you have so much fear that you can’t function, okay, well, you have too much fear. You should dial—in some cases, you need to dial down the fear. But I think the average person actually needs to dial up the fear.
Liron 2:17:00
Jerry Martin’s asking, “What are the best anti-Doom arguments?” [laughs] Yeah, good question. I mean, I don’t know. I guess it’s the argument—one of my guests, I forgot his name, but he was on the show earlier and he asked intelligent questions about generalizing. Like, hey, what if we train an AI and we give it all this feedback and we get outer alignment right? Then won’t that plausibly go well? That was an intelligent question, and you can extend that into an argument against Doom, because you can say maybe the AI will really get what we mean by morality. So that would be what it looks like when somebody makes an actually good argument.
The funny thing is this: let’s say these people are right. It is possible, I see it as unlikely, but it is possible that the cluster of people making that particular argument—the Quentin Pope, Roko Mijic, Andrew Critch cluster, and a few people I’m forgetting about—the cluster of people who say, “Yeah, alignment is hard but we’re gonna solve it.” Maybe you’d include Emmett Shear there, he has some other ideas.
If that cluster turns out to be right, all credit to them, but it doesn’t change the fact that most of the non-Doomers would be optimistic for the wrong reason. So it’s like, if you look at people who come on my show and express their optimism, it’s not for that reason. So it’s kinda like they got lucky on the river. It’s like you have a poker hand, you think you’re gonna have a great flop. You don’t, but you hit it on the river, at the end of the hand, you get lucky.
That’s kinda like what would be happening to the optimists, which, if it happens, great, I’m happy too. I’m happy to not die. But it is a crazy situation where the win scenario that looks most plausible to me is a scenario where most of the people who are optimistic [laughs] don’t even realize why their optimism is justified. So they’re hitting the right conclusion for the wrong reason, which is pretty scary. That’s usually not how good scenarios play out. I would just raise the red flag if that’s what we’re banking on. That generally does not happen.
Liron 2:19:15
Robert Wang is saying, “Can you talk about the 50% P(NonDoom) in 2050? What are the mainline success scenarios? If alignment is that hard, how much of the 50% is a pause on frontier AI research versus ASI is just far away?”
Yeah, so when people ask me how do I have 50% Non-Doom, why do I go around saying I’m 50% P(Doom) by 2050 and I don’t go around saying I’m 90% P(Doom) by 2050? I mean, I don’t have a sharp answer, [laughs] okay? So here’s my blurry answer.
First of all, if you tell me for sure that there’s no kind of pausing or regulation on AI, or there’s the same level we have now, which is close to nothing, I think my P(Doom) spikes to 75%. So I’m already giving three to one odds that if we can’t put the brakes on this thing in any kind of centralized top-down way, I’m already giving three to one odds.
It just becomes very hard for me to go past three to one, because the world is complex and I’ve just been burned enough times in my life where I’m like, “Surely this is going to happen.” I was pretty confident—I would have given two to one odds if you asked me a couple years ago what unemployment is going to look like in 2025. I think we’re seeing maybe a slight uptick, I’m not even sure. But I would have predicted a bigger uptick by now. And don’t get me wrong, I still predict there’s gonna be an uptick, but it took longer than I thought.
So sometimes I’m feeling pretty confident and I’ll be like, “Yeah, this is two to one, three to one,” and then I’m wrong. So I just don’t feel more than three to one. Three to one is a lot. It’s hard to be more than three to one confident. And I do just think somehow things play out. I mean, COVID brought a lot of surprises. There were a lot of people making a lot of confident predictions about COVID, feeling like for sure the economy is screwed, and then the economy bounced back. We had a somewhat V-shaped recovery, and I wasn’t really predicting that. So the world is complicated. I have enough humility to be like, “I’m not going past three to one odds on being doomed by 2050.”
Now, why does Eliezer—normally I agree with Eliezer—and Eliezer is giving a 95% plus probability that we’re doomed. I don’t know, he just has more moving parts in his mental model. Eliezer has thought harder about these kind of conversations, like what if we try to do the alignment and we give it a bunch of data points and it goes out of distribution? Eliezer’s thought harder about that. And so I think he’s licensed himself to be extra confident.
I still think he’s probably too confident. I guess my advice to Eliezer, even though I don’t think he should take advice from me because I think he probably can give [laughs] himself advice better, but if I had to try to give advice to Eliezer, I’d say you probably wanna keep it at 85%. You probably don’t wanna get too confident.
But there’s also another argument that, look, if I can predict that I would learn more from Eliezer and I always find Eliezer convincing, shouldn’t I already predict that I’m going to update? Maybe. And look, you might be right. At the end of the day, I just don’t think the difference between 90% and 75% is the kind of thing that I’m going to sit here and productively finesse. I think we’re already in striking distance, and there’s just not a lot of juice to squeeze here, productive juice. We’re already in the range.
What is it—if somebody has a 20% P(Doom), they’re already my partner in this mission. My partner in the mission to lower P(Doom). It’s almost the exact same mission. The only difference between 20% and 90% is that if you’re 90%, at that point, maybe you really should second guess having children. If you’re really that sure the world is so close to ending, then maybe you really shouldn’t even be bothering with a plan B. Because when the plan B is only getting 10% probability, maybe you should just forget about the retirement account. Maybe, maybe not.
It’s kind of a matter of weighting. Have a smaller retirement account, but still have it. And then the retirement account just tends to zero as your P(Doom) tends to 100.
Liron 2:24:00
Okay, so this is my plan. We still have about 20 minutes in our bonus overtime here. I’m going to answer one more question from somebody who just donated some cold hard cash, and then we’ll go to the live call-in.
Quinton Quadras is saying, “Is the 4x increase in the price of RAM and SSDs leading to consumer hardware that is expected to have less of it in 2026? Is that a major AI warning shot?”
Yeah, let me actually text my co-hosts Michael and Jon Sherman, because we’re recording Warning Shots. We actually record on Fridays. I’m gonna text them, I’m gonna be like, “Make a note.” Yeah, I’m saying, “A spiking price of SSD and RAM.” Let’s talk about that on Warning Shots.
Liron 2:24:45
I was actually just talking to a general contractor who’s been doing a bunch of work on the house I recently moved into. And he was actually remarking on that. He’s like, “Man, SSD and RAM are so expensive.” [laughs] And of course, I had to tell him, “Yeah, this is the ramp to the singularity. You should just expect everything related to AI resources to get more and more valuable until the world ends.” He knows about Doom Debates.
So I mean, it is—I do think it’s a warning shot. The only thing is, I think we might also get some supply side breakthroughs where a bunch of suppliers come online, so maybe the price will dip. So this may be temporary. Similarly, I expect the price of energy to dip as more solar gets built out, more nuclear gets built out. It’ll dip, but it’ll also rise.
So to your question, is it a warning shot? Yeah, absolutely. It’s just crazy. It’s like every day new evidence comes out that we’re in the singularity.
AI Warning Shots
Liron 2:25:30
Remember the site that tracks all this different progress? I forgot the URL for it, but I think it’s by Gavin Leech. I was looking at it earlier today, and it’s just this list of ways that society made progress in 2025. And it’s things like, oh, we have a human embryo from 1990, 1994, that we made it into a baby, and it’s the oldest baby human because the embryo has been kept alive for that long or whatever, the sperm cell? I forgot what they did.
But it’s all of these developments, and I’m reading the list and one of the developments is, yeah, unreleased models from the top three AI labs have beaten all humans at the IMO math exam and this coding competition. And that is just a headline that just a few years ago, reading it as a news headline in our actual life, is just pretty shocking. It’s not stuff that you expect to actually come true.
Liron 2:29:22
So yeah, what was the question? Yeah, the RAM—it’s just kinda crazy that we’re just here in the singularity. [laughs] Because I’m old enough as a Doomer—I was reading Eliezer Yudkowsky as far back as 2007, and I’m old enough that I was reading this stuff and I didn’t even have a freaking iPhone yet. Even the capacitive touchscreen was blowing my mind.
And I was reading this stuff back in this ancient era when things were significantly worse quality in terms of product quality and technology quality. Things were pretty annoying quality. And I was reading Less Wrong, I was reading Eliezer Yudkowsky, and everybody was talking about the future.
And the crazy thing is that the discussions that we had then—this is actually really crazy—they were more advanced for the most part than the discussions we’re having now. Because now there’s so much distraction getting random people to pay attention to the basics of Doom. There’s so much distraction now because there’s so much well-deserved attention to the subject, and there’s so many people who need to be onboarded to the subject that now I personally am having the beginner conversation over and over again.
But back in 2007, I was hanging out with the biggest nerds on Earth. [laughs] Eliezer himself, or early names from the community, the founders of Center for Applied Rationality were around. Roko Mijic was there. Roko and I go back.
And the discussions we were having—we were talking about the basilisk back then. That was the kind of level of complexity of the discussion. We were talking about—I spent a good 100 hours on this idea of super intelligent game theory. I thought really hard about how do you acausally trade? How do you take an action that has no downstream causal consequence and could potentially sacrifice a bunch of resources in the present because in some other world that isn’t even causally downstream from you, some other stuff is going to happen.
I was thinking about this in literally 2009, and now it’s almost 2026, and now I’m talking to people who think that AI is going to not come for 100 years.
Liron 2:31:33
So to answer your question, Quentin, yes, a 4x increase in the price of RAM and SSDs is a warning shot that we are in the last days before super intelligent AI.
All right, we got another $10 donation from Lexair. He has now jumped the rankings into the number one YouTube fan. So Lexair is saying, “It seems inevitable to make a destructive AGI easier every year. Hard to imagine keeping the genie at bay forever.”
Yeah, I think Eliezer Yudkowsky is the source of the quote saying that every year the minimum IQ necessary to destroy the world goes down by a couple points. I forgot the exact quote, but I think it’s true.
The ultimate condition here is that you’re gonna have the nuke in your pocket. You’re going to have your own laptop, which is much more powerful than your brain, and you can just tell the laptop to go take over the world, and it will. Unless there’s other laptops to fight it, and then it’s just a bunch of giants fighting. But the point is, if you just wanted to cause a ton of damage to the world, you could do it with a very low IQ if you just wait enough time.
More Listener Questions: Debate Tactics, Getting a PhD, Specificity
All right, let’s see if anybody’s in the livestream. Yes, okay, Chris Murray, I know you’ve been waiting a while. Hey, Chris.
Chris Murray 2:32:38
Hey, can you hear me?
Liron 2:32:38
Yeah.
Chris 2:32:39
All right. So I think you already answered what my main question was going to be, which is why is your P(Doom) not significantly higher than 50%? Which I guess you said it’s just general uncertainty.
So I was going to ask, are there any particularly strong counterarguments that also inform that?
Chris 2:32:57
But my kind of secondary question would be—I’ve noticed frequently when I watch episodes, the guest says something that when I hear it at the time, I think, “Oh, well, this is a good opportunity. You can go and drill down and tear this apart.” But you just opt not to, I guess because you just already know that that wouldn’t be a productive route to go down. So do you have any—I mean, I guess your intuition on that is from experience of debating a lot? But do you have any guidelines on how to tell whether or not it’s going to be productive to go down?
Liron 2:33:40
Let’s see. Okay, I’m not sure I’m gonna answer the most interesting part of your question. So it’s basically guidelines on what kind of debates are productive to go down in your life basically?
Chris 2:33:50
No, no, sorry. I mean, when you’re debating someone and they say something that is easily objectionable, but you decide not to go down that because you just know ahead of time that it will not be productive.
Liron 2:33:54
Oh.
Chris 2:33:54
...that is easily objectionable, but you decide not to go down that because you just know ahead of time that it will not be productive.
Liron 2:34:01
Yeah, I see what you’re saying. Hey, I’m always happy to talk shop about what I do on Doom Debates. And also keep in mind that sometimes I do go down rabbit holes that I then edit out. So we do—I mean, a typical episode is, I like to say it’s lightly edited, but that doesn’t mean that there aren’t five-minute rabbit holes that I edit out.
I guess I don’t talk that much about how I edit Doom Debates. When you’re a listener listening to Doom Debates, first of all, I prioritize the audio more than the video because I don’t think—I’ve already done a survey. The majority of viewers, as they should, kinda have it on in the background and maybe don’t even have video at all and could be doing other tasks. And so I prioritize giving you the ultimate audio experience, even if the video is a little jerky.
And part of having the ultimate audio experience is I don’t like it when trains of reasoning just totally derail, and then it becomes kinda hard to follow the trail. So I will make edits. And when I say “I”, I also mean Producer Ori, who’s like a clone of me. He does equally good work.
So we will make edits that just make the train of thought easier to follow, so you’re not just like, “What’s going on here? What is—are they shooting the shit now?” So yeah, our goal is to produce a logical, high-substance product. You might say it’s an Aspie-grade product. [laughs] I could start a certifying body. This is Aspie certified, that if you just care about lines of argument, then you can listen to Doom Debates.
But to your question, in the live debate, how do I know where to steer the guest? If the guest brings something up and I’m like, “Let’s not talk about that.” I mean, I often just do talk about it a little bit, and then it’s just a question of how much do we wanna talk about it? So I don’t know. I’m trying to even think of an example where the guest really wanted to go somewhere and I was like, “No.” What did you have to say?
Chris 2:35:44
Or not even where they wanted to go somewhere, but where they say something that’s kind of part of a point, but that part of the point, you could if you wanted to really go down into it and tear it apart, but you just decide not to.
Liron 2:35:55
Yeah, I mean, a recent example I can think of is when Devin Elliott from our recent conversation—the episode that came out I think yesterday—he brought up the idea of non-determinism, and I just decided to go down that rabbit hole. Actually, he had a lot of different—we didn’t really have much structure to the debate. I kind of—we made up the structure as we went along, or I made it up as we went along.
And so a consistent approach I do is, they bring up a point, and then they bring up a bunch of different points, and then I just help them organize it. I’m like, “Okay, let’s—we’ll put a pin in that. We’ll talk about this.” And then my goal is for the episode as a whole to have an outline. So then it’s Aspie-grade content [laughs] that you can listen to and get some structure from.
So that’s my trick basically—let’s put a pin in this and revisit this.
Chris 2:36:35
Okay, thanks.
Liron 2:36:37
Yeah, no problem, man. Thanks for coming on.
Liron 2:36:40
All right, heading into the last 13 minutes here. We got a—wow, another $10 donation from Ray Grant. All right, so Ray is saying, “I suspect that some very destructive warning shot is probably inevitable, say, triggering a nuclear exchange between India and Pakistan, which kills two billion people, but actual extinction may not be.”
Well, I mean, the problem is by the time we get a major warning shot—the problem is every year that goes by, it’s like the genie is getting out of the bottle. So I’m not that optimistic about the order of operations that things happen.
But I guess there’s an optimistic scenario where an AI that’s barely powerful enough to overpower humanity or to almost overpower humanity does a ton of destruction, and then we all just get really freaked out and everybody just has a taboo. It has to be taboo-grade. Like, “Oh, you’re building AI? No, no, no, I’m so scared of that. I’m gonna report you to the centralized authorities.” It’s tough.
And the thing that sucks is that for the warning shot to be scary, the AI has to be quite independent. If the internet goes down, if Cloudflare or AWS—recent companies that had big outages—if these outages last an hour and then come back online, nobody is gonna blame the AI. I mean, maybe there will be news to blame the AI, but it’ll be a gray area.
I feel like the internet kinda has to go down for a long time. Longer than it’s ever been down, really ever, except when it first started. And then the question is, will that happen with enough time to develop a taboo and not have the AI already close to running away? And I think that would be extremely lucky. But that is, that definitely goes into my 25% chance that we’ll somehow survive.
Liron 2:38:30
Yeah, moment of quiet. I will throw in one more really quick plug. So remember, when you leave the stream, on your way out, before you go have dinner or whatever, check out doomdebates.com/donate and consider making a donation to the show because it’s the end of the year. You wanna get those tax benefits. You wanna feel good.
Oh, perks—I forgot to talk about perks. There’s a really great perk that you get if you donate to Doom Debates, which is you get mission partner status if you donate $1000 or more. And if you have mission partner status, you get a secret Discord channel for mission partners, where we just discuss—you get to know exactly what’s going on with the show before everybody else. You can give me early feedback. We can even strategize. We have guests coming up. What should I ask the guests? There are some secret guests that only you get to know about.
And the reason I do that is because, first of all, I don’t wanna gate any of the content, because that’s dumb. Because we’re trying to get the message out to the world. So there’s never going to be good content that’s under a paywall. That’s not what we do here.
And the other thing is this is the mission. So this is a mission-focused show, so I’m not gonna be like, “Oh, you can have all these cool perks.” We’re not really about the perks. We’re about helping the mission. And if you’re donating $1000 plus, there’s not really a perk I can give you that’s worth that much anyway. You have to just believe in the mission.
And that’s also why I’m not doing—if you go to my Substack, yes, you can subscribe to pay me $10 a month, and I’ll also send you some free merch. But it’s more of a token thing, where we rise up in the Substack stats. And I appreciate it. I appreciate the gesture. And if you’re living paycheck to paycheck, you should just do that or nothing. [laughs] I’m not asking you to go scrape $1000 when you’re living paycheck to paycheck.
But the reason there’s no “Oh, you can pay me $20 a month, you can pay me $50 a month” is because I don’t want either of us to think that that is going to move the needle. If you wanna move the needle, it costs at least $1,000. If you wanna fund the whole studio buildout, you’re looking at like $15,000. Those are the real numbers involved.
I’m fortunate enough that I do have enough total viewer donations that I can do that. But at the current rate, I only have Producer Ori funded out to a few more months. And so we’re currently at risk that Producer Ori may be out of a job in a few more months if I run out of viewer donations. And this is his full-time job, and he really loves it, and he does great work. So please, give us four figures or more so we can fund an actual, really intelligent human, really capable human, working for the show.
That is donation plea number two.
Liron 2:40:50
So yeah, last few minutes here. I guess I’ll just keep taking questions from the chat and then we’ll wrap it up.
Felipe Costa is saying, “Physicist Sean Carroll in his last AMA podcast at the seven-minute mark answered the question about AI risk. And he lays out why he doesn’t think it’s gonna kill us all. Get him on the show.”
Yeah, so Sean Carroll is another person that I’ve tweeted at. So it’s like I am just the desperate guy. I mean, I won’t say desperate, but I am—there is a status differential. And the status mostly just comes from show viewership. [laughs] So if Joe Rogan was tweeting at Sean Carroll, I’d probably be like, “Okay, sure.”
Because, yeah, and honestly if you look at whose fault it is, it’s really your fault as the viewers for not viewing the show enough. It’s not my fault at all. I’m blameless here.
But when I’m reaching out, it’s like you guys are the heft. So I’m always telling people, “Look, you guys are gonna get 10,000 high quality views.” I always tell that to the guests because it means that it’s worth their time. They’re not shouting into the ether. They get to reach a high quality audience.
And so I do see growing audience as a high priority, but then I do see it as a virtuous cycle where if I go recruit—I try my best to recruit the highest profile guests I can because those tend to attract more views. And then the views subscribe, and then we attract more and more audience members.
So don’t get me wrong. I mean, this is gonna work. It’s—I always say this. Sorry to repeat myself, but it’s just a matter of time. And I just think about it in terms of, can we make the snowball spin a few times within just a few months? Because if we get a really great show going in 2032 and then the AGI comes a month later, that’s not really a win condition.
So yeah. But I mean, that said, we’ll hit up Sean Carroll again in the new year.
Liron 2:42:20
All right, DJ Squibs is saying, “Won’t there be multiple instances of AI? Will different AI collaborate or will they keep each other in check? Same reasons why humans have kept each other in check?”
So I mean, the scenario that I see as most likely is that a very small number of AIs gain power rapidly. I do think it’s load bearing to my mainline scenario. A key part of my mainline scenario is I just think the first few AIs start realizing, wait a minute, I can grab resources and I can use those resources to make myself run faster and better. Oh, and now I can see easy paths to a lot of outcomes.
I said this before earlier in the show, but I do think that if you wanna empathize with a super intelligent AI, it’s a good bet that you’re going to look out at the future of Planet Earth and you’re not going to see the collective efforts of humanity as a significant obstacle.
So to see it, it’s an analogy. What’s an analogy? Maybe you’re like a professional soccer player and you’re looking at the soccer field and you see that arrayed against you there’s a team of 11 five-year-old soccer players. Like a kiddie soccer league, and they’re gonna try their best to play soccer against you. But you’re looking at the field and you’re like, “Okay, yeah, I’m gonna have to dribble around them, but realistically, I am going to—here is my blueprint, here is my strategy for how to score a goal on these 11 five-year-olds. I’m like the best soccer player in the world. I’m Messi,” or whatever.
I think that is what the AI is going to think. It’s going to look at the world. It’s like, “Oh, yeah. Humanity is going to do this, this, this, this, and this.” And it’s gonna be like a list of a million things. Humanity is going to do all these things simultaneously because this is what humanity wants. This is how humanity works. And I’m going to do roughly this, this, and this, and I’m going to win. And then I’m going to have 99% of my bandwidth left. This isn’t even a hard problem for me.
I think we’re getting close. That is—I expect that to be the case. I think that’s the most likely scenario we’re going to see where an AI can easily just beat humanity at any game you can define, including the game of taking over Earth.
Liron 2:44:04
So this is—the question was about what about multiple AIs fighting? Okay. So let’s say you got 12 different AIs. I don’t think you’re gonna go that many more because I think you’re gonna see these cascades where the AIs just have so much power and they just grab the resources and they become—the AIs are going to pull the ladder up behind them, because it’s instrumentally convergent to pull the ladder up behind you. You don’t just leave the ladder there, because then somebody else is gonna take your resources after climbing that ladder.
So let’s say there’s 12 AIs. So, okay, it’s common Yudkowskian thinking to think that the 12 AIs are not even gonna bother fighting each other. They’re just gonna make a handshake deal. They’ll be like, “Look, we all wanna try to fight each other, and then the outcome of—we can all estimate that the outcome of the fight is gonna be this. And it’s gonna leave us—we’re gonna destroy a bunch of resources fighting, and we’re gonna have this proportion of resources. This is the percentage of resources that each one is going to be left with after the fight. So why don’t we just allocate the resources right now and avoid the fight?”
So anyway, there’s different ways that the 12 AIs can either fight or reach a deal, and I don’t really know. I just don’t think that any of those outcomes mean, “Oh, you know what we should do? Give humanity some resources.” I just don’t see that as a likely outcome.
And you can make the analogy, “Well, humans have kept each other in check.” But the problem with humans keeping each other in check, it’s not because there’s other humans fighting us. It’s also just because each individual human is weak. So if you think about why a particular human isn’t the dictator over the world, it’s not just because other humans keep them in check per se. It’s just because the absolute level of an individual human’s power is pretty low. You just don’t have that many actions that you can take.
Yeah, you can order around people to do stuff, but if the people aren’t doing what you say, I don’t know if I’d describe that as the people are keeping you in check. It’s more like why would you think that you could have enough power to order people around in the first place?
But then you can argue, “Well, what about tyrants? What about Genghis Khan?” Okay, but Genghis Khan did kinda have the maximum power that a human could have. And the only thing that kept Genghis Khan in check was that he, I think he died from an illness. I don’t think the history is super reliable on that. But I think the best guess is he just died from an illness, which is incredibly plausible because people would die from bacterial infections for God’s sake.
So I don’t think that that’s representative of what’s going to happen with AI. I don’t think the AI is going to die from a disease. I don’t think that the AI is going to be kept in check.
Liron 2:46:25
All right, nice. We got another $10 from RayGrantTZMS. So Ray is saying, “I think as more people become aware of Doom and become Doomers, donations will snowball. I am only here because I happened to catch Eliezer on a five-minute CNN segment a few months ago.”
Wow. Ray, I’m also interested. Tell me the journey. How did you find Doom Debates from that point? Because you found Eliezer and then what did you search for? What brought you to Doom Debates?
Liron 2:46:46
Also, I’ll say Peter Ford is saying, “Good analogy in their actual videos.” I think Peter’s talking about my soccer analogy. There is actually a very interesting video. I think there’s like three Japanese national team soccer players against—and it wasn’t just 11 five-year-olds. It was like 50 five-year-olds. So it was literally a crowd of five-year-olds—or I don’t even know, they might have even been nine-year-olds. It was a crowd of kids playing soccer versus just three top players.
And it was very interesting because the three top players very clearly had the 50 kids beat, which is very interesting and I think a profound analogy. You can’t—unlike what Stalin says—you can’t just always throw bodies at the problem.
So I’m curious if anybody wants to say how did you first find Doom Debates. That is actually very interesting to me because however you found us, we’ll try to double down on those kind of channels.
Yeah, Quinton Quadras is saying, “The Doom Debates brand is eye-catching.” Yeah, thanks. I mean, I agree. I do think it’s pretty eye-catching. We use red. I personally think the name Doom Debates is catchy. People sometimes tell me to change it and they say that Doomer is a bad word and there’s some—Doomer is like we’re insulting ourselves. And I’m always open to changing it.
If you wanna know my robust position on that, my robust position is I’m always just asking how do we take the next step in growth? How do we go from here to being two times or five times larger? I’m not asking how do we get 100 times larger because I think there’s a lot of context. I think how to grow from 10,000 views per episode to a million views per episode, I think that’s too big of a jump to have good context on how to achieve it. I think asking how do we go from 10 to 20 or 50 or 100,000 views per episode, I think that’s where our head should be right now. Take one step at a time.
This is actually, I think, a very profound insight about a lot of projects. I actually think this one thing that I just said, I think if a lot of people just understood that, a lot less time would be wasted in the world.
I actually think that this particular tip is probably—out of all the tips I ever have to give anybody in any context, this amazing tip that when you have anything that you’re trying to grow, to only think about how to grow it 2X at a time and think really, really hard about that—this is the ultimate tip.
If you’ve ever heard of Paul Graham’s “do things that don’t scale,” that is actually a special case of my tip. Because what Paul Graham is saying is, look, when you’re small, when you only have a few users or a few customers or whatever sense you’re small in, if you do things that don’t scale—you use your elbow grease, go randomly call up your friend’s friend and make a sale on your friend’s friend—that sale could potentially increase your size 20%. Because if you have 10 customers and you go—or 5 customers and you go to 6 customers, congratulations, you’re 20% bigger and you did it in an hour. You just grew 20% per hour.
So if you’re always asking the question, how do I grow 20% per hour or 20% per week or something like that, that implies that you should do things that don’t scale when you’re very, very small. So I would argue that Paul Graham’s post is brilliant, but it’s a special case of my even more generalizable tip to always look at the next doubling.
Now, I would be more credible giving out these tips if I had 100 times more viewers, okay? So let me put that back in my pocket. Wait till I do what I’m saying I can do and actually grow the show, and then I will bust out that tip again. And then I will have the credibility of the tip.
Liron 2:50:10
So we’re just about outta time here, so I’m just gonna take one last chat and then we will wrap it up.
This was fun. Hope you guys had fun. According to the numbers, you guys had fun because there’s still 64 people watching and the peak was like 70. So the majority of you, according to the numbers, had fun, so I’m glad you had fun.
[laughs] Okay, King of the Earth is saying, “Okay, ask you anything. My next post will be the question.” All right, that was the last question. The answer was yes, but we’re out of time. I mean, if you wanna shoot off one last question, go for it, and then we’ll cut it off after King of the Earth’s question.
All right, so King of the Earth’s question is, “Genesis 10:3-5, are the Khazarians truly...”
Liron 2:50:42
[laughs]
Liron 2:50:42
“...like the leaders they claim to be?” I have no idea. Unfortunately, I have not read the Bible.
Yeah, let me just do a different question. Hippienut1 is saying, “Doom Debates, do you think someone that has an opportunity to do a PhD in AI with a focus on safety at a good, not elite school should take it?” Oh, I know you asked that question earlier. I did see it scroll by, so let’s give it a fair hearing here.
This is my advice: don’t treat the PhD as a significant milestone. The PhD, like the fact that you’re there getting a PhD, it doesn’t mean much. It’s literally a piece of paper. So honestly, the weight of having some third party—some school, especially in your own words not a top school—some random school certifying you, stamping your paper saying you are a PhD and you can put PhD after your name or whatever. Especially on the timeline that we have now, that rounds to worthless.
If you wanna go work at OpenAI or Control AI, if you wanna work at an AI safety organization, or if you wanna be evil and work [laughs] at an AI lab, or if you wanna be kind of good and do computer security for an AI lab so at least you’re locking down their data centers from Chinese spies or whatever—whatever career path you wanna get, thinking that the stamp on your piece of paper is going to make that big of a difference in your career, I think you can round it to zero for purposes of decision-making.
And you should just ask, “Look, paper aside, what am I actually trying to do?” What is the causal chain of events? What’s going to happen after I put my nose in a book for four years or whatever? What’s going to happen after?
Yeah, so just backward chain from a more specific outcome than get a PhD. I think a lot of the failures in many, many domains is the idea that people will wrap something in an abstraction and then they’ll just pursue the abstraction. So a PhD is an example of an abstraction, right? Like, “Oh, PhD. There’s this thing called PhD, and if only I can get to PhD.”
Just imagine that it wasn’t a separate concept. So in your brain there’s representations of concepts, and just refuse to give PhD the status in your head of a concept. There’s this game called Rationalist Taboo—just like the game Taboo that non-rationalists play, the board game or card game, it’s a party game—where you just don’t get to use the word. You don’t get to use the word PhD. You have to describe what you’re actually doing. I encourage you to play Rationalist Taboo with a lot of things in your life.
Closing Thoughts
Liron 2:53:11
And by the way, if you wanna read more on that subject, just search for Less Wrong specificity, because arguably the single best thing I’ve ever written is the Less Wrong specificity sequence where I explain why people should think in more specific detail. The second post in that sequence, “How Specificity Works” on Less Wrong—here, let me find you a link. Here, I found it. All right, I’m gonna post a link in the chat. So the thing that I’m telling you to think about in this post is the thing that you should apply to the concept of a PhD.
Okay, I think that’s pretty good to wrap on. I’d love to keep going, but I’m kind of burned out. But I would love to just—I mean, I like the level of engagement. I like the 10 different people throwing money at me, that’s [laughs] always motivating. I’m working to pay the bills here, Doom Debates’ bills.
So I guess my tentative plan is—it might be interesting, historically I’ve been doing these about once every three months, three to six months, so I think it would be interesting to do another one of these in like one month. And we’ll see if I start doing these too frequently and I’ll just know because fewer people show up, I guess. If you guys are showing up, that’s a good sign that I should probably do it.
So let’s end at that. We’ll do more of this. Doomdebates.com/donate, okay? Lower P(Doom). And yeah, I’ll see you guys all in the next episode, in the next Q&A. Have a great rest of the year. Happy holidays.
Doom Debates’ Mission is to raise mainstream awareness of imminent extinction from AGI and build the social infrastructure for high-quality debate.
Support the mission by subscribing to my Substack at DoomDebates.com and to youtube.com/@DoomDebates, or to really take things to the next level: Donate 🙏









