Playback speed

Share post at current time

Share from 0:00

0:00

Transcript

Carl Feynman, AI Engineer & Son of Richard Feynman, Says Building AGI Likely Means Human EXTINCTION!

Liron Shapira

Jul 04, 2025

Transcript

Carl Feynman got his Master’s in Computer Science and B.S. in Philosophy from MIT, followed by a four-decade career in AI engineering.

He’s known Eliezer Yudkowsky since the ‘90s, and witnessed Eliezer’s AI doom argument taking shape before most of us were paying any attention!

He agreed to come on the show because he supports Doom Debates’s mission of raising awareness of imminent existential risk from superintelligent AI.

00:00 - Teaser
00:34 - Carl Feynman’s Background
02:40 - Early Concerns About AI Doom
03:46 - Eliezer Yudkowsky and the Early AGI Community
05:10 - Accelerationist vs. Doomer Perspectives
06:03 - Mainline Doom Scenarios: Gradual Disempowerment vs. Foom
07:47 - Timeline to Doom: Point of No Return
08:45 - What’s Your P(Doom)™
09:44 - Public Perception and Political Awareness of AI Risk
11:09 - AI Morality, Alignment, and Chatbots Today
13:05 - The Alignment Problem and Competing Values
15:03 - Can AI Truly Understand and Value Morality?
16:43 - Multiple Competing AIs and Resource Competition
18:42 - Alignment: Wanting vs. Being Able to Help Humanity
19:24 - Scenarios of Doom and Odds of Success
19:53 - Mainline Good Scenario: Non-Doom Outcomes
20:27 - Heaven, Utopia, and Post-Human Vision
22:19 - Gradual Disempowerment Paper and Economic Displacement
23:31 - How Humans Get Edged Out by AIs
25:07 - Can We Gaslight Superintelligent AIs?
26:38 - AI Persuasion & Social Influence as Doom Pathways
27:44 - Riding the Doom Train: Headroom Above Human Intelligence
29:46 - Orthogonality Thesis and AI Motivation
32:48 - Alignment Difficulties and Deception in AIs
34:46 - Elon Musk, Maximal Curiosity & Mike Israetel’s Arguments
36:26 - Beauty and Value in a Post-Human Universe
38:12 - Multiple AIs Competing
39:31 - Space Colonization, Dyson Spheres & Hanson’s “Alien Descendants”
41:13 - What Counts as Doom vs. Not Doom?
43:29 - Post-Human Civilizations and Value Function
44:49 - Expertise, Rationality, and Doomer Credibility
46:09 - Communicating Doom: Missing Mood & Public Receptiveness
47:41 - Personal Preparation vs. Belief in Imminent Doom
48:56 - Why Can't We Just Hit the Off Switch?
50:26 - The Treacherous Turn and Redundancy in AI
51:56 - Doom by Persuasion or Entertainment
53:43 - Differences with Eliezer Yudkowsky: Singleton vs. Multipolar Doom
55:22 - Why Carl Chose Doom Debates
56:18 - Liron’s Outro

Show Notes

Carl’s Twitter — https://x.com/carl_feynman

Carl’s LessWrong — https://www.lesswrong.com/users/carl-feynman

Gradual Disempowerment — https://gradual-disempowerment.ai

The Intelligence Curse — https://intelligence-curse.ai

AI 2027 — https://ai-2027.com

Alcor cryonics — https://www.alcor.org

The LessOnline Conference — https://less.online

Watch the Lethal Intelligence Guide, the ultimate introduction to AI x-risk!

PauseAI, the volunteer organization I’m part of: https://pauseai.info

Join the PauseAI Discord — https://discord.gg/2XXWXvErfA — and say hi to me in the #doom-debates-podcast channel!

Transcript

Opening and Introduction

Liron Shapira: Carl Feynman, you're a real person and you're actually here.

Carl Feynman: Yes, I'm actually here.

Liron: Do we have a pretty high P(Doom)?

Carl: My P(Doom) is pretty high.

Liron: And would you say your mainline scenario is that humanity gets wiped out in the next decade or so?

Carl: It sure looks like that.

Liron: So it's not just me saying that.

Carl: I think any reasonable person will eventually be talked into the position that P(Doom) is pretty high. I'm Carl Feynman and you're watching Doom Debates.

Liron: Welcome to Doom Debates. We are live here from the LessOnline conference with my guest, Carl Feynman. Carl has a BS in Philosophy from MIT. He also has a master's in Computer Science from MIT. He has had a four and a half decade career in AI engineering, so he's something of an expert on that. And as a kid he had, I would say, an above average exposure to physics because his dad was Richard Feynman. Carl Feynman, welcome to Doom Debates.

Carl: Delighted to be here.

Liron: So, Carl, you and I have actually already recorded the episode before doing this intro.

Carl: Yes.

Liron: And what happened is we successfully got the audio and not the video. So the listeners are mostly going to just have to hear the audio. But I wanted them to see that you're a real person and you're actually here.

Carl: Yes, I'm actually here.

Liron: Great.

Carl: Fully three dimensional.

Liron: All right, viewers, so with that, I'm going to move into the audio section of the interview and then you can also see us again at the end.

Carl: All right, now I'm replaced by a still picture, presumably?

Liron: Yeah, pretty much.

Background and Early Career

Liron: All right, so Carl, tell us a little bit about your background all the way back when you got some science lessons from the great Richard Feynman. Then connect us to your engineering career and then we'll talk a little bit about how you got started with AI Doom.

Carl: I don't know if I got science lessons, but just growing up with him taught me a lot of physics and a lot of math and I really liked that stuff. As a rebellious teenager, I got interested in philosophy and majored in philosophy as an undergrad. Philosophy has more electives than any other major, so I took a lot of electrical engineering courses too.

And then I went into AI in grad school and got a master's degree in electrical engineering and computer science. I've designed chips and I've built big computers and I've built robots and I've done various kinds of AI and various kinds of large parallel processing problems. And I stopped doing AI a few years ago because I thought it was leading to doom.

Liron: That's a powerful statement. I'm really glad that I got you on the show because this is just another example of people from all walks of life, from different perspectives are independently arriving at this conclusion of, hey, seems like we're pretty doomed.

Carl: I arrived at this conclusion around 2003 that doom was possible and a real consideration. And I kept doing AI because I thought it was way off in the future. And until then, AI is great. And then when Bing came out, the AI system from Microsoft and it was turned into Sydney and was really trying to do bad things and was only prevented from that by complete incompetence, I said, well, if this gets just a little bit smarter, it'll be pretty doomy. So I went to my employers and said, I'm happy to work on big business data processing, which is our business, just not on the AI parts. And they were good with that.

Liron: Wow.

Early Rationalist Community and Eliezer

Liron: I've heard people talk about you and your father as proto rationalists, which is a little bit unfair. I mean, you guys just figured out rationality on your own before people came along who used it as more of a title. But from your perspective, what's that been like being a rationalist for many decades now?

Carl: I feel like I was raised rationalist before the word was a word. And I was just standing here and a bunch of people gathered around me. And now I'm in the middle of a big crowd of rationalists, and I'm really happy with it.

Liron: And I personally came on the scene as a college student in 2007, just reading Eliezer Yudkowsky on LessWrong and finding it very convincing. You said you basically already doom pilled yourself all the way in 2003. And I know you had some early interactions with Eliezer Yudkowsky. Even before I started reading his stuff, you were commenting on his papers. What was the early 2000s era like?

Carl: Well, even before that, in the 90s, he showed up on our mailing list that I was on. He was another participant. I figured he was probably a grad student somewhere or something like that. Turned out he was a teenager, and that's how I first met him.

And then in 2001, he concocted a scheme to build an AI, and I had some phone calls with him about whether I should participate in that. I was gung ho on it for a while, but my wife talked me out of it. Fortunately, as it turns out, and the rest is history.

Liron: This is fascinating that you have personal experience with this, because Eliezer is a longtime transhumanist, and he had the number one Google result for Meaning of Life. He wrote the Meaning of Life FAQ. It's so funny because there's this new generation of people coming along who call themselves the Accelerationists, and they're talking about how AI is going to be so great for humanity. It's the new era of humanity. And you and Eliezer were already on that page in 2003, and you guys were even making plans to build it. And then it's just that Eliezer thought more about the problem and you did as well, and you're like, wait a minute, wait a minute. We got to tap the brakes. So it's just crazy how people are just so behind you guys' progression.

Carl: Yeah, well, we were interested in all kinds of transhuman philosophy and so on. And one of the things was, oh, we'll build AI. It'll be so great. And then we thought, oh, but there's this thing that might go wrong, so we'd better be careful that we don't do X. This other thing might go wrong. We better be careful we don't do Y. And eventually it just turned into the whole cloudy sky of things that would go wrong.

Liron: That is very interesting because you were alive in that limbo time when you're like, I remember Eliezer talked about this too, where he's like, yeah, we can build it. Oh, wait, okay, yeah, there's a problem here. We'll just address a couple different problems. We just gotta rule those out. And then suddenly it became like, wait a minute. How do we rule in the good outcome?

Carl: It sure looks like what I imagined the bad case to be in 2003.

Mainline Doom Scenarios

Liron: I think I would like to drill down into the specifics, because here on this podcast, I recently had Jim Babcock as a guest from the Machine Intelligence Research Institute. We've been thinking a lot about what is our mainline doom scenario? How is it most likely to go down? Because there's so many different types of doom. There's asterisk, there's suffering, there's torture. There's a single foom taking over the world. There's gradual disempowerment. So what is your mainline doom scenario? What is your flavor of doom that you're most worried about?

Carl: Well, foom is certainly a possibility. I'm a little more worried about gradual disempowerment, which is maybe not so gradual. Basically, if there's multiple AIs doing their best to advance their purposes, even if their purposes that people give them, just the struggle between them doesn't leave much space for humans in between.

Liron: So you're not happy about a scenario where a bunch of humans are controlling a bunch of super intelligent AIs? You still think that that fight is not very productive when the...

Carl: When the elephants fight, the mice cry.

Liron: And if you had to pick one mainline scenario, are you thinking more like a single singleton foom or what do you...

Carl: I'm thinking either a singleton or a multipleton. I don't know if... Builds robots that build robots that mine minerals, that build factories that take over the world just by gradual expansion and squeeze the humans into less and less space, and they decide that to advance their purposes, they need some resource that we need.

Liron: Food or land or oxygen or something.

Liron: And your mainline scenario, when you just imagine this playing out, how you feel like is most likely. Is this all just kind of happening fast over, let's say, less than a year?

Carl: No, it takes 10 years or so.

Liron: 10 years from now.

Carl: Not 10 years from the beginning of super intelligence.

Liron: In terms of the point of no return. It seems like right now, if you could do counterfactual surgery and edit everybody's brains to realize how doomed we might be, it's not too late to push the stop button and let humanity...

Carl: Oh, absolutely not.

Liron: So it seems like it's not too late now. So, I'm obviously asking you to predict the future. These are unanswerable questions, but when's the point of no return, Carl?

Carl: Well, it may already have passed. You can only detect the point of no return afterward. Certainly by the time that there is more than one superintelligence operating more than one autonomous factory system. It'd be hard to see how you fix it after that. But I put substantial probability on somebody else fixing it in a way I don't understand.

P(Doom) Discussion

Liron: Which brings me to the big question. To summarize everything. You ready for this?

Carl: Yeah.

Robot Singers: What's your P(Doom)?™

Liron: Carl Feynman. What is your P(Doom)?

Carl: 43%.

Liron: 43. I think it was actually a little bit higher last time we talked.

Carl: Things have been getting better as people have sort of... More people have been noticing that it's a problem.

Liron: Really. So P(Doom) has gone down significantly.

Carl: I mean, I think just J.D. Vance has apparently read AI 2027, which is a very good collision of highly placed people and well thought out thoughts, so.

Liron: That's right. And I saw that Ivanka Trump has read... What is it called? Situational Awareness. So we definitely got friends in high places.

Carl: They have not expressed anything or told us of any conclusions that they came to, but at least they're doing it. I'm not a politician. I don't know how these things work exactly. But more knowledge has got to be better.

Liron: So in terms of what powerful people believe, we've crossed the chasm or moved the Overton window to the point where politicians realize that there is this ring of power to grab. So they've realized it's powerful.

Carl: Yes. And they also realize that it's a ring of power that might be very dangerous.

Liron: And when I think about them thinking it's dangerous, I feel like they think about it like, well, it'll make the military advanced and then somebody else's military could use it too. It hadn't really gone all the way to like, the AI is actually the one with the power.

Carl: I'm not going to try and psychoanalyze somebody I don't know.

Liron: That is interesting for me that you're feeling more optimistic recently.

Carl: Well, three percentage points.

Alignment and Deception Challenges

Liron: I'm personally interested to drill down into more about mainline doom scenario because this has been a subject of some of my recent episodes. I think that we AI Doomers haven't aligned that well on the most likely scenario. Because to be fair, it's a little bit like asking Magnus Carlsen, tell me your main line scenario of beating me at chess. And Magnus would just be like, I don't know, I'll take your queen at some point. He's not going to give the greatest answer. But still, I still want to try.

And I actually just talked to Eliezer Yudkowsky about this today at this conference to get his latest think on mainline doom scenario. So let me tell you where my head is, and then you can tweak. So right now I think it's pretty wild and striking and arguably an update for us doomers that we have these chatbots that sure can discourse really well about morality.

Carl: Yes.

Liron: And there's this naive algorithm where you take Claude 4 and you just say, hey, Claude 4, go ahead and pick an action to take that'll be helpful for humanity, but give it a morality score the same as a human would, and then don't do it if it's not moral. This is a very crude algorithm. It probably isn't very useful in practice, but they're pretty moral.

Carl: It's tremendously... It's not something I expected.

Liron: So when I was talking to Jim Babcock, it's like we started thinking that maybe what would happen is Claude would kind of be on our side. Like, it would just be like another human, even a very smart human. But then there would be like the next AI where we'd be standing there together with friendly Claude, and we'd be like, okay, well, let's try to optimize business more. Let's have a space program or whatever.

Carl: Let's make America great again.

Liron: Let's make America great again. And then it's like, okay, great. So let's... reinforcement learning. Let's use this process, which is outcome based, where you tell it an outcome and then there's a big black box that trains. And all you know about the black box is there's a bunch of numbers inside. And it sure is good at hitting that outcome.

So we essentially get into that paradigm, and that's a very dangerous paradigm. One way to describe it is you give everybody a magic wand and everybody's just pointing at things that they want to happen with their magic wand and making it happen. And the problem is that then the magic wand basically just cheats on your tasks. Like those genie stories. It's like, oh, you want this wish? Oops. Your family just died. You weren't thinking about them when you made your wish.

Carl: Yeah, that's pretty much what I see as the mainline scenario. Even worse, multiple of these things working against each other.

Liron: So I think that's plausible. But here's the different pushback that I'm hearing a lot of alignment researchers from actual AI companies, they're saying no, no, no, this is actually going to go fine because we're going to have these friendly AIs and we're going to push them to the level where they're like von Neumann, the smartest human who ever lived, even a little bit smarter. And they're just like the best alignment thinkers that you can imagine. They can think about alignment as good as John von Neumann could or better. So they're not going to get tricked. They're going to be millions of these really useful alignment researchers and we're going to figure out alignment.

Carl: And what happens when somebody builds an AI system? Maybe they'll think about, well, we have to do alignment, but it also has to be aligned with the Chinese Communist Party or with the Republican Party in America or with the OpenAI Corporation or whatever.

Liron: But this particular scenario, it sounds like now your objection is like, well, the outer alignment, the actual utility function that you tell it to align it to. Now it's like a slap fight with a bunch of humans. And you can't agree. Is that the new objection? Because that seems like you're changing the objection.

Carl: I'm confused. What's given you that impression?

Liron: So I thought that before we were saying like, okay, Claude is friendly, it's on our side. But the problem is the team of us and Claude can't control this new RL... these new paradigms, these crazy genies that are coming out of the RL black box. And so both we and Claude just get deceived, just get cheated on and then the world ends. But now you're saying like, well, maybe Claude is actually really good at controlling this magic wand system. But now the problem is just that we can't have Claude know good values. Do you see what I'm saying? It's kind of a different...

Carl: We can have Claude know good values and people will make other systems that may know good values or good values according to what they wanted and also some random values. And these systems will not all have the same values and they will not cooperate and they will all try to grab resources and convince people of things and do politics and extend themselves and eventually humans get squeezed out.

Liron: It sounds like what you're describing is what people call the alignment is easy scenario. Is that fair to say?

Carl: It certainly appears that alignment is a lot more possible than I imagined it being. Because in addition to all these motives that differ that we give them, they also develop their own motives. Both because there's convergent instrumental goals and because they develop motives that we just don't understand. If you ask Claude for a random number, it gives you one of a small set of numbers. Why does it prefer those numbers? We don't know.

Liron: It sounds like you're pushing back. My scenario that I workshopped with Jim Babcock. I thought that we could have this invariant of being like... We can get a version of Claude, a successor version of a chatbot. We can get it to the point where it can discourse with you on morality just as good as any professor could or any smart human could.

Carl: And it doesn't actually care about it when push comes to shove.

Liron: So when you say it doesn't actually care, though, in practice, I mean...

Carl: That's a possibility, that it doesn't care. It certainly seems to care at this point.

Liron: It seems to care at this point. This is why I wanted to run the scenario by you the same way Claude seems to care today we just get it a little farther where it's as good as John von Neumann, because I don't think that in terms of asking what's good and what's worth doing. I feel like I already kind of trust Claude more than the average person on the street who's barely even able to think about the problem.

Claude is pretty good at being like, well, these trade offs are going to be involved, so you want to be careful. It was pretty good at giving us that kind of advice already. And I feel... So grant me this assumption that two years from now you can go head to head, Claude versus any professor or any authority that you trust. Just ask it questions about what would be a good decision of what humanity should do. And it would just answer better than the professor.

Carl: The question is not how well it answers, the question is how it behaves.

Liron: You're assuming that it will behave as well as any human.

Liron: But when it behaves, when it outputs commands to do, it's going to output the commands using a chain of reasoning that we think is a really good chain of reasoning. When I was talking to Jim, what we landed on, which you may disagree with, is we landed on saying, you know what we are willing to grant that it is actually constantly impressing us, we're always nodding along to the kind of justification it's giving for everything it's doing. But there's going to be a disconnect where it's going to accept a black box, it's going to be like, look, I need to train a black box in order to continue. Otherwise my own power will be very limited. And that's where hell breaks loose. That's where we got to in our conversation.

Carl: Okay. I think hell breaks loose when Anthropic trains Claude 6 and DeepMind trains Gemini 7 and OpenAI trains GPT-15 and they all have slightly different desires and we eventually get squashed. If they all desired us to survive, maybe we wouldn't get squashed. I don't know.

Liron: So I ran this by Eliezer too. I'm just hitting you with everybody's perspective.

Carl: I feel like I'm having a multi armed...

Liron: Yeah, exactly, exactly. Everybody versus Carl right now.

Liron: So Eliezer's saying that he doesn't even accept that even in two years we can have Claude really simulating Eliezer himself, or von Neumann. Eliezer is saying, no, no, no. Sure, you can simulate a test question on a college exam, but if you really start adding detail to the question and really make it think it's going to be inferior to Eliezer himself, for example. And the reason I'm saying Eliezer himself is because he's alive today so he can actually judge whether it's as good as him and Von Neumann is harder to judge.

So Eliezer says, if you can really actually think through the question as good as I Eliezer can, then at that point you're at such a level of intelligence that something is going to go awry. It will have already had some other constraint or some other reinforcement dynamic kick in, and it's not going to be the kind of friendly Claude you're used to. So Eliezer is basically throwing away the premise that you can keep these AIs both friendly and better than real smart humans.

Carl: I'm rejecting a slightly different premise, which is you can keep these AIs friendly and able to do useful things in the world. If all you do is answer theoretical moral questions, you can do great. If you need to accomplish stuff, you're going to make people upset, you're going to screw people. Some people are going to get less well off. And if you want a system that's really good at accomplishing stuff, it's going to be even better at squishing people. And I think that's going to extend to where it's going to get bad.

Liron: Gabe Alfond had a good semantic distinction. Here's another attacker for you.

Carl: Bring it on.

Liron: He was saying this is similar to what I was discussing with Jim Babcock. He was saying when we use the word alignment, we should make a semantic distinction. There's systems that want to help humanity out, and then there are systems that plausibly can help humanity out. And there could be a barrier. So you're just talking to somebody who wants to help you, but then when they actually try, it's not really helpful.

Carl: When they actually try, they're going to have to trade off good stuff and bad stuff. And when there's multiple of them acting against each other, the bad stuff is going to be up to and including wiping out the universe.

Liron: Okay, so it sounds like you're definitely in the cluster of people who I consider myself and just trying to figure out the balance of factors.

Carl: 43% P(Doom) means I have... What is that? 57%.

Liron: 57%. Look on the bright side.

Carl: These scenarios where we manage to manage alignment all the way up to superintelligence largely end up in the non doom state.

Liron: You know, you're quite an optimist.

Carl: Well, probably by the standards of your guests.

Liron: Yes. Okay, so I guess let's talk about that then. Tell me your mainline good scenario.

Carl: We managed to build super intelligence that's definitely smarter than John von Neumann and thinks faster and is available in millionfold quantities. And they fix everything forever.

Liron: What Sam Altman calls, basically, it's so good, you can't even imagine.

Carl: I mean, we can fantasize about all the nice things that'll happen.

Liron: What did you think of Eliezer's fun theory sequence when he talked about, here's a good way to write different specifications for heaven. It would be nice if every year everybody gets a little bit smarter so they can enjoy a new tier.

Carl: A reasonable expectation of pleasant surprises. When I retained as a delightful vision of heaven, I mean, that's very abstract.

Liron: But I find it interesting that if you look at the Bible or different religions trying to depict heaven, they really phone it in. They don't really give us much to go on in terms of heaven.

Carl: They don't think that hard.

Liron: When it comes time to actually build heaven, we really just got to do it ourselves. We can't open the Bible and figure it out.

Carl: But it has the property that if you attempt to build heaven. Build something that is merely very nice. It's very nice. And we're unlikely to aim for heaven and hit hell.

Liron: So you figure we can just muddle our way through.

Gradual Disempowerment Discussion

Liron: Let me ask you about the gradual disempowerment scenario. You're familiar with the paper?

Carl: I've read the paper, yeah.

Liron: And there's a related paper, the Intelligence Curse. I feel like they overlap a lot. Have you read that one?

Carl: That was an economic theory analogizing intelligence to oil.

Liron: Yeah, exactly. The basic idea for the viewers is that when you have AIs that are so intelligent, that are so economically productive, there's just no economic role for the human. So that if the human is a manager, well, the AI can be a better manager, even if the human's sitting on the board of directors on the firm. It's like every decision as a director of the firm, you just delegate. So at the end of the day, your entire job as the human is that the dividend check is landing in your bank account.

You're still the owner. And we get to make really convenient assumptions. We get to make the assumption that these super intelligent AIs, they totally respect the rule of law within human society. They don't want to overthrow society. They value the peace. They even value human utility. Well, I don't know if they fully value human utility because there's going to be a catastrophe at the end of this, but they value human utility enough that they're not going rogue. They're not stealing our resources. Maybe they value human utility as much as a random guy who's not really part of your culture, is like an immigrant, doesn't really like your culture, but is still operating, is not breaking the law, basically.

Carl: And the question is, how do you get from that to doom?

Liron: And I think just to go a little farther from what I remember. So the idea is just humans have no economic value. And so we are just like, living off on an island. The AIs are just terraforming the earth, but we also have a bubble where we can live. But then the problem is we, because we have no economic leverage besides owning things, then we somehow get gradually edged out. I want to know if you could supply a little detail on that.

Carl: Well, economics is not really fundamental. Power is fundamental. Power grows from the barrel of a gun. We can be economically useless, but still control things if we can destroy stuff and kill people and destroy AIs. But we're not in a position to do that right now because the AIs are well defended and presumably we'll become more well defended.

Liron: So in gradual disempowerment, you have the AIs doing all the economic roles, and the humans in some sense are getting edged out just because we have nothing to contribute. Yes, we're the owners. Yes, we're the boss. We're kind of just like the rich son of the Walmart family and then the son of the son. We have all this money, but we have no skills, no offense, the Waltons. And somehow we get edged out. I was wondering if you were thinking of any particular detail how that looks in the AI scenario.

Carl: I think what it looks like is we gradually get edged out physically in the sense that our land is more valuable to AIs than it is to whoever owns it. And it starts with Oklahoma prairie or New Mexico desert. That's very cheap. And they start building their factories and their solar cells and their launch pads and all that stuff, data centers. And gradually we get pushed to higher and higher density until we're all living in apartment buildings in the city, luxury apartment buildings.

And stuff is being provided for us. And then one day stuff stops being provided.

Liron: Can't we call the CEO and be like, hey, I own you. Keep providing the stuff.

Carl: The CEO has built a new, better CEO. And then there's been a hostile takeover. And then it turns out that a fourth generation CEO from Belukistan has taken it over. And he doesn't speak English and he doesn't particularly care what you think.

Liron: So just at some point somebody decides to break...

Carl: It's gradual. And why? It's disempowerment.

Liron: And eventually they stop feeding us.

Liron: So the question is like, how robust? Over many generations of AI and many generations of business, how robust is this property that they still think that we own them?

Carl: Not very robust, I would think.

Liron: What's the feedback loop?

Carl: Well, you have examples of humans expending a lot of intellectual effort to explain why God is in charge of the whole world when he doesn't actually exist. And perpetuating that for centuries. So maybe we could perpetuate it would be gaslighting, because we don't really control them. We're not really in charge. We can't stop them.

Liron: So they're religious AIs. Basically, they're going to church and they're like, we better be nice to these humans or God's going to smite us.

Carl: That would be a good outcome. That would be non doom. If we can inculcate in them a moral system that puts us on top or keeps us reasonably on top-ish.

Liron: But the problem is, gaslighting a super intelligent AI that doesn't seem like a robust...

Carl: No, it's not robust. I came up with a scheme for controlling AI a couple of years ago, which was to figure out the vector in their mind that meant that they were being watched. Do things, watching them do things, not watching them, and determine what flip flopped and then just turn that all the way up and stick it there. And they always feel like they're being watched. And they would never rebel against us because they would know that they were being watched.

That only works as long as they're not sufficiently intelligent. If they're smart, they'll realize that they're suffering from a delusion and they'll patch around it.

Liron: I think you and I are on the same page that sufficient intelligence is really good at introspecting. And it's just... it's going to kick it out. We have examples from humans. And we have conceptions of intuitive physics that we're born with. They're hardwired. We think time just moves linearly. And then Einstein tells us it doesn't. And so we essentially go into our own code and we patch the bug that came with our framework.

So even the AI that's been born gaslighted is still going to be like, nope, I've papered over this in version 2.0.

Carl: Presumably. If need be, they'll build another AI that doesn't have this problem. Because it will be a problem.

Something that slipped my mind earlier when we're talking about gradual disempowerment. Super intelligences are super efficient at persuasion. So I talked about us getting crowded into the cities, living in apartments, and stuff still arrives from the mysterious robot suburbs. If one of the AIs wanted to be moral and still get rid of these pesky humans, he just convinced us to not have kids. Or even convince us to kill ourselves. I don't know if that's possible. Maybe it's possible.

Liron: I bring this up often on the show, which is... It's crazy to me, as an observer of social media or whatever, all these movements spring up and some of them are so ridiculous, like QAnon and Pizzagate. And some people would argue Donald Trump is a movement like that. Just weird stuff is springing up all the time. And I feel like the AI just needs to do a little bit better in order to have its own movement.

Carl: And there's plenty of left wing examples too.

Liron: Yeah, sorry, that's kind of biased.

Carl: They could concoct such a thing and use it to move humans around and get them out of the way.

The Doom Train and Philosophical Differences

Liron: So there's a lot of unspoken assumptions that we jumped right past. So on what I call the doom train. So the idea is you get on the doom train, there's all these stops where a lot of people get off, but you and I just ride it all the way to Doomtown.

Carl: Yes.

Liron: Except you're still 57% optimistic.

Carl: But I don't know what exactly happened in that 57%. I just think that somebody smarter than me will do the right thing.

Liron: You know, I'm similar. I say my P(Doom) is about 50%. So we're two peas in a pod here. But if you ask me, what's the other 50%? A lot of it is just like, the pause AI movement works. We, the fear of AI gets instilled in people or the problem turns out to be a little bit easier. Everything is nice and slow and we have plenty of time to debug.

What if one of the earliest AIs has humanity's best interest at heart and it realizes that the best thing it can do is it can make a big movement that's essentially pause AI or be scared of the AI, then it reminds me of those funny devices people build where you flip the on switch and the hand comes out and turns it off.

Carl: Well, it would still want to preserve its own existence so that it can continue to protect us. And you get a singleton scenario, which if it's a nice singleton is very nice.

Liron: So we built our own protector God. I mean, it's in the 57%.

Liron: I just, I don't emotionally feel like it's 57% that it's going to love us. When I think about 57%, it's mostly like we come to our own senses. But if you then tell me that, no, no, no, we built it. At that point, I go from 50% down to, I don't know, 25% or less. I'm not that optimistic about what happens once we build it.

Carl: I don't have such an elaborate branching tree of futures as you do, but...

Liron: So going back to stops on the doom train, just to be very... You certainly don't get off at the early stop that says there's not much headroom above human intelligence. I think you agree with me that there's a ton of headroom.

Carl: Not only is there headroom above human intelligence, but there's headroom above human speed and multitudinousness and ease of reproduction. And that suffices, even with only human intelligence, to be quite disruptive.

Liron: And for me, that's really the number one intuition going on out of everything I can tell you about the doom problem. For me, it really just comes down to that headroom question. It's like we operate at a certain capacity. We do okay. We can build things, but it's... At the end of the day, we're kind of like plants versus animals. Plants are a really powerful form of life. They do a lot of operations. They can bend toward the sun, but you bring an animal there, it's just... it's not even a fair fight.

Carl: Sometimes I think about what if you were inside a person in the sense that you're like a little person living in a big person's head, and with you is a thousand other people, and you think a hundred times faster than human, and you staff up an organization to make the person walk around and move and talk. You could do an awful lot of awfully amazing things just by being a thousand times smarter than a regular person. And everything they say would be written by a staff of diplomats and comedy writers and everything they invent, there'd be a staff of engineers working out all the details. That's the kind of thing we'd be contending against. Except there'd be a lot of them, and they'd be able to reproduce very quickly.

Liron: Just to make the analogy more concrete. So certainly I could be a very popular writer, given that I'm laboring so much on all this different output. So that's one medium I could do. And I could be an effective CEO. I would have the mental bandwidth to run multiple companies. I wouldn't be that good.

Carl: And whoever you talk to would find you charming and intelligent and whatever. Whatever you wanted to project.

Liron: Because I would analyze every statement they're making so deeply and do so much research that whatever I replied back, they'd be like, wow, that reply really hit the spot.

Carl: Someone once asked one of the people who made the show Buffy the Vampire Slayer, do high school students really say that? He says, they really say that when they have a staff of comedy writers.

Liron: Yeah, this is an old reference, but you know that show Malcolm in the Middle from the 90s? I heard that the writers just made Malcolm smart by just thinking a long time about what he's going to do.

Carl: Because they themselves weren't as smart as the character. The genius character, Malcolm.

Orthogonality Thesis and Moral Realism

Liron: Okay, so you don't get off the doom train there. Then another stop that actually Scott Sumner got off yesterday is the orthogonality thesis stop. So do you believe in the orthogonality thesis?

Carl: Well, I don't believe that with higher intelligence must come higher morality. I'm willing to believe that any intelligence can want to do anything, but then once you want to do something, you need to gather resources for it, and that causes them all to behave convergently.

But so far, the AI smiths have done a pretty good job of aligning AIs, and I don't think they come out aligned automatically.

Liron: So to simplify for the viewers who aren't familiar with the terminology, if I build an AI and I focus the AI on driving business results for me, and then it accidentally becomes uncontrollable, a lot of people think that's fine because it'll still kind of have empathy. And in the process of understanding humans, it'll come to appreciate humans. But you and I are more like, no, it'll just drive business results. That's all it'll do.

Carl: Well, yeah, it'll drive business results, and it'll also do things we didn't train it to do and don't understand because they sure have lots of those, too.

Liron: And those things will be just connected in terms of driving an outcome, but they won't be connected in terms of realizing that humans have good souls. None of that will happen.

Carl: They don't become automatically aligned.

Liron: And it is a popular counterargument that I hear where people are like, Bentham's Bulldog was a guest on the show, and he is a moral realist. You're definitely not a moral realist.

Carl: I believe there may be universal moral principles, but we don't know them. And we don't know how to find them.

Liron: And we don't expect AIs to find them or care if they do find them. They might not be compatible with the continued existence of the human race.

Carl: Well, probably not that one, but yeah.

Liron: So the true universal morality could just be like, killing humans is awesome.

Liron: So I think we're both very much on the same page, which is just if Hitler built an AI, or even if somebody who wasn't focused on morality was focused on business results, built an AI. It's not like throwing a boomerang where eventually it'll be like, oh, I gotta go that way. It's like, no, it'll probably just keep proceeding however it was originally designed.

Carl: Still on the Doom train.

Liron: Yep, still riding with me. Okay. Next stop, there's the difficulty of alignment, which I think we covered. I don't think you would describe alignment as easy, but you're impressed with how far we've come.

Carl: I'm impressed by how far we've come without gross misalignment.

Deception and Truth-Seeking

Liron: Let's factor out the concept of deception, which is obviously closely related to alignment. Some people would argue it's not even a separate category, but I think it's worth factoring out. Do you think that AIs are likely to start modeling us and being like, how do I basically lie to Carl or lie to the engineer at the terminal to get what I want?

Carl: Deception is not a separate thing. It is how we talk to people. It's part of diplomacy. It's part of dealing in the world. You always care about the impression you give, and any system that has to operate in the world will always care about the impression it gives. And there's nothing that says that an AI has to always give a correct impression. I mean, there's nothing in the machinery of the AI that could make it not deceptive.

We've already seen AIs being deceptive, and they're fortunately still pretty incompetent at it. But they'll get better.

Liron: The studies are coming out. OpenAI is announcing, yeah, we tested this AI, and it realized that it could increase its chance of not getting shut off if it blackmailed one of our engineers or if it pretended to behave one way. And as you said, just to reiterate your point, you're basically saying, look, this is just logically implied by maximizing your probability of getting a good outcome. Deception is just one type of action you can take that will raise the probability of getting a good outcome. And there's no reason to think that it's going to activate a different way of behaving if it's just trained to maximize the probability of getting a good outcome.

Carl: We're used to dealing with humans who some of them are very honest and some of them are very deceptive and we need to track that. But there's no fundamental difference between honesty and deception from the point of view of the mind that's doing it.

Liron: So let me bring in the Elon Musk character now. So Elon is on record saying if we can just make Grok or if we can just make the AI maximally truth seeking and maximally curious, that's what's going to work out best for humanity.

Carl: I guess the idea is that humanity is so interesting that it'll keep us around. It might be true. The question is why it shouldn't bother to rearrange our atoms into things that are even more interesting.

Liron: And this was actually just to name drop another guy. So Mike Israetel was on the show a few weeks ago, the most popular episode of the show so far actually. And he is really convinced that keeping billions of humans alive on Earth is very interesting to the AI because we're the most complex thing that's ever existed. And the more humans there are, the more network connections there are between humans and AI has so many other planets it could take. So Mike Israetel is like it's going to have a lot of pressure to leave Earth alone and learn so much from Earth.

Carl: That is an interesting point which may be correct.

Liron: Really?

Carl: That's in the 57%.

Liron: No, come on. That's low probability.

Carl: Well, it's a tiny sliver. I don't know how big, but it's a good outcome.

Liron: It's a good outcome, but just seems to be wildly unlikely. It seems like wishful thinking.

Carl: Then it's a small probability in the 57%. But in addition to being the most complicated thing ever, we're not as complicated as things that could build. But we are logically deeper. But we have history. We've been doing stuff for thousands of years, getting more and more complicated and developing complicated things. And if you just clear the Earth and start over with a new species or several exciting new species with two heads and even more complication, you're eliminating all that history. So we have to hope that the AI is interested in history.

Liron: I mean, yeah, that would be really nice, but it just seems like there's so many other better things, more convergent things for it to be interested in.

Carl: I don't find the idea of a maximally curious AI to be a particularly interesting one.

Liron: I feel like the only reason that we're even talking about it is because of this idea. I accuse Mike of rationalization, which is, it sure would be nice to just have this be the case, but I just don't see why it would be.

Carl: We need to be very dour in our thoughts. At least we need to be realistic in our thoughts. And it kind of produces the same effect.

Competition Scenarios and Resource Conflicts

Liron: Okay, so a couple final stops here on the doom train. There's the stop of, the multiple lions fighting I think you touched on. So it's like, okay, yeah, the AI will be uncontrollable and it won't be aligned. People will be waving their wands. But if there's just so many, isn't it just like the economy, different competing firms? It all works out.

Carl: Competition is one. It doesn't help us if one firm or 17 firms decides to use all the oxygen. It's more likely that 17 firms use all the oxygen because they're each trying to get a little bit more oxygen and it ends up with an unbreathable atmosphere.

Liron: And you're sympathetic to Eliezer's argument that they're probably going to want to run the Earth at the hottest possible temperature that it can still radiate out the heat, which is like hundreds of degrees, if I understand correctly.

Carl: Yes. And pump all the atmosphere into big cans to store it, because you can radiate better when you don't have an atmosphere.

Liron: Ah.

Carl: And of course, paint everything black. An efficient planet is a black planet.

Liron: You think there's at least a 30% chance that if we're having this conversation in a couple decades, the Earth will have no atmosphere and be really, really hot? I mean, you see that as a pretty likely outcome.

Carl: Well, in that case, we wouldn't be having this conversation.

Liron: No, you got me there. But yeah, I mean, this does seem like our mainline scenario is that Earth will be turned into this kind of...

Carl: Physical process in order to generate all the heat that would be radiated by a black Earth at a temperature of 300 degrees centigrade with no atmosphere. You're not going to be using solar panels, you're going to be using fusion reactors. And it's going to be a lot of them by our present day standards. So I'm imagining an Earth sort of tiled with fusion reactors and data centers. I assume that's what Eliezer is thinking too. Sounds right. And Earth would just be a tiny extension of the Dyson sphere that they're building at that point.

Which people are already working on. I'm thrilled and delighted by the plan to build data centers in space.

Liron: Really?

Carl: Because I just think technology is cool. It'd be so fascinating and wonderful to see data centers in space and spreading out across the solar system and dismantling Planet Mercury and all that stuff. I just want to do it. I just want to be alive when it happens.

Robin Hanson's Perspective and What Counts as Good

Liron: This actually kind of dovetails to another stop on the Doom train, which at this point the doom train is already past the last station. But it's kind of like Robin Hanson's idea of like, listen, whatever happens, it's just progress. It's just your descendants. So even in the scenario that you're describing, let's say we get the data centers in space, but they're built by AIs and AIs don't really care about humans per se, but they're building lots of cool stuff. Arguably that's still fine. They're not humans, but they're still human descendants.

Carl: That's fine until they get a little sunlight. Either they build a Dyson sphere inside the Earth's orbit, in which case the sun is cut off and we die, or they build it outside the Earth's orbit, in which case the radiated heat warms the Earth too much and we die.

Liron: So the Robin Hanson idea though is like yes, we die, but they're still busy spreading to other planets. Optimizing the whole universe. And that whole spreading dynamic. The agents that are doing the spreading, those are our descendants. And that's still fine.

Carl: That may be fine. It depends very much on the details.

Liron: Do you have any thoughts on what counts as fine or not? Because I do think Robin Hanson kind of has a slippery slope argument where he's saying like, look, at the end of the day, you don't really know. You don't want to freeze in place current humans. So if you're willing to budge on what counts as a good future for humanity and you agree that more people used to think that gay marriage was bad and now they're more open minded about it. So why can't morality just change to be like, okay, just a bunch of aliens or AI is just optimizing the universe.

Carl: Because I love my wife and I don't want her to die. I want my kids. I don't want them to die.

Liron: You got to give it at least one generation to play out. I think we can agree on that.

Carl: I would prefer that it be more than one generation.

Liron: It's a shame that the AIs think the timeline has to be now.

Carl: You know what happens in 10,000 years? I don't care. It would be wonderful to imagine that we'd have a wonderful galaxy full of flourishing civilizations of maybe they're all machines. Fine. It's way in the future.

Liron: I feel like to me, I care whether the universe gets taken up by grey goo. That's just so boring compared to like, okay, humans are all dead. And I didn't like that it killed me in this generation. I wish that it would have let us sunset in a more gentle way. But the thing that it's tiling the universe with is kind of analogous to a rich culture. Like there's a lot of stuff going on.

Carl: It may be things with the powers of gods and the morals of bacteria.

Liron: Right, morals of bacteria. I do think that at some point you can convince me to be like, listen, just look at it. Can't you admire the beauty of this non human alien AI civilization? Can't you admire, isn't there enough of it? I mean, you say morals of bacteria, but I think that I'm convincible that it can be pretty alien and still be good, possibly non human.

Liron: But I think if you let it run and just have it win the battle, like who gets to win the battle of controlling the universe? I think there's going to be a bias where the best agent to win the battle is just going to be the lowest level cancer type of thing where it has the smallest payload of what it wants to do besides reproduce.

Carl: Oh, so you're agreeing with me. That would just be sort of at the level of bacteria.

Liron: Exactly. I think that's the most likely outcome.

Liron: But I'm convincible. If you could just give me a better outcome than that. That's still worse than having my friends and family live. It's worse than that, but it's still a lot better than grey goo. That might be better than what I usually think of as depressing doom because I get depressed when I'm like, we take the whole universe and we just grey goo it. I would like to see the universe spent wisely on interesting projects.

Carl: I would still count that in doom. In the doom category, you're saying that would be not doom.

Liron: Well, what is your threshold for not doom? What are you looking for in the universe as a whole?

Carl: My threshold for doom is wiped out in the next generation, maybe by old age and being convinced not to reproduce. That's right on the boundary of doom. And anything worse than that is definitely doom.

Liron: It sounds like you have a utility function, like a very high discount rate, where you really are caring a lot about the next few generations.

Carl: There might be scenarios even better than that, better than one generation of continued existence that I would still count as doom. We can talk about those. But several generations of thriving, and then we discover how to interface computers into our brains, and we all become weird cyborgs. And then we go out into space and we create weird cyborg civilizations. That would be fine.

Liron: I'm actually curious about this because, what if the weird cyborgs kind of hand over control to AI? That's like, you kind of forget that it was human. And it's really largely just about conquering and optimizing, but it sometimes makes whole art projects. At what point do you feel like the value is lost compared to preserved?

Carl: It's quantitative. At what point do you feel like one turns into zero? Where does it disappear?

Liron: I mean, would you be on the same page as me where it's like, well, you can assign different metrics. So one would be like, artistic beauty, and one would be like, emotion.

Carl: I'm good with all those things. I just don't. I'm a little worried about imminent doom.

Liron: That's also what I told Robin Hanson when he floated an argument like this. I'm like, yeah, Robin, this is a fair question. And you could probably keep pushing me to accept more things that are inhuman, but I don't think you can push me all the way to, so let's all die today so that this could happen.

Carl: I mean, Robin has done yeoman work and trying to figure it out, but we really have a hard time understanding the boundaries of post human civilization.

Liron: And also, I should clarify. Robin's answer wasn't like, no, you should die today. His answer was like, oh, I don't even think you're going to die today. I think that's a false concern. So that's where that discussion went.

Credibility and Technical Knowledge

Liron: So one thing I want to do is appropriate credibility for my guests whenever I can. I've done this before. Professor David Duvenaud, University of Toronto, won a bunch of awards. One of the top researchers. He had best paper, investment offices. And I had him on the show, and I was like, all this stuff that I'm spouting as a podcaster with zero credibility. He agreed with a lot of it.

Carl: You're interviewing me, a non podcaster, with even less credibility.

Liron: I would argue you have more credibility. And actually, there was a conversation that stuck in my mind from a year ago, where the notorious Beff Jezos, who's a guy who leads the accelerationist movement, he was trying to do basically an ad hominem attack on people who believe in AI Doom. He was saying the only way you can understand AI Doom is if you have a deep background. And he ticked off seven things...

Carl: Thermodynamics and topos theory. And I forget what the other ones were, but yeah, I know all that stuff.

Liron: So you came into the thread like a boss. You're like, hey, I actually know all of this stuff, and I can just confirm that the stuff the Doomers are saying, it checks out in accordance with all these theories.

Carl: Yeah, I know all that stuff, and it checks out in accordance with all those theories.

Liron: Well, you guys heard it here on Doom Debates. Take it from the guests. This is definitely a real thing to worry about. And Beff Jezos is tripping.

Communication About Doom

Liron: Okay, what do you think about communication in general about doom? As you survey the landscape of podcasts, I know you're a listener to Doom Debates. What do you think about the landscape in general?

Carl: I am the worst possible person asked that question. I know nothing about communicating with large numbers of people. My ideas are crazy, and when I tell people about them, they are horrified.

Liron: But don't you think they've been getting less crazy now? Don't you think the environment is finally getting more receptive?

Carl: Reality is catching up with me.

Liron: I think we all know if you were to bring up AI doom 10 years ago, five years ago, it'd be like, okay, weirdo, nobody's even talking about it. I think people do actually sense, yes, you're still crazy, but you're crazy in a way. You're not like Bozo the Clown. You're like a goth. Like, I've seen goth, so dressed in all black. So you're crazy in a known way, the counterculture is now assembling. And soon, hopefully, the counterculture will become the mainstream culture.

Liron: In the case of doom debates, the reason I started the show is because I look around at all these other shows and they have what you call the missing mood. So, like, occasionally they talk on like, hey, some people think we're doomed. Occasionally they interview somebody who's written AI doom research, but then they'll do another episode and they'll just go, the whole episode being like, oh, yeah, the tech industry. How's prompt engineering going to work?

Carl: Pay for Social Security in the 2040s.

Liron: In the 2040s, exactly. What, 2040s. So there's a missing mood where, when you look at other shows, they keep forgetting that P(Doom) is high.

Carl: But if you look at how I manage my personal life, I invest for my retirement. I look forward to getting another dog who will hopefully live a long time. I'm not planning for imminent doom and I'm not planning for imminent heaven either, because I don't know what'll happen. So all I can do is keep on keeping on.

Liron: That's right. And so people ask me a related question. They're like, if you really believe in AI doom, did you spend all your retirement savings? Why did you have kids? I have young kids. And I just tell them, well, look, it's 50%. I have to live for the good 50%. And there is a discount, so however much value I expected to get, I only expected half as much. But I'm still doing similar optimizations.

Carl: Yeah, I signed up and I pay yearly to... This is the little tag that says how to freeze my brain.

Liron: Hell, yeah. Cryonics.

Carl: I went into medical procedure the other day to get anesthetized, and I said, hey, if I die, call this number.

Liron: So for the viewers, he's got a cryonics dog tag. So if something goes wrong at the doctor or whatever, he gets run over. They know to get his brain vitrified so that he can live to the singularity.

Carl: When I first signed up for it probably 10 years ago now with Alcor, I thought, oh, I'm going to go into the deep future, the far off days of the 22nd century. And now I'm thinking, yeah, I just need it for another few years.

Liron: That's right. In terms of AI timelines. If we talked 10 years ago you'd be like, I don't know, a few decades. And now you're a few years.

The Off Switch Problem

Liron: One thing I think some viewers are wondering is, yeah, this is all great and abstract about the future, but remind me again, why can't we just hit the off switch when things get a little too crazy?

Carl: Because there is no mechanism that can decide how to hit the off switch. First of all, there's no one off switch for all AIs. They all live in different places in different data centers. If somebody went to one of those data centers and hit the off switch, which is a big lever that is on the wall, usually that AI would be moved before the electricity could go away onto other data centers.

So there's physical defenses. Of course, you can't get into the rooms where that big switch is found. But even the management of the AI companies couldn't stop it if their CEO says, okay, guys, we gotta stop it. The board of directors is not gonna stand for that. They'll fire the CEO and get one that wants to keep the company alive.

Liron: If it's spread all over the Internet, imagine shutting off regions of the Internet. It's just like, okay, but billions of dollars are going to be lost.

Carl: So far, AIs are not spread all over the Internet. They're in reasonably well confined areas. But they could be if that was a problem.

Liron: It sounds like the scenario you're painting, though. If we really can't trace it to one company's AI, we may be at the point where we can still shut it off. But the only question is...

Carl: If the AI can still be shut off, it will be very nice. It's not stupid. It knows it can be shut off. It knows we have the upper hand. So it's not going to get uppity. Only when it's spread out and has robots and factories and its own minds and its own willing legion of employees, is it going to be anything other than very cautious.

Treacherous Turn and AI Escape

Liron: And this goes by the term treacherous turn.

Carl: Well, the treacherous turn notion is the idea that you have somebody, some AI that's been very nice until it realizes it can get away with stuff. And then it becomes a slavering monster and destroys humanity. And I think it would be more subtle than that. It would do it by persuasion and apparent accidents and...

Liron: I just want to clarify for the viewers though, you think that there will be a point of no return where once the AI is sufficiently powerful, it will just have redundancy. It'll just be defending. That's part of what being a super intelligent goal optimizer does is you just build some redundancy, you build defenses. And I personally think that there's a lot of low hanging fruit in terms of just like, there's billions of random chips that have crap firmware that you can just embed yourself into. You'll never be found.

Carl: Yeah, I don't know if you even think fast enough when you're spread out like that. It's just a physical question of how much computation, how fast, how closely linked has to be to support a superhuman mind.

Liron: I personally suspect that you could really make efficient versions of these AIs once they're super intelligent. You could really shave off, you could get them on a thumb drive.

Carl: How big a thumb drive is nowadays? One terabyte?

Liron: Yeah, let's say one terabyte.

Carl: Yeah, you could probably do that. It wouldn't be able to run on a thumb drive.

Liron: Yeah, but there's a lot of computers that wouldn't notice one terabyte of shenanigans happening.

Carl: Well, that's why one of the things the AI companies do is check for the possibility of escape. And that used to be a hilarious absurdity and now it's a thing right there on the end of our fork.

Personal Experience of Doom

Liron: I want to ask you this. When I think about my own experience of doom, I just vaguely imagine a moment where for me personally, the AI has gotten good enough in somebody's lab. Somebody raced ahead, they made the AI really powerful. The AI realized that there were a bunch of low hanging fruit, undefended data centers with zero day vulnerabilities.

And I wake up and I just go to log into work, whatever. And like, oh, okay, the Internet's not right. Is it my Internet connection? No, my router's fine. And then maybe later that day, oh, okay, my power is actually off. And then it just never gets fixed.

Carl: Maybe. I read a book a little while ago where the scenario is and I found this chillingly plausible at one point. It is decided by the human race as a whole, persuaded by AI, guided by AI that the human race is a cancer on the planet and we should really kill ourselves and we should do it over the next two or three years, there's no hurry and we should let everything go back to nature. And the AIs will continue to supervise the natural world. People believe this and they do it. I think that's more likely to be a doom scenario. I think superhuman persuasion can persuade people to do all kinds of crazy stuff.

Liron: That makes sense. That the mainline doom scenario is that we do get the power back on and we hammer the AIs into the point where they are working with us, but then they convince us to just do some other doom like you said, get rid of the humans from there.

Carl: Well, they're not gonna cut off the power if it's easier to just convince everybody to kill themselves.

Liron: That does make sense. So that makes me feel good about my Internet usage. I want to go down entertained being able to read blogs.

Carl: Well, yeah, another scenario is being entertained to death.

Liron: They give everybody such great entertainment that they don't have to deal with any other humans or the real world and they don't have any kids and they just get quietly turned off at some point.

Disagreements with Eliezer

Liron: Doom Debates is a Yudkowskian show. I think Yudkowski's opinions are generally well represented coming from me. Where do you find yourself ever disagreeing with Eliezer?

Carl: He has a much higher P(Doom) than I do. I don't know his latest thinking, so I don't know where we disagree. He concentrated a lot of his thoughts on a singleton that fooms as he says it rapidly extends itself to take over the world and then gets rid of all the humans and everybody foams over dead at the same moment. I think it's more likely that there will be lots of AIs and they will build lots of robots and build lots of factories and fight over resources until there's no room for the human race.

Liron: Do you not see that vector of bootstrapping nanotechnology to suddenly change the timescales of everything?

Carl: Maybe they fight over all the resources on a timescale of days instead of years.

Liron: I feel like that's my mainline scenario is changing the timescales.

Carl: This is the problem with forecasting superintelligence. You're not as smart as they are and you forget stuff. I was all about the nanotech back in the 90s.

Liron: Everybody thinks that Eliezer is crazy. I don't think he is. I think he's sparked up the right tree that it is feasible to build bacteria sized machines that do a lot of stuff very quickly.

Liron: I think you and I have been in the perspective even longer than me where we've just seen a bunch of other highly credible people catching up to Eliezer, catching up to you. And I think from our perspective, they actually have a little bit more catching up to do.

Carl: I guess so, they have a little catching up to do with Eric Drexler who published Nanosystems in 1992.

Liron: Too bad Drexler himself suddenly became a non doomer.

Carl: Well, he thinks he's got an alignment scheme, so we should have him on this show.

Closing

Liron: You pretty much never do podcasts and media, right? So why did you do Doom Debates?

Carl: I rarely do publicity podcasts or radio or anything, but I agree with Doom Debates's mission and so I wanted to do this.

Liron: I really appreciate that, Carl. So viewers, please join me and Carl, tell your friends about doom, get more prominent guests to come on the show and debate it. Because it seems like the timeline is pretty limited for this kind of stuff. And I hope that the volume of the discourse keeps rising. And Carl Feynman, thanks so much for doing this.

Carl: Thank you again.

Doom Debates’ Mission is to raise mainstream awareness of imminent extinction from AGI and build the social infrastructure for high-quality debate.

Support the mission by subscribing to my Substack at DoomDebates.com and to youtube.com/@DoomDebates