Why does the simplest AI imaginable, when you ask it to help you push a box around a grid, suddenly want you to die?
AI doomers are often misconstrued as having "no evidence" or just "anthropomorphizing". This toy model will help you understand why a drive to eliminate humans is NOT a handwavy anthropomorphic speculation, but rather something we expect by default from any sufficiently powerful search algorithm.
We’re not talking about AGI or ASI here — we’re just looking at an AI that does brute-force search over actions in a simple grid world.
The slide deck I’m presenting was created by Jaan Tallinn, cofounder of the Future of Life Institute.
00:00 Introduction
01:24 The Toy Model
06:19 Misalignment and Manipulation Drives
12:57 Search Capacity and Ontological Insights
16:33 Irrelevant Concepts in AI Control
20:14 Approaches to Solving AI Control Problems
23:38 Final Thoughts
Watch the Lethal Intelligence Guide, the ultimate introduction to AI x-risk! https://www.youtube.com/@lethal-intelligence
PauseAI, the volunteer organization I’m part of: https://pauseai.info
Join the PauseAI Discord — https://discord.gg/2XXWXvErfA — and say hi to me in the #doom-debates-podcast channel!
Doom Debates’ Mission is to raise mainstream awareness of imminent extinction from AGI and build the social infrastructure for high-quality debate.
Support the mission by subscribing to my Substack at https://doomdebates.com and to https://youtube.com/@DoomDebates
Share this post