Doom Debates

so is it 5/26 or 5/28?)))

Expand full comment

Liron Shapira

Oops, fixed to 5/28. Thanks!

Expand full comment

David Colin Gould

From watching your channel, it seems to me that arguments against doom are extremely weak. Do you have any advice on how to steel-man them? This is so that I can be better prepared when trying to discuss the issue with others.

Expand full comment

J. A. C.

Jun 1

Think about it like any other animal on earth, competing for survival resources. For example, when ants crawl into your kitchen and get into your cookie jar or whatever, its just an irritation or whatever, maybe you even have to toss out the cookies. But what happens if those ants are suddenly the size of cars? They still want to eat everything, so they will eat you. Its how they survive - by taking whatever resources they can find. Humans do the same. So you could say the argument is about an intelligence that has a will to survive. Thats something everyone understands. For example: it could easily remove all thermal control, spray the atmosphere or create a nuclear winter because computer hardware thrives in the coldest conditions and abhors heat, and if you have like millions of square miles of server farms everywhere, you want the whole world super frosty like a giant heat sink. Humans would not survive that. There are plenty other scenarios to consider, but thats a plausible one.

Expand full comment

David Colin Gould

What are the chances that timelines are significantly longer than we think? There has been recent discussion about compute limits being a hard boundary, such that if AGI is not developed in the next five years it might take decades from there to reach it. Given AI's inconsistent improvement - extremely fast in some areas, barely noticeable in others - is true general intelligence an incredibly hard thing to build?

Analagous to the rocket problem, what if we lived on a 1.5 g world and it was just too difficult to get rockets to orbit? What do you think are the chances that we will not be able to build sufficient compute to reach AGI?

(note that this is one possibility that gives me hope, so I was wondering where it fits in your model)

Expand full comment

Vianney PETIT

I understand AI doesn't have to be conscious to be dangerous but I don't often hear you mention the ethics of enslaving conscious AI to serve us and/or stay aligned with our interests. Is that not something that worries you as well?

Expand full comment

Nathan Metzger

What do you think you will you do if frontier AI progress is paused globally, or if it somehow hits a wall?

Expand full comment

Jeremy Helm

celebrate?

Expand full comment

Ethan

Why do people feel the "emergent behavior" of AI is so surprising? How did abilities like being able to summarize text or translate across languages subvert or expectations? Could that have just been a limitation of our own imagination due to the massive scale? If we had no intuitions about how these came about, then how can we be so certain they will continue to impress us with new capabilities?

Expand full comment

Koushik Gopal Parakulam

Hi Liron, its an honor to be getting the chance to ask a question to you.

It is my current belief that we will not reach AGI unless we arrive at models that no longer have a strict separation between the training and inference phase, and if they can continually tune their weights adaptively during inference time compute, regardless of its generality at that stage, it would only be a matter of time we would reach AGI. Do you think we can have AGI without such a merger, and if so how do you think models can be a complete generalizer without blurring these lines?

Adding to this what are your personal markers of "High AGI potential" breakthroughs would concern you the most, I don't mean based on empirical results, like being able to answer the HLE with high accuracy or something, but in terms of the model architecture itself.

Expand full comment

Jeremy Helm

Instead of this paradigm of scraping the internet & all user interactions ongoingly to transliterate all of human knowledge into a black box, what if humanity’s design of hyperthogonal space was human readable, much like the design of the original vision for hypertext, Ted Nelson’s Project Xanadu - this is what the IPFS client/ server Seed.Hyper.Media is recreating in a decentralized manner.

Expand full comment

David Colin Gould

https://hollyelmore.substack.com/p/the-myth-of-ai-warning-shots-as-cavalry

One suggestion: Holly Elmore wrote an interesting piece on warning shots here:

What is your thinking on warning shots? How likely are they? How could we rapidly demonstrate that they were the result of AI?

Expand full comment

Alfie

Here's my question for the ama and congrats on 5k subs!

I think in the past I might have heard you say that your expectation is to wake up one day, notice the internet is down and think "well shit... this is it". My question is have you thought much about what you would do next and is there anything you're doing currently to "prepare"? Obviously I'm not sure that any amount of doomsday prepping could save you from a superintelligence but in the case where a rouge AI is taking over the internet, disrupting supply chains, power grids etc. It seems like it might be a good idea to at least have some spare food and water around for you and your family.

Expand full comment

John M

What sorts of things could you imagine an unaligned AI valuing based on the way we're training them right now? I know this is an incredibly hard, if not impossible, thing to predict, but I think it would offer some clarity to have a general idea of exactly what unaligned values we could be reinforcing through current techniques.

Expand full comment

Humiliati

https://imgur.com/gallery/sundog-theorem-signatures-vGEnjIa

You talk glowingly of lesswrong being a place of rational discourse why is my theorem getting downvoted to oblivion and deleted with no feedback (despite the first post posting guide saying this wouldnt be the case)? I'm attempting to publish and being met with gatekeeping and banning everywhere in this field, what gives?

I have a solid equation proved in sim with charted graphs resembling natural phenomena to prove it yet yall won't address my presentation, just ad hominem attacks and exile. If you can spare some attention please check out my https://gitlab.com/malice-mizer/sundog

my claim is that under 100kb with the sundog theorem H(x) we can deliver meaningful, physics based alignment every time. observe physics sim https://youtu.be/Gp7a-fXcRNM

I am struggling with presentation because I come from hardware but this seems to be so inflamatory i cant post it in r/singularity r/artifical r/chatgpt I was instantly blocked etc despite the applicability and sincerity of alignment.

How do yall propose to align shit and can't confront your own shadow

Expand full comment

Koushik Gopal Parakulam

The Alignment space does gatekeep a bit and your not wrong there I have been having a hard time breaking into the field myself. But I'll look into what you've done, though superficially I want to tell you that in this world presentation is everything and from just reading your description on your video, the thing that you did is not very clear. It contains way too many buzzwords and acronyms that don't reference anything in particular and don't make a cohesive argument.

For example, "We train agents not to reach for a target but to align to it using indirect signals only; specifically" What does this mean, if its indirect signals, how does the model optimize itself, or achieve a goal, or solve problems?

Its very important to break down your arguments properly and make it readable for someone who is familiar with machine learning but not what you specifically created and worked on.

Expand full comment

Humiliati

The Theorem states H of x equals the partial derivative of S with respect to tau or 😇 centering.

If we were integrating this into a precision environment with sensors back to a LLM, the LLM could:Interpret simulation data like \( H(x) \), torque, and bloom spread, using regular knowledge of physics and engineering principles.Generate natural language insights, such as: "The bridge’s high torque variance suggests a risk of oscillatory failure. Consider adding dampers to stabilize the structure." Assisting users interactively, answering follow-up questions like: "What material would reduce this variance?" Then using the sundog theorem to scan the craftsmanship.

This is physical alignment through resonance not ethical alignment. It states the halo big, too far. If halo small, just right. Collapse the shadow. The program turns robotics' proprioception into 100kbs of precision compared to the gigs of captchad busses to determine self driving car =/= to fire hydrant.

It comes from practice and application idk how to say it to an academic audience but it worked in sim so like fuck it I'll take the downvoted but to be banned was whack

Expand full comment

Koushik Gopal Parakulam

Physical alignment okay so as in feedback equilibration similar to how a thermostat regulates heating of a room through the feedback of temperature, or a robot using a accelerometer and torque feedback to balance itself, your just applying a feedback sensor data which is already regularly fed to LLMs for training. I don't really understand what is novel about what your trying to say.

Expand full comment

Humiliati

Sure man, me neither, except it got me banned from lesswrong. A place that doomer claims is rational.their voting system is irrational. He wouldn't confront my question neither...

Expand full comment

torbg