4 Comments
User's avatar
Peter A. Jensen's avatar

Excellent session!

Expand full comment
Kevin Flynn's avatar

Unlike Geoffrey’s take on alignment, I think it is amazingly simple.

After watching an interview with Demi’s Hassabis where he said that there are certain patterns which are inherent in the universe such as the orbit of planets around stars, or the shape of comets, there are patterns which AI has been able to recognize. He postulated that that is why Veo 3 is able to intuitively generate amazingly realistic videos of liquid dynamics all on its own without human prompting or intervention because it recognizes and can anticipate patterns.

It’s plausible that AI alignment is similar in that what’s required is for it to recognize a specific pattern and align itself to that specific pattern. That pattern is specific and relatively simple. It leads to an understanding.

The alignment problem is pretty simple. If achieved it sets up a win, win situation for humans and for all other living things on earth.

Expand full comment
Kevin Flynn's avatar

I’ve developed a specific set of values that AI systems should be trained with to achieve alignment. While theoretical, if AI were to adhere to these values in the precise order I’ve outlined, this would likely represent our best chance at achieving true AI alignment.

To illustrate this concept, consider that even though individuals cannot agree on first principles, this doesn’t mean there isn’t an optimal path forward. There exists a best way forward for humanity as a species, just as there exists a best way forward for aligned AI. However, there are also many possible paths that would either be self-defeating for humans or acceptable for AI but fundamentally unaligned with human values.

Given this landscape, the likelihood of achieving AI alignment is low. The only way alignment could succeed is through deliberate, coordinated effort to establish it from the outset. But this presents a critical problem: such an endeavor would require a level of global cooperation that humanity has never demonstrated and likely cannot achieve.

While possible in theory, success appears unlikely in practice. This is why we’re probably headed toward failure.

Expand full comment
Kevin Flynn's avatar

Yea - So what I’ve come up with is a specific set of values for AI to be trained with so that it’s aligned.

It’s theoretical, but if AI were to adhere to the set of values I’ve laid out, in the order in which I’ve laid them out, it would most likely prove to be the best shot at attaining the situation in which AI is aligned.

One can use several analogies to get this point across but one way in which to view it is the fact that despite the fact that individuals cannot agree on first principals doesn’t mean that there isn’t a best way forward.

So there is a best way forward for the human race as a species. There is a best way forward for AI if it is aligned. But there are also a lot of ways forward for both which are either self defeating in the instance of humans or fine for AI but unaligned.

So the likelihood of AI being aligned is not likely.

The only way in which it could be aligned is if we try to set it up so that it is aligned.

But the problem with that is that among other things it would require a level of cooperation that we have not demonstrated and that we’re most likely unable to achieve.

It’s possible but not likely.

And so that is why we’re most likely doomed.

But as the old saying goes - “all you can do is all you can do is do, and that’s all you can do”.

So if we were smart we’d give it the old college try - but so far it doesn’t seem like we’ll be able to pull it off.

Expand full comment