May 2023: Welcome to the alpha release of TYPE III AUDIO.
Expect very rough edges and very broken stuff—and daily improvements.
Please share your thoughts, but don't share this link on social media, for now.

Homearrow right

(Part 1/2) Is power-seeking AI an existential risk? by Joseph Carlsmith

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is part one of: Is power-seeking AI an existential risk?, published by Joseph Carlsmith.
1. Introduction
Some worry that the development of advanced artificial intelligence will result in existential catastrophe -- that is, the destruction of humanity’s longterm potential. Here I examine the following version of this worry (it’s not the only version):
By 2070:
It will become possible and financially feasible to build AI systems with the following properties:
Advanced capability: they outperform the best humans on some set of tasks which when performed at advanced levels grant significant power in today’s world (tasks like scientific research, business/military/political strategy, engineering, and persuasion/manipulation).
Agentic planning: they make and execute plans, in pursuit of objectives, on the basis of models of the world.
Strategic awareness: the models they use in making plans represent with reasonable accuracy the causal upshot of gaining and maintaining power over humans and the real-world environment.
(Call these “APS” -- Advanced, Planning, Strategically aware -- systems.)
There will be strong incentives to build and deploy APS systems | (1).
It will be much harder to build APS systems that would not seek to gain and maintain power in unintended ways (because of problems with their objectives) on any of the inputs they’d encounter if deployed, than to build APS systems that would do this (even if decision-makers don’t know it), but which are at least superficially attractive to deploy anyway | (1)-(2).
Some deployed APS systems will be exposed to inputs where they seek power in unintended and high-impact ways (say, collectively causing >$1 trillion dollars of damage), because of problems with their objectives | (1)-(3).
Some of this power-seeking will scale (in aggregate) to the point of permanently disempowering ~all of humanity | (1)-(4).
This disempowerment will constitute an existential catastrophe | (1)-(5).
These claims are extremely important if true. My aim is to investigate them. I assume for the sake of argument that (1) is true (I currently assign this >40% probability). I then examine (2)-(5), and say a few words about (6).
My current view is that there is a small but substantive chance that a scenario along these lines occurs, and that many people alive today -- including myself -- live to see humanity permanently disempowered by artificial systems. In the final section, I take an initial stab at quantifying this risk, by assigning rough probabilities to 1-6. My current, highly-unstable, subjective estimate is that there is a ~5% percent chance of existential catastrophe by 2070 from scenarios in which (1)-(6) are true. My main hope, though, is not to push for a specific number, but rather to lay out the arguments in a way that can facilitate productive debate.
Acknowledgments: Thanks to Asya Bergal, Alexander Berger, Paul Christiano, Ajeya Cotra, Tom Davidson, Daniel Dewey, Owain Evans, Ben Garfinkel, Katja Grace, Jacob Hilton, Evan Hubinger, Jared Kaplan, Holden Karnofsky, Sam McCandlish, Luke Muehlhauser, Richard Ngo, David Roodman, Rohin Shah, Carl Shulman, Nate Soares, Jacob Steinhardt, and Eliezer Yudkowsky for input on earlier stages of this project; and thanks to Nick Beckstead for guidance and support throughout the investigation. The views expressed here are my own.
1.1 Preliminaries
Some preliminaries and caveats (those eager for the main content can skip):
I’m focused, here, on a very specific type of worry. There are lots of other ways to be worried about AI -- and even, about existential catastrophes resulting from AI. And there are lots of ways to be excited about AI, too.
My emphasis and approach differs from that of others in the literature in various ways. In particular: I’m less focused than some on the possibility of an extre...