May 2023: Welcome to the alpha release of TYPE III AUDIO.
Expect very rough edges and very broken stuff—and daily improvements.
Please share your thoughts, but don't share this link on social media, for now.
This report examines what I see as the core argument for concern about existential risk from misaligned artificial intelligence. I proceed in two stages. First, I lay out a backdrop picture that informs such concern. On this picture, intelligent agency is an extremely powerful force, and creating agents much more intelligent than us is playing with fire -- especially given that if their objectives are problematic, such agents would plausibly have instrumental incentives to seek power over humans. Second, I formulate and evaluate a more specific six-premise argument that creating agents of this kind will lead to existential catastrophe by 2070. On this argument, by 2070: (1) it will become possible and financially feasible to build relevantly powerful and agentic AI systems; (2) there will be strong incentives to do so; (3) it will be much harder to build aligned (and relevantly powerful/agentic) AI systems than to build misaligned (and relevantly powerful/agentic) AI systems that are still superficially attractive to deploy; (4) some such misaligned systems will seek power over humans in high-impact ways; (5) this problem will scale to the full disempowerment of humanity; and (6) such disempowerment will constitute an existential catastrophe. I assign rough subjective credences to the premises in this argument, and I end up with an overall estimate of ~5% that an existential catastrophe of this kind will occur by 2070. (May 2022 update: since making this report public in April 2021, my estimate here has gone up, and is now at >10%.)
Original article:
https://arxiv.org/abs/2206.13353
Narrated for Joseph Carlsmith by TYPE III AUDIO.
This is a linkpost for https://epochai.org/blog/literature-review-of-transformative-artificial-intelligence-timelines
We summarize and compare several models and forecasts predicting when transformative AI will be developed.
Highlights
- The review includes quantitative models, including both outside and inside view, and judgment-based forecasts by (teams of) experts.
- While we do not necessarily endorse their conclusions, the inside-view model the Epoch team found most compelling is Ajeya Cotra’s “Forecasting TAI with biological anchors”, the best-rated outside-view model was Tom Davidson’s “Semi-informative priors over AI timelines”, and the best-rated judgment-based forecast was Samotsvety’s AGI Timelines Forecast.
- The inside-view models we reviewed predicted shorter timelines (e.g. bioanchors has a median of 2052) while the outside-view models predicted longer timelines (e.g. semi-informative priors has a median over 2100). The judgment-based forecasts are skewed towards agreement with the inside-view models, and are often more aggressive (e.g. Samotsvety assigned a median of 2043).
Original article:
Narrated for the Effective Altruism Forum by TYPE III AUDIO.
In this post, we point out that short AI timelines would cause real interest rates to be high, and would do so under expectations of either unaligned or aligned AI. However, 30- to 50-year real interest rates are low. We argue that this suggests one of two possibilities:
- Long(er) timelines. Financial markets are often highly effective information aggregators (the “efficient market hypothesis”), and therefore real interest rates accurately reflect that transformative AI is unlikely to be developed in the next 30-50 years.
- Market inefficiency. Markets are radically underestimating how soon advanced AI technology will be developed, and real interest rates are therefore too low. There is thus an opportunity for philanthropists to borrow while real rates are low to cheaply do good today; and/or an opportunity for anyone to earn excess returns by betting that real rates will rise.
In the rest of this post we flesh out this argument.
Original article:
https://forum.effectivealtruism.org/posts/8c7LycgtkypkgYjZx/agi-and-the-emh-markets-are-not-expecting-aligned-or
Narrated for the Effective Altruism Forum by TYPE III AUDIO.
This post is inspired by What 2026 looks like and an AI vignette workshop guided by Tamay Besiroglu. I think of this post as “what would I expect the world to look like if these timelines (median compute for transformative AI ~2036) were true” or “what short-to-medium timelines feel like” since I find it hard to translate a statement like “median TAI year is 20XX” into a coherent imaginable world.
I expect some readers to think that the post sounds wild and crazy but that doesn’t mean its content couldn’t be true. If you had told someone in 1990 or 2000 that there would be more smartphones and computers than humans in 2020, that probably would have sounded wild to them. The same could be true for AIs, i.e. that in 2050 there are more human-level AIs than humans. The fact that this sounds as ridiculous as ubiquitous smartphones sounded to the 1990/2000 person, might just mean that we are bad at predicting exponential growth and disruptive technology.
Original article:
https://www.lesswrong.com/posts/qRtD4WqKRYEtT5pi3/the-next-decades-might-be-wild
Narrated for LessWrong by TYPE III AUDIO.
In this note I’ll summarize the bio-anchors report, describe my initial reactions to it, and take a closer look at two disagreements that I have with background assumptions used by (readers of) the report.
This report attempts to forecast the year when the amount of compute required to train a transformative AI (TAI) model will first become available, as the year when a forecast for the amount of compute required to train TAI in a given year will intersect a forecast for the amount of compute that will be available for a training run of a single project in a given year.
Original article:
https://docs.google.com/document/d/1_GqOrCo29qKly1z48-mR86IV7TUDfzaEXxD3lGFQ8Wk/edit#
Narrated for the Effective Altruism Forum by TYPE III AUDIO.
https://www.lesswrong.com/posts/6Xgy6CAf2jqHhynHL/what-2026-looks-like#2022
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
This was written for the Vignettes Workshop.[1] The goal is to write out a detailed future history (“trajectory”) that is as realistic (to me) as I can currently manage, i.e. I’m not aware of any alternative trajectory that is similarly detailed and clearly more plausible to me. The methodology is roughly: Write a future history of 2022. Condition on it, and write a future history of 2023. Repeat for 2024, 2025, etc. (I'm posting 2022-2026 now so I can get feedback that will help me write 2027+. I intend to keep writing until the story reaches singularity/extinction/utopia/etc.)
What’s the point of doing this? Well, there are a couple of reasons:
- Sometimes attempting to write down a concrete example causes you to learn things, e.g. that a possibility is more or less plausible than you thought.
- Most serious conversation about the future takes place at a high level of abstraction, talking about e.g. GDP acceleration, timelines until TAI is affordable, multipolar vs. unipolar takeoff… vignettes are a neglected complementary approach worth exploring.
- Most stories are written backwards. The author begins with some idea of how it will end, and arranges the story to achieve that ending. Reality, by contrast, proceeds from past to future. It isn’t trying to entertain anyone or prove a point in an argument.
- Anecdotally, various people seem to have found Paul Christiano’s “tales of doom” stories helpful, and relative to typical discussions those stories are quite close to what we want. (I still think a bit more detail would be good — e.g. Paul’s stories don’t give dates, or durations, or any numbers at all really.)[2]
- “I want someone to ... write a trajectory for how AI goes down, that is really specific about what the world GDP is in every one of the years from now until insane intelligence explosion. And just write down what the world is like in each of those years because I don't know how to write an internally consistent, plausible trajectory. I don't know how to write even one of those for anything except a ridiculously fast takeoff.” --Buck Shlegeris
https://www.lesswrong.com/posts/K4urTDkBbtNuLivJx/why-i-think-strong-general-ai-is-coming-soon
I think there is little time left before someone builds AGI (median ~2030). Once upon a time, I didn't think this.
This post attempts to walk through some of the observations and insights that collapsed my estimates.
The core ideas are as follows:
- We've already captured way too much of intelligence with way too little effort.
- Everything points towards us capturing way more of intelligence with very little additional effort.
- Trying to create a self-consistent worldview that handles all available evidence seems to force very weird conclusions.
Some notes up front:
- I wrote this post in response to the Future Fund's AI Worldview Prize. Financial incentives work, apparently! I wrote it with a slightly wider audience in mind and supply some background for people who aren't quite as familiar with the standard arguments.
- I make a few predictions in this post. Unless otherwise noted, the predictions and their associated probabilities should be assumed to be conditioned on "the world remains at least remotely normal for the term of the prediction; the gameboard remains unflipped."
- For the purposes of this post, when I use the term AGI, I mean the kind of AI with sufficient capability to make it a genuine threat to humanity's future or survival if it is misused or misaligned. This is slightly more strict than the definition in the Future Fund post, but I expect the difference between the two definitions to be small chronologically.
- For the purposes of this post, when I refer to "intelligence," I mean stuff like complex problem solving that's useful for achieving goals. Consciousness, emotions, and qualia are not required for me to call a system "intelligent" here; I am defining it only in terms of capability.