June 2023: Welcome to the alpha release of TYPE III AUDIO.
Expect very rough edges and very broken stuff—and daily improvements. Please share your thoughts.
AGI Safety Fundamentals: Alignment
Readings from the AI Safety Fundamentals: Alignment course.
(Partially in response to AGI Ruin: A list of Lethalities. Written in the same rambling style. Not exhaustive.)
Agreements Powerful AI systems have a good chance of deliberately and irreversibly disempowering humanity. This is a much easier failure mode than killing everyone with destructive physical technologies. Catastrophically risky AI systems could plausibly exist soon, and there likely won’t be a strong consensus about this fact until such systems pose a meaningful existential risk per year. There is not necessarily any “fire alarm.” Even if there were consensus about a risk from powerful AI systems, there is a good chance that the world would respond in a totally unproductive way. It’s wishful thinking to look at possible stories of doom and say “we wouldn’t let that happen;” humanity is fully capable of messing up even very basic challenges, especially if they are novel.