June 2023: Welcome to the alpha release of TYPE III AUDIO.
Expect very rough edges and very broken stuff—and daily improvements. Please share your thoughts.

Homearrow rightPlaylists

[Week 2] “Learning from human preferences” (Blog Post) by Dario Amodei, Paul Christiano & Alex Ray

AGI Safety Fundamentals: Alignment

Readings from the AI Safety Fundamentals: Alignment course.



Apple PodcastsSpotifyGoogle PodcastsRSS

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration with DeepMind’s safety team, we’ve developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better.

Original article:

Dario Amodei, Paul Christiano, Alex Ray

This article is featured on the AGI Safety Fundamentals: Alignment course curriculum.

Narrated by TYPE III AUDIO on behalf of BlueDot Impact.

Share feedback on this narration.