Welcome to the alpha release of TYPE III AUDIO.
Expect very rough edges and very broken stuff—and regular improvements. Please share your thoughts.
Welcome to the alpha release of TYPE III AUDIO.
Expect very rough edges and very broken stuff—and regular improvements. Please share your thoughts.
Readings from the AI Safety Fundamentals: Governance course.
I’ve previously argued that machine learning systems often exhibit emergent capabilities, and that these capabilities could lead to unintended negative consequences. But how can we reason concretely about these consequences? There’s two principles I find useful for reasoning about future emergent capabilities:
Using these principles, I’ll describe two specific emergent capabilities that I’m particularly worried about: deception (fooling human supervisors rather than doing the intended task), and optimization (choosing from a diverse space of actions based on their long-term consequences).
Source:
https://bounded-regret.ghost.io/emergent-deception-optimization/
Narrated for AGI Safety Fundamentals by Perrin Walker of TYPE III AUDIO.
---
Click “Add to feed” on episodes, playlists, and people.
Listen online or via your podcast app.