May 2023: Welcome to the alpha release of TYPE III AUDIO.
Expect very rough edges and very broken stuff—and daily improvements.
Please share your thoughts, but don't share this link on social media, for now.
We only have recent episodes right now, and there are some false positives. Will be fixed soon!
Originally released in May 2023.
Imagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don't get to see any resumes or do reference checks. And because you're so rich, tonnes of people apply for the job — for all sorts of reasons.
Today's guest Ajeya Cotra — senior research analyst at Open Philanthropy — argues that this peculiar setup resembles the situation humanity finds itself in when training very general and very capable AI models using current deep learning methods.
Links to learn more, summary and full transcript.
As she explains, such an eight-year-old faces a challenging problem. In the candidate pool there are likely some truly nice people, who sincerely want to help and make decisions that are in your interest. But there are probably other characters too — like people who will pretend to care about you while you're monitoring them, but intend to use the job to enrich themselves as soon as they think they can get away with it.
Like a child trying to judge adults, at some point humans will be required to judge the trustworthiness and reliability of machine learning models that are as goal-oriented as people, and greatly outclass them in knowledge, experience, breadth, and speed. Tricky!
Can't we rely on how well models have performed at tasks during training to guide us? Ajeya worries that it won't work. The trouble is that three different sorts of models will all produce the same output during training, but could behave very differently once deployed in a setting that allows their true colours to come through. She describes three such motivational archetypes:
- Saints — models that care about doing what we really want
- Sycophants — models that just want us to say they've done a good job, even if they get that praise by taking actions they know we wouldn't want them to
- Schemers — models that don't care about us or our interests at all, who are just pleasing us so long as that serves their own agenda
And according to Ajeya, there are also ways we could end up actively selecting for motivations that we don't want.
In today's interview, Ajeya and Rob discuss the above, as well as:
- How to predict the motivations a neural network will develop through training
- Whether AIs being trained will functionally understand that they're AIs being trained, the same way we think we understand that we're humans living on planet Earth
- Stories of AI misalignment that Ajeya doesn't buy into
- Analogies for AI, from octopuses to aliens to can openers
- Why it's smarter to have separate planning AIs and doing AIs
- The benefits of only following through on AI-generated plans that make sense to human beings
- What approaches for fixing alignment problems Ajeya is most excited about, and which she thinks are overrated
- How one might demo actually scary AI failure mechanisms
Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.
Producer: Keiran Harris
Audio mastering: Ryan Kessler and Ben Cordell
Transcriptions: Katy Moore
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Accidentally teaching AI models to deceive us (Ajeya Cotra on The 80,000 Hours Podcast), published by 80000 Hours on May 15, 2023 on The Effective Altruism Forum. Over at The 80,000 Hours Podcast we just published an interview that is likely to be of particular interest to people who identify as involved in the effective altruism community: Ajeya Cotra on accidentally teaching AI models to deceive us. You can click through for the audio, a full transcript, and related links. Below is the episode summary and some key excerpts. Episode Summary I don’t know yet what suite of tests exactly you could show me, and what arguments you could show me, that would make me actually convinced that this model has a sufficiently deeply rooted motivation to not try to escape human control. I think that’s, in some sense, the whole heart of the alignment problem. And I think for a long time, labs have just been racing ahead, and they’ve had the justification — which I think was reasonable for a while — of like, “Come on, of course these systems we’re building aren’t going to take over the world.” As soon as that starts to change, I want a forcing function that makes it so that the labs now have the incentive to come up with the kinds of tests that should actually be persuasive. Ajeya Cotra Imagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don’t get to see any resumes or do reference checks. And because you’re so rich, tonnes of people apply for the job — for all sorts of reasons. Today’s guest Ajeya Cotra — senior research analyst at Open Philanthropy — argues that this peculiar setup resembles the situation humanity finds itself in when training very general and very capable AI models using current deep learning methods. As she explains, such an eight-year-old faces a challenging problem. In the candidate pool there are likely some truly nice people, who sincerely want to help and make decisions that are in your interest. But there are probably other characters too — like people who will pretend to care about you while you’re monitoring them, but intend to use the job to enrich themselves as soon as they think they can get away with it. Like a child trying to judge adults, at some point humans will be required to judge the trustworthiness and reliability of machine learning models that are as goal-oriented as people, and greatly outclass them in knowledge, experience, breadth, and speed. Tricky! Can’t we rely on how well models have performed at tasks during training to guide us? Ajeya worries that it won’t work. The trouble is that three different sorts of models will all produce the same output during training, but could behave very differently once deployed in a setting that allows their true colours to come through. She describes three such motivational archetypes: Saints — models that care about doing what we really want Sycophants — models that just want us to say they’ve done a good job, even if they get that praise by taking actions they know we wouldn’t want them to Schemers — models that don’t care about us or our interests at all, who are just pleasing us so long as that serves their own agenda In principle, a machine learning training process based on reinforcement learning could spit out any of these three attitudes, because all three would perform roughly equally well on the tests we give them, and ‘performs well on tests’ is how these models are selected. But while that’s true in principle, maybe it’s not something that could plausibly happen in the real world. Af...
Over at The 80,000 Hours Podcast we just published an interview that is likely to be of particular interest to people who identify as involved in the effective altruism community: Ajeya Cotra on accidentally teaching AI models to deceive us.
You can click through for the audio, a full transcript, and related links. Below is the episode summary and some key excerpts.
Episode Summary
I don’t know yet what suite of tests exactly you could show me, and what arguments you could show me, that would make me actually convinced that this model has a sufficiently deeply rooted motivation to not try to escape human control. I think that’s, in some sense, the whole heart of the alignment problem.
And I think for a long time, labs have just been racing ahead, and they’ve had the justification — which I think was reasonable for a while — of like, “Come on, of course these systems we’re building aren’t going to take over the world.” As soon as that starts to change, I want a forcing function that makes it so that the labs now have the incentive to come up with the kinds of tests that should actually be persuasive.
Ajeya Cotra
Imagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don’t get to see any resumes or do reference checks. And because you’re so rich, tonnes of people apply for the job — for all sorts of reasons.
Today’s guest Ajeya Cotra — senior research analyst at Open Philanthropy — argues that this peculiar setup resembles the situation humanity finds itself in when training very general and very capable AI models using current deep learning methods.
As she explains, such an eight-year-old faces a challenging problem. In the candidate pool there are likely some truly nice people, who sincerely want to help and make decisions that are in your interest. But there are probably other characters too — like people who will pretend to care about you while you’re monitoring them, but intend to use the job to enrich themselves as soon as they think they can get away with it.
Like a child trying to judge adults, at some point humans will be required to judge the trustworthiness and reliability of machine learning models that are as goal-oriented as people, and greatly outclass them in knowledge, experience, breadth, and speed. Tricky!
Can’t we rely on how well models have performed at tasks during training to guide us? Ajeya worries that it won’t work. The trouble is that three different sorts of models will all produce the same output during training, but could behave very differently once deployed in a setting that allows their true colours to come through. She describes three such motivational archetypes:
Saints — models that care about doing what we really want
Sycophants — models that just want us to say they’ve done a good job, even if they get that praise by taking actions they know we wouldn’t want them to
Schemers — models that don’t care about us or our interests at all, who are just pleasing us so long as that serves their own agenda
In principle, a machine learning training process based on reinforcement learning could spit out any of these three attitudes, because all three would perform roughly equally well on the tests we give them, and ‘performs well on tests’ is how these models are selected.
But while that’s true in principle, maybe it’s not something that could plausibly happen in the real world. Af...
Imagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don't get to see any resumes or do reference checks. And because you're so rich, tonnes of people apply for the job — for all sorts of reasons.
Today's guest Ajeya Cotra — senior research analyst at Open Philanthropy — argues that this peculiar setup resembles the situation humanity finds itself in when training very general and very capable AI models using current deep learning methods.
Links to learn more, summary and full transcript.
As she explains, such an eight-year-old faces a challenging problem. In the candidate pool there are likely some truly nice people, who sincerely want to help and make decisions that are in your interest. But there are probably other characters too — like people who will pretend to care about you while you're monitoring them, but intend to use the job to enrich themselves as soon as they think they can get away with it.
Like a child trying to judge adults, at some point humans will be required to judge the trustworthiness and reliability of machine learning models that are as goal-oriented as people, and greatly outclass them in knowledge, experience, breadth, and speed. Tricky!
Can't we rely on how well models have performed at tasks during training to guide us? Ajeya worries that it won't work. The trouble is that three different sorts of models will all produce the same output during training, but could behave very differently once deployed in a setting that allows their true colours to come through. She describes three such motivational archetypes:
- Saints — models that care about doing what we really want
- Sycophants — models that just want us to say they've done a good job, even if they get that praise by taking actions they know we wouldn't want them to
- Schemers — models that don't care about us or our interests at all, who are just pleasing us so long as that serves their own agenda
And according to Ajeya, there are also ways we could end up actively selecting for motivations that we don't want.
In today's interview, Ajeya and Rob discuss the above, as well as:
- How to predict the motivations a neural network will develop through training
- Whether AIs being trained will functionally understand that they're AIs being trained, the same way we think we understand that we're humans living on planet Earth
- Stories of AI misalignment that Ajeya doesn't buy into
- Analogies for AI, from octopuses to aliens to can openers
- Why it's smarter to have separate planning AIs and doing AIs
- The benefits of only following through on AI-generated plans that make sense to human beings
- What approaches for fixing alignment problems Ajeya is most excited about, and which she thinks are overrated
- How one might demo actually scary AI failure mechanisms
Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.
Producer: Keiran Harris
Audio mastering: Ryan Kessler and Ben Cordell
Transcriptions: Katy Moore
Why would we program AI that wants to harm us? Because we might not know how to do otherwise.
Source:
https://www.cold-takes.com/why-ai-alignment-could-be-hard-with-modern-deep-learning/
Crossposted from the Cold Takes Audio podcast.
Why would we program AI that wants to harm us? Because we might not know how to do otherwise.
Source:
https://www.cold-takes.com/why-ai-alignment-could-be-hard-with-modern-deep-learning/
Crossposted from the Cold Takes Audio podcast.
Kelsey Piper and I just launched a new blog about AI futurism and AI alignment called Planned Obsolescence. If you’re interested, you can check it out here.
Both of us have thought a fair bit about what we see as the biggest challenges in technical work and in policy to make AI go well, but a lot of our thinking isn’t written up, or is embedded in long technical reports. This is an effort to make our thinking more accessible. That means it’s mostly aiming at a broader audience than LessWrong and the EA Forum, although some of you might still find some of the posts interesting.
So far we have seven posts:
What we're doing here
"Aligned" shouldn't be a synonym for "good"
Situational awareness
Playing the training game
Training AIs to help us align AIs
Alignment researchers disagree a lot
The ethics of AI red-teaming
Thanks to ilzolende for formatting these posts for publication. Each post has an accompanying audio version generated by a voice synthesis model trained on the author's voice using Descript Overdub.
You can submit questions or comments to mailbox@planned-obsolescence.org.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: New blog: Planned Obsolescence, published by Ajeya Cotra on March 27, 2023 on LessWrong. Kelsey Piper and I just launched a new blog about AI futurism and AI alignment called Planned Obsolescence. If you’re interested, you can check it out here. Both of us have thought a fair bit about what we see as the biggest challenges in technical work and in policy to make AI go well, but a lot of our thinking isn’t written up, or is embedded in long technical reports. This is an effort to make our thinking more accessible. That means it’s mostly aiming at a broader audience than LessWrong and the EA Forum, although some of you might still find some of the posts interesting. So far we have seven posts: What we're doing here "Aligned" shouldn't be a synonym for "good" Situational awareness Playing the training game Training AIs to help us align AIs Alignment researchers disagree a lot The ethics of AI red-teaming Thanks to ilzolende for formatting these posts for publication. Each post has an accompanying audio version generated by a voice synthesis model trained on the author's voice using Descript Overdub. You can submit questions or comments to mailbox@planned-obsolescence.org. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Learn more about the work of Ajeya and her colleagues: https://www.openphilanthropy.org
Timestamps:
00:00 Introduction
00:44 The default versus the accelerating picture of the future
04:25 The role of AI in accelerating change
06:48 Extrapolating economic growth
08:53 How do we know whether the pace of change is accelerating?
15:07 How can we cope with a rapidly changing world?
18:50 How could the future be utopian?
22:03 Is accelerating technological progress immoral?
25:43 Should we imagine concrete future scenarios?
31:15 How should we act in an accelerating world?
34:41 How Ajeya could be wrong about the future
41:41 What if change accelerates very rapidly?
Episode: Ajeya Cotra on how Artificial Intelligence Could Cause Catastrophe
Release date: 2022-11-03
Ajeya Cotra joins us to discuss how artificial intelligence could cause catastrophe. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 AI safety research in general 02:04 Realistic scenarios for AI catastrophes 06:51 A dangerous AI model developed in the near future 09:10 Assumptions behind dangerous AI development 14:45 Can AIs learn long-term planning? 18:09 Can AIs understand human psychology? 22:32 Training an AI model with naive safety features 24:06 Can AIs be deceptive? 31:07 What happens after deploying an unsafe AI system? 44:03 What can we do to prevent an AI catastrophe? 53:58 The next episode
Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org
Timestamps:
00:00 Introduction
00:53 AI safety research in general
02:04 Realistic scenarios for AI catastrophes
06:51 A dangerous AI model developed in the near future
09:10 Assumptions behind dangerous AI development
14:45 Can AIs learn long-term planning?
18:09 Can AIs understand human psychology?
22:32 Training an AI model with naive safety features
24:06 Can AIs be deceptive?
31:07 What happens after deploying an unsafe AI system?
44:03 What can we do to prevent an AI catastrophe?
53:58 The next episode
Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org
Timestamps:
00:00 Introduction
00:53 Ajeya's report on AI
01:16 What is transformative AI?
02:09 Forecasting transformative AI
02:53 Historical growth rates
05:10 Simpler forecasting methods
09:01 Biological anchors
16:31 Different paths to transformative AI
17:55 Which year will we get transformative AI?
25:54 Expert opinion on transformative AI
30:08 Are today's machine learning techniques enough?
33:06 Will AI be limited by the physical world and regulation?
38:15 Will AI be limited by training data?
41:48 Are there human abilities that AIs cannot learn?
47:22 The next episode
https://www.lesswrong.com/posts/pRkFkzwKZ2zfa3R6H/without-specific-countermeasures-the-easiest-path-to
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
I think that in the coming 15-30 years, the world could plausibly develop “transformative AI”: AI powerful enough to bring us into a new, qualitatively different future, via an explosion in science and technology R&D. This sort of AI could be sufficient to make this the most important century of all time for humanity.
The most straightforward vision for developing transformative AI that I can imagine working with very little innovation in techniques is what I’ll call human feedback[1] on diverse tasks (HFDT):
Train a powerful neural network model to simultaneously master a wide variety of challenging tasks (e.g. software development, novel-writing, game play, forecasting, etc) by using reinforcement learning on human feedback and other metrics of performance.
HFDT is not the only approach to developing transformative AI,[2] and it may not work at all.[3] But I take it very seriously, and I’m aware of increasingly many executives and ML researchers at AI companies who believe something within this space could work soon.
Unfortunately, I think that if AI companies race forward training increasingly powerful models using HFDT, this is likely to eventually lead to a full-blown AI takeover (i.e. a possibly violent uprising or coup by AI systems). I don’t think this is a certainty, but it looks like the best-guess default absent specific efforts to prevent it.
https://www.lesswrong.com/posts/AfH2oPHCApdKicM4m/two-year-update-on-my-personal-ai-timelines#fnref-fwwPpQFdWM6hJqwuY-12
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
I worked on my draft report on biological anchors for forecasting AI timelines mainly between ~May 2019 (three months after the release of GPT-2) and ~Jul 2020 (a month after the release of GPT-3), and posted it on LessWrong in Sep 2020 after an internal review process. At the time, my bottom line estimates from the bio anchors modeling exercise were:[1]
- Roughly ~15% probability of transformative AI by 2036[2] (16 years from posting the report; 14 years from now).
- A median of ~2050 for transformative AI (30 years from posting, 28 years from now).
These were roughly close to my all-things-considered probabilities at the time, as other salient analytical frames on timelines didn’t do much to push back on this view. (Though my subjective probabilities bounced around quite a lot around these values and if you’d asked me on different days and with different framings I’d have given meaningfully different numbers.)
It’s been about two years since the bulk of the work on that report was completed, during which I’ve mainly been thinking about AI. In that time it feels like very short timelines have become a lot more common and salient on LessWrong and in at least some parts of the ML community.
My personal timelines have also gotten considerably shorter over this period. I now expect something roughly like this:
Read the full transcript here.
What is Effective Altruism? Which parts of the Effective Altruism movement are good and not so good? Who outside of the EA movement are doing lots of good in the world? What are the psychological effects of thinking constantly about the trade-offs of spending resources on ourselves versus on others? To what degree is the EA movement centralized intellectually, financially, etc.? Does the EA movement's tendency to quantify everything, to make everything legible to itself, cause it to miss important features of the world? To what extent do EA people rationalize spending resources on inefficient or selfish projects by reframing them in terms of EA values? Is a feeling of tension about how to allocate our resources actually a good thing?
Ajeya Cotra is a Senior Research Analyst at Open Philanthropy, a grantmaking organization that aims to do as much good as possible with its resources (broadly following effective altruist methodology); she mainly does research relevant to Open Phil's work on reducing existential risks from AI. Ajeya discovered effective altruism in high school through the book The Life You Can Save, and quickly became a major fan of GiveWell. As a student at UC Berkeley, she co-founded and co-ran the Effective Altruists of Berkeley student group, and taught a student-led course on EA. Listen to her 80,000 Hours podcast episode or visit her LessWrong author page for more info.
Michael Nielsen was on the podcast back in episode 016. You can read more about him there!
Read the full transcript here.
What is Effective Altruism? Which parts of the Effective Altruism movement are good and not so good? Who outside of the EA movement are doing lots of good in the world? What are the psychological effects of thinking constantly about the trade-offs of spending resources on ourselves versus on others? To what degree is the EA movement centralized intellectually, financially, etc.? Does the EA movement's tendency to quantify everything, to make everything legible to itself, cause it to miss important features of the world? To what extent do EA people rationalize spending resources on inefficient or selfish projects by reframing them in terms of EA values? Is a feeling of tension about how to allocate our resources actually a good thing?
Ajeya Cotra is a Senior Research Analyst at Open Philanthropy, a grantmaking organization that aims to do as much good as possible with its resources (broadly following effective altruist methodology); she mainly does research relevant to Open Phil's work on reducing existential risks from AI. Ajeya discovered effective altruism in high school through the book The Life You Can Save, and quickly became a major fan of GiveWell. As a student at UC Berkeley, she co-founded and co-ran the Effective Altruists of Berkeley student group, and taught a student-led course on EA. Listen to her 80,000 Hours podcast episode or visit her LessWrong author page for more info.
Michael Nielsen was on the podcast back in episode 016. You can read more about him there!
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines., published by NunoSempere on August 16, 2022 on The Effective Altruism Forum. tl;dr: The report underestimates the amount of compute used by evolution because it only looks at what it would take to simulate neurons, rather than neurons in agents inside a complex environment. It's not clear to me what the magnitude of the error is, but it could range many, many orders of magnitude. This makes it a less forceful outside view. Background Within Effective Altruism, Ajeya Cotra's report on artificial general intelligence (AGI) timelines has been influential in justifying or convincing members and organizations to work on AGI safety. The report has a section on the "evolutionary anchor", i.e., an upper bound on how much compute it would take to reach artificial general intelligence. The section can be found in pages 24-28 of this Google doc. As a summary, in the report's own words: This hypothesis states that we should assume on priors that training computation requirements will resemble the amount of computation performed in all animal brains over the course of evolution from the earliest animals with neurons to modern humans, because we should expect our architectures and optimization algorithms to be about as efficient as natural selection. This anchor isn't all that important in the report's own terms: it only gets a 10% probability assigned to it in the final weighted average. But this bound is personally important to me because I do buy that if you literally reran evolution, or if you use as much computation as evolution, you would have a high chance of producing something as intelligent as humans, and so I think that it is particularly forceful as an "outside view". Explanation of my concern I don't buy the details of how the author arrives at the estimate of the compute used by evolution: The amount of computation done over evolutionary history can roughly be approximated by the following formula: (Length of time since earliest neurons emerged) (Total amount of computation occurring at a given point in time). My rough best guess for each of these factors is as follows: Length of evolutionary time: Virtually all animals have neurons of some form, which means that the earliest nervous systems in human evolutionary history likely emerged around the time that the Kingdom Animalia diverged from the rest of the Eukaryotes. According to timetree.org, an online resource for estimating when different taxa diverged from one another, this occurred around ~6e8 years ago. In seconds, this is ~1e16 seconds. Total amount of computation occurring at a given point in time: This blog post attempts to estimate how many individual creatures in various taxa are alive at any given point in time in the modern period. It implies that the total amount of brain computation occurring inside animals with very few neurons is roughly comparable to the amount of brain computation occurring inside the animals with the largest brains. For example, the population of nematodes (a phylum of small worms including C. Elegans) estimated to be ~1e20 to ~1e22 individuals. Assuming that each nematode performs ~10,000 FLOP/s,the number of FLOP contributed by the nematodes every second is ~1e21 1e4 = ~1e25; this doesn't count non-nematode animals with similar or fewer numbers of neurons. On the other hand, the number of FLOP/s contributed by humans is (~7e9 humans) (~1e15 FLOP/s / person) = ~7e24. The human population is vastly larger now than it was during most of our evolutionary history, whereas it is likely that the population of animals with tiny nervous systems has stayed similar. This suggests to me that the average ancestor across our entire evolutionary history was likely tiny and performed very few FLOP/s. I will as...
tl;dr: The report underestimates the amount of compute used by evolution because it only looks at what it would take to simulate neurons, rather than neurons in agents inside a complex environment. It's not clear to me what the magnitude of the error is, but it could range many, many orders of magnitude. This makes it a less forceful outside view.
Background
Within Effective Altruism, Ajeya Cotra's report on artificial general intelligence (AGI) timelines has been influential in justifying or convincing members and organizations to work on AGI safety. The report has a section on the "evolutionary anchor", i.e., an upper bound on how much compute it would take to reach artificial general intelligence. The section can be found in pages 24-28 of this Google doc. As a summary, in the report's own words:
This hypothesis states that we should assume on priors that training computation requirements will resemble the amount of computation performed in all animal brains over the course of evolution from the earliest animals with neurons to modern humans, because we should expect our architectures and optimization algorithms to be about as efficient as natural selection.
This anchor isn't all that important in the report's own terms: it only gets a 10% probability assigned to it in the final weighted average. But this bound is personally important to me because I do buy that if you literally reran evolution, or if you use as much computation as evolution, you would have a high chance of producing something as intelligent as humans, and so I think that it is particularly forceful as an "outside view".
Explanation of my concern
I don't buy the details of how the author arrives at the estimate of the compute used by evolution:
The amount of computation done over evolutionary history can roughly be approximated by the following formula: (Length of time since earliest neurons emerged) (Total amount of computation occurring at a given point in time). My rough best guess for each of these factors is as follows:
Length of evolutionary time: Virtually all animals have neurons of some form, which means that the earliest nervous systems in human evolutionary history likely emerged around the time that the Kingdom Animalia diverged from the rest of the Eukaryotes. According to timetree.org, an online resource for estimating when different taxa diverged from one another, this occurred around ~6e8 years ago. In seconds, this is ~1e16 seconds.
Total amount of computation occurring at a given point in time: This blog post attempts to estimate how many individual creatures in various taxa are alive at any given point in time in the modern period. It implies that the total amount of brain computation occurring inside animals with very few neurons is roughly comparable to the amount of brain computation occurring inside the animals with the largest brains. For example, the population of nematodes (a phylum of small worms including C. Elegans) estimated to be ~1e20 to ~1e22 individuals. Assuming that each nematode performs ~10,000 FLOP/s,the number of FLOP contributed by the nematodes every second is ~1e21 1e4 = ~1e25; this doesn't count non-nematode animals with similar or fewer numbers of neurons. On the other hand, the number of FLOP/s contributed by humans is (~7e9 humans) (~1e15 FLOP/s / person) = ~7e24.
The human population is vastly larger now than it was during most of our evolutionary history, whereas it is likely that the population of animals with tiny nervous systems has stayed similar. This suggests to me that the average ancestor across our entire evolutionary history was likely tiny and performed very few FLOP/s. I will as...