“The shard theory of human values” by Quintin Pope & TurnTrout

LessWrong (Curated)

Audio version of the posts shared in the LessWrong Curated newsletter.




TL;DR: We propose a theory of human value formation. According to this theory, the reward system shapes human values in a relatively straightforward manner. Human values are not e.g. an incredibly complicated, genetically hard-coded set of drives, but rather sets of contextually activated heuristics which were shaped by and bootstrapped from crude, genetically hard-coded reward circuitry.