Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What can we learn from Lex Fridman’s interview with Sam Altman?, published by Karl von Wendt on March 27, 2023 on LessWrong. These are my personal thougths about this interview. Epistemic status: I neither consider myself a machine-learning expert, nor am I an alignment expert. My focus is on outreach: explaining AI safety to the general public and professionals outside of the AI safety community. So an interview like this one is important material for me to both understand the situation myself and explain it to others. After watching it, I’m somewhat confused. There were bits in this talk that I liked and others that disturbed me. There seems to be a mix of humbleness and hubris, of openly acknowledging AI risks and downplaying some elements of them. I am unsure how open and honest Sam Altman really was. I don’t mean to criticize. I want to understand what OpenAI’s and Sam Altman’s stance towards AI safety really is. Below I list transcriptions of the parts that seemed most relevant for AI safety and my thoughts/questions about them. Maybe you can help me better understand this by commenting. [23:55] Altman: “Our degree of alignment increases faster than our rate of capability progress, and I think that will become more and more important over time.” I don’t really understand what this is supposed to mean. What’s a “degree of alignment”? How can you meaningfully compare it with “rate of capability progress”? To me, this sounds a lot like marketing: “We know we are dealing with dangerous stuff, so we are extra careful.” Then again, it’s probably hard to explain this in concrete terms in an interview. [24:40] Altman: “I do not think we have yet discovered a way to align a super powerful system. We have something that works for our current scale: RLHF.” I find this very open and honest. Obviously, he not only knows about the alignment problem, but openly admits that RLHF is not the solution to aligning an AGI. Good! [25:10] Altman: “It’s easy to talk about alignment and capability as of orthogonal vectors, they’re very close: better alignment techniques lead to better capabilities, and vice versa. There are cases that are different, important cases, but on the whole I think things that you could say like RLHF or interpretability that sound like alignment issues also help you make much more capable models and the division is just much fuzzier than people think.” This, I think, contains two messages: “Capabilities research and alignment research are intertwined” and “criticizing us for advancing capabilities so much is misguided, because we need to do that in order to align AI”. I understand the first one, but I don’t subscribe to the second one, see discussion below. [47:53] Fridman: “Do you think it’s possible that LLMs really is the way we build AGI?”Altman: “I think it’s part of the way. I think we need other super important things . For me, a system that cannot significantly add to the sum total of scientific knowledge we have access to – kind of discover, invent, whatever you want to call it – new, fundamental science, is not a superintelligence. . To do that really well, I think we need to expand on the GPT paradigm in pretty important ways that we’re still missing ideas for. I don’t know what those ideas are. We’re trying to find them.” This is pretty vague, which is understandable. However, it seems to indicate to me that the current, relatively safe, mostly myopic GPT approach will be augmented with elements that may make their approach much more dangerous, like maybe long term memory and dynamic learning. This is highly speculative, of course. [49:50] Altman: “The thing that I’m so excited about is not that it’s a system that kind of goes off and does its own thing but that it’s this tool that humans are using in this feedback loop . I’m excited about a world ...