MIRI researcher Evan Hubinger reviews “11 different proposals for building safe advanced AI under the current machine learning paradigm”, comparing them on outer alignment, inner alignment, training competitiveness, and performance competitiveness.
Other updates
- We keep being amazed by new shows of support — following our last two announcements, MIRI has received a donation from another anonymous donor totaling ~$265,000 in euros, facilitated by Effective Giving UK and the Effective Altruism Foundation. Massive thanks to the donor for their generosity, and to both organizations for their stellar support for MIRI and other longtermist organizations!
- Hacker News discusses Eliezer Yudkowsky's There's No Fire Alarm for AGI.
- MIRI researcher Buck Shlegeris talks about deference and inside-view models on the EA Forum.
- OpenAI unveils GPT-3, a massive 175-billion parameter language model that can figure out how to solve a variety of problems without task-specific training or fine-tuning. Gwern Branwen's pithy summary:
GPT-3 is terrifying because it's a tiny model compared to what's possible, trained in the dumbest way possible on a single impoverished modality on tiny data, yet the first version already manifests crazy runtime meta-learning—and the scaling curves still are not bending!
Further discussion by Branwen and by Rohin Shah.
- Stuart Russell gives this year's Turing Lecture online, discussing “provably beneficial AI”.