The following is a fictional dialogue building off of AI Alignment: Why It’s Hard, and Where to Start. (AMBER, a philanthropist interested in a more reliable Internet, and CORAL, a computer security professional, are at a conference hotel together discussing what Coral insists is a difficult and important issue: the difficulty of building “secure”… Read more »
Posts By: Eliezer Yudkowsky
AlphaGo Zero and the Foom Debate
AlphaGo Zero uses 4 TPUs, is built entirely out of neural nets with no handcrafted features, doesn’t pretrain against expert games or anything else human, reaches a superhuman level after 3 days of self-play, and is the strongest version of AlphaGo yet. The architecture has been simplified. Previous AlphaGo had a policy net that predicted… Read more »
There’s No Fire Alarm for Artificial General Intelligence
What is the function of a fire alarm? One might think that the function of a fire alarm is to provide you with important evidence about a fire existing, allowing you to change your policy accordingly and exit the building. In the classic experiment by Latane and Darley in 1968, eight groups of… Read more »
AI Alignment: Why It’s Hard, and Where to Start
Back in May, I gave a talk at Stanford University for the Symbolic Systems Distinguished Speaker series, titled “The AI Alignment Problem: Why It’s Hard, And Where To Start.” The video for this talk is now available on Youtube: We have an approximately complete transcript of the talk and Q&A session here, slides… Read more »
Five theses, two lemmas, and a couple of strategic implications
MIRI’s primary concern about self-improving AI isn’t so much that it might be created by ‘bad’ actors rather than ‘good’ actors in the global sphere; rather most of our concern is in remedying the situation in which no one knows at all how to create a self-modifying AI with known, stable preferences. (This is why… Read more »
Three Major Singularity Schools
I’ve noticed that Singularity discussions seem to be splitting up into three major schools of thought: Accelerating Change, the Event Horizon, and the Intelligence Explosion.