Skip to content

Check out The AI Doc, now streaming.

The Problem
Research
About Us
Updates
Donate

The Problem
Research
About Us
Updates
Donate

August 2020 Newsletter

August 13, 2020
Rob Bensinger

MIRI updates

Three questions from MIRI's Abram Demski: What does it mean to apply decision theory?, How “honest” is GPT-3?, and How should AI debate be judged?
A transcript from MIRI researcher Scott Garrabrant: What Would I Do? Self-Prediction in Simple Algorithms.
MIRI researcher Buck Shlegeris reviews the debate on what the history of nuclear weapons implies about humanity's ability to coordinate.
From MIRI's Evan Hubinger: Learning the Prior and Generalization and Alignment Proposals and Complexity Classes.
Rafael Harth's Inner Alignment: Explain Like I'm 12 Edition summarizes the concepts and takeaways from “Risks from Learned Optimization”.
Issa Rice reviews discussion to date on MIRI's research focus, “To what extent is it possible to have a precise theory of rationality?”, and the relationship between deconfusion research and safety outcomes. (Plus a short reply.)
“Pitfalls of Learning a Reward Function Online” (IJCAI paper, LW summary): FHI researcher and MIRI research associate Stuart Armstrong, with DeepMind's Jan Leike, Laurent Orseau, and Shane Legg, explore ways to discourage agents from manipulating their reward signal to be easier to optimize.

News and links

From Paul Christiano: Learning the Prior and Better Priors as a Safety Problem.
From Victoria Krakovna: Tradeoff Between Desirable Properties for Baseline Choices in Impact Measures.
Ben Pace summarizes Christiano's “What Failure Looks Like” post and the resultant discussion.
Kaj Sotala collects recent examples of experiences from people working with GPT-3.

Browse

Search

Browse

Categories

Analysis
Conversations
Guest Posts
MIRI Strategy
News
Newsletters
Papers
Uncategorized
Video

Subscribe

Follow us on

Facebook X-twitter Rss

Contact
Donate
Careers
Team
Transparency
Privacy

Contact
Donate
Careers
Team
Transparency
Privacy

Subscribe to our Newsletter

Machine Intelligence Research Institute

Berkeley, California

Facebook X-twitter