Skip to content

Check out The AI Doc, streaming April 14th.

The Problem
Research
About Us
Updates
Donate

The Problem
Research
About Us
Updates
Donate

September 2019 Newsletter

September 30, 2019
Rob Bensinger

Updates

We ran a very successful MIRI Summer Fellows Program, which included a day where participants publicly wrote up their thoughts on various AI safety topics. See Ben Pace’s first post in a series of roundups.
A few highlights from the writing day: Adele Lopez's Optimization Provenance; Daniel Kokotajlo's Soft Takeoff Can Still Lead to Decisive Strategic Advantage and The "Commitment Races" Problem; Evan Hubinger's Towards a Mechanistic Understanding of Corrigibility; and John Wentworth's Markets are Universal for Logical Induction and Embedded Agency via Abstraction.
New posts from MIRI staff and interns: Abram Demski's Troll Bridge; Matthew Graves' View on Factored Cognition; Daniel Filan's Verification and Transparency; and Scott Garrabrant's Intentional Bucket Errors and Does Agent-like Behavior Imply Agent-like Architecture?
See also a forum discussion on "proof-level guarantees" in AI safety.

News and links

From Ben Cottier and Rohin Shah: Clarifying Some Key Hypotheses in AI Alignment
Classifying Specification Problems as Variants of Goodhart's Law: Victoria Krakovna and Ramana Kumar relate DeepMind's SRA taxonomy to mesa-optimizers, selection and control, and Scott Garrabrant's Goodhart taxonomy. Also new from DeepMind: Ramana, Tom Everitt, and Marcus Hutter's Designing Agent Incentives to Avoid Reward Tampering.
From OpenAI: Testing Robustness Against Unforeseen Adversaries. 80,000 Hours also recently interviewed OpenAI's Paul Christiano, with some additional material on decision theory.
From AI Impacts: Evidence Against Current Methods Leading to Human-Level AI and Ernie Davis on the Landscape of AI Risks
From Wei Dai: Problems in AI Alignment That Philosophers Could Potentially Contribute To
Richard Möhn has put together a calendar of upcoming AI alignment events.
The Berkeley Existential Risk Initiative is seeking an Operations Manager.

Browse

Search

Browse

Categories

Analysis
Conversations
Guest Posts
MIRI Strategy
News
Newsletters
Papers
Uncategorized
Video

Subscribe

Follow us on

Facebook X-twitter Rss

Contact
Donate
Careers
Team
Transparency
Privacy

Contact
Donate
Careers
Team
Transparency
Privacy

Subscribe to our Newsletter

Machine Intelligence Research Institute

Berkeley, California

Facebook X-twitter