Research Workshops

While we do not currently have any upcoming workshops scheduled, there are other ways to get involved.

July 20–22, 2018 – Berkeley, California

2nd Workshop on Approaches in AI Alignment

CHAI Participants
Jordan Alexander
Lawrence Chan
James Drain
Aaron Tucker
Alex Turner

Unaffiliated Participants
Alex Gunning

MIRI Participants
Alex Appel
Daniel Demski
Evan Hubinger
Linda Linsefors
Alex Mennen
David Simmons
Alex Zhu

This weekend workshop brought together research interns from MIRI and UC Berkeley’s Center for Human-Compatible AI (CHAI) to discuss conceptual foundations and open problems in AI safety research.

November 18–19, 2017 – Berkeley, California

1st Workshop on Approaches in AI Alignment

Tsvi Benson-Tilsen (MIRI)
Paul Christiano (OpenAI)
Andrew Critch (UC Berkeley)
Wei Dai (independent)
Abram Demski (MIRI)

Sam Eisenstat (MIRI)
Scott Garrabrant (MIRI)
Richard Mallah (FLI, Cambridge Semantics)
Andreas Stuhlmüller (Stanford)
Jessica Taylor (independent)

This weekend workshop brought together researchers interested in understanding and exploring the intersection between MIRI’s Agent Foundations research agenda and Paul Christiano’s research.

April 1-2, 2017 – Berkeley, California

4th Workshop on Machine Learning and AI Safety

Tsvi Benson-Tilsen (MIRI)
Paul Christiano (OpenAI)
Andrew Critch (UC Berkeley)
Wei Dai (independent)
Abram Demski (MIRI)

Sam Eisenstat (MIRI)
Scott Garrabrant (MIRI)
Richard Mallah (FLI, Cambridge Semantics)
Andreas Stuhlmüller (Stanford)
Jessica Taylor (independent)

This workshop brought together researchers with machine learning backgrounds to work on long-term AI safety problems that can be modeled in current machine learning systems and frameworks, for instance those described in “Concrete Problems in AI Safety” and “Alignment for Advanced Machine Learning Systems”.

This workshop was funded in part by a grant from the Artificial Intelligence Journal.

March 25–26, 2017 – Berkeley, California

Workshop on Agent Foundations and AI Safety

Alexander Appel (University of Nevada Reno)
Michael Dennis (UC Berkeley)
Sam Eisenstat (Google)
Matt Frank
Scott Garrabrant (MIRI)

Juan David Gil (MIT)
Patrick LaVictoire (MIRI)
Eliana Lorch (Thiel Fellow)
Eli Sennesh
Harry Slatyer (Google)
Alex Zhu

This two-day weekend workshop brought together researchers with interests in long-term theoretical AI safety research. The workshop covered the context and content of current AI safety research agendas and projects (with a focus on MIRI’s Agent Foundations technical agenda). It was geared for researchers who have technical backgrounds and who have not previously worked extensively with MIRI.

December 1-3, 2016 – Berkeley, California

3rd Workshop on Machine Learning and AI Safety

Ryan Carey (MIRI)
Cameron Freer (Gamalon and Borelian)
Scott Garrabrant (MIRI)
Marcello Herreshoff (Google)
Patrick LaVictoire (MIRI)

Moshe Looks (Google)
Jeremy Nixon (Spark)
Anand Srinivasan (AlphaSheets)
Jessica Taylor (MIRI)
Eliezer Yudkowsky (MIRI)

This small three-day workshop brought together researchers with machine learning backgrounds to work on long-term AI safety problems that can be modeled in current machine learning systems and frameworks, for instance those described in “Concrete Problems in AI Safety” and “Alignment for Advanced Machine Learning Systems”.

Topics included zero-shot learning using shared embeddings, the differences between quantilization and regularization, generative adversarial networks and Goodhart’s Law, and mathematical formalizations of conservative concept learning.

November 11-13, 2016 – Berkeley, California

9th Workshop on Logic, Probability, and Reflection

Tsvi Benson-Tilsen (UC Berkeley)
Ryan Carey (MIRI)
Andrew Critch (MIRI)
Abram Demski (USC)
Sam Eisenstat (UC Berkeley)
Benya Fallenstein (MIRI)

Jack Gallagher
Scott Garrabrant (MIRI)
Marcello Herreshoff (Google)
Patrick LaVictoire (MIRI)
Nisan Stiennon (Google)
Jessica Taylor (MIRI)
Alex Zhu (MIT)

Participants at this three-day workshop — most of them veterans of past workshops — worked on a variety of problems related to MIRI’s Agent Foundations technical agenda.

Topics included safe exploration in rich domains, the difference between predicting a human and predicting HCH, and decision theories resulting from other decision theories self-modifying.

October 21-23, 2016 – Berkeley, California

2nd Workshop on Machine Learning and AI Safety

Ryan Carey (MIRI)
Sarah Constantin
Scott Garrabrant (MIRI)
Marcello Herreshoff (Google)

Patrick LaVictoire (MIRI)
William Saunders (Google)
Jessica Taylor (MIRI)
Eliezer Yudkowsky (MIRI)

Topics included concept learning with different ontologies, problems for Task AGI, censored representations, and conservative concepts.

August 26-28, 2016 – Berkeley, California

1st Workshop on Machine Learning and AI Safety

Paul Christiano (UC Berkeley)
Daniel Filan (UC Berkeley)
Cameron Freer (Gamalon and Borelian)
Dylan Hadfield-Menell (UC Berkeley)
Victoria Krakovna (Harvard)

Janos Kramar (University of Montreal)
Patrick LaVictoire (MIRI)
Jelena Luketina (University of Montreal)
Richard Mallah (FLI, Cambridge Semantics)
Jessica Taylor (MIRI)
Eliezer Yudkowsky (MIRI)

This three-day workshop brought together researchers with machine learning backgrounds to work on long-term AI safety problems that can be modeled in current machine learning systems and frameworks, for instance those described in “Concrete Problems in AI Safety” and “Alignment for Advanced Machine Learning Systems”.

Topics included learning human-interpretable and causal models of the environment; engineering cost functions based on impact measures to disincentivize side effects; designing robust metrics for the quality of a purported explanation of a plan; and developing a formal model of Goodhart’s Law which yields mild optimization.

June 17, 2016 – Berkeley, California

CSRBAI Workshop on Agent Models and Multi-Agent Dilemmas

Twenty participants attended from institutions including:

USC Institute for Creative Technologies
Carleton University
Future of Humanity Institute
Carnegie Mellon University
Harvard
Oxford University

University College London
Australian National University
UC Berkeley
UT Austin
Princeton University
Columbia University

The Colloquium Series on Robust and Beneficial AI included a series of workshops to facilitate conversations and collaborations between people interested in a number of different approaches to the technical challenges associated with AI robustness and reliability.

The fourth workshop of CSRBAI focused on the topics of designing agents that behave well in their environments, without ignoring the effects of the agent’s own actions on the environment or on other agents within the environment.

June 11-12, 2016 – Berkeley, California

CSRBAI Workshop on Preference Specification

Twenty participants attended from institutions including:

Australian National University
University College London
Center for the Study of Existential Risk
University of Oxford
Future of Humanity Institute
Carnegie Mellon University

The Swiss AI Lab IDSIA
Australian National University
UC Berkeley
Brown University
University of Montreal
USC Institute for Creative Technologies

The third workshop of CSRBAI focused on the topic of preference specification for highly capable AI systems, in which the perennial problem of wanting code to “do what I mean, not what I said” becomes increasingly challenging.

June 4-5, 2016 – Berkeley, California

CSRBAI Workshop on Robustness and Error-Tolerance

Fourteen participants attended from institutions including:

University College London
Center for the Study of Existential Risk
Google
Future of Humanity Institute
Carnegie Mellon University

Australian National University
UC Berkeley
The Swiss AI Lab IDSIA
Cornell University
USC Institute for Creative Technologies

The second workshop of CSRBAI focused on the topic of robustness and error-tolerance in AI systems, and how to ensure that when AI system fail, they fail gracefully and detectably.

May 28-29, 2016 – Berkeley, California

CSRBAI Workshop on Transparency

Twenty participants attended from institutions including:

Oregon State University
Australian National University
Future of Humanity Institute
Carnegie Mellon University
IBM Research
Montreal Institute for Learning Algorithms

Google Research
Stanford University
Google
UC Berkeley
University College London
Harvard
Future of Life Institute

The first workshop of CSRBAI focused on the topic of transparency in AI systems, and how we can increase transparency while maintaining capabilities.

April 1-3, 2016 – Berkeley, California

Self-Reference, Type Theory, and Formal Verification

Miëtek Bak (Least Fixed)
Benya Fallenstein (MIRI)
Jack Gallagher (Gallabytes)
Jason Gross (MIT)

Ramana Kumar (Cambridge)
Patrick LaVictoire (MIRI)
Daniel Selsam (Stanford)
Nathaniel Thomas (Stanford)

Participants worked on questions of self-reference in type theory and automated theorem provers, with the goal of studying systems that model themselves.

April 1-3, 2016 – Berkeley, California

Self-Reference, Type Theory, and Formal Verification

Miëtek Bak (Least Fixed)
Benya Fallenstein (MIRI)
Jack Gallagher (Gallabytes)
Jason Gross (MIT)

Ramana Kumar (Cambridge)
Patrick LaVictoire (MIRI)
Daniel Selsam (Stanford)
Nathaniel Thomas (Stanford)

Participants worked on questions of self-reference in type theory and automated theorem provers, with the goal of studying systems that model themselves.

August 28-30, 2015 – Berkeley, California

3rd Introductory Workshop on Logical Decision Theory

Holger Dell (Saarland University)
Owain Evans (MIT)
Benya Fallenstein (MIRI)
Benjamin Fox (Israel Defense Forces)
Patrick LaVictoire (MIRI)

Jonathan Lee (Cambridge)
Ben Levinstein (Oxford)
Jelena Luketina (Aalto)
David Steinberg (U Maryland)
Nate Soares (MIRI)
Eliezer Yudkowsky (MIRI)

This was the sixth in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was decision theory, and projects begun at the workshop are discussed in the following post: Proof Length and Logical Counterfactuals Revisited

August 7–9, 2015 – Berkeley, California

2nd Introductory Workshop on Logical Uncertainty

Pedro Carvalho (Instituto Superior Técnico)
Adele Dewey-Lopez (SEED Platform Inc.)
Benya Fallenstein (MIRI)
John Fox (Oxford)
Robert Krzyzanowski (UIC)

This was the fifth in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was logical uncertainty, and projects begun at the workshop are discussed in the following post: What’s logical coherence for anyway?

June 26–28, 2015 – Berkeley, California

1st Introductory Workshop on Vingean Reflection

Siddharth Bhaskar (UCLA)
Justin Brody (Goucher College)
Abram Demski (USC)
Benya Fallenstein (MIRI)
Roko Jelavić (Ericsson)
Seth Kurtenbach (U Missouri)

Patrick LaVictoire (MIRI)
Kenneth Presting (Renaissance Computing Institute)
Jess Riedel (Perimeter Institute)
Nate Soares (MIRI)
Eliezer Yudkowsky (MIRI)

This was the fourth in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was Vingean reflection, and projects begun at the workshop are discussed in the following posts:

June 12–14, 2015 – Berkeley, California

2nd Introductory Workshop on Logical Decision Theory

Manav Bhushan (Oxford)
Paul Crowley (Google)
Benya Fallenstein (MIRI)
Preston Greene (NTU)
Jason Gross (MIT)
Nick Hay (UC Berkeley)

Victoria Krakovna (Harvard)
Patrick LaVictoire (MIRI)
Jan Leike (Australian National University)
Nate Soares (MIRI)
Eliezer Yudkowsky (MIRI)

This was the third in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was decision theory, and projects begun at the workshop are discussed in the following post: Fixed point theorem in the finite and infinite case

May 29–31, 2015 – Berkeley, California

1st Introductory Workshop on Logical Uncertainty

Sarah Constantin (Yale)
Benya Fallenstein (MIRI)
Jacob Hilton (University of Leeds)
Vanessa Kosoy (Metaqube)
Janos Kramar (Independent)
Patrick LaVictoire (MIRI)

Shivaram Lingamneni (UC Berkeley)
Quinn Maurmann (Quidsi)
Nate Soares (MIRI)
Charlie Steiner (Independent)
Eliezer Yudkowsky (MIRI)

This was the second in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was logical uncertainty, and projects begun at the workshop are discussed in the following posts:

May 4–6, 2015 – Berkeley, California

1st Introductory Workshop on Logical Decision Theory

Sam Eisenstat (Twitter)
Benya Fallenstein (MIRI)
Scott Garrabrant (UCLA)
George Hotz (Vicarious)
Patrick LaVictoire (MIRI)

Evan Lloyd (UCLA)
Nate Soares (MIRI)
Eliezer Yudkowsky (MIRI)
Sebastien Zany (Independent)

This was the first in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was decision theory, and projects begun at the workshop are discussed in the following posts:

May 3–11, 2014 – Berkeley, CA

7th Workshop on Logic, Probability, and Reflection

Mihály Bárász (Google)
Paul Christiano (UC Berkeley)
Benya Fallenstein (Bristol U)
Marcello Herreshoff (Google)
Patrick LaVictoire (Quixey)

Nate Soares (Google)
Nisan Stiennon (Stanford)
Qiaochu Yuan (UC Berkeley)
Eliezer Yudkowsky (MIRI)

Participants at this workshop — all of them veterans of past workshops — worked on a variety of problems related to Friendly AI. The first tech report from this workshop is available here.

December 14–20, 2013 – Berkeley, CA

6th Workshop on Logic, Probability, and Reflection

Nate Ackerman (Harvard)
John Baez (UC Riverside)
Paul Christiano (UC Berkeley)
Benya Fallenstein (Bristol U)
Cameron Freer (MIT)
Jeremy Hahn (Harvard)
Wojtek Moczydlowski (Google)

Michele Reilly (independent)
Will Sawin (Princeton)
Nate Soares (Google)
Nisan Stiennon (Stanford)
Gregory Wheeler (LMU Munich)
Eliezer Yudkowsky (MIRI)

Participants at this workshop focused on the Löbian obstacle, probabilistic logic, and the intersection of logic and probability more generally. The results of this workshop are described here. See photos from the workshop here.

November 23-29, 2013 – Oxford, UK

5th Workshop on Logic, Probability, and Reflection

Stuart Armstrong (Oxford)
Mihály Bárász (Google)
Catrin Campbell-Moore (LMU Munich)
Daniel Dewey (Oxford)
Benya Fallenstein (Bristol U)

Jacob Hilton (Oxford)
Ramana Kumar (Cambridge)
Jan Leike (U Freiburg)
Bas Steunebrink (IDSIA)
Gregory Wheeler (LMU Munich)
Eliezer Yudkowsky (MIRI)

Participants at this workshop investigated problems related to reflective agents, probabilistic logic, and priors over logical statements / the logical omniscience problem. Some results from this workshop were developed further at the December 2013 workshop and described here.

September 7-13, 2013 – Berkeley, CA

4th Workshop on Logic, Probability, and Reflection

Paul Christiano (UC Berkeley)
Wei Dai (independent)
Gary Drescher (independent)
Kenny Easwaran (USC)
Cameron Freer (MIT)

Patrick LaVictoire (Quixey)
Ilya Shpitser (U Southampton)
Vladimir Slepnev (Google)
Nisan Stiennon (Stanford)
Andreas Stuhlmüller (MIT & Stanford)
Eliezer Yudkowsky (MIRI)

This workshop focused on a variety of open problems related to normative decision theory. Participants brainstormed “well-posed problems” in the area, built on LaVictoire et al.’s Löbian cooperation work, made some progress on formalizing updateless decision theory, and formulated additional toy problems such as the Ultimate Newcomb’s Problem.

These results are still being written up in various forms.

July 8-14, 2013 – Berkeley, CA

3rd Workshop on Logic, Probability, and Reflection

Andrew Critch (PhD, UC Berkeley)
Abram Demski (USC)
Benya Fallenstein (Bristol U)
Marcello Herreshoff (Google)

Jonathan Lee (Cambridge)
Will Sawin (Princeton)
Qiaochu Yuan (UC Berkeley)
Eliezer Yudkowsky (MIRI)

This workshop focused on a variety of issues related to the Löbian obstacle for self-modifying systems, and to Demski’s earlier work on logical prior probability. The primary result was a proof that attempting to create a probability distribution which performs scientific induction on Π₁ statements, converging to probability 1 for the true versions of such statements, can create zero limiting probabilities assigned to true Π₂ statements. This result is still being written up, but it has been discussed briefly in a blog post by Demski. Other bits of progress were developed at further workshops and described here.

April 3-24, 2013 – Berkeley, CA

3rd Workshop on Logic, Probability, and Reflection

Andrew Critch (PhD, UC Berkeley)
Abram Demski (USC)
Benya Fallenstein (Bristol U)
Marcello Herreshoff (Google)

Jonathan Lee (Cambridge)
Will Sawin (Princeton)
Qiaochu Yuan (UC Berkeley)
Eliezer Yudkowsky (MIRI)

This three-week workshop addressed multiple open research problems simultaneously. First, participants found an improved version of the reflection principle discovered in the previous workshop, though this progress is still being written up. Second, participants improved upon earlier work by LaVictoire, resulting in the paper “Robust Cooperation in the Prisoner’s Dilemma: Program Equilibrium via Provability Logic.” Third, participants improved upon Benya Fallenstein’s parametric polymorphism approach to tackling the Löbian obstacle for self-modifying systems.