Research Workshops

While we do not currently have any upcoming workshops scheduled, there are other ways to get involved.

Learn More

July 20–22, 2018 – Berkeley, California

2nd Workshop on Approaches in AI Alignment

CHAI Participants

Jordan AlexanderJordan Alexander ()
Lawrence ChanLawrence Chan ()
James DrainJames Drain ()
Aaron TuckerAaron Tucker ()
Alex TurnerAlex Turner ()

Unaffiliated Participants

Alex GunningAlex Gunning ()

MIRI Participants

Alex AppelAlex Appel ()
Daniel DemskiDaniel Demski ()
Evan HubingerEvan Hubinger ()
Linda LinseforsLinda Linsefors ()
Alex MennenAlex Mennen ()
David SimmonsDavid Simmons ()
Alex ZhuAlex Zhu ()

This weekend workshop brought together research interns from MIRI and UC Berkeley’s Center for Human-Compatible AI (CHAI) to discuss conceptual foundations and open problems in AI safety research.

November 18–19, 2017 – Berkeley, California

1st Workshop on Approaches in AI Alignment

Tsvi Benson-TilsenTsvi Benson-Tilsen (MIRI)
Paul ChristianoPaul Christiano (OpenAI)
Andrew CritchAndrew Critch (UC Berkeley)
Wei DaiWei Dai (independent)
Abram DemskiAbram Demski (MIRI)

Sam EisenstatSam Eisenstat (MIRI)
Scott GarrabrantScott Garrabrant (MIRI)
Richard MallahRichard Mallah (FLI, Cambridge Semantics)
Andreas StuhlmüllerAndreas Stuhlmüller (Stanford)
Jessica TaylorJessica Taylor (independent)

This weekend workshop brought together researchers interested in understanding and exploring the intersection between MIRI’s Agent Foundations research agenda and Paul Christiano’s research.

April 1-2, 2017 – Berkeley, California

4th Workshop on Machine Learning and AI Safety

Ryan CareyRyan Carey (MIRI)
Lawrence ChanLawrence Chan (University of Pennsylvania)
Michael CohenMichael Cohen (Noodle.ai)
Monica GatesMonica Gates (UC Berkeley )
Luke GreckiLuke Grecki (Shopkeep)
Daniel HendrycksDaniel Hendrycks ()
Jenna KainicJenna Kainic (NYU)
Jonathan KrauseJonathan Krause ()

Robert KrzyzanowskiRobert Krzyzanowski (University of Illinois)
Eric LangloisEric Langlois ()
Patrick LaVictoirePatrick LaVictoire (MIRI)
Holden LeeHolden Lee ()
Long OuyangLong Ouyang ()
Ethan PerezEthan Perez (Rice University)
Anthony RoseAnthony Rose (Uber, Texas A&M)
Anand SrinivasanAnand Srinivasan (MIT, AlphaSheets)
Jessica TaylorJessica Taylor (MIRI)

Artificial Intelligence Journal This workshop brought together researchers with machine learning backgrounds to work on long-term AI safety problems that can be modeled in current machine learning systems and frameworks, for instance those described in “Concrete Problems in AI Safety” and “Alignment for Advanced Machine Learning Systems”.

This workshop was funded in part by a grant from the Artificial Intelligence Journal.

March 25–26, 2017 – Berkeley, California

Workshop on Agent Foundations and AI Safety

Alexander AppelAlexander Appel (University of Nevada Reno)
Michael DennisMichael Dennis (UC Berkeley)
Sam EisenstatSam Eisenstat (Google)
Matt FrankMatt Frank ()
Scott GarrabrantScott Garrabrant (MIRI)

Juan David GilJuan David Gil (MIT)
Patrick LaVictoirePatrick LaVictoire (MIRI)
Eliana LorchEliana Lorch (Thiel Fellow)
Eli SenneshEli Sennesh ()
Harry SlatyerHarry Slatyer (Google)
Alex ZhuAlex Zhu ()

This two-day weekend workshop brought together researchers with interests in long-term theoretical AI safety research. The workshop covered the context and content of current AI safety research agendas and projects (with a focus on MIRI’s Agent Foundations technical agenda). It was geared for researchers who have technical backgrounds and who have not previously worked extensively with MIRI.

December 1-3, 2016 – Berkeley, California

3rd Workshop on Machine Learning and AI Safety

Ryan CareyRyan Carey (MIRI)
Cameron FreerCameron Freer (Gamalon and Borelian)
Scott GarrabrantScott Garrabrant (MIRI)
Marcello HerreshoffMarcello Herreshoff (Google)
Patrick LaVictoirePatrick LaVictoire (MIRI)

Moshe LooksMoshe Looks (Google)
Jeremy NixonJeremy Nixon (Spark)
Anand SrinivasanAnand Srinivasan (AlphaSheets)
Jessica TaylorJessica Taylor (MIRI)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

This small three-day workshop brought together researchers with machine learning backgrounds to work on long-term AI safety problems that can be modeled in current machine learning systems and frameworks, for instance those described in “Concrete Problems in AI Safety” and “Alignment for Advanced Machine Learning Systems”.

Topics included zero-shot learning using shared embeddings, the differences between quantilization and regularization, generative adversarial networks and Goodhart’s Law, and mathematical formalizations of conservative concept learning.

November 11-13, 2016 – Berkeley, California

9th Workshop on Logic, Probability, and Reflection

Tsvi Benson-TilsenTsvi Benson-Tilsen (UC Berkeley)
Ryan CareyRyan Carey (MIRI)
Andrew CritchAndrew Critch (MIRI)
Abram DemskiAbram Demski (USC)
Sam EisenstatSam Eisenstat (UC Berkeley)
Benya FallensteinBenya Fallenstein (MIRI)

Jack GallagherJack Gallagher ()
Scott GarrabrantScott Garrabrant (MIRI)
Marcello HerreshoffMarcello Herreshoff (Google)
Patrick LaVictoirePatrick LaVictoire (MIRI)
Nisan StiennonNisan Stiennon (Google)
Jessica TaylorJessica Taylor (MIRI)
Alex ZhuAlex Zhu (MIT)

Participants at this three-day workshop — most of them veterans of past workshops — worked on a variety of problems related to MIRI’s Agent Foundations technical agenda.

Topics included safe exploration in rich domains, the difference between predicting a human and predicting HCH, and decision theories resulting from other decision theories self-modifying.

October 21-23, 2016 – Berkeley, California

2nd Workshop on Machine Learning and AI Safety

Ryan CareyRyan Carey (MIRI)
Sarah ConstantinSarah Constantin ()
Scott GarrabrantScott Garrabrant (MIRI)
Marcello HerreshoffMarcello Herreshoff (Google)

Patrick LaVictoirePatrick LaVictoire (MIRI)
William SaundersWilliam Saunders (Google)
Jessica TaylorJessica Taylor (MIRI)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

Topics included concept learning with different ontologies, problems for Task AGI, censored representations, and conservative concepts.

August 26-28, 2016 – Berkeley, California

1st Workshop on Machine Learning and AI Safety

Paul ChristianoPaul Christiano (UC Berkeley)
Daniel FilanDaniel Filan (UC Berkeley)
Cameron FreerCameron Freer (Gamalon and Borelian)
Dylan Hadfield-MenellDylan Hadfield-Menell (UC Berkeley)
Victoria KrakovnaVictoria Krakovna (Harvard)

Janos KramarJanos Kramar (University of Montreal)
Patrick LaVictoirePatrick LaVictoire (MIRI)
Jelena LuketinaJelena Luketina (University of Montreal)
Richard MallahRichard Mallah (FLI, Cambridge Semantics)
Jessica TaylorJessica Taylor (MIRI)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

This three-day workshop brought together researchers with machine learning backgrounds to work on long-term AI safety problems that can be modeled in current machine learning systems and frameworks, for instance those described in “Concrete Problems in AI Safety” and “Alignment for Advanced Machine Learning Systems”.

Topics included learning human-interpretable and causal models of the environment; engineering cost functions based on impact measures to disincentivize side effects; designing robust metrics for the quality of a purported explanation of a plan; and developing a formal model of Goodhart’s Law which yields mild optimization.

August 12-14, 2016 – Berkeley, California

8th Workshop on Logic, Probability, and Reflection

Tsvi Benson-TilsenTsvi Benson-Tilsen (UC Berkeley)
Andrew CritchAndrew Critch (MIRI)
Sam EisenstatSam Eisenstat (UC Berkeley)
Benya FallensteinBenya Fallenstein (MIRI)
Scott GarrabrantScott Garrabrant (MIRI)

Patrick LaVictoirePatrick LaVictoire (MIRI)
Nate SoaresNate Soares (MIRI)
Jessica TaylorJessica Taylor (MIRI)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

Participants at this workshop — all of them veterans of past workshops — worked on a variety of problems related to MIRI’s Agent Foundations technical agenda, with a focus on decision theory and the formal construction of logical counterfactuals.

June 17, 2016 – Berkeley, California

CSRBAI Workshop on Agent Models and Multi-Agent Dilemmas

Twenty participants attended from institutions including:

USC Institute for Creative Technologies
Carleton University
Future of Humanity Institute
Carnegie Mellon University
Harvard
Oxford University

University College London
Australian National University
UC Berkeley
UT Austin
Princeton University
Columbia University

The Colloquium Series on Robust and Beneficial AI included a series of workshops to facilitate conversations and collaborations between people interested in a number of different approaches to the technical challenges associated with AI robustness and reliability.

The fourth workshop of CSRBAI focused on the topics of designing agents that behave well in their environments, without ignoring the effects of the agent’s own actions on the environment or on other agents within the environment.

June 11-12, 2016 – Berkeley, California

CSRBAI Workshop on Preference Specification

Twenty participants attended from institutions including:

Australian National University
University College London
Center for the Study of Existential Risk
University of Oxford
Future of Humanity Institute
Carnegie Mellon University

The Swiss AI Lab IDSIA
Australian National University
UC Berkeley
Brown University
University of Montreal
USC Institute for Creative Technologies

The third workshop of CSRBAI focused on the topic of preference specification for highly capable AI systems, in which the perennial problem of wanting code to “do what I mean, not what I said” becomes increasingly challenging.

June 4-5, 2016 – Berkeley, California

CSRBAI Workshop on Robustness and Error-Tolerance

Fourteen participants attended from institutions including:

University College London
Center for the Study of Existential Risk
Google
Future of Humanity Institute
Carnegie Mellon University

Australian National University
UC Berkeley
The Swiss AI Lab IDSIA
Cornell University
USC Institute for Creative Technologies

The second workshop of CSRBAI focused on the topic of robustness and error-tolerance in AI systems, and how to ensure that when AI system fail, they fail gracefully and detectably.

May 28-29, 2016 – Berkeley, California

CSRBAI Workshop on Transparency

Twenty participants attended from institutions including:

Oregon State University
Australian National University
Future of Humanity Institute
Carnegie Mellon University
IBM Research
Montreal Institute for Learning Algorithms

Google Research
Stanford University
Google
UC Berkeley
University College London
Harvard
Future of Life Institute

The first workshop of CSRBAI focused on the topic of transparency in AI systems, and how we can increase transparency while maintaining capabilities.

April 1-3, 2016 – Berkeley, California

Self-Reference, Type Theory, and Formal Verification

Miëtek BakMiëtek Bak (Least Fixed)
Benya FallensteinBenya Fallenstein (MIRI)
Jack GallagherJack Gallagher (Gallabytes)
Jason GrossJason Gross (MIT)

Ramana KumarRamana Kumar (Cambridge)
Patrick LaVictoirePatrick LaVictoire (MIRI)
Daniel SelsamDaniel Selsam (Stanford)
Nathaniel ThomasNathaniel Thomas (Stanford)

Participants worked on questions of self-reference in type theory and automated theorem provers, with the goal of studying systems that model themselves.

August 28-30, 2015 – Berkeley, California

3rd Introductory Workshop on Logical Decision Theory

Holger DellHolger Dell (Saarland University)
Owain EvansOwain Evans (MIT)
Benya FallensteinBenya Fallenstein (MIRI)
Benjamin FoxBenjamin Fox (Israel Defense Forces)
Patrick LaVictoirePatrick LaVictoire (MIRI)

Jonathan LeeJonathan Lee (Cambridge)
Ben LevinsteinBen Levinstein (Oxford)
Jelena LuketinaJelena Luketina (Aalto)
David SteinbergDavid Steinberg (U Maryland)
Nate SoaresNate Soares (MIRI)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

This was the sixth in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was decision theory, and projects begun at the workshop are discussed in the following post: Proof Length and Logical Counterfactuals Revisited

August 7–9, 2015 – Berkeley, California

2nd Introductory Workshop on Logical Uncertainty

Pedro CarvalhoPedro Carvalho (Instituto Superior Técnico)
Adele Dewey-LopezAdele Dewey-Lopez (SEED Platform Inc.)
Benya FallensteinBenya Fallenstein (MIRI)
John FoxJohn Fox (Oxford)
Robert KrzyzanowskiRobert Krzyzanowski (UIC)

Patrick LaVictoirePatrick LaVictoire (MIRI)
Michele ReillyMichele Reilly (Turing Inc.)
Nate SoaresNate Soares (MIRI)
Nathaniel ThomasNathaniel Thomas (Stanford)
Michael WestmorelandMichael Westmoreland (Denison)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

This was the fifth in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was logical uncertainty, and projects begun at the workshop are discussed in the following post: What’s logical coherence for anyway?

June 26–28, 2015 – Berkeley, California

1st Introductory Workshop on Vingean Reflection

Siddharth BhaskarSiddharth Bhaskar (UCLA)
Justin BrodyJustin Brody (Goucher College)
Abram DemskiAbram Demski (USC)
Benya FallensteinBenya Fallenstein (MIRI)
Roko JelavićRoko Jelavić (Ericsson)
Seth KurtenbachSeth Kurtenbach (U Missouri)
Patrick LaVictoirePatrick LaVictoire (MIRI)
Kenneth PrestingKenneth Presting (Renaissance Computing Institute)
Jess RiedelJess Riedel (Perimeter Institute)
Nate SoaresNate Soares (MIRI)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

This was the fourth in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was Vingean reflection, and projects begun at the workshop are discussed in the following posts:

June 12–14, 2015 – Berkeley, California

2nd Introductory Workshop on Logical Decision Theory

Manav BhushanManav Bhushan (Oxford)
Paul CrowleyPaul Crowley (Google)
Benya FallensteinBenya Fallenstein (MIRI)
Preston GreenePreston Greene (NTU)
Jason GrossJason Gross (MIT)
Nick HayNick Hay (UC Berkeley)

Victoria KrakovnaVictoria Krakovna (Harvard)
Patrick LaVictoirePatrick LaVictoire (MIRI)
Jan LeikeJan Leike (Australian National University)
Nate SoaresNate Soares (MIRI)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

This was the third in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was decision theory, and projects begun at the workshop are discussed in the following post: Fixed point theorem in the finite and infinite case

May 29–31, 2015 – Berkeley, California

1st Introductory Workshop on Logical Uncertainty

Sarah ConstantinSarah Constantin (Yale)
Benya FallensteinBenya Fallenstein (MIRI)
Jacob HiltonJacob Hilton (University of Leeds)
Vanessa KosoyVanessa Kosoy (Metaqube)
Janos KramarJanos Kramar (Independent)
Patrick LaVictoirePatrick LaVictoire (MIRI)
Shivaram LingamneniShivaram Lingamneni (UC Berkeley)
Quinn MaurmannQuinn Maurmann (Quidsi)
Nate SoaresNate Soares (MIRI)
Charlie SteinerCharlie Steiner (Independent)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

This was the second in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was logical uncertainty, and projects begun at the workshop are discussed in the following posts:

May 4–6, 2015 – Berkeley, California

1st Introductory Workshop on Logical Decision Theory

Sam EisenstatSam Eisenstat (Twitter)
Benya FallensteinBenya Fallenstein (MIRI)
Scott GarrabrantScott Garrabrant (UCLA)
George HotzGeorge Hotz (Vicarious)
Patrick LaVictoirePatrick LaVictoire (MIRI)
Evan LloydEvan Lloyd (UCLA)
Nate SoaresNate Soares (MIRI)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)
Sebastien ZanySebastien Zany (Independent)

This was the first in a series of introductory workshops, where MIRI brought together researchers with different backgrounds, discussed open problems in one of the technical agenda topics, and began projects and collaborations in that area.

The topic of this workshop was decision theory, and projects begun at the workshop are discussed in the following posts:

May 3–11, 2014 – Berkeley, CA

7th Workshop on Logic, Probability, and Reflection

Mihály BárászMihály Bárász (Google)
Paul ChristianoPaul Christiano (UC Berkeley)
Benya FallensteinBenya Fallenstein (Bristol U)
Marcello HerreshoffMarcello Herreshoff (Google)
Patrick LaVictoirePatrick LaVictoire (Quixey)

Nate SoaresNate Soares (Google)
Nisan StiennonNisan Stiennon (Stanford)
Qiaochu YuanQiaochu Yuan (UC Berkeley)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

Participants at this workshop — all of them veterans of past workshops — worked on a variety of problems related to Friendly AI. The first tech report from this workshop is available here.

December 14–20, 2013 – Berkeley, CA

6th Workshop on Logic, Probability, and Reflection

Nate AckermanNate Ackerman (Harvard)
John BaezJohn Baez (UC Riverside)
Paul ChristianoPaul Christiano (UC Berkeley)
Benya FallensteinBenya Fallenstein (Bristol U)
Cameron FreerCameron Freer (MIT)
Jeremy HahnJeremy Hahn (Harvard)
Wojtek MoczydlowskiWojtek Moczydlowski (Google)

Michele ReillyMichele Reilly (independent)
Will SawinWill Sawin (Princeton)
Nate SoaresNate Soares (Google)
Nisan StiennonNisan Stiennon (Stanford)
Gregory WheelerGregory Wheeler (LMU Munich)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

Participants at this workshop focused on the Löbian obstacle, probabilistic logic, and the intersection of logic and probability more generally. The results of this workshop are described here. See photos from the workshop here.

November 23-29, 2013 – Oxford, UK

5th Workshop on Logic, Probability, and Reflection

Stuart ArmstrongStuart Armstrong (Oxford)
Mihály BárászMihály Bárász (Google)
Catrin Campbell-MooreCatrin Campbell-Moore (LMU Munich)
Daniel DeweyDaniel Dewey (Oxford)
Benya FallensteinBenya Fallenstein (Bristol U)

Jacob HiltonJacob Hilton (Oxford)
Ramana KumarRamana Kumar (Cambridge)
Jan LeikeJan Leike (U Freiburg)
Bas SteunebrinkBas Steunebrink (IDSIA)
Gregory WheelerGregory Wheeler (LMU Munich)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

Participants at this workshop investigated problems related to reflective agents, probabilistic logic, and priors over logical statements / the logical omniscience problem. Some results from this workshop were developed further at the December 2013 workshop and described here.

September 7-13, 2013 – Berkeley, CA

4th Workshop on Logic, Probability, and Reflection

Paul ChristianoPaul Christiano (UC Berkeley)
Wei DaiWei Dai (independent)
Gary DrescherGary Drescher (independent)
Kenny EaswaranKenny Easwaran (USC)
Cameron FreerCameron Freer (MIT)
Patrick LaVictoirePatrick LaVictoire (Quixey)
Ilya ShpitserIlya Shpitser (U Southampton)
Vladimir SlepnevVladimir Slepnev (Google)
Nisan StiennonNisan Stiennon (Stanford)
Andreas StuhlmüllerAndreas Stuhlmüller (MIT & Stanford)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

september_workshop_1_300px This workshop focused on a variety of open problems related to normative decision theory. Participants brainstormed “well-posed problems” in the area, built on LaVictoire et al.’s Löbian cooperation work, made some progress on formalizing updateless decision theory, and formulated additional toy problems such as the Ultimate Newcomb’s Problem.

These results are still being written up in various forms.

July 8-14, 2013 – Berkeley, CA

3rd Workshop on Logic, Probability, and Reflection

Andrew CritchAndrew Critch (PhD, UC Berkeley)
Abram DemskiAbram Demski (USC)
Benya FallensteinBenya Fallenstein (Bristol U)
Marcello HerreshoffMarcello Herreshoff (Google)
Jonathan LeeJonathan Lee (Cambridge)
Will SawinWill Sawin (Princeton)
Qiaochu YuanQiaochu Yuan (UC Berkeley)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

september_workshop_2_300px This workshop focused on a variety of issues related to the Löbian obstacle for self-modifying systems, and to Demski’s earlier work on logical prior probability. The primary result was a proof that attempting to create a probability distribution which performs scientific induction on Π₁ statements, converging to probability 1 for the true versions of such statements, can create zero limiting probabilities assigned to true Π₂ statements. This result is still being written up, but it has been discussed briefly in a blog post by Demski. Other bits of progress were developed at further workshops and described here.

April 3-24, 2013 – Berkeley, CA

2nd Workshop on Logic, Probability, and Reflection

Stuart ArmstrongStuart Armstrong (Oxford)
Mihály BárászMihály Bárász (Google)
Paul ChristianoPaul Christiano (UC Berkeley)
Andrew CritchAndrew Critch (PhD, UC Berkeley)
Daniel DeweyDaniel Dewey (Oxford)
Benya FallensteinBenya Fallenstein (Bristol U)
Marcello HerreshoffMarcello Herreshoff (Google)
Patrick LaVictoirePatrick LaVictoire (U Wisconsin)
Jacob SteinhardtJacob Steinhardt (Stanford)
Jessica TaylorJessica Taylor (Stanford)
Qiaochu YuanQiaochu Yuan (UC Berkeley)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

This three-week workshop addressed multiple open research problems simultaneously. First, participants found an improved version of the reflection principle discovered in the previous workshop, though this progress is still being written up. Second, participants improved upon earlier work by LaVictoire, resulting in the paper “Robust Cooperation in the Prisoner’s Dilemma: Program Equilibrium via Provability Logic.” Third, participants improved upon Benya Fallenstein’s parametric polymorphism approach to tackling the Löbian obstacle for self-modifying systems.

November 11-18, 2012 – Berkeley, CA

1st Workshop on Logic, Probability, and Reflection

Mihály BárászMihály Bárász (Google)
Paul ChristianoPaul Christiano (UC Berkeley)
Marcello HerreshoffMarcello Herreshoff (Google)
Eliezer YudkowskyEliezer Yudkowsky (MIRI)

This workshop pursued one line of attack on the Löbian obstacle for self-modifying systems. The primary result of this workshop was a non-constructive “loophole” in Tarski’s undefinability of truth (via a fixed point theorem), which was later written up in draft form as “Definability of Truth in Probabilistic Logic” (see discussions here, here, and here).