CSRBAI talks on agent models and multi-agent dilemmas

 |   |  News, Video

We’ve uploaded the final set of videos from our recent Colloquium Series on Robust and Beneficial AI (CSRBAI) at the MIRI office, co-hosted with the Future of Humanity Institute. A full list of CSRBAI talks with public video or slides:

For a recap of talks from the earlier weeks at CSRBAI, see my previous blog posts on transparency, robustness and error tolerance, and preference specification. The last set of talks was part of the week focused on Agent Models and Multi-Agent Dilemmas:



Michael Wellman, Professor of Computer Science and Engineering at the University of Michigan, spoke about the implications and risks of autonomous agents in the financial markets (slides). Abstract:

Design for robust and beneficial AI is a topic for the future, but also of more immediate concern for the leading edge of autonomous agents emerging in many domains today. One area where AI is already ubiquitous is on financial markets, where a large fraction of trading is routinely initiated and conducted by algorithms. Models and observational studies have given us some insight on the implications of AI traders for market performance and stability. Design and regulation of market environments given the presence of AIs may also yield lessons for dealing with autonomous agents more generally.



Stefano Albrecht, a Postdoctoral Fellow in the Department of Computer Science at the University of Texas at Austin, spoke about “learning to distinguish between belief and truth” (slides). Abstract:

Intelligent agents routinely build models of other agents to facilitate the planning of their own actions. Sophisticated agents may also maintain beliefs over a set of alternative models. Unfortunately, these methods usually do not check the validity of their models during the interaction. Hence, an agent may learn and use incorrect models without ever realising it. In this talk, I will argue that robust agents should have both abilities: to construct models of other agents and contemplate the correctness of their models. I will present a method for behavioural hypothesis testing along with some experimental results. The talk will conclude with open problems and a possible research agenda.



Stuart Armstrong, from the Future of Humanity Institute in Oxford, spoke about “reduced impact AI” (slides). Abstract:

This talk will look at some of the ideas developed to create safe AI without solving the problem of friendliness. It will focus first on “reduced impact AI”, AIs designed to have little effect on the world – but from whom high impact can nevertheless be extracted. It will then delve into the new idea of AIs designed to have preferences over their own virtual worlds only, and look at the advantages – and limitations – of using indifference as a tool of AI control.



Lastly, Andrew Critch, a MIRI research fellow, spoke about robust cooperation in bounded agents. This talk is based on the paper “Parametric Bounded Löb’s Theorem and Robust Cooperation of Bounded Agents.” Talk abstract:

The first interaction between a pair of agents who might destroy each other can resemble a one-shot prisoner’s dilemma. Consider such a game where each player is an algorithm with read-access to its opponent’s source code. Tennenholtz (2004) introduced an agent which cooperates iff its opponent’s source code is identical to its own, thus sometimes achieving mutual cooperation while remaining unexploitable in general. However, precise equality of programs is a fragile cooperative criterion. Here, I will exhibit a new and more robust cooperative criterion, inspired by ideas of LaVictoire, Barasz and others (2014), using a new theorem in provability logic for bounded reasoners.