We’ve uploaded a second set of videos from our recent Colloquium Series on Robust and Beneficial AI (CSRBAI) at the MIRI office, co-hosted with the Future of Humanity Institute. These talks were part of the week focused on robustness and error-tolerance in AI systems, and how to ensure that when AI system fail, they fail gracefully and detectably. All released videos are available on the CSRBAI web page.
Bart Selman, professor of computer science at Cornell University, spoke about machine reasoning and planning (slides). Excerpt:
I’d like to look at what I call “non-human intelligence.” It does get less attention, but the advances also have been very interesting, and they’re in reasoning and planning. It’s actually partly not getting as much attention in the AI world because it’s more used in software verification, program synthesis, and automating science and mathematical discoveries – other areas related to AI but not a central part of AI that are using these reasoning technologies. Especially the software verification world – Microsoft, Intel, IBM – push these reasoning programs very hard, and that’s why there’s so much progress, and I think it will start feeding back into AI in the near future.
Jessica Taylor presented on MIRI’s recently released second technical agenda, “Alignment for Advanced Machine Learning Systems”. Abstract:
If artificial general intelligence is developed using algorithms qualitatively similar to those of modern machine learning, how might we target the resulting system to safely accomplish useful goals in the world? I present a technical agenda for a new MIRI project focused on this question.
Stefano Ermon, assistant professor of computer science at Stanford, gave a talk on probabilistic inference and accuracy guarantees (slides). Abstract:
Statistical inference in high-dimensional probabilistic models is one of the central problems in AI. To date, only a handful of distinct methods have been developed, most notably (MCMC) sampling and variational methods. While often effective in practice, these techniques do not typically provide guarantees on the accuracy of the results. In this talk, I will present alternative approaches based on ideas from the theoretical computer science community. These approaches can leverage recent advances in combinatorial optimization and provide provable guarantees on the accuracy.
Paul Christiano, PhD student at UC Berkeley, gave a talk about training aligned reinforcement learning agents. Excerpt:
That’s the goal of the reinforcement learning problem. We as the designers of an AI system have some other goal in mind, which maybe we don’t have a simple formalization of. I’m just going to say, “We want the agent to do the right thing.” We don’t really care about what reward the agent sees; we just care that it’s doing the right thing.
So, intuitively, we can imagine that there’s some unobserved utility function U which acts on a transcript and just evaluates the consequences of the agent behaving in that way. So it has to average over all the places in the universe this transcript might occur, and it says, “What would I want the agent to do, on average, when it encounters this transcript?”
Jim Babcock discussed the AGI containment problem (slides). Abstract:
Ensuring that powerful AGIs are safe will involve testing and experimenting on them, but a misbehaving AGI might try to tamper with its test environment to gain access to the internet or modify the results of tests. I will discuss the challenges of securing environments to test AGIs in.
For a summary of how the event as a whole went, and videos of the opening talks by Stuart Russell, Alan Fern, and Francesca Rossi, see my last blog post.