On October 15th from 4:30-5:30pm, MIRI workshop participant Paul Christiano will give a technical talk at the Harvard University Science Center, room 507, as part of the Logic at Harvard seminar and colloquium.
Christiano’s title and abstract are:
Probabilistic metamathematics and the definability of truth
No model M of a sufficiently expressive theory can contain a truth predicate T such that for all S, M |= T(“S”) if and only if M |= S. I’ll consider the setting of probabilistic logic, and show that there are probability distributions over models which contain an “objective probability function” P such that M |= a < P(“S”) < b almost surely whenever a < P(M |= S) < b. This demonstrates that a probabilistic analog of a truth predicate is possible as long as we allow infinitesimal imprecision. I’ll argue that this result significantly undercuts the philosophical significance of Tarski’s undefinability theorem, and show how the techniques involved might be applied more broadly to resolve obstructions due to self-reference.
Yudkowsky’s talk will be somewhat more accessible than Christiano’s, and will take place in MIT’s Ray and Maria Stata Center (see image on right), in room 32-123 (aka Kirsch Auditorium, with 318 seats). There will be light refreshments 15 minutes before the talk. Yudkowsky’s title and abstract are:
Recursion in rational agents: Foundations for self-modifying AI
Reflective reasoning is a familiar but formally elusive aspect of human cognition. This issue comes to the forefront when we consider building AIs which model other sophisticated reasoners, or who might design other AIs which are as sophisticated as themselves. Mathematical logic, the best-developed contender for a formal language capable of reflecting on itself, is beset by impossibility results. Similarly, standard decision theories begin to produce counterintuitive or incoherent results when applied to agents with detailed self-knowledge. In this talk I will present some early results from workshops held by the Machine Intelligence Research Institute to confront these challenges.
The first is a formalization and significant refinement of Hofstadter’s “superrationality,” the (informal) idea that ideal rational agents can achieve mutual cooperation on games like the prisoner’s dilemma by exploiting the logical connection between their actions and their opponent’s actions. We show how to implement an agent which reliably outperforms classical game theory given mutual knowledge of source code, and which achieves mutual cooperation in the one-shot prisoner’s dilemma using a general procedure. Using a fast algorithm for finding fixed points, we are able to write implementations of agents that perform the logical interactions necessary for our formalization, and we describe empirical results.
Second, it has been claimed that Godel’s second incompleteness theorem presents a serious obstruction to any AI understanding why its own reasoning works or even trusting that it does work. We exhibit a simple model for this situation and show that straightforward solutions to this problem are indeed unsatisfactory, resulting in agents who are willing to trust weaker peers but not their own reasoning. We show how to circumvent this difficulty without compromising logical expressiveness.
Time permitting, we also describe a more general agenda for averting self-referential difficulties by replacing logical deduction with a suitable form of probabilistic inference. The goal of this program is to convert logical unprovability or undefinability into very small probabilistic errors which can be safely ignored (and may even be philosophically justified).
Also, on Oct 18th at 7pm there will be a Less Wrong / Methods of Rationality meetup/party on the MIT campus in Building 6, room 120. There will be snacks and refreshments, and Yudkowsky will be in attendance.