MIRI Executive Director Nate Soares and Rutgers/UIUC decision theorist Ben Levinstein have a new paper out introducing functional decision theory (FDT), MIRI’s proposal for a general-purpose decision theory.
The paper, titled “Cheating Death in Damascus,” considers a wide range of decision problems. In every case, Soares and Levinstein show that FDT outperforms all earlier theories in utility gained. The abstract reads:
Evidential and Causal Decision Theory are the leading contenders as theories of rational action, but both face fatal counterexamples. We present some new counterexamples, including one in which the optimal action is causally dominated. We also present a novel decision theory, Functional Decision Theory (FDT), which simultaneously solves both sets of counterexamples.
Instead of considering which physical action of theirs would give rise to the best outcomes, FDT agents consider which output of their decision function would give rise to the best outcome. This theory relies on a notion of subjunctive dependence, where multiple implementations of the same mathematical function are considered (even counterfactually) to have identical results for logical rather than causal reasons. Taking these subjunctive dependencies into account allows FDT agents to outperform CDT and EDT agents in, e.g., the presence of accurate predictors. While not necessary for considering classic decision theory problems, we note that a full specification of FDT will require a non-trivial theory of logical counterfactuals and algorithmic similarity.
“Death in Damascus” is a standard decision-theoretic dilemma. In it, a trustworthy predictor (Death) promises to find you and bring your demise tomorrow, whether you stay in Damascus or flee to Aleppo. Fleeing to Aleppo is costly and provides no benefit, since Death, having predicted your future location, will then simply come for you in Aleppo instead of Damascus.
In spite of this, causal decision theory often recommends fleeing to Aleppo — for much the same reason it recommends defecting in the one-shot twin prisoner’s dilemma and two-boxing in Newcomb’s problem. CDT agents reason that Death has already made its prediction, and that switching cities therefore can’t cause Death to learn your new location. Even though the CDT agent recognizes that Death is inescapable, the CDT agent’s decision rule forbids taking this fact into account in reaching decisions. As a consequence, the CDT agent will happily give up arbitrary amounts of utility in a pointless flight from Death.
Causal decision theory fails in Death in Damascus, Newcomb’s problem, and the twin prisoner’s dilemma — and also in the “random coin,” “Death on Olympus,” “asteroids,” and “murder lesion” dilemmas described in the paper — because its counterfactuals only track its actions’ causal impact on the world, and not the rest of the world’s causal (and logical, etc.) structure.
While evidential decision theory succeeds in these dilemmas, it fails in a new decision problem, “XOR blackmail.”1 FDT consistently outperforms both of these theories, providing an elegant account of normative action for the full gamut of known decision problems.
The underlying idea of FDT is that an agent’s decision procedure can be thought of as a mathematical function. The function takes the state of the world described in the decision problem as an input, and outputs an action.
In the Death in Damascus problem, the FDT agent recognizes that their action cannot cause Death’s prediction to change. However, Death and the FDT agent are in a sense computing the same function: their actions are correlated, in much the same way that if the FDT agent were answering a math problem, Death could predict the FDT agent’s answer by computing the same mathematical function.
This simple notion of “what variables depend on my action?” avoids the spurious dependencies that EDT falls prey to. Treating decision procedures as multiply realizable functions does not require us to conflate correlation with causation. At the same time, FDT tracks real-world dependencies that CDT ignores, allowing it to respond effectively in a much more diverse set of decision problems than CDT.
The main wrinkle in this decision theory is that FDT’s notion of dependence requires some account of “counterlogical” or “counterpossible” reasoning.
The prescription of FDT is that agents treat their decision procedure as a deterministic function, consider various outputs this function could have, and select the output associated with the highest-expected-utility outcome. What does it mean, however, to say that there are different outputs a deterministic function “could have”? Though one may be uncertain about the output of a certain function, there is in reality only one possible output of a function on a given input. Trying to reason about “how the world would look” on different assumptions about a function’s output on some input is like trying to reason about “how the world would look” on different assumptions about which is the largest integer in the set {1, 2, 3}.
In garden-variety counterfactual reasoning, one simply imagines a different (internally consistent) world, exhibiting different physical facts but the same logical laws. For counterpossible reasoning of the sort needed to say “if I stay in Damascus, Death will find me here” as well as “if I go to Aleppo, Death will find me there” — even though only one of these events is logically possible, under a full specification of one’s decision procedure and circumstances — one would need to imagine worlds where different logical truths hold. Mathematicians presumably do this in some heuristic fashion, since they must weigh the evidence for or against different conjectures; but it isn’t clear how to formalize this kind of reasoning in a practical way.2
Functional decision theory is a successor to timeless decision theory (first discussed in 2009), a theory by MIRI senior researcher Eliezer Yudkowsky that made the mistake of conditioning on observations. FDT is a generalization of Wei Dai’s updateless decision theory.3
We’ll be presenting “Cheating Death in Damascus” at the Formal Epistemology Workshop, an interdisciplinary conference showcasing results in epistemology, philosophy of science, decision theory, foundations of statistics, and other fields.4
Update April 7: Nate goes into more detail on the interpretive questions raised by functional decision theory in a follow-up conversation: Decisions are for making bad outcomes inconsistent.
Update November 25, 2019: A revised version of this paper has been accepted to The Journal of Philosophy. The JPhil version is here, while the 2017 FEW version is available here.
Sign up to get updates on new MIRI technical results
Get notified every time a new technical paper is published.
- Just as the variants on Death in Damascus in Soares and Levinstein’s paper help clarify CDT’s particular point of failure, XOR blackmail drills down more exactly on EDT’s failure point than past decision problems have. In particular, EDT cannot be modified to avoid XOR blackmail in the ways it can be modified to smoke in the smoking lesion problem. ↩
- Logical induction is an example of a method for assigning reasonable probabilities to mathematical conjectures; but it isn’t clear from this how to define a decision theory that can calculate expected utilities for inconsistent scenarios. Thus the problem of reasoning under logical uncertainty is distinct from the problem of defining counterlogical reasoning. ↩
- The name “UDT” has come to be used to pick out a multitude of different ideas, including “UDT 1.0” (Dai’s original proposal), “UDT 1.1”, and various proof-based approaches to decision theory (which make useful toy models, but not decision theories that anyone advocates adhering to).
FDT captures a lot (but not all) of the common ground between these ideas, and is intended to serve as a more general umbrella category that makes fewer philosophical commitments than UDT and which is easier to explain and communicate. Researchers at MIRI do tend to hold additional philosophical commitments that are inferentially further from the decision theory mainstream (which concern updatelessness and logical prior probability), for which certain variants of UDT are perhaps our best concrete theories, but no particular model of decision theory is yet entirely satisfactory. ↩
- Thanks to Matthew Graves and Nate Soares for helping draft and edit this post. ↩