MIRI senior researcher Eliezer Yudkowsky and executive director Nate Soares have a new introductory paper out on decision theory: “**Functional decision theory: A new theory of instrumental rationality**.”

Abstract:

This paper describes and motivates a new decision theory known as

functional decision theory(FDT), as distinct from causal decision theory and evidential decision theory.Functional decision theorists hold that the normative principle for action is to treat one’s decision as the output of a ﬁxed mathematical function that answers the question, “Which output of this very function would yield the best outcome?” Adhering to this principle delivers a number of beneﬁts, including the ability to maximize wealth in an array of traditional decision-theoretic and game-theoretic problems where CDT and EDT perform poorly. Using one simple and coherent decision rule, functional decision theorists (for example) achieve more utility than CDT on Newcomb’s problem, more utility than EDT on the smoking lesion problem, and more utility than both in Parﬁt’s hitchhiker problem.

In this paper, we deﬁne FDT, explore its prescriptions in a number of diﬀerent decision problems, compare it to CDT and EDT, and give philosophical justiﬁcations for FDT as a normative theory of decision-making.

Our previous introductory paper on FDT, “Cheating Death in Damascus,” focused on comparing FDT’s performance to that of CDT and EDT in fairly high-level terms. Yudkowsky and Soares’ new paper puts a much larger focus on FDT’s mechanics and motivations, making “Functional Decision Theory” the most complete stand-alone introduction to the theory.^{1}

Contents:

1. **Overview.**

2. **Newcomb’s Problem and the Smoking Lesion Problem.** In terms of utility gained, conventional EDT outperforms CDT in Newcomb’s problem, while underperforming CDT in the smoking lesion problem. Both CDT and EDT have therefore appeared unsastisfactory as expected utility theories, and the debate between the two has remained at an impasse. FDT, however, offers an elegant criterion for matching EDT’s performance in the former class of dilemmas, while also matching CDT’s performance in the latter class of dilemmas.

3. **Subjunctive Dependence.** FDT can be thought of as a modification of CDT that relies, not on causal dependencies, but on a wider class of *subjunctive* dependencies that includes causal dependencies as a special case.

4. **Parfit’s Hitchhiker.** FDT’s novel properties can be more readily seen in Parfit’s hitchhiker problem, where both CDT and EDT underperform FDT. Yudkowsky and Soares note three considerations favoring FDT over traditional theories: an argument from precommitment, an argument from information value, and an argument from utility.

5. **Formalizing EDT, CDT, and FDT.** To lend precision to the claim that a given decision theory prescribes a given action, Yudkowsky and Soares define algorithms implementing each theory.

6. **Comparing the Three Decision Algorithms’ Behavior.** Yudkowsky and Soares then revisit Newcomb’s problem, the smoking lesion problem, and Parfit’s hitchhiker problem, algorithms in hand.

7. **Diagnosing EDT: Conditionals as Counterfactuals.** The core problem with EDT and CDT is that the hypothetical scenarios that they consider are malformed. EDT works by conditioning on joint probability distributions, which causes problems when correlations are spurious.

8. **Diagnosing CDT: Impossible Interventions.** CDT, meanwhile, works by considering strictly causal counterfactuals, which causes problems when it wrongly treats unavoidable correlations as though they can be broken.

9: **The Global Perspective.** FDT’s form of counterpossible reasoning allows agents to respect a broader set of real-world dependencies than CDT can, while excluding EDT’s spurious dependencies. We can understand FDT as reflecting a “global perspective” on which decision-theoretic agents should seek to have the most desirable decision type, as opposed to the most desirable decision token.

10. **Conclusion.**

We use the term “*functional* decision theory” because FDT invokes the idea that decision-theoretic agents can be thought of as implementing deterministic functions from goals and observation histories to actions.^{2} We can see this feature clearly in **Newcomb’s problem**, where an FDT agent—let’s call her Fiona, as in the paper—will reason as follows:

Omega knows the decision I will reach—they are somehow computing the same decision function I am on the same inputs, and using that function’s output to decide how many boxes to fill. Suppose, then, that the decision function I’m implementing outputs “one-box.” The same decision function, implemented in Omega, must then also output “one-box.” In that case, Omega will fill the opaque box, and I’ll get its contents. (

+$1,000,000.)Or suppose that instead I take both boxes. In that case, my decision function outputs “two-box,” Omega will leave the opaque box empty, and I’ll get the contents of both boxes. (

+$1,000.)The first scenario has higher expected utility; therefore my decision function hereby outputs “one-box.”

Unlike a CDT agent that restricts itself to purely causal dependencies, Fiona’s decision-making is able to take into account the dependencies between Omega’s actions and her reasoning process itself. As a consequence, Fiona will tend to come away with far more money than CDT agents.

At the same time, FDT avoids the standard pitfalls EDT runs into, e.g., in the smoking lesion problem. The smoking lesion problem has a few peculiarities, such as the potential for agents to appeal to the “tickle defense” of Ellery Eells; but we can more clearly illustrate EDT’s limitations with the **XOR blackmail problem**, where tickle defenses are of no help to EDT.

In the XOR blackmail problem, an agent hears a rumor that their house has been infested by termites, at a repair cost of $1,000,000. The next day, the agent receives a letter from the trustworthy predictor Omega staying:

I know whether or not you have termites, and I have sent you this letter iff exactly one of the following is true: (i) the rumor is false, and you are going to pay me $1,000 upon receiving this letter; or (ii) the rumor is true, and you will not pay me upon receiving this letter.

In this dilemma, EDT agents pay up, reasoning that it would be bad news to learn that they have termites—in spite of the fact that their termite-riddenness does not depend, either causally or otherwise, on whether they pay.

In contrast, Fiona the FDT agent reasons in a similar fashion to how she does in Newcomb’s problem:

Since Omega’s decision to send the letter is based on a reliable prediction of whether I’ll pay, Omega and I must both be computing the same decision function. Suppose, then, that my decision function outputs “don’t pay” on input “letter.” In the cases where I have termites, Omega will then send me this letter and I won’t pay (

−$1,000,000); while if I don’t have termites, Omega won’t send the letter (−$0).On the other hand, suppose that my decision function outputs “pay” on input “letter.” Then, in the case where I have termites, Omega doesn’t send the letter (

−$1,000,000), and in the case where I don’t have termites, Omega sends the letter and I pay (−$1,000).My decision function determines whether I conditionally pay and whether Omega conditionally sends a letter. But the termites aren’t predicting me, aren’t computing my decision function at all. So if my decision function’s output is “pay,” that doesn’t change the termites’ behavior and doesn’t benefit me at all; so I don’t pay.

Unlike the EDT agent, Fiona correctly takes into account that paying won’t increase her utility in the XOR blackmail dilemma; and unlike the CDT agent, Fiona takes into account that one-boxing *will* increase her utility in Newcomb’s problem.

FDT, then, provides an elegant alternative to both traditional theories, simultaneously offering us a simpler and more general rule for expected utility maximization in practice, and a more satisfying philosophical account of rational decision-making in principle.

For additional discussion of FDT, I recommend “Decisions Are For Making Bad Outcomes Inconsistent,” a conversation exploring the counter-intuitive fact that in order to decide what action to output, a decision-theoretic agent must be able to consider hypothetical scenarios in which their deterministic decision function outputs something other than what it outputs in fact.^{3}

#### Sign up to get updates on new MIRI technical results

*Get notified every time a new technical paper is published.*

- “Functional Decision Theory” was originally drafted prior to “Cheating Death in Damascus,” and was significantly longer before we received various rounds of feedback from the philosophical community. “Cheating Death in Damascus” was produced from material that was cut from early drafts; other cut material included a discussion of proof-based decision theory, and some Death in Damascus variants left on the cutting room floor for being needlessly cruel to CDT. ↩
- To cover mixed strategies in this context, we can assume that one of the sensory inputs to the agent is a random number. ↩
- Many of the hypotheticals an agent must consider are internally inconsistent: a deterministic function only has one possible output on a given input, but agents must base their decisions on the expected utility of many different “possible” actions in order to choose the best action. E.g., in Newcomb’s problem, FDT and EDT agents must evaluate the expected utility of two-boxing in order to weigh their options and arrive at their final decision at all, even though it would be inconsistent for such an agent to two-box; and likewise, CDT must evaluate the expected utility of the impossible hypothetical where a CDT agent one-boxes.
Although poorly-understood theoretically, this kind of counterpossible reasoning seems to be entirely feasible in practice. Even though a false conjecture classically implies all propositions, mathematicians routinely reason in a meaningful and nontrivial way with hypothetical scenarios in which a conjecture has different truth-values. The problem of how to best represent counterpossible reasoning in a formal setting, however, remains unsolved. ↩