Will MacAskill on normative uncertainty

 |   |  Conversations

William MacAskill portrait Will MacAskill recently completed his DPhil at Oxford University and, as of October 2014 will be a Research Fellow at Emmanuel College, Cambridge.

He is the cofounder of Giving What We Can and 80,000 Hours. He’s currently writing a book, Effective Altruism, to be published by Gotham (Penguin USA) in summer 2015.

Luke Muehlhauser: In MacAskill (2014) you tackle the question of normative uncertainty:

Very often, we are unsure about what we ought to do… Sometimes, this uncertainty arises out of empirical uncertainty: we might not know to what extent non-human animals feel pain, or how much we are really able to improve the lives of distant strangers compared to our family members. But this uncertainty can also arise out of fundamental normative uncertainty: out of not knowing, for example, what moral weight the wellbeing of distant strangers has compared to the wellbeing of our family; or whether non-human animals are worthy of moral concern even given knowledge of all the facts about their biology and psychology.

…one might have expected philosophers to have devoted considerable research time to the question of how one ought to take one’s normative uncertainty into account in one’s decisions. But the issue has been largely neglected. This thesis attempts to begin to fill this gap.

In the first part of your thesis you argue that when the moral theories to which an agent assigns some credence are cardinally measurable (as opposed to ordinal-scale) and they are intertheoretically comparable, the agent should choose an action which “maximizes expected choice-worthiness” (MEC), which is akin to maximizing expected value across multiple uncertain theories of what is desirable.

I suspect that result will be intuitive to many, so let’s jump forward to where things get more interesting. You write:

Sometimes, [value] theories are merely ordinal, and, sometimes, even when theories are cardinal, choice-worthiness is not comparable between them. In either of these situations, MEC cannot be applied. In light of this problem, I propose that the correct metanormative theory is sensitive to the different sorts of information that different theories provide. In chapter 2, I consider how to take normative uncertainty into account in conditions where all theories provide merely ordinal choice-worthiness, and where choice-worthiness is noncomparable between theories, arguing in favour of the Borda Rule.

What is the Borda Rule, and why do you think it’s the best action rule under these conditions?

Will MacAskill: Re: “I suspect that result will be intuitive to many.” Maybe in your circles that’s true! Many, or even most, philosophers get off the boat way before this point. They say that there’s no sense of ‘ought’ according to which what one ought to do takes normative uncertainty into account. I’m glad that I don’t have to defend that for you, though, as I think it’s perfectly obvious that the ‘no ought’ position is silly.

As for the Borda Rule: the Borda Rule is a type of voting system, which works as follows. For each theory, an option’s Borda Score is equal to the number of options that rank lower in the theory’s choice-worthiness ordering than that option. An option’s Credence-Weighted Borda Score is equal to the sum, across all theories, of the decision-maker’s credence in the theory multiplied by the Borda Score of the option, on that theory.

So, for example, suppose I have 80% credence in Kantianism and 20% credence in Contractualism. (Suppose I’ve had some very misleading evidence….) Kantianism says that option A is the best option, then option B, then option C. Contractualism says that option C is the best option, then option B, then option A.

The Borda scores, on Kantianism, are:
A = 2
B = 1
C = 0

The Borda scores, on Contractualism, are:
A = 0
B = 1
C = 2

Each option’s Credence-Weighted Borda Score is:
A = 0.8*2 + 0.2*0 = 1.6
B = 0.8*1 + 0.2*1 = 1
C = 0.8*0 + 0.2*2 = 0.4

So, in this case, the Borda Rule would say that A is the most appropriate option, followed by B, and then C.

The reason we need to use some sort of voting system is because I’m considering, at this point, only ordinal theories: theories that tell you that it’s better to choose A over B (alt: that “A is more choice-worthy than B”), but won’t tell you how much more choice-worthy A is than B. So, in these conditions, we have to have a theory of how to take normative uncertainty into account that’s sensitive only to each theory’s choice-worthiness ordering (as well as the degree of credence in each theory), because the theories I’m considering don’t give you anything more than an ordering.

The key reason why I think the Borda Rule is better than any other voting system is that it satisfies a condition I call Updating Consistency. The idea is that increasing your credence in some particular theory T1 shouldn’t make the appropriateness ordering (that is, the ordering of options in terms of what-you-ought-to-do-under-normative-uncertainty) worse by the lights of T1.

This condition seems to me to be very plausible indeed. But, surprisingly, very few voting systems satisfy that property, and those others that do have other problems.

Luke: One problem for the Borda Rule is that it is, as you say, “extremely sensitive to how one individuates options” — a problem analogous to the problem of clone-dependence in voting theory. You tackle this problem by modifying the Borda Rule to include a measure over the set of all possible options. Could you explain how that works? Also, is this modification to the Borda Rule novel to your thesis?

Will: A measure is a way of giving sense to the size of a space. It allows us to say that some options represent a larger part of possibility space than others. This is an intuitive idea: ‘drinking tea tomorrow’ represents a larger portion of possibility space than ‘drinking tea with my left hand tomorrow at 3pm’. With the idea of a measure on board, we can rewrite our definition of the Borda Rule, as follows (I’ll ignore the possibility of equally choice-worthy options, as that makes the definition a little more complicated):

For each theory, an option’s Borda Score is equal to the sum total of the measure of the options that rank lower in the theory’s choice-worthiness ordering than that option. An option’s Credence-Weighted Borda Score is equal to the sum, across all theories, of the decision-maker’s credence in the theory multiplied by the Borda Score of the option, on that theory.

So, suppose that, according to some theory Ti, A>B. On the old definition of the Borda Rule, A gets a Borda Score of 1. But if option B is split into options B’ and B”, such that A>B’=B”, then A gets a Borda score of 2. The fact that A’s score has changed just because of how you’ve individuated options is problematic.

But let’s use the new definition, which incorporates a measure, and suppose that the measure of A is 0.5 and the measure of B is 0.5. If so, then, when the decision-maker faces options A and B, then A gets a Borda Score of 0.5, on Ti. But when option B is split into options B’ and B”, then the measure is split, too. Suppose that B’ and B” are equally large. If so, then B’ would have a measure of 0.25 and B” would have a measure of 0.25. So A’s Borda score, on Ti, would be 0.5, just as before.

This modification to the Borda Rule is novel, though the idea was given to me by Owen Cotton-Barratt, so I can’t take credit! I guess the reason it hasn’t been suggested in the voting theory literature is because it might seem obvious that every candidate gets the same measure. But perhaps you could think of the ‘space’ of possible political positions (which would be easy if it were really a left-right spectrum), and assign candidates a measure based on how much of this space they take up. That could possibly allow for the Borda Rule to avoid problems to do with clone-dependence. But I think that for actual voting systems, the Schulze method is better than the Borda Rule. It’s clone-independent even without a measure, and is much less vulnerable to strategic voting than the Borda Rule is.

Luke: What is the relevance of Arrow’s impossibility theorem to your suggested use of a modified Borda Rule for handling normative uncertainty?

Will: I suggest that, in conditions of ordinal theories, we should exploit the analogy with voting. But that analogy with voting means that we’ll run into a analogue of Arrow’s Impossibility Theorem: the result that no voting system can satisfy all of a number of highly desirable properties.

There are a few ways to formulate the impossibility result. The strongest, in my view, is to say that any voting system that satisfies other more essential properties must violate Contraction Consistency, where Contraction Consistency is defined as follows:
Let A be the option set, and M be the set of maximally appropriate options within A. Let S be a subset of A that contains all members of M. The condition is: A is a maximally appropriate option, given option set S, iff it is a member of M.

It’s a condition that you’ve got to be careful how to formulate. I don’t go into that in my thesis. But some violations of it are intuitively clearly irrational. E.g. imagine you’ve got the options of blueberry ice cream or strawberry ice cream. You currently prefer blueberry. But then you discover that the restaurant also serves chocolate ice cream, and so you switch your preference from blueberry to strawberry, even though your assessment of the relative values of blueberry and strawberry hasn’t changed. That seems irrational – e.g. it would suggest that you should spend resources trying to find out if you have available to you an option that you know you won’t take.

I think that contraction consistency is a problem for the Borda Rule. But it’s a problem that affects all voting system analogues. So it’s something that we’ve got to live with – it’s just unfortunate that we have (or ought to have) credence in merely ordinal theories.

There is a second response as well. Which is to distinguish the Narrow and Broad versions of the Borda Rule. The Narrow version assigns Borda Scores within an option-set. The Broad version assigns Borda Scores across all possible options. It’s only the Narrow version that violates Contraction Consistency. But the Broad version has its own weirdnesses. Suppose that you’ve got a situation:

99%: T1: A>B
1%: T2: B>A

Where T1 and T2 are merely ordinal theories. It might seem obvious that you should pick A in that situation. But you can’t infer that from that case, according to the Broad Borda Rule. Instead, you’ve got to look at how A and B rank in T1 and T2’s ordering of all possible options. If A and B are very close on T1 but very far apart, on T2, then B might be the most appropriate option. So the Broad Borda is very difficult to use. And it gives results that seem wrong to me – as if you’re ‘faking’ cardinality where there is none.

So my general view on this is that any account you have will have deep problems. Endorsing a particular view involves carefully weighing up different strengths and weaknesses; there’s no obviously correct position. (This becomes a theme when you start working on normative uncertainty. To an extent, this should be expected: we’re dealing with messy nonideal agents, who don’t have perfect access to their own values or to the normative truth).

Luke: Your thesis covers many other interesting topics; we won’t try to cover them all here. How would you summarize the other major “takeaways” you’d most want people to know about from your thesis?

Will: The “Maximise Expected Choice-Worthiness” approach to moral uncertainty is the best approach. It is able to respond to a number of objections that have been levelled against it.

If you think that you can’t compare choice-worthiness across theories, then you should normalise different theories at their variance. But I think that the arguments for intertheoretic incomparability don’t work. Instead, you should feel comfortable about using your intuitions about how different theories compare.

We can make sense of the idea of two theories T1 and T2 having exactly the value-ordering over options, but T1 thinking that everything is twice as important as T2 does. So ‘utilitarianism’ really refers to a class of theories, each of different levels of amplification.

Most of our intuitions about different Newcomb and related problems can be captured by maximising expected choice-worthiness over uncertainty about whether evidential or causal decision theory is true (with much higher credence in causal decision-theory than evidential decision theory). Taking decision-theoretic uncertainty into account puts causal decision theory on pretty strong grounds — you can respond to the intuitive and “Why Ain’cha Rich?” arguments in favour of evidential decision theory.

Moral philosophy provides a bargain in terms of gaining new information: doing just a bit of philosophical study or research can radically alter the value of one’s options. So individuals, philanthropists, and governments should all spend a lot more resources on researching and studying ethics than they currently do.

Even if you think that continued human survival is net bad, you should still work to prevent near-term human extinction, in case the human race gets evidence to the contrary over the next few centuries. (Well, this is true given a few fairly controversial premises.

Luke: Thanks, Will!