Why Consequentialism is True

The thesis I will defend is that non-consequentialist moral theories are so flawed that they shouldn’t even be in the realm of consideration. Not only are they false, they are obviously false. I will show that to the extent they diverge from consequentialism, they fail independently plausible criteria for a good ethical system.

First, a word on the question of realism. This is an essay in normative and applied ethics, not metaethics. I am writing from a broadly quasi-realist or non-naturalist perspective, in which moral facts are heavily sequestered from non-moral facts. The arguments I make are a priori and concern abstract and formal requirements for moral truth, not a posteriori and concerning extrapolation from existing human desires and inquiries into the intrinsic goodness/badness of affective states like happiness and suffering. Think Parfit and Moore, not Aristotle and Mills.

What is consequentialism? What is non-consequentialism? I will carve the line at the distinction between agent-neutral and agent-relative theories. First, some background:

An agent is a machine (person, etc.) equipped with a probability function and value function over states of affairs, and a decision procedure for selecting actions with the highest expected payoff (in value). We stipulate that all agents share the same set of states of affairs.

Agent-Neutrality: all morally perfect agents have the same value function

Agent-Relativity: morally perfect agents can have different value functions

It’s not obvious what these definitions have to do with consequentialism and non-consequentialism. Let’s take two ethical theories and test them using this framework:

Utilitarianism: morally perfect agents assign value to a state of affairs proportional to the total happiness in the state of affairs

Kantianism: morally perfect agents assign infinite disvalue to states of affairs in which they violate the categorical imperative

There are two major differences between the two theories. First, Kantianism is an absolutist theory (it readily assigns infinite value/disvalue). It is important to note that there are numerous moderate neo-Kantian views that are not absolutist. Our issue is not with absolutism per se.

The second difference is that Kantianism, unlike Utilitarianism, fails to uniquely determine a single value function for all morally perfect agents. The indexicality in the definition of Kantianism (each agent must ensure that it itself obeys the categorical imperative) leads to a drastically different value function for each morally perfect agent.

I think this distinction is at the core of the consequentialist/non-consequentialist division. It is not just bad that a murder occurs, it is also bad to perform a murder oneself (says the non-consequentialist). The consequentialist promotes value whereas the non-consequentialist respects value.

Consequentialism is sometimes accused (and gleefully admitted by some of its proponents!) of leading to an esoteric morality that is put into practice in secrecy by an elite minority while a manufactured false morality that leads to good consequences is followed by common folk. Non-consequentialism faces its own problem of esotericism; why should a non-consequentialist promote the true moral theory? They do not care (as much as a consequentialist) about whether someone else does something wrong, so why bother spreading the word? Now, a fleshed-in moral theory could certainly fill this pothole, but a consequentialist could do the same with his problem of esotericism.

Non-consequentialist esotericism is far more problematic. It violates the intuitive principle:

Self-Promotion: agents following the correct moral theory tend to promote the influence of agents (that share similar probability functions) following the correct moral theory

To see why the non-consequentialist falls afoul of this principle, it might be better to get rid of the idea of a non-consequentialist theory altogether. Instead of Kantianism, we have Kantianism Benjy and Kantianism Michael Stevens etc.
What does KantianismBenjy have to do with KantianismMichael Stevens? Why should I care whether Michael Stevens obeys the categorical imperative? I don’t think the categorical imperative itself gives us an answer.

Perhaps I am wrong about the categorical imperative, and perhaps there are plausible extensions of the categorical imperative that solve this difficulty. But if a Kantian cared as much about whether a fellow agent followed the categorical imperative as she cared about whether she herself followed the categorical imperative, she would cease to be a Kantian proper and would be more like a categorical-imperative-obedience maximizer.

The categorical-imperative-obedience-maximizer lies to prevent others from lying and murders to prevent others from murdering. She is really a very strange sort of consequentialist.

What have we shown so far? That consequentialists necessarily care more about their moral theories than non-consequentialists care about their moral theories. From this, we can derive that consequentialists are more likely to form into groups (how are collective decisions even evaluated in a non-consequentialist framework?), more likely to preserve their moral theories for future generations and more likely to evangelize extraterrestrial civilizations. In short: the more consequentialist a theory is, the more likely it is to survive and reproduce. Non-consequentialism doesn’t make a good meme.

I think that most ethical theories propounded throughout history have been mostly consequentialist (in absolute terms). It doesn’t look like we see consequentialism adopted throughout the world because most theories have been consequentialist enough. Notice that radically non-consequentialist theories, like unadulterated Kantianism (even more so, egoism), are extremely rare.

Just because a theory is good at spreading doesn’t mean we should accept it. The fact that natalists create more natalists than anti-natalists create anti-natalists isn’t a very good reason to believe natalism. But consequentialist anti-natalists will at least do their best at making anti-natalism a powerful force, even if it means having kids.

We should take this all as a sign that non-consequentialism doesn’t really care about itself. What I am proposing is even weaker than Robin Hanson’s assertion that Morality Should Exist. I am saying that The Universe Should Be Left in the Hands of the Moral.

What is my vision of morality? Not a set of independent solipsistic value functions, but a transparent and universal blueprint for a good world. Moral agents should act in complete harmony with one another, completely dedicated to the singular task of perfecting nature. What is good is what is right to bring about and vice versa. Morality is not a lonely enterprise, we should take every action as if we are gears in an infinite machine.

Appendix: Supposed Benefits of Non-consequentialism

Consequentialists are sometimes accused of ignoring “the separateness of persons” when aggregating the wellbeing of populations. Aggregation is unavoidable for any serious moral theory. There is simply no better way of weighing the moral claims of distinct individuals. I challenge the non-aggregationist to come up with some other method for resolving difficult cases. Weighted lotteries seem to be the most promising route.

Consequentialist thinking is sometimes blamed for various catastrophes in human history. This argument is so frustratingly awful that I can’t even bear to address it in detail. Two notes: many disasters have arisen from not thinking like a consequentialist, and many false results have been accepted  because of the use of probability theory. We should not give up probability theory.

Consequentialism is sometimes criticized for making morality too uncertain. But why should we suppose that the right action isn’t uncertain? Sensitivity to epistemological considerations is a feature not a bug.

Five Reasons Not to Be a Non-Transitivist

1) It offers absolutely no benefits over paraconsistent approaches except for nominally validating modus ponens

2) Higher-order vagueness is way less intuitive with three-valued logic than with modal operators. There’s a lot less flexibility in what rules you can make it follow

3) The principle of tolerance fails given the existence of a sorites sequence (it also succeeds, but still)

4) It probably counts itself as borderline true (and therefore tolerantly false)

5) It prevents a unified treatment of the nonfactual. Metaethics can’t be solved by using a nonclassical logic, but there are ways of extending classical theories to accommodate morality



My Ancient Defense of a Non-Transitivist Approach to Vagueness

I wrote this months ago and have basically completely disavowed it. But I spent a lot of work on it, so it would be a shame if it never saw the light of day. If someone wants it, I can dig up the bibliography.


Vagueness is a troublesome and mysterious feature of natural languages. It is troublesome because it can be exploited to nefarious ends in an argument called “the sorites paradox.” It is mysterious because it is not altogether clear why language is vague at all, especially to the degree that it is. In using the term “vague,” I mean to distinguish it from other phenomena such as ambiguity, generality, context-dependence and uninformativeness. Ambiguity is the result of a confusion between multiple linguistic tokens, like “bat” (mammal of order Chiroptera) and “bat” (a variety of baseball equipment.) A predicate is general when it doesn’t make distinctions that more specific predicates might. An example of a general predicate is “child” (which does not specify gender.) Context-dependent predicates have different truth-conditions under different background conditions, like who is speaking and what the reference class is. “Tall” denotes different ranges of heights in humans and buildings. An uninformative predicate is one that supplies less information than is appropriate in a context, like “child” in this hypothetical exchange:

Speaker 1: Does Shmuel have any daughters?
Speaker 2: Shmuel has one child.

I will follow Mark Sainsbury in saying that vague concepts are “concepts without boundaries” (Sainsbury 1996.) That is, vague predicates classify objects without sorting them into well defined sets. This is not a theory-neutral definition of vagueness: it rules out many of the most popular competitors. However, I think that the absence of boundaries is such a basic feature of vague concepts that we can safely reject any theory that posits them.

The Sorites Paradox

The sorites is a very simple paradox that most people are roughly familiar with without knowing that it has a name or that it is even a paradox. It is an ancient puzzle, but there is something mysterious about it that has made most people (even some philosophers) dismiss it without even trying to put forth a solution. There are many variations, but here is a simple one suited for our purposes:

P1: 1,000,000,000 grains of sand in a pile constitute a heap
P2: If 1,000,000,000 grains of sand in a pile constitute a heap, then 999,999,999 grains of sand in a pile constitute a heap
P3: If 999,999,999 grains of sand in a pile constitute a heap, then 999,999,998 grains of sand in a pile constitute a heap

P1000000001: If 2 grains of sand in a pile constitute a heap, then 1 grain of sand in a pile constitutes a heap
C: Therefore, one grain of sand in a pile constitutes a heap

This is a paradox because while each of the premises seems indisputable, together they ought to entail an unacceptable conclusion. Any solution to the paradox must hold that in every context, one of the premises isn’t true or the argument is invalid. (Or that the conclusion is true!)
Solving the Paradox
The philosopher’s job is to figure out how this is so. Approaches that deny one of P2-1000000001 can be called sharpist, using Bryan Frances’s terminology (Frances 1.) Most sharpists will hold that vague predicates do have boundaries, but that these boundaries are unknowable (epistemicism,) indeterminate (supervaluationism,) extremely variable (contextualism,) or in possession of some other status that supposedly makes them less problematic. I am mentioning these theories only for the purpose of completeness: I consider them misguided for reasons I indicated in the introduction.
An approach that rejects P1 could be called nihilist, and an approach that accepts the conclusion could be called trivialist. I do not know of any defenses of the trivialist view in the literature, so I will say no more about it except for that it seems to share important similarities with the nihilist option. The nihilist view is also rare, but not unheard of. The clearest case of a true nihilist is Peter Unger, who takes the sorites paradox to be evidence for the nonexistence of ordinary material objects like heaps, chairs, and mountains. (Unger 1979.) Other nihilists take their views to be harmless and void of any serious metaphysical consequences. (Braun, Sider 2007) establishes a novel theory of ignoring vagueness within a nihilist (they call themselves “semantic nihilists”) framework. They are in agreement with the broad supervaluationist picture that vagueness is to be seen as the existence of multiple acceptable precisifications of a predicate, but they disagree with the identification of truth-under-every-acceptable-precification (what supervaluationists call supertruth and Braun and Sider call approximate truth) with truth simpliciter. Braun and Sider hold that “There is typically a cloud of propositions in the neighborhood of a sentence uttered by a vague speaker. Vagueness prevents the speaker from singling out one of these propositions uniquely, but does not banish the cloud” (Braun, Sider 4.) Braun and Sider see themselves as vindicating “[the] old and attractive view [that] vagueness is to be eliminated before semantic notions (truth, implication, and so on) may be applied” (Braun, Sider 1.)
Another approach is the tolerance approach. Tolerantists accept, in one form or another, all instances of this principle:
P(x)∧(x~P y)→P(y)
where P is a vague predicate like “a number of grains of sand that suffices to make a heap of sand” and ~P is a relation of similarity in aspects relevant to whether something is P. Because a soritical chain of objects can be constructed for most vague predicates (a series of objects in which the first element falls under the extension of the predicate, the second element does not fall under the extension of the predicate, and any two adjacent members are similar in all aspects relevant to whether they fall under the extension of the predicate,) a revision of classical logic appears to be required. (Although, see (Pagin 2010.)) This revision usually takes the form of the denial of the transitivity of entailment: the principle that from A⊢B and B⊢C one can derive A⊢C. The easiest way to generate a non-transitive logic is to demand different standards for premises (the symbols on the left hand side of ⊢) and conclusions (the symbols on the right hand side of ⊢.) For example, one could say that a valid argument is one where if the premise is assigned the value 1, then the conclusion must be assigned the value 1 or the value ½.
My own perspective falls between the nihilist and tolerantist camps. I agree with the nihilists that all predicates lack extensions, but I blame this not on vagueness, but on the possibility of verbal disputes surrounding those predicates. I will discuss this issue in a separate paper. I follow tolerantists in advocating for the adoption of a non-transitive logic to deal with the sorites paradox.
Dissolving the Paradox
Most people are not philosophers, of course, and so most people do not care about solving the sorites paradox. This is not troubling at all. What is troubling is what I would call dismissivism, the idea that there isn’t a solution to be found at all. Dismissivists substantiate this by blaming the paradox on what they see as a mistaken conception of the relation between reality and language. For example, some may reject the standard package of the correspondence theory of truth (the thesis that truth consists of a special relation between a statement and the world) and truth-conditional semantics (the thesis that the meaning of a statement consists of the conditions under which it is true.) One example of this approach can be found in (Correia 2013,) which puts forth a very interesting and radical analysis of the sorites paradox in terms of so-called “signalling games,” but falls short of providing a solution, even suggesting that any theory that avoided the paradox would be a misrepresentation of natural language (Correia 15.) It is obvious to me that natural language does avoid the paradox. And so, a formalization of this intuition would be appreciated.
Resolving the Paradox?
I agree with those who seek to sweep vagueness’s paradoxes under the rug in that I hold that no substantive account of vague truth can be supplied. However, we certainly use vague predicates in our day to day lives coherently, and it would be nice for logic to respect that. Logic on the picture I am advocating is not a matter of determining what might be the case but a matter of determining what inferences are acceptable. The sorites paradox is paradoxical not because it threatens our ideas about heaps but because it threatens our ideas of what can be derived from intuitively acceptable premises.
There are a great deal many non-transitive logics to choose from. The current favorite among tolerantists appears to be the logic ST (strict/tolerant), which can be characterized similarly as the “gappy” K3 and the “glutty” LP. If a statement is assigned the value ½ in a model of K3, it is treated as neither true nor false in that model (a gap), and if a statement is assigned the value ½ in a model of LP, it is treated as both true and false (a glut). ST splits the difference: premises are evaluated as in L3 and conclusions are evaluated as in LP. Another logic, “pb” (super/sub), with analogues to supervaluationist logic and subvaluationist logic, is more at home in a classical worldview. Although each instance of P(x)∧(x~P y)→P(y) follows from the claim that we have a Sorites series, the negation of the universal claim does as well (Not for all x and y, P(x)∧(x~P y)→P(y)), which may disqualify it in the eyes of very picky philosophers. ST arguably also suffers the same problem, although to a lesser extent: both the universal form of the tolerance principle and its negation follow from the existence of a Sorites series. I have a hunch that this could be avoided by considering non-transitive analogues of weakenings of LP that don’t prove

~A ⊢LP*~(A∧B)
as in (Beall 2004).

1) This conception of vagueness can be fitted to a pluralist theory of vagueness just as easily as it can be to a nihilist theory. Under the nihilist picture, when we make vague statements we aren’t saying anything at all. Under a pluralist picture, when we make vague statements we are saying many things. Braun and Sider’s theory of ignoring vagueness could easily be extended as well, perhaps treating the ignoring of the plurality of what we say when we speak vaguely as an instance of conflation. A tolerant theory of vagueness based on conflation is described in (Ripley 2016). It is not clear exactly how the pluralist would resolve the sorites paradox without resorting to a tolerant framework. The literature contains some discussion of plurivaluationism, which seems to be what I am referring to as pluralism, so there may be some clues there.

2) If entailment is a relation between sets of premises and sets of conclusions, then a valid argument is one where if all the premises are true, then some of
the conclusions are true.