Oxford Studies in Philosophy of Language. Volume 1 0198836562, 9780198836568

Philosophy of language has been at the center of philosophical research at least since the start of the 20th century. Si

763 82 2MB

English Pages 294 Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Oxford Studies in Philosophy of Language. Volume 1
 0198836562, 9780198836568

Citation preview

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

Oxford Studies in Philosophy of Language

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

Oxford Studies in Philosophy of Language Volume 1 EDITED BY

Ernie Lepore and David Sosa

1

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

3

Great Clarendon Street, Oxford, OX2 6DP, United Kingdom Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries © the several contributors 2019 The moral rights of the authors have been asserted First Edition published in 2019 Impression: 1 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this work in any other form and you must impose this same condition on any acquirer Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America British Library Cataloguing in Publication Data Data available Library of Congress Control Number: 2018955391 ISBN 978–0–19–883656–8 Printed and bound in Great Britain by Clays Ltd, Elcograf S.p.A. Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

Contents Preface List of Contributors 1. The Subtle Lives of Descriptive Names Imogen Dickie 2. Sources of Context-Dependence: The Case of Knowledge Ascriptions Michael Glanzberg

vii ix 1

35

3. Words by Convention Gail Leckie and J. R. G. Williams

73

4. Conditional Acceptance Ofra Magidor

99

5. Frege’s Begriffsschrift Theory of Identity Vindicated Ulrich Pardey and Kai F. Wehmeier

122

6. Truth Ian Rumfitt

148

7. Subordinating Speech and Speaking Up Gillian Russell

178

8. Context-Free Semantics Paolo Santorio

208

9. Semantic Explanations Zoltán Gendler Szabó

240

Index

277

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

Preface With this inaugural issue, Oxford Studies in Philosophy of Language joins the distinguished family of Oxford Studies series, as a regular showcase for leading research in its area. Philosophy of language has been a main focus of philosophical research since at least Frege’s seminal contributions at the turn of the twentieth century. Since that “linguistic turn,” important work in philosophy has often been related in some significant way to philosophy of language. This series hopes to offer a regular snapshot of state-of-the-art contributions in this important field. To be published biennially, and intended to be a forum for papers by some of the best scholars from around the world, both senior and junior, each issue will include an assortment of outstanding papers in philosophy of language, broadly construed. This first issue of our series is a good instance of the form: it includes nine new papers by a distinguished range of philosophers. Together, the papers provide a perspective on the state of the sub-discipline. Two of the papers investigate basic notions in the area, truth and reference: Imogen Dickie’s “The Subtle Lives of Descriptive Names” and Ian Rumfitt’s “Truth.” Dickie’s treatment of reference derives from a reconsideration of descriptive names and a rejection of the idea that the referent of such a name is the satisfier of the associated description. Rumfitt seeks to recapture a conception of truth due to P. F. Strawson on which the key insight is that “one who makes a statement or assertion makes a true statement if and only if things are as, in making the statement, he states them to be.” “Words by Convention,” by Gail Leckie and Robbie Williams, and “Semantic Explanations,” by Zoltan Szabo, can both be seen as investigations in metasemantics. The former is concerned with the priority, relative to reductive projects in metasemantics, of our categorization of words into types; the latter with the question of whether semantic theories are merely “descriptive” or whether such theories can offer more substantive explanations. Two other papers take up the phenomenon of context-sensitivity, considering the role of context in semantics generally and in the case

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

viii



of knowledge-ascriptions in particular: Paolo Santorio’s “Context-Free Semantics,” and Michael Glanzberg’s “Sources of Context Dependence: The Case of Knowledge Ascriptions.” Santorio rejects any distinctive semantic role for context; Glanzberg defends a form of contextdependence for knowledge ascriptions and explores the varieties of context dependence found in natural language. Ofra Magidor’s “Conditional Acceptance” finds that three prominent theories of conditionals cannot provide an adequate treatment of a case she devises. The paper by Gillian Russell, “Subordinating Speech and Speaking Up,” explores a broadly socio-political question in philosophy of language: how can “speaking up” work against the phenomenon of subordination? Frege’s Begriffsschrift is the focus of the paper by Ulrich Pardey and Kai Wehmeier: their “Frege’s Begriffsschrift Theory of Identity Vindicated” is concerned to rehabilitate Frege’s view in the face of two main objections leveled against it. As you read the entries, you will see how in almost every case, the specific topics taken up are closely connected to other topics of deep and abiding interest in philosophy: collective action (like that involved in establishing conventions or practices), explanation, identity, individual action (like that involved in making a speech act), knowledge, reasoning, and subordination. Together, this broad-ranging set of papers reveals the breadth and depth of work in philosophy of language today. We expect future issues to provide an equally diverse, rich, and valuable collection of contributions to our discipline. Our thanks to Peter Momtchiloff for his support, and for the addition of this series to the Oxford Studies family.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

List of Contributors I MOGEN D ICKIE Department of Philosophy, St. Andrews University M ICHAEL G LANZBERG Department of Philosophy, Northwestern University G AIL L ECKIE Department of Philosophy, University of Leeds O FRA M AGIDOR Faculty of Philosophy, Oxford University U LRICH P ARDEY Institut für Philosophie, Ruhr Universität Bochum I AN R UMFITT All Souls College, Oxford G ILLIAN R USSELL Department of Philosophy, University of North Carolina, Chapel Hill P AOLO S ANTORIO Department of Philosophy, University of California, San Diego Z OLTÁN G ENDLER S ZABÓ Department of Philosophy, Yale University K AI F. W EHMEIER Department of Logic & Philosophy of Science, University of California, Irvine J. R. G. W ILLIAMS Department of Philosophy, University of Leeds

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

1 The Subtle Lives of Descriptive Names Imogen Dickie

Consider the following example: Case 1: ‘Tremulous Hand’ ‘Tremulous Hand’ is used to refer to the otherwise unidentified author of around 50,000 thirteenth-century glosses in manuscripts. Palaeographical analysis provides strong evidence that these glosses are the work of a single person with distinctive (tremulous and left-leaning) handwriting. All that is known about Tremulous Hand is what can be deduced from the glosses themselves.

‘Tremulous Hand’ is a ‘descriptive name’: a name associated with a stipulation of form ⌜Let α refer to the Ψ⌝.¹ The extant discussion² of such expressions is characterized by a standard claim and a controversy: Standard claim (satisfactionality)—A descriptive name’s referent, if it has one, is the satisfier of the associated description (if α refers, it refers to the satisfier of ⌜the Ψ⌝). Central question of the controversy (singularity?)—Is the thought expressed by a sentence containing a descriptive name a singular thought about the name’s bearer? (Part of what is at issue in this controversy is what counts as a genuinely ‘singular’ thought.)

¹ ‘α’ and ‘Ψ’ are schematic letters ranging over object-language singular terms and predicates respectively. ² See, for example, Evans 1982; Campbell 1999, 2002; Jeshion 2004, 2010; Reimer 2004; Recanati 2012; Goodman 2016.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

This paper argues that the standard claim is false, and suggests a new solution to the controversy. Here are two more examples which will enable a gesture towards what I am going to propose: Case 2: ‘Geraint the Blue Bard’ ‘Geraint the Blue Bard’ was used for over a hundred years as a name for the otherwise unidentified author of a series of songs in medieval Welsh, dealing with medieval themes, and employing medieval metres. Efforts to find out more about Geraint’s life, taking off from cues in the texts, supposed that he flourished in the ninth century, and was either an apothecary, a minor aristocrat, or a priest. Rival factions collected large bodies of evidence to support each of these hypotheses. But in 1956 the ‘Blue Bard’ songs were shown to be the work of notorious nineteenth-century forger Edward Williams. Case 3: ‘Gizmo’ X, the now aged head of a manufacturing company, likes to boast to his underlings about ‘the gizmo that started it all’, with strong suggestions that he was himself this thing’s inventor. The underlings introduce a descriptive name ‘Gizmo’ with aboutness-fixing description , and use X’s utterances (‘Ah, that was the year that the gizmo that started it all really took off ’ etc.) and the company’s financial history to try to work out which of the firm’s early patents Gizmo was. In fact, there was an early patent that enabled the firm to get on its feet—the first version of the firm’s famous self-setting rat trap. But X was not its inventor. The firm’s early patents were all bought for almost nothing from an unworldly individual who died an impoverished emeritus professor in a university town.

I take it that there are reasonably clear intuitive verdicts³ about these cases. In Case 2, intuition cries out that there was no Geraint—‘Geraint the Blue Bard’ as used by the unfortunate scholars did not refer. In Case 3, we can imagine filling in the details in such a way that the intuitive verdict is that ‘Gizmo’ does refer—to the rat trap: one underling says to another ‘Well, here’s Gizmo, but you realize that X didn’t invent it after all . . . ’ If we take them at face value, these intuitive verdicts reverse what we should expect to find if the most flat-footed version of the standard claim is true. ‘Geraint’ and ‘Gizmo’ are descriptive names, associated with stipulations ‘Let “Geraint” refer to the author of these songs’ and ‘Let “Gizmo” refer to X’s most remunerative early invention’. The description that figures in the ‘Geraint’ stipulation is satisfied (by Edward Williams). The one in the ‘Gizmo’ stipulation is not. If the standard

³ I clarify the extent to which I think ‘intuitive’ verdicts like this carry evidential weight at pp. 19–22 of Dickie 2015.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



claim as I have stated it is true, ‘Geraint’ refers to Edward Williams, and ‘Gizmo’ is an empty name: diagnoses repugnant to intuition. By the end of the paper, I shall have argued for a position that I think best explains these observations. The standard claim is false. And it is not false for the unexciting reason that the accompanying explicit stipulation might not capture the ‘real’ reference-fixing description associated with a name like ‘Tremulous Hand’, ‘Geraint’, or ‘Gizmo’. The standard claim is false because the mechanism of reference-fixing for these expressions is not satisfactional at all. The paper is structured as follows. §1 develops a general framework for accounts of aboutness-fixing for our thoughts about ordinary things—a framework which will provide the basis for accounts of reference-fixing for the singular terms we standardly use to express these thoughts. §2 uses this framework to overturn the standard claim and motivate an alternative, non-satisfactional, account of how descriptively mediated aboutness-fixing and reference-fixing work. §3 develops the response to the singularity controversy that I want to propose. §4 considers the consequences of the §§1–3 discussion for a right account of what speaker and hearer commit themselves to when the speaker makes, and the hearer accepts a ⌜Let α refer to the Ψ⌝ stipulation. I should add that, fascinating as descriptive names are in their own right, I take much of the interest of the topic to derive from how it fits into the wider picture of our thought and speech about ordinary particular things. I have allowed editorial decisions about which details to develop and which to elide to be guided by this view.

1.1 Aboutness and justification This section introduces a framework for accounts of aboutness-fixing for our thoughts about ordinary things—things like tables, dogs, trees, and people.⁴ To get the framework in place, I shall concentrate on what have traditionally been taken to be the central instances of such thoughts: the perceptual demonstrative and proper-name-based cases, illustrated by Cases 4 and 5 respectively— ⁴ This section presents an alternative version of the argument of Dickie 2015: ch. 2. I leave open the extent to which the same picture applies to thoughts about non-ordinary things, for example, bosons, numbers, or systems of government.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

Case 4 ‘That’ You are looking at a grapefruit on a table in front of you. The viewing conditions are good, and the situation devoid of causal and cognitive perversities: you are having an ordinary perceptual experience, caused by the grapefruit in an ordinary way. You form a body of beliefs you would express by saying things like ‘That is round’, ‘That is rolling’, ‘That is orange’. Case 5 ‘Aneurin Bevan’ You have not heard the name ‘Aneurin Bevan’ before. Somebody begins to explain who Bevan was: ‘Aneurin Bevan was a British Labour Party politician. He was a long-standing member of parliament, and a cabinet minister in the 1940s and 50s. He was instrumental in the foundation of Britain’s National Health Service.’ Nothing about the situation leads you to doubt your informant’s reliability. You take the utterances at face value, forming a body of beliefs you would express using ‘Aneurin Bevan’.

In each of these cases, I take it that it is obvious which individual your beliefs are about. In Case 4 they are about the grapefruit you are looking at; in Case 5 they are about the politician Aneurin Bevan. But to say that your beliefs are about these individuals is as yet to say nothing about what makes it the case that these are the individuals they are about. This section develops a new answer to this ‘What makes it the case?’ question. The new answer is built around a principle derived from two further principles which I take to be basic, one connecting aboutness and truth, the other truth and justification: Principle connecting aboutness and truth—If an belief is about object o, it is true iff o is Φ.⁵ (If my belief that Jack has fleas is about my dog, it is true iff he has fleas.) Principle connecting truth and justification—Justification is truth conducive; in general and allowing exceptions, if your belief is justified, you will be unlucky if it is not true and not merely lucky if it is. Given these principles, it will be surprising and disappointing if we cannot cut the intermediate term and obtain a third principle connecting aboutness and justification—a principle capturing the significance for accounts of aboutness-fixing and, therefore, for the theory of reference of the fact that justification is truth-conducive. The rest of this section argues for such a principle as applicable to the perceptual demonstrative

⁵ ‘An belief ’ should be read as an abbreviation for ‘A belief standardly expressed by a sentence of form ┌ α is Φ ┐’. ‘Φ’/‘’ and ‘Φ’ are braced together: Φ expresses conceptual representation of property Φ.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



and proper-name-based cases. The next section extends the discussion to the case of descriptive names. As a first step towards the aboutness and justification principle that I want to propose, note two features that Cases 4 and 5 have in common. In each case, you are maintaining a body of beliefs which you treat as about a single thing. And in each the body of beliefs is associated with what I shall call a ‘proprietary’ means of justification: a means of justification which you treat as trumping other means. The fact that in each case you are treating the resulting body of beliefs as about a single thing shows itself in the ways you are prepared to allow it to develop. For example, when you believe and in Case 4, you are automatically prepared to move to , without looking for evidence that the round thing and the rolling thing are the same. And as you maintain your growing body of beliefs, you automatically guard against overt contradictions, revising your beliefs or reinterpreting or rejecting incoming testimony to avoid and combinations.⁶ In Case 4, the proprietary means of justification is uptake from your attentional perceptual link with the grapefruit. In Case 5 it is careful uptake from the stream of ‘Aneurin Bevan’ testimony. A body of beliefs united by the treated by the subject as about the same thing relation may come to include beliefs not justified by the associated proprietary means. But the proprietary means is marked out by its ‘trumping’ status: ‘Actually it’s made of glass and will shatter if it falls’ I tell you, as we watch the grapefruit to which we are jointly attending roll along. You have no reason to doubt what I say, and form a belief, justified by uptake from my testimony. But when you see the grapefruit fall from a height onto the hardwood floor and roll away, perception trumps testimony and the belief is discarded.⁷

⁶ A body of beliefs treated by the subject as about a single thing is what some philosophers call a ‘mental file’—see Recanati 2012 for a recent and thorough discussion. I explain my own abstinence from use of this term in the appendix to Dickie (forthcoming). ⁷ I provide a more detailed discussion of the notion of proprietary justification at Dickie 2015: 50–2. There are various options to explore in deciding how to extend the treatment of this notion to allow for ‘mixed’ cases where a single body of beliefs is associated with different proprietary means of justification at different times, or with two means of justification that carry equal weight. On the question of what counts as a ‘means’ or ‘method’ of justification, see note 9.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

The aboutness and justification principle that I am going to propose connects the aboutness of a body of beliefs treated by the subject as about a single thing with what I shall call ‘justificatory convergence’ for the associated proprietary means of justification: Principle connecting aboutness and justification (initial approximate version)—A body of beliefs treated by the subject as about some single thing is about object o iff its proprietary means of justification converges on o, making o the unique object whose properties the subject will be unlucky to get wrong and not merely lucky to get right in justifying beliefs in this way. Here is a parallel case to consolidate what the aboutness and justification principle says. Suppose that an astronomer, hereafter ‘A’, is compiling a report from the data delivered by a telescope focused on object o in the night sky. A has verified that the telescope is both focused and working as it should. The telescope delivers a stream of data: detection of motion; detection of fluctuating temperature; and so on. A compiles her report: ‘It’s moving. Its temperature is fluctuating between such-and-such values . . . ’. The fact that the telescope is focused on o does not entail that the report will get o’s properties right. But it does entail that the report will get o’s properties right unless some unlucky spoiler—a dirty mirror; deviant behaviour on o’s part—intervenes. The aboutness and justification principle treats the aboutness of our ordinary beliefs about ordinary things as what I shall call ‘cognitive focus’: the fact that a body of justified beliefs is about an object does not entail that all or any of them will match the object. It does entail that if a belief about an object is justified yet does not match what the object is like, some unlucky spoiler has got in the way. To reach an official statement of the aboutness and justification principle, we must say something more precise about the notions of being ‘unlucky’ to get an object’s properties wrong, and ‘not merely lucky’ to get them right. This in turn requires taking a stand on how to precisify the underlying principle connecting truth and justification. I take it that some version of this principle is inescapable: it is part of the concept of theoretical justification—justification for belief—that forming justified beliefs is, in general and allowing exceptions, a way to form true beliefs: if Philosopher A shows that Philosopher B’s account of what it is for a belief to be justified entails that nothing has gone wrong in cases where a

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



justified belief is not true, A wins and B must go back to the drawing board. But the (inescapable) claim that we must accept some version of the truth and justification principle leaves completely open exactly which version is to be preferred. It is obviously not possible to do justice to the intricacies in which this question is embrangled in a section of a paper on something else. So rather than attempting to argue for a specific version of the principle, I shall rest with stating the version that I am going to employ. (Perhaps there is no one version of this principle which is to be preferred for all explanatory purposes. In any case, though I am not confident as to whether there is a definitive precisification of the connection between truth and justification, I am confident that the argument I am about to develop could be reconstructed, with suitable adjustments, around the various alternatives. The resulting aboutness and justification principle might itself look a little different from the principle that I shall propose. These differences will not matter for the purposes of this paper.) The version of the truth and justification principle that I shall suppose takes its rise from the observation that the cognitive capacities at our disposal for the purposes of forming justified beliefs are limited relative to the complexity of our environment, and that there are, therefore, many more ways a belief might fail to be true than we have the resources to rule out as we go about our belief-forming business. For example, consider my current belief, formed by uptake from perception, that people are riding bicycles past the window. My path to this belief is inconsistent with many ways it might fail to be true. If we set aside possibilities in which I am being taken in by some devious or unusual feature of the situation, my path to the belief rules out the possibility that what is outside is a six-lane highway devoid of bicycle traffic; the possibility that I am in fact staring at a blank wall rather than a three-dimensional bicycle-containing street scene; and many more besides. But in gesturing towards the ‘belief not true’ scenarios that my path to the belief does rule out, we have already conceded that there are others upon which it is silent. These are ‘devious’ or ‘unusual’ scenarios of the kind that were set aside preliminary to the gesture: the possibility that the things passing a few feet away are cars disguised to look like bicycles to avoid the city’s congestion charge; the possibility that rather than looking through a window I am looking at the last in a serious of disguised and perfectly aligned mirrors, and the people on bicycles from whom my perceptual experience derives are in fact behind me and

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

several blocks away. Though there is, on the face of things, nothing in my path to belief that rules out these devious or unusual scenarios, I would not, in ordinary life, be regarded as under a requirement to hold back from forming my belief until I had gathered evidence to exclude them. In situations like the one described, it is bad doxastic practice to hold out for evidence that excludes arcane and unusual, as well as humdrum and commonplace belief-not-true circumstances. A subject with ordinary human information-processing capacities who insists on ruling out even the most arcane not-p possibilities before believing that p will be too sluggish a cognitive operator to flourish in our rapidly changing world. The elements of the precise truth and justification principle that I shall suppose can be abstracted from the discussion of this example. I shall suppose that a belief is justified only if formed by a route that eliminates some reasonable range of circumstances in which the belief is not true, where ‘elimination’ is defined as follows: Definition—a route to the formation of a belief ‘eliminates’ a circumstance iff the fact that the belief is formed by this route is incompatible with the circumstance (so that the fact that the belief is formed by this route entails that the circumstance is not actual).⁸ I shall annex the term ‘rational’ to describe beliefs like the one in the example, justified by a route that eliminates a sufficient range and proportion of the ways the belief might fail to be true that it would have been bad practice to hold out for further justification before forming the belief:

⁸ The definition presupposes some way of individuating routes to belief formation. Philosophers with reductionist agendas who wish to explain traditional epistemic notions (like ‘justification’ and ‘knowledge’) in terms of notions like ‘route to belief formation’ and ‘reliability’ taken as prior face notorious difficulties in saying how routes to belief formation are to be individuated without using the epistemic notions that are the target of the reductionist explanation. This is the ‘problem of individuation of methods’ for reductive reliabilism (sometimes called the ‘generality problem’). For in-depth discussion and a pessimistic survey of solutions available to a reductive reliabilist see Conee and Feldman 1998. This author has no reductionist agenda, and takes the notion of the ‘route’ by which a belief is formed to be explicable partly in terms of the aspects of the causal story behind the belief ’s formation that contribute to its having the kind of justification it does. A reader who does have a reductionist reliabilist agenda is invited to plug his or her own preferred solution to the problem of the individuation of methods into the definition.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



Definition—a belief is ‘rational’ iff it is formed by a careful enough justification-conferring route. And I shall introduce a notion of ‘rational relevance’ defined as follows: Definition—Consider belief B formed by subject S. A B-not-true circumstance is ‘rationally irrelevant’ to S’s formation of B iff it need not be eliminated by S’s justification for B in order for this justification to secure B’s rationality. A B-true circumstance is ‘rationally irrelevant’ to S’s formation of B iff it is one in which rationalitysecuring-justification for the belief would fail to secure the belief ’s status as not-merely-luckily true. A circumstance is ‘rationally relevant’ to S’s formation of B iff it is not rationally irrelevant. (For example, the circumstance in which the things I am looking at are cars disguised as bicycles is a rationally irrelevant belief-not-true circumstance. A circumstance in which I am (though I do not realize it) looking at the reflections of distant cyclists, but there are also cyclists, unseen by me, going past behind the mirror just a few feet away is a rationally irrelevant circumstance where my belief happens to be true.) Given these elements, the version of the truth and justification principle that I am going to suppose can be stated as follows (capitalization signals official status):   —Justification that secures the rationality of a belief eliminates every rationally relevant circumstance where the belief is not true. I take the notions of ‘rationality’ and ‘rational relevance’ that I have introduced to be correlative to that of knowledge: a true belief formed by rationality-securing means counts as knowledge iff the circumstance in which it is formed is rationally relevant. I also take the notion of ‘rational relevance’ to be correlative to the ‘virtue reliabilist’ notion of a ‘manifestation’ of true-belief-forming competence. An exercise of true-belief-forming competence ‘manifests’ the competence iff it generates a true belief, and does so in virtue of being an exercise of the competence, rather than in some way that leaves the belief ’s truth a mere matter of luck.⁹ In these ⁹ The notion of ‘manifestation’ is a primitive of Sosa’s virtue reliabilist framework. (See Sosa 2015: ch. 2 for a recent and careful development.) The suggestion is that a performance

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

terms, a rationally irrelevant circumstance is one where a belief formed by the exercise of a true-belief-forming competence nevertheless fails to manifest the competence (leaving it a matter of luck whether the belief is true). Combining these elements, we get the precise version of the aboutness and justification principle for which I am about to argue:   —A body of beliefs treated by subject S as about a single thing is about o iff its proprietary means of justification converges on o so that, for all , if S has proprietary rationalitysecuring justification for the belief that , this justification eliminates every rationally relevant circumstance where o is not Φ.¹⁰    is a biconditional connecting aboutness and a precisified notion of justificatory convergence: Aboutness

Justificatory convergence ,

S’s beliefs are about o

For all , if S has proprietary rationality-securing justification for believing , this justification eliminates every rationally relevant circumstance where o is not Φ.

To prove the biconditional, we shall establish each direction (left-toright; right-to-left) in turn. Here is an argument for the left-to-right direction (from aboutness to justificatory convergence): Suppose 1 S’s belief that is about o. Add the aboutness and truth principle: 2 If S’s belief that is about an object, the belief is true iff that object is Φ.

manifests a competence iff it is causally derived from the competence in a way that involves no deviant causal chains, where the right-hand side of this biconditional is not to be regarded as explanatorily prior to the left: causal derivation of performance from competence without a deviant causal chain is just what there is in cases of manifestation. ¹⁰ The quantifier over property representations (‘for all ’) ranges over the such that the proprietary means of justification might deliver an ‘’ or ‘’ verdict. I explain this in more detail at Dickie 2015: 59 and 199–211.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



1 and 2 entail 3 S’s belief that is true iff o is Φ. Add   : 4 Justification that secures a belief ’s rationality eliminates every rationally relevant circumstance where the belief is not true. 3 and 4 entail 5 Justification that secures the rationality of the belief that eliminates every rationally relevant circumstance where o is not Φ. So we have the left-to-right direction of the    biconditional: 6 If S’s belief is about o, justification that secures the rationality of the belief eliminates every rationally relevant circumstance where o is not Φ. The argument for the other direction of the biconditional (where there is justificatory convergence there is aboutness) is a proof by reductio: Suppose 1 It is not sufficient, for S’s beliefs to be about o, that their proprietary means of justification converge on o. Given 1, the following combination is coherent. S has proprietary rationality-securing justification for believing . There is nothing devious interfering with the ‘detection of Φ-instantiation’ aspect of S’s path to the belief: in forming the belief, S manifests competence at detection of the presence of some Φ-instantiating object. o is the object upon which the proprietary means of justification for S’s beliefs converges, so S’s manifestation of Φ-detecting competence is picking up on whether o is Φ. But, because of the failure of some extra condition on aboutness—some condition above and beyond justificatory convergence—S’s belief is not about ο. 2 In the scenario just described, S’s circumstance is either rationally relevant to her formation of the belief, or it is rationally irrelevant. But the elements already in place generate an argument for 3: 3 The circumstance is not rationally relevant to S’s formation of the belief. Suppose that 3 is false—the circumstance is rationally relevant. 1 specifies that o is the object upon which S’s justification for the belief converges. So the left-to-right direction of the biconditional, just established, entails

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

that if S’s beliefs are about anything, they are about o. 1 also specifies that S’s beliefs are not about o. They are, therefore, about nothing, in which case they are not true. In addition, the definition of ‘elimination’ entails that a subject’s justification for a belief never eliminates the actual circumstance—the circumstance in which the belief is formed. So if we suppose that the actual circumstance is rationally relevant, we are supposing that S has rationality-securing justification for the belief that which leaves uneliminated a rationally relevant circumstance in which the belief is not true. But    says that rationally-securing justification for a belief eliminates every rationally relevant circumstance where the belief is not true. Contradiction.¹¹ And the elements already in place also generate an argument for 4: 4 The circumstance is not rationally irrelevant to S’s formation of the belief. To see the argument for 4, note first that the circumstance is not rationally irrelevant to S’s formation of the corresponding belief that . For in the circumstance as described, there is nothing devious interfering with S’s detection of Φ-instantiation: in forming a belief on the basis of the means of Φ-detection that underpins proprietary justification for her belief, S would be manifesting true-belief-forming competence, and a circumstance in which formation of a belief by rationality-securing means manifests true-belief-forming competence just is a circumstance rationally relevant to the belief ’s formation. Given that the circumstance is rationally relevant to formation of the belief that , to deny 4 is to endorse the possibility of the following combination: A circumstance rationally irrelevant to formation of the belief that may be rationally relevant to formation of the belief that .

¹¹ There is in fact a loophole in this argument. The envisaged incoherent case is a case where there is a unique object upon which justification converges, and yet aboutness fails. So the argument is silent about cases where justification converges on more than one object. I close this loophole at Dickie 2015: 52–3 (down and dirty version) and 65–72 (full version, including connection to Strawson’s (1959) puzzle about ‘massive reduplication’).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



And to endorse this possibility is to suppose that the conditions for the rationality of a belief might be more demanding than those for the rationality of the corresponding belief. For example, it is to suppose that it might be rational to believe by uptake from a perceptual link, but irrational to believe on the same justification (because the rationality of the belief requires the elimination of extra ‘nothing square there’ circumstances—circumstances that must be guarded against if it is to be rational to move to on the basis of perception, but may be ignored in moving to ). And this just gets things the wrong way around. Across the target range of cases—cases like Cases 4 and 5 from the start of this section—a subject rationally entitled to believe is automatically rationally entitled to believe too. (There are cases where some philosophers would deny the parallel claim. For example, some people deny that beliefs ‘about’ fictional characters are existentially committing, maintaining that does not entail , and that a subject might be justified in believing the first but not the second. But these and other instances where the validity of the inference from to is up for negotiation lie outside the target range.) Having established 3 and 4, we have eliminated both disjuncts of 2. But the choice at 2 is generated by a situation whose coherence is entailed by 1, so 1 must be rejected, giving us 5: 5 If the proprietary means of justification for S’s beliefs converges on o, these beliefs are about o. With    in place, we have a blueprint for answering the ‘What makes it the case?’ questions about Cases 4 and 5— the questions of what makes it the case that your beliefs are about the grapefruit and the politician respectively. In each case, the account of how aboutness-fixing works will be an account of how the resulting beliefs are justified, combined with an account of the conditions under which this means of justification—the means of justification proprietary to the body of beliefs—converges on a particular thing.¹²

¹² I develop the blueprint for the cases of perceptual demonstrative and proper-namebased thought in Dickie 2015, chapters 4 and 5 respectively.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

There are many questions of detail about exactly how this blueprint is to be filled in. And a raft of further questions concern how the resulting accounts of aboutness-fixing for our perceptual demonstrative and proper-name-based thoughts will dovetail with accounts of linguistic competence to deliver accounts of reference-fixing for demonstratives and proper names. But rather than pursue these questions here, I want now to turn to the main topic of this paper—descriptive names like ‘Tremulous Hand’, and the thoughts we use them to express.

1.2 Descriptive names in the aboutness and justification framework The previous section used the cases of perceptual demonstrative and proper-name-based thought to motivate a framework for accounts of aboutness-fixing for our thoughts about ordinary things. This section extends the discussion to cases involving descriptive names. The first steps towards this extension can be read off the structural parallels between the perceptual demonstrative and proper-name-based cases, illustrated by Cases 4 and 5, and cases like Case 1 ‘Tremulous Hand’. Like those in Cases 4 and 5, subjects in Case 1 seem to be in the business of using a proprietary means of justification to develop bodies of belief that they treat as about a single particular thing. The proprietary means of justification in this case involves deployment of the description associated with the name. The core group¹³ of speakers use this description to harvest information from the vandalized manuscripts, looking for evidence for beliefs, and, gathering the resulting < . . . is Φ> claims into a bodies of beliefs which they would affirm, if asked, to be ‘about’ Tremulous Hand. Given these structural parallels, we can see how the    framework developed in the previous section would apply to the cases like Case 1. The suggestion would be that grasp of a description makes available a means of justification for a body of beliefs: use the description to harvest information which you then bundle together as about a particular thing. The resulting body of beliefs—standardly ¹³ Obviously there might also be ‘deferential’ users, who are ignorant of the association between the name and the description.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



expressed using a descriptive name—is about object o iff o is the object upon which this means of justification converges: the object whose properties the subject will be unlucky to get wrong and not merely lucky to get right in forming a body of beliefs justified in this way. But why think that aboutness-fixing for the beliefs we express using descriptive names in fact does work in something like this way? One reason is that the argument for    as applicable in the perceptual demonstrative and proper-name-based cases applies, with a few wrinkles,¹⁴ to the case of descriptive names too. Another is that the resulting view generates improvements on both extant discussions of whether there can be, as I shall say ‘descriptively mediated singular thoughts’, and accounts of how ⌜Let α refer to the Ψ⌝ stipulations work in conversational contexts. (I develop these points in §3 and §4 respectively.) A third reason is that the   -based account explains the intuitive verdicts surrounding the problem cases from the start of the paper—cases which seem to show that a descriptive name may refer to an object that does not satisfy the associated description, and fail to refer even though the associated description is satisfied. This is the line of thought I shall develop in this section. (I should stress that it is only in combination with the other two reasons that I think the story I am about to tell counts as the best explanation for the phenomena.) Recall Cases 2 and 3 from the start of the paper. Case 2: ‘Geraint the Blue Bard’ ‘Geraint the Blue Bard’ was used for over a hundred years as a name for the otherwise unidentified author of a series of songs in medieval Welsh, dealing with medieval themes, and employing medieval metres . . . Case 3: ‘Gizmo’ X, the now aged head of a manufacturing company, likes to boast to his underlings about ‘the gizmo that started it all’, with strong suggestions that he was himself this thing’s inventor. The underlings introduce ‘Gizmo’ with the stipulation ‘Let “Gizmo” refer to X’s most remunerative early invention’, and set about trying to find out which thing it was . . .

The intuitive verdict in Case 2 was that the scholars’ beliefs were about nobody, even though the description associated with ‘Geraint’ is satisfied—in particular, the beliefs were not about Edward Williams,

¹⁴ The wrinkles concern the uniqueness claim discussed in note 12.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

even though he was the description’s satisfier. The verdict in Case 3 was that Gizmo—the thing the underlings are trying to find out about—does not, after all, satisfy the ‘Gizmo’ description. I shall consider how the    framework predicts each of these results in turn. Consider Case 2 ‘Geraint’, and consider how scholars working before the discovery of the forgery justify their beliefs. We can imagine Scholar A arguing that Geraint had seen a manuscript of the Life of St Cuthbert like this: ‘There is strong evidence in the songs that Geraint has read the Life of St Cuthbert. In the ninth century, the only copies of the Life of St Cuthbert in existence were manuscript copies. So Geraint had seen a manuscript copy.’ Now, by the nineteenth century, there were many many more print copies of the Life of St Cuthbert than manuscript copies. But suppose that Edward Williams, the satisfier of the ‘Geraint’ description, in fact did see one of the rare manuscript copies. Does anything in Scholar A’s path to his belief tend to rule out situations in which Edward Williams did not see a manuscript copy (making the match between Scholar A’s belief and a property had by Edward Williams more than just a matter of luck)? The answer to this question is ‘No’: Scholar A’s justification for this belief secures the belief ’s rationality, but leaves it a matter of luck whether Edward Williams was Φ. And, given the associated proprietary means of justification, this conclusion applies to the scholars’ beliefs in general: it will be a matter of spectacular chance if a body of beliefs justified by the method the scholars are using matches what Edward Williams was like. So, given   , the suggestion that the scholars’ beliefs are about Edward Williams is wrong. Now consider Case 3 ‘Gizmo’. In the situation as envisaged, the story develops something like this. The name is introduced using the stipulation ‘Let “Gizmo” refer to X’s most remunerative early invention’. The underlings set about their investigation, combing the financial records from the firm’s early days; studying X’s old sketchbooks in the attempt to date various inventions; and so on. As the investigation unfolds, financial-record-combing proves a much more fruitful line of inquiry than X’s-sketchbook-trawling, so that the sketchbook-trawling is left behind as a way of arriving at beliefs. In this way, the underlings end up with bodies of belief whose means of justification converges

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



on an object—the rat trap upon which their investigations are homing in—which does not satisfy the initial aboutness-mediating description. So the    framework explains the Case 2 and Case 3 intuitions, and does so in terms of a principle for which there is an independent, from-first-principles argument. But there is an obvious objection to moving from here to the conclusion that the mechanism for aboutness-fixing that underpins our uses of descriptive names is not satisfactional. The objector maintains that the reference-fixing mechanism at work in the cases I have considered is satisfactional—it is just that the respective ⌜Let α refer to the Ψ⌝ stipulations do not capture the ‘real’ aboutness-fixing descriptions. For example, the suggestion might be that in the ‘Geraint’ case the ‘real’ aboutness-fixing description is ‘the ninthcentury author of these ballads’—a description that Edward Williams does not satisfy, and that in the ‘Gizmo’ case the ‘real’ description is one that the rat trap does satisfy—‘the firm’s most remunerative early patent’. I shall give the reply to this objection which I take to be most helpful from the point of view of adding detail to the alternative, non-satisfactional, view of descriptively mediated aboutness-fixing that I want to propose. Consider the following case: Case 6 What will save the queen? (from a Hans Christian Andersen story) The queen, beloved of her people, is sick and in danger of death. A sage advises that the queen will be saved if she is shown the loveliest rose in the world. The people embark on a collective search. At first they are looking for the rose bloom that is the most aesthetically pleasing. However, the results of the search for such a bloom lead them to realize that they need not the rose ‘loveliest’ in the narrow aesthetic sense, but the rose that shows forth the most love. So they consider roses that (in the world of the story) have grown spontaneously from the graves of lovers or soldiers who have given their lives for their countries. What they uncover in this phase of the search leads them to decide that what they are looking for is not a literal rose. At first they think it is a ‘flowering’ of human creativity, and look for the human creation which shows forth the most love on the part of its creator. But the search in that direction leads them back to more everyday possibilities: the rose ‘seen on the blooming cheeks’ of a young child, or the ‘white rose of grief ’ in the face of somebody worried about somebody beloved. Finally their search leads them to what they have been looking for all along: Christ (in the world of the story, visible to the faithful, when in a suitable state of enlightenment, as an apparition springing rose-like from the pages of the Bible).

Case 6 illustrates a feature of our operations with descriptive names that is also present, in less extreme form, in Case 3 ‘Gizmo’: the description

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

around which the proprietary means of justification for a descriptive name is built is not a static parameter which must stay fixed throughout the course of development of an associated body of beliefs. Rather, it is what I shall call an ‘outcome sensitive’ parameter. The proprietary means of justification associated with the body of beliefs standardly expressible using a descriptive name is to use a description to harvest information, looking for evidence for precursor beliefs, and bundling the resulting < . . . is Φ> information into a body of beliefs you treat as about a single thing. In structurally simple cases like Case 1 ‘Tremulous Hand’ and Case 2 ‘Geraint’, the description playing the information-harvesting role remains stable through the period of the use of the name. But in more complex cases like Case 3 ‘Gizmo’ and Case 6 ‘ . . . the queen’, the descriptive condition used to harvest information shifts as the activity of maintaining the body of beliefs unfolds. An element of the descriptive condition that is front and centre at the beginning of the investigation fails to bear fruits in the form of resulting < . . . is Φ> beliefs, and is left behind: this is what happens to the < . . . was invented by X> element of the initial descriptive condition in Case 3. Subjects’ understanding of key elements of the ⌜the Ψ⌝ description shifts so that, though there is continuity in their unfolding investigation, each stage making sense in the light of what has been uncovered at earlier ones, there is no single descriptive condition which can really be said to underpin the whole course of the investigation. This is what happens in Case 6. And it is easy to imagine further dimensions of fluidity as subjects adjust their investigative tactics to maintain the productivity of the investigation and the coherence of the body of beliefs it generates. (For example, it might be that the ‘Tremulous Hand’ investigation ends up discarding some subset of the initial set of glosses as apocryphal; or that the investigation comes to take for granted the claim that Tremulous Hand was also the author of one of the major texts in which the marginalia appear; or . . . ) One option that might suggest itself to someone attracted by the ‘find the real aboutness-fixing description’ strategy is to claim that the ‘real’ description whose satisfaction by an object fixes the aboutness of the body of beliefs expressed using a descriptive name can change over time. But this will entail that many cases that we want to say involve thinking about the same thing all along in fact involve flipping between aboutness and aboutness failure, and from thought about o to thought about o*, as the ‘real’ aboutness-fixing description changes.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



Another option that might suggest itself is to raise the level of cognitive sophistication of the supposed aboutness-fixing descriptive condition. For example, the suggestion might be that the aboutness-fixing descriptive condition in any given case is something like . Given the proposal of the last two sections, the object the beliefs are about will be the satisfier of this description. But it is a familiar observation that to formulate a description that is satisfied in a case of aboutness is one thing, and to show that the description plays an aboutness-fixing role quite another.¹⁵ And in this case the suggestion that the proposed description is playing an aboutness-fixing role is open to an obvious response from redundancy. The suggestion that this description is playing an aboutness-fixing role owes whatever plausibility it has to the argument of §1. But given this argument, we already have an account of what makes an object the object the body of beliefs expressed using a descriptive name is about: it is about the object on which the associated means of justification converges. There is simply no aboutness-fixing work left for the meta-level description to do. So I suggest that there is a good case for the conclusion that the mechanism of aboutness-fixing for the thoughts we standardly express using descriptive names is, though descriptively mediated, not satisfactional. This proposal can be put as a distinction between truth-conditions for what I shall call ‘description based’ thoughts on the one hand, and ‘mere descriptive’ thoughts on the other: Mere descriptive thought—A mere descriptive thought that is true iff whatever satisfies is Φ. Description-based thought—A description-based thought that , with aboutness fixing description , is true iff (i) there is some o upon which the associated description-centred route to justification converges, and (ii) this o is Φ. (I shall return to the claim that the thoughts we ‘standardly’ express using descriptive names are description-based thoughts in §4.)

¹⁵ Compare Kripke 1980: 88.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

1.3 Descriptively mediated singular thoughts? The previous two sections argued for a framework for accounts of the aboutness of our ordinary thoughts, and used this framework to overturn the standard view of aboutness-fixing for the thoughts we ordinarily express using descriptive names. This section considers the implications of this proposal for the controversy over whether the thoughts standardly expressed using these expressions are genuinely singular. The contemporary discussion of singular thought traces its modern history to Russell’s distinction¹⁶ between thoughts which characterize particular things in the world (for example, the thought I express when I look at my dog and say ‘He is dirty’) and thoughts which characterize the world’s pattern of property instantiation (for example, the thought that there are dirty dogs, which is, in effect, the thought that the property of being a dog is co-instantiated with that of being dirty). Given this distinction, we can recognize two kinds of case where a thought’s truth depends on what some particular object is like. On the one hand, there are cases of ‘singular’ thought about the object—cases where the thought is true iff o is Φ, and the reason o has this special status is that the thought characterizes the object. On the other, there are ‘general’ cases, where the condition for the truth of the thought is really a condition on the pattern of property instantiation, but if we hold steady o’s place relative to this pattern, the condition can be restated as a condition on o. For example, if we accept the Theory of Descriptions, and if ⌜the Ψ⌝ has a satisfier, the thought expressed by an utterance of form ⌜The Ψ is Φ⌝ is a general thought whose truth or falsity depends on what a particular object is like: it is the thought that the property of being Ψ is both uniquely instantiated and co-instantiated with the property of being Φ; if o is the unique Ψ thing, the thought expressed by the utterance is true iff o is Φ. Though it is easy enough to state this initial distinction, it has proven much harder to say exactly what having a thought which ‘characterizes an object’ requires and, therefore, to say anything definitive about which thoughts are singular. My own view is that the traditional notion of singular thought runs together what are in fact distinct criteria; that a thought which counts as ‘singular’ relative to one or more of these criteria may fail to do so relative to others; and that rather than ¹⁶ Compare Russell 1956a: 234 and 247–8; 1956b: 51.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



squabbling about whether thoughts that meet some criteria but not others get to count as genuinely singular, we should be exploring the roles that thoughts meeting the various criteria play in our overall cognitive economies. In what follows I shall make two moves towards delivery upon this wider agenda. First, I shall show how the proposal of the previous two sections entails that description-based thoughts about objects meet one of the central traditional criteria for singularity: description-based thoughts are object-dependent; having such a thought involves standing in a relation to an object, so that if there is no object your thought is about, there is a sense in which there is no thought to be had. Secondly, I shall introduce a distinction between kinds of cognitive focus which I shall suggest should serve as the starting point for an exploration into the roles played by different classes of object-dependent thoughts in our cognitive economies. The easiest way to bring out how the proposal of §§1–2 entails that description-based thoughts are object-dependent is to see how it overturns what I take to be the standard argument for the claim that object-dependence and descriptively-mediated aboutness-fixing are incompatible: 1 The mechanism of descriptively mediated aboutness-fixing is satisfactional. (That is, if you count as thinking about an object in virtue of (a) your relation to a descriptive condition, and (b) a relation between this descriptive condition and an object, the relation at (b) is satisfaction: you count as thinking about the object in virtue of the fact that it is the description’s satisfier.) 2 No thought whose aboutness is fixed satisfactionally is objectdependent. 3 There are no object-dependent thoughts whose aboutness is descriptively mediated. The argument is valid. But the proposal of §§1–2 entails that 1 is false. According to this proposal, an thought is about o iff o is the object upon which the associated proprietary means of justification converges. In cases of description-based thought, the proprietary means of justification is to use the description to harvest information and combine the resulting < . . . is Φ> beliefs into a body treated as about a single thing. In some cases, this means of justification converges on the

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

satisfier of the description. In others, it does not—the body of beliefs might fail to be about the description’s satisfier, as in Case 2 ‘Geraint’, or be about an object that does not satisfy the description, as in Case 3 ‘Gizmo’. But even where the body of beliefs is about the object that satisfies the description, the mechanism of aboutness-fixing, though descriptively mediated, is not satisfactional. In such a case, the descriptively mediated means of justification converges on the description’s satisfier. And it is justificatory convergence, not satisfaction, that is doing the aboutness-fixing work. Now, to overturn the 1–3 argument is not yet to establish that description-based thoughts are object dependent. But the reader can perhaps already see how the case for this conclusion will go. The proposal I have made treats having a descriptively mediated thought about a thing as standing in a relation to an object—a relation the same in kind as the relation to an object in which you stand if you have a perceptual demonstrative or proper-name-based thought about it; a relation of cognitive focus. In this sense, all three kinds of thought involve having an object before the mind—focusing on an object, and thinking, with respect to the object you are focused on, that it is Φ. And, according to the proposal I have made, cases of descriptively mediated aboutness failure (illustrated in this paper by Case 2 ‘Geraint’) emerge as involving the same kind of dysfunction found in empty cases of perceptual demonstrative and proper-name-based thought: in each kind of case, you are essaying a token of a type of thought which, if it characterizes the world, does so in virtue of characterizing a particular object in the world—the object upon which you have cognitive focus. So in each kind of case, if there is no such object you do not succeed in characterizing the world at all.¹⁷ I shall take this ‘object dependence’ result to be enough to warrant bestowing the traditional honorific ‘singular’ upon description-based thoughts. So I am suggesting that description-based thoughts are singular because having such a thought, unlike having a general thought, involves standing in a relation to the object the thought is about: no object; no relation; no thought there to be had.

¹⁷ I develop this point in more detail at Dickie 2015: 254–63.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



To fill out this proposal a little, let us put it to work in solving a widely discussed puzzle. Consider the following descriptions: (i) the longest-lived survivor of the Battle of Kadesh (fought in 1274 ) (ii) the heaviest sea turtle currently alive in the wild (iii) the first person to be born in the twenty-second century (iv) the nearest pebble to my left big toe; the next nearest; the next nearest (repeat a thousand times) Most philosophers defending the possibility of descriptively mediated singular thought have denied that it can be mediated by descriptions like these. The consensus has been that to allow that these descriptions can mediate singular thought is to treat descriptively mediated singular thought as too easily attained: easier than it in fact is (the suggestion is that there are intuitive barriers to introduction of descriptive names on the basis of (i)–(iv) type descriptions); and easier than we should expect it to be if there really is a difference in kind between a descriptively mediated thought and its precursor. But if there are some cases where grasp of a description does enable singular thoughts about an appropriately related object and others where it does not, there should be an explanation of why the boundary lies where it does.¹⁸ The proposal of this paper generates a new such explanation. I have suggested that subjects are in the business of thinking descriptionbased thoughts about o—descriptively mediated singular thoughts; the thoughts standardly expressed using descriptive names—when they are in the business of deploying a description to harvest information into a body of beliefs whose proprietary means of justification converges on o. Against this background, taking the step from grasp of a precursor description to description-based singular thought requires intending to engage in this kind of information-harvesting activity. But it is a standard element of philosophical accounts of intention that we cannot intend actions we do not believe that we will perform.¹⁹ So we cannot intend actions we believe to be impossible. Nor can we intend actions which we believe to be beyond the range of things we are going to get around to attempting. ¹⁸ For recent discussions see Jeshion 2010; Recanati 2012. ¹⁹ For philosophers of action making this general point see, for example, Velleman 2000: 202–4; Searle 1983: 408–9; Bratman 1987: 4, 15–18.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

This explains why I cannot just move to description-based singular thoughts given my grasp of the descriptions at (iv). Even as I contemplate these descriptions, I know that I am not going to bother deploying them in singular-thought-sustaining information-harvesting activity. A nearby line of thought explains why we cannot think descriptionbased thoughts about objects on the basis of descriptions like (i)–(iii). Consider description (i), ‘the longest-lived survivor of the Battle of Kadesh’. From our current perspective, the informational environment is radically impoverished with respect to anything that might count for evidence as to the properties of individual rank-and-file participants in events this historically remote. So any attempt to justify beliefs will be thrown back very quickly on beliefs in the two classes represented by the columns in the following table:

The longest-lived survivor of the Battle of Kadesh survived the Battle of Kadesh.

The longest-lived survivor of the Battle of Kadesh was either an Egyptian or a Hittite.

The longest-lived survivor of the Battle of Kadesh lived longer than any other survivor of the Battle of Kadesh.

The longest-lived survivor of the Battle of Kadesh was born (and died) before 1100 .

Beliefs in the left-hand column are arrived at by unpacking the content of the description, rather than by using the description to harvest information to build up a body of beliefs treated as about a particular thing. And beliefs in the right-hand column, though arrived at by using the description to harvest information, are beliefs ascribing properties which will, unless some rationally irrelevant factor intervenes, be possessed by every survivor of the battle (in intuitive terms, the right-hand results are arrived at by using the informational environment to draw conclusions about survivors of the battle in general, not one survivor in particular). Given our current informational environment, using the description to harvest information is not a means of justification which converges on some particular thing. So, again, a subject who grasps the description and knows which kinds of things we can find out about the ancient past cannot form the intention to use it to move to descriptively mediated singular thought.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



So far in this section I have argued that description-based thoughts (as opposed to merely descriptive thoughts) fill one of the criteria associated with the traditional notion of singular thought: description-based thoughts are relational and, therefore, object-dependent. In this respect, the proposal of this paper treats description-based thought as much closer to the perceptual demonstrative case than is generally supposed. But I now want to turn to a respect in which description-based thoughts are unlike perceptual demonstrative thoughts: the two kinds of thought involve different kinds of cognitive focus. To see the difference between kinds of cognitive focus in intuitive terms, consider the contrast between an optical telescope and a radio telescope. In the optical case, focus is a relation to an object which contributes to shaping the information signal the telescope delivers: it is a relation that secures the result that this signal will match the object unless some unlucky spoiler intervenes. For a radio telescope, in contrast, focus is a relation to an object that is generated by post-signal processing. The radio telescope is angled to collect all the information coming from some portion of its potential receptive field. The resulting signal then serves as input to further processing which, if the astronomers do their job properly, and if no unlucky spoiler intervenes, matches what some object in the receptive field is like. So if the astronomers do their job properly, there is a relation of focus between the report and some object in the telescope’s receptive field. But the signal is not itself shaped by an underlying focus relation: the focus relation is generated by post-signal processing. I shall give the two kinds of cognitive focus the working appellations ‘A-focus’ and ‘B-focus’: A-focus—A body of beliefs is A-focused on object o iff (a) beliefs formed by the means of justification proprietary to the body of beliefs will match o unless some rationally irrelevant factor intervenes, and (b) the condition at (a) holds in virtue of an underlying relation between the body of beliefs and o which shapes the informationprocessing that generates the proprietary path to belief formation. B-focus—A body of beliefs is B-focused on object o iff (a) justification of beliefs by the proprietary means generates beliefs that match what the object is like unless some rationally irrelevant factor intervenes, where

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



  (b) the condition at (a) does not hold in virtue of any underlying relation which shapes the information processing that generates the proprietary path to belief formation.

I suggest that the cognitive focus involved in perceptual demonstrative thought is A-focus and that involved in descriptively mediated thought is B-focus. In a case of perceptual demonstrative thought, the subject stands in a relation to an object—the object at the end of the attentional perceptual channel—which shapes the subsequent informationprocessing, yielding a specific version of the A-focus template: if the information processing routine shaped by the attentional link runs its course, the resulting beliefs will match what the attended object is like unless some rationally irrelevant factor intervenes. In contrast, in a case of descriptively mediated thought, the subject is dealing with a wealth of information (compare—the radio telescope picks up all the signal coming from some region of the night sky) and uses the description as the basis of post-signal processing to generate a body of beliefs. In cases of descriptively mediated aboutness, the subject’s post-signal processing generates a B-focus relation: it is a means of justification for beliefs such that the subject will be unlucky if the beliefs justified in this way do not match what the object is like, and not merely lucky if they do. To consolidate both the contrast between A-focus and B-focus, and the suggestion that perceptual demonstrative thought involves the one and descriptively mediated thought the other, I shall pause to note a difference between A-focus-involving perceptual demonstrative cases and B-focus-involving description-based cases with respect to the relation between having cognitive focus on an object, and actually engaging in the business of forming beliefs. In A-focus cases, cognitive focus is causally upstream from belief formation: having the kind of cognitive lock on an object that enables beliefs about it does not require actually having any beliefs of the enabled kind. This diagnosis accords with what I take to be the intuitive verdicts about cases like these: Case 7 You are looking at an object visible to you only as a speck on the horizon. You are attending to the object. But, because your attentional perceptual link is not delivering any information of the kind that ordinarily generates beliefs, you have not formed any. you think. Case 8 You are attending to an object plainly visible just in front of you, and receiving rich perceptual information of the kind that ordinarily generates our

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



beliefs. But somebody whose word you have no reason to doubt has just told you (falsely, as it happens) that you had better be very careful, because a nefarious third party has given you a drug which will distort your perceptions of everything you encounter. , you think, your attention fixed on the object.

In each of these cases, it seems that you are thinking about the attended object, even though you are not forming beliefs—let alone justified beliefs—as to what it is like. And the observation that the kind of cognitive focus involved in perceptual demonstrative thought is A-focus lets us explain how this can be. In each case, though you are not engaged in the activity of maintaining a body of beliefs whose proprietary means of justification converges on an object, you stand to an object in a relation which enables this activity: a relation which is such that, if you were to start forming beliefs justified by the route it makes available—if the object in Case 7 moved closer so that your attentional link started delivering property information of the kind to which we respond by forming beliefs; if you were disabused of the misapprehension in Case 8—these beliefs would tend to match what the attended object is like. The opposite point about B-focus adds a refinement to the account of the boundaries of description-based aboutness-fixing from a few pages ago. It is not enough, to count as having descriptively mediated singular thoughts, just to intend to deploy a description to secure cognitive focus on an object. A subject might have this intention, but, through indolence or mischance, never get around to engaging in the intended activity. If just forming the intention were enough, we would still have a version of the objection that allowing descriptively mediated singular thought is treating singular thought as too easily and cheaply attained. But since the kind of cognitive focus involved in descriptively mediated singular thought is B-focus, this objection falls away. Since descriptively mediated singular thought is B-focus, there is singular thought only when the activity of using the description to harvest information is up and running. In cases where there is merely an as yet unfulfilled intention to engage in the activity, there is no activity, so no singular thought. (This point generates subtleties concerning ⌜Let α refer to the Ψ⌝ stipulations which I am about to consider.) So I suggest that there are in fact two kinds of cognitive focus— A-focus, of which the perceptual demonstrative case is the paradigm; B-focus, which underpins our uses of descriptive names and which (though

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

I have not shown this) is also involved in many cases of aboutness-fixing intermediate in sophistication between these two extremes. From a distance, the difference between kinds of cognitive focus is a difference in structure between aboutness-fixing relations. It would be surprising if this kind of difference did not generate quite far-reaching consequences concerning the division of labour in the ‘thought about material objects’ part of our conceptual scheme. And in fact I think there are important consequences of the A-focus/B-focus distinction: consequences which must be explored to bring out what remains of the case against descriptively mediated singular thought once the old assumption that the mechanism of descriptively mediated aboutness fixing is satisfactional has been left behind. But I shall not try to explore these consequences here.

1.4 ⌜Let α refer to the Ψ⌝ I shall close by considering the implications of the view developed over the previous three sections for the question of the commitments that are generated when a speaker makes, and a hearer accepts, a ⌜Let α refer to the Ψ⌝ stipulation. I shall first sketch what I take to be the standard account of this matter, then explain the very different picture the proposal of this paper entails. The central claim of the standard account is generated by the standard claim²⁰ about reference-fixing for descriptive names. According to the standard claim, expression α introduced into use by a ⌜Let α refer to the Ψ⌝ stipulation refers to the satisfier of ⌜the Ψ⌝. In a situation where such a stipulation has been made and accepted, participants have adopted the convention of using ⌜α is Φ⌝ to express thoughts which are true iff the satisfier of ⌜the Ψ⌝ is Φ. One standard add-on to this claim is that a descriptive-name-introducing stipulation is to be understood as forcing a wide-scope reading of the description with respect to modal operators, so that ⌜α might have been Φ⌝ is true iff there is a possible world in which the individual who satisfies ⌜the Ψ⌝ in the actual world is Φ. Philosophers who are in agreement on the standard claim, and on these initial steps concerning ⌜Let α refer to the Ψ⌝ stipulations, then disagree about what to say about cases where such a stipulation is ²⁰ See p. 1 above.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



made but the description has no satisfier, with the debate at this point interacting with the standard (standard-claim-assuming) dispute as to whether descriptively mediated thoughts are genuinely singular. Here is a final case which will help motivate the details of the alternative treatment of ⌜Let α refer to the Ψ⌝ stipulations that I want to propose: Case 9 ‘Tal’ X and Y are researching the Battle of Kadesh, and trying to build up a picture of how it impacted the lives of its rank-and-file survivors. In writing up their findings for a popular audience, they introduce the name ‘Tal’, and use it as follows: ‘Let’s pick one survivor of the battle—say, the longest-lived one. We don’t know his name, but we’ll call him “Tal”. Tal lived a long time ago, but he might have descendants who are alive today . . . ’

The proposal of this paper entails that the ⌜Tal was Φ⌝ utterances that occur in X and Y’s narrative do not express descriptively mediated singular thoughts: there is descriptively mediated singular thought where a description is deployed to achieve and maintain cognitive focus; the impoverishment of our informational environment with respect to participants in the Battle of Kadesh precludes this possibility. But it seems that X and Y are using ‘Tal’ to make claims whose truth or falsity depends on what some particular individual—the longest-lived survivor of the battle—is like. Whether Tal has descendants who are alive today depends on how things stand with respect to one particular individual. If X says ‘Tal was a Hittite’, when in fact—though this is undiscoverable from our current perspective—the longest-lived survivor was an Egyptian, X has said something false. I suggest that this case points to a fact of the matter about our ⌜Let α refer to the Ψ⌝ stipulations. Let us say that an expression filling the subject place in an ⌜ . . . is Φ⌝ utterance ‘Frege-refers’ to object o iff, in the absence of special stage-setting, the utterance may be regarded as expressing a thought which is true iff o is Φ. Then, reaching back to the initial distinction between singular and general thought drawn in §3, there are two kinds of case of Frege-reference: singular cases, where the utterance expresses a singular thought about o; general cases, where the utterance expresses the thought that , and (as it happens) o is the Ψ. I suggest that it is a fact of the matter about our ⌜Let α refer to the Ψ⌝ stipulations that such a stipulation on its own commits a speaker who makes the utterance or a hearer who accepts it only to using the expression

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

in a way that Frege-refers. The commitment may be fulfilled either by using the expression to express descriptively mediated singular thoughts, or by using it to abbreviate a (rigidified) description. In some cases, features of the context entail that the speaker is from the start committing herself to one disjunct rather than the other. In others, a commitment to one disjunct rather than the other crystallizes as the situation unfolds. To develop this suggestion and bring out its plausibility, it will be useful to work within a slightly more formal framework for accounts of what conversational transactions involve. In what follows I shall suppose a framework in which a speaker making an assertion is proposing to restrict the range of possibilities which are counted as ‘live’ by participants in the conversation, and a hearer accepting an assertion is undertaking to make this restriction.²¹ (For example, when I say, in a situation where each of us knows that the other understands all the terms involved, ‘There is a direct train from London to Marseilles’, I am proposing that my hearer join me in ruling out all possibilities where there is no such direct train, committing us both to the claim that the actual world—whatever else may be true of it—is a world in which trains run London–Marseilles direct.) I shall also suppose that a stipulation as to what an expression is to stand for or how it is to be used may be treated, without too much distortion, as a case of assertion: a speaker making such a stipulation is, in effect, proposing to rule out all possibilities where the expression is used in ways that do not accord with it. Now let us model the proposal I have sketched about ⌜Let α refer to the Ψ⌝ stipulations within this framework. Suppose that speaker S directs such a stipulation to hearer H. I have proposed that on its own S’s stipulation is a proposal to begin using α as a Frege-referring expression, where the mechanism for determining an object’s status as the object upon whose Φ-ness or not the truth of an ⌜α is Φ⌝ utterance depends is mediated by description ⌜the Ψ⌝. If H accepts this proposal, S and H both set aside, for the purposes of the future development of the conversation, uses of α which fall outside this characterization. Now suppose that H does accept the stipulation, and consider a use of α occurring within the S/H conversation, and within the scope of the stipulation ²¹ This is the central claim of Stalnaker’s account of assertion in, for example, Stalnaker 1984. I am departing radically from Stalnaker’s view of what might count as an aboutnessfixing relation.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



(that is, an occurrence of α occurring at a time when S and H take the permissible uses of α to be governed by the stipulation). We can recognize three main cases (I suppress finer details in the interests of getting the basic proposal on the table): (a) The singular thought case—By the time of the use of α we are considering, there is a practice of using either or a successor descriptive condition to harvest information into a body of beliefs treated as expressible using α. (b) The ‘options still open’ case—By the time of the use of α that we are considering, there is not yet a practice of the kind required to underpin description-based singular thought. Nor are there features of either the context in which the stipulation was made, the context of the subsequent conversation, or moves made within the conversation which rule out (as not live in the conversation) the future development of such a practice. (c) The general thought case—By the time of the use of α we are considering, either features of the context in which the stipulation was made, the context of the subsequent conversation, or moves made within the conversation have ruled out (as not live in the conversation) the future development of a practice associated with α of the kind required to underpin description-based singular thought. In this case α is being used as what I shall call an ‘abbreviated rigidified description’. For example, consider Case 8 ‘Tal’. Features of the context in which the ‘Let’s call the longest-lived survivor “Tal” ’ stipulation is made rule out— and are known by the parties to the stipulation to rule out—the possibility of using the descriptive condition to achieve cognitive focus on a single individual. So the context in which the stipulation is made closes off the possibility of using ‘Tal’ as a genuine descriptive name, forcing an ‘abbreviated rigidified description’ account of what speaker and hearer are committing themselves to when they commit to using the name in a way governed by the stipulation. This same effect—forcing the ‘rigidified description’ account of how to conform to the stipulation—might be generated in other ways. For example, it might be that, in the context in which the stipulation is made, it is clear to both speaker and hearer that nobody is going to

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

bother engaging in cognitive-focus-generating information-harvesting activity, even though this kind of activity in fact is possible in the informational environment. Or it might be that someone closes off the singular thought possibility explicitly (‘Let’s treat α as a representative of all objects in this class; we won’t bother finding out what α was like’). However, though it would be an exaggeration to say that situations where participants in a conversation conform to a ⌜Let α refer to the Ψ⌝ stipulation by using α as an abbreviated rigidified description are rarities, it is a matter of fact that by far the more usual cases are those I have spent most of this paper discussing: α is annexed to a stream of singularthought-sustaining activity. In this kind of case, the participants in the conversation conform to the stipulation by using α to express beliefs generated by (or other attitudes associated with) this activity. And once α has been annexed to such an activity, it is the ‘abbreviated description’ way of conforming to the stipulation that falls away. ‘Well, here’s the author of the ballads, but there was no Geraint’, says scholar X to scholar Y, breaking the news of the discovery that Edward Williams wrote the ballads. ‘You realize that X didn’t invent Gizmo after all’, says one underling to another, as their investigation uncovers what really went on. Let me close by stepping back from the array of cases introduced in this paper, and returning to Evans’s famous hypothetical stipulation ‘Let “Julius” refer to the inventor of the zip’.²² To keep the case simple, suppose that the stipulation is made, as it were, from cold: there is no established ‘What was the inventor of the zip like?’ investigation to which the stipulation is annexing the name. Here are what I take to be two observations about this case: (A) The answer to the question ‘Might Julius turn out not to have invented zips (but to have found them, occurring as a natural phenomenon)?’, asked immediately after the stipulation, is ‘No’.²³ (B) The answer to this question, asked once an investigation into what Julius is like is up and running, is ‘Yes’.

²² Evans 1982: 31. Evans’s formulation is actually ‘Let us call whoever invented the zip “Julius” ’. ²³ Note that the precise form of the sentence is important. Things will be different for ‘Might Julius not have invented the zip?’.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

     



Evans and other proponents of the standard view of descriptive names have an explanation for (A) but not (B). According to the standard view, once the initial stipulation has been made and accepted in a conversation, the referent of ‘Julius’ within the conversation is the satisfier of the description: the only way for it to turn out that Julius did not invent zips is for it to turn out that there was no Julius. In contrast, the proposal I have made predicts the (A)/(B) pattern. Once the investigation is up and running within the conversational context, uses of ‘Julius’ are annexed to the singular-thought-sustaining information-harvesting activity. ‘Julius’ refers—if it refers—to the individual upon whom this means of justification converges, an individual who may or may not satisfy the description that is being used to do the informationharvesting work. But the perspective from which it makes sense to say that Julius might turn out to have found zips growing as naturally occurring crystalline entities, or to have been given them by a beneficent alien has to be earned: it is made available by engagement in (or awareness of) the information-harvesting activity required to use the description to achieve cognitive focus on an object of singular thought.²⁴

References Bratman, M. (1987). Intentions, Plans, and Practical Reason. Stanford, CA: CSLI. Campbell, J. (1999). ‘Immunity to Error Through Misidentification and the Meaning of a Referring Term’, Philosophical Topics 26 (1&2). Campbell, J. (2002). Reference and Consciousness. Oxford: Oxford University Press. Conee, E. and R. Feldman (1998). ‘The Generality Problem for Reliabilism’, Philosophical Studies 89 (1). Dickie, I. (2015). Fixing Reference. Oxford: Oxford University Press. Dickie, I. (forthcoming). ‘Cognitive Focus’, in J. Genone, R. Goodman, and N. Kroll (eds.), Mental Files and Singular Thought. Oxford: Oxford University Press. ²⁴ This paper advances the discussion of the topic from Dickie 2015 in response to extremely helpful comments from Karen Lewis, Calvin Normore, and James Shaw. Thanks to them, to participants in the author meets critics session on the book at the American Philosophical Association Pacific Division Meetings in 2017, and to participants at the Meaning and Representation Workshop at the University of Turin a few months later. I apologize for the vulgar profusion of references to my own book, rendered unavoidable by the paper’s status as the next stage in the development of one of its central proposals.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

Evans, G. (1982). The Varieties of Reference, edited by John McDowell. Oxford: Oxford University Press. Evans, G. (1985). ‘Understanding Demonstratives’, in Collected Papers. Oxford: Oxford University Press. Goodman, R. (2016). ‘Cognitivism, Significance, and Singular Thought’, Philosophical Quarterly 2016 66 (263). Jeshion, R. (2004). ‘Descriptive Descriptive Names’, in M. Reimer and A. Bezuidenhout (eds.),Descriptions and Beyond. Oxford: Oxford University Press. Jeshion, R. (2010). ‘Singular Thought: Acquaintance, Instrumentalism, and Cognitivism’, in R. Jeshion (ed.), New Essays on Singular Thought. Oxford: Oxford University Press. Kripke, S. (1980). Naming and Necessity. Cambridge, MA: Harvard University Press. Recanati, F. (2012). Mental Files. Oxford: Oxford University Press. Reimer, M. (2004). ‘Descriptively Introduced Names’, in M. Reimer and A. Bezuidenhout (eds.), Descriptions and Beyond. Oxford: Oxford University Press. Russell, B. (1956a). ‘The Philosophy of Logical Atomism’, in Logic and Knowledge: Essays 1901–1950, edited by R. Marsh. London: Allen and Unwin. Russell, B. (1956b). ‘On Denoting’, in Logic and Knowledge: Essays 1901–1950, edited by R. Marsh. London: Allen and Unwin. Searle, J. (1983). Intentionality. Cambridge: Cambridge University Press. Sosa, E. (2015). Judgment and Agency. Oxford: Oxford University Press. Strawson, P. (1959). Individuals. London: Methuen. Stalnaker, R. (1984). Inquiry. Cambridge, MA: MIT Press. Velleman, J. (2000). The Possibility of Practical Reason. Oxford: Oxford University Press.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

2 Sources of ContextDependence The Case of Knowledge Ascriptions Michael Glanzberg

This paper has two goals. The first is to defend a form of contextdependence for knowledge ascriptions. In particular, I shall develop and defend a version of question-sensitivity for knowledge, building on work of Schaffer, Szabó, and Knobe. I shall explore in depth some of the evidence in favor of question-sensitivity, and offer an account of the semantics and pragmatics of knowledge ascriptions that captures it. I shall thus, to an extent, defend the context-dependence of knowledge ascriptions. The second goal of this paper is to explore the different sources of context-dependence that natural language provides, and the variety of forms of context-dependence that these sources produce. In particular, using knowledge ascriptions as an illustration, I shall argue that there are two very different sorts of sources of context-dependence in language. One is highly specific, typically lexical context-dependence. We discover that some specific word or specific class of words is context-dependent. The other is general. As I shall illustrate below, highly general features of extremely broad categories of expressions, and other general apparatus at work in language, can create context-dependence that is only minimally associated with any one expression or lexically determined class of expressions. The case of knowledge ascriptions provides an example of this kind of context-dependence. It shows how general features of the

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

semantics of operators, and general features of how questions influence discourse, create context-dependence. Both general and specific forms of context-dependence are welldocumented. The case of knowledge ascriptions is useful for exploring them, as it highlights the fundamental difference between the two sorts. Both are of great linguistic interest. Like much we discover in the study of language, the difference between the two illustrates how lexical and other kinds of grammatical factors can divide linguistic labor. I claim that we see that as much with context-dependence as with syntax, argument structure, or any other linguistic phenomenon. When it comes to philosophical concerns about contextualism, the difference points to something not always fully noted. In many cases, as I shall discuss more below, specific instances of context-dependence can reveal something important about the specific concept a given word expresses. General context-dependence does this in at best highly limited ways. My defense of the context-dependence of knowledge ascriptions will show it to exhibit only general context-dependence. Thus, it is a very limited defense of contextualism. I shall argue that the source of contextdependence in question-sensitivity is not the lexical meaning of know, but rather the general mechanisms related to the semantics and pragmatics of questions and focus. Thus, we have only general contextdependence. I shall speculate that in the end, this is too weak to support substantial epistemological conclusions. The discussion of varieties of context-dependence will come at the end of this paper. First, we will explore the sources of context-dependence for question-sensitivity in detail. The structure of the paper is as follows. We will introduce question-sensitivity in section 2.1. We will need some substantial linguistic background, which will be provided in section 2.2. With that, we will examine question-sensitivity closely in section 2.3. We will build up a semantics and pragmatics for questionsensitivity in section 2.4. With all that in place, we will see how a form of context-dependence is present in knowledge ascriptions in section 2.5. But we will also see there that the kind of context-dependence we have uncovered is general, and not specific. We will discuss how different sources create these different kinds of context-dependence, and what the difference between the two kinds shows, in section 2.6.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



2.1 The phenomenon of question-sensitivity Before turning to semantic, I shall review the case for a special kind of context-dependence for knowledge, drawing on work of Schaffer (2004, 2005, 2007), Schaffer & Knobe (2012), and most importantly, Schaffer & Szabó (2014). This offers a distinctive kind of context-dependence for knowledge ascriptions, relating to questions and answers to them. The traditional form of context-dependence for knowledge ascriptions is sometimes called stakes-sensitivity. Work of Cohen (1999) and DeRose (1992), and work of Lewis (1996), argued that we see context-dependence in knowledge ascriptions relating to how high the stakes for a knowledge claim are. So, in DeRose’s (1992) bank case, it is argued that we have: (1)

Low Stakes Context: We are driving home on Friday afternoon, and planning to stop at the bank to deposit a check. We pass the bank and see a long line. It is not especially important whether the check is deposited immediately. Our dialog goes: a. Maybe the bank won’t be open tomorrow. b. No, I know it will be open. I was just there two weeks ago on Saturday.

(2)

High Stakes Context: Same as above, except, we have a large check outstanding. It will bounce if we do not make the deposit. Our dialog goes: a. Do you know the bank will be open tomorrow? Banks can change their hours. b. Well, no. I don’t know it will be open.

Many judge that both of the (b) answers sound true. As they differ in whether the speaker knows the bank will be open, this sort of example appears to support the context-dependence of knowledge ascriptions. It appears that it is the stakes of the context that is important to changing the truth value of a knowledge ascription. Despite its obvious appeal, stakes-sensitivity has been subject to extensive criticism. The intuitions surrounding it are quite delicate, and attempts to confirm them empirically have not been very successful. (See the extensive survey in Schaffer & Knobe 2012.) Furthermore, as a semantic proposal, it lacks any explanation of the mechanism of contextdependence (cf. Stanley 2005). Finally, as a substantial epistemological

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

proposal (as opposed to a semantic one), stakes-sensitivity has meet a number of objections (e.g. Hawthorne 2004; Reed 2010). For the current discussion, I shall simply follow these trends, and assume that there is no context-dependence of knowledge ascriptions from stakes-sensitivity. But I do think there is some context-dependence. The kind I shall focus on is question-sensitivity, following Schaffer & Szabó (2014), as well as Schaffer (2004, 2005, 2007). This work focuses on examples like the following: (3)

Context: Claire has stolen the diamonds. Ann and Ben are wondering who stole the diamonds, and Ann finds Claire’s fingerprints all over the safe. So Ann says to Ben: a. I know that Claire stole the diamonds.

(4)

Context: Claire has stolen the diamonds. Ann and Ben are wondering what Claire stole, and Ann finds Claire’s fingerprints all over the safe. So Ann says to Ben: a. I know that Claire stole the diamonds.

Here we do get fairly firm judgments that (3a) is true, while (4a) is false. This is backed up by some experimental work (Schaffer & Knobe 2012). I think it is fair to assume there is a genuine phenomenon here. It does indicate context-dependence, as across two different contexts we get different truth values. But it also seems much different from stakessensitivity. Call the phenomenon we see here question-sensitivity (following Schaffer and Szabó). I shall assume question-sensitivity is a fact. But that still leaves open many issues. We have yet to see for certain that question-sensitivity indicates a form of contextualism about knowledge ascriptions. If it does, how does know become context-dependent? I shall explore this by first examining the phenomenon in more depth, and then by pursuing a contextualist account of it. I shall then ask what kind of results this account gets, and if they are right. But before that, some background for our discussion is needed.

2.2 Some background Our first task is to try to understand better what is at work in the questionsensitivity scenarios. But to do this, we will need some apparatus. The first

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



piece of apparatus we will need is focus, and its role in regulating questions and answers. Questions and answers are sensitive to focus: usually marked by intonational prominence.¹ This is illustrated by: (5)

Does Max want coffee or tea? a. Max wants COFFEE. b. # MAX wants coffee.

The different intonational prominences marked by capital letters mark different foci.² Focus triggers a felicity condition called question-answer congruence, as we see with: (6)

Does Max want coffee or tea? a. Max wants COFFEE. b. # MAX wants coffee.

(7)

Who wants coffee? a. # Max wants COFFEE. b. MAX wants coffee.

A sentence is only felicitous if the focus marks an appropriate answer to a question. The focused material itself is usually understood as providing the new information that makes the answer informative. The intonational prominence marking focus is sometimes called ‘stress’. As we will not be going into the phonology, that is fine, but I should pause to note that most phonologists do not identify it as stress, but rather a distinct kind of intonational contour. This is often labeled a ‘pitch accent’ (or just ‘accent’).³ One further fact about focus, that will become important as we go forward, is that different placements of pitch accent make different sentences. It is a standard view that the intonational prominence

¹ See the surveys of Beaver & Clark (2008), Büring (2016b), and Kadmon (2001), as well as seminal work of Jackendoff (1972) and Rooth (1985, 1992). ² The marking ‘#’ indicates judgments of infelicity, a form of degraded acceptability. ³ Technically, pitch accents are distinctive parts of an intonational phrase marking specific combinations of local maxima or minima in the pitch contour. There is some dispute about whether the pitch accents or larger intonational phrases make up the realization of focus. For some surveys of relevant aspects of phonology, see Büring (2016b), Kadmon (2001), Ladd (1996), and Pierrehumbert & Hirschberg (1990).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

marking focus realizes a syntactic feature, which we can write as F. So a more full representation of the examples above is: (8)

Does Max want coffee or tea? a. Max wants [COFFEE]F. b. # [MAX]F wants coffee.

(9)

Who wants coffee? a. # Max wants [COFFEE]F b. [MAX]F wants coffee.

The different focus placements thus make different sentences, not just different ways of pronouncing the same sentence. Let me mention a few of the many reasons this is the standard assumption. One is that there are clear relations between accent placement and syntax. As Selkirk (1995) observed, there is a preference for a phrase to be marked by an accent on its internal argument, and not its head.⁴ There is also a much-discussed phenomenon of ‘second occurrence focus’, where semantically a focus is present, but no pitch accent is recognized (Beaver & Clark 2008; Beaver et al. 2007; Partee 1991). Also, it is an old observation that focus affects grammaticality (Jackendoff 1972). More recently, important connections between focus and ellipsis have been explored (Merchant 2001; Rooth 1992). A number of authors have noted the role of focus in the syntax of copular clauses (e.g. Heycock & Kroch 2002). The persistent connections between syntax, focus, and accent placement make a general case that there is a syntactic feature realized by accent in focus. Finally, there are big-picture reasons to see accent as realizing a syntactic focus feature. Many models of how syntax relates to semantics and phonology hold that semantics and phonology cannot see each-other, and so there must be features in the syntax before phonology and semantics split that can affect both. All together, these pieces of evidence, and others, have led to the standard assumption that focus is marked in syntax and realized in some languages by accent. The relation of focus to questions is made more clear if we think of each assertion as trying to answer a question, called the question under

⁴ This is the phenomenon usually called focus projection. Theories have changed since Selkirk’s seminal work, due to the influence of Schwarzschild (1999). For overviews, see Beaver & Clark (2008), Büring (2016b), or Kadmon (2001).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



discussion (QUD) (Roberts 1996). The QUD is the immediate question we are trying to answer with an assertion. This is equivalent to a notion of discourse topic.⁵ A QUD can be overt, set by a question. Typically it is part of a larger structure of questions and answers describing an inquiry (Büring 2003; Roberts 1996). When not overt, it can be set implicitly by context. Either way, the QUD is part of context, set either by discourse or other features of the context. The examples above show that typically focus requires congruence with the immediate QUD (Roberts 1996). This is a felicity condition, relating a sentence to a particular feature of context. But, there is a complication we should mention, if just to set aside. Our main focus here will be on sentences embedded under know. With attitude reports in general, and with know in particular, the conditions under which a focus in an embedded sentence must match an immediate QUD gets rather complicated. Attitudes come in flavors: ‘emotive’ ( glad, etc.) and ‘cognitive’ (know, etc.) Many attitude verbs, including emotives, are highly restricted in question-answer congruence (Hooper 1975; Rooryck 2001a, b; Simons 2007). Know actually has a very delicate distribution: (10)

Who stole the diamonds? a. I think that [Claire]F stole the diamonds. b. ? I know that [Claire]F stole the diamonds. c. # I’m glad that [Claire]F stole the diamonds.

This shows that when an embedded clause can be taken as a felicitous answer to a QUD is delicate. As we will look at embedded clauses in knowledge ascriptions, a few observations are in order. Following Simons, we can speculate that attitudes that serve an evidential function allow an embedded clause to answer a QUD. In these cases, the embedded clause rather than the matrix attitude verb clause provides the ‘main point’ of the utterance, and the attitude signals evidential status for the embedded clause. Emotives do not do this, and hence their limited distribution. Know only appears to do this in fairly special circumstances,

⁵ For some overview of the many ways of thinking about discourse topics, see the surveys of Büring (2016a) and Roberts (2011), and the extended discussions in Büring (2016b) and Kadmon (2001).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

when the strong epistemic commitment to knowledge indicated by know is non-redundant. Fortunately, it seems our above ‘theft’ contexts (3) and (4) build this condition in, and so allow the embedded clause to answer the immediate QUD. Thus, for our purposes here, we can safely assume that a focus needs to be congruent to some reasonably nearby question recoverable from the context, and not worry about whether it is the immediate QUD. Even if less than fully accurate, this assumption is safe enough.⁶ We will also need some background about questions, focus, and presupposition as we go forward. It is a vexed question whether questions carry existential presuppositions. We often see a strong intuition that they do: (11)

a. Who stole the diamonds? b. Presupposes that someone did?

But this can be made to disappear: (12)

a. Who stole the diamonds? b. No one did. They are on loan.

It is controversial whether there is some kind of cancelation effect here, or if there was no presupposition at all.⁷ Our concern is how much this sort of presupposition might play a role in question-sensitivity. But, we can control for it, simply by using over presupposition suspenders (Horn 1972): (13)

Who, if anyone, stole the diamonds.

With the addition of the suspender if anyone, the presupposition appears clearly to be gone. It has also been a vexed issue whether focus carries an existential presupposition. I shall suppose it does not. I think this the dominant view (e.g. Jackendoff 1972; Rooth 1999). But it remains controversial (e.g. Herburger 2000). I think the most easy way to see why many think

⁶ There are also issues about where accents should fall in embedded clauses, which I shall ignore. ⁷ Classic positions on this matter come from Groenendijk & Stokhof (1984) and Karttunen & Peters (1979), but the literature is rather large.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



focus does not carry an existential presupposition is to contrast it with clefts, which do carry a real existential presupposition: (14)

Did Sam kiss anyone? a. Sam kissed [Kim]F b. # It is [Kim]F who Sam kissed.

(15)

Did anyone win the football pool this week? a. Probably not, because it is unlikely that [Mary]F won it, and she is the only person who ever wins. b. # Probably not, because it is unlikely that it was [Mary]F who won it, and she is the only person who ever wins.

(Example (15) is from Rooth 1999.) In light of these observations, we will assume question-answer congruence does not automatically require existential presupposition, and try to control any suggestion otherwise explicitly with suspenders. We now have some tools we can use to investigate question-sensitivity more carefully. It is to that task we now turn.

2.3 The ingredients of question-sensitivity Let us return to our example of question-sensitivity. First, recall the two contexts involved: (16)

a. Context 1: Claire has stolen the diamonds. Ann and Ben are wondering who stole the diamonds, and Ann finds Claire’s fingerprints all over the safe. b. Context 2: Claire has stolen the diamonds. Ann and Ben are wondering what Claire stole, and Ann finds Claire’s fingerprints all over the safe.

We can now confirm that these are really two different contexts, as they set up different QUDs. But we also have a problem. We also have two distinct target sentences, because of question-answer congruence: (17)

Context 1. QUD: Who stole the diamonds? a. I know that [CLAIRE]F stole the diamonds. b. # I know that Claire stole [the DIAMONDS]F.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



  (18)

Context 2. QUD: What did Claire steal? a. # I know that [CLAIRE]F stole the diamonds. b. I know that Claire stole [the DIAMONDS]F.

The felicitous sentences for these contexts require different foci, and so are distinct sentences. There is a further problem we might worry about: Are we smuggling in existential presuppositions for the QUDs? Are we thus smuggling in facts about what is known, which affect truth values but are not relevant to context-dependence? To try to address these, we will look at the examples again, but be more careful about the ingredients of the two contexts, including QUDs, presuppositions of questions, and assumptions about knowledge. And, we will also be more careful about how we divide up context and truthsupporting circumstances. First, let us try to enumerate the facts about the world that are common across contexts (1) and (2). To attempt this, we can first try to take the descriptions given in Schaffer & Knobe (2012) and Schaffer & Szabó (2014), and fill in everything we might infer that is common across those contexts. This exercise produces a list that looks something like: 1. 2. 3. 4. 5. 6.

Claire stole the diamonds. The diamonds are stored in the safe. Other things are also stored in the safe. The diamonds are not now in the safe. Claire’s fingerprints are on the safe. No one else’s fingerprints are on the safe.

Looking at how we judge truth values for knowledge claims in these contexts, it appears we also take fingerprints to provide sufficient evidence in some cases. This is not simple to put in an epistemologically neutral way, but let us add: 7.

Fingerprints are sufficiently reliable evidence.

Call these features F. F constitutes a very rough list of the initial facts about the world as described in contexts (1) and (2), excluding facts about who believes or knows what. I should pause to stress that it is not clear if F constitutes a complete enumerate of the relevant facts, and item (7) in particular will generate

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



some questions about how we reach truth value judgments as we go forward. F is merely a rough enumerate of what seems to be at work in our judgments and is common across contexts (1) and (2). It will give us a starting point for understanding how truth values can turn out the way we judge, once we have a clear understanding of the contents of the target assertions. So, F gives us some idea what the facts might be. We also need a clearer representation of the two contexts. To do this, it will be helpful to identify what is common ground in the conversation. This provides a good representation of a context, though it does not tell us what features of it are semantically relevant or how. At the beginning of the discourse, we can take all of (2–7) to be common ground for Ann and Ben. But (1) cannot be common ground for them. All that it looks like Ann and Ben presuppose is that some stealing took place. Adding this to (2–7) gives us the initial common ground. Call this B. B is a set of Base features which will be common across the contexts we will consider. B is the common ground, and so a substantial portion of the context as it stands at the beginning of the discourses that go into both (1) and (2). Of course, in both cases, we have assertions, which add to common ground, and make facts explicit. We can assume that for both contexts, it becomes common ground that Ann believes Claire stole the diamonds. Call this BA (for Base with Addition). We might also note that when Ann’s assertions are accepted, then the fact that Claire stole the diamonds gets accommodated. But we will not actually need this given how we model things. When it comes to assessing the assertions in our two contexts, we can assume they are both made in contexts including BA, and that we assess them for truth against facts including F plus the fact that Ann believes Claire stole the diamonds. Call this FA. BA and FA give us initial indications of the needed facts and contextual inputs. It reflects what we assume as we reason about contexts (1) and (2). BA is not yet a full context, as it does not specify a QUD, what is presupposed about the QUD, and what is taken as known about it. In looking at how to complete our contexts, we will focus on two sorts of ways of doing so: weak and strong. Weak and strong will characterize information that is active in a context over and above BA. Contexts can be weak or strong epistemically, or in terms of what their QUDs presuppose.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

Let us start by looking at who-contexts, involving QUDs related to Who stole the diamonds?. And, let us begin with weak forms. Our contexts can be epistemically weak, in that Ann and Ben do not know for sure that the diamonds were stolen. They can also have presuppositionally weak QUDs, in that their QUDs are not taken to carry an existential presupposition. Call a context that is both epistemically and presuppositionally weak BAEQ (E for epistemically weak, Q for a QUD without an existential presupposition). In this case, we have: (19)

Context for BAEQwho: Ann and Ben suspect that the diamonds were stolen. Ben asks Who, if anyone, stole the diamonds? Ann finds Claire’s fingerprints all over the safe.

This context is built from the common ground BA, and the QUD Who stole the diamonds, but explicitly canceling any existential presupposition, and not assuming specific knowledge that the diamonds were stolen. The judgments we get related to question-sensitivity for this context are: (20)

Context BAEQwho. a. T/G [Claire]F stole the diamonds. b. T I believe that [Claire]F stole the diamonds. c. F I know that [Claire]F stole the diamonds.

A few comments. These judgments are supposed to be of what we think is expressed in BAEQwho, assessed against FA. (a), marked T/G, sounds fine to me, and seems true. But there may be some issues about whether Claire violates a Gricean maxim because she asserts more than she has good reason to, or maybe she violates a norm of assertion (if you like knowledge norms). The marking T/G indicates the possible Gricean effect. (c) is false. Ann’s evidence is some fingerprints, which does not suffice to rule out that no one stole the diamonds. Maybe Claire took out some rubies to display, while Tim, wearing gloves, stole some sapphires. We can also look at strong contexts for the who-QUD. Strong contexts have a strong QUD with an existential presupposition, and the agents know that the presupposition is satisfied. This makes a strong context BAE+Q+. In the strong who-case we have: (21)

Context for BAE+Q+who: Ann and Ben know that someone stole the diamonds. (They are being ransomed by a third party.) Ben says, OK, we know the diamonds were stolen, but who stole them? Ann finds Claire’s fingerprints all over the safe.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



This context builds on the same base BA. It adds the same QUD, but with no overt cancelation of any presupposition. And, it adds as a presupposition that it is known the diamonds were stolen. The judgments we get here are: (22)

Context BAE+Q+who. a. T [Claire]F stole the diamonds. b. T I believe that [Claire]F stole the diamonds. c. T I know that [Claire]F stole the diamonds.

Importantly, (c) changes status, as the evidence directly implicates Claire. I take these judgments to be fairly clear, and supported by experimental work (cf. Schaffer & Knobe 2012). We should note, and will return to later, the question of what truth-supporting circumstances are used for these judgments. It is a combination of FA, and, in the strong case, added information about what is known. There are some intermediate cases: BAE+Q and BAEQ+. BAE+Q is odd: we explicitly try to cancel something we already take as known. I will ignore this case. BAEQ+ turns out very messy. Take, for instance its who-version: (23)

? Context for BAEQ+ who: Let’s suppose, though we don’t know for sure, that someone stole the diamonds. Who stole them? a. T [Claire]F stole the diamonds. b. ?? I believe that [Claire]F stole the diamonds. c. #/?? I know that [Claire]F stole the diamonds.

I think the latter two are bad, though the (c) sentence seems worse to me. But the judgments seem to me unclear, as the question in the set-up context seems bad. Especially for (c), I think we see the effect Simons (2007) observed about when our answers with know are acceptable. It requires emphasis on evidential standards and a high standard. But then, any evidential claim made against a presupposition understood to be mere supposition is not adequate. We also need to look at contexts with a what-QUD: What did Claire steal?. Let us start with a weak what-context: (24)

Context for BAEQwhat: Ann and Ben suspect that Claire stole something. Ben asks What, if anything, did Claire steal? Ann finds Claire’s fingerprints all over the safe.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



  a. b. c.

G/T

Claire stole [the diamonds]F. I believe that Claire stole [the diamonds]F. F I know that Claire stole [the diamonds]F. T

The Gricean effect for (a) seems stronger than in the who-context. More importantly, we still get a clear false judgment for (c). Now let us look at the strong context: (25)

Context for BAE+Q+what: Ann and Ben know that Claire stole something. (She confessed.) Ben says OK, we know Claire stole something, but what did she steal? Ann finds Claire’s fingerprints all over the safe. a. G/T Claire stole [the diamonds]F. b. T I believe that Claire stole [the diamonds]F. c. F I know that Claire stole [the diamonds]F.

Crucially, the truth value of (c) is still false, in contrast to the behavior in who-environments. Our key data is this. In strong contexts: (26)

a. Context BAE+Q+who. b. T I know that [Claire]F stole the diamonds.

(27)

a. Context BAE+Q+what. b. F I know that Claire stole [the diamonds]F.

We have different truth values for the two target sentences. We also have what I call the flip: (28)

a. Context BAE Qwho. b. F I know that [Claire]F stole the diamonds.

(29)

a. Context BAE+Q+who. b. T I know that [Claire]F stole the diamonds.

It is the flip that generates the contrast we see in strong contexts. It shows that while in the weak who-context our target sentence is false, in the strong context it comes out true. I take these data points to be the core phenomenon of questionsensitivity. Question-sensitivity is not simple. The contexts that show it can differ in many ways, including QUDs, presuppositions, and what is known in the contexts. We also have two different sentences in our original example of question-sensitivity.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



Our examination of the phenomenon of question-sensitivity allows us to make a prima facie case for the context-dependence of knowledge ascriptions, but it is one we will have to revisit and reconsider as we proceed. Here is a highly sketchy version of a case for context-dependence. The main difference between our two contexts is a difference in QUD. This is indeed a feature of context. It leads to a shift in truth values. Hence, we have context-dependence. We already know this is not quite right, and we need to make the prima facie argument with more care. Our two different contexts involve two different sentences, differing in focus placement. So, we cannot pretend we have one sentence changing truth value in different contexts. But we can still make a prima facie case, assuming that FA is an accurate representation of the truth-supporting circumstance, and BA is an accurate representation of the elements common across the contexts. Assuming this, we recall that our judgments led us to conclude that: (30)

þ

þ

⟦I know that ½ClaireF stole the diamonds⟧BAE Q who 6¼ þ þ ⟦I know that Claire stole ½the diamondsF ⟧BAE Q what

After all, our judgment for the first is true, and the second false, so the two must differ in truth-conditions. Now, we can ask where that difference in truth-conditions can come from? The difference between the contexts involves only QUDs, and the difference between the sentences only focus. So, the difference in truth-conditions must be generated by the semantics of focus, set by QUD. So, know must be sensitive to these differences in a way that generates different truth-conditions. Thus, we conclude, know is context-dependent, with context-dependence mediated by focus. This is less than a direct argument: it is more a proposal about how best to explain the phenomenon. But, there is an alternative available: deny FA is sufficient to fix truth values. After all, in the strong E+ contexts, we add claims about what we know. And they are different in the different contexts: one is knowledge that diamonds were stolen, while the other is knowledge that Claire stole something. We supposed, roughly, that fingerprints are reliable evidence (number 7 of F), but how we evaluate that evidence, and what we can conclude from it, will at best depend on much of the background of what else we know. So, we

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

might simply conclude that it is not FA against which we assess truth, but a much richer epistemic range of facts. If so, then we do not need context-dependence to explain our results. It is simply a change of facts leading to a change of truth values. What of the appearance of context-dependence? We might explain this away as what is called a weak effect of focus: a discourse effect not leading to truth-conditional differences. The existence of a non-truth-conditional discourse effect from focus, and change of facts hidden under a QUD could create an appearance of context-dependence, even if there is none.⁸ So, we have two different views. One offers at least a prima facie case for context-dependence, the other does not. Which of these arguments is right? I shall claim they are each half-right. There is some genuine context-dependence at work, and I shall argue for it indirectly, by laying out a good semantics that shows how it works. But, there is still an issue about what goes into the truth-supporting circumstances, that indicates weaker effects of context than the prima facie argument might suggest. Looking at the phenomenon, the key issue is what explains the flip. Is it driven by context-dependence, or by facts about truth-supporting circumstances? I shall argue that the context-dependence we will uncover does not fully explain the flip. Thus, though there is a case to be made for context-dependence, it is not the full story about question-sensitivity. I shall argue this as follows. I shall sketch a semantics which makes room for context-dependence. With that, we can look at how it captures the key data for of question-sensitivity, and see where other explanations are needed.

2.4 Semantics and context-dependence To carry this out, we first need to work with the semantics of know, and then look at how it can be context-dependent. I shall address these in that order.

2.4.1 Semantics Our first goal is to build a not-too-terrible semantics for know. Schaffer & Szabó (2014) build an interesting semantics, relying on an analogy with ⁸ This comes closer to the original way Schaffer formulated the idea of contrastive knowledge.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



adverbs of quantification. I shall build a variant, which gets similar results, but is more in keeping with the semantics of other attitude verbs. I shall briefly ask whether we can tell which variant is better as we proceed. Let us start with a fairly standard semantics for attitude verbs. The usual starting place is work of Hintikka (1969). Hintikka’s idea is that an attitude verb is interpreted as the set of worlds compatible with subject’s attitudinal state: (31)

a. SA ðxÞðwÞ ¼ fw0 : w0 is compatible with x’s A‐attitude in wg b. ⟦Att‐V⟧c ¼ λPλxλw:SA ðxÞðwÞ  P

Of course, this is simplified, ignoring de se effects, and so on, and the long tradition of belief puzzles. But it offers a useful starting-point for the semantics of attitudes. Specifically for belief, the relevant attitudinal state is the speaker’s belief state, labeled DOX (for doxastic state). We thus have: (32)

⟦believe⟧c ¼ λPλxλw:DOXðxÞðwÞ  P

This is a useful initial semantics for belief, though of course, much has been done since Hintikka’s work. One important observation we can make already with this semantics is that it shows no context-dependence for believe. DOX is fully determined by the speaker’s state, and there is thus no room for context-dependence. There might be some weak question-answer effects from focus, but they cannot be truth-conditional, if this semantics is on the right track. But other attitudes show more context-dependence, and we can model it with the same basic approach. Consider glad, which shares some properties with know. Glad is an ‘emotive factive’. It has a few salient features: (33)

a. Presupposes its complement (like know is often assumed to). b. Attitude is emotive (different from know, which is evidential in some way).

We can follow von Fintel (1999) and Heim (1992) to provide a Hintikkastyle semantics for emotives. This builds on the standard Kratzer (1977) semantics for modals. As is well known, the Kratzer semantics provides two contextual parameters:

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



  (34)

a. Modal base. A function f(x, w) from individuals and worlds to sets of worlds. (The worlds accessible from w for x.) b. Ordering source. A function g(x, w) to sets of propositions (sets of sets of worlds).

The ordering source allows us to define a partial ordering on worlds. Given any set of propositions X and worlds x, y, we set x  X y iff for all p 2 X, if x 2 p, then y 2 p. If we make the so-called limit assumption (e.g. Lewis 1973), this allows us to define a set of best worlds in a given set W:⁹ (35)

maxg ðWÞ is the set of g -best worlds.

Attitudes can be similar to modals, in quantifying over the right set of worlds. They can thus pick up the same context-dependence through these two parameters as modals can. But as we will see, there is much more lexical specification of the values of the parameters for attitudes. As a warm-up to glad, let us start with want (von Fintel 1999; Heim 1992; Stalnaker 1984). For want, we need to pick an appropriate ordering source for preferences. Let g(x,w) be a set of propositions that characterizes what x prefers in w. We then have:  (36) ⟦want⟧c ¼ λPλxλw:maxgðx;wÞ f ðx; wÞ  P You want P if all the worlds you prefer most are P worlds. There is more to say about this proposal, but it will suffice for now. What is the right modal base for want? It appears it should be: (37)

f ðx; wÞ ¼ DOXðxÞðwÞ

This distinguishes want from wish. What I want in w is compatible with what I believe to be the case in w. Want does not require all the most desirable worlds to be P worlds, only those you believe are open. For instance, I might want to teach Tuesdays and Thursdays, even though the absolute best worlds are those where I am not teaching. To include this, we can add: (38)

⟦want⟧c : a. Defined only if f ðx; wÞ ¼ DOXðxÞðwÞ  b. If defined, = λPλxλw:maxgðx;wÞ f ðx; wÞ  P

⁹ For a more full presentation, and many references, see Portner (2009).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



Again, this gives us a workable example of a semantics. Now we can do glad. Glad adds factivity. There is a long debate about how to handle factivity in semantics. If a proposition P is the factive content, then the usual way is to require it be presupposed that DOXðxÞðwÞ  P. This might be weaker than most epistemologists would assume, as it does not require the truth of P. It is standard in semantics, as most standard approaches to presuppositions cannot easily implement the stronger requirement. There are also some reasons in support of it. For instance, we have examples like: (39)

John mistakenly thought it was Sunday and was glad he could sleep in.

I will simply follow the tradition and work with the weaker form. We need more than a factive presupposition for glad. We also need to presuppose that DOXðxÞðwÞ  f ðx; wÞ. To make this vivid observe that if the modal base contained only worlds incompatible with what I believe, then selecting the best ones would not reflect my attitude. I’m glad when things as I see them work out OK. But crucially, we cannot simply set DOXðxÞðwÞ ¼ f ðx; wÞ. Suppose I got bitten by a mosquito and got Chikungunya. Then the best worlds in DOX are those where I get debilitating muscular problems, but live (the disease can cause severe muscular problems or death). But I am not glad I get debilitating muscular problems. There are nearby ways things could have unfolded where I don’t get infected from the bite. Those are better. (NB if we combine the condition that f ðx; wÞ \ P ¼ 6 ∅ and f ðx; wÞ n P ¼ 6 ∅ with the weak factivity presupposition that DOXðxÞðwÞ  P, we also see that we cannot have f ðx; wÞ ¼ DOX.) So (simplifying again), our semantics for glad is: (40)

⟦glad⟧c : a. Defined only if: i. DOXðxÞðwÞ  P ii. DOXðxÞðwÞ  f ðx; wÞ

 b. If defined, = λPλxλw:maxgðx;wÞ f ðx; wÞ  P

This semantics makes the difference with want and believe clear. For glad, unlike want and believe, we have genuine context-dependence. For want and believe, lexical constraints fix the value of f(x, w) fully, as they set f ðx; wÞ ¼ DOXðxÞðwÞ. But for glad, we only have DOXðxÞðwÞ  f ðx; wÞ.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

Context must contribute something more to fix the value of f(x, w). Lexical meaning does constrain the value of this parameter, but it leaves it partly open for glad. With all this background in place, we can build a semantics for know, following the model of other factive attitudes. We will wind up with a variant of the Schaffer & Szabó (2014) semantics, but by a very different route. And once we have it, we can apply it to the case of questionsensitivity. We have already seen that factive attitudes can open up space for context-dependence, and we will explore how that works with know. When moving from emotive factives to knowledge, we need to think about evidence. Linguistically, our earlier observations about evidentials and questions and answers indicate this. We saw that evidentially oriented material can allow embedded main point questions. We get this most easily for know when a high degree of confidence is a relevant factor in discourse. Linguistics aside, it is of course an obvious epistemological idea that evidence is relevant to knowledge. We will put the evidential component of know into the ordering source, where the preference ordering showed up for glad. (This makes the attitudes of different flavors, one emotive, the other cognitive.) To do this, we replace the set of propositions characterizing preferences with those that are evidence for agent x in w. Call this E(x, w). In parallel with glad, the main idea is to rule out any world not compatible with all the  evidence. Hence, we will require maxEðx;wÞ f ðx; wÞ  P. Know is factive, and we need to presuppose factivity. As I mentioned earlier, I shall follow the linguistic tradition and encode this as DOXðxÞðwÞ  P. This requires (more or less) that every world in the common ground is a P world, so it is not a bad approximation, even if your epistemology might call for something stronger. As far as discourse goes, this often at least looks correct. If we all presuppose it is raining, and I say I know I need an umbrella, things look fine within the discourse, even if in fact we are wrong about it raining. As with other attitudes we have looked at that are ‘not too counterfactual’, we need the modal base to include our belief worlds. So, we should have DOXðxÞðwÞ  f ðx; wÞ, just as with glad. As before, we allow more in the modal base, to allow some counterfactual reasoning. The requirement is most clear in cases of counterfactual reasoning coupled with realis assertions:

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

    (41)



Let’s suppose one of us would teach aesthetics. That includes me, Jonathan, Zoltan, and Cody. a. I know Jonathan would teach aesthetics. b. # I know Jonathan is teaching aesthetics.

We can reason about knowledge with worlds outside what we in fact believe. We need all our belief worlds to be present, but can go beyond them for knows. Knowledge entails belief. As before, our current approach gets this as a bonus from factivity, as we have DOXðxÞðwÞ  P.¹⁰ With all this, our semantics for know is: (42)

⟦know⟧c : a. Defined only if: i. DOXðxÞðwÞ  P ii. DOXðxÞðwÞ  f ðx; wÞ

 b. If defined, = λPλxλw:maxEðx;wÞ f ðx; wÞ  P

This is like glad in many ways. It is a factive attitude. It has substantial context-dependence, via f(x, w), just like glad does, as the modal base can go beyond DOX. The main difference is that we have switched from an emotive ordering source to an evidential one, E. My main goal here is to provide a linguistically plausible semantics for know, that illustrates its context-dependence, and shows it to be similar to other attitudes. But there is a little bit of real epistemology here too. Schaffer and Szabó observe that a similar semantics can be seen as an implementation of a relevant alternatives theory (e.g. Dretske 1981; Goldman 1976; Lewis 1996; Schaffer 2005; Stine 1976). The idea is roughly that to know is to be able to rule out competing hypotheses. But not all possibilities are relevant, e.g. distinguishing zebras from cows is relevant, but not distinguishing them from carefully painted horses. The semantics just sketched implements a version of this. f(x, w) provides a contextually fixed domain of ‘relevant’ alternatives. E rules them out. Insofar as the semantics works, this might be an indirect reason to prefer a relevant alternatives theory. But, it is a very weak reason. The abstract semantics does not say what in particular E is, beyond it being ¹⁰ Schaffer and Szabó encode belief directly. This is one virtue of the approach to factivity I have taken here.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

evidential. This allows that it could be something very different from the mechanisms relevant alternatives theories have in mind. The same goes for max. As we explore how context sets the value of f(x, w), we will see standard mechanisms from semantics and pragmatics. It is not obvious if these provide the relevance assumed by relevant alternatives theories. So, we will think of the semantics in a relevant alternatives way, but not put much weight on it. With our semantics, which makes some room for contextdependence, in hand, we can now look at how context can affect it.

2.4.2 Context-dependence We will see that, as question-sensitivity highlights, the main mechanism of context-dependence at work with attitudes like know is focus. In particular, it is a case of what is called association with focus (Rooth 1985), where focus has a truth-conditional effect. We will explore how our factive attitudes allow this. We will see that we have a case of what Beaver and Clark (2008) call free association. In cases like this, focus constrains the setting of a contextual parameter.¹¹ The basic idea behind free association with focus is that when we have a generalization over a domain that is partly set by context, focus plays a crucial role in determining what is in that domain. In our cases, we have a contextual parameter f(x, w) partly set by context. Its value is constrained, as we have DOXðxÞðwÞ  f ðx; wÞ. But the value is not fully fixed by lexical meaning, so it must be partly set by context. Our semantics is one of universal quantification over worlds in f(x, w), so f(x, w) functions as a restrictor. In such cases, we have the general phenomenon of free association with focus: focus links to questions that affect restrictors of operators. The general phenomenon here is well-established. A wide range of operators with contextually restricted domains associate with focus, including adverbs of quantification, determiners, counterfactuals, etc. (e.g. Beaver & Clark 2008; Partee 1991). There have been a number of explanations offered over the years as to why and how this happens. Some early views simply say it is a lexically encoded feature of operators ¹¹ Focus in attitudes has been studied by a number of authors, including Asher (1987), Beaver & Clark (2008), Dretske (1972), von Fintel (1999), Heim (1992), Jackendoff (1972), Kadmon & Landman (1993), and Simons (2007).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



that associate with focus (Rooth 1985). Given how widespread the phenomenon is, it is more common these days to see it as derived from semantics-pragmatics interactions. So, for instance, Rooth (1992) and von Fintel (1994) argue that focus sets up anaphora with a focus value, which has the result of restricting domain parameters. Or, along the lines of Roberts (1996), it has been argued that association with focus is QUD driven. Question-answer congruence triggers some kind of pragmatic inference, triggering local accommodation of material into a restrictor. I shall talk more about some of these theoretical options in a moment. First, let us see the end result. A QUD, appropriately restricted, provides a salient set of propositions C. This gives us a salient set of worlds UC. This restricts our set of available worlds f(x, w), as we will need to have f ðx; wÞ  UC. Pragmatics typically strengthens this to f ðx; wÞ ¼ UC. To see where focus fits into this process, we need a little more detail on its semantics. Simplifying a lot, but following Rooth (1985, 1992), assign an alternative set to a sentence by varying the focused element, to give a focus semantic value. Our focus semantic value is marked by f, e.g.: (43)

⟦½EdeF wants coffee⟧f ¼ f⟦x wants coffee⟧ : x 2 De g

We also need a quick and dirty semantics for questions (borrowing from Groenendijk & Stokhof 1984 and Hamblin 1973), that makes the semantic value of a question the set of possible answers to it, varying the argument position of the wh-expression. Hence: (44)

⟦Who wants coffee?⟧ ¼ f⟦x wants coffee⟧ : personðxÞg.

A more full semantics, among other things, allows for sortal restrictions. But the quick and dirty version will suffice for now. It will be important as we go forward to allow a null element ∅ to be among the values for questions and focus. This may be more natural in a setting where our variables range over generalized quantifier values. But it reflects the considerations we raised in section 2.2 about existential presuppositions, and reflects other substantial issues in the semantics of questions. Some of these could complicate the semantics, but adding ∅ will give us a useful way to work with a simple semantics. Now, we can ask more about how focus works. As Rooth (1992) observed, focus is essentially anaphoric. When S contains a focus, it requires a salient set C in the context such that C  ⟦S⟧f . C provides a

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

set of salient alternatives for the focus. We also build in some non-triviality requirements: ⟦S⟧ 2 C, and there is at least one other element of C. What provides C? Question-answer congruence. Question semantic values provide values for focus anaphora. The end result is then that in typical cases, focus semantic value = semantic value of the question under discussion. There are a few modifications we might note. For embedded clauses, we might have the focus semantic value of the clause containing the focus = semantic value of the immediate QUD. And of course, this is an oversimplification. We have not allowed for multiple foci, nor have we taken into account the possible complexity in the structure of questions and subquestions and how intonational prominence relates to them. But, we can work with the simple idea that focus is anaphoric on the QUD. There is usually further domain restriction. Focus semantic values are big. We normally get anaphora on a contextually restricted part of the value, which in turn will map to a contextually restricted part of a question value. But this still amounts to finding a contextually salient C. There is a theoretical issue I shall mention, only to put aside. There are two ways of approaching a more detailed analysis of this effect. One is to take the anaphoric behavior of focus as basic, and derive questionanswer congruence effects (e.g. Rooth 1992). The other is the reverse, which takes question-answer congruence to be a felicity condition and derive the anaphoric properties of focus (e.g. Roberts 1996). These are theoretically distinct options, but fortunately, we do not need to decide between them now. There are a few other theoretical issues I shall mention briefly, and also set aside. Why, for instance, are restrictors specifically associated with focus? We see that when a restrictor is context-dependent, it needs to find a salient value. Focus affects how the value is set, and makes it anaphoric on the restricted QUD (e.g. Partee 1991). Slightly more specifically, the restrictor contains a variable, which is constrained to be anaphoric on whatever licenses focus in the nuclear scope. This is the QUD (e.g. von Fintel 1994; Rooth 1992). But why does this happen? Luckily, we need not really decide. But here are a few ideas. As I mentioned, it may be simply part of the semantics of operators (Rooth 1985). It is more common these days to argue it is a pragmatic effect, and defeasible. It may just be that there are two context-dependent elements, looking for similar values, and they typically wind up being set the same way. Or, maybe we have

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



local accommodation. Maybe general constraints on discourse force the restrictor, which is the local domain, to satisfy the focus requirement (Beaver & Clark 2008; Roberts 1996). I have sketched some of the semantics and pragmatics of focus and QUD, to illustrate how they work. The main result, that will be important for us, is that when we have a contextually determined domain restrictor, it gets constrained by the QUD. Where C is the appropriately restricted value of the QUD, we typically have our restrictor constrained by UC. Applying this to our semantics, in most cases, we will have f ðx; wÞ ¼ UC. This is the main effect of focus, and is the underlying reason for question-sensitivity. We also have a lexically set lower bound: DOXðxÞðwÞ  f ðx; wÞ. But f(x, w) remains context-dependent, and mechanisms of context-dependence have f(x, w) resolve to UC (which can often get us the lexical constraint for free). Focus and QUD are the main mechanisms that make this happen. The relation of QUD to focus creates question-sensitivity. We now have at least a sketch of a semantics for know that allows for a context-dependent parameter, and a sketch of how that parameter gets set in context. I have provided a little extra information, to flesh out what the mechanism setting the parameter might be. But the important point is that we have a strong generalization about what value it takes in context, which we can use to look once more at question-sensitivity. That will be our task in the next section.

2.4.3 The Schaffer-Szabó semantics Before doing that, I shall pause to compare the semantics I sketched here to the one developed by Schaffer and Szabó. The same line of reasoning I have followed here led them to a slightly different proposal. Their proposal treats know as an adverbial. Adverbials are known to associate with focus, and are in important ways similar in structure to modals. A version of their semantics is something like:   (45) ⟦know⟧c ¼ λPλxλw: \ Eðx; wÞ \ ðUCÞ  P ∧ x truly beleieves P on the basis of E(x, w) This is not quite their form. Semantics for adverbs is usually done in terms of situations (Berman 1987; von Fintel 1994; Kratzer 1989). Schaffer and Szabó follow suit. Also following the tradition on adverbs, they assume a distinct semantic argument which can be filled by an overt

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

restrictor (an if or when-phrase). I doubt these matters at the level of detail we are working at. The significant difference as I see it is that they add the separate clause: x truly believes P on the basis of her evidence in w. Are the two semantics equivalent? They are very close. The emotive factive model put in a kind of factive presupposition as a definedness condition, but Schaffer and Szabó could do this as well. My semantics gets belief for free. But it only does so because of the particular way we implemented the factive presupposition. So, it is not clear if this is a benefit or not. Actually, there is another option for deriving belief. Assume a version of ‘you believe your evidence’, i.e. DOX  \ E. Then DOXðxÞðwÞ  fðx; wÞ and DOXðxÞðwÞ  \ Eðx; wÞ, so DOXðxÞðwÞ  maxEðx;wÞ f ðx; wÞ  P. (We might worry if this assumes more closure than we might want?) I thus take my semantics to be largely a variant of that in Schaffer & Szabó (2014). There are small differences, which are worth further exploration, but substantial similarity.

2.5 Predicting and explaining question-sensitivity Finally, with all this background in place, we can return to questionsensitivity. We will apply our semantics to see if we can explain the features of question-sensitivity we uncovered above. Association with focus will be a key component, but only a part of the story. Recall, we had two features to derive for question-sensitivity: 1. BAEþ Qþ what: False. 2. The flip. • BAE Q who: False. • BAEþ Qþ who: True We can now see if and how these results come out of our semantics, by doing a few computations. Here are the components of our computations. First, recall that our semantics tells us that know is true iff maxEðx;wÞ ð f ðx; wÞÞ  P. We will fix a few features for the target question-sensitivity cases. We will fix that Ann ð¼ aÞ is the speaker and the agent of the attitude, and that we are in the world described by FA. Hence, we will usually not mention our world and agent parameters ‘x’ and ‘w’. Also, assume from BA that the presuppositions on f and P are satisfied.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



We already know that f (x, w) is context-dependent, and that association with focus will have it fixed to be UC for a salient C provided by the QUD. (Technically, it is only constrained by UC, but, as I noted above, we usually pragmatically strengthen this effect to have f ðx; wÞ ¼ UC.) This allows us to assume that know is true iff maxE ðUCÞ  P. It will simplify matters to assume that E is coherent, so that \ E is non-empty. It will also simplify matters to assume that not all members of UC are ruled out by the total available evidence. This is satisfied if DOX and E are veridical. If it fails, all worlds believed or considered are ruled out, which is a degenerate case we can safely ignore. Putting all these together, we get: • • • •

maxE ðUCÞ ¼ ðUCÞ \ ð \ EÞ. know is true iff ððUCÞ \ ð \ EÞÞ  P. P ¼ ⟦Claire stole the diamonds⟧. So Ann knows that Claire stole the diamonds is true iff every alternative world compatible with all the evidence is a world where Claire stole the diamonds.

We can now compute truth values in our contexts. Let us start with the weak BAE Q who context. In this, we have C ¼ QUD ¼f⟦x stole the diamonds⟧ : x ¼ Claire; Ann; Ben; ∅; . . .g. As this is a Q context, we include ∅ (i.e. no one stole the diamonds). The BAE Q who computation is then: • Is every world in UC compatible with the total evidence \ E one where Claire stole the diamonds? • No. The total evidence, i.e. fingerprints, does not rule out the null option, that no one stole the diamonds and Claire took out some rubies to display. • So, predict False. • This is the correct result. Now for the strong BAEþ Qþ who context. Here C ¼ QUD ¼f⟦x stole the diamonds⟧ : x ¼ Claire; Ann; Ben; . . .g, but as we have a Qþ context, we do not include ∅. The BAEþ Qþ who computation is then: • Is every world in UC compatible with the total evidence \ E one where Claire stole the diamonds? • C no longer includes the null option.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



  • So we conclude yes. We know someone stole the diamonds, and our evidence is enough to rule out anyone other than Claire. • So, predict True. • This is the correct result.

Thus, we get the flip. Now we can do our what-context. In the strong BAEþ Qþ what context, we have C ¼ QUD ¼ f⟦Claire stole x⟧ : x ¼ the diamonds; the rubies; the trade secrets; . . .g. As this is a Qþ context, we do not include ∅. The BAEþ Qþ what computation is then: • Is every world in UC compatible with the total evidence \ E one where Claire stole the diamonds? • We conclude no. The evidence of fingerprints on the safe tells us nothing to rule out any option in C. • So, predict False. • This is the correct result. So, we have generated all the right results for question-sensitivity. We thus have an account of the semantics and pragmatics of knowledge ascriptions that gets the right results on the question-sensitivity cases. But what really made that happen? We have seen the computations in detail, but should ask what is important in them, and what carries the explanatory weight. First, though, we should observe again that our semantics for know is context-dependent. It is a very standard semantics for attitudes, that has a standard form of context-dependence, derived from the parameter f(x, w). It shows strong association with focus effects (specifically, in the terminology of Beaver and Clark (2008), free association with focus). Again, that is entirely standard for this sort of case. Because of this, I agree with Schaffer and Szabó that in spite of resistance to other forms of contextualism for knowledge, this form is well-motivated and wellsupported. A fully standard semantics and pragmatics predicts contextualism. Thus, to an extent, we have vindicated contextualism. The main place where we see the role of context-dependence is in comparing our strong who- and what-contexts. There, context hands us entirely different domains: f⟦x stole the diamonds⟧g versus f⟦Claire stole x⟧g. It is then easy to see that E does not help rule out any what-worlds.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



But, when we look more closely at the flip, we see something more complicated. It may be tempting to view the flip as also a matter of context-dependence. After all, context does provide different domains. In the weak case, C includes ∅. In the strong case, it does not. The evidence is not sufficient to rule out ∅-worlds. So, we have different truth values, as a result of context-dependence. But, I think here we can see the importance of non-contextual factors as well. Our formal apparatus supposes a set of evidence propositions E, and computes ðUCÞ \ ð \ EÞ. This is important, as it helps characterize what we need to check to determine truth values. It tells us we need to rule out worlds in UC according to the evidence. But our descriptions of the contexts, and our attempt to enumerate the facts as FA, do not directly provide us with E. Rather, we try to work out what worlds in UC would be ruled out, according to our understanding of the information we have about the evidence. The main piece of evidence, as we make clear in FA, is fingerprints. But, we need to also combine that with whatever else we know in the context to work out truth values. We did this above by considering, informally, which of the various propositions specified in C would be affected by the presence of fingerprints. But that always goes along with some inferences that make use of other factors. With this in mind, let us look at our who-contexts again. Semantically, the difference between the strong and weak who-contexts is just one of whether ∅ is among the options we have to rule out with E. That was important to how we explained the flip. But let us look again at how a normal option is ruled out, say the one where someone else, Fred, stole the diamonds. Our evidence is Claire’s fingerprints, and it is not so simple how that might tell us anything about whether Fred stole the diamonds. In the weak, BAE Q who context, it does not have to. We will not rule out ∅, so it will not change our truth value whether or not we rule out Fred stealing the diamonds. In the strong context BAEþ Qþ who, to get the truth value of True, we do have to rule out every option except Claire stealing the diamonds. So we have to rule out the case of Fred stealing the diamonds. But notice, formally, the situation is no different. We are asking if the Fred world is in \ E. We are going beyond our formal description in reaching the value True for the strong who-context. The key, I believe, is that we also have an epistemically strong context: one where we know that someone stole the diamonds. This is more

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

than just not having ∅ in UC. It affects how we assess the evidence of fingerprints. Our judgment in the strong context reflects ruling out the Fred world. But how do we do it? Does the evidence of Claire’s fingerprints tell us anything about this world? Not directly. It tells us nothing about Fred. But in the strong context, we know that someone stole the diamonds. That appears to affect what our evidence can rule out: in the presence of that knowledge, our evidence does seem to rule out Fred stealing them. The presence of Claire’s fingerprints (and only Claire’s) make her the only suspect. Someone stole the diamonds, so if Claire is the only suspect, it must have been her. So we rule out the Fred world. This is a just-so story, and like all just-so stories, this one is pretty sketchy, and I am not sure about the details. (And, if you stare at it, you might start doubting your own judgments on the main case, maybe in stakes-like fashion.) But nonetheless, it indicates that something other than just context-dependence is at work. What we assume about the situation affects not just context-dependent semantic values, but how evidence works, and what evidence can tell us. That is not contextdependence, according to the semantics and pragmatics we just developed. In contrast, in the weak BAE Q who context, in addition to not being able to rule out ∅, it seems we cannot rule out the Fred world. If we do not know that someone stole the diamonds, it is hard to see how we can tell any story that makes a connection between the evidence of Claire’s fingerprints and Fred’s activity. Even if the just-so story is just-so, the contrast seems fairly clear. And it is about evidence, not contextdependent domains. We thus see that to get the flip, we need not only the domain C the context provides, but also background knowledge that affects how our evidence rules out alternatives. The latter is not an aspect of contextdependence, either according to our semantics or intuitions. Both are needed to generate the flip.¹² So what is the role of context in question-sensitivity? It appears to be relatively weak. It helps to fix alternatives to rule out, but does not specify ¹² Incidentally, If you got OK judgments on the intermediate BAE Qþ who context, then this likely explains them. In this case, C does not contain ∅. But the judgment is false. Presumably because the evidence does not rule out the Fred case, so lack of ∅ is not sufficient.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



anything about how they are ruled out. This does play a genuine role. Just setting up the domains of alternatives is sufficient for explaining the what-cases. In either the strong or weak what-context, our alternatives to rule out, provided by context, are things like Claire stole the rubies, Claire stole the pencils, and so on. Evidence of Claire’s fingerprints on the safe tells us nothing about those, so it is easy to see that we get truth values of False in both of those cases. But to explain the flip, we need more. And without the flip, we do not get the truth value variation. So, context is not the whole story. Actually, this suggests there might be an alternative explanation of question-sensitivity. Recall, we assumed that FA is sufficient to determine truth values. One way of seeing the just-so stories is that they effectively weaken this assumption. They do so in two ways. One is that they reveal ways that FA may be incomplete, and how we complete it might be influenced by other factors that we tried to put into context. We see that we strengthened FA in our reasoning, to include no one else stealing anything, making Claire the only suspect in one case. We may have also smuggled in an assumption that nothing else was stolen. This shows that FA, though a fairly good transcription of what we explicitly supposed the facts to be, may not be all there is to the facts we work with when reasoning about evidence. There is another way our assumption that FA is sufficient to determine truth values might be weakened, that I think is more important. In addition to the extra facts to which we might have implicitly appealed, listed a moment, ago, facts about what else is known are involved in determining truth values. We have seen that this is crucial for the flip. Together, these two observations invite the idea that we could put all the important effects of question-sensitivity into the truth supporting circumstances. Can we then skip context-dependence altogether? The proposal here would be that instead of context-dependence, we see different truth-supporting circumstances at work in the main questionsensitivity cases. Focusing on the strong cases, and ignoring the flip, we would have at least two different truth-supporting circumstances: (46)

a. FAEþ who: FA þ known that someone stole the diamonds (plus more). b. FAEþ what: FA þ known that Claire stole something (plus more).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

Can this explain everything, without appeal to context-dependence? I think the answer is no, for several reasons (though the most committed invariantists in epistemology will disagree). First, the work of the previous sections shows that we really do have context-dependence. We have it on good linguistic grounds. The context-dependence involved is exactly the same as other well-documented sorts. It offers a standard explanation of how questions affect truth values. So, we can use it without reservations. It is there, and we should not ignore it. And we do make substantial, if partial, use of the effects of context. They guarantee that the domains of alternatives for the evidence to rule out differ in important ways. This is crucial for getting the behavior of the what-case. Can we explain this without appeal to context? We might imagine that roughly, in a given circumstance, you generate a range of options based on what you know, and see how evidence rules them out. But contextdependence explains and clarifies this much more successfully. It allows a kind of contextual pre-structuring of domains. That tells us what goes into the domains, using independently supported linguistic principles. We then easily get the right explanation in the what-case. The contextbased approach offers a clear explanation, at no additional cost. So, we have established three things. First, there is context-dependence in the attitude verb know. Second, this is important to explaining the phenomenon of question-sensitivity. But third, it is only part of the explanation. Non-contextual factors, about how evidence works, are also crucial to the full explanation. This vindicates contextualism. To a point. Know is context-dependent, and that context-dependence matters. But, given its limited role, we should ask how important the context-dependence is to our understanding of the semantics of know, or to our understanding of knowledge itself. That is our last task, to which we now turn.

2.6 Varieties of context-dependence One striking feature of the linguistic facts pertaining to the contextdependence of know is how general they are. We have an operator that universally quantifies over a context-dependent domain. The domain is partly set by lexical factors, but partly by context. This situation appears across attitudes (as we saw), but also adverbials, D-quantifiers, and so on.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



It is widespread. In such cases, association with focus plays a significant role in setting the domain in context. This effect of association with focus is also very widespread, and may well be an entirely general feature of this sort of environment (cf. Beaver & Clark 2008). What appears to be a specific feature of the verb know providing question-sensitivity turns out to be the result of several highly general linguistic mechanisms. Of course, these general mechanisms have specific effects in the setting of knowledge ascriptions. This is most clear in the what-case BAEþ Qþ what. There, context sets up a range of alternatives, for which it is clear that the available evidence fails to rule out any of them. It is much less clear in the who-cases and the flip. The key factor in getting the flip was ruling out alternatives. Context still sets up a range of alternatives, but why they are ruled out has more to do with inferences from evidence than with the effects of context. So, a very general effect of context-dependence sets up one easy case, but that does not appear to be all of the underlying phenomenon. It appears to be a somewhat superficial effect. Input from context makes certain cases clear and easy. Without it, it is not clear if we could get the right results. But it does not explain all that goes into our judgments, and it is not clear if it is doing much in the most difficult aspects of the phenomenon in question. I think this is a reflection of the very general nature of the contextdependence involved. In many cases where we discover context-dependence, we at the same time discover something important about the meaning of a particular expression (or a highly specific class of expressions), and presumably thereby discover something important about the underlying property that term expresses (or class of properties). For instance, suppose you happened not to notice that gradable adjectives like rich are context-dependent. You could discover that it is context-dependent, and develop a semantics based on, for instance, comparison classes or standards for richness. These are context-dependent parameters that feed the meaning of rich. And, along with that, you might conclude that the property of being rich reflects this. Thus, it is (in a special way) relational. To be rich, our semantics informs us, is to be rich compared to some group of people or some standard amount of money. In cases like this, we learn something specific and important about the word rich, and about the property it expresses (and likewise for the whole category of gradable adjectives).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

Many instances of context-dependence are like this. They are deep results about both specific words or classes of words, and the properties they express. The context-dependence we have been working with here is different. It is a general effect of domain restriction and association with focus for operators. As we saw a moment ago, it occurs very widely, across a wide range of types of expressions. This is incredibly important linguistically, and provides a substantial generalization about how context-dependence works in language. It is a collection of deep results about some basic mechanisms our languages use. But unlike the case of rich we just looked at, it uncovers no special features of the specific properties specific words or classes of words express. Let us see how this plays out for know. We have an analysis of know— one among many kinds of universal quantifiers in language. This gives it a domain of quantification, which like almost all such domains shows some context-dependence, and in particular, association with focus. No special feature of the underlying property of knowledge is identified by noting it is context-dependent (beyond its being a universal quantifier over a partly context-dependent domain). The fact that it is contextdependent can set up some easy truth value judgments. But, it does not suffice for any judgment which requires subtle understanding of how this particular property works. Call this sort of context-dependence general. It is context-dependence of how broad genera of linguistic expressions, such as operators, interact with other highly general mechanisms such as association with focus, question-answer congruence, and so on. We have concluded that the context-dependence of know is general. In contrast, substantial lexical context-dependence, of the kind we illustrated with rich, is of course specific to that lexical item or a specific class of items. Thus, it is an example of the contrasting sort of specific context-dependence. General context-dependence is linguistically very important, and it has been well-documented. But it is different from what we typically think of as ‘contextualism’ in philosophy. That tends to be a matter of specific context-dependence, often of a contentious sort. We can see this in the case of knowledge ascriptions. Has the discussion here vindicated contextualism about knowledge ascriptions? In a sense, of course, yes. I have argued, following others, that know is in fact context-dependent. So contextualism is to an extent vindicated. But the form of context-dependence is general. This still matters, as it is an

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



important part of our explanation of the phenomenon of questionsensitivity. But, as I argued, the effects of general context-dependence are only part of the explanation of question-sensitivity. They take care of the easy part, about structuring domains of alternatives. The hard part is deciding what makes evidence rule out what, and that is not a matter of context-dependence. Rather, it is a matter of how background knowledge affects evidence. So, we have contextualism about knowledge ascriptions, but of a limited sort. I close by speculating that for philosophical purposes, general context-dependence, like we see with know, is really very weak. It shows how the background structure of language can affect uses of expressions in context. But we have not seen reason to think the fundamental features of knowledge ascriptions make use of this in any special way, and we have certainly not seen why we might conclude something basic about the property of knowledge. So, it seems we may have a very weak form of contextualism, which may interest the linguist, but it does not resolve many important questions in epistemology. We thus have two kinds of context-dependence. One is general, and has sources in general mechanisms crossing wide ranges of expressions and other apparatus encoded in language. Precisely because of its generality, it is of great interest to those of us studying language, to whom generalizations are of great value. It has many sources. We have seen features of operators, focus, and question-answer congruences can be sources. This is almost certainly not an exhaustive list. The other kind of context-dependence is specific, resulting from the features of specific lexical items or classes of items. This is, of course, also of linguistic interest. It may not provide such sweeping generalizations, but it can provide deep insights into lexical meaning. When it comes to learning about concepts expressed by words, it appears that it is specific, and not general context-dependence that we need. To support what philosophers call contextualism, we seem to need specific context-dependence.¹³

¹³ First and foremost, thanks to Jonathan Schaffer and Zoltán Gendler Szabó for the work that inspired this paper, for many discussions of the material, and for comments on earlier drafts of this paper. Thanks also to Sam Carter, Peter van Elswyk, Itamar Francez, Simon Goldstein, Gilbert Harman, John Hawthorne, Chris Kennedy, Jeff King, Max Köbel, Ernie Lepore, Morgan Moyer, Carlota Pavese, and Adam Sennet for many more discussion of the material in this paper and more comments on earlier drafts. Earlier versions of this paper were presented at the Workshop in Linguistics and Philosophy, University of

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

References Asher, N. (1987). ‘A typology for attitude verbs and their anaphoric properties’, Linguistics and Philosophy 10: 125–97. Beaver, D. I. and B. Z. Clark (2008). Sense and Sensitivity: How Focus Determines Meaning. West Sussex: Wiley-Blackwell. Beaver, D. I., B. Z. Clark, E. Flemming, T. F. Jaeger, and M. Wolters (2007). ‘When semantics meets phonetics: acoustical studies of second-occurrence focus’, Language 83: 245–76. Berman, S. (1987). ‘Situation-based semantics for adverbs of quantification’, University of Massachusetts Occasional Papers in Linguistics. Büring, D. (2003). ‘On D-trees, beans, and B-accents’, Linguistics and Philosophy 26: 511–45. Büring, D. (2016a). ‘(Contrastive) topic’, in C. Féry and S. Ishihara (eds.), Oxford Handbook of Information Structure, 64–85. Oxford: Oxford University Press. Büring, D. (2016b). Intonation and Meaning. Oxford: Oxford University Press. Cohen, S. (1999). ‘Contextualism, skepticism, and the structure of reasons’, Philosophical Perspectives 13: 57–89. DeRose, K. (1992). ‘Contextualism and knowledge attributions’, Philosophy and Phenomenological Research 52: 913–29. Dretske, F. (1972). ‘Contrastive statements’, Philosophical Review 81: 411–37. Dretske, F. (1981). ‘The pragmatic dimension of knowledge’, Philosophical Studies 40: 363–78. Goldman, A. (1976). ‘Discrimination and perceptual knowledge’, Journal of Philosophy 73: 771–91. Groenendijk, J. and M. Stokhof (1984). Studies in the Semantics of Questions and the Pragmatics of Answers. Ph.D. dissertation, University of Amsterdam. Hamblin, C. L. (1973). ‘Questions in Montague English’, Foundations of Language 10: 41–53. Hawthorne, J. (2004). Knowledge and Lotteries. Oxford: Oxford University Press. Heim, I. (1992). ‘Presupposition projection and the semantics of attitude verbs’, Journal of Semantics 9: 183–221. Herburger, E. (2000). What Counts: Focus and Quantification. Cambridge, MA: MIT Press.

Chicago, April 2014; the Davis Extravaganza Philosophy Conference, University of California, Davis, April 2014; the Facultat de Filosofia, Universitat de Barcelona, December 2016; the Rutgers Semantics Reading Group, Rutgers University, March 2017; and the Philosophy of Language in Lima workshop, Pontificia Universidad Católica del Perú, June 2017. Thanks to all the participants at those events. This paper changed a great deal as I presented it and discussed it with people, and I am grateful for all the input it received.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

   



Heycock, C. and A. Kroch (2002). ‘Topic, focus, and syntactic representation’, Proceedings of the West Coast Conference on Formal Linguistics 21: 141–65. Hintikka, J. (1969). ‘Semantics for propositional attitudes’, in J. W. Davis, D. Hockney, and W. K. Wilson (eds.), Philosophical Logic, 21–45. Dordrecht: Reidel. Hooper, J. B. (1975). ‘On assertive predicates’, in J. P. Kimball (ed.), Syntax and Semantics, Syntax and Semantics, Vol. 4, pp. 91–124. New York: Academic Press. Horn, L. R. (1972). ‘A presuppositional analysis of only and even’, Papers from the Chicago Linguistics Society 5: 97–108. Jackendoff, R. S. (1972). Semantic Interpretation in Generative Grammar. Cambridge, MA: MIT Press. Kadmon, N. (2001). Formal Pragmatics. Oxford: Blackwell. Kadmon, N. and F. Landman (1993). ‘Any’, Linguistics and Philosophy 16: 353–422. Karttunen, L. and S. Peters (1979). ‘Conventional implicature’, in C.-K. Oh and D. A. Dinneen (eds.), Presupposition, Syntax and Semantics, Vol. 11, pp. 1–56. New York: Academic Press. Kratzer, A. (1977). ‘What must and can must and can mean’, Linguistics and Philosophy 1: 337–56. Kratzer, A. (1989). ‘An investigation into the lumps of thought’, Linguistics and Philosophy 12: 607–53. Ladd, D. R. (1996). Intonational Phonology. Cambridge: Cambridge University Press. Lewis, D. (1973). Counterfactuals. Cambridge, MA: Harvard University Press. Lewis, D. (1996). ‘Elusive knowledge’, Australasian Journal of Philosophy 74: 549–67. Reprinted in Lewis (1999). Lewis, D. (1999). Papers in Metaphysics and Epistemology. Cambridge: Cambridge University Press. Merchant, J. (2001). The Syntax of Silence: Sluicing, Islands, and the Theory of Ellipsis. Oxford: Oxford University Press. Partee, B. H. (1991). ‘Topic, focus and quantification’, Proceedings of Semantics and Linguistic Theory 1: 159–87. Pierrehumbert, J. and J. Hirschberg (1990). ‘The meaning of intonational contours in the interpretation of discourse’, in P. R. Cohen, J. Morgan, and M. E. Pollack (eds.) Intentions in Communication, 271–311. Cambridge, MA: MIT Press. Portner, P. (2009). Modality. Oxforrd: Oxford University Press. Reed, B. (2010). ‘A defense of stable invariantism’, Noûs 44: 224–44. Roberts, C. (1996). ‘Information structure in discourse: Towards an integrated formal theory of pragmatics’, Ohio State University Working Papers in Linguistics 49: 91–136.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

Roberts, C. (2011). ‘Topics’, in C. Maienborn, K. von Heusinger, and P. Portner (eds.), Semantics: An International Handbook of Natural Language Meaning, Vol. 2, pp. 1908–33. Berlin: de Gruyter Mouton. Rooryck, J. (2001a). ‘Evidentiality, Part I’, GLOT International 5: 125–33. Rooryck, J. (2001b). ‘Evidentiality, Part II’, GLOT International 5: 161–8. Rooth, M. (1985). Association with Focus. Ph.D. dissertation, University of Massachusetts at Amherst. Rooth, M. (1992). ‘A theory of focus interpretation’, Natural Language Semantics 1: 75–116. Rooth, M. (1999). ‘Association with focus or association with presupposition?’, in P. Bosch and R. van der Sandt (eds.), Focus: Linguistic, Cognitive, and Computational Perspectives, 232–44. Cambridge: Cambridge University Press. Schaffer, J. (2004). ‘From contextualism to contrastivism’, Philosophical Studies 119: 73–103. Schaffer, J. (2005). ‘Contrastive knowledge’, Oxford Studies in Epistemology 1: 235–71. Schaffer, J. (2007). ‘Knowing the answer’, Philosophy and Phenomenological Research 75: 383–403. Schaffer, J. and J. Knobe (2012). ‘Contrastive knowledge surveyed’, Noûs 46: 675–708. Schaffer, J. and Z. G. Szabó (2014). ‘Epistemic comparativism: A contextualist semantics for knowledge ascriptions’, Philosophical Studies 168: 491–543. Schwarzschild, R. (1999). ‘Givenness, avoidF and other constraints on the placement of accent’, Natural Language Semantics 7: 141–77. Selkirk, E. (1995). ‘Sentence prosody: Intonation, stress, and phrasing’, in J. A. Goldsmith (ed.), Handbook of Phonological Theory, 550–69. Oxford: Blackwell. Simons, M. (2007). ‘Observations on embedding verbs, evidentiality, and presupposition’, Lingua 117: 1034–56. Stalnaker, R. (1984). Inquiry. Cambridge, MA: MIT Press. Stanley, J. (2005). Knowledge and Practical Interests. Oxford: Oxford University Press. Stine, G. (1976). ‘Skepticism, relevant alternatives, and deductive closure’, Philosophical Studies 29: 249–61. von Fintel, K. (1994). Restrictions on Quantifier Domains. Ph.D. dissertation, University of Massachusetts at Amherst. von Fintel, K. (1999). ‘NPI licensing, Strawson entailment, and context dependency’, Journal of Semantics 16: 97–148.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

3 Words by Convention Gail Leckie and J. R. G. Williams

A long-established project in the philosophy of language is the search for a reductive naturalistic metasemantics. Reductive metasemantics ground target semantic facts—such as the fact that the word type “cat” refers to cats—in a non-semantic base.¹ Existing reductive projects presuppose that word- (or sentence-) types like “cat” are part of the non-semantic base. They presuppose the availability of an exogenous theory of word types, that is, one that is prior to and independent of the metasemantics.² This paper argues that an exogenous account of word types is unlikely to succeed. We propose a new strategy: an endogenous account of word types, that is, one where word types are fixed as part of the metasemantics. In particular, we show how a metasemantic account on the lines of Lewis’s account in terms of conventions of truthfulness and trust can provide an endogenous account of words suited to a naturalistic metasemantics. We say that it is the conventions of truthfulness and trust that ground not only the meaning of the words (meaning by convention) but also what the word type is of each particular token utterance (words by convention). We begin by explaining why existing exogenous theories of word types ill serve the metasemantic project (section 3.1). We then detail how a Lewisian metasemantics in terms of conventions of trust and ¹ We use quote marks variously to pick out word and sentence types, orthographic types and particular token utterances. We hope it is clear which use is in play on a particular occasion. ² For example, Lewis 1983, Evans 1973, Davidson 1973, Horwich 1998. By contrast, Millikan 1984 and Richard 1990 may intend something more like what we call an endogenous theory of word types.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

truthfulness would use an exogenous theory of words (section 3.2), before adapting that metasemantics to give an endogenous theory of words in terms of conventions (section 3.3). In section 3.4, we show that our endogenous account deals well with the problem cases we earlier identified for exogenous accounts of words. Finally we raise the potential problem of overgeneration of words for our account (section 3.5) and show that Lewis’s account of convention already has the resources to prevent such overgeneration (section 3.6).

3.1 As an example of a metasemantic theory that requires an exogenous account of words, consider a simple version of a dominant causal source account, modelled on Evans 1973. Where u is a token utterance, x is an object or property, y is a word type, (Toy Evans) u refers to x iff u is of type y and x is the dominant causal source of tokenings of the type y. Notice that (Toy Evans), like most metasemantic theories, specifies how features of the word type fix a word token’s semantic features.³ To say what the meaning is of a specific token, u, we need to know its type. For example, Peta’s utterance on 1/1/1850 “I’ve visited Madagascar” refers to whatever is the dominant causal source of tokens of that type “Madagascar” which Peta tokened. Evans says that the dominant causal source of the word-type “Madagascar” is the island and so Peta’s token refers to that island, not the mainland. How are we to use a metasemantic theory, such as (Toy Evans), in our naturalistic project? We must establish that all the terms on the righthand side of the biconditional are legitimate parts of the non-semantic naturalistic base. It is fair to assume that the island is a legitimate part of the base, as is the noise which is Peta’s utterance. What about word types? Are these a legitimate part of the non-semantic base? To show that they are, an advocate of (Toy Evans) would need a naturalistic account of word types; and likewise for other metasemantic theories that give the semantics of tokens in terms of their word type. We consider two ³ We lack space to consider theories which provide metasemantic accounts for individual tokens directly without appealing to word types.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



naturalistic accounts of word types—phonographemic and Kaplanian.⁴ We show that both generate counterintuitive predictions about what utterances are of the same type. This matters because those counterintuitive typings look likely to deliver undesirable semantics once plugged into any particular metasemantics. The default picture of words in the philosophical literature is based on phonetic and orthographic similarity.⁵ Where u and v are token utterances, (Phonographemic) u and v are tokens of the same word iff they are spelt or pronounced the same. Words are maximal equivalence classes of tokens in the same-word relation.⁶ So the word- type “Madagascar” might be the equivalence class of all tokens that are spelt M-A-D-A-G-A-S-C-A-R etc., or all tokens sounded /ˌmadəˈɡaskə/. Let’s call this the phonographemic conception.⁷ The phonographemic account types word counterintuitively. Sometimes, it is too fine-grained, typing separately what are intuitively tokens of the same word type. At other times, it is too coarse-grained, typing together what are intuitively tokens of different words. In itself, counterintuitiveness is not strong evidence against the phonographemic account. It would be if we were performing descriptive conceptual analysis on the folk concept word. But there is no reason to take folk intuitions about the extension of “word” as revealing the extension of the natural kind that is of significance to semantics and metasemantics. Rather, counterintuitive typings are problematic in so far as, once plugged into particular metasemantic accounts, they deliver crazy semantic assignments. ⁴ The metasemantic theories we are considering require a metaphysics of public words since their target is shared languages, not idiolects. For this reason we do not consider lexemes, construed as the individual’s typing on expressions. ⁵ For example, Stebbing 1935; Davidson 1979: 90; Haack 1978: 75. Cappelen 1999 and Cappelen and Dever 2001 propose a more sophisticated version of the phonographemic approach. ⁶ That is, tokens of a word are all same-word related and there are no ways to merge these equivalence classes to produce larger equivalence classes whose members are all sameword related. ⁷ Hawthorne and LePore (2011) call this the form-theoretic conception. It is part of the view that Kaplan (1990) critiques under the name ‘the orthographic conception’.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

Where an exogenous theory of words is intuitively too fine-grained, it provides an impoverished input for the metasemantic theory. It misses relevant tokens from the word type of an utterance u. Accordingly, the metasemantic theory the exogenous theory feeds into misses facts relevant to the meaning of utterances of u. If the input is sufficiently impoverished, the metasemantic theory has too little to go on and delivers no determinate meaning for u. If the input is selectively impoverished, the metasemantics may assign u a meaning but one that is incompatible with any plausible semantic theory. On the other hand, exogenous theories that are too coarse-grained contaminate the input to the metasemantic theory. When the metasemantics determines the meaning of u, it contaminates facts that do bear on the meaning of u—those about other tokens of intuitively the same type as u—with facts that don’t—those about tokens of an intuitively separate word type(s). This may render the meaning of u indeterminate or skew its meaning towards what is intuitively the meaning of the other ‘word type’. Consider some examples. There are two ways that the phonographemic conception cuts too fine and impoverishes the input to the metasemantic theory. (Phonographemic) does not group any distinct sounds and spelling types together. Yet, sometimes an intuitively unitary word type has instances with various spellings (“realize”/“realise”) or various pronunciations (“tomato”). The phonographemic account also counterintuitively counts instances of M-A-D-A-G-A-S-C-A-R and of /ˌmadəˈɡaskə/ as two distinct ‘words’. This opens up the risk that the various phonemes and graphemes are assigned different meanings by the metasemantics and the risk that the input to the metasemantic theory for any grapheme or phoneme is so impoverished as to fail to generate the correct meaning for tokens of it. Suppose, for example, that what we would ordinarily think of as the word “okapi” is uttered once and only once in the distinctive North Welsh accent. The class of tokens phonetically exactly similar to that utterance may be a singleton. Further suppose that the single North Welsh token was uttered in front of a muddy zebra’s backside, not an actual okapi. Plugging the phonographemic account of names into (Toy Evans) now delivers crazy results. (Toy Evans) directs us to the dominant causal source of the word type and since the type is a singleton, it directs to the causal source of that particular token—the zebra.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



(Phonographemic) also cuts more coarsely than the typing that gives the desired input into the metasemantic theory. For example, far-flung tokens of the grapheme C-O-W, traced out by aliens on Mars, ought not to determine the semantics of Earth tokens of that grapheme C-O-W. That would contaminate the input to the metasemantics. Even within a language, tokens of a single grapheme don’t always have the same meanings—consider lexical ambiguity. For example, tokens spelt B-A-N-K sometimes bear information about financial institutions and sometimes about the edges of rivers. The watery tokens ought not to determine the semantics of the financial tokens. Once more, (Toy Evans) will struggle to deliver the right semantic results for such ‘words’ since the counterintuitive typing skews what is the dominant causal source of the type. This typing might leave it indeterminate whether a token of B-A-N-K is the financial kind or the watery kind. In other cases, the causal source of the additional tokens will outweigh the dominant causal source of local tokens of that phonographeme. Going disjunctive might mitigate this concern. Perhaps we could say that a disjunctive property can be the dominant causal source of a word. We might try to say that B-A-N-K has a disjunctive property as its extension: is-a-river-bank-or-a-financial-bank. However, these referential assignments are likely to cause trouble for the semantics of whole sentences once combined with a compositional semantics. We do not want “There are three banks” to come out as true in cases where there are at most two river banks and one financial bank.⁸ It is tempting to resolve the problem of coarse-grainedness by distinguishing two words with the same spelling and pronunciation but which are different words in virtue of their different meanings.⁹ (Plus Semantic) u and v are tokens of the same word iff they have the same meaning and the same phonographemic features.

⁸ Going disjunctive brings other problems for the Evansian. Some tokens of the intuitive word “cow” have their source in horses, and yet we do not want any such token to mean horse. To avoid the horse/cow problem, it is tempting to build in a condition that disadvantages disjunctions as dominant causal sources. ⁹ Many text books, such as Larson and Segal 1995, treat ambiguity as a syntactic phenomenon so that “bank” corresponds to two words. Plausibly, this move requires words to be individuated partially in semantic terms.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

Again, words are maximal classes of tokens in the same word relation. (Plus Semantic) rules that there are two homonyms spelt B-A-N-K and so we might hope to get a different dominant causal source of each by (Toy Evans). However, (Toy Evans) can’t be combined with (Plus Semantic) on pain of circularity. The account of word types can’t appeal to notions which the metasemantic theory is supposed to ground naturalistically. The phonographemic account of words is not the only one on offer. Kaplan offers a different exogenous account of word type. He takes two word tokens to be of the same type iff they are in an appropriate causal relation to one another. The paradigm appropriate causal relation is intended repetition. Not all tokens of B-A-N-K need be tokens of the same word, provided they fail to be related by chains of intended repetition. Conversely, other pairs of tokens that are spelt or pronounced differently—for example, “realize” and “realise”—are in the right sort of relation. Since Kaplan’s account has never been spelt out in detail, it’s difficult to assess. However, it, too, types words counterintuitively in ways that are likely to cause trouble for any metasemantic theories reliant on it. Kaplan’s account cuts too coarsely in cases like the following. Suppose Xena produces a token /treɪn/ in attempting to convey that she is a trainengineer, which Zara mishears and then repeats as /dreɪn/, forming and passing on the belief that Xena is a drain-engineer. Xena and Zara’s tokens /treɪn/ and /dreɪn/ would count by Kaplan’s lights as tokens of the same word since Zara intentionally repeats Xena’s utterance. Zara’s friends and relations go on to repeat her utterance in the context of discussions of plumbing. Such tokens count as the same word as Zara’s /dreɪn/ tokening, but also by transitivity, as the same word as Xena’s /treɪn/ tokening. The danger is that we lose the obvious fact that there are two word types in play here. Inter alia, this plays havoc in identifying a single dominant causal source, as required by (Toy Evans).¹⁰ As with the phonographemic account, it is tempting to get the intuitively right word typing by adding a semantic element to the account. We want to count only those intentional-repetition-links that preserve meaning, in order to ¹⁰ Another type of case where Kaplan’s account cuts more coarsely than an intuitive typing of words is the case of substantial but gradual phonetic, graphemic, and semantic change, including cases with fission structure, but we lack space to discuss such cases.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



identify a tradition of usages focused on conveying information about a single subject-matter. But a reductive metasemantics can’t, on pain of circularity, make use of an exogenous account of words that contains semantic elements. What have we shown so far? Phonographemic and Kaplanian views attempt to offer an account of word types that is separate from, and independent of, the main metasemantic account. We have established problems with combining these views of the metaphysics of words with (Toy Evans). At this point, one project is to retain the architecture: metasemantics underpinned by an exogenous theory of word types, and to show how the difficulties identified above can be resolved by more sophisticated development of the component theories. However, we think that it is a mistake to assume that the overall theory will have this architecture. What follows illustrates an alternative, on which wordtyping and semantic facts are settled simultaneously and the word types are endogenous to the metasemantics. Our illustration of an endogenous theory will use a Lewisian metasemantics, though the endogenous strategy could potentially be deployed in other frameworks.¹¹

3.2 In this section, we describe the metasemantic theory—Lewis’s—that we will be adapting into an endogenous account (in section 3). We start by recapping Lewis’s own metasemantic story in terms of conventions of truth and trust. Lewis himself seems to be working with an exogenous phonographemic picture of word types. He says language assigns meaning to “certain strings of types of sounds or marks” (Lewis 1983: 163). The exogenous theories already canvassed turn out to cause problems for Lewis similar to the ones afflicting (Toy Evans). In “Language and Languages” Lewis says that languages, in the sense of abstract semantic theories, are functions from sentences to meanings.¹² Grammars are, inter alia, functions from the public lexicon (a generalization

¹¹ One author takes the fact that we can develop a successful endogenous version of Lewisian metasemantics as evidence for Lewisian metasemantics. However, it may be possible to develop endogenous versions of interpretationist and inferentialist metasemantics to rival the endogenous Lewisian account. ¹² Lewis 1983: 163–88.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

of the category word types) to meanings which recursively generate the functions from sentences to meanings. The job of a metasemantic account is to lay down conditions, in an illuminating, non-circular way, as to which of these abstract grammars and languages is in use by a population. Lewis’s metasemantic theory says that a certain language L is correct for that population if the conventions of truthfulness and trust for L prevail for that population P. A convention of truthfulness for a sentence type s, connecting it to the proposition p, is a regularity in usage—that members of the population utter tokens of s only if they believe that p— where this regularity is entrenched in the beliefs and desires of the community in a distinctive convention-forming way. A convention of trust, likewise, is a conventional regularity of having or forming a belief that p upon hearing someone else utter s. A grammar is correct if it is the best axiomatic theory among those which generate that correct language. In more detail, a regularity R is a convention in a population P iff within P, the following hold, with at most a few exceptions: (1) Everyone in P conforms to R. (2) Everyone in P believes that everyone in P conforms to R. (3) This belief gives everyone in P a good reason to conform to R himself. (4) There is a general preference in P for general conformity to R rather than slightly-less-than-general conformity to R. (5) There is an alternative possible regularity R’ such that if it met (1) and (2), it would also meet (3) and (4). (6) All of (1–5) are common knowledge. To repeat, the specific conventional regularities that make an abstract language L used in population P are as follows, where s ranges over sentence types and p over propositions, (Truthfulness) Members of P utter s only if they believe p, where L(s)=p. (Trust) If a member of P hears another member of P utter s, she tends to come to believe p, where L(s)=p. (Lewis) summarizes the metasemantic theory: (Lewis) Given an exogenously fixed specification of population P₁ and typing of sentences, T₁, L is the language of P₁ for T₁ iff there are conventions of (Truthfulness) and (Trust) in L in P₁ for T₁.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



This highlights that if (Lewis) is to provide a naturalistic metasemantic theory, it must be supplemented by exogenous theories of populations and word types.¹³ However, we’ve seen reason to doubt that the available exogenous accounts of word types will serve. As before, an exogenous account with semantic elements is circular. As before, combining a metasemantic account with an exogenous account that cuts too finely, such as (phonographemic) causes havoc in the semantics. For example, in the “okapi” case, either the convention covering those tokens is one that ties them to zebras or there are too few tokens of the ‘word’ for there to be any convention. We will use the remainder of this section to show why overly coarsegrained accounts are problematic in combination with (Lewis). Because of the complexities and flexibility of Lewis’s account, this will take some time. Fortunately, the work here will also be useful in defending our own account in section 3.6. Overly coarse-grained accounts of words obscure the differences between word tokens with intuitively different semantic values and thereby conflate what should be separate streams of input to the metasemantic theory. For example, tokens that are intuitively about financial institutions/trains are bundled in with those intuitively about waterway edges/drains. We focus on the case of “bank” but similar reasoning could be applied to the “train”/ “drain” case. Call the phonographemic string “There is a bank nearby” sentence type s₁. Which language and grammar can capture the conventional regularities for s₁ for English speakers? The problem is that sometimes a speaker produces s₁ when she believes there is a financial institution nearby; but other times she does so when she believes there is a river edge nearby. Sometimes hearing s₁ inclines a hearer to adopt the one belief and sometimes the other. Which grammars and which languages match those regularities while treating all instances of the phonographemic string as, counterintuitively, a single sentence type? A grammar that associates “bank” with only financial banks will not suffice, nor will one that associates it with only river banks. There are three ways that Lewis could address this problem: the

¹³ It must also be supplemented by the contents of the beliefs of members of that population. Lewis’s theory is, then, ‘headfirst’: the intensional content of mental states is presupposed in the characterization of semantic content. The authors of this paper differ in their enthusiasm for granting this headfirst presupposition.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

disjunctive method, the sequence method, and the indexical method. We’ll show none of them renders the phonographemic account suitable as an exogenous plug-in for Lewis’s metasemantics. We start with the disjunctive method. Let the grammar Gd pair “bank” with a function from the union of the set of financial banks and the set of river banks to true. So, each sentence containing “bank” is paired with a belief that has disjunctive content. Suppose further that Gd includes the usual rules of composition. Gd pairs s₁ with a proposition of the form [there is either a financial institution or the edge of a waterway near to x] where x is a place provided by context. This might capture the regularities of English for s₁. However, such a theory will struggle to get the right regularities for phonographemic strings of the form “There are three banks”. This is uttered when there are either three financial institutions or three river banks; not when there are three things that are either financial institutions or river banks as the conventions of Gd require. In such cases, the disjunctive method delivers the wrong semantics. An analogous problem afflicted (Toy Evans). Lewis’s own approach to ambiguity is to pair each sentence s with a sequence of propositions , instead of pairing it with a single proposition. The truthfulness convention is a conventional regularity of uttering s only if one believes p, or believes q, or . . . The trust convention is a conventional regularity of believing p or believing q or . . . if one hears another member of the population utter s. According to this proposal, “the bank is nearby” is paired with the sequence: . Call this the gruesome sequence regularity. “There are three banks” is paired with the sequence . Lewis does not describe a grammar that generates such sequences. Perhaps he has in mind a grammar Gs that associates words with sequences of referents, with each sentence s then mapped to sequences of all those propositions that are determined compositionally from some selection of referents from the list associated by Gs with the constituent words of s. For example, “bank” will be associated with the list . Call this the sequence method. If Gs does pair “there are three banks” with the extensionally appropriate sequences, it avoids the problems of the disjunctive method. The sequence method also fails. The problem is that the gruesome sequence regularity is not a convention. To be sure, there is a regularity

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



of uttering “the bank is nearby” either when one believes that the financial bank is nearby or believes the river bank is nearby, and a regularity of coming to believe one of those propositions when one hears the sentence uttered. But for the gruesome sequence regularity to be a convention, Lewis’s condition (3) requires that our awareness of that regularity gives us reason to conform to it. It does not. Suppose you hear someone utter “the bank is nearby”. Awareness of the gruesome sequence regularity gives you reason to believe that either the speaker believes a river bank to be nearby, or believes a financial bank to be nearby. If you assume the speaker is reliable, this gives you reason to believe the disjunction: either a river bank or financial bank is nearby. But believing the disjunction is not to conform to the gruesome sequence regularity of trust. To do that, one would have to believe one or other of the disjuncts. Turn now to the indexical method. Here is one way of fitting indexicals into a Lewisian account.¹⁴ Pair sentences with characters, functions from context to ‘horizontal’ content. While there’s no horizontal proposition p such that every speaker will utter “I am standing” only if they believe p, all speakers satisfy the following: they will try to only utter “I am standing” (in c) when they believe whatever proposition which is the value of that sentence’s character C at c; likewise, they will all try to respond to utterances of that sentence type (in c) by forming the belief in the same proposition. An example of such a conventional regularity is: utter “I am standing” only if A is the speaker, and one believes that A is standing. Notice that a speaker can use awareness of this regularity to give them reason to believe an appropriate formulation of trust: for they combine their awareness of the general regularity with the publicly available information about which context they’re in (who the speaker is) and, if they take the speaker to be reliable, they’ll have reason to believe that that individual is standing, and so to conform to the regularity.

¹⁴ A different interpretation of Lewis uses the (later) distinction between horizontal and diagonal content. (When A is the speaker, the horizontal content of “I am sitting” is that A is sitting; the diagonal content is that the speaker of the context is sitting.) Diagonal contents of sentences are a candidate for the narrow content of beliefs ascribed using those sentences. This might well serve for true indexicals. But the analogue of diagonal content for ambiguous sentences would be highly recherché, and it’s implausible that ordinary competent use of “bank” requires one to form beliefs with that content.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

One could try co-opting that treatment of indexicals to the case of “bank”. On the indexical method, the convention would associate “bank” not with a sequence of propositions, but with a function from contexts to propositions. The putative regularity would be that a speaker would only utter “the bank is nearby” when financial banks is the salient disambiguation of “bank” and they believe that the financial bank is nearby or river banks are the salient disambiguation of “bank” and they believe the river bank is nearby. Belief in these regularities combines with knowledge of context to provide reasons to conform to the analogous regularity of trust just as it does for standard indexicals. Call the relevant regularities the gruesome indexical regularities. One problem for the indexical method is how to keep the base free of semantic facts in line with our reductive ambitions. The gruesome indexical regularities build in the “salient disambiguation of words” as part of context, where the account of context is an exogenous account plugged into the reductive metasemantic account. But what makes a disambiguation salient depends in large part on the linguistic context in which the word appears (e.g. whether mortgages or boating was mentioned most recently). “Salient disambiguation” is a smokescreen for sneaking in illicit appeal to the semantic. Upon examination, this particular indexical account is no less circular than helping oneself to semantically-individuated words at the outset.¹⁵ A problem for all three methods of dealing with overly coarse-grained words—disjunctive, sequence and indexical—is that, even if they identify regularities of truthfulness and trust for a population, these regularities are not conventions because they do not meet condition (3)—they do not reflect our reasons for continued conformity. We postpone explaining this problem until section 6. The exogenous version of Lewis’s theory—Lewis’s own approach—is no more promising than (Toy Evans). We will now adjust the Lewisian metasemantics to provide a positive, endogenous metaphysics of word types.¹⁶ ¹⁵ This is not to deny that one can specify other features of context non-semantically, e.g. the speaker, time, place of the utterance. ¹⁶ Of course, Lewis’s approach may need other refinements. For example, Lewis considers objections to his approach relating to liars and non-literal utterances in his 1983: 163–88. But in these cases, any responses open to Lewis are also open to our endogenous account. The aim of our paper is only to explore an example of theory of words endogenous

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



3.3 Let a language be a triple of a population P, a typing relation T and a function L from sentence types to propositions. The population is a set of time-slices of people.¹⁷ The typing relation is a set of sets s₁ . . . sn where each si is a set of actual and possible token utterances. T should impose a typing on each of the utterances of members of P. L is a function from those sentence types s₁ . . . sn and only those sentence types to propositions. If languages are triples of this sort, how do the conventional regularities of truthfulness and trust fix which of those languages is the one in use for a particular utterance u? We replace (Lewis) with (Endogenous). (Endogenous) Given an utterance u, is a language in use in utterance u iff P is a population and T a typing relation relative to which there are conventions of (Truthfulness) and (Trust) in L, and the speaker/hearer of u is a member of the population P; and u is a member of some equivalence class of the typing relation T. Instead of determining L after fixing a particular population and typing relation, (Endogenous) treats the population and typing relation as variables whose values are fixed however is necessary to produce conventions of (Truthfulness) and (Trust).¹⁸ If P is to count as language-using population, the members of P have to be able to think about population P, and think about the sentence types T induces, in order to have the belief that the relevant regularities prevail among P. This is not a problem for (Endogenous), as the way that speakers think about P and about S is fairly unconstrained. For example, suppose that each member has an unstructured name-like concept (“Us”) and each believes that every one of Us follows the relevant regularity. Suppose also they have an unstructured concept “sentence-of ” whose to some metasemantics and show it fares better than the exogenous version of that metasemantics. ¹⁷ Populations must be sets of time-slices of people in order to allow division of the utterances of people who are bilingual. ¹⁸ While population and word-typing are fixed endogenously by our new metasemantic account, we still presuppose that there are exogenous naturalistic accounts available for other entities, in particular, the sounds and marks that comprise the utterance tokens.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

referent is a function from utterances to types, which features in the specification of the relevant regularities. What the account requires, in order for to satisfy (Endogenous), is that these unstructured concepts “Us” and “sentence-of” refer to the right things (P, and the function induced by T, respectively). Why be pessimistic about this issue? One might think it problematic if one assumed that members of P think about P and T descriptively, via some implicit exogenous theory of populations or sentence types. But why restrict speakers to descriptive concepts of P and T?

3.4 Notice that (Endogenous) makes space for languages where words are typed so as to have the extension specified by the phonographemic or Kaplanian accounts, just as long as those types do in fact feature in conventions of the right sort. The considerations earlier are reasons to think they won’t feature in suitable conventions for languages such as English. But nothing in the form of the endogenous account rules them out. The endogenous account is highly flexible, making room for words to be typed so as to avoid the overly coarse or overly fine graining we saw in extant exogenous accounts. (Endogenous) makes room for word types to have the extensions which (Plus Semantic) and the indexical method were reaching for (again, provided those groupings of tokens feature in some conventions of the right sort). Nothing prevents us from giving an informative specification of the extension of word types that appeals to semantic facts, so long as this is not construed as part of a reductive metaphysics. In particular, (Endogenous) makes room for a satisfactory treatment of cases that phonographemic typed too finely. On (Lewis), the grapheme “realise” needs a different convention from the grapheme “realize” since words are typed phonographemically. By contrast, (Endogenous) permits a typing that groups these together as tokens of one word. (Endogenous) also addresses the problem of word types with too few instances to support conventional regularities. The single pronunciation of “okapi” in a North Welsh accent can be lumped together with pronunciations in other accents.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



What about cases that the phonographemic account typed too coarsely, such as the “bank” case? (Endogenous) allows the language to feature a typing relation T that types some utterances such as B-A-N-K as of the word-type “bank₁” and others of the type “bank₂”. With two words, the language can have a grammar featuring a non-disjunctive regularity, pairing “bank₁” with the function from river banks to True and a separate non-disjunctive regularity pairing “bank₂” with the function from financial banks to True. In a case where there are at most two river banks and one high street bank, “there are three banks” will come out false as desired, avoiding the problem that faced the disjunctive method. (Endogenous) also avoids the problem that faced the sequence method. There are regularities of truthfulness and trust governing respectively the types “there is a bank₁ nearby” and “there is a bank₂ nearby”, speakers utter the first only when they believe there’s a river bank nearby, and the second when they believe there’s a financial bank nearby, and form the appropriate beliefs when they hear the respective utterances. A speaker’s reason to obey (Truthfulness) for the sentence featuring bank₂ is her knowledge that her interlocutors conform to (Trust) for that sentence. So unlike the sequence method, the regularities in (Endogenous) meet condition (3) for conventions. Finally, although the typing reflects semantic features of the token sentences, this does not result in circularity. The indexical method was embroiled in circularity only because it was part of an exogenous theory. Since we do not attempt to characterize T prior to the semantics, there is no circularity. (Endogenous) also makes populations endogenous to the metasemantic theory. This answers an objection White (MS) raises against Lewis’s metasemantics. White gives reason to doubt that there is a principled non-semantic exogenous account of a population. The speakers of a language are not exactly those within certain geographical boundaries in certain time periods or those of certain nationalities. Nor can we identify the population as those who speak a certain language if the population is to be given prior to the semantic theory. Our endogenous approach gives a principled line on what will count as a language-using population: a language speaker population is any group for which there are conventional regularities of truthfulness and trust. So our approach rescues Lewis from White’s objection.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

3.5 Although we have made room for new typings of utterance tokens, this will be no good if it comes at the cost of removing too many constraints, leaving us with an account that lets in crazy typings along with the desirable ones, thereby letting in crazy semantics along with the desired semantics. In this section, we suggest prima facie ways to gerrymander such crazy languages. Fortunately, in section 6, we can rule out that these gerrymanders are genuine languages—although they may characterize genuine regularities of truthfulness and trust, those regularities are not conventions. We will consider three ways our opponents might try to gerrymander spurious languages from intuitively correct semantic theories—by restriction, by merging, by tailoring. Begin with restriction. If a regularity prevails among the whole population, it also prevails among subpopulations. One can gerrymander spurious languages (it seems) by taking the intuitively correct semantic theory and restricting the population. The restricted subpopulation could be arbitrary, or it could be based on some recognizable feature. For an example of the latter, take English but restrict the population to the brown-eyed subset of the original population (counting those with other colours of eyes as a separate linguistic population). Relatedly, there are ways to subdivide the typing relation while preserving regularities of truth and trust. For example, take English but double the number of types of word, counting whispered and non-whispered tokens always as of different words. A second method to produce gerrymandered types and languages is merging. One trick is to merge what are intuitively distinct but synonymous word types into a single gerrymandered word type. For example, type tokens of the graphemes “cell” and “mobile” together as if they were different graphemic realizations of the same word. Another gerrymander merges tokens of the graphemes “purchase” and “buy” together. Merged gerrymanders keep the structure of regularities in place, simply collapsing two regularities from intuitive language(s) into a single regularity in the merged gerrymander. For that reason, merged gerrymanders would produce the same meaning assignments as the intuitively correct semantic theory that they are based on. Another trick merges tokens which are intuitively of distinct nonsynonymous word types. An example groups all tokenings of “there are oranges in the fruit bowl” together with one person’s tokenings of “there

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



are fish in the sea”, counting these as instances of a single type. A more natural version would count tokenings of the same phonographemic type as of the same word type across geographical or historical semantic drift. For example, “pants” picks out an external leg-covering among speakers from one geographic region, an undergarment among those from another. How should non-synonymous mergers specify the candidate regularity of truthfulness/trust to cover disparate usage of the two nonsynonyms? To get the right semantics, one would have to treat the merged ‘word type’ as ambiguous. We discussed three methods of dealing with ambiguity in section 3—the disjunctive, sequence, or indexical method. (In essence, treating homophones like “bank” as a single type is also a merge relative to the two non-synonymous but phonographemically identical types.) We showed in section 3 that the disjunctive and sequence methods are unsuitable. However, we could treat merged words as indexical provided the relevant aspects of context are not semantic as we argued they were for homophones like “bank”. For the orange/fish merge the relevant feature of context is whether the token was uttered by a particular person and this is not problematically semantic. For “pants”, that contextual feature might be the accent of the speaker. Restriction and merging (under the indexical treatment) in general do not generate unwanted assignments of truth-conditions to any token utterances. By contrast, our final kind of spurious typings—tailored typings—don’t only produce spurious types; they also threaten to overgenerate meaning assignments to tokens. Tailoring involves tweaking the boundaries of standard types, so as to alter the semantic assignments to particular utterances. Arbitrary tweaking is not guaranteed to produce regularities. But here’s a prescription for gerrymandering genuine regularities: find utterances which are exceptions to the regularities of what is intuitively the correct semantics, then adjust the population or sentence typing so that they are no longer exceptions to the rule that covers them. Here is a first example of tailoring—the red tailor. Consider a biased selection P* of the population who are apt to call more orangey things “red” than is the norm. Suppose P* are scattered among the rest of the intuitive population of English speakers. Now consider a gerrymandered typing of words, “red₁” and “red₂”, the first tokened exclusively by members of P* when interacting with other members of P*, the other tokened on the remainder of the occasions. By construction, there will be regularities of

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

truthfulness and trust among P* linking “that is red₁” to a belief that the item in question is red*, where red* is a colour that includes orangey nonred shades. Insofar as we concentrate on the whole population and the standard typing of “red”, the “red” tokens uttered by members of P* will mean red; but insofar as we concentrate on the gerrymander P*/“red₁” those very tokens will also mean that is red* (in a distinct language). For a second example, consider utterances where the speaker intended to produce the phoneme /treɪn/ but the phonetic intention failed and she produced a different phoneme, /dreɪn/. Perhaps many such utterances are misheard as /treɪn/ by the audience, a common mistake, given expectations set up by the conversational context. Call these the fluffed tokens. Intuitively the fluffed tokens are instances of the word “drain”, not “train”. Accordingly, the standard semantic theory treats sentences involving the fluffed tokens as exceptions to the truth and trust regularities linking “drain” and beliefs about drains. (Lewis) and (Endogenous) permit this—regularities need not be perfect. But our opponents can retype the fluffed tokens (which sound /dreɪn/) with tokens that sound /treɪn/. Call this word type “train+”. There are regularities linking “train+” to beliefs about trains, which both the tokens sounding /treɪn/ and the fluffed tokens confirm. Malaprops too are susceptible to tailoring. Suppose Mrs Malaprop says, “Aviators are dangerous reptiles found in the marshes in Florida.” On the intuitive semantic interpretation, Mrs Malaprop says something false about people who fly planes, even if her audience can work out what truth she intended to convey. Our opponents tailor a gerrymander—the malaprop tailor—out of Malaprop’s exception to the intuitive regularities. Retype the first token word in Malaprop’s utterance with tokens spelt and sounded like the intuitive word-type “alligator” to produce a word-type “alligator+”. One can now subsume her utterance under a truthfulness regularity linking alligators and “alligator+”. Perhaps Lydia hears Mrs Malaprop and takes her to believe that there are thousands of alligators. If so, one can also subsume Mrs Malaprop’s utterance under a trust regularity.

3.6 We have presented seven ways of gerrymandering word types. There were two forms of restriction: either arbitrary (subtypes that do

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



not correspond to features to which speaker-hearers are sensitive) or recognizable (whispered vs. non-whispered, uttered by someone with brown vs. blue eyes); two forms of merging (of synonyms, e.g. “cell”/ “mobile”, “buy”/“purchase”, or of non-synonyms under the indexical treatment); and three ways of tailoring the type (the red-tailor, favouring orange-biased usage, the fluff-tailor, reclassifying drain-utterances as train-utterances, and the malaprop-tailor, treating Malaprop’s aviatorutterance as an alligator-utterance). These all articulate genuine regularities of truthfulness and trust connecting our utterances and attitudes. Fortunately, Lewis’s account of convention already contains within it countervailing pressures to rule out these gerrymanders. To be a convention, a regularity needs to do more than meet clause (1) of Lewis’s definition. Condition (3) requires that the populations’ beliefs that the regularity obtains must give them good reasons for conforming to it in future. We will argue that this means that types must be (i) identifiable; and (ii) must be psychologically present. Restriction arb. Identifiable N Presence N

Merge

Tailor

recog.

syn.

non-syn. red

N

N

N

N N

fluff

malaprop

N

N

The table summarizes what these constraints rule out. Although psychological presence would suffice alone to rule out all gerrymanders from counting as actual languages, identifiability tells us something about what is a possible human language. For this reason it can explain why the phonographemic account of words seemed appealing. Identifiability. To feature in a conventional regularity, sentence types and populations must be identifiable by speakers, in the sense that the speaker must have a generic capacity to tell that utterance u is of type s, or person a is in population P. This falls out of (3)’s requirement that speakers employ beliefs about the type in reasoning. (3) requires that the speaker-hearers’ awareness of the existence of the regularities of truthfulness and trust prevailing in their population gives them reason (either epistemic or practical) to conform to those regularities. Here, for illustration, is the schematic form of the epistemic reason to conform to a suitable instance of the trust regularity:

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . .  a. There’s a regularity among P to utter type s only if they believe p. b. The speaker has uttered something of type s. c. The speaker is a member of P.

So, d.

The speaker believes p.

Combined with standing presumptions of cooperation and expertise, this supports a conclusion of the form: e.

p.

(Similar reasoning applies on the practical side.) For this sort of reasoning to be available to a hearer, she needs, in steps b and c, to identify which type a token she encounters falls under, and which population a person stage she encounters belongs to. If she can never do this, her belief in the truthfulness regularity never provides her with reason to conform to the trust regularity and (3) fails utterly. To be able to identify types and populations, she need not do so successfully in all cases. For example, Zara misidentified Xena’s utterance of “train” but she still has the general ability to identify utterances of “train”. However, given that in order for a regularity to be convention, (3) must be met with only a few exceptions, she must be able to identify types and populations with only a few exceptions. The identifiability point is general to all conventions. Take the convention of driving on the left, when in Australia. There is therefore a regularity of driving on the left when in region R of Australia (and one of driving on the left when in the Australian complement of R). But if I can’t tell when I’m in region R, and when not, these subregularities can’t feature in my practical reasoning and if so, these regularities aren’t conventions. The identifiability constraint rules out some of the gerrymanders from section 5. It immediately implies that arbitrary subpopulations (P₁/P₂) or arbitrary finer-grained typings of words (s₁/s₂) won’t feature in conventions, since speakers will not in general be able to identify when an utterance falls under s₁ rather than s₂, or an individual is in P₁ rather than P₂. The constraint also eliminates some examples of tailoring: speakers can’t tell who is biased towards orange in their “red” grapheme/ phoneme utterances if P* is unidentifiable. As the table records, many gerrymanders are not ruled out by this constraint. Recognizable restrictions,

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



the malaprop and the fluff tailor and the various forms of merging, all pass the constraint. The identifiability constraint does admit the intuitive sentence types and populations, as desired. The population of English speakers make distinctive noises and inscriptions that enable others in the group (with only occasional exceptions) to identify them as such. Tokens of the intuitive word-type “cat” are also identifiable as such, with the primary clue being their phonographemic features. Background knowledge plays an important mediating role in this identification: phonological features are combined with experience of the idiosyncratic variations of pronunciation across different accents or speech impediments to produce our ability to identify types. Other resources available to us in the identification task include non-semantic conventions¹⁹ such as the explicit letter transformation conventions of braille and transliteration conventions, both of which apply regardless of semantics to any sentence type. This explains the initial appeal of an exogenous phonographemic account. Phonographemic features are indeed important but they feature in the epistemology of type recognition, rather than the metaphysics of those types. In the epistemology, they need only be defeasible evidence of word type. This comfortably accommodates the holism and idiosyncrasy that blocked a metaphysical phonographemic account. Psychological presence. We take it that to be a convention, a regularity must be the content of the psychological reasons that members of a relevant population possess for acts of conformity to R.²⁰ Given this, a genuine regularity R* can fail to meet (3) if it does not feature as the content of speakers’ reasons. Again, there are illuminating precedents in driving conventions. First consider merging: from the base conventions, such as driving-left-inAustralia, we can construct merged regularities such as driving-leftwhen-in-Australia-and-driving-right-when-in-the-US. These merged regularities do not feature in agents’ psychological reasons for instances of left-driving in Australia. As evidence, notice which information agents ¹⁹ See Cappelen 1999 for discussion of such conventions. ²⁰ An alternative reading of (3), which we reject, is that the regularity should be a consideration that objectively counts in favour of future conformity to it, irrespective of whether it is psychologically present in the population. The motivation for the objective reading is to avoid over-intellectualizing speakers by attributing to them explicit beliefs about the regularities. However, the psychological view need only attribute implicit beliefs.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

take as relevant when deciding which side of the road to drive on. The reason for and decision to drive on the left in Australia are resilient under various hypotheses about what happens elsewhere. I don’t recheck my decision to drive-left-in-Australia if I discover that the US has altered its driving laws. This shows that the Australia specific regularity is operative in my reasons, not a merged regularity for both Australia and the US. Exactly the same goes for mergers of types for conventions of truthfulness and trust. We psychologically encode information about truthfulness and trust for tokens of “buy” and “purchase” separately. As evidence, notice that were I to learn that the meaning of “purchase” had shifted so as to be associated with the belief an item had been stolen, I would not regard my practice with “buy” as undermined. A fortiori, arbitrary mergers of non-synonymous words, such as “orange” and “fish”, do not figure among my reasons for forming beliefs or uttering sentences. Equally, in a conversation between Americans, it doesn’t matter whether or not they know the British usage of “pants”. That shows that the US specific regularity rather than the merge of US and UK tokens of P-A-N-T-S is psychologically present. We now have an additional reason to reject Lewis’s treatment of ambiguity from section 2. It is the two separate bank₁ and bank₂ regularities that feature in our psychological reasons, not the disjunctive, indexical, or sequence regularities featuring the phonographemic type B-A-N-K. Evidence is that the former but not the latter regularities are resilient. If I get evidence that the regularity linking B-A-N-K to river banks is breaking down, it affects only my watery uses of B-A-N-K, not my financial ones. In sum, presence knocks out the identifiable but gerrymandered mergers. Application of the presence constraint also rules out recognizable restrictions. Again, consider the driving convention analogue. From a base convention of driving-left-when-in-Australia, we get restricted regularities of driving-left-when-in-a-big-car-in-Australia and drivingleft-when-in-a-small-car-in-Australia. Here, the relevant psychological fact is that we are inclined to recheck our practical reasoning when in a big car, conditional on learning that behaviour with small cars is not as we thought it was. If our psychological reasons for drive-left (in circumstances where we happen to be in a big car) were the big-car regularity, this should not happen. The whispered/non-whispered restricted regularities are the analogues of this in the case of truthfulness and trust. If I learn that whispered

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



tokens of “blue” are applied to slightly greener things than I had thought, I’m inclined to assume that this generalizes to non-whispered tokens. We’re disposed to take it that whispered and non-whispered tokens pattern alike—being more confident in this than we are in the exact patterns connected to either. So our actual psychological reasons for the linguistic acts (and beliefs formed in communicative situations) are not sensitive to the whisper/non-whisper distinction.²¹ It remains to consider the fluff and malaprop tailors. The underlying challenge here is to pinpoint why paradigm conventions are based on simple but exception-prone regularities, when there are complex regularities with fewer exceptions that we are in a position to work out. The Australian regularity driving on the left has exceptions, predictable from our general knowledge of the world and each other: we violate the regularity when swerving to avoid a kangaroo, when moving into a parking space on the right, etc. One could tailor a “regularity” to which there is more perfect conformity by including the exceptions into the specification: “drive on the left except when one swerves to avoid a kangaroo . . . ”. The latter gerrymander doesn’t seem to be a true convention. Why not? We can once more appeal to presence. The simple rule remains a reason for us to drive left, conditional on varying assumptions about the character of exceptions. We might learn that drivers do not react fast enough to swerve to avoid a kangaroo. That is a useful piece of information, but not something that prompts me to recheck my ordinary practical reasoning about how to drive in the absence of kangaroos. When kangaroos are around, the complex regularity does no better, for it is not a psychological reason for swerving. When I do swerve the reason to do so is that if one doesn’t one will have a collision, not a desire to conform to some rule. In the case of truthfulness and trust, malaprops and fluffs pattern similarly. Regularities featuring the type “alligator+” have one fewer

²¹ This is not to deny that there can be nested conventions, cases where the inner conventions are restrictions of an outer, more general convention, perhaps for a larger population. For example, there seem to be specific conventions for speakers with a Leeds accent nested within more general ones for speakers with a British accent, nested inside more general conventions still for speakers with a wider range of accents. These nested conventions might feature different types—consider the case of “colour”/“color”.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

exception than the ‘standard’ regularities featuring the intuitive word types. However, that gerrymandered typing is not anybody’s reason for uttering what they do. It’s not Mrs Malaprop’s, since she is not aware of her idiosyncrasy. It’s not our reason when interpreting each other, outside malaprop contexts. We don’t recheck our way of interpreting each other conditional on propositions about that utterance of Mrs Malaprop’s. You might think that knowledge of the malaprop regularities would be an interpreter’s reason for attributing to Mrs Malaprop an alligator-belief upon hearing her utter, “Aviators are dangerous reptiles found in the marshes in Florida.” But anyone who tried to appeal to that rule would make a mistake about Mrs Malaprop’s psychology. They would be representing her utterance as the rational outcome of true beliefs about regularities in usage of a (tailored) word type, when part of the basic data about this case is that she’s either subject to performance-error (so that act-type isn’t rationalized by her beliefs) or has false linguistic beliefs that lead to her utterance. Similar remarks cover the case of fluffs. So, the psychological presence constraint disqualifies all the remaining gerrymanders from being languages. The threat of overgeneration is allayed. There’s a difference between regularities that are not conventions due to violations of identifiability, and regularities that are not conventions due to violations of presence. The former regularities could not become conventions, unless our powers of recognition were enhanced. In the case of presence-violators, it is a contingent fact of psychology that we reason one way rather than another. We could base our driving decisions on international disjunctive regularities rather than national ones. We could treat “buy”/“purchase” as a semantic unit, a variant spelling like “recognize”/“recognise”. One shouldn’t expect the abstract metaphysics of words to be the place for an illuminating explanation of why our psychology of language codes things one way rather than another— historical linguistics is a much better bet for such insight. But our point has been simply to show that the endogenous story set out in section 3 was not immediately refuted by crazily overgenerating word types, and this has been achieved.²²

²² One author would also use condition (4) of Lewis’s characterization of convention to demonstrate the non-conventionality of some of the gerrymandered word-type regularities.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

  



3.7 We have set out and defended a new account of sentence types. Rather than receiving some individuative story prior to and independent of our metasemantics, on our theory the same factors that fix meaning, fix word-typing. In so doing, we avoid the problems that beset previous accounts of the metaphysics of words, including the widely but unreflectively endorsed phonographemic view. Our story explains why the phonographemic view seemed so natural: phonographemic factors play a crucial epistemological role in the identifiability of our sentence types. The most important worry for our view is that it might overgenerate spurious word types. We have explained in detail why, although there are many regularities of truthfulness and trust involving spurious typings, there are very few conventional regularities. We have given an account of sentence types; it is a further question what individuates the lexical items, the basic morphemes out of which sentences are built. A natural extension of our account addresses this question. Recall that Lewis held that once we had a grip on which language (function from sentence types to propositions) was in use in a population, we would then use this to get a grip on which grammar (syntactical parsing and semantic analysis) was in use therein. The grammar in use was the simplest, strongest theory that generated the sentence-proposition pairings of a language-in-use. Included in the grammar, in our view, will be an assignment of lexical-types to parts of utterances, as well as an assignment of syntactical-tree types to whole utterances; a constraint will be that the leaves of the tree be lexical types that (if non-null) are instantiated by parts of utterances. The requirement for simplicity and strength will ensure that a sensible typing of sentence types will require a sensible typing of constituent word types, if the grammar is to be used. Although our focus has been on the metaphysics of word types, we have en passant given an account of language-using populations. Thus we are able to offer an answer to White’s (MS) challenge: that the Lewisian metasemantics is circular because the relevant populations cannot be non-circularly specified. White presupposes that the account of populations must be exogenous. The endogenous theory of word types allows the naturalistic theory of meaning to proceed without the worry that the metasemantic base contains tacit circularities. But the flipside is that there is no guarantee

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



   . . . 

that endogenous theories are available for arbitrary metasemantics. Those wishing to deploy a sophisticated variant of the Evans-style dominant-causal-role metasemantics with which we began still owe us all a (presumably exogenous) account of the metaphysics of word types. Worse for them, the convention-based competitor has discharged their debts in this respect—it can no longer be dismissed as everyone’s problem.

References Cappelen, H. (1999). ‘Intentions in Words’, Noûs 33 (1): 92–102. Cappelen, H. and J. Dever (2001). ‘Believing in Words’, Synthese, Vol. 127, No. 3 (June 2001), 279–301. Davidson, D. (1973). ‘Radical Interpretation’, Dialectica 27: 314–28. Davidson, D. (1979). ‘Quotation’, in Inquiries into Truth and Interpretation. New York: Oxford University Press, 1979. Evans, G. (1973). ‘The Causal Theory of Names’, Proceedings of the Aristotelian Society, Supplementary Volume 47: 187–208. Haack, S. (1978). Philosophy of Logic. New York: Cambridge University Press, 1978. Hawthorne, J. and E. LePore (2011). ‘On Words’, Journal of Philosophy cviii (9): 447–85. Horwich, P. (1998). Meaning. Oxford: Oxford University Press, 1998. Kaplan, D. (1990). ‘Words’, Proceedings of the Aristotelian Society, Supplementary Volume 64: 93–119. Larson, R. K. and G. Segal (1995). Knowledge of Meaning: An Introduction to Semantic Theory. Cambridge, MA: MIT Press, 1995. Lewis, D. (1983). ‘Language and Languages’, in Philosophical Papers, Volume I, 163–88. Oxford: Oxford University Press, 1983. Millikan, R. G. (1984). Language, Thought and Other Biological Categories. Cambridge, MA: MIT Press, 1984. Richard, M. (1990). Propositional Attitudes: An Essay On Thoughts and How We Ascribe Them. Cambridge, MA: Cambridge University Press, 1990. Stebbing, L.S. (1935). ‘Sounds, Shapes, and Words’, Proceedings of the Aristotelian Society, Supplementary Volume 14 (1): 1–21. White, R. (MS). ‘David Lewis, Meaning and Convention.’

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

4 Conditional Acceptance Ofra Magidor

4.0 Introduction Providing an adequate semantics for indicative conditionals is a notoriously difficult task. In this paper I present a new challenge to the task of providing semantics for indicative conditionals, and the related epistemic question of when one ought to accept a conditional statement. The challenge takes the following form. In 4.1, I present a case involving an utterance of a certain indicative conditional C. In 4.2, I argue that at least each of three prominent theories of conditionals predict that, in this case, you should assign a credence of at least 0.9 to C. In 4.3, I argue that this prediction is wrong: given the case, it is entirely permissible for you to assign a credence which is lower than 0.9 to C. In 4.4, I discuss what conclusions we can draw from the argument both to the semantics of conditionals, but also to epistemology more generally. The appendix contains the more technical details of the probabilistic analysis of the case.

4.1 The case You are visiting an exotic island, and you are certain of (that is, you have credence 1 in) the following set-up: 1.

The Island: the island contains exactly two kinds of inhabitants: Randomers and Reliabilists. You do not have any way to distinguish between the two kinds of inhabitants other than by relying on the features explained below.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

2.

Randomers: Randomers are odd people: they have their own beliefs but they keep them entirely to themselves. When they speak, they pick some subject matter P and flip a fair coin in order to decide whether to utter P or to utter P. Which sentence they utter is neither a reflection of their credences nor of the facts in the world.¹

3.

Reliabilists: Reliabilists, on the other hand, utter (declarative) sentences only when they are highly confident in them. Moreover, while Reliabilists are not omniscient and sometimes do utter falsehoods, they are highly rational and extremely well-informed. You take them to be experts, and when you are certain someone is a Reliabilist, you attempt to defer to their beliefs, including partial beliefs.²

For a start, let us set up the case so that, subject to the usual caveats, your credence in P conditional on X being a Reliabilist who utters P is precisely 0.9 (call this ‘the Simple Case’).³ How might we defend such conditional priors? For a start note that on the standard Bayesian picture your conditional prior must be some precise number or other: there seems to be no reason we cannot set this number to 0.9 in particular. Moreover, such conditional priors might be motivated by the following picture: suppose you think that Reliabilists only utter P if their credence in P is at least 0.9 (and have no additional information about their credences). One way of trying to defer to their expertise is to adopt the (somewhat cautious) strategy of always adopting a conditional credence of 0.9 in such cases. Some, however, seem to find this constraint on your conditional credences too strict. I will therefore also present another version of the case (call it ‘the Generalized Case’), where your credence in P conditional ¹ Some might find such cases of insincere utterances inappropriate for semantic theorizing. In that case, here is a variant which will do just as well for my purposes: Randomers only utter sentences they are highly confident in, but they form their beliefs entirely randomly. (On this variant, you should either maintain that the beliefs of Randomers are not probabilistically coherent, or alternatively, that the entire probabilistic state of a Randomer changes frequently by mechanisms other than conditionalization.) ² To make the case a bit more realistic, we can suppose that there are some subjectmatters on which you don’t take Reliabilists to be experts. Thus, for example, you might not defer to Reliabilists on the question of whether you have a toothache. But I will assume that the conditional which is at the centre of this case is one of those many subject-matters on which you do defer to Reliabilists. ³ The caveats are that since PrðAjBÞ ¼ PrðA^BÞ PrðBÞ , the conditional probability cannot be equal to 0.9 if Pr(A)¼ 1, or if PrðA^BÞ ¼ 0 (and consequently if either Pr(A) ¼ 0 or Pr(B) ¼ 0).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



on X being a Reliabilist who uttered P is at least 0.9.⁴ (As we shall see, the shift from the Simple to the Generalized Case makes no difference to my argument in 4.2, and while it does somewhat complicate the argument in 4.3, the argument ultimately goes through equally well.) There is a final complication here: the utterance which will be at the centre of my case consists of an indicative conditional. This fact makes no difference to the set-up given any theory of conditionals which maintains that they express propositions. But one of the prominent theories I will discuss below (The Suppositional Theory) maintains that conditionals do not express propositions. Moreover, while proponents of the theory often avail themselves of talk of ‘the probability of a conditional’, there are good reasons to think that such “probabilities” cannot be fully represented using the standard Bayesian picture as above.⁵ Indeed, on one way of interpreting the Suppositional Theory, conditionals do not strictly speaking have probabilities at all. A conditional statement ‘If A then B’ can be more or less “accepted”, and it is accepted to precisely the degree of one’s conditional credence in B given A. Relatedly, on this interpretation a conditional is assertible in so far as the speaker’s conditional credence in B given A is high, and the role of the assertion is to get the hearer to also adopt a similarly high conditional credence. While this picture does not allow us to (strictly speaking) accept my assumption that conditional on X being a Reliabilist who utters a conditional you should have a credence of (exactly or at least) 0.9 in the conditional, it can allow for a very close analogue of this assumption: suppose you are certain that Reliabilists only assert conditionals if they have a high conditional credence in B given A. Thus when you are certain that X is a Reliabilist who uttered a conditional ‘If A then B’, you should (subject to the usual caveats as above) also adopt a high conditional credence in B given A. Relatedly, if you revise your current credences by conditionalizing on the claim that X is a Reliabilist who uttered P, the resulting credence function would be one on which your degree of acceptance of the conditional is 0.9 (the Simple Case), or at least 0.9 (the Generalized Case). ⁴ This time, we’ll only need the caveat that your prior credence in ‘P and X is a Reliabilist who uttered P’ is not null. ⁵ For example, on the Suppositional Theory, the “probability” of a conditional, conditional on the negation of its antecedent, is undefined, even if the latter receives positive probability.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

4.

The distribution of islanders: The intuitions in the case become most vivid, I think, if you suppose that the vast majority of the islanders are Randomers, and a small minority are Reliabilists. As it turns out, however, for the purposes of the Simple Case, any non-trivial distribution is sufficient for my argument, and for the purposes of the Generalized Case, a distribution of for example 50% Reliabilist/50% Randomers will do. (See §3 and the Appendix for discussion.)

5.

The utterance: Walking around the island you meet a stranger S. The stranger utters the following conditional: (C) If I am a Reliabilist then I have high blood pressure.

This completes the set-up of the case. The question is now whether, upon hearing S’s utterance, you ought to adopt a high credence in C. In what follows, I argue that each of three highly prominent theories on the semantics of conditionals predicts that you should adopt a credence of at least 0.9 in C, but that this prediction is incorrect.⁶

4.2 Prediction of the prominent theories There are three general positions concerning the semantics of indicative conditionals. The first maintains that the conditional is truth-functional (the truth-value of a conditional depends only on the truth-values of the antecedent and consequent) and it is easy to show that the only

⁶ Some readers might have the suspicion that this case is related to the semantic paradoxes, perhaps in particular Curry’s paradox. I do not think this objection is on the right track. In so far as this is merely raised as a suspicion rather than a concrete argument, it is hard to defuse—but let me make a few comments that might help alleviate these worries. First, I see no reason at all to suppose the conditional in question is paradoxical. Second, there needn’t be anything in the content of C that (either directly or indirectly) refers to C or its truth-value. (It might be helpful to realize that the label ‘Reliabilist’ is merely a helpful notation—the group need not be defined by its reliability properties—it might refer to members of a particular gender or religion. Furthermore, the case only depends on the fact that you have credence one in the claim that Reliabilists have the reliability properties described above, nothing in the case requires them to actually have them.) Finally, there is no reason to suppose that C is in some way ungrounded. Of course, the question of what exactly grounds C depends on which semantics for conditionals is adopted, but the same kind of facts that make standard indicative conditionals such as ‘If John is British, he has high blood pressure’ true or acceptable, should serve equally for the case of C.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



candidate truth-function here is that of the material conditional.⁷ According to this view, ‘If A then B’ is true just in case ‘¬A∨B’ is true.⁸ The second maintains that the conditional is not truth-functional, but does have truth-conditions. Probably the most influential theory of this form, at least among philosophers, is that of Robert Stalnaker.⁹ On Stalnaker’s view, ‘If A then B’ is true if and only if B is true in the selected A-world. Generally, the relevant A-world is selected to be the “closest” one to the actual world.¹⁰ However, a crucial constraint on the orderingfunction in the case of indicative conditionals is that if there are any A-worlds in the context-set (roughly: the set of worlds that are considered open possibilities in the conversation), then the selected A-world must be in the context-set. Thus if A is not ruled out in the conversation, ‘If A then B’ is true just in case B is true in the closest A-world in the context-set. The third position maintains that indicative conditionals do not even have truth-conditions (or at least, they have at most partial truthconditions). On this view, although conditionals might have systematic semantic values, these do not take the form of standard propositions. The most prominent variant of this view is the Suppositional Theory.¹¹ According to the Suppositional Theory, the central principle governing indicative conditionals is that the credence one should assign to the conditional ‘If A then B’ should be the credence one assigns to the consequent B conditional on the antecedent A (this principle is often summarized as ‘the probability of the conditional is the conditional probability’ and is also referred to as ‘Stalnaker’s thesis’ or ‘Adams’s thesis’).¹² Adams’s thesis is connected to the view that conditionals are non-propositional in two ways: first, because the principle places a direct

⁷ See Edgington 1995: 242 for a defence of this claim. ⁸ Notable defenders of the truth-functional view include Grice (1989) and Jackson (1987). See Edington 2009 for a summary of the main arguments in favour of truthfunctionality. ⁹ See Stalnaker 1968 and Stalnaker 1975. ¹⁰ I write ‘closest’ in quotes because the ordering function might be highly context sensitive, and in particular depend on the conditional in question. ¹¹ For defences of this theory see Adams 1975, Edgington 1995, and Bennett 2003. ¹² Though, as I noted in the introduction, on some interpretations of the view, we shouldn’t think of agents as strictly speaking having credences in conditionals. If you prefer this interpretation, replace ‘degree of acceptance’ with ‘credence’ in the discussions of the view.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

constraint on one’s credence in the conditional, rather than offering truth-conditions coupled with the general principle that one’s credence in a proposition is the credence that it is true. This means that the Suppositional Theory is at least compatible with a non-propositional approach. Second, and more importantly, a range of triviality results show that (subject to some plausible background assumptions) one cannot satisfy Adams’s thesis while also maintaining that conditionals express propositions.¹³ What do each of the three views predict about your credence in C following the utterance in the case?¹⁴ Start with the material conditional view. According to this view, C is true just in case either S is not a Reliabilist or S has high blood pressure. Let ‘R’ denote the proposition that S is a Reliabilist, and let ‘P’ denote your credence function after updating on S’s utterance (i.e. your posterior credence function). For a start, note that by the Law of Total Probability PðCÞ ¼ PðCjRÞ  PðRÞ þ PðCjRÞ  PðRÞ. Since PðRÞ þ PðRÞ ¼ 1, if both PðCjRÞ  0:9 and PðCjRÞ  0:9, it follows that PðCÞ  0:9. And this is indeed the case on the material conditional view: conditional on S being a non-Reliabilist, the conditional is trivially true and should receive credence 1. Conditional on S being a Reliabilist, since S uttered C, you should (in light of the set-up) give C a credence of at least 0.9. So your overall credence in the conditional should be at least 0.9.¹⁵ Next, consider the Suppositional Theory (I leave the discussion of Stalnaker’s view to the end, because it’s the trickiest case). According to the Suppositional Theory, the credence (or whatever stands proxy for

¹³ See Lewis 1976 for the original triviality results, and Bennett 2003: §5 for a helpful summary of various extensions. ¹⁴ I will discuss the Simple and the Generalized Cases in tandem here, as the difference between them does not matter for the purposes of the current discussion. ¹⁵ It is worth noting that in addition to the truth-conditions, Jackson and Grice each offer some pragmatic constraints on utterances of a conditional. One might suggest that we should take these into account when considering your credence in C, at least for the case where you suppose S is a Reliabilist. But in the case of Jackson, the constraint is simply that the speaker’s credence obey Adams’s thesis, which would still give us a credence of 0.9 (see the discussion of the Suppositional Theory below). And in the case of Grice, one can easily cancel the relevant conversational implicatures (e.g. by adding some special pragmatic explanation for why S uttered C rather than one of its disjuncts. One simple explanation is that you simply asked S specifically about C).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



credence) that you should assign to C is the conditional credence you have in the consequent (‘S has high blood pressure’—abbreviate this with ‘HBP’) conditional on the antecedent (namely, R). Let PR be the credence function you obtain from P by conditionalizing on R. On the assumption that S is a Reliabilist, you ought, given the set-up, to have a credence (or credence-proxy) of at least 0.9 in C, or in other words, you should have a conditional credence of at least 0.9 in the consequent conditional on the antecedent. This entails that PR ðHBPjRÞ  0:9. On the other hand, by the definition of PR , PR ðHBPjRÞ ¼ PR ðHBPÞ ¼ PðHBPjRÞ: Thus, PðHBPjRÞ  0:9 as required. Finally, consider Stalnaker’s view. According to this view, the conditional is true just in case in the closest context-set world where S is a Reliabilist (assuming there is one), S has high blood pressure. First, note that there certainly are context-set worlds where S is a Reliabilist. (You have no reason to rule out that S is a Reliabilist prior to hearing S’s utterance, and you also have no reason to rule it out after the utterance: after all you have no reason to assume S’s utterance is false in this case, and even if you did, that would not give you grounds for ruling it out because Reliabilists sometimes utter falsehoods.) The second thing to note is that following S’s utterance, it is part of the common ground in the conversation that S uttered C.¹⁶ It follows that all worlds in the context-set are ones where S utters C. Now first let us consider the set of context-set worlds in which S is a Reliabilist (that is, let us assess C conditional on S being a Reliabilist). As we’ve established, each of these is a world where S utters C. Given the setup, you have 0.9 credence (or higher, in the generalized case) that a world where S is a Reliabilist and S utters C is a world where C is true. Moreover, on Stalnaker’s semantics, any world where S is a Reliabilist is a world where C is true just in case the consequent of C is true. Thus the worlds in which S is a Reliabilist are distributed so that 0.9 (or more) of the region they occupy is a HBP region. Next, consider the probability of C conditional on S being a Randomer. Here things become somewhat trickier. As Lewis pointed out, on Stalnaker’s semantics the probability of a conditional is not always equal to the conditional probability, precisely because the probability of the ¹⁶ Note that Stalnaker is explicit about endorsing the claim that one (typically) adds to the context-set the fact that the particular utterance has been made (see e.g. Stalnaker 1999: 86).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

conditional over worlds where the antecedent is false might differ from its distribution over worlds where it is true.¹⁷ Here is an example which illustrates how the two could come apart: you are certain that a person in the other room was holding a cup and dropped it (but you cannot see the cup). Now suppose you think there are two possibilities: either the cup is made of glass or it is made of pewter. Conditional on its being made of glass, you have a credence of 0.95 that it shattered. But suppose you also know that if the cup is made of pewter it must have a particular shape (say an oval shape), and moreover you think that glass cups which have an oval shape are much less fragile than usual (they have only 0.5 chance of shattering). Now consider the conditional: ‘If the cup is made of glass it shattered’. On the Suppositional Theory, your credence in this conditional ought to be 0.95. But Stalnaker’s view predicts otherwise: assuming that the closest world to each pewter-world is one where the cup’s shape is held fixed, then conditional on the cup being made of pewter, the conditional ‘If glass then shattered’ will receive a credence of 0.5, and the overall probability of the conditional will be lower than the conditional probability (namely 0.95). Return to our case. What I would like to argue is that, as long as we add an unproblematic background assumption, this case is importantly different from the pewter/glass case just described. All we need to add is that any feature which is held fixed in closest worlds is one that is probabilistically independent for you of the question of whether S has high blood pressure. For example, you might think that for each Randomer world w, the closest Reliabilist world is one where S’s gender is the same as it is in w. But that would make no difference if you thought the probability of S having high blood pressure conditional on S being a Reliabilist is identical to the probability of S having high blood pressure conditional on S being a Reliabilist with a particular gender. Assuming such independence, then, for each world in the context-set where S is a Randomer, your probability that the closest context Reliabilist world is a high-blood-pressure world should just be the same as in the case where S is Reliabilist (0.9 or higher). With the independence assumption in place, this case is thus analogous to a much simpler version of the cup case: suppose that you think that glass cups are 0.95 likely to shatter

¹⁷ See Lewis’s discussion of ‘imaging’ in Lewis 1976: 308–12.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



when dropped, and otherwise think that the likelihood of shattering is only dependent on the material the cup is made of. In that case, your probability for the conditional ‘If the cup is made of glass it shattered’ would be 0.95, and the probability of the conditional would be identical to the conditional probability. While Stalnaker’s view does not vindicate Adams’s thesis across the board, it does validate it in a restricted range of cases, and with the proper set-up it will validate it in our particular case. A final worry might arise: there is one feature that is certainly not probabilistically independent for you of whether S has high blood pressure: S’s blood-pressure levels. Suppose we require that the closest world to each world w matches w in S’s blood-pressure levels? (After all, wouldn’t a world which matches w in this way count as closer to w than a world which doesn’t match it? . . . ) If we had such a matching constraint then your probabilities on S’s bloodpressure should affect your credence that the closest w-world is a high-blood-pressure world. Suppose for example that, on the supposition that S is a Randomer, you are 50-50 on whether S has high blood pressure. Given the matching constraint, it follows that your credence that in the closest Reliabilist world S has high blood pressure should be 0.5, and our overall credence in C might be significantly lower than 0.9. I have two responses to this worry. First response: I think it is highly implausible that we ever interpret ‘closeness’ in a way which requires a matching constraint on the truth of the consequent. Consider another variant of the cup example: you have a cup which is in fact made of pewter. You drop it and (you are certain that) it didn’t shatter. You also know that cups made of glass which are dropped in a similar fashion are 0.95 likely to shatter. Now it’s not obvious how we should interpret closeness in these cases, but I think at least a good indication is to look at the corresponding counterfactual judgements (we certainly shouldn’t simply assess closeness tout court given how context sensitive the notion is, but since it’s far less controversial that the semantics of counterfactuals involves something like Stalnaker’s closest-world analysis this seems like a good place to look . . . ). So consider the counterfactual ‘If the cup had been made of glass it would have remained intact’. If closeness worked so as to make worlds that matched the actual world in facts about the consequent (i.e. shattering facts), then we should assign this counterfactual a credence of 1, but we certainly do not assess the counterfactual in this way. Similarly, suppose I toss a fair coin and it

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

in fact lands on heads. Now consider the counterfactual ‘If the coin had been 0.9 biased towards tails, it would have landed heads’. If closeness favoured matching the actual worlds in facts about the consequent, we should assign this counterfactual a credence of 1, but clearly we do no such thing.¹⁸ It seems, then, that matching facts about the consequent of a conditional simply cannot be a requirement on how closeness is interpreted. Second response to the worry: if, despite my remarks above, one still insists that closeness can involve matching facts about blood-pressure levels, then we can simply add to the set-up of the case the claim that you are certain that all Randomers have high blood pressure (though you do not think there is any interesting connection between being a Reliabilist and blood-pressure levels). This will not make a difference to the rest of my argument,¹⁹ but now if we expect the closest worlds to match up in blood-pressure facts, this would only increase the probability that one should assign to C. (In this case, Stalnaker’s theory would predict that conditional on S being a Randomer, your credence in C is 1, which is over 0.9, and thus the overall probability of C would be at least 0.9.) I conclude, then, that all three views predict that you should assign a credence of at least 0.9 to C.²⁰

¹⁸ There is of course a tricky question of which credence we assign to ‘If the coin had been 0.9 biased towards tails it would have landed tails’, but whatever you think of that question, it’s clear that we do not assign a very high credence to the counterfactual with ‘lands heads’ in its consequent. ¹⁹ It makes no difference at all to the predictions of the material conditional and Suppositional Theory, and will make no difference to my argument that these predictions are wrong in the next section. ²⁰ What about other views concerning the semantics of conditionals? Some notable alternatives that take conditionals have truth-conditions are ones that take ‘If A then B’ to be true just in case the corresponding material conditional is true in all worlds representing some contextually determined epistemic state (see Kratzer 1986 and Rothschild 2011 for views that take this form). These views are phrased in terms that are too general to predict what they would say about this specific case, but these theories should, I think, have the resources to avoid the problematic predictions here (cf. my discussion in §4). Another kind of view which might be classified as non-propositional is a trivalent account of the conditional (see Belnap 1970 and Rothschild (forthcoming))— again, without further details it is hard to determine what such accounts will predict about this case, but in so far as these views wish to respect Adams’s thesis (as Rothschild (forthcoming) proposes) they will deliver the same prediction as the Suppositional Theory.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



4.3 Why the prediction is wrong In this section I argue that given the case, it is entirely permissible for you to have a credence of less than 0.9 in C. In 4.3.1 and 4.3.2 I argue for this conclusion assuming the Simple Case, and in 4.3.3 I explain how to extend my argument to the Generalized Case. A preliminary about the dialectical force of my argument is in order: my argument in this section does not rely on any particular theory for the semantics of conditionals (after all, what are the correct semantics is precisely what is at stake here). Nor do I maintain that the considerations I rely on are fully consistent with the theories I reject (after all, the conclusion I reach is inconsistent with the prediction of these theories, so we should expect the assumptions in the argument to be inconsistent as well . . . ). Rather, I wish to tease out some pre-theoretic intuitions we have about conditionals which show that the prediction of the three theories is incorrect. I will do so by arguing in turn for two claims: Claim One: Prior to S’s utterance, it is permissible for you to have a credence which is lower than 0.9 in C. Claim Two: If your credence prior to the utterance is lower than 0.9, then your credence after the utterance should also be lower than 0.9. Together, these claims entail that it is permissible for your posterior credence to be lower than 0.9.

4.3.1 Defending Claim One Claim One is, I take it, highly intuitive. Think of the sorts of situations where one would give a high credence to C: one such situation is where you think that all Reliabilists have high blood pressure. Another, perhaps, is where you think that merely a large majority of Reliabilists have high blood pressure. Other cases involve having high credence in S having some particular feature that is positively correlated, amongst Reliabilists, with high blood pressure (for example, you might be certain that all tall Reliabilists have high blood pressure, and be highly confident that S is tall). Perhaps you also give C high credence in the case where you merely have a very high credence that the conjunction of the antecedent and consequent are true (even if you don’t think there is any interesting connection between the two conjuncts).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

The crucial point, however, is that one can easily set up the case so that none of the above apply. Suppose that prior to the utterance you do not have a particularly high credence in either the antecedent or the consequent. Furthermore, you don’t think Reliabilists are particularly prone to high blood pressure and there is no other feature (at least no other feature you have high credence in S possessing) which you think of as positively increasing the likelihood of S’s having high blood pressure (if they are Reliabilists). Perhaps, in the absence of any such factors you would have credence of 0.5 in C, or perhaps you should adopt a much lower credence.²¹ But either way it would be very odd to require that you must have a prior credence of over 0.9 in C.²²

4.3.2 Defending Claim Two Here is why we should accept Claim Two: assume your prior credence in C is lower than 0.9. After hearing S’s utterance, there are two cases to consider: first, suppose that S is a Randomer. On that hypothesis, the fact that S uttered C gives you no information concerning C, so you should stick to your prior credence in C (which, by assumption, is lower than 0.9). Second, suppose that S is a Reliabilist. On that hypothesis, upon hearing the utterance, your credence in C should be 0.9.²³ But since you are less than certain that S is a Reliabilist, your overall posterior probability should be less than 0.9. Let me put this line of reasoning in slightly more precise probabilistic terms: let ‘R’ represent the proposition that S is a Reliabilist, and ‘UC’ the proposition that S uttered C. Let ‘Pr’ denote your prior credence function (i.e. your credence function prior to S’s utterance).²⁴ If you update your beliefs using standard conditionalization, your credence in C after the utterance should be PrðCjUCÞ. But the Law of Total Probability, PrðCjUCÞ ¼ PrðCjUC ∧ RÞ  PrðRjUCÞ þ PrðCjUC∧  RÞ  PrðRjUCÞ:

²¹ Those who accept conditional excluded middle might be more tempted towards the view that the credence should be 0.5. ²² It is worth noting that each of the theories discussed in §2 can also accept Claim One (though in the case of the material conditional view, one would need to assume that your prior in S being a Randomer isn’t extremely high). ²³ Recall that for now I am focusing on the Simple Case. ²⁴ Note that this is in contrast with the credence functions discussed in §2, which represented your posterior credences.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



But first, due to the set-up, PrðCjUC ∧ RÞ ¼ 0:9: Second, I claim that PrðCjRÞ ¼ PrðCÞ. Why is that? First, recall that we are discussing your prior credence function, thus S’s uttering C should not make any difference here. One might try to argue that as R appears in the antecedent of C, learning R (and equally, learning R) should affect the probability of C. But that is not very plausible: barring very special circumstances, the credence you should assign to a conditional should not depend on the credence you assign to its antecedent.²⁵ For example, consider the conditional ‘If John presses the button, the bomb will explode’. The credence you should assign to this conditional typically depends on factors such as whether you think the button is wired to the bomb, the bomb is live, and so forth. But how likely you think it is that John will press the button should not impact your credence in the conditional (indeed, one of the chief criticisms of the material conditional view is that it blatantly violates this independence). Third, I maintain that PrðCjR ^ UCÞ ¼ PrðCjRÞ. The reason is that, supposing S is a Randomer, the fact that they uttered C should have no effect on your credence in C (after all, S’s utterance of C is akin to reporting the result of a coin flip). Thus PrðCjR ^ UCÞ ¼ PrðCjRÞ ¼ PrðCÞ, which by assumption is less than 0.9. Thus as long as PrðRjUCÞ < 1; we get that PrðCjUCÞ < 0:9 (see the Appendix for the calculations of the precise probabilities C receives in the Generalized Case).²⁶ Having given my argument for Claim Two, it is instructive to see where the three theories discussed in the previous section get things wrong. Both the Material Conditional view and Stalnaker’s theory get things wrong by assuming that C gets a high probability not only conditional on UC∧R but also conditional on UC∧R. The Material Conditional view gets this result simply because it assumes that, on your prior credence function, C is not independent of its antecedent: PrðCjRÞ ¼ 1

²⁵ Cf. Rothschild 2011 and the discussion in 4.4 below. (As I show there, your credence function after conditionalizing on UC constitutes one such special circumstance.) ²⁶ Certain variants of epistemic externalism might maintain that you shouldn’t update your credences by conditionalization in this case: if S is a Reliabilist, you should have a high credence in their utterance (whether or not you are certain that S is a Reliabilist). But even if we accept such views, one can simply stipulate that in the case (unbeknownst to you) S is a Randomer, and thus you ought not to have a high credence in C.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

and hence that PrðCjUC ∧ RÞ ¼ 1: Stalnaker’s view gets things wrong because, given the relevant background assumptions, it entails that PrðCjUC ∧ RÞ ¼ PrðCjUC ∧ RÞ:²⁷ Finally, the Suppositional Theory gets this wrong by assume that PrðCjUC∧RÞ is entirely irrelevant to the posterior probability of the conditional (the view takes this probability to be undefined as it involves assessing a conditional whose antecedent is certain to be false), and takes the probability of the conditional to be directly equal to PrðCjRÞ:²⁸

²⁷ Here is another way to highlight why views such as Stalnaker’s, which predict you should give C a high posterior because you give C a high posterior conditional on S being a Reliabilist are highly problematic. Consider a slightly modified version of the case, where instead of Randomers the island contains ‘Yay-Sayers’—who, whenever asked about a claim p, assent to it. Now suppose you are planning to ask S about p. In this case, PrðCjUC ∧ RÞ ¼ PrðCjRÞ, because the claim that R is not a Reliabilist entails that S is a Yay-Sayer and thus will utter C. But then, if your prior in S being a non-Reliabilist is sufficiently high (e.g. because you think the island contains a lot more Yay-Sayers then Reliabilists), then your prior in C (i.e. your credence in C even before it was uttered) would be extremely high, which seems highly implausible. ²⁸ The fact that, on the Suppositional Theory, conditionals are undefined when conditionalizing on the negation of the antecedent follows from the more general feature that, according to the theory, conditionals are undefined when the antecedent receives probability zero. Here is the typical defence one gets from proponents of the theory for this general feature (from Bennett 2003: 56): “conditionals are devices for intellectually managing states of partial information, and for preparing for the advent of beliefs that one does not currently have. For an A that you regard as utterly ruled out, so that for you PðAÞ ¼ 0, you have no disciplined way of making such preparations, no way of conducting the Ramsey test; you cannot say what the upshot is of adding to your belief system something you actually regard as having no chance of being true”. The problem with this defence is that it is irrelevant to cases (such as the one I am considering in this paper) where you are not genuinely giving the antecedent null credence, but rather you are assessing the conditional under the supposition that the antecedent is false. In that case you do not (unconditionally) regard the antecedent as having no chance of being true, and there is no reason why you cannot use your (unconditional) epistemic state to assess the plausibility of the conditional. Perhaps a helpful way to see the point is by analogy to the case of epistemic modals. For example, if I am genuinely certain that not-A, then ‘might A’ would trivially get credence zero, but the same would not be obviously true if I am merely supposing not-A. (Consider a case where I know that the murderer is either Jack or Jill but I don’t know which. The following seems to have an acceptable reading: “Suppose Jack is not the murderer but Jill framed him by leaving lots of misleading evidence that he is. So each of Jack and Jill might be the murderer” . . . Relatedly, if I am genuinely certain that Jack is not the murderer, then plausibly we cannot define my conditional probability of Jack going to jail conditional on the claim that he might be the murderer. But if I have a non-zero credence that Jack is the murderer, that arguably does not bar me from having some conditional probability on Jack going to jail, conditional on the supposition that he is not the murderer but might be the murderer . . . )

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



4.3.3 The Generalized Case So far, I have discussed the Simple Case. As we have seen, the posterior probability of C is the weighted average of PrðCjUC ∧ RÞ and PrðCjUC∧RÞ. In the Simple Case, we know that the former is exactly 0.9 and the latter is lower than 0.9, and thus neither the precise values of these two nor the precise way they are weighted mattered to the defence of Claim Two. The same is not true for the Generalized Case: since in the Generalized Case PrðCjUC∧RÞ might be higher than 0.9, the weighted average might end up being higher than 0.9. However, even in the Generalized Case, the overall posterior probability of the conditional will be lower than 0.9 provided that either your prior in C was sufficiently low (and thus PrðCjUC∧RÞ is equally low) or your prior in S being a Randomer was sufficiently high (and thus PrðRjUCÞ is sufficiently high). In the Appendix, I give a full probabilistic analysis of the case. One can substitute various values, but here are a couple of sample values that are of interest. (See the Appendix for a full explanation of these results.) Let us suppose that PrðCjUC∧RÞ ¼ 1 (this is the “worst case scenario” for the argument in the Generalized Case). If your prior in the conditional is 0.5 (or lower) then any distribution of the islanders in which more than 20% of them are Randomers would suffice for the posterior probability of C to be lower than 0.9. Alternatively, if your prior distribution has 95% of the islanders be Randomers, then as long as your prior in the conditional is lower than 0.890626 the posterior would still be lower than 0.9. Since, on the one hand, I can set the case to involve whatever (non-trivial) distribution of islanders we wish, and on the other hand, the same considerations in 4.3.1 suggest that our prior in C should not be higher than 0.5 (and certainly not as high as 0.890626!), the argument in the Generalized Case goes through as well.²⁹

²⁹ Moreover, consider the conditional C*: ‘If S is a Reliabilist, then S does not have high blood pressure’. If one accepts Conditional Excluded Middle, then at least one of C and C* should have a probability less than or equal to 0.5, and one can easily run the argument with S uttering C* instead of C.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

4.4 Lessons and conclusions 4.4.1 The semantics of conditionals The main lesson of this paper concerns the semantics of indicative conditionals: each of the three prominent theories has been shown to generate the wrong prediction in this case, and any adequate theory needs to provide the correct prediction. Moreover, my discussion provides a novel kind of counterexample to Adams’s thesis.³⁰ This is important, first, because Adams’s thesis (or at least a restricted form of Adams’s thesis) is a consequence of a range of theories. But second, rejecting Adams’s thesis doesn’t only provide a general objection to views such as the Suppositional Theory. Rather it undermines the central motivation for adopting such non-propositional views (the idea was that in light of the triviality results, one can only maintain Adams’s thesis by rejecting the claim that conditionals have truth-conditions, but if Adams’s thesis should anyhow be rejected, this motivation for going non-propositional no longer holds any sway). This point also connects to a more specific issue concerning the semantics of conditionals. Rothschild recently argued that Lewis’s triviality proof crucially relies on the following assumption: an indicative conditional is probabilistically independent of its antecedent, i.e. PrðIf A then Bj AÞ ¼ PrðIf A then BÞ:³¹ Moreover, Rothschild maintains that while the assumption that the conditional is probabilistically independent of its antecedent is in most cases correct, it is not always correct: there are some counterexamples to this assumption and these counterexamples make for cases where Adams’s thesis fails. Rothschild’s discussion is very interesting, but I think the specific counterexample to independence he proposes is not entirely convincing (indeed, he himself expresses some doubts about it³²). Here is his example: suppose you know the following: a certain type of car either has a defect in the airbag system, in which case whenever the car crashes the airbag fails to inflate; or it lacks the defect, in which case whenever

³⁰ See the discussion in n. 31 below on how my case differs from those of Kaufmann (2004) and Rothschild (2011). ³¹ That is, Lewis’s proof relies on Adams’s thesis, but the thesis that is used in Lewis’s proof amounts to accepting the weaker independence claim (see Rothschild 2011). ³² See his n. 31.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



the car crashes the airbag inflates. You are confronted with a car of the relevant type, but you do not know whether the car has the defect or not, or whether or not the car will crash. It seems that your credence in the conditional ‘If this car crashes, the airbag will not inflate’ is entirely independent of the credence you give to the antecedent (whether or not you assume the car in fact crashes, your credence in the conditional will simply be your credence in the car having the defect). But now suppose that you also know that cars which have the defect are also somewhat more likely to crash. Rothschild maintains that in that case the probability of the conditional is no longer independent of its antecedent. His reasoning is roughly the following: the probability you assign to the conditional is simply the probability you assign to the car having a defect. But if we conditionalize on the car crashing, then your credence on the car having the defect (and hence of the conditional) increases. The problem with this reasoning, however, is that it is not obvious that in the revised case (where you know cars with the defect are more likely to crash) the probability that you assign to the conditional (before conditionalizing on the antecedent) is still equal to the probability that you assign to the car having the defect: it is hard to know exactly how to assess such conditionals (and, as Rothschild suggests, they may well have different readings in different contexts), but it’s not implausible that (on at least one reading) your probability in C already takes into account the thought that situations in which the car crashes are situations where it is more likely to have the defect. Putting things otherwise: we know that PrðCÞ ¼ PrðCjcrashingÞ  PrðcrashingÞ þ PrðCjnot crashingÞ  Prðnot crashingÞ, so the crucial question is whether PrðCjnot crashingÞ < Pr ðCjcrashingÞ, and intuitions might go either way in this case. Interestingly, though, the case I propose in this paper provides a cleaner counterexample to independence: since your posterior probability in C is (assuming a set-up as in 4.3.1) lower than 0.9, but your probability in C conditional on its antecedent is at least 0.9, independence fails. And this example seems to be clearer, because here we do have a specific argument for why the probability of C conditional on the negation of its antecedent should be low.³³ ³³ Rothschild’s case also seems to be an instance of the challenges to Adams’s thesis which are raised in Kaufmann 2004. But the case presented in this paper seems crucially different from Kaufmann’s cases. On the Kaufmann-style cases, the probability of the

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

4.4.2 Conditional acceptance in philosophical arguments I want to conclude with a somewhat more tentative note on some implications that this case might have for issues that go beyond the semantics of conditionals. Conditionals play a crucial role in philosophical arguments: often, philosophers find it difficult to defend outright philosophical views, and are much more confident about conditional claims of the form ‘If this assumption is correct, then these consequences follow’. In particular, philosophers might be interested in conditionals where the antecedent recommends some logical theory, method of reasoning, or procedure for justification, and the consequent draws some conclusion from adopting this theory or method. But while many such conditionals are surely acceptable, one should note that these types of conditionals are precisely at risk of having a similar structure to the one presented in this paper: if the reasoning which leads us from the antecedent to the consequent relies on employing the very method recommended by the antecedent, then our acceptance of the conditional will not be independent of our acceptance of the antecedent—and adopting such conditionals would not be neutral between all parties to the debate. (These will precisely be cases where one’s credence in the conditional, conditional on the negation of the antecedent, might be low, and those parties who have a very low prior in

conditional is basically calculated as the conditional probability, except that one effectively assumes that some variable of the case is held fixed to the way it actually is (and thus that the probability of that variable is calculated via actual probabilities, rather than probabilities conditional on the antecedent). To take Rothschild’s example above, the relevant variable is the car having or lacking a defect. Thus assume for example that you have a credence of 0.5 in the car having the defect unconditionally, of 0.9 on its having the defect conditional on its crashing (and you are certain that when the car crashes, the airbag inflates if and only if it lacks the defect). Your conditional probability of the antecedent on the consequent is calculated as: Prðnot inflatejcarashes ∧ defectÞPrðdefectjcrashÞþ Prðnot inflatejcarashes∧no defectÞPrðno defectjcrashÞ ¼ 10:9 þ 00:1 ¼ 0:9. But if we’re holding fixed the question of whether or not the car has a defect, then the probability of the conditional would be calculated as Prðnot in flatejcarashes ∧ def ectÞPrðdefectÞþ Prðnot inflatejcarashesno defectÞPrðno defectÞ ¼ 10:5 þ 00:5 ¼ 0:5: The crucial point is that the failure of Adams’s thesis that is presented in this paper cannot be subsumed under the same analysis. The only relevant variable that one might try to “hold fixed” is the question of whether or not S has high blood-pressure. But, as with the discussion of Stalnaker’s view in 4.2, we simply assume that your credence in Randomers having high blood pressure is extremely high (if you want, 1), in which case holding fixed S’s blood pressure will not help explain the particular failure of Adams’s thesis argued for in 4.3.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



the antecedent might thus have a low overall credence in the conditional as a whole.)³⁴ Thus while the example presented in this paper might initially appear somewhat recherché, it has profound implications both for the semantics of conditionals as well as for philosophical arguments more generally.³⁵

4.5 Appendix: A probabilistic analysis of the case In this Appendix I show how to calculate the precise probability of C in this case. Let ‘Pr’ represent your prior probability function. As above ‘R’ represents the proposition that S is a Reliabilist and ‘UC’ the proposition that S uttered C. Let r be what you assume is the percentage of Reliabilists in the island (i.e. r ¼ PrðRÞ), c your prior credence in C (i.e. c ¼ PrðCÞ). Finally, let k be your credence in C assuming that it was uttered by a Reliabilist, i.e. k ¼ PrðCjUC ∧ RÞ. In the Simple

³⁴ One particular case of this sort involves a variant of an argument proposed by Dummett against strict finitism (see Dummett 1975, Magidor 2007, and Magidor 2012). Strict Finitism is an extreme form of constructivism in philosophy of mathematics, which takes as its key notion construability in practice. One upshot of the view is that proof-skeletons that would be too long to fill out in practice (e.g. ones that would take 2¹⁰⁰ steps to fill out) do not count as legitimate proofs by the finitist’s lights. Call a proof skeleton ‘short enough’ (SE) if it is short enough to be filled out in practice. I cannot go into the details here, but it turns out that one can provide, for each n, an allegedly finitistically acceptable argument for the conditional claim that ‘If SE(2n) then SE(2n+¹)’, and that moreover, accepting these conditionals leads to contradiction, even by the finitist’s own lights. The interesting fact, though, is the form that each of these arguments takes: it starts with the supposition that SE(2n), then uses 2n steps, and reaches the conclusion that SE(2n). The idea is that since each argument uses 2n steps, then under the supposition of the antecedent (namely that 2n is short enough), the argument should be finitistically acceptable and thus the finitist should accept the consequent. My contention, however, is that the finitist should not accept the conditional on this basis, for precisely the same reason that you are not required to accept C in the current case: since for some values of n they don’t in fact accept that SE(2n) they shouldn’t accept the relevant conditional. ³⁵ Thanks to audiences in Carnegie Mellon, Krakow, London, NYU, Rutgers, and York as well as to Cian Dorr, John Hawthorne, Nicholas Jones, Sarah Moss, Daniel Rothschild, James Studd, and Timothy Williamson for helpful discussion of this paper. Thank you also to the Leverhulme Trust for their financial support.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

Case we know that k ¼ 0:9, but in the Generalized Case we merely know that k  0:9.³⁶ As discussed in 4.3 above, your posterior credence in C should be prðCjUCÞ ¼ PrðC^UCÞ PrðUCÞ . For simplicity of the discussion, assume that you are certain in advance that S will utter either C or C.³⁷ Now let us partition the space where UC is true into four regions: UC∧C∧R; UC∧C∧R; UC∧C∧R; UC∧C∧R. Note that PrðC∧UCÞ is the sum of probabilities of the first two regions, and that PrðUCÞ is the sum of probabilities for all four regions. Now let’s calculate the probabilities of each of these regions in turn. Step 1: calculating UC ∧ C ∧ R We start by noting that PrðUC∧C∧RÞ ¼ PrðUC∧CjRÞ  PrðRÞ Now, let PrR be the probability function we get from Pr after conditionalizing on R. So PrðUC∧CjRÞ ¼ PrR ðUC∧CÞ ¼ PrR ðCjUCÞ  PrR ðUCÞ But we know from the set-up of the case that PrR ðCjUCÞ ¼ k, so PrR ðUC∧CÞ ¼ k  Pr R ðUCÞ. We also know that PrR ðCjUCÞ ¼ 1  k (because if S does not utter C, S utters C and then  C has a probability of k; see n. 34). Finally, we know that PrR ðC∧UCÞ ¼ PrR ðCÞ  PrR ðC∧UCÞ. So we have: PrR ðC∧UCÞ PrR ðCÞ  PrR ðC∧UCÞ ¼ PrR ðUCÞ 1  PrR ðUCÞ PrR ðCÞ  k  PrR ðUCÞ ¼ 1  PrR ðUCÞ

ð1  kÞ ¼ PrR ðCjUCÞ ¼

We thus get that PrR ðUCÞ ¼ ðPrR ðCÞ þ k  1ÞÞ=2k  1. But, as argued in §3.2, on your prior credence function, the conditional is independent of antecedent, so PrR ðCÞ ¼ PrðCjRÞ ¼ PrðCÞ ¼ c.

³⁶ I will not assume that your credence in a proposition conditional on it being uttered by a Reliabilist is always uniform, but to simplify the calculations I will assume that the probability of ¬C conditional on a Reliabilist uttering ¬C is also k. ³⁷ This assumption makes no difference at all to the calculations. Let ‘DC ’ be the proposition that S discussed C (i.e. uttered either C or C). Then since UC entails DC, PrðUCÞ ¼ PrðUCjDCÞ  PrðDCÞ and PrðUC∧CÞ ¼ PrðUC∧CjDCÞ  PrðDCÞ, so PrðUC∧CjDCÞ PrðCjUCÞ ¼ PrðUC∧CjDCÞPrðDCÞ PrðUCjDCÞPrðDCÞ ¼ PrðUCjDCÞ , so we can assume to have conditionalized on DC.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



Thus PrR ðUCÞ ¼ ðc þ k  1ÞÞ=ð2k  1Þ And to conclude we have: PrðUC∧C∧RÞ ¼ PrðUC∧CjRÞ  PrðRÞ ¼ k  PrR ðUCÞ  r ¼ k  ðc þ k  1ÞÞ=ð2k  1Þ  r Step 2: calculating UC∧C∧R We start by noting that PrðC∧UC∧¬RÞ ¼ PrðUCjC∧RÞ  PrðC∧RÞ. Now, on the supposition that S is not a Reliabilist, S is a Randomer and thus flips a fair coin in order to decide whether to utter C or  C. So PrðUCjC∧RÞ ¼ 0:5. Moreover, since (as argued above), on the prior credence function C and R are independent, PrðC ∧ RÞ ¼ PrðCÞ  PrðRÞ Finally, we know that PrðRÞ ¼ 1  r So we conclude that: PrðC∧UC∧RÞ ¼ 0:5  c  ð1  rÞ: Step 3: calculating UC∧¬C∧R PrðUC∧C∧RÞ ¼ PrðUC∧CjRÞ  PrðRÞ But using our calculations in Step 1, we have PrðUC∧CjRÞ ¼ PrR ðUC∧CÞ ¼ PrR ðCjUCÞ  PrR ðUCÞ ¼ ð1kÞðcþk1ÞÞ=ð2k1Þ: So we conclude that PrðUC∧C∧RÞ ¼ ð1  kÞ  ðc þ k  1ÞÞ=ð2k  1Þ^ð1  rÞ: Step 4: calculating UC∧C∧R Using similar considerations to those in Step 2 we get: PrðUC ∧ C ∧ RÞ ¼ PrðUCjC∧  RÞ^PrðC ∧ RÞ ¼ PrðUCjC ∧ RÞ  PrðCÞ   PrðR ¼ 0:5  ð1cÞ  ð1rÞ

Step 5: calculating PrðCjUCÞ Putting everything together we get: PrðCjUCÞ ¼

k  ðcþk1ÞÞ=ð2k1Þ  rþ0:5  c  ð1rÞ kðcþk1ÞÞ=ð2k1Þrþ0:5cð1rÞþð1kÞðcþk1ÞÞ=ð2k1Þð1rÞþ0:5ð1cÞð1rÞ

This ratio is somewhat complicated, but can be simplified once particular values are substituted for c, k, and r. The ratio simplifies particularly

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



 

nicely if we assume that k ¼ 1 (as in the most difficult version of the Generalized Case). In that case we get: c  r þ 0:5  c  ð1  rÞ c  r þ 0:5  c  ð1  rÞ þ 0:5  ð1  cÞ  ð1  rÞ c  rþc ¼ 2c  r þ 1  r

PrðCjUCÞ ¼

It is now easy to see why (as noted at the end of §3.3), if c ¼ 0:5 then PrðCjUCÞ < 0:9 just in case r < 0:8, and that if r ¼ 0:05 then PrðCjUCÞ < 0:9 just in case c < 0:890625.

References Adams, E. (1975). The Logic of Conditionals: An Application of Probability to Deductive Logic. Dordrecht: Springer. Belnap, N. (1970). ‘Conditional assertion and restricted quantification’, Noûs 4: 1–12. Bennett, J. (2003). A Philosophical Guide to Conditionals. Oxford: Oxford University Press. Dummett, M. (1975). ‘Wang’s Paradox’, Synthese 30: 301–23. Edgington, D. (1995). ‘On conditionals’, Mind 104: 239–73. Edgington, D. (2009). ‘Conditionals’¸ in Edward Zalta (ed.), The Stanford Encyclopedia of Philosophy, Spring 2009 edition. Grice, P. (1989). ‘Logic and conversation’, in his Studies in the Way of Words. Cambridge, MA: Harvard University Press. Jackson, S. F. (1987). Conditionals. Oxford: Basil Blackwell. Kaufmann, S. (2004). ‘Conditioning against the grain: abduction and indicative conditionals’, Journal of Philosophical Logic 33: 583–606. Kratzer, A. (1986). ‘Conditionals’, Chicago Linguistics Society 22: 1–15. Lewis, D. (1976). ‘Probabilities of conditional and conditional probabilities’, Philosophical Review 8: 297–315. Magidor, O. (2007). ‘Strict finitism refuted?’, Proceedings of the Aristotelian Society 107: 403–11. Magidor, O. (2012). ‘Strict finitism and the happy sorites’, Journal of Philosophical Logic 41: 471–91. Putnam, H. (1981). Reasons, Truth, and History. Cambridge: Cambridge University Press. Rothschild, D. (2011). ‘Do indicative conditionals express propositions?’, Noûs 47: 49–68.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

 



Rothschild, D. (forthcoming). ‘Capturing the connection between conditionals and conditional probability in trivalent semantics’, forthcoming in Journal of Applied Non-Classical Logics. Stalnaker, R. (1968). ‘A theory of conditionals’, in Nicholas Rescher (ed.), Studies in Logical Theory, 98–112. Oxford: Blackwell. Stalnaker, R. (1975). ‘Indicative conditionals’, Philosophia 5: 269–86. Stalnaker, R. (1999). Context and Content. Oxford: Oxford University Press.

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

5 Frege’s Begriffsschrift Theory of Identity Vindicated Ulrich Pardey and Kai F. Wehmeier

5.1 Introduction The publication of Begriffsschrift in 1879 marks the birth of modern quantificational logic. This achievement of Frege’s is, of course, widely admired. One doctrine put forth in that work, however, has met with universal rejection by logicians and philosophers of language alike, to wit, the theory of identity that is the subject of §8. Here is Frege’s own formulation: Equality of content differs from conditionality and negation by relating to names, not to contents. Whereas elsewhere signs are merely representatives of their content, so that every combination into which they enter expresses only a relation between their contents, they suddenly display their own selves as soon as they are combined by the sign for equality of content; for this signifies the circumstance that two names have the same content. Hence the introduction of a sign for identity of content necessarily produces a bifurcation in the meaning of all signs: they stand at times for their content, at times for themselves. [ . . . ] Now let ‘AB mean that the sign A and the sign B have the same conceptual content [ . . . ] (Begriffsschrift §8; emphasis in the original)

Thus in an identity statement of the form ⌜a  b⌝, the names a and b do not designate the objects they ordinarily represent, but rather themselves.¹ The sentence ‘Hesperus is Phosphorus’ is, accordingly, in the first ¹ We are assuming here that the Fregean content of a name is the object the name ordinarily designates. This reading of Begriffsschrift seems to be almost universally shared

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

’    



place about the names ‘Hesperus’ and ‘Phosphorus,’ of which it predicates co-reference. In other words, ‘Hesperus is Phosphorus’ says that some object is designated both by ‘Hesperus’ and by ‘Phosphorus.’ The sentence ‘Hesperus has greater mass than the Earth,’ by contrast, is about the planet Venus, that is, the object bearing the name ‘Hesperus,’ and attributes a certain property to that object. So depending on its context of occurrence, ‘Hesperus’ sometimes refers to itself (namely, in identity contexts) and sometimes to Venus (namely, in ordinary contexts).² It has long been thought that such a view of identity is incompatible with the role that Frege accords the identity sign in the technical development of his logic. As we shall see presently, there are two distinguishable, if closely related, versions of this objection, one alleging a confusion of use and mention, the other an incompatibility with quantification. As far as we are aware, Alonzo Church is the first author to diagnose a serious problem with Frege’s Begriffsschrift theory of identity when he writes that [if ] use and mention are not to be confused, the idea of identity as a relation between names renders a formal treatment of the logic of identity all but impossible. (Church 1951: 3)

Roger White provides more detail: In the prose introduction to the Begriffsschrift, [Frege] proposes a theory for identity, whereby, to use his later terminology, proper names have a dual rôle: to stand for objects and to stand for themselves; in an identity proposition they are alleged to have this second rôle. But the actual body of the work belies this prose introduction. If we consider the crucial axioms for identity (52) and (53) [ . . . ], then if the identity sign is regarded as a two-place predicate flanked by proper

by Frege scholars. For a lone dissenting voice, see May 2012. In Begriffsschrift, though not in his later works, Frege uses the triple bar, ‘  ’, as the identity sign. As is evident from the quotations below, most of the secondary literature on the Begriffsschrift view of identity departs from Frege’s usage and employs the usual double bar instead. Except in quotations, we will use the triple bar for the identity predicate of the object language under discussion and reserve the double bar for the identity predicate of the metalanguage. ² A note on our use of devices of quotation: an expression that begins with a single opening quote and ends with a single closing quote designates the expression that occurs between those two quotes. Quinean quasi-quotation (corner quotes) is used when we wish to speak of the result of inserting unspecified expressions into blanks of a specified context; e.g. for any names a and b, ⌜a  b⌝ is the result of first writing the name a, then the identity sign ‘ ’, and then the name b, whereas ‘a  b’ would simply be the expression consisting of the three symbols ‘a’, ‘  ’, and ‘ b’, in that order. See Quine (1940: §6).

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



    . 

names standing for their normal references, everything here proceeds smoothly and straightforwardly. But if attempt [sic] to parse the formulae according to the prose introduction, everything becomes opaque. If that is we consider such formulae as (52) and (55), then (52) becomes immediately surrounded by obscurities [ . . . ]. If we adapt Frege’s notation, (52) runs ðc ¼ dÞ ⊃ ð f ðcÞ ⊃ f ðdÞÞ: Now in this the ‘c’ of the antecedent is alleged to have a different reference from the ‘c’ of the consequent, and hence it would appear that [ . . . ] we have a formula employing a sign ambiguously and incapable of being read. The shift from mentioning a sign in the antecedent to using it in the consequent seems to make a uniform construal of the formula impossible [ . . . ]. So the moment we try to read into his actual use of the identity sign within his formulae his official account of identity, we run into a whole thicket of difficulties about ambiguity, about use and mention of signs. (White 1977–8: 157–8)

Christopher Williams quotes White approvingly and adds: [The assumption that identity is a relation between signs,] although it is explicitly made in the introductory ‘Explanation of the Symbols’, is not taken account of in the formal development of the Begriffsschrift itself. Indeed it is actually incompatible with certain of the axioms and theorems of the system. The schema ðx ¼ yÞ ! ðφx $ φyÞ requires for its validity, and indeed for its intelligibility, that substitutions for x and y have the same meaning in the consequent as they have in the antecedent. They will not have this if Frege’s remarks about equality being a relation between signs are taken to apply to this schema: substitutions for x and y will be names which in the antecedent are their own meanings, but in the consequent have their customary meanings, the things they normally name. (Williams 1989: 22–3).

While Church, White, and Williams focus on a purported violation of the use–mention distinction, a second line of attack against Frege’s Begriffsschrift view is predicated on an alleged problem with quantifying into contexts that have the bound variable occurring in both identity and ordinary contexts. We find this objection formulated in nuce by Montgomery Furth, according to whom Frege’s view renders it practically impossible to integrate the theory of identity into the formalized object-language itself; e.g. to state generally such a law as that if FðaÞ and a ¼ b then FðbÞ. (Furth 1964: xix)

Furth does not explain what precisely the problem is supposed to be. We take his reference to the difficulty of providing a general statement of Leibniz’s law to indicate that it pertains to the use of bound variables in identity contexts, a point explicitly made by Michael Dummett:

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi

’    



In Begriffsschrift Frege held that identity was a relation between names and not between things. [ . . . ] but [this view] makes nonsense of the use of bound variables on either side of the sign of identity. (Dummett 1973: 544)

But how exactly is the view supposed to make bound variables flanking the identity sign nonsensical? Richard Heck explains this as follows: [Frege] says in §8 that ‘all signs . . . stand at times for their content, at times for themselves,’ but he seems unaware how problematic that statement is. He never makes any effort to explain what precisely it might mean or, crucially, how it might be understood when these signs are not names but variables. Indeed, if I had to guess, I would say that Frege’s dissatisfaction with his old theory of what identity is emerged from reflection on just this sort of problem. As a theory of the semantics of identity, the Begriffsschrift view is completely inadequate. The identity-sign does not occur only in construction with names, but also in construction with variables. So how is a sentence such as ‘ð8xÞðx ¼ 0 ! FxÞ’ to be understood, on the Begriffsschrift view? What precisely is the variable supposed to range over? Its two occurrences—once as argument of the identity-sign, once as argument of ‘Fξ’—place incompatible demands on its range if identity is a relation between names. (Heck 2003: 86–7)

We have, then, essentially two (as we shall see, related) objections to Frege’s view: 1. Church, White, Williams: the Begriffsschrift conception of identity leads to confusion of use and mention. 2. Furth, Dummett, Heck: the Begriffsschrift conception of identity is incompatible with quantification. One main result of our discussion will be that neither of these objections has any force. This result also has consequences for an exegetical issue touched upon by Heck in the above quote, namely the proper interpretation of the first paragraph in Frege’s ‘On Sense and Reference’ (Frege 1892). Thau and Caplan (2001) have argued that, contrary to received wisdom, Frege does not there abandon his Begriffsschrift view of identity in favor of the adoption of an objectual identity relation. Heck (2003) puts forth several arguments against their reading, one of which we have just seen—to recall: Indeed, if I had to guess, I would say that Frege’s dissatisfaction with his old theory of what identity is emerged from reflection on just this sort of problem. (Heck 2003: 86–7)

OUP CORRECTED PROOF – FINAL, 24/1/2019, SPi



    . 

Given that there is no such ‘sort of problem,’ this line of attack against Thau and Caplan’s reading would seem to be invalidated. But in addition to these exegetical issues, the viability of Frege’s Begriffsschrift conception also has systematic consequences that are relevant to historical and contemporary debates. Current philosophical orthodoxy has it that, just like ‘