Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018) [1st ed. 2020] 978-3-030-30076-0, 978-3-030-30077-7

This book focuses mainly on logical approaches to computational linguistics, but also discusses integrations with other

226 23 4MB

English Pages IX, 205 [210] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018) [1st ed. 2020]
 978-3-030-30076-0, 978-3-030-30077-7

Table of contents :
Front Matter ....Pages i-ix
Proof Irrelevance in Type-Theoretical Semantics (Zhaohui Luo)....Pages 1-15
Saving Hamlet Ellipsis (Kristina Liefke)....Pages 17-43
Temporal Representations with and without Points (Tim Fernando)....Pages 45-66
From Tree Adjoining Grammars to Higher Order Representations of Abstract Meaning Representations via Abstract Categorial Grammars (Rasmus Blanck, Aleksandre Maskharashvili)....Pages 67-93
Measuring Linguistic Complexity: Introducing a New Categorial Metric (Mehdi Mirzapour, Jean-Philippe Prost, Christian Retoré)....Pages 95-123
On Categorial Grammatical Inference and Logical Information Systems (Annie Foret, Denis Béchet)....Pages 125-153
A Scope-Taking System with Dependent Types and Continuations (Justyna Grudzińska, Marek Zawadowski)....Pages 155-176
On the Coevolution of Language and Cognition—Gricean Intentions Meet Lewisian Conventions (Nikola Anna Kompa)....Pages 177-205

Citation preview

Studies in Computational Intelligence 860

Roussanka Loukanova Editor

Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018)

Studies in Computational Intelligence Volume 860

Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland

The series “Studies in Computational Intelligence” (SCI) publishes new developments and advances in the various areas of computational intelligence—quickly and with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. The books of this series are submitted to indexing to Web of Science, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink.

More information about this series at http://www.springer.com/series/7092

Roussanka Loukanova Editor

Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018)

123

Editor Roussanka Loukanova Department of Mathematics Stockholm University Stockholm, Sweden

ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN 978-3-030-30076-0 ISBN 978-3-030-30077-7 (eBook) https://doi.org/10.1007/978-3-030-30077-7 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

The chapters in this book evolved from the symposium on Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018) and its predecessor, the workshop on Logic and Algorithms in Computational Linguistics 2017 (LACompLing2017). The symposium LACompLing2018 took place at the Department of Mathematics, Stockholm University, Stockholm, Sweden, August 28–31, 2018: http://staff.math.su.se/rloukanova/LACompLing2018-web/ The workshop LACompLing2017 was also held at the same venue, August 16–19, 2017: http://staff.math.su.se/rloukanova/LACompLing17.html Similar to the workshop, the symposium LACompLing2018 was a distinguished forum, at which researchers from different subfields of computational linguistics presented and discussed their work. The symposium LACompLing2018 took forward the initiation of its first edition. The initial idea of assessment of the place of computer science, mathematical logic, and other subjects of mathematics, in computational linguistics, has been fruitful resource of research. This book covers mainly logical approaches to computational linguistics but also integrations with other approaches. Presented work varies from classic to newly emerging theories and applications. While computational linguistics is a relatively young discipline, decades of research on theoretical work and practical applications have demonstrated that it is a distinctively interdisciplinary area. There is also sense that computational approaches to linguistics can benefit from research on the nature of human language, from the perspective of its evolution. The chapters of this book cover topics on computational theories of human language, across grammar, syntax, and semantics. The common threads of the research in the chapters are the roles of computer science, mathematical logic, and other subjects of mathematics, in computational linguistics and natural language processing.

v

vi

Preface

Proof Irrelevance in Type-Theoretical Semantics, Zhaohui Luo Type theories have been used as foundation for formal semantics. Under the propositions-as-types principle, most modern type systems have explicit proof objects which, however, cause problems in obtaining correct identity criteria in semantics. Veovodsky proposed h-logic, studied in the HoTT project, where proof irrelevance is built-in in the notion of logical proposition. This chapter proposes that Martin-Löf’s type theory (MLTT) should be extended with h-logic to MLTTh, a predicative type system that can be adequately employed for formal semantics.

Saving Hamlet Ellipsis, Kristina Liefke The chapter presents a propositionalist account of depiction reports (e.g., “Mary paints a unicorn”) that interprets the direct objects in these reports as existential propositions (i.e., as “there being a unicorn”). The proposed account solves a number of problems of Parsons’ (1997) “Hamlet ellipsis”-account of depiction reports, including the prediction of unattested readings, the difficulty to capture the entailments of depiction reports, and the inability to interpret multiply embedded depiction reports.

Temporal Representations with and without Points, Tim Fernando Intervals and events are analyzed in terms of strings that represent points as symbols occurring uniquely. Allen interval relations, Dowty’s aspect hypothesis, and inertia are relative to strings, compressed into canonical forms, describable in Monadic Second-Order logic. Stative predicates are replaced by their borders, represented in the S-words of Schwer and Durand. Borders point to non-stative predicates.

From Tree Adjoining Grammars to Higher Order Representations of Abstract Meaning Representations via Abstract Categorial Grammars, Rasmus Blanck, and Aleksandre Maskharashvili The chapter presents construction of an Abstract Categorial Grammar (ACG) that interrelates Tree Adjoining Grammar (TAG) and Higher Order Logic (HOL) formulas encoding Abstract Meaning Representations (AMRs). TAG derivations are

Preface

vii

interpreted as HOL formulas representing Montague semantics. This improves complexity of the natural language generation and parsing with TAGs and HOL.

Measuring Linguistic Complexity: Introducing a New Categorial Metric, Mehdi Mirzapour, Jean-Philippe Prost, and Christian Retoré The chapter provides a computable quantitative measure which accounts for the difficulty in processing of natural language by humans. The new metric uses Categorial Proof Nets to correctly model Gibson’s account in his Dependency Locality Theory. The proposal is close to the modern computational psycholinguistic theories. It promises inclusion of semantic complexity, by syntax–semantics interface in categorial grammars.

On Categorial Grammatical Inference and Logical Information Systems, Annie Foret, and Denis Béchet In this chapter, learning is viewed as a symbolic issue in an unsupervised setting, from raw or from structured data, for some variants of Lambek grammars and of categorial dependency grammars. For these frameworks, the authors present different type connectives and structures, some limitations, and some algorithms. On the experimental side, categorial grammar has potentials as a particular case of Logical Information System.

A Scope-Taking System with Dependent Types and Continuations, Justyna Grudzińska, and Marek Zawadowski The chapter proposes a new scope-taking system with dependent types and continuations. Its key elements are (i) richly typed system, (ii) contexts for determining the relative scoping of quantifiers, (iii) recursive procedure by which the interpretation is computed and the dependently typed context is built along the surface structure tree. The core idea behind the proposal is that certain lexical elements are responsible for inverting scope: relational nouns and locative prepositions.

viii

Preface

On the Coevolution of Language and Cognition—Gricean Intentions Meet Lewisian Conventions, Nikola Kompa How might human language have emerged; and is it essentially based on convention or intention? The chapter is an attempt at outlining an answer to the first question by trying to answer the second. The starting point is the idea that linguistic and (other) cognitive capacities must have coevolved and provided mutual scaffolding for one another. It is argued that combining Grice’s intention-based model of meaning and communication with Lewis’ game-theoretic account of conventions helps explain how language might have unfolded.

Goal and Intended Readers of this Collection of Works The goal of this collection of works is to promote intelligent approaches to computational linguistics and Natural Language Processing (NLP). The intended readers of the book are researchers in computational linguistics, theory and applications of Artificial Intelligence (AI) and Natural Language Processing (NLP), and in related subjects. Uppsala, Sweden July 2019

Roussanka Loukanova

Contents

Proof Irrelevance in Type-Theoretical Semantics . . . . . . . . . . . . . . . . . . Zhaohui Luo

1

Saving Hamlet Ellipsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kristina Liefke

17

Temporal Representations with and without Points . . . . . . . . . . . . . . . . Tim Fernando

45

From Tree Adjoining Grammars to Higher Order Representations of Abstract Meaning Representations via Abstract Categorial Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rasmus Blanck and Aleksandre Maskharashvili Measuring Linguistic Complexity: Introducing a New Categorial Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mehdi Mirzapour, Jean-Philippe Prost and Christian Retoré

67

95

On Categorial Grammatical Inference and Logical Information Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Annie Foret and Denis Béchet A Scope-Taking System with Dependent Types and Continuations . . . . 155 Justyna Grudzińska and Marek Zawadowski On the Coevolution of Language and Cognition—Gricean Intentions Meet Lewisian Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Nikola Anna Kompa

ix

Proof Irrelevance in Type-Theoretical Semantics Zhaohui Luo

Abstract Type theories have been used as foundational languages for formal semantics. Under the propositions-as-types principle, most modern type systems have explicit proof objects which, however, cause problems in obtaining correct identity criteria in semantics. Therefore, it has been proposed that some principle of proof irrelevance should be enforced in order for a type theory to be an adequate semantic language. This paper investigates how proof irrelevance can be enforced, particularly in predicative type systems. In an impredicative type theory such as UTT, proof irrelevance can be imposed directly since the type Prop in such a type theory represents the totality of logical propositions and helps to distinguish propositions from other types. In a predicative type theory, however, such a simple approach would not work; for example, in Martin-Löf’s type theory (MLTT), propositions and types are identified and, hence, proof irrelevance would have implied the collapse of all types. We propose that Martin-Löf’s type theory should be extended with h-logic, as proposed by Veovodsky and studied in the HoTT project, where proof irrelevance is built-in in the notion of logical proposition. This amounts to MLTTh , a predicative type system that can be adequately employed for formal semantics.

1 Introduction Formal semantics in modern type theories (MTT-semantics for short) [8, 25] is a framework for natural language semantics, in the tradition of Montague’s semantics [32]. The development of MTT-semantics is a part of a wider research endeavour by many researchers who have recognised the potential advantages of rich type structures in constructing formal semantics [3, 5, 10, 12, 17, 27, 34–36]. While Montague’s semantics is based on Church’s simple type theory [9, 14] (and its models in set theory), MTT-semantics is based on dependent type theories, which we call modern type theories (MTTs), to distinguish them from the simple type theory. Z. Luo (B) Royal Holloway, University of London, London, UK e-mail: [email protected] © Springer Nature Switzerland AG 2020 R. Loukanova (ed.), Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018), Studies in Computational Intelligence 860, https://doi.org/10.1007/978-3-030-30077-7_1

1

2

Z. Luo

One of the key differences between MTTs and simple type theory is that MTTs have rich type structures with many types, much richer than those in simple type theory. Because of this, in such a rich type system, common nouns can be interpreted as types rather than predicates (see, for example, Ranta’s proposal on this in Martin-Löf’s type theory [34]). We can call this the CNs-as-types paradigm: it is different from Montague’s semantics, which adopts the CNs-as-predicates paradigm. For instance, consider the CN ‘book’: it is interpreted in Montague’s semantics as a predicate of type e → t, while in MTT-semantics, it is interpreted as a type Book. It has been argued that interpreting CNs as types rather than predicates has several advantages including, for example, better treatment of selectional restriction by typing, compatible use of subtyping in semantic interpretations and satisfactory treatment of some advanced linguistic features such as copredication (see, for example [23, 25], for further details).1 Another feature of MTTs is that their embedded logics are all based on the principle of propositions as types [11, 20] and, in particular, there are proof terms whose types are logical propositions: a logical formula P is true if, and only if, there exists a proof term p of type P. Such formulations with proof terms are rather natural for constructive proof systems: type theories such as Martin-Löf’s type theory [29, 30] are such examples which were originally designed for describing constructive mathematics such as that described by Bishop in [4]. However, MTTs can also be applied in applications other than constructive mathematics: their use in MTT-semantics for natural language is such an example. In particular, MTTs provide a semantic framework which is both model-theoretic and proof-theoretic [26] and have many advantages as foundational languages for formal semantics, as compared with the model-theoretic framework of Montague’s semantics. MTT-semantics also provides useful mechanisms for successful treatments of various linguistic features such as copredication with subtyping [23, 25] that have been found difficult to be dealt with in the traditional setting. However, proof terms that inhabit propositions are not completely innocent: when we employ type theories for formal semantics, they cause problems. In particular, their presence makes it difficult, if not impossible, for one to obtain correct identity criteria of CNs. For example, in a dependent type theory, one may use dependent types of pairs2 to represent CNs modified by intersective adjectives [34]: for instance, a handsome man is a pair of a man and a proof that the man is handsome. Then, for such representations, one can ask: what is the identity criterion for handsome man? An obvious answer should be that it is the same as that for man: two handsome men are the same if, and only if, they are the same man. But this is not what the formal interpretation gives us since it also requires that the proofs of the man being 1 It

may be interesting to remark that, recognising that interpreting CNs as types has several advantages, some researchers have suggested that both paradigms need be considered including, for instance, Retoré’s idea in this respect [35]. A related issue in type-theoretical semantics is how to turn judgemental interpretations into corresponding propositional forms, as studied in [41] which proposes an axiomatic solution for such transformations that can be justified by means of heterogenous equality of type theory. 2 Technically, -types are used—see Sect. 2.2 and Footnote 6 for a further description of -types.

Proof Irrelevance in Type-Theoretical Semantics

3

handsome be the same. Obviously, this would not be a correct identity criterion for the modified CN handsome man. How to solve this problem? It has been proposed that some principle of proof irrelevance should be adopted [24]: i.e., any two proofs of the same logical proposition should be the same. In the above example, it implies that any two proof terms of a man being handsome should be the same. In Sect. 2, we shall introduce the issue of identity criteria for CNs and illustrate that proof irrelevance provides a nice solution to this problem. A type theory can either be predicative or impredicative. Examples of the former include Martin-Löf’s type theory [33] as implemented in the proof assistant Agda [1] and those for the latter include the type theory UTT [22] as implemented in Lego/Plastic [6, 28] and pCIC as implemented in Coq [39]. A notable difference is that, in impredicative type theories, there is usually a type Prop of all logical propositions, while in a predicative type theory, we usually do not have such a type. This difference is significant when we consider a proof irrelevance principle. For impredicative type theories, imposing proof irrelevance is pretty straightforward: we simple stipulate that, for any proposition of type Prop, any two proof terms of type P are the same [24] (more details can be found in Sect. 2.3). However, how to consider proof irrelevance in a predicative type theory is not a simple matter anymore: for example, in Martin-Löf’s type theory, people usually follow Martin-Löf to identify types with propositions. In this case, it would be quite absurd to impose proof irrelevance for all propositions because that would have meant to identify objects of all types! It is unclear whether and how proof irrelevance can be enforced in a predicative type theory, a point of view expressed in the following quotation, taken from [24]: It is also worth noting that, although proof irrelevance can be considered for impredicative type theories directly as above, it is unclear how this can be done for predicative type theories. For instance, in Martin-Löf’s type theory [29, 30], propositions are identified with types. Because of such an identification, one cannot use the above rule to identify proofs, for it would identify the objects of a type as well. Put in another way, proof irrelevance is incompatible with the identification of propositions and types. In order to introduce proof irrelevance, one has to distinguish logical propositions and types (see, for example, [22]).

The above-mentioned identification of types and propositions gives rise to a logic based on the principle of propositions as types—the usual logic in Martin-Löf’s type theory—let’s call it the PaT logic.3 In other words, one cannot enforce a principle of proof irrelevance in Martin-Löf’s type theory (MLTT) with PaT logic. Therefore, to use MLTT with PaT logic for formal semantics suffers of the problem we have described above: one cannot (or at least, would be very difficult to) obtain correct identity criteria in semantic interpretations of CNs, among other things.4 The main contribution of the current paper is to solve this problem of how to consider proof irrelevance for Martin-Löf’s type theory. Recently, based on Martin-Löf’s type theory, researchers have developed Homotopy Type Theory (HoTT) [40] for the 3 PaT

stands for ‘propositions as types’. the second paragraph of the Conclusion section for another potential issue in this respect, but this is out of the scope of the current paper.

4 See

4

Z. Luo

study of foundation and formalisation of mathematics. One of the developments in the HoTT project is its logic (sometimes called h-logic), developed by Voevodsky, based on the idea that a logical proposition, called a mere proposition, is a type that is either a singleton or empty. In other words, proof irrelevance is built-in in h-logic and this, among other things, has given rise to a logic with a type of all (small) propositions. Our proposed solution is that MLTT should be extended with h-logic and the resulting language, MLTTh , can then be used adequately as a foundational language for MTT-semantics. MLTTh is a proper extension of Martin-Löf’s type theory, although it is a subsystem of type system for HoTT as described in [40]. In Sect. 3, we shall first discuss the above problem briefly and then describe h-logic and MLTTh and illustrate how to use MLTTh in formal semantics.

2 Identity Criteria and Proof Irrelevance 2.1 Identity Criteria of Common Nouns As first observed by Geach [15] and later discussed by many others, common nouns are associated with their criteria of identity. Intuitively, a CN represents a concept that does not only have a criterion of application, to be employed to determine whether the concept applies to an object, but a criterion of identity, to be employed to determine whether two objects of the concept are the same. It has been argued that CNs are distinctive in this since other lexical terms like verbs and adjectives do not have such criteria of identity (cf., Baker’s arguments in [2]). The notion of criteria of identity can be traced back to Frege [13] when he considered abstract mathematical objects such as numbers or lines. Geach has noticed that such criteria of identity exist for every common noun and are the basis for counting [2, 15, 18].5 The idea can be illustrated by considering the following examples (1) and (2), where the CNs ‘passenger’ and ‘person’ have different criteria of identity, from which it is easy to see that (1) does not imply (2). (1) EasyJet has transported 1 million passengers in 2010. (2) EasyJet has transported 1 million persons in 2010. Several formalisations of the CN ‘passenger’ are discussed in [24], some of which give intended (correct) identity criteria while the others do not. In particular, when we use logical propositions in formalisations, some principle of proof irrelevance should be enforced.

5 In general, one may say that an interpretation of a CN should actually be a setoid—a type together

with an identity criterion, although in most cases, the situation is more straightforward and can be simplified in that one does not have to mention the identity criterion anymore [7].

Proof Irrelevance in Type-Theoretical Semantics

5

2.2 Proof Terms and Proof Irrelevance Based on the Curry-Howard principle of propositions as types [11, 20], modern type theories contain proof terms of propositions. Let’s consider a simple example: the logical proposition A in (3) can be proved by the term p in (4); in other words, (5) is a correct judgement stating that p is a proof term of A (formally, one says that the judgement (5) can be derived): (3) A = ∀P: Nat → Prop ∀x: Nat. P(x) ⇒ P(x) (4) p = λP: Nat → Prop λx: Nat λy: P(x). y (5) p: A A logical formula is true if, and only if, there exists a proof of the formula: in the above example, A is true since there is p which is a proof of A. Note that a logical proposition may have more than one proof; put in another way, proof terms are not necessarily unique: for some propositions, there may be proofs of the same proposition that are not the same. Proof terms cause problems when we consider identity criteria of CNs. Their presence makes it difficult, if not impossible, for one to obtain correct identity criteria of CNs. For example, in a dependent type theory, one may use -types of pairs6 to represent CNs modified by intersective adjectives [34], as the following example shows: (6) [[handsome man]] = (Man, handsome) where Man is the type of men and handsome is a predicate over domain Man. If we ask what the identity criterion for handsome man is, we would have a problem, since the above interpretation (6) would not give us the intended identity criterion [24]. Obviously, the correct (or intended) identity criterion for handsome man should be the same as that for man: two handsome men are the same if, and only if, they are the same man. But this is not what the formal interpretation (6) gives us: an object of type (Man, handsome) is a pair (m, h) where m is of type Man and h of type handsome(m) and, for two handsome men (m, h) and (m , h ) to be the same, we require that it is not only the case that m and m are the same, but also that h and h are the same! In other words, two handsome men being the same would have required that the proof terms that prove that they are handsome are the same as well. If there are more than one proof, say h and h , that a man m is handsome, then m A -type is an inductive type of dependent pairs. Here is an informal description of the basic laws governing -types (see, for example, [30] for the formal rules and further explanations). • If A is a type and B is an A-indexed family of types, then (A, B), or sometimes written as x: A.B(x), is a type. • (A, B) consists of pairs (a, b) such that a is of type A and b is of type B(a). • -types are associated projection operations π1 and π2 so that π1 (a, b) = a and π2 (a, b) = b, for every (a, b) of type (A, B). When B(x) is a constant type (i.e., always the same type no matter what x is), the -type degenerates into the product type A × B of non-dependent pairs. 6

6

Z. Luo

as handsome man with proof h would be different from m with proof h . Obviously, this would not be a correct identity criterion for handsome man. Some examples to illustrate the above problem are given by Tanaka in a talk at Ohio [38], in the context of studying Sundholm’s approach to constructive generalised quantifiers [37]. Consider the following sentence: (7) Most persons who travelled by plane are happy. where the semantics of most requires correct counting while the -type interpretation of the phrase ‘person who travelled by plane ’ does not give correct counting since it involves proof terms. To explain, we have: (8) [[person who travelled by plane]] = x: Persony: Plane. travel(x, y), where travel(x, y) is a proposition expressing that x travelled by plane y. One of the reasons that (8) may give incorrect counting results is that there could be more than one proof of type travel(x, y) and, as a consequence, an interpretation of (7) based on (8) would not be correct. The interpretation (8) has another problem: its use of (the second)  as existential quantifier is also problematic in counting, as pointed out by Tanaka in her talk [38]. A more adequate interpretation would be (9), where the usual existential quantifier ∃ is used: (9) [[person who travelled by plane]] = x: Person∃y: Plane. travel(x, y). Note that, while (A, P) is a type, ∃(A, P) is a logical proposition. Although the usual existential quantifier ∃ exists in impredicative type theories such as UTT, it does not exist in Martin-Löf’s type theory. In order to solve the above problem, it has been proposed that some principle of proof irrelevance should be adopted [24]: i.e., any two proofs of the same logical proposition should be the same. In the above examples, it implies that, for any man m, any two proof terms of handsome(m) should be the same (proof irrelevance for proposition handsome(m)) and that, for any person x and any plane y, any two proof terms of travel(x, y) should be the same (proof irrelevance for proposition travel(x, y)). The interpretations (6) and (9) give correct counting results with proof irrelevance. Note that, in type theory, proof irrelevance for all logical propositions requires that there be a clear distinction between the types standing for logical propositions (such as that in (3)) and the other types (such as a type of humans and a type of numbers). In an impredicative type theory, such a distinction can easily be made, while in a predicative type theory, it is not clear how to do it. In the following Sect. 2.3, we discuss how this can be done for impredicative type theories and, in Sect. 3, we shall investigate how it could be done for predicative type theories.

Proof Irrelevance in Type-Theoretical Semantics

7

2.3 Proof Irrelevance in Impredicative Type Theories In impredicative type theories such as UTT [22] and pCIC [39], there is a type Prop of all logical propositions. As a type, Prop represents the totality of all logical propositions. For instance, a formula of the form ∃X : Prop. ... says that ‘there exists logical proposition X such that ...’. As another example, the proposition in (3) quantifies over all predicates P with domain Nat. With such a distinctive totality Prop of logical propositions, it becomes straightforward to express a principle of proof irrelevance that states that any two proofs of the same logical proposition be the same. As proposed in [24], in the impredicative type theory UTT, this can be captured by the following rule: (∗)

  P: Prop   p: P   q: P   p = q: P

Intuitively, it say that, if P is a logical proposition (i.e., P is of type Prop) and if p and q are two proof terms of P, then p and q are the same.7 Consider the semantic interpretation of handsome man in (6). With proof irrelevance, two handsome men are the same if, and only if, they are the same man because, for any man m, any two proofs of proposition handsome(m) are the same, according to rule (∗) that expresses proof irrelevance. As noted above, in order to state the principle of proof irrelevance, there must be a clear distinction between logical propositions and other types so that proof irrelevance can be imposed for the former (and not for the latter). This is the case for impredicative type theories, as the (∗) rule illustrates. However, a rule like (∗) would not be available for predicative type theories such as Martin-Löf’s type theory with PaT logic, as to be explained in the following section.8

2.4 Most: Counting and Anaphoric Reference Anaphora representation was an early application of dependent type theory, first studied by Sundholm [36], as an alternative to dynamic semantics [16, 19, 21]. the equality = is the definitional equality in type theory. In §3, when we consider h-logic, the equality will be the propositional equality. We shall not emphasise the differences of these two equalities in the current paper, partly because it would not affect our understandings of the main issues. 8 A reviewer has asked the question whether ‘one should use impredicative type theories rather than predicative ones for studying logics of natural language’ (and maybe others would ask similar questions.) Some people would have drawn this conclusion and, in particular, those (including the author himself) who do not think that impredicativity is problematic would have agreed so. However, some others may think otherwise, believing that impredicativity is problematic in foundations of mathematics (or even in general); if so, predicative type theories would then have their merits and should be considered. 7 Here,

8

Z. Luo

Consider the simple donkey sentence (10), whose interpretation in Martin-Löf’s type theory would be (11), where  acts as the existential quantifier and F and D are the types that interpret farmer and donkey, respectively. Note that the use of  as existential quantifier is the key to the solution here: if the second  in (11) is changed into a traditional (weak) existential quantifier, the interpreting sentence would not be well-formed because the term π1 (π2 (z)) would be ill-typed. (10) If a farmer owns a donkey, he beats it. (11) z: (x: Fy: D. own(x, y)). beat(π1 (z), π1 (π2 (z))) However, as explained above in Sect. 2.2 with (7) and (8), using  as existential quantifier in this simple way causes problems in counting. This can be made clear by the following example (12)9 and its interpretation (13) in Martin-Löf’s type theory, where most is interpreted by means of the quantifier Most defined in [37]. (12) Most farmers who own a donkey beat it. (13) Most z: (x: Fy: D. own(x, y)). beat(π1 (z), π1 (π2 (z))) The semantic interpretation (13) fails to respect correct counting which becomes important for the truth of (12): because of the second , the proofs of own(x, y) contribute to counting in a wrong and unintended way. This was already realised by Sundholm himself, who proposed some ad hoc method to deal with this (see, for example, [38]). Unlike MLTT, in some type theories there exist both the strong  and a weak existential quantifier. For example, in UTT, we have both  and ∃, the latter being the traditional existential quantifier in its embedded higher-order logic. This has opened a new possibility of using both in a semantic interpretation. For instance, (12) can be interpreted in UTT as (14), where, with proof irrelevance, counting is correctly dealt with in (14), and so is anaphoric reference as well. (14) Most z: [x: F∃y: D. own(x, y)]. ∀y : [y: D.own(π1 (z), y)]. beat(π1 (z), π1 (y )) The above interpretation (14) is a strong one – most donkey-owning farmers beat all donkeys he owns. A weaker interpretation would mean that most donkey-owning farmers beat some donkeys he owns, which would be interpreted as (15), obtained from (14) by changing ∀ into ∃, which deals with counting correctly as well. (15) Most z: [x: F∃y: D. own(x, y)]. ∃y : [y: D.own(π1 (z), y)]. beat(π1 (z), π1 (y )) Remark 1 Here, we have considered how to use both strong and weak sum types in dealing with counting and anaphora in type theories such as UTT. In this respect, instead of doing this, one might consider extending a type theory to become a ‘dynamic type theory’, in the same way as extending FOL to becomes dynamic predicate logic [16]. The author believes, however, that this is too much a price to 9 Thanks

to Justyna Grudziñska for a discussion about this example.

Proof Irrelevance in Type-Theoretical Semantics

9

pay: like dynamic predicate logic, such a dynamic type theory completely loses its standard logical properties and would become a rather strange logical system. For instance, dynamic predicate logic is very much a non-standard logical system: among other things, it is non-monotonic and the notion of dynamic entailment fails to be reflexive or transitive. The author does not think that an underlying logic for NL semantics should be too far from a usual system that is well understood.

3 Formal Semantics in Martin-Löf’s Type Theory with H-Logic In this section, we first discuss the problem of imposing proof irrelevance in MartinLöf’s type theory (MLTT) with PaT logic and then detail our proposal of extending MLTT with h-logic, as studied and developed by Voevodsky in the HoTT project [40], and employing the extension MLTTh as a foundational language for formal semantics. Homotopy Type Theory (HoTT) is a new research field in the study of foundations of mathematics, among other related things. It was first initiated by Vladimir Voevodsky who, with others, has organised a special year about this at the Institute of Advanced Study in Princeton that has resulted in the HoTT book [40], among other things. HoTT extends MLTT with two things: the univalence axiom and higher inductive types. In particular, as a part of a larger development, Voevodsky has studied a notion of proposition as a type whose objects are all the same, which is later on coined in the HoTT project as mere propositions. What we shall propose and study is a subsystem of the HoTT type theory: the system will be called MLTTh –it only extends MLTT with the logic of mere propositions, called h-logic. We shall briefly show, by giving simple examples, that MLTTh may adequately be used for MTT-semantics.

3.1 MLTT with PaT Logic: a Problem Although the basic idea of MLTT with PaT logic is based on the propositions-astypes principle, Martin-Löf has gone a step further: not only every proposition is a type, but vice versa: every type is also a proposition. In other words, with the standard PaT logic of MLTT, propositions and types are identified. As explained in the introduction, this identification has caused a problem in incorporating proof irrelevance. If proof irrelevance were imposed for every proposition in MLTT, then unfortunately every type, which is also a proposition (in Martin-Löf’s sense), would collapse as well: if a and b are two objects of type A, we’d have that a and b are the same because type A is a proposition. This is obviously absurd and unacceptable.

10

Z. Luo

As we saw in Sect. 2.3, in an impredicative type theory such as UTT, the distinction between propositions and types is clear: in such a type theory, there is the type Prop of all logical propositions: a type is a logical proposition if, and only if, it is an object of type Prop. Therefore, a principle of proof irrelevance can be stated and imposed in a straightforward way by a rule like (∗) in Sect. 2.3. Such a rule cannot be formulated, and hence unavailable, in MLTT with PaT logic. As pointed out in [24], Martin-Löf’s type theory with PaT logic is inadequate for MTT-semantics, because it is impossible for one to impose a principle of proof irrelevance and, as explained, proof irrelevance would be needed to obtain correct identity criteria for CNs.

3.2 H-Logic in HoTT In HoTT, a logical proposition is a type whose objects are all propositionally equal to each other. To distinguish them from other types, which in MLTT are also called propositions, they are sometimes called mere propositions. Definition 1 (mere proposition [40]) A type A is a mere proposition if for all x, y: A we have that x and y are equal. Formally, let U be the smallest universe in MLTT and A: U . Then A is a mere proposition in h-logic if the following is true: isProp(A) = x, y: A. IdA (x, y), where Id is the propositional equality (called Id -type) in MLTT. We can define the type of mere propositions in U to be the following -type: PropU = X : U. isProp(X ). In the following, we shall omit U and write Prop for PropU . Note that Prop is different from Prop in an impredicative type theory like UTT: Prop contains all logical propositions in the type theory while Prop does not—it only contains the mere propositions in the predicative universe U ; sometimes, we say that Prop is the type of small mere propositions. Another thing to note is that an object of Prop is not just a type—it is a pair (A, p) such that A is a type in U and p is a proof of isProp(A) (i.e., A is a mere proposition). The traditional logical operators can be defined and some of these definitions (e.g., those for disjunction and existential quantifier) use the following truncation operation that turns a type into a mere proposition. • Propositional Truncation. Let A be a type. Then, there is a higher inductive type A specified by the following rules, where the elimination operator κA satisfies the definitional equality κA (f , |a|) = f (a):

Proof Irrelevance in Type-Theoretical Semantics

 valid   a: A   |a|: A   isProp( A ) true

11

  isProp(B)   f : A → B   κA (f ): A → B

Note that A is a higher inductive type and, in particular, in turning a type A into a mere proposition A , one imposes that there is a proof of isProp( A ), i.e., A is a mere proposition – in other words, every two proofs of A are equal (propositionally).10 The traditional logical operators can now be defined for h-logic as follows, where ˙ for the conjunction we denote them by means of an extra dot-sign: for example, ∧ connective in h-logic. • • • • • • • •

true = 1 (the unit type). false = ∅ (the empty type). ˙ Q = P × Q. P∧ P⇒ ˙ Q = P → Q. ˙ P = P → ∅. ¬ ˙ A.P(x) = x: A.P(x). ∀x: ˙ Q = P + Q . P∨ ˙ A.P(x) = x: A.P(x) . ∃x:

Please note that the typing operators on the right hand side are those used in MLTT to define the corresponding logical operators. The reader may have noticed that the truncation operation is only used in the last two cases (disjunction and existential quantification), but not for defining the ˙ Q is directly other logical connectives: for example, the logical conjunction P ∧ defined as the product type P × Q, rather than P × Q . The reason is that ×, the product typing operator, preserves the property of being mere propositions: if P and Q are mere propositions, so is P × Q. This property of preservation also holds for the operators such as implication and universal quantification. However, it does not hold for disjunction or existential quantification: for example, even when P and Q are mere propositions, P + Q is not a mere proposition and, therefore, the truncation operator has to be used to turn P + Q into a mere proposition P + Q .

3.3 MLTTh and Its Adequacy for Formal Semantics MLTTh extends Martin-Löf’s type theory (MLTT) (see Part III of [33] for its formal description) with the h-logic in HoTT [40], as described above. MLTTh does not include other extensions of MLTT in the HoTT project: in particular, it does not have the univalence axiom or any other higher inductive types except those in h-logic. Since it is a subsystem of the HoTT type theory, MLTTh has nice meta-theoretic properties including logical consistency. 10 For

people who are familiar with type theory, this implies that canonicity fails to hold for the resulting type theory.

12

Z. Luo

We claim that MTT-semantics can be done adequately in MLTTh . Since in MLTTh there is the totality Prop of (small) mere propositions, we can approximate the notion of predicate as follows: a predicate over type A is a function of type A → Prop. Therefore, we can interpret linguistic entities such as verb phrases, modifications by intersective adjectives, etc. as we have done before based on UTT. For example, the modified CN (16) can be interpreted as (17), where hs: Man → Prop is a predicate in MLTTh expressing the property of being handsome: (16) handsome man (17) m: Man, π1 (hs(m))) More precisely, for any man m: Man, hs(m) is a pair (A, p) where A is a type representing that m is handsome and p is a proof term showing that A is a mere proposition. That is why we have to use the operator π1 of first projection to get the first component of hs(m) to form the -type. Proof irrelevance is built-in in the notion of mere proposition. In h-logic as described above, every two proofs of a mere proposition are equal (by definition, for the propositional equality Id ). In particular, this is imposed for A when a type A, which may not be a mere proposition, is turned into a mere proposition A . For instance, considering the semantic interpretation (17) of (16), we have that two handsome men are the same if, and only if, they are the same man, because any two proof terms of the mere proposition π1 (hs(m)) are the same. Therefore, the problem described in Sects. 2.2 and 3.1 is solved satisfactorily in MLTTh .11

3.4 Most in MLTTh MLTTh also contains a weak version of the existential quantifier—the operator ∃˙ as defined above. Therefore, the sentence (12), repeated below as (18), can be given the semantic interpretation (19) in MLTTh , where ∃˙ is used as the weak existential quantifier. (18) Most farmers who own a donkey beat it. ˙ D. own(x, y)]. (19) Most z: [x: F ∃y: ∀y : [y: D.own(π1 (z), y)]. beat(π1 (z), π1 (y )) Note that the above interpretation (19) in MLTTh is similar to (14) in UTT, with the only difference that ∃ in (14) is changed into ∃˙ in (19).

11 Of course, we recognise that MLTT

h is a proper extension of MLTT, although this seems to be the best one could do (but further research may be needed to see whether it is possible to do otherwise).

Proof Irrelevance in Type-Theoretical Semantics

13

4 Conclusion In this paper, we have discussed that, in order to obtain adequate identity criteria for CNs, a principle of proof irrelevance should be adopted in a type-theoretical semantics. In particular, we showed that, unlike impredicative type theories like UTT, this presents a problem for predicative type theories such as MLTT. The paper then proceeds to show how one may extend MLTT by means of h-logic, the logic of mere propositions studied in the HoTT project, to obtain MLTTh , which is then claimed to be an adequate foundational language for MTT-semantics. Usually, we have included Martin-Löf’s type theory as one of the MTTs employable for MTT-semantics (just like impredicative type theories such as UTT). This paper may be regarded as a justification for this practice. Of course, for this, MLTT need be extended with h-logic, rather than using its original standard PaT logic.12 It should be emphasised that further study is needed to justify our claim that MLTTh can adequately deal with all the semantic matters as studied based on UTT, although intuitively we do not see any serious problems. To mention a potential issue: in a predicative type theory, formally there is no totality of all propositions (and hence no totality of predicates)—one can only have relative totalities of propositions or predicates using predicative universes (cf., Prop in Sect. 3.2). This is not ideal but it is to be seen whether it causes any serious problems. The existence of both strong and weak sums ( and ∃) in type theories like UTT and MLTTh has brought a new light in semantic treatments for sentences involving unbound anaphora [31], as some examples about most have illustrated in this paper. However, a general study in this respect is still needed to see how far one can go. Acknowledgements Partially supported by EU COST Action EUTYPES (CA15123, Research Network on Types).

References 1. Agda.: The Agda proof assistant (developed at Chalmers, Sweden) (2008). http://appserv.cs. chalmers.se/users/ulfn/wiki/agda.php 2. Baker, M.: Lexical Categories: verbs, Nouns and Adjectives, vol. 102. Cambridge University Press, Cambridge (2003) 3. Bekki, D.: Representing anaphora with dependent types. LACL 2014, LNCS 8535 (2014) 4. Bishop, E.: Foundations of Constructive Analysis. McGraw-Hill (1967) 5. Boldini, P.: Formalizing context in intuitionistic type theory. Fundam. Inform. 42(2), 1–23 (2000) 6. Callaghan, P., Luo, Z.: An implementation of LF with coercive subtyping and universes. J. Autom. Reason. 27(1), 3–27 (2001) 7. Chatzikyriakidis, S., Luo, Z.: Identity criteria of common nouns and dot-types for copredication. Oslo Stud. Lang. 10(2) (2018) 12 Although

the current work has not been published until now, its idea, i.e., using HoTT’s h-logic instead of the PaT logic, has been in the author’s mind for a long time. This has partly contributed to the decision of including Martin-Löf’s type theory as one of the MTTs for MTT-semantics.

14

Z. Luo

8. Chatzikyriakidis, S., Luo, Z.: Formal Semantics in Modern Type Theories. Wiley & ISTE Science Publishing Ltd., (2019) (to appear) 9. Church, A.: A formulation of the simple theory of types. J. Symb. Log. 5(1) (1940) 10. Cooper, R.: Records and record types in semantic theory. J. Log. Comput. 15(2) (2005) 11. Curry, H.B., Feys, R.: Combinatory Logic, vol. 1. North Holland Publishing Company (1958) 12. Dapoigny, R., Barlatier, P.: Modeling contexts with dependent types. Fundam. Inform. 21 (2009) 13. Frege, G.: Grundlagen der Arithmetik. Basil Blackwell (Translation by J. Austin in 1950: The Foundations of Arithmetic) (1884) 14. Gallin, D.: Intensional and higher-order modal logic: with applications to Montague semantics (1975) 15. Geach, P.: Reference and Generality. Cornell University Press (1962) 16. Groenendijk, J., Stokhof, M.: Dynamic predicate logic. Linguist. Philos. 14, 1 (1991) 17. Grudzi´nska, J., Zawadowski, M.: Generalized quantifiers on dependent types: a system for anaphora. In: Chatzikyriakidis, S., Luo, Z. (eds.). Modern Perspectives in Type-Theoretical Semantics (2017) 18. Gupta, A.: The Logic of Common Nouns. Yale University Press (1980) 19. Heim, I.: File change semantics and the familiarity theory of definiteness. In: Bäuerle, R. et al. (eds.) Meaning, Use and Interpretation of Language (1983) 20. Howard, W.A.: The formulae-as-types notion of construction. In: Hindley, J., Seldin, J. (eds.). To H. B. Curry: Essays on Combinatory Logic. Academic Press (1980) 21. Kamp, H.: A theory of truth and semantic representation. In: Groenendijk, J. et al. (eds.). Formal Methods in the Study of Language (1981) 22. Luo, Z.: Computation and Reasoning: a Type Theory for Computer Science. Oxford University Press (1994) 23. Luo, Z.: Type-theoretical semantics with coercive subtyping. Semantics and Linguistic Theory 20 (SALT20), Vancouver (2010) 24. Luo, Z.: Common nouns as types. In: Bechet, D., Dikovsky, A. (eds.). Logical Aspects of Computational Linguistics (LACL’2012). LNCS 7351 (2012) 25. Luo, Z.: Formal semantics in modern type theories with coercive subtyping. Linguist. Philos. 35(6), 491–513 (2012) 26. Luo, Z.: Formal semantics in modern type theories: is it model-theoretic, proof-theoretic, or both?. In: Invited talk at Logical Aspects of Computational Linguistics 2014 (LACL 2014), vol. 8535, pp. 177–188, Toulouse. LNCS (2014) 27. Luo, Z., Callaghan, P.: Coercive subtyping and lexical semantics (extended abstract). LACL’98 (extended abstracts), available in request to the first author or as. http://www.cs.rhul.ac.uk/ home/zhaohui/LACL98.abstract.ps (1998) 28. Luo, Z., Pollack, R.: LEGO proof development system: user’s manual. LFCS Report ECSLFCS-92-211, Dept of Computer Science, Univ of Edinburgh (1992) 29. Martin-Löf, P.: An intuitionistic theory of types: predicative part. In: Rose, H., Shepherdson, J.C. (eds.). Logic Colloquium’73 (1975) 30. Martin-Löf, P.: Intuitionistic Type Theory. Bibliopolis (1984) 31. Moltmann, F.: Unbound anaphoric pronouns: E-type, dynamic and atructured-propositions approaches. Synthese 153 (1983) 32. Montague, R.: Formal Philosophy. R. Yale University Press, Collected papers edited by Thomason (1974) 33. Nordström, B., Petersson, K., Smith, J.: Programming in Martin-Löf’s type theory: an introduction. Oxford University Press (1990) 34. Ranta, A.: Type-theoretical Grammar. Oxford University Press (1994) 35. Retoré, C.: The Montagovian generative lexicon λTyn : a type theoretical framework for natural language semantics. In: Matthes, R., Schubert, A. (eds.). Proceedings of the TYPES2013 (2013) 36. Sundholm, G.: Proof theory and meaning. Gabbay, D., Guenthner, F. (eds.). Handbook of Philosophical Logic, vol. III (1986) 37. Sundholm, G.: Constructive generalized quantifiers. Synthese 79(1), 1–12 (1989)

Proof Irrelevance in Type-Theoretical Semantics

15

38. Tanaka, R.: Generalized quantifiers in dependent type semantics. Talk given at Ohio State University (2015) 39. The Coq Development Team: The Coq Proof Assistant Reference Manual (Version 8.0), INRIA (2004) 40. The Univalent Foundations Program: Homotopy type theory: univalent foundations of mathematics. Technical report, Institute for Advanced Study (2013) 41. Xue, T., Luo, Z., Chatzikyriakidis, S.: Propositional forms of judgemental interpretations. In: Proceedings of the Workshop on Natural Language and Computer Science, Oxford (2018)

Saving Hamlet Ellipsis Kristina Liefke

Abstract Hamlet ellipsis (see Parsons 1997) is a propositionalist account of depiction reports (e.g. Mary imagines/paints a unicorn) that analyzes the object DPs in these reports as the result of eliding the infinitive to be (there) from a CP. Hamlet ellipsis has been praised for its uniformity and systematicity, and for its ability to explain the learnability of the meaning of depiction verbs (e.g. imagine, paint). These merits notwithstanding, recent work on ‘objectual’ attitude reports (esp. Forbes 2006; Zimmermann 2016) has identified a number of challenges for Hamlet ellipsis. These include the material inadequacy of this account, its prediction of unattested readings of reports with temporal modifiers, and its prediction of counterintuitive entailments. This paper presents a semantic save for Hamlet ellipsis, called Hamlet semantics, that answers the above challenges. Hamlet semantics denies the elliptical nature of the complement in depiction reports (s.t. object DPs are interpreted in the classical type of DPs, i.e. as intensional generalized quantifiers). The propositional interpretation of the object DPs in these reports is enabled by the particular interpretation of depiction verbs. This interpretation converts intensional quantifiers into ‘existential’ propositions during semantic composition.

1 Introduction Hamlet ellipsis (see [36, pp. 375–376]; cf. [10, pp. 63–64], [51, pp. 432–435]) is a propositionalist account of depiction reports1,2 (e.g. (1a), (3a)) that analyzes the object DPs in these reports (e.g. the DP a unicorn in (1a)) as the result of eliding the 1 These

are reports that contain depiction verbs like paint, imagine, conceive, visualize, portray, sculpt, write (about), and draw (see [10, pp. 37, 130–150], [24, p. 232], [32, p. 242], [51, p. 427]). 2 In this paper, we focus on de dicto-readings of depiction reports. These are readings on which the object DPs in these reports describe the content of pictures, mental images, etc. (see [51]; cf. [2]). K. Liefke (B) Institute for Linguistics, Goethe University Frankfurt, Norbert-Wollheim-Platz 1, 60323 Frankfurt am Main, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2020 R. Loukanova (ed.), Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018), Studies in Computational Intelligence 860, https://doi.org/10.1007/978-3-030-30077-7_2

17

18

K. Liefke

infinitive to be (or to be there) from a clausal complement (for (1a): from the CP a unicorn to be3 ; see (1b/c)). On this account, ‘objectual’ depiction reports like (1a) have a similar form to ‘propositional’ depiction reports (see (2)). (1) a. Mary imagined [dp a unicorn]. ≡ b. Mary imagined [cp [dp a unicorn] to BE (there)/ to EXIST].   ≡ c. Mary imagined [cp there to BE [dp a unicorn]]. (2) Mary imagined [cp that there was [dp a unicorn]]. (3) a. Penny painted [dp a penguin]. ≡ b. Penny painted [cp [dp a penguin] to BE (there)].   ≡ c. Penny painted [cp there to BE [dp a penguin]]. Hamlet ellipsis differs from other propositionalist accounts of objectual attitude reports (e.g. [24, 33, 40]) by requiring neither the lexical decomposition of the matrix verb (e.g. (4)–(6); see [40, pp. 177–178, 183–185], [33, pp. 166–168]) nor the introduction of a contextually supplied relation between the matrix subject and the object DP (e.g. (7)–(8); see [44, pp. 271–274]). (4) a. Sally seeks [dp a unicorn]. ≡ b. Sally tries (/strives/endeavors) [cp to FIND [dp a unicorn]]. ≡ c. Sally tries (/strives/endeavors) [cp that she FINDS [dp a unicorn]]. (5) a. Olli owes Harry [dp a horse]. ≡ b. Olli is obliged [cp to GIVE / c. that he GIVES Harry [dp a horse]]. (6) a. Willard wants [dp a sloop]. ≡ b. Willard wants (/wishes) [cp to HAVE / c. that he HAS [dp a sloop]]. (7) a. John needs [dp a coffee]. ≡ b. John needs [cp to DRINK [dp a coffee]]. (8) a. John needs [dp a marathon]. ≡ b. John needs [cp to RUN [dp a marathon]]. Since it uses the same verb (i.e. be) in the infinitival clause-analysis of depiction reports with different matrix verbs (see (1), (3)), Hamlet ellipsis provides a systematic, uniform account of depiction reports and explains the easy learnability of the meaning of the matrix verbs in these reports (see [36, pp. 374–375]). 3 Note that, while depiction reports like (1b)

may be “bad English” (to use Quine’s term, see [40, p. 152]), they are still grammatical. To see this, consider the similarly structured report (†): (†) Mary imagined [cp [dp a unicorn] to be basking in the sun]. (‡) Mary imagined [cp there being [dp a unicorn]]. The above notwithstanding, we find that reports with gerundive small clause complements like (‡) are more natural and intuitively provide better paraphrases of reports like (1a). We will present a likely reason for this judgement in Sect. 3.1.

Saving Hamlet Ellipsis

19

2 Challenges for Hamlet Ellipsis Its advantages notwithstanding, Hamlet ellipsis has been argued to face a number of challenges. These challenges are identified below:

2.1 Challenge 1: Unattested Readings Forbes [10, p. 63] has observed that Hamlet ellipsis wrongly predicts the ambiguity of (9) between (10a and b). This prediction is based on the ability of temporal adverbials like yesterday to modify either the matrix verb in a sentence (in (10): the verb imagine; see (10a)) or the implicit predicate in the verb’s complement (in (10): the predicate be; see (10b)) (cf. [19, 24, 30, 41]). (9) Mary imagined [dp a unicorn] yesterday. (10) Mary imagined [cp [dp a unicorn] to be] yesterday. a. Mary’s imagining of a unicorn occurred yesterday b. Mary imagined the existence yesterday of a unicorn ≡ Mary imagined yesterday’s existence of a unicorn

(see [10, p. 63]) (my paraphrase)

In virtue of the above, depiction reports like (9) are predicted to behave analogously to control constructions with temporal modifiers (e.g. (11)). Such constructions are ambiguous between high-scope readings (here: (11a)) and low-scope readings of the modifier (here: (11b)). (11) Bill will need [dp a laptop] tomorrow. ≡ Bill will need [cp FOR [tp PRO to HAVE [dp a laptop]]] tomorrow. a. Tomorrow is the time of Bill’s need /when Bill’s need will arise b. Tomorrow is the time when Bill needs to have the laptop However, in contrast to control constructions, the modifier low-scope reading of (10), i.e. (10b), is unattested and intuitively unavailable for (9). Beyond the above, Forbes [10, p. 63] has noted that Hamlet ellipsis fails to predict the (seeming) ambiguity of (9) between (10a) and (12): (12) Mary imagined a unicorn as one would have been yesterday According to Forbes, the above is a reading on which the temporal adverbial yesterday modifies the DP a unicorn, rather than the elided predicate that is introduced by this DP. We will discuss potential support for this reading in Sect. 5.5.

20

K. Liefke

2.2 Challenge 2: Material Inadequacy Forbes [10, pp. 62–63] (see [51, p. 434]) has further observed that Hamlet ellipsis gives counterintuitive truth-conditions for depiction reports. On Forbes’ reconstruction of Parsons’ account, this account interprets the elided infinitive in (1a) at the depicting situation, i (in which the imagining takes place), such that (1a) is analyzed as (13a). However, it is obviously possible to imagine non-existent (incl. physically and logically impossible) objects without thereby suggesting—or bringing about— the objects’ existence in i. As a result, (1a) is intuitively not equivalent to (13a), but to (13b)4 : (13)

a. Mary imagined-in-i [cp [dp a unicorn] to be-in-i]. b. Mary imagined-in-i [cp [dp a unicorn] to be-in-the situation that she imagines in i / to be-in-her mental image in i ].

Forbes’ reconstruction of (1b) as (13a) also has problematic consequences for the interpretation of negation (see [10, pp. 62–63]): Hamlet ellipsis predicts that the negation of (1a), i.e. (14a), is equivalent to the negation, i.e. (14b), of (1b). However, if we interpret the clausal complement of imagine with respect to i (see (14b–i); cf. (13a)), (1a) (taken as an objectual attitude) is still compatible with (14b). This is evidenced by Forbes’ [10, p. 63] observation that “there could be a drawing entitled ‘Mary imagining a unicorn’ in which it is clear that the depicted unicorn is supposed to be a figment of Mary’s imagination.” (14) a. Mary did not imagine [dp a unicorn]. ≡ b. Mary did not imagine [cp [dp a unicorn] to be (there)]. i. Mary did not imagine-in-i [cp [dp a unicorn] to be-in-i]. ii. Mary did not imagine-in-i [cp [dp a unicorn] to be-in-the situation that she imagines in i ].

2.3 Challenge 3: Concealed Iterated Attitudes One could try to defend Hamlet ellipsis against the above challenge by interpreting the infinitive to be with respect to the (depicted or imagined) situation that is the target of the subject’s attitude in the situation i (see (13b)). However, this interpretation still 4 Notably,

(13a) violates Percus’ Generalization X (see [37, p. 201]). This rule demands that the world/situation variable that a verb selects for must be coindexed with the nearest lambda abstractor above it (in Forbes’ example: with the lambda abstractor that is associated with imagine). On Percus’ account, the de dicto-reading of (1b) is analyzed as (∗), where w0 and w1 range over situations: (∗) λw0 [Mary imagined-in-w0 [λw1 [a unicorn-in-w1 exists-in-w1 ]]].

Saving Hamlet Ellipsis

21

provides an inadequate semantics for concealed iterated attitude reports like (15a). The object DP in this report describes the content of Ferdinand Bol’s painting Jacob’s Dream (1642) (see [51, p. 434]; cf. [10, p. 63]). This painting shows Jacob sleeping with a large angel towering above him. (15) a. Ferdinand painted [dp an angel]. ≡ b. Ferdinand painted [cp [dp an angel] to be in the depicted situation]. c. Ferdinand painted [cp [dp an angel] to be in Jacob’s dream-world in the depicted situation]. In virtue of the above, (15a) is true on its de dicto-reading. However, since the angel in Bol’s picture is intended to be a figment of Jacob’s imagination—rather than an inhabitant of Bol’s depicted situation in which Jacob is dreaming (see [51, p. 434]) –, (15a) cannot be interpreted as (15b). Instead, it is intuitively interpreted as (15c).

2.4 Challenge 4: Disabled Inferences Our discussion of the truth-conditions for (15a) suggests that this report is entailed by (16a) (s.t. the inference in (16) is valid). However, the classical logical translations of these reports (in (17b) resp. (17a); see [51, p. 434], cf. [34]) block this inference5 : (16) a. Ferdinand painted [cp Jacob dreaming of [dp an angel]]. ⇒ b. Ferdinand painted [dp an angel].   (17) a. painti ferdinand, λj s. dream j (jacob, λk s (∃x e )[angelk (x)])    b. painti ferdinand, λj s (∃y e )[angel j (y)] One could try to capture the above inference by taking the existential quantifier in (17b) to range over non-existent objects (incl. dream-angels) and by adopting an exportation principle (e.g. (18)) that allows dream-objects to inherit some of the properties (here: the property of being an angel) from the contents of the dreams that they occupy (see [51, pp. 434–435]):   (18) (∀ j s ) dream j (jacob, λk (∃x)[angelk (x)]) → (∃y)[angel j (y)] However, it is not clear whether this strategy can be used in a systematic way (see [51, p. 435]; cf. [9, p. 103 ff.]).

5 Below,

types are given in superscript. Following standard convention, we let s be the type for indices (i.e. world/time-pairs) or for situations. e and t are the types for individuals and truth-values or truth-combinations, respectively. Types of the form (αβ) (for short: αβ) are associated with functions from objects of type α to objects of type β.

22

K. Liefke

2.5 Challenge 5: Unwarranted Inferences Beyond the above, Zimmermann [51, pp. 435–436] has shown that Hamlet ellipsis makes wrong predictions about the admissible entailments of depiction reports: arguably, the existence of live unicorns necessitates the existence of pumping unicorn hearts (s.t. all worlds that are inhabited by live unicorns contain (pumping) unicorn hearts; see (19b)). However, (19c) does not intuitively follow from (19a): (19)

a. Mary imagined [a live unicorn (to be)].   b. (∀ j) (∃x)[unicorn j (x)] → (∃y)[unicorn-heart j (y)]

 c. Mary imagined [a unicorn heart (to be)]. In particular, it is possible for Mary to imagine a live unicorn (e.g. a unicorn that is cantering through a meadow; s.t. (19a) is true) without imagining a unicorn heart (s.t. (19c) is false).

2.6 Objective We propose to answer the above challenges by adopting a uniform variant of Quine’s lexical decomposition-account of objectual attitude verbs (see (4)–(6)), dubbed Hamlet semantics. This semantics interprets all DP-taking occurrences of depiction verbs V as ‘V [cp there being [dp ] (in the subject’s V’ed situation)]’ (see (20b)) or, equivalently, as ‘V [cp [dp ] to exist . . .]’ (see (20c)). In virtue of this interpretation, depiction verbs uniformly denote relations to an existential proposition: (20) a. Mary imagined [dp a unicorn]. ≡ b. Mary imagined [there BEING [dp a unicorn] (in her imagined situation)]. ≡ c. Mary imagined [cp [dp a unicorn] to EXIST (in her imagined situation)]. In contrast to Parsons’ account, Hamlet semantics denies the elliptical nature of the complement in depiction reports. In particular, this semantics analyzes the object DPs in these reports as DPs (as opposed to CPs, as on Parsons’ account) and interprets them in the classical type of DPs (i.e. as intensional generalized quantifiers; type s((s(et))t)). The propositional interpretation of these DPs is enabled by the particular interpretation of depiction verbs. This interpretation converts intensional quantifiers into propositions during semantic composition.6 The possibility of coding intensional quantifiers as propositions (see above) is enabled by the fact that the object DPs in de dicto-readings of depiction reports all denote existential quantifiers7 (i.e. quantifiers of the form λj s λP s(et) (∃x)[B j (x) ∧ 6 In

virtue of this interpretation, Hamlet semantics differs from Moltmann’s [31] intensional quantifier analysis of the complements of intensional transitive verbs. 7 We will see in Sect. 4.2 that—given certain constraints—this move still allows depiction verbs to take non-existential quantified DPs in object position.

Saving Hamlet Ellipsis

23

P j (x)], where B is a non-logical constant of type s(et); see [49, pp. 160–164]) and by the correspondence between existential quantifiers and existential propositions (i.e. propositions of the form λj s (∃x e )[B j (x)]). This correspondence is witnessed by the function λQs((s(et))t) λj s (∃x e )[Q j (λk s λy e . x = y)] (see Sect. 4.2). To answer Challenges 2–4, we take the propositional relata of depiction verbs to be a particular kind of existential propositions, viz. existential situated propositions (see [27, pp. 660–662]). The latter are existential propositions that are essentially ‘linked’ to a particular situation or set of situations (in a sense specified in Sect. 4.1). Our use of existential situated propositions is motivated by the observation that depiction reports typically have a ‘vivid’ (or experiential) reading on which their complements describe a situation, event, or state (in (1a): Mary’s imagined situation; see [16, 47]). Below, we first give the motivation and background of Hamlet semantics (in Sect. 3). Following the detailed presentation of this semantics (in Sect. 4), we then show that this semantics answers all of the above challenges (in Sect. 5).

3 Motivation and Background We begin our presentation of the motivation and background of Hamlet semantics by identifying empirical support for the vivid interpretation of the object DPs in depiction reports:

3.1 Vivid Attitude Reports The vivid reading of (1a)/(20b) (on which the complement (there being) a unicorn describes a situation) is supported by the possibility of modifying the matrix verb in this report through an ‘experiential’ modifier like vividly8 or in vivid/lifelike detail (see (21); cf. [47, p. 156]), of combining the complement in this report with a situation-related locative or temporal modifier (see (22); cf. [28, pp. 280, 287–294]),9 and of rephrasing an equivalent of the complement of this report as an embedded 8 Arguably,

modification with vividly works better for imagine than for paint (see (21a) vis-a-vis ( a)). In particular, the most natural interpretation of ( a) (in ( b)) treats vividly as a resultative predicative. I thank an anonymous reviewer for SCI-LACompLing2018 for directing my attention to this issue. ()

a. Penny vividly painted [dp a penguin]. b. Penny painted [dp a penguin] such that it comes across as vivid.

9 Maienborn [28] uses these tests as a tool for detecting eventuality arguments. Given the similarities

between situations and events (see [23, Sect. 9]), it is not surprising that they double as a diagnostic for situation arguments.

24

K. Liefke

eventive how-complement (see (23); cf. [7, 42, 48]). The latter possibility is corroborated by the observation that physical or mental images typically do not represent isolated items of information (e.g. Kratzer-style facts), but informationally richer objects (see [51, p. 433]). (21)

a. Mary vividly imagined [dp a unicorn]. b. Mary imagined [dp a unicorn] in vivid/lifelike detail.

(22)

a. Mary imagined [(there being) a unicorn in the meadow]. b. Mary imagined [(there being) a unicorn yesterday].

(23)

(cf. (10b))

a. Mary imagined [[dp a unicorn] cantering through a meadow]. b. Mary imagined [how [dp a unicorn] was cantering through a meadow].

Note that (1a) is only equivalent to (1b/c) and (20b), but not to (2). Note further that, in contrast to reports with infinitival complements (e.g. (1b/c)), reports with gerundive small clause complements (e.g. (20b)) only allow for a vivid reading (see [47, p. 149]). We take this fact to explain why (20b) is intuitively judged to be a better paraphrase of (1a) than (1b/c) (see fn. 3). We will discuss the relation between (non-vivid) finite and vivid non-finite (infinitival or gerundive) clauses in Sect. 4.2.

3.2 Situations and Vivid Interpretation We have mentioned above that the complements of vivid readings of depiction reports describe a situation.10 Notably—because of the physical constraints on perception, documentation, and memory—, situations of this kind need not be informationally total objects (see [6, p. 692]). This is illustrated by the situation that is denoted by the object DP in (24). This DP describes (a part of) the content of Yekaterina Zernova’s painting Collective Farmers Greeting a Tank (1937). This painting shows a group of farmers and their families waving and raising bouquets of flowers to greet an approaching tank.11 (24) Yekaterina painted [dp a tank]. While the visual scene that is depicted in Zernova’s painting is likely located in the actual world (Zernova was a realist painter), some contextually or perceptually non-salient facts (e.g. the fact that the tank has a (working) engine, or that the farmers all have blood group zero) are not part of this situation. Specifically, even if the tank has an engine in the depicted part of the real world, (25) will still be judged to be an incorrect description of the content of Zernova’s painting. 10 For the purposes of this paper, we treat situations as non-decomposable primitives (following [20,

21]). However, nothing hinges on this treatment. https://thewire.in/history/russian-revolution-catalysed-array-experiments-art April 27, 2019). 11 see

(accessed

Saving Hamlet Ellipsis

25

(25) Yekaterina painted [dp the engine of a tank]. To adequately interpret reports like (24), we identify situations with informationally incomplete world-parts, analogous to Liefke and Werning’s [27] contextually specified situations (see also [20, 21]). Such situations are obtained by identifying the contextually or perceptually salient information about the relevant spatio-temporal world-part (for (24): information about the particular time and location of the real world, depicted by Zernova, in which farmers and their families are greeting a tank; cf. [27, pp. 658–659]). This information includes information about the ‘inventory’ of the specified world-part (for Zernova’s painting: six people, one tank, five flower bouquets, . . .) and information about the properties of the ‘items’ in this inventory (e.g. two people are younger males greeting a/the tank, one is an elderly male holding a flower bouquet). Because of the informational nature of situations, the above-mentioned properties need not be closed under classical modal entailment: for example, the co-existence (in all standard contexts) of approaching tanks and tanks with a (working) engine does not exclude the existence of situations that contain an approaching tank, but in which this tank does not have an engine, or in which the existence of an engine is left open. We will see in Sect. 5.4 that the informational incompleteness of situations enables an easy answer to the challenge from unwarranted inferences (see Sect. 2.5). However, even informationally incomplete world-parts are not (yet) quite suitable for the interpretation of vivid depiction reports: in contrast to vision reports (whose target situations are typically spatio-temporally located), the target situations of depiction reports are often not anchored in a particular world or time (see [38, 39]).12 In particular, given the non-existence of unicorns in the actual world, it is likely that the situation that is described by the object DP in (1a) is not part of (i.e. is not anchored in) a particular world at a particular time. For example, in the described imagining situation, Mary may only have imagined a certain set of properties of some arbitrary unicorn, rather than a situation in a particular world/time in which a certain unicorn exhibits said properties. The non-anchoredness of situations is also illustrated by contexts in which the cognitive agent is unable to identify a particular world or time that is characterized by the content of the attitude complement. Such contexts are exemplified by a context for (3a) in which Penny paints the Phillip Island penguin Pebbles standing at the southernmost tip of Summerland Beach wearing an orange knitted sweater,13 but in which she does not have in mind—or does not remember—a specific time at which Pebbles was wearing the sweater at that location. (Assume that Penny has watched Pebbles and his sweater at Summerland Beach on multiple occasions).

12 In this respect, our situations are distinct from Kratzer-style situations [20, 21] and Davidsonian events [5], which are both “unrepeatable entities with a location in space and time” (see [25, p. 151]). For Kratzer-style situations, this is reflected in the stipulation that “every possible situation s is related to a unique maximal element, which is the world of s” (see [21, p. 660, Condition 5]). 13 see https://www.nzherald.co.nz/lifestyle/news/article.cfm?c_id=6&objectid=11400740 (accessed April 27, 2019).

26

K. Liefke

To obtain non-anchored situations of the above kind, we represent situations by sets of informationally incomplete world-parts (type st). Anchored situations are then represented by singletons that contain the relevant informationally incomplete worldpart. Non-anchored situations are represented by sets of parts of different worlds that share the relevant information (i.e. by sets of isomorphic (= qualitatively identical) situations; see [21, p. 667]; cf. [8, p. 136]). For the above-described interpretation of (3a), these are situations with different temporal anchors in which Pebbles is standing at the southernmost tip of (real-world) Summerland Beach and is wearing an orange knitted sweater. For a given interpretation of (1a), these are situations with different worldly anchors that are inhabited by a/some unicorn that exemplifies a certain particular set of properties (e.g. being white, having a long spiraling horn, and cantering through a meadow). In what follows, we will refer to the (possibly non-anchored) situations that are described by the complements of depiction reports as internal situations (or as imagined, or depicted, situations).14 The situation of evaluation, i, will be called the external situation. We close this section with a comparison between our non-anchored situations and Kratzer-style propositional facts (see [21, pp. 667–669]). The latter are sets of extensions, p := { j | f ≤ j}, of an actual fact f that are closed under persistence and maximal similarity.15 Our characterization of non-anchored situations suggests that these situations are close in spirit to propositional facts. Non-anchored situations differ from propositional facts w.r.t. their informational richness and their cardinality. In particular, since agents may stand in depiction relations to informationally rich (i.e. non-minimal) situations, we allow that the members of non-anchored situations share more information than the information that is encoded in a single fact. Because of the particular nature of our application, we further do not require that our non-anchored situations are closed under persistence. Granted the extension of Kratzer’s framework with non-minimal, non-persistenceclosed variants of propositional facts, the present proposal could alternatively be rephrased in a thus-extended Kratzerian framework. However, since the required objects are already available in Liefke and Werning’s ontology (see [27])—and since their account identifies the origins of informational partiality more carefully than Kratzer’s account (see [21])—we here adopt Liefke and Werning’s ontology.

14 Arguably,

the term internal situation is ambiguous in iterated attitude reports like (55a) (i.e. Ferdinand painted Jacob dreaming of an angel). In (55a), the situation that is denoted by the deepest embedded complement (i.e. the situation denoted by an angel) is internal both w.r.t. the inner verb (i.e. dream) and the outer verb (i.e. paint). The relativization of this situation to the attitude verb (here: the replacement of internal situation by depicted situation) avoids this ambiguity. 15 It is closure under maximal similarity that effects the non-anchoredness of propositional facts.

Saving Hamlet Ellipsis

27

4 Hamlet Semantics With the above background in place, we are now ready to present our new semantics for depiction reports. To facilitate the presentation of this semantics, we first give a sentence-level semantics for such reports (in Sect. 4.1). To extend this semantics into a fully compositional theory, we then specify the semantic contribution of depiction verbs (in Sect. 4.2) and compare the contribution of finite and non-finite clausal complements of these verbs (in Sect. 4.3).

4.1 Sentence-Level Hamlet Semantics Our proposed interpretation of (1a) is given in (26).16 There, f c (λj s ∃x e. unicorn j (x) ∧ E j (x)) is the set of isomorphic situations whose facts Mary imagines in i (see Sect. 3.2; cf. [27, pp. 664–665]). This set is identified by a parametrized choice function, f (see [15, pp. 367–369]). This function selects a subset of a given set of situations λj s [. . .] in dependence on a parameter, c, for the described depiction event (here, c := (ιe)[imaginei (e) ∧ agenti (e) = mary], where e is a variable over events (type v) and imagine is a constant of type s(vt)).17 The dependence of f on c is evidenced by the observation that different imagining events of Mary’s (at different times)—and different imagining events of different agents at the same time—typically have different contents. Below, E is a situation-relative existence predicate.‘E j (x)’ asserts that the individual/object x exists in the situation j. For the behavior of E, the reader is referred to [26, p. 117 ff.]. (26) Mary imagines [dp a unicorn]i = imaginei (mary, f c (λj s ∃x e . unicorn j (x) ∧ E j (x))) The above describes the obtaining of the imagining relation between Mary and the (underlined) set of anchored situations that represent Mary’s imagined situation in i, 16 For

reasons of simplicity, we hereafter neglect tense in the logical translation of our examples.

17 As a result of this event-dependence, the interpretation of (1a) uses two translations of imagine, viz.

as a situation-relative predicate of pairs of situations and individuals (i.e. imagine; type s(s(et))) and as a situation-relative predicate of events (i.e. imagine; type s(vt)). One could avoid this ‘dual translation’ by adopting instead the fully-fledged event-interpretation of (1a) (see the NeoDavidsonian version in (• a) [3, 35] and the original Davidsonian version in (• b) [5]). (I thank an anonymous referee for SCI-LACompLing2018 for suggesting the interpretation in (• a).) (•)

a. ∃e [imaginei (e) ∧ agenti (e) = mary ∧ themei (e) = f e (λj ∃x. unicorn j (x) ∧ E j (x))] b. ∃e [imaginei (e, mary, f e (λj ∃x. unicorn j (x) ∧ E j (x)))]

Since the interpretation in (26) is closer in spirit to established semantics for depiction reports (e.g. [49, 50]), we here adopt this interpretation. Readers who prefer the above event-interpretation are free to adopt this interpretation instead. For a compositional implementation of Neo-Davidsonian event semantics, these readers are referred to [4].

28

K. Liefke

in which there is a unicorn. We will hereafter refer to sets of situations of the above form as existential situated propositions. This name is motivated by the fact that the members of such sets are restricted to members of the representation, f c (λj. . . .), of the internal situation (s.t. the proposition is situated w.r.t. this situation) and in which there is a unicorn (s.t. the proposition is existential w.r.t. unicorns). Note that, for the interpretation of depiction reports with definite and indefinite DPs, we may waive the need for the conjunct E j (x) in (26) (i.e. (26) is equivalent to (27)). This is due to the fact that the restrictor in the logical translation of such DPs (in (26): the constant unicorn) is relativized to (the members of the set that represents) the internal situation (in (26): to each j in f c (λj. . . .)). As a result of this restriction, existential quantification over unicorns entails the existence-in-each- j of at least one unicorn. (27) imaginei (mary, f c (λj ∃x. unicorn j (x))) Things are different for depiction reports with embedded proper names: arguably, depiction reports with names (e.g. (28)) can receive an analogous interpretation to depiction reports with definite or indefinite DPs (see (29a)): (28) Mary imagines [dp Sparkle]. (29) a. imaginei (mary, f c (λj ∃x. E j (x) ∧ x = sparkle)) ≡ b. imaginei (mary, f c (λj. E j (sparkle))) (30) imaginei (mary, f c (λj ∃x. x = sparkle)) However, if we were to drop the conjunct E j (x) from (29a) (see (30)), the name’s relativization to the internal situation would be lost: the non-relative variant of (29a), i.e. (30), asserts the obtaining of the imagining relation between Mary and the proposition—situated w.r.t. Mary’s imagined situation in i—that assumes Sparkle’s existence in any situation (including non-members of the set that codes Mary’s imagined situation). But this does not capture the intuitive meaning of (28).

4.2 Compositional Hamlet Semantics In Sect. 2.6, we have described Hamlet semantics as an ‘existential’ variant of Quinean lexical decomposition-accounts that interprets DP-taking occurrences of depiction verbs V as ‘V [cp [dp ] to exist (in the subject’s V’ed situation)]’ (see (20c)). Our semantics for DP-taking occurrences of imagine captures this interpretation: (31) imaginei = λQs((s(et))t) λz e [imaginei (z, f c (λj.Q j (λk s λy e . E k (y))))] This semantics interprets imagine as a function that ‘converts’ its quantifier-argument (here: Q) into the situated proposition that, in all members of the set that represents the subject’s imagined situation in i, Q is instantiated by the property of situation-relative existence. The compositional interpretation of (1a) is given in (32) (see (26)):

Saving Hamlet Ellipsis

29

(32) Mary imagines [dp a unicorn]i   ≡ imaginei Mary, a unicorn E k (y))))] = λQλz [imaginei (z, f c (λj.Q j (λkλy.   mary, λl s λP s(et) ∃x. unicorn l (x) ∧ Pl (x)   ≡ imaginei (mary, f c (λj. λP ∃x. unicorn j (x) ∧ P j (x) (λkλy. E k (y)))) ≡ imaginei (mary, f c (λj ∃x. unicorn j (x) ∧ E j (x))) ≡ imaginei (mary, f c (λj ∃x. unicorn j (x))) With a minor tweak, the interpretation of imagine from (31) also enables the interpretation of CP complements. This tweak uses the possibility of representing (type-st) propositions p by intensional quantifiers λj s λP s(et) [ p j ]. CP-taking occurrences of imagine are then interpreted as follows: (33) imagine [cp p]i  s s(et)  = λQλz [imaginei (z, f c (λj.Q j (λkλy.  E k (y))))] λl λP . pl  s(et) ≡ λz [imaginei (z, f c (λj. λP . p j (λkλy. E k (y))))] ≡ λz [imaginei (z, f c (λj. p j ))] Granted the interpretation of be (or exist) as situation-relative existence (see (34)), (33) enables the compositional interpretation of (20b) (and of (1b/c)) as (1a) (see (36)). This interpretation uses the intermediate step in (35): (34) (to) be there ≡ (to) exist = λQλj. [Q j (λkλy. E k (y))] (35) there being [dp a unicorn] ≡  be there(a unicorn)  = λQλj [Q j (λkλy. E k (y))] λl s λP s(et) ∃x. unicornl (x) ∧ Pl (x) ≡ λj [(λl s λP s(et) ∃x. unicornl (x) ∧ Pl (x)) j (λkλy. E k (y))] ≡ λj ∃x. unicorn j (x) ∧ (λkλy. E k (y)) j (x) ≡ λj ∃x. unicorn j (x) ∧ E j (x) ≡ λj ∃x. unicorn j (x) i (36) Mary imagines  [cp there being [dp a unicorn]]  i ≡ imagine Mary, be there(a unicorn)   = λQλz [imaginei (z, f c (λj.Q j (λkλy. E k (y))))] mary, λl ∃x. unicornl (x) ≡ λQλz [imaginei (z, f c (λj.Q j (λkλy. E k (y))))]   mary, λlλP ∃x. unicornl (x) ≡ imaginei (mary, f c (λj.(λP ∃x. unicorn j (x))(λkλy. E k (y)))) f c (λj ∃x. unicorn ≡ imaginei (mary,   j (x))) = imaginei Mary, a unicorn ≡ Mary imagines [dp a unicorn]i

Note that (33) (cf. (31)) interprets imagine as a kind of context shifter (see [17, 45]; cf. [18, Sect. VII]). The latter is a displacing device on contexts that alters the content of its embedded sentence. In our interpretation of imagine from (33), this

30

K. Liefke

displacement shifts the external situation i (identified, for simplicity, with the utterance context) to the set, f c (λj. p j ), that codes the internal situation. We will show the use of the interpretation of depiction verbs as context shifters in Sect. 5.2. We close this section with a note on the available readings of depiction reports: Above, we have focused on de dicto-readings of depiction reports, on which the object DP in these reports receives a non-specific interpretation. In Hamlet semantics, specific (de re-)readings of the object DPs in depiction reports can still be obtained in the standard way, i.e. by letting the DP raise out of the scope of the depiction verb. The de re-interpretation of (1a) is given in (37): (37) [dp a unicorn] [λ1 [Mary imaginest1 ]]i  = λP(∃x )[unicorni (x) ∧ Pi (x)] λjλy. imagine j (mary, f c (λk. E k (y))) ≡ (∃x ) [unicorni (x) ∧ imaginei (mary, f c (λj. E j (x)))] Zimmermann [49, pp. 160–161] has observed that, for depiction reports with a strong quantificational object DP (e.g. (38)), de re-readings are, in fact, the only available readings. (38) Penny painted [dp every penguin]. To capture Zimmermann’s observation, one could replace the interpretation of imagine from (31) by the interpretation in (39). In the new interpretation, the quantifier ‘∃x’ restricts the DPs that are acceptable as non-specific objects of depiction verbs to those that denote existential quantifiers (see [49, pp. 162–167]). (39) imagineialternative interpretation = λQλz [imaginei (z, f c (λj ∃x. Q j (λkλy. x = y ∧ E k (y))))] (39) excludes the de dicto-reading of (38), as desired (see the contradiction in (40), where c := (ιe)[painti (e) ∧ agenti (e) = penny]). At the same time, it still allows for the de re-interpretation of (38) (which is obtained by quantifying-in, see (37)) and for the de re/de dicto-ambiguity of (1a). (40) Penny paints [dp every penguin]i = λQλz [painti (z, f c (λj∃u e.Q j (λkλy. u = y ∧ E k (y))))]  penny, λl s λP s(et) ∀x. penguinl (x) → Pl (x) ≡ painti (penny, f c (λj∃z.(λP s(et) ∀x. penguin j (x) → P j (x)) (λkλy. z = y ∧ E k (y)))) ≡ painti (penny, f c (λj. ⊥)) ≡ ⊥ However, since Hamlet ellipsis focuses on depiction reports with indefinite object DPs, we here use the simpler interpretation of depiction verbs from (31). The reader will see that all results from the subsequent sections can be equally achieved by using the semantics for depiction verbs from (39).

Saving Hamlet Ellipsis

31

4.3 Vivid Versus Non-Vivid Imagining We have suggested in Sect. 3.1 that (1a) is only equivalent to (20b) (and to (1b– c)), but not to (2). To account for the intuitive difference between (1a) and (2), we assume that the complements of non-vivid depiction reports have a different interpretation from the complements of vivid depiction reports. In particular, since finitely complemented depiction reports fail most of the diagnostics for vividness from Sect. 3.1 (see (41), (42)),18 we assume with [47] that they assert the obtaining of relations to isolated facts, rather than to full (i.e. informationally rich) situations. (41) (42)

a. b. ??

?? ??

Mary vividly imagined [cp that there was [dp a unicorn]]. Mary imagined [cp that there was [dp a unicorn]] in vivid/lifelike detail.

Mary imagined [how there was [dp a unicorn]].

We assume that the non-vivid interpretation of reports with finite clausal complements is achieved through the interpretation of the complementizer that (see [27, pp. 669, 676–678]; cf. [22, pp. 5–6]). This interpretation applies to sets of situations to identify the informationally minimal members of these sets (see (43)). In (43), ≤ is a partial ordering on the set of situations that is induced by the informational incompleteness of situations (see Sect. 3.2). Formulas of the form ‘k ≤ j’ assert that the situation j contains all information of the form ‘a Fs in w at t’ that is contained in k, where a and F are an individual and a property or activity, respectively (see [27, pp. 658–659]). (43) that = λpλj [ p j ∧ (∀k.( pk ∧ k ≤ j) → k = j)] The adoption of (43) enables a uniform interpretation (as (31)) of the occurrences of imagine in (1a)/(20b) and (2). The interpretation of (2) is given in (45). This interpretation uses the interpretation of the CP that there is a unicorn from (44): (44) that there is [dp a unicorn]   = λpλj [ p j ∧ (∀k. ( pk ∧ k ≤ j) → k = j)] λl ∃x. unicornl (x) ≡ λj ∃x. unicorn j (x) ∧ (∀k.((∃y. unicornk (y)) ∧ k ≤ j) → k = j) (45) Mary imagines [cp that there is [dp a unicorn]]i  = λQλz [imaginei (z, f c (λj.Q j (λkλy. E k (y))))] mary, λlλP∃x.  unicornl (x) ∧ (∀k . ((∃y . unicornk (y )) ∧ k ≤ l) → k = l) ≡ imaginei (mary, f c (λj.(λP∃x.unicorn j (x) ∧ (∀l. ((∃y . unicornl (y )) ∧ l ≤ j) → l = j))(λkλy. E k (y)))) ≡ imaginei (mary, f c (λj ∃x.unicorn j (x) ∧ (∀k. ((∃y. unicornk (y)) ∧ k ≤ j) → k = j)))

18 The

semantic deviance of (42) is due to the fact that states (incl. those introduced by be) do not allow for manner modification (see [28]). I thank Sebastian Bücking for providing this reference.

32

K. Liefke

(45) interprets imagine as a relation to a set of minimal situations in which there is a unicorn. Our interpretations of (20b) (see (36); copied in (46a)) and (2) (see (45); copied in (46b)) capture the intuitive validity of the inference from (20b) (and, hence, from (1a)) to (2) (see (46)). This validity is ensured by the principles in (46b) and (46c). These principles capture the semantic inclusion of the complement of (2) in the complement of (20b) (see (46b)) and the upward-monotonicity of the complement of imagine (see (46c)): (46)

a. Mary imagines [cp there being [dp a unicorn]]i = imaginei (mary, f c (λj ∃x. unicorn j (x)))     b. (∀ p)(∀c) f c λj [ p j ∧ (∀k.( pk ∧ k ≤ j) → k = j)] ≤ f c ( p) c. (∀ j)(∀z)[imaginei (z, j) → (∀k. k ≤ j → imaginei (z, k))]

⇒ d. Mary imagines [cp that there is [dp a unicorn]]i = imaginei (mary, f c (λj ∃x.unicorn j (x) ∧ (∀k. ((∃y. unicornk (y)) ∧ k ≤ j) → k = j))) The inference in the other direction (i.e. from (46d) to (46a)) is formally and intuitively invalid. The attentive reader may have noted that, in Sect. 1, we have only given an example of an imagination, but not of a painting report with a finite that-clause complement. In fact, in contrast to same-structure imagination reports (see (2), copied below), finite clausal complements in painting reports are grammatically odd (see (47)): (2)



(47)



Mary imagines [cp that there is [dp a unicorn]]. Penny paints [cp that there is [dp a unicorn]].

The above suggests that—contrary to what is suggested by the established terminology19 —‘depiction verbs’ do not form a homogeneous class. In particular, as regards the acceptance of finite clausal complements, the verbs visualize, plan, and conceive pattern with imagine. The verbs sculpt, write (about), and draw pattern with paint. We explain the different selectional behavior of imagine and paint with respect to differences in the possible abstractness of the content of imagining and painting events: while it is, in principle,20 possible to imagine a maximally abstract unicorn (which does not exemplify any concrete properties), constraints on physical depiction (incl. painting, drawing, and sculpting) block this possibility. These constraints include the choice of color and stroke (for painting/drawing), of a particular shape (for sculpting), and of the specific subject matter (for writing). In particular, the composition of the isolated sentence There was a unicorn barely qualifies as ‘writing about a unicorn’. 19 An

exception to this is Larson [24, p. 232], who labels the discussed class verbs of depiction and imagination. 20 We interpret the difficulty of this endeavor as support for the vivid interpretation of imagination reports.

Saving Hamlet Ellipsis

33

To capture the lower degree of abstractness of the content of pictures (vis-à-vis the content of mental images), we assume that the function that selects a set of situations in dependence on a particular painting event is undefined for sets of informationally minimal situations (incl. the denotations of that-clauses). This undefinedness then underlies the ungrammaticality of (47). We leave a more detailed description and explanation of the selectional behavior of different subclasses of depiction verbs as a topic for future work.

5 Solving the Challenges With our semantics for depiction reports in place, we are now ready to answer the challenges from Sect. 2. We start by presenting those answers (i.e. our answers to Challenges 2–4) that crucially use the situatedness of depiction complements in Hamlet semantics.

5.1 Solving Challenge 2 (Material Inadequacy) The interpretation of (1a) as (26) (see (48), below) enables a straightforward answer to Forbes’ challenge from material inadequacy (see Sect. 2.2). In particular, in virtue of this interpretation, Hamlet semantics does not demand that the complements of depiction reports be interpreted with respect to the external situation i (i.e. (1a) need not be interpreted as (13a), copied below). This is due to the non-factive nature of imagine—in particular, to the observation that the world(s) that are associated with (the situation represented by) f c (λj ∃x. unicorn j (x)) may be different from the world that is associated with the external situation i. As a result, acknowledging the existence of unicorns (or other fictional objects) in f c (λj ∃x. unicorn j (x)) does not force a commitment to the existence of unicorns in i. (13a) Mary imagined-in-i [cp [dp a unicorn] to be-in-i]. (48) Mary imagines [dp a unicorn]i ≡ Mary imagines [cp [dp a unicorn] to exist in her imagined situation]i ≡ Mary imagines-in-i [cp [dp a unicorn] to be-in-the situation that Mary imagines in i ] (see (13b)) = imaginei (mary, f c (λj ∃x. unicorn j (x))) Analogous observations hold for most depiction verbs, including paint and visualize, but (typically) excluding portray (see [51, pp. 427–428]; cf. [12]).21 21 The

factivity of portray is supported by the intuitive semantic deviance of reports (e.g. ( ), ()) that report the portrayal-in-i of an individual that does not exist in i:

34

K. Liefke

Unsurprisingly, the interpretation of depiction complements with respect to a contextually chosen internal situation also solves Forbes’ challenge from negation: since Hamlet semantics interprets (1a) as (13b) (see (48), above), it makes (1a) incompatible with (14b) (see (49a)): i a. Mary does [cp [dp a unicorn] to be/exist]    not imagine ≡ not imaginei Mary, a unicorn to exist = ¬imaginei (mary, f c (λj ∃x. unicorn j (x)))    ≡ b. not imaginei Mary, a unicorn

(49)

In particular, since it denies the existence of a Mary- and i-dependent imagined situation in which there is a unicorn (see (49a)), the negation of (1a), i.e. (14a), is equivalent to the negation, i.e. (14b), of (1b) (see (49b)). This is exactly the prediction of Parsons’ account.

5.2 Solving Challenge 3 (Concealed Iterated Attitudes) To answer Zimmermann’s challenge from concealed iterated attitudes ( Sect. 2.3), we use a pragmatic analysis of (15a) as (50a) (or, equivalently, as (50b))22 : (50) a. Ferdinand painted [dp a/the angel [pp FROM JACOB’S DREAM]]. ≡ b. Ferdinand painted [dp JACOB’S DREAM-angel]. This analysis is triggered by contextual cues and incongruities in Bol’s painting— specifically, by the beam of light that surrounds the angel and by the fact that Jacob and the depicted angel do not co-exist (at any time) in the same world. In particular, the light beam seemingly ‘displaces’ the angel from the primary depicted situation (in which Jacob is dreaming) into another situation (presumably, Jacob’s dream-world). The light beam thus has a similar role to thought bubbles in a comic strip (see [29, p. 7 ff.]). The interpretation of the noun or adjective dream as a context shifter (in (51); see Sect. 4.2) enables the interpretation of (50a) as (53). Below, c := (ιe)[paint @ (e) ∧ agent@ (e) = ferdinand] and c := (ιe)[dream @ (e) ∧ agent@ (e) = jacob], where @ is a variable for the external situation. ( )

?? Paul

portrayed [dp a particular unicorn].

()

?? Paul

portrayed [dp Superman] (from the theatrical release poster of Man of Steel).

For ideas about the treatment of such atypical, i.e., non-factive, uses of portray, the reader is referred to our solution to the challenge from concealed iterated attitudes (see Sect. 5.2). 22 Arguably, this analysis would equally serve to answer Challenge 3 for Hamlet ellipsis. (I thank two anonymous referees for SCI-LACompLing2018 for pointing out this possibility.). A similar point can, in fact, be made for Challenges 2 and 4. However, in contrast to (the suitably modified variant of) Hamlet ellipsis, only Hamlet semantics solves Challenges 1 and 5.

Saving Hamlet Ellipsis

35

(51) [dp [cn ]] from Jacob’s dream ≡ Jacob’s dream-[cn ] = λQλjλP [ f c (λk.Qk (λlλy. Pl (y)))( j)] The interpretation in (53) uses the intermediate step in (52): (52) an angel from Jacob’s dream ≡ Jacob’s dream-angel   ≡ from Jacob’s dream an angel   = λQλjλP [ f c (λk.Qk (λlλy. Pl (y)))( j)] λj λP ∃x. angel j (x) ∧ P j (x)   ≡ λjλP [ f c (λk. λP ∃x. angelk (x) ∧ Pk (x) (λlλy. Pl (y)))( j)] ≡ λjλP [ f c (λk ∃x. angelk (x) ∧ Pk (x))( j)] (53) Ferdinand paints [dp an angel [pp from Jacob’s dream]]i i ≡ Ferdinand  paints [dp Jacob’s dream-angel]  i ≡ paint Ferdinand, from Jacob’s dream(an angel) = λQλz [painti (z, f c (λj.Q j (λkλy. E k (y))))]  ferdinand, λlλP [ f c (λk ∃x. angelk (x) ∧ Pk (x))(l)] ≡ painti (ferdinand, f c (λj.[ f c (λk ∃x. angelk (x) ∧ E k (x))( j)])) ≡ painti (ferdinand, f c (λj. f c (λk ∃x. angelk (x))( j))) The formula in the last line of (53) asserts the angel’s existence in (the set, f c (λk ∃x. angelk (x)), that represents) Jacob’s dream-world. The report (15a) (qua (50a/b)) is thus equivalent to the grammatically slightly odd (15c) (copied in (54a)): (54) a. Ferdinand painted [cp [dp an angel] to BE in Jacob’s dream-world]. ≡ b. Ferdinand painted [cp there BEING [dp an angel] in Jacob’s dream-world]. Notably, in virtue of the above, the depicted angel is neither an inhabitant of the world (or situation) in which Jacob is dreaming (s.t. (15a) ≡ (15b)) nor of the (real) world in which Ferdinand is painting (see Challenge 2). This is in line with Zimmermann’s intuition from [51, p. 435].23 We close this section with a remark on the relation between (50a) and (16a) (copied in (55a)) or (55b): (55) a. Ferdinand painted [cp Jacob dreaming of [dp an angel]]. ≡ b. Ferdinand painted [cp Jacob dreaming of [cp there being [dp an angel]]]. Intuitively, (50a) and (55a) are only related by unidirectional entailment (i.e. (55a) ⇒ (50a), but (50a)  (55a)). In particular, it seems possible for Ferdinand to paint Jacob’s dream-angel without painting Jacob dreaming of this angel. We explain this intuition through the different interpretations of (50a) (see (53)) and (55a) (in (57)). The interpretation in (57) uses the semantics of dream in (56): (56) dream [pp of [dp ]] = λQλzλj [dream j (z, f c (λk.Qk (λlλy. El (y))))] 23 Thus,

Zimmermann [51, p. 435] writes, “for [(17b)] to be true, the picture would not need to imply that there be a real angel—a dream angel would be enough; and obviously the dream angel also suffices to make the sentence [(15a)] true, which would thus be aptly captured by [(17b)].”.

36

K. Liefke

(57) Ferdinandpaints [cp Jacob dreamingof [dp an angel]]i  ≡ painti Ferdinand, dream of Jacob, an angel = λQλz [painti (z, f c (λj. Q j (λj λy.E j (y))))]  ferdinand, λkλP. dreamk (jacob, f c (λl∃x. angell (x))) ≡ painti (ferdinand, f c (λj. dream j (jacob, f c (λk∃x. angelk (x))))) In contrast to (53), (57) additionally demands that Jacob is dreaming in Ferdinand’s depicted situation. As a result of this extra demand, the situation that is represented by f c (λj. dream j (jacob, f c (λk∃x. angelk (x)))) contains the situation that is represented by f c (λj. f c (λk ∃x. angelk (x))( j)) as a proper part. The set-theoretic inclusion of the second argument of the occurrence of paint from (57) in the second argument of the occurrence of paint from (53) then explains the unidirectionality of the above entailment.

5.3 Solving Challenge 4 (Disabled Inferences) The above already captures the intuitive entailment relation between (16a) and the pragmatic version, (50a/b), of (15a) (see Sect. 2.4). To capture the entailment between (16a) and the pre-pragmatic version of (15a) (which is not analyzed as (50a)), we supplement our interpretation of depiction verbs from (31) with an agent-sensitive interpretation of the object DP of these verbs. This interpretation is inspired by the interpretation of the DP Jacob’s dream-angel in (53) (see (51)) and by the possibility of further reducing the information about the identity of the cognitive agent (in (53): Jacob) and the agent’s particular cognitive access to the DP’s referent (in (53): dreaming) in this interpretation. This possibility suggests the interpretation of (15a) as (58) (in (59)). In this interpretation, the event-parameter of f is generalized to all events that happen (or are located) in the depicted situation, i.e. to (ηe)[happen j (e)]. (58) Ferdinand painted [dp someone’s somehow conceived angel]. (59) painti (ferdinand, f c (λj. f (ηe)[happen j (e)] (λk ∃x. angelk (x))( j))) The interpretation in (59) integrates some (relevant) agent’s perspective on the depicted angel into the semantics of the depiction report. This integration may proceed either through the semantics of the DP an angel (see (60a)) or through the semantics of the verb paint (see (60b)): (60)

a. an angel ≡ someone’s somehow conceived angel = λjλP [ f (ηe)[happen j (e)] (λk ∃x. angelk (x) ∧ Pk (x))( j)] b. paint [dp [cn ]]i ≡ paint someone’s somehow conceived [cn ]i = λQλz [painti (z, f c (λj. f (ηe)[happen j (e)] (λk. Qk (λlλy. El (y)))( j)))]

We leave the choice between these two alternatives to the reader.

Saving Hamlet Ellipsis

37

The interpretation of (15a) (i.e. Ferdinand painted an angel) as (59) validates the inference in (16), as desired. This is due to the fact that the second argument of the occurrence of paint in (59) (see (61b)) is semantically included in the second argument of the occurrence of paint in the last line of (57) (see (61a)). (61)

a. f c (λj. dream j (jacob, f c (λk∃x. angelk (x)))( j)) ≡ f c (λj. dream j (jacob, f (ιe)[dream @ (e)∧agent@ (e)=jacob ] (λk∃x. angelk (x)))( j)) b. f c (λj. f (ηe)[happen j (e)] (λk ∃x. angelk (x))( j))

Admittedly, the adoption of (59) as a universal interpretation strategy for depiction reports may look like somewhat of an overshoot. However, an interpretation like (59) proves fruitful even for simpler reports like (24): in contrast to the ‘standard’ Hamlet-interpretation of this report (see (62a)), this interpretation (in (62b)) captures the potential difference between the depicted situation (for (24): the situation represented by f c (λj ∃x. tank j (x))) and the real-world situation (represented by f (ηe)[happen j (e)] (λk ∃x. tankk (x))) that contains the depicted tank. (62) Yekaterina paints [dp a tank]i a. painti (yekaterina, f c (λj ∃x. tank j (x))) b. painti (yekaterina, f c (λj. f (ηe)[happen j (e)] (λk ∃x. tankk (x))( j)) The difference between these two kinds of situations enables us to account for fictional depictions that integrate real-world elements (e.g., for (3a): Penny’s depicting an imaginary situation in which the real-world penguin Pebbles—as observed by her at a specific location and point in time—soars above Summerland Beach like an eagle). This completes our presentation of answers for challenges that use the situatedness of depiction complements. We next turn to answers that use other aspects of the interpretation of depiction complements in Hamlet semantics, i.e. the informational incompleteness of situations (Challenge 5) and the interpretation of object DPs as intensional generalized quantifiers (Challenge 1).

5.4 Solving Challenge 5 (Unwarranted Inferences) In Sect. 3.2, we have identified situations with informationally incomplete worldparts that may only contain one—but not the other—of two necessarily co-existing objects or facts. This identification holds an easy answer to Zimmermann’s challenge from unwarranted inferences (see Sect. 2.5). In particular, since it allows for imagined situations that contain live unicorns without (also) containing unicorn hearts (see (63b)), Hamlet semantics blocks counterintuitive entailments like (19) (see (63)). Our interpretation of (19a) and (19c) is given in (63a) respectively in (63c):

38

K. Liefke

(63)

a. Mary imagines [a live unicorn (to be)]i = imaginei (mary, f c (λj ∃x. unicorn j (x) ∧ is-alive j (x)))  b. (∃l s ) f c (λj ∃x. unicorn j (x) ∧ is-alive j (x))(l) ∧  ¬ f c (λk ∃x. unicorn-heartk (x))(l)

 c. Mary imagines [a unicorn heart (to be)]i = imaginei (mary, f c (λj ∃x. unicorn-heart j (x)))

5.5 Solving Challenge 1 (Unattested Readings) We have shown in Sect. 4.2 that Hamlet semantics interprets the object DPs in depiction reports in the classical type of DPs. As a result, there is no elided infinitive to be (there) in (9) (copied, for convenience, below) that can be modified by the temporal adverbial yesterday. The impossibility of modifying this infinitive explains the unavailability of the low-scope reading of the temporal modifier in (9) (see (10b)) and excludes the ambiguity of (9) between (10a and b). (9) Mary imagined [dp a unicorn] yesterday. (10)

a. Mary’s imagining of a unicorn occurred yesterday b. Mary imagined yesterday’s existence of a unicorn

The unavailability of (10b) as a reading of (9) is further supported by the difficulty of finding a suitable interpretation for this reading. This is due to the requirement that yesterday in (10b) must be interpreted with respect to the utterance context of (9) (here identified, for simplicity, with the external situation i) (see [45, pp. 63–66]). However, the obvious interpretation of (10b) that respects this requirement (see (64a)) fails to capture the unicorn’s property of existing yesterday. The obvious alternative to this interpretation, which violates this requirement (see (64b)), inadequately attributes the property of existing a day ago to Mary’s imagined unicorn. Since this alternative allows for the possibility that the unicorn has passed away in the meantime, (64b) may be used to formalize the report Mary imagined a dead/recently deceased unicorn. But this is clearly not the intended interpretation of (64). (64) Mary imagined [dp the existence yesterday of a unicorn]i i ≡ Mary imagined [cp [dp a unicorn] to exist-yesterday]   ≡ imaginei Mary, exist-yesterday(a unicorn) = a. imaginei (mary, f c (λj ∃x. unicorn j (x) ∧ E i−1 day (x))) = Mary imagined a unicorn (this unicorn still existed yesterday)i = b. imaginei (mary, f c (λj ∃x. unicorn j (x) ∧ E j−1 day (x))) = Mary imagined a unicorn that existed the day before the point in time of Mary’s imagined situationi

Saving Hamlet Ellipsis

39

?

= c. imaginei (mary, f c (λj. t j = ti − 1 day ∧ (∃x. unicorn j (x) ∧ E j (x)))) = Mary imagined a unicorn as one would have been yesterdayi The formula in (64c) seems closer to the intuitive meaning of (10b): this formula asserts the obtaining-in-i of the imagining relation between Mary and the proposition that is situated w.r.t. Mary’s imagined situation in which there is a unicorn, where this situation is temporally located at the day before the time of i (i.e. t j = ti − 1 day). (64c) suggests that, in low-scope readings of the temporal modifier in depiction reports (see (10b), (12)), the modifier specifies the time of the internal situation. According to this suggestion, (10b) and (12) are both interpreted as (64c). The preference to express this reading with (12) for ‘objectual’ depiction reports (here: (9)) and with (10b) for propositional depiction reports (here: (10)) can be explained with reference to the fact that temporal modifiers typically adjoin to larger-than-DP constituents. The interpretation of yesterday that gives rise to (64c) is given below: (65)

a. [dp ] yesterday = λQ λjλP [Q j (λkλy. tk = t@ − 1 day ∧ Pk (y))] b. [vp ] yesterday = λP(s((s(et))t))t λQ [P(λkλP. tk = t@ − 1 day ∧ Qk (P))]

In particular, (65a) and (65b) enable the interpretation of the modifier narrowscope reading of (9) and (10) as follows: (66)

i a. Mary imagined  [[dp a unicorn] yesterday]   i ≡ imagine Mary, yesterday a unicorn = λQλz [imagine i (z, f c (λj.Q j (λkλy. E k (y))))]   mary, λlλP [tl = ti − 1 day ∧ (∃x. unicornl (x) ∧ Pl (x))] ≡ imaginei (mary, f c (λj. λP [t j = ti − 1 day ∧  (∃x. unicorn j (x) ∧ P j (x))] (λkλy. E k (y)))) ≡ imaginei (mary, f c (λj.t j = ti − 1 day ∧ (∃x. unicorn j (x) ∧ E j (x))))

i ≡ b. Mary imagined  [cp [dp a unicorn] [[to be] yesterday]]   i ≡ imagine Mary, yesterday(to be) a unicorn = λQλz [imagine i (z, f c (λj.Q j (λj λy.E j (y))))]   mary, λkλP. tk = t@ − 1 day ∧ (∃x. unicornk (x) ∧ E k (x))

≡ imaginei (mary, f c (λj. t j = t@ − 1 day ∧ (∃x. unicorn j (x) ∧ E j (x))))

6 Conclusion In this paper, we have provided a variant of Parsons’ propositional(ist) account of depiction reports, i.e. Hamlet semantics, that answers the familiar challenges for this account. This is achieved by adopting a uniform variant of Quine’s lexical decomposition-account of objectual attitude verbs that interprets all DP-taking occurrences of depiction verbs V as ‘V [cp there being [dp ] (in the subject’s V’ed

40

K. Liefke

situation)]’ or, equivalently, as ‘V [cp [dp ] to exist (in this situation)]’. Our interpretation of depiction complements as existential situated propositions answers all challenges for Parsons’ account. Hamlet semantics shows that—contrary to the received view (see [10, 13, 49, 51])—the complements of depiction verbs do not constitute evidence against a propositional(ist) analysis of attitude complements. Furthermore, it opens up the possibility for a propositional analysis of the complements of other objectual attitude verbs (e.g. want, owe; see (i), below) and for the analysis of semantic DP/CP relations (see (ii)): (i) Schwarz [44, pp. 271–275] has shown that the complements of want- and needconstructions that resist an analysis through the implicit predicate HAVE (e.g. (67)) can be analyzed through the use of the existential closure operator ∃ (see (67a)). This operator applies to an object DP to assert the existence of the DP’s referent in the subject’s relevant attitudinal alternatives (in (67a): the set of John’s need-alternatives in i, Needjohn, i ). (67) John needs [dp a marathon]. (said by John’s coach) # ≡ John needs [cp for [tp pro to HAVE [dp a marathon]]]. a. John needs [∃ [dp a marathon]]. = 1 ⇔ (∀ j)[Needjohn, i ( j) → (∃x)[marathon j (x)]] b. John needs [cp there to be [dp a marathon]].  needi (john, f c (λj ∃x. marathon j (x))) Our interpretation of (67) (in (67b), where c := (ιe)[need @ (e) ∧ agent@ (e) = john]) captures the core of Schwarz’ analysis. We leave the further development of Hamlet semantics for the above constructions (in particular, the adaptation of this semantics to capture these constructions’ obligatory de se-interpretation) as a project for future work. (ii) The uniform interpretation of depiction complements as propositions also facilitates the semantic treatment of DP/CP coordinations in such complements (see (68); cf. [1, 43]) and captures semantic inclusion relations between DPs and CPs in these complements (e.g. the semantic inclusion, in all contexts, of the complement of the occurrence of imagine from (69a) in the complement of the occurrence of imagine from (69b); cf. [14, 27, 46]). The associated inference (in (70)) shares key aspects of the inference from (55a) to (50a). petting it]]i (68) Mary imagined  [[dp a unicorn] and [cp that John was   ≡ imaginei Mary, and a unicorn, that John pets a unicorn (69)

a. Mary imagined [dp a unicorn]. b. Mary imagined [cp John petting [dp a unicorn]].

(70)

a. Mary imagines [cp John petting [dp a unicorn]]i = imaginei (mary, f c (λj ∃x. unicorn j (x) ∧ pet j (john, x))) b. (∀ p)(∀q)[(∀ j. p( j) → q( j)) → (∀c)( f c (q) ≤ f c ( p))]

Saving Hamlet Ellipsis

41

c. (∀ j)(∀z)[imaginei (z, j) → (∀k. k ≤ j → imaginei (z, k))]

(cf. (46c))

⇒ d. Mary imagines [dp a unicorn] = imaginei (mary, f c (λj ∃x. unicorn j (x))) i

The semantic treatment of (68) and (69) poses a challenge for existing semantics for depiction reports, which assume different lexical entries for DP- and CP-taking occurrences of depiction verbs (see [11]; cf. [49, 50]) and/or for vivid and nonvivid occurrences of these verbs (see [47]). We leave the further investigation and modelling of the above constructions as a future project. Acknowledgements I wish to thank three anonymous referees for SCI-LACompLing2018 for valuable comments on an earlier version of this paper. The paper has profited from discussions with Sebastian Bücking, Eugen Fischer, Friederike Moltmann, Frank Sode, Carla Umbach, Dina Voloshina, Markus Werning, and Ede Zimmermann. The research for this paper is supported by the German Research Foundation (via Ede Zimmermann’s grant ZI 683/13-1).

References 1. Bayer, S.: The coordination of unlike categories. Language 72(3), 579–616 (1996) 2. Bücking, S.: Painting cows from a type-logical perspective. In: Sauerland, U., Solt, S. (eds.) Proceedings of Sinn und Bedeutung 22, pp. 277–294. ZAS Papers in Linguistics, Berlin (2018) 3. Carlson, G.N.: Thematic roles and their role in semantic interpretation. Linguistics 22(3), 259–280 (1984) 4. Champollion, L.: The interaction of compositional semantics and event semantics. Linguist. Philos. 38(1), 31–66 (2015) 5. Davidson, D.: The logical form of action sentences. In: Rescher, N. (ed.) The Logic of Decision and Action, pp. 81–95. University of Pittsburgh Press, Pittsburgh (1967) 6. Devlin, K., Rosenberg, D.: Information in the study of human interaction. In: Adriaans, P., van Benthem, J. (eds.) Philosophy of Information, pp. 685–710. Elsevier, Amsterdam (2008) 7. Falkenberg, G.: Einige Bemerkungen zu perzeptiven Verben. In: Falkenberg, G. (ed.) Wissen, Wahrnehmen, Glauben, pp. 27–45. Niemeyer, Tübingen (1989) 8. Fine, K.: Properties, propositions, and sets. J. Philos. Log. 6, 135–191 (1977) 9. Fine, K.: Critical review of Parsons’ non-existent objects. Philos. Stud. 45, 95–142 (1984) 10. Forbes, G.: Attitude Problems: An Essay on Linguistic Intensionality. Oxford University Press, Oxford (2006) 11. Forbes, G.: Content and theme in attitude ascriptions. In: Grzankowski, A., Montague, M. (eds.) Non-Propositional Intentionality, pp. 114–133. Oxford University Press, Oxford (2018) 12. Goodman, N.: Languages of Art. Hackett Publishing, New York (1969) 13. Grzankowski, A.: Not all attitudes are propositional. Eur. J. Philos. 23(3), 374–391 (2015) 14. Grzankowski, A., Montague, M.: Non-propositional intentionality: an introduction. In: ibid. Non-Propositional Intentionality, pp. 1–18. Oxford University Press, Oxford (2018) 15. Heusinger, K. von: The salience theory of definiteness. In: Perspectives on Linguistic Pragmatics, pp. 349–374. Springer, New York (2013) 16. Higginbotham, J.: Remembering, imagining, and the first person, 212–245 (2003) 17. Israel, D., Perry, J.: Where monsters dwell. Log. Lang. Comput. 1, 303–316 (1996) 18. Kaplan, D.: Demonstratives: an essay on the semantics, logic, metaphysics, and epistemology of demonstratives and other indexicals. In: Almog, J., Perry, J., Wettstein, H. (eds.) Themes from Kaplan, pp. 489–563. Oxford University Press, Oxford (1989)

42

K. Liefke

19. Karttunen, L.: Discourse referents. In: McCawley, J. (ed.) Syntax and Semantics 7: Notes from the Linguistic Underground, pp. 363–385. Academic Press, New York (1976) 20. Kratzer, A.: An investigation of the lumps of thought. Linguist. Philos. 12, 607–653 (1989) 21. Kratzer, A.: Facts: particulars or information units? Linguist. Philos. 25(5–6), 655–670 (2002) 22. Kratzer, A.: Decomposing attitude verbs: handout from a talk in honor of Anita Mittwoch on her 80th birthday. Hebrew University Jerusalem (2006) 23. Kratzer, A.: Situations in natural language semantics. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy (Spring 2019 Edition). https://plato.stanford.edu/archives/spr2019/ entries/situations-semantics/ 24. Larson, R.K.: The grammar of intensionality. In: Logical Form and Language, pp. 369–383. Clarendon Press, Oxford (2002) 25. LePore, E.: The semantics of action, event, and singular causal sentences. In: LePore, E., McLaughlin, B. (eds.) Actions and Events: Perspectives on the philosophy of Donald Davidson, pp. 151–161. Blackwell, Oxford (1985) 26. Liefke, K.: A single-type semantics for natural language. Doctoral dissertation. Tilburg Center for Logic and Philosophy of Science, Tilburg University (2014) 27. Liefke, K., Werning, M.: Evidence for single-type semantics - an alternative to e/t-based dualtype semantics. J. Semant. 35(4), 639–685 (2018) 28. Maienborn, C.: On the limits of the davidsonian approach: the case of copula sentences. Theor. Linguist. 31, 275–316 (2005) 29. Maier, E., Bimpikou, S.: Shifting perspectives in pictorial narratives. In: Espinal, M.T., Castroviejo, E., Leonetti, M., McNally, L., Real-Puigdollers, C. (eds.) Proceedings of Sinn und Bedeutung 23, 91–105. https://semanticsarchive.net/Archive/Tg3ZGI2M/Maier.pdf 30. McCawley, J.D.: On identifying the remains of deceased clauses. Lang. Res. 9, 73–85 (1974) 31. Moltmann, F.: Intensional verbs and quantifiers. Nat. Lang. Semant. 5(1), 1–52 (1997) 32. Moltmann, F.: Intensional verbs and their intentional objects. Nat. Lang. Semant. 16(3), 239– 270 (2008) 33. Montague, R.: On the nature of certain philosophical entities. Monist 53(2), 159–194 (1969) 34. Montague, R.: Universal grammar. Theoria 36(3), 373–398 (1970) 35. Parsons, T.: Events in the Semantics of English. The MIT Press, Cambridge, MA (1990) 36. Parsons, T.: Meaning sensitivity and grammatical structure. In: Broy, M., Dener, E. (eds.) Structures and Norms in Science, pp. 369–383. Springer, Dordrecht (1997) 37. Percus, O.: Constraints on some other variables in syntax. Nat. Lang. Semant. 8, 173–229 (2000) 38. Pustejovsky, J.: Where things happen: on the semantics of event localization. In: Proceedings of the IWCS 2013 Workshop on Computational Models of Spatial Language Interpretation and Generation (CoSLI-3) (2013) 39. Pustejovsky, J.: Situating events in language. Conceptualizations of Time 52, 27–46 (2016) 40. Quine, W.V.: Quantifiers and propositional attitudes. J. Philos. 53(5), 177–187 (1956) 41. Ross, J.R.: To have have and to Not have have. In: Jazayery, M., Polom, E., Winter, W. (eds.) Linguistic and Literary Studies, pp. 263–270. De Ridder, Lisse, Holland (1976) 42. Sæbø, K.J.: ‘How’ questions and the manner-method distinction. Synthese 193(10), 3169–3194 (2016) 43. Sag, I., Gazdar, G., Wasow, T., Weisler, S.: Coordination and how to distinguish categories. Nat. Lang. Linguist. Theory 3(2), 117–171 (1985) 44. Schwarz, F.: On needing propositions and looking for properties. In: Proceedings of SALT XVI, pp. 259–276 (2006) 45. Schlenker, P.: A plea for monsters. Linguist. Philos. 26(1), 29–120 (2003) 46. Stainton, R.J.: Words and Thoughts: Subsentences, Ellipsis and the Philosophy of Language. Oxford University Press, Oxford (2006) 47. Stephenson, T.: Vivid attitudes: centered situations in the semantics of ‘remember’ and ‘imagine’. In: Proceedings of SALT XX, pp. 147–160 (2010) 48. Umbach, C., Hinterwimmer, S., Gust, H.: German ‘wie’-complements: manners, methods and events in progress (submitted)

Saving Hamlet Ellipsis

43

49. Zimmermann, T.E.: On the proper treatment of opacity in certain verbs. Nat. Lang. Semant. 1(2), 149–179 (1993) 50. Zimmermann, T.E.: Monotonicity in opaque verbs. Linguist. Philos. 29(6), 715–761 (2006) 51. Zimmermann, T.E.: Painting and opacity. In: Freitag, W., et al. (eds.) Von Rang und Namen, pp. 427–453. Mentis, Münster (2016)

Temporal Representations with and without Points Tim Fernando

Abstract Intervals and events are analyzed in terms of strings that represent points as symbols occurring uniquely. Allen interval relations, Dowty’s aspect hypothesis and inertia are understood relative to strings, compressed into canonical forms, describable in Monadic Second-Order logic. That understanding is built around a translation of strings replacing stative predicates by their borders, represented in the S-words of Schwer and Durand. Borders point to non-stative predicates, including forces that may compete, succeed to varying degrees, fail and recur.

1 Introduction To analyze temporal relations between events, James Allen treats intervals as primitive (not unlike [9]), noting There seems to be a strong intuition that, given an event, we can always “turn up the magnification” and look at its structure. . . . Since the only times we consider will be times of events, it appears that we can always decompose times into subparts. Thus the formal notion of a time point, which would not be decomposable, is not useful. [1, p. 834].

Sidestepping indivisible points, Allen relates intervals a and a  in 13 mutually exclusive ways (reviewed in Sect. 2 below). An example is a overlaps a  , which can be pictured as the string a a, a  a 

(1)

of length 5, – starting with an empty box for times before a, – followed by a for times in a but not a  , – followed by a, a  for times in a and a  , T. Fernando (B) Computer Science Department, Trinity College Dublin, Dublin, Ireland e-mail: [email protected] © Springer Nature Switzerland AG 2020 R. Loukanova (ed.), Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018), Studies in Computational Intelligence 860, https://doi.org/10.1007/978-3-030-30077-7_3

45

46

T. Fernando

– followed by a  for times in a  but not a, – followed by for times after a  .1 Now, if, in addition, a third interval a  overlaps both a and a  , we can bring a  into view by turning up the magnification (as Allen puts it) on (1) for the string a  a, a  a, a  , a  a, a  a 

(2)

of length 7, – splitting the first box

and third box a, a  in (1) each into two,

a  and

a, a  , a  a, a  , respectively, whilst – adding a  to a for a, a  . To understand the change from (1) to (2), it is useful to define for any set A and string s = α1 · · · αn of sets αi , the A-reduct of s to be the intersection of s componentwise with A, written ρ A (s) ρ A (α1 · · · αn ) := (α1 ∩ A) · · · (αn ∩ A). For instance, the {a, a  }-reduct of (2) is a a, a  a, a  a  which we can then compress to (1) by applying a function bc (for block compression) that, given a string α1 · · · αn , deletes every αi such that i < n and αi = αi+1 ⎧ if n < 2 ⎨ α1 · · · αn else if α1 = α2 bc(α1 · · · αn ) := bc(α2 · · · αn ) ⎩ α1 bc(α2 · · · αn ) otherwise. Let us agree to call a string α1 · · · αn stutterless if αi = αi+1 whenever 1 ≤ i < n. Then clearly, bc(s) is stutterless and s is stutterless

⇐⇒ s = bc(s).

The finite-state approach to temporality in [5–7] reduces a string s of subsets of a set A to its stutterless form bc(s), on the assumption that every element a ∈ A names a stative predicate pa , understood according to David Dowty’s hypothesis that the different aspectual properties of the various kinds of verbs can be explained by postulating a single homogeneous class of predicates — stative predicates — plus three or four sentential operators or connectives. [3, p. 71]. 1 Boxes are drawn instead of ∅ and curly braces {·} to reduce the risk of confusing, for example, the

empty language ∅ with the string  of length one (not to mention the null string of length 0).

Temporal Representations with and without Points

47

A stative predicate here amounts to a set p of intervals such that for all intervals I, J whose union I ∪ J is an interval, I ∈ p and J ∈ p ⇐⇒ (I ∪ J ) ∈ p

(3)

(with =⇒ making p cumulative, and ⇐= making p divisive). For example, rain is stative insofar as it holds of an interval I iff it holds of any pair of intervals whose union is I , illustrated by the equivalence between (a) and (b). (a) It rained from 8 am to midnight. (b) It rained from 8 am to noon, and from 10 am to midnight. For any finite linear order ≺, the requirement (3) on a stative predicate p over intervals (relative to ≺) is equivalent to reducing p to the set of subintervals of the set p↓ of points t for which the interval {t} is in p p = {I ⊆ p↓ | I is an interval} where p↓ := {t | {t} ∈ p}. For example, relative to the string a a, a  a 

,

we can interpret a and a  as the subsets Ua = {2, 3} and Ua  = {3, 4} of the set {1, 2, 3, 4, 5} of string positions where a and a  (respectively) occur, and then lift Ua and Ua  to stative predicates pa and pa  over intervals, using Ua as ( pa )↓ pa = {I ⊆ Ua | I is an interval} and Ua  as ( pa  )↓ pa  = {I ⊆ Ua  | I is an interval}. Over any string, we can repackage any stative predicate as a subset U of string positions. But now, can we take for granted Dowty’s hypothesis that aspect can be based on stative predicates and assume a string representing an event is built solely from stative predicates? This is far from clear. The event nucleus of [14], for instance, postulates not only states but also events that can be extended or atomic, including what Moens and Steedman refer to as “points” (Comrie’s semelfactives), which should not be confused with the points that a linear order compares. The present work is concerned with yet another notion of point, defined relative to a string s over the alphabet 2 A . An element a ∈ A is said to be an s-point if it occurs exactly once in s—i.e.,

48

T. Fernando

ρ{a} (s) ∈



a



(4)

Just as a string of statives can be compressed by removing stutters through bc, a string s of points can be compressed by deleting all occurrences in s of the empty box  for d (s). More precisely, d () :=  (where  is the string of length 0), and  d (αs) :=

d (s) if α =  α d (s) otherwise.

Line (4) above simplifies to the equation d (ρ{a} (s)) = a . To formulate a corresponding equation for an s-interval a, it is useful to pause and note that in general, a string s = α1 · · · αn of n subsets αi of a set A specifies for each a ∈ A, a subset of the set [n] := {1, . . . , n} of string positions, namely, the set Ua := {i ∈ [n] | a ∈ αi } of positions where a occurs. If we repackage s as the model Mod A (s) := [n], Sn , {Ua }a∈A  over [n] with successor relation Sn := {(i, i + 1) | i ∈ [n − 1]} then a theorem due to Büchi, Elgot and Trakhtenbrot says the regular languages over the set 2 A of subsets of A are given by the sentences ϕ of MSO A as {s ∈ (2 A )∗ | Mod A (s) |= ϕ} where MSO A is Monadic Second-Order logic over strings with unary predicates labeled by A (e.g., [13]).2 The Büchi-Elgot-Trakhthenbrot theorem is usually formulated for strings over the alphabet A (as opposed to 2 A above), but there are at least two advantages in using the alphabet 2 A . First, for applications such as (1) and (2), it is convenient to put zero, one or more symbols from A in boxes for a simple temporal construal of succession. The second advantage has to do with restricting 2 Regularity of languages is interesting here for computational reasons; for instance, since inclusions

between regular languages are computable (unlike inclusions between context-free languages), so are entailments in MSO.

Temporal Representations with and without Points

49

an MSO A -model M = [n], Sn , {Ua }a∈A  to a subset A of A. The A -reduct of M is the MSO A -model M  A = [n], Sn , {Ua }a∈A  obtained from M by keeping only the unary predicates Ua with a in the subset A . As with the componentwise intersection ρ A (s) of s with A , only elements of A are observable. The two notions of A -reduct coincide Mod A (s)  A = Mod A (ρ A (s)) making the square

commute. Notice that a string s fed to the function ρ A must be formed from sets for ρ A to carry out intersection (componentwise). But what if we “turn up the magnification” by allowing inside a box a label for a non-stative predicate? For example, we might expand string (1) a a, a  a  to the string l(a) a, l(a  ) a, a  , r (a) a  , r (a  ) introducing labels l(a) and l(a  ) for the left (open) border of a and a  respectively and r (a) and r (a  ) for the right (closed) border of a and a  respectively. The introduction of borders is made precise in Sect. 2 through a function b on strings, turning the equation d (ρ{a} (s)) = a

for an s-point a

into the equation d (b(ρ{a} (s))) = l(a) r (a)

for an s-interval a

50

T. Fernando

(Propositions 1 and 2), and replacing interiors a, a  by borders l(a), l(a  ), r (a), r (a  ) for a picture d (b(

a a, a  a 

)) = l(a) l(a  ) r (a) r (a  )

of the ordering of borders characteristic of the Allen relation a overlaps a  . In [4], Schwer and Durand call a string s of non-empty sets an S-word (S for set), and define for any set A, the S-projection over A of s to be d (ρ A (s)), i.e., the A-reduct of s with all occurrences of  deleted. Let the vocabulary of a string α1 · · · αn of sets αi be the union n  αi voc(α1 · · · αn ) := i=1

(making voc(s) the ⊆-least set A such that s ∈ (2 A )∗ ). Let us say s projects to s  if s  is the S-projection over voc(s  ) of s d (ρvoc(s  ) (s)) = s  . Every subset of voc(s) specifies a potentially different string to which s can project. The problem of satisfying several statements of projection (each statement describing a feature of the same situation) is taken up in the account of superposition in Sect. 3 below. The translation b is inverted in Sect. 4, with an eye to points other than the borders l(a) and r (a). In particular, actions in [2] that give rise to events are described, leading to a formulation of inertia associated with statives. That said, special attention is paid in Sects. 2 and 3 to Allen interval relations and the transitivity table in [1] enumerating the Allen relations that can hold between three intervals. The present work steps beyond the previous work [5–7] in exploring non-stative predicates given by the border translation b and actions over and above borders of statives. Dowty’s aspect hypothesis is tested with and without points, understood in different ways, one of which is indivisibility at a fixed granularity. (More in the Conclusion.)

2 Points and the Border Translation Given a string s of subsets of A, an s-point is an element a of A that occurs exactly once in s. This condition is expressed in MSO through a unary predicate symbol Pa labeled by a (interpreted Ua by Mod A (s)) as the MSO{a} -sentence (∃x)(∀y)(Pa (y) ≡ x = y) (with biconditional ≡) stating there is a position x where a occurs and nowhere else.

Temporal Representations with and without Points

51

Proposition 1 For any a ∈ A and s ∈ (2 A )∗ , the following are equivalent (i) (ii) (iii)





ρ{a} (s) ∈ a Mod(s) |= (∃x)(∀y)(Pa (y) ≡ x = y) s projects to a .

Points marking the borders of an interval are made explicit by a string function b mentioned in the introduction, to which we turn next. Let l and r be two 1-1 functions with domain A such that the three sets A, {l(a) | a ∈ A} and {r (a) | a ∈ A} are pairwise disjoint. It is useful to think of l(a) and r (a) as syntactic terms (rather than say, numbers), and to collect these in A• := {l(a) | a ∈ A} ∪ {r (a) | a ∈ A}. Now, let the function

b A : (2 A )∗ → (2 A• )∗

map a string α1 · · · αn of subsets αi of A to a string β1 · · · βn of subsets βi of A• with βi := {l(a) | a ∈ αi+1 − αi } ∪ {r (a) | a ∈ αi − αi+1 } βn := {r (a) | a ∈ αn }.

for i < n

For example, b{a,a  } (

a a, a  a 

) = l(a) l(a  ) r (a) r (a  )

and in general, for A ⊆ A,

commutes. To simplify notation, we will often drop the subscript A on b A . The idea behind b is to describe a half-open interval a as (l(a), r (a)] with open left border l(a) and closed right border r (a). For an interval analog of Proposition 1, let boundeda (x, y) be the MSO{a} -formula boundeda (x, y) := (∀z)(Pa (z) ≡ (x < z ∧ z ≤ y)) saying a picks out (via Pa ) string positions after x but before or equal to y, and observe that for any string s of subsets of A,

52

T. Fernando

b(ρ{a} (s)) = ρ{l(a),r (a)} (b(s)). Proposition 2 For any a ∈ A and s ∈ (2 A )∗ , the following are equivalent +

+



(i) (ii) (iii)

ρ{a} (s) ∈ a Mod(s) |= (∃x)(∃y)(x < y ∧ boundeda (x, y)) ∗ ∗ ∗ l(a) r (a) b(ρ{a} (s)) ∈

(iv)

b(s) projects to l(a) r (a) .

Let us define an s-interval to be an element a that satisfies any (equivalently, all) of (i)–(iv) in Proposition 2. To the list (i)–(iv), we can add (v)

bc(ρ{a} (s)) =

a or bc(ρ{a} (s)) =

a

.

The case of bc(ρ{a} (s)) =

a

in (v) is that of a period a in [2]. Alternatively, we can relax any assumption of boundeness by dropping on either side of a , expanding (v) to bc(ρ{a} (s)) ∈ { a , a

,

a,

a

}.

It is convenient for what follows to work with the more restrictive notion described by Proposition 2. When considering strings s over the alphabet 2 A• (as opposed to 2 A ), we overload the definition of an s-interval to apply to a when s projects to l(a) r (a) . We say s demarcates A if each a ∈ A is an s-interval. For any finite set A, we collect the strings of non-empty subsets of A that demarcate A in the language L• (A) := {s ∈ (2 A• − {})∗ | every a ∈ A is an s -interval}. For example, L• ({a}) = { l(a) r (a) } and for syntactically distinct a, a  , L• ({a, a  }) = {s R (a, a  ) | R ∈ AR} where AR is the set AR := {, d, di, f, fi, m, mi, o, oi, s, si, =}

Temporal Representations with and without Points

53

Table 1 Allen interval relations as strings of points, after [4] R a Ra  s R (a, a  ) R −1 a




l(a) r (a)

l(a  )

r (a  )

s

a starts

d

a during a 

l(a  ) l(a) r (a) r (a  )

di

l(a) l(a  ) r (a  ) r (a)

f

a finishes a 

l(a  ) l(a) r (a), r (a  )

fi

l(a) l(a  ) r (a), r (a  )

l(a), l(a  )

=

=

a equal

a

r (a), r (a  )

of 13 interval relations R in [1], pictured in Table 1 (from [4]) by a string s R (a, a  ) with vocabulary {a, a  }• = {l(a), r (a), l(a  ), r (a  )} such that for s ∈ (2 A )∗ , a Ra  holdsins ⇐⇒ b(s) projects to s R (a, a  ). Note that a Ra  is said to hold in a string s of subsets of A, rather than A• .3 Interval networks based on Allen relations treat a set A of interval names as the set of vertices (nodes) of a graph with edges (arcs) labeled by the set of Allen relations understood to be possible between the vertices. The obvious question is: given a specification f : (A × A) → 2AR of sets f (a, a  ) of Allen relations possible for pairs (a, a  ) from A, is there a string s that meets that specification in the sense of (5) below? for all a, a  ∈ A, there exists R ∈ f (a, a  ) such that a Ra  holds in s

(5)

This question is approached in [1] through a transitivity table T : (AR × AR) → 2AR mapping a pair (R, R  ) from AR to the set T (R, R  ) of relations R  ∈ AR such that for some intervals X, Y and Z,

3 The strings s

R (a, a

 ) can be derived from strings s◦ (a, a  ) over the alphabet {a, a  } by the equation R

s R (a, a  ) = b(s◦R (a, a  )). For example, s◦< (a, a  ) = a

a

and s◦m (a, a  ) = a a  .

A full list of s◦R (a, a  ), for every Allen relation R, can be found in Table 7.1 in [5, p. 223].

54

T. Fernando

X R Y and Y R  Z and X R  Z. For example, T ( 0 main n (A/B) = main n (B\A) = main n−1 (A) Remark If A/B (or B\A) is the main subtype of depth k for a formula C then A is the main subtype of depth k + 1 for this formula C. Defintion 14 (Deduction Constraints) Let G be a rigid categorial grammar, its deduction constraint of depth n is:

140

A. Foret and D. Béchet r l T ab(G; n)[ a, i , b, j ] ∈ {⊥,  L , L } where 0 ≤ i, j ≤ n G : a → A, b → B, iff ∃A, B, C, D : = Lr main j (B) = C/D, main i (A) NL D  G : a → A, b → B, l iff ∃A, B, C, D : =L main j (B) = D\C, main i (A) NL D =⊥ elsewhere

Theorem 3 [16] Two rigid grammars with the same deduction constraints for each depth, (or equivalently for their maximum FA-arit y) generate the same FA-structure (FA-arit y≤n) languages: let n ≥ 0, G, G  ∈ CG 1 , if T ab(G; n) = T ab(G  ; n) then FL(G) = FL(G  ) An RG-like Method See [21, 39] for a presentation of the so-called RG-algorithm performing on structures sentences (FA-structures) for rigid AB grammars. An algorithm for NL is proposed in [15]. In the context of NL grammars, rules Lr and Ll given in Sect. 3.2 play the role of classical forward and backward elimination Lr and Ll given in Sect. 1.2 We give below an algorithm called RGC. Given a set of positive examples D, it computes RGC(D) composed of a general form of grammar together with derivation constraints on its type variables. The main differences between RG and RGC appear in steps (3) and (6). Algorithm for RGC(D)—Rigid Grammar with Constraints 1. assign S to root; 2. assign distinct variables xi to argument nodes; 3. compute the other types on functor nodes and the derivation constraints according to Lr and Ll rules as follows: for each functor node in a structure corresponding to an argument xi and a conclusion Ai assign one of the types (according to the rule) Ai /xi or xi \Ai , where the xi variables are all distinct and add the constraint xi xi ;

4. collect the types assigned to each symbol, this provides G F + (D); and collect the derivation constraints, this defines GC + (D); let G FC(D) = G F + (D), GC + (D) 5. unify the types assigned to the same symbol in G F + (D), and compute the most general unifier σmgu of this family of types. 6. The algorithm fails if unification fails, otherwise the result is the application of σmgu to the types of G F + (D) and to the set of constraints, this defines RGC(D) = RG + (D), RC + (D) consisting in the rigid grammar RG + (D) = σmgu (G F + (D)) and the set of derivation constraints RC + (D) = σmgu (GC + (D)); the result is later written RGC(D) = σmgu (G FC(D)).

On Categorial Grammatical Inference and Logical Information Systems

141

A Summary We now give a summary of some results on the learnability of NL classes of grammars from strings and generalized FA-structures. The table also shows results on intermediate structures, well-bracketed strings can be seen as the generalized FAstructures without rule labels. Each line corresponds to a class of grammars. Each column is specific to the kind of structures (for the examples). Grammar class\Structures all k-valued k-valued and t-arit y bounded k-valued and FA-arit y bounded

Strings no no [11] yes [10] yes [16]

Well bracketed strings no no [14] yes corollary of [10] yes [16]

Generalized FA no yes [16] yes corollary of [10] yes [16]

Remark Similar structures have been studied for NL ∅ [14]: the key point is to add to rules Lr and Ll a “pseudo-application” of a functor to an “empty” argument. To our knowledge, an adaptation of learning results for NL ∅ has not been described yet.

4 Learning CDG and Link Grammars Categorial grammars are close to some grammatical formalisms based on dependencies or links between words. The section presents related results on Categorial Dependency Grammars (CDG) and on Link Grammars (LG).

4.1 Categorial Dependency Grammars Dependency grammars define relations between words, definitions of generative dependency grammars can be found in [23] and in [22, 24] a formal model categorial dependency grammars (CDG) that is completely lexicalized and handles projective and non-projective dependencies in the form of a type calculus generalizing AB grammars. Properties of Dependency Structure grammars (DSG) and CDG are further studied in [5]. Like categorial grammars, a CDG associates to each word one or several types. Each type is composed of a flat categorial type and a potential. A flat categorial type is a classical categorial type where \ and / cannot appear on subtypes. Moreover, it is possible to introduce iterative atomic types C ∗ , optional atomic types C ? and repetitive atomic types C + . Two atomic types of the types of two words in a sentence that have the same name (one must be an argument and the other must be the main subtype) can create a projective dependency whose name is the name of the atomic types. A projective dependency cannot cross another projective dependency.

142

A. Foret and D. Béchet

The potential is a list of valencies composed of a name, a polarity (positive or negative) and a direction (left or right). Valencies are never iterated, optional or repetitive. A positive valency and a negative valency with the same name and the same direction can be the two ends of a non-projective dependency. Non-projective dependencies can cross each other (except if they have the same name and the same orientation) and can cross projective dependencies. For instance, [A\B ? \S/C ∗ ]DE is a CDG type where the flat categorial type is A\B ? \S/C ∗ and the potential is  D  E. A, B ? and C ∗ are atomic types on an argument position, S is the main subtype and  D and  E are two valencies of the potential. They are both on the left direction but  D is positive and  E is negative. Table 2 presents the CDG rules (only left rules are shown, rules are similar on the right). Using the lexicon of a CDG, they determines dependency structures. For instance, a CDG can be defined by the following lexicon: elle la lui donn ee ´ a

→ [ pr ed] → [#( clit −a −obj)]clit−a−obj → [#( clit −3d −obj)]clit−3d−obj → [aux]clit−3d−objclit−a−obj → [#( clit −3d −obj)\#( clit −a −obj)\ pr ed\S/@ f s/aux −a −d]

Table 2 CDG rules (only left rules) Name

Rule

Ll

H P1 [H \β] P2 [β] P1 P2

John ran N N \S S

creates a local dependency H C P1 [C ∗ \β] P2 [C ∗ \β] P1 P2

Il

nice city ∗ A A∗ \N A \N

creates a local dependency C l

[C ∗ \β] P [β] P

city A∗ \N N

Dl

α P1 (V )P(V )P2 α P1 P P2 ,

if the potential (V )P(V ) satisfies the

creates a distant dependency V

FA pairing principle: P has no occurrences of V, V.

Rules for option C P1 [C ? \β] P2 [β] P1 P2

Ol

the cities D D ? \N N

creates a local dependency C l

[C ? \β] P [β] P

city A? \N N

C P1 [C + \β] P2 [C ∗ \β] P1 P2

nice city ∗ A A+ \N A \N

Rules for repetition Rl

creates a local dependency C

On Categorial Grammatical Inference and Logical Information Systems

143

Fig. 1 A non projective dependency structure

Fig. 2 A dependency structure with iterative cir c∗ dependency

The CDG rules with the lexicon allow the non projective dependency structure of Fig. 1. Projective dependencies whose name begins with # are anchors of nonprojective dependencies. They are displayed at the bottom of the figure for a better readability. Dashed arrows are non-projective dependencies. The other arrows are projectives dependencies. Figure 2 shows a dependency structure with iterative circumstantial dependencies that comes from an iterative atomic type cir c∗ in the type of fallait. For this example, the word fallait can be assigned to the type: [cir c∗ \ pr ed\S/@ f s/cir c∗ /a −obj]. The left cir c∗ is associated to 3 circumstantial dependencies and the right one is associated to none.

4.2 Learnability of CDG As expected, the general class of all the Categorial Dependency Grammars is not learnable from strings or from dependency structures. Nevertheless some subclasses are learnable. The first idea consists in the limitation of the number of types for each word. In fact, in contrast with classical categorial grammars, the classes of rigid CDG or k-valued CDG for any k > 0 are not learnable from strings or from dependency structures. This fact comes from the presence of optional or iterated atomic types

144

A. Foret and D. Béchet

(C ? or C ∗ ). If optional and iterated atomic types are forbidden and even if repetitive atomic types C + and potentials are enabled, the classes of rigid CDG and k-valued CDG for any k > 0 become learnable. Paper [9] considers CDG with iterative types C ∗ , but also the case of optional ? C and repetitive types C + . It proves that the particular classes of rigid CDG and k-valued CDG for any k > 0 without optional or iterative atomic type has finite elasticity and so are learnable from strings. It shows the results as summarized below (for rigid or k-valued CDG):

Class A∗ A? A+

Learnable Finite elasticity Finite elasticity Finite-valued from strings on strings on structures relation no ⇒ no yes ⇒ no no ⇒ no yes ⇒ no yes ⇐ yes ⇐ yes yes

Limit Points for CDG with Iterated Subtypes Below we give some details showing the unlearnability of the class of rigid CDG (or 1-valued CDG) with optional atomic types (without iterative or repetitive atomic type and with empty potential) and a similar result for CDG with iterative atomic types. Of course, the results can be extended to larger classes of CDG like the class of k-valued CDG for any k > 0 with or without optional, iterative or repetitive atomic types and with any potentials. Defintion 15 Let S, A, B be dependency names. We define G n , G ∗ and G n , G ∗ : C0 = S Cn+1 = Cn /A? C0 = S  = Cn /A∗ /B ∗ Cn+1

G 0 = {a → [A], c → [C0 ]} G n = {a → [A], c → [Cn ]} G ∗ = {a → [A/A? ], c → [S/A? ]} G 0 = {a → [A], b → [B], c → [C0 ]} G n = {a → [A], b → [B], c → [Cn ]} G ∗ = {a → [A], b → [A], c → [S/A∗ ]}

Theorem 4 These constructions yield the limit points as follows [9]: L(G n ) = {ca k | k ≤ n} L(G ∗ ) = c{a}∗ L(G n ) = {c(b∗ a ∗ )k | k ≤ n} L(G ∗ ) = c{b, a}∗ Corollary 1 The constructions show the non-learnability from strings for the class of the rigid (or 1-valued) CDG allowing optional atomic types (A? ) and for the similar class of CDG allowing iterative atomic types (A∗ )

On Categorial Grammatical Inference and Logical Information Systems

145

We observe that in these constructions, the number of optional or iterative atomic types (A? or A∗ ) is not bound. Learning from Untyped Dependency Nets Learnability or non-learnability of rigid or k-valued CDG from structures are studied in [9]. The structures used are Untyped Dependency Nets that can be defined as dependency structures where the names of the projective and non-projective dependencies are erased. The following picture shows an example of an untyped dependency net. The symbol l means that the dependency is projective (i.e. local) and d means that the dependency is non-projective (i.e. distant).

Learning Algorithm from Untyped Dependency Nets The learning algorithm for rigid (1-valued) CDG from untyped dependency nets proposed in [9] is based on Buszkowski and Penn’s original algorithm for rigid classical categorial grammars. The algorithm can only be applied to the class of rigid (or 1-valued) CDG with repetitive atomic types and any potential but without optional and iterative atomic types. Learning from Functor-Argument Structures Another close structure can be defined from the derivation of a dependency structure. In this case, the structure is called the functor-argument (or FA) structure. For instance, for the following lexicon: J ohn → [N ] ran → [N\S/A∗ ] f ast → [A] yester day → [A] A derivation and the corresponding dependency structure are: [N \S/A∗ ] A [N \S/ A∗ ]

Ir

[N \S/A∗ ]

D1 : N

[N \S]

A

Ir

r

(dependency structure)

Ll

S The labelled FA-structure is: Ll [N ] (J ohn, Lr [A] (Lr [A] (ran, f ast), yester day)) The unlabelled FA-structure is: Ll (J ohn, Lr (Lr (ran, f ast), yester day))

146

A. Foret and D. Béchet

In FA-structure the nodes are annotated with the kind and the orientation of dependencies introduced in the derivation. Unary derivation rules like r do not appear in the FA-structure. Ir introduces a (right) dependency A from ran to f ast that corresponds to the iterative atomic type A∗ of [N\S/A∗ ]. r erases the iterative atomic type A∗ . Ll introduce a (local and left) dependency from ran to J ohn. About RG-like (Rigid) Algorithms and Iteration In fact, an RG-like algorithm, when the class of grammars is restricted to rigid (1-valued) CDG with iterative atomic type, when positive examples are FA-structures (without dependency names), cannot converge (in the sense of Gold). This can be shown, as in [8], using the same grammars as in the limit point construction for string languages in [9], involving iterated atomic types. In this case, the FA-structures are all flat structures, with only/operators. Using Labelled Functor-Argument Structures [8] also gives another limit point that establishes the non-learnability from labelled FA-structures for the classes of grammars (rigid or not) allowing iterative categories (A*). The similar question for rigid or k-valued CDG with optional atomic type is left open. Models of Learning and Representing with Iterated Dependency Types Papers [6–8] study learnability of Categorial Dependency Grammars (CDG) expressing all kinds of projective, discontinuous and repeatable dependencies enabled by iterative atomic types. For these grammars, it is known [8] that they are not learnable from dependency structures or from FA-structures. These papers propose different ways of modelling the repeatable dependencies through iterated atomic types and the two corresponding families of CDG which cannot distinguish between the dependencies repeatable at least K times and those repeatable any number of times. For both cases, the papers show that they are incrementally learnable in the limit from dependency structures. Moreover, each paper presents an incremental learning algorithm that can learn a grammar in three variant subclasses of CDG called K -star-generalization CDG (for any K ≥ 2) from the dependency structures generated by the grammars. The difficulty in these tasks consist in finding, for the types of each word in the lexicon, where the iterative atomic types must be introduced.

4.3 Link Grammars Link grammars [49] are also close to dependency grammars. For a link grammar, a link between two words can be seen as a dependency between the words except that links do not have a left or a right orientation. Like CDG, a link grammar in defined by the types associated to the words. The analysis of a sentence is a link net that connects all the words of the sentence with links. The graph must be planar and connected and there must be at most one

On Categorial Grammatical Inference and Logical Information Systems

147

link between two words. Each link has a name. The lexicon determines the possible left and right link ends of each word. Thus, each type is a list of left link end names and a list of right link end names. The assignment: the cat who chased mouse died

→ Ds+ → Ds − Ss + Bs + C+ → C − Z + → Z − Bs − O+ → Ds − O− → Ss−

determines the following link net: Ss Bs Ds

the

C

cat

O Z

who

Ds

chased

the

mouse

died

[4] studies the learnability of this class of grammar. The whole class is not learnable but the classes of rigid link grammars and k-valued link grammars for any k > 0 are learnable from string or from unlabelled link nets. The proofs are very close to the proofs concerning k-valued CDG without optional, iterative and repetable atomic types and with empty potentials.

5 Logical Information Systems (LIS) The Logical Information Systems approach, allows for navigation, querying (classifying), updating, and analysis of heterogeneous data collections where data are given (logical) descriptors. While the LIS approach is a general purpose one, when considered on linguistic data—with different kinds of linguistic units and tasks—it may still take several directions, both theoretically and practically. First, depending on the linguistic framework a grammar may be seen as a case of Logical Information context; this holds for lexicalized grammars, in particular for categorial grammars when words are seen, on the information side, as data attached to their categories (logical types), the logic used in the context can be a standard one. LIS have also been considered for the development of pregroup grammars [17]. On the logic side of the system, due to their logical nature (categories as logical properties and parsing seen as a deduction), Lambek-like categorial grammars can be seen as a case of Logical Information System with their specific logic: some of them have been studied within the logic functors paradigm so as to extend them by composition with other logics and calculi [30, 31].

148

A. Foret and D. Béchet

On the practical side, logical information contexts have been built to enrich the presentation of data in lexicons or treebanks [18], or of some their transformations [18], providing a help for exploration and control. Camelis9 is a tool that implements the logical information system approach in which we can load and manage logical information contexts.

5.1 Logical Concept Analysis and LIS Logical Information Systems (LIS) are based on Logical Concept Analysis (LCA). LCA [27] is an extension of Formal Concept Analysis that allows to use logical formulas for rich object descriptions and expressive queries. The LCA framework [27] applies to logics with a set-valued semantics similar to description logics [2]. It is sufficient here to define a logic (see [27] for a detailed presentation) as a pre-order of formulas. The pre-ordering is the logical entailment, called subsumption: e.g., an interval included in another one, a string matching some regular expression, a graph being a subgraph of another one. Logic

Logical Context

Extent

It is a pre-order LT = (L , T ), where L is a set of formulas, T is a customizable parameter of the logic, and T is a subsumption relation that depends on T . The relation f T g reads “ f is more specific than g” or “ f is subsumed by g”, and is also used to denote the partial ordering induced from the pre-order. It is a tuple K = (O, LT , X, d), where O is a finite set of objects, LT is a logic, X ⊆ L is a finite subset of formulas called the navigation vocabulary, and d ∈ (O → LT ) is a mapping from objects to logical formulas. For any object o, the formula d(o) denotes the description of o. The extent of a query formula q in a logical context K is defined by K .ext (q) = {o ∈ O | d(o) T q}.

A key feature of LIS, is to allow the tight combination of querying and navigation [26]. The system returns a set of query increments that suggest to users relevant ways to refine the query, i.e. navigation links between concepts, until a manageable amount of answers is reached. A query is a logical formula, and its answers are defined as the extent of this formula, i.e. the set of objects whose description is subsumed by this formula.

9 www.irisa.fr/LIS/softwares.

On Categorial Grammatical Inference and Logical Information Systems

149

Fig. 3 A toy grammar as a LIS context: pgtype [o] versus pgtype [n] (n ≤ o in the pregroup grammar)

5.2 Categorial Grammars and/as LIS In [32] different perspectives are explored on how categorial grammars can be considered as Logical Information Systems (LIS)– where objects are organized and queried by logical properties – both theoretically, and practically. In direct adaptations, the information attached to words (as in Lefff10 ) is rendered as attributes, see [33]. Another view is to take more advantage of the logic nature of categorial grammars as in [33]. An important aspect of LIS is genericity w.r.t. the logic: a toolbox of logic functors is proposed in [28] (logic components, that can be assembled at a high level); it can be used in Camelis. A dedicated logic has then been used in [32] to represent pregroup types, in order to describe words, phrases, and sentences (Fig. 3). The tool permits query, navigation, but also the construction of new objects and new attachments to properties. Another benefit consists in the execution of actions from Camelis, and connexions with linguistic resources and tools such as parsers. Such systems can assist a user in several modes from a standard one (simple browsing, knowledge acquisition) to an expert one (such as in the control of data quality or the construction of a type-logical grammar).

10 Lefff

stands for: “Lexique des Formes Fléchies du Français/Lexicon of French inflected forms” (see http://alpage.inria.fr/~sagot/lefff-en.html).

150

A. Foret and D. Béchet

Fig. 4 Simplified vicinities computed on corpus Sequoia

5.3 Experiments on Dependency Treebanks From Dependency Treebanks to Vicinities The workflow in [18] applies to data in the Conll format,11 in particular to Sequoia data (http://deep-sequoia.inria.fr/). The CDG potentials in this section are considered as empty.12 For each governor unit in each corpus are computed (using MySQL and Camelis): (1) its vicinity in the root simplified form [l1 \...\ln \r oot/rm /..../r1 ] (where l1 to ln on the left and r1 to rm on the right are the successive dependency names from that governor), then (2) its generalization as star-vicinity, replacing consecutive repetitions of dk on a same side with dk∗ ; and (3) its generalization as vicinity_2seq following the LML mode of the algorithm. Such a development allows to mine repetitions and to call several kinds of viewers: the item/word description interactive viewer Camelis and the sentence parse conll viewer13 or grew.14 Figure 4 on its left, shows the root simplified vicinities computed on corpus Sequoia; the resulting file has been loaded as an interactive information context, in Camelis; this tool manages three synchronised windows: the current query is on the top, selecting the objects on the right, their properties can be browsed in the multi-facets index on the left.

11 http://universaldependencies.org/format.html. 12 This

complies with Sequoia data, but may be a simplification for other corpora.

13 https://universaldependencies.org/conllu_viewer.html. 14 http://talc2.loria.fr/grew/.

On Categorial Grammatical Inference and Logical Information Systems

151

6 Conclusion We have revisited results where learning is viewed as a symbolic issue in an unsupervised setting, from raw or from structured data, for some variants of Lambek grammars and of categorial dependency grammars. This shows a great variety of classes of grammars and a gap between the AB categorial grammars and many of the extensions. This also shows the need for appropriate structures, for class restrictions or for further information. Beyond the theoretical properties, we are interested in the definition and the application of learning algorithms to treebanks. This should avoid some hand-written types and rules, to build robust grammars. In this perspective, using systems such as Logical Information Systems should ease the task and provide guidance in the study.

References 1. Angluin, D.: Inductive inference of formal languages from positive data. Inf. Control 45, 117– 135 (1980) 2. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press (2003) 3. Bar-Hillel, Y.: A quasi arithmetical notation for syntactic description. Language 29, 47–58 (1953) 4. Béchet, D.: k-valued link grammars are learnable from strings. In: Jäger, G., Monachesi, P., Penn, G., Wintner, S. (eds.) Proceedings of the 8th conference on Formal Grammar (FGVienna), Vienna, Austria, pp. 9–18 (2003) 5. Béchet, D., Dikovsky, A., Foret, A.: Dependency structure grammar. In: Blache, P., Stabler, E., Busquets, J., Moot, R. (eds.) Logical Aspects of Computational Linguistics, 5th International Conference, LACL 2005, Bordeaux, France, April 28–30, 2005, Proceedings. Lecture Notes in Artificial Intelligence (LNAI), vol. 3492, pp. 18–34. Springer (2005). https://doi.org/10.1007/ b136076. http://signes.labri.fr/LACL/cfp.htm 6. Béchet, D., Dikovsky, A., Foret, A.: On dispersed and choice iteration in incrementally learnable dependency types. In: Logical Aspects of Computational Linguistics - 6th International Conference, LACL 2011, Montpellier, France. Lecture Notes in Computer Science (LNCS), vol. 6736, pp. 80–95. Springer (2011) 7. Béchet, D., Dikovsky, A., Foret, A.: Sur les itérations dispersées et les choix itérés pour l’apprentissage incrémental des types dans les grammaires de dépendances. In: Conférence Francophone d’Apprentissage 2011 (CAP), Chambéry, France (2011) 8. Béchet, D., Dikovsky, A., Foret, A.: Two models of learning iterated dependencies. In: de Groote, P., Nederhof, M.J. (eds.) Formal Grammar, 15th and 16th International Conferences, FG 2010, Copenhagen, Denmark, August 2010, FG 2011, Lubljana, Slovenia, August 2011, Revised Selected Papers. Lecture Notes in Computer Science (LNCS), vol. 7395, pp. 17–32. Springer (2012). https://doi.org/10.1007/978-3-642-32024-8 9. Béchet, D., Dikovsky, A., Foret, A., Moreau, E.: On learning discontinuous dependencies from positive data. In: Proceedings of the Formal Grammar Conference (FG 2004) (2004) 10. Béchet, D., Foret, A.: Apprentissage des grammaires de Lambek rigides et d’arité bornée pour le traitement automatique des langues. In: Actes de la Conférence d’APprentissage 2003 (CAP’2003) (2003)

152

A. Foret and D. Béchet

11. Béchet, D., Foret, A.: k-valued non-associative Lambek categorial grammars are not learnable from strings. In: ACL (ed.) Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003) (2003) 12. Béchet, D., Foret, A.: k-valued non-associative Lambek grammars are learnable from functionargument structures. In: Proceedings of the 10th Workshop on Logic, Language, Information and Computation (WoLLIC’2003). Electronic Notes in Theoretical Computer Science, vol. 85 (2003) 13. Béchet, D., Foret, A.: Remarques et perspectives sur les langages de prégroupe d’ordre 1/2. In: Actes, Dixième conférence de Traitement Automatique des Langues Naturelles (TALN 2003), pp. 309–314. ATALA (2003). (Poster) 14. Béchet, D., Foret, A.: k-valued non-associative Lambek grammars (without product) form a strict hierarchy of languages. In: Blache, P., Stabler, E., Busquets, J., Moot R. (eds.) Logical Aspects of Computational Linguistics, 5th International Conference, LACL 2005, Bordeaux, France, April 28–30, 2005, Proceedings. Lecture Notes in Artificial Intelligence (LNAI), vol. 3492, pp. 1–16. Springer (2005). https://doi.org/10.1007/b136076. http://signes.labri.fr/ LACL/cfp.htm 15. Béchet, D., Foret, A.: On rigid NL Lambek grammars inference from generalized functorargument data. In: FGMOL’05, the Tenth Conference on Formal Grammar and the Ninth on the Mathematics of Language, Edinburgh, Scotland (2005) 16. Béchet, D., Foret, A.: k-valued non-associative Lambek grammars are learnable from generalized functor-argument structures. J. Theor. Comput. Sci. 355(2) (extended version of [12]) (2006) 17. Béchet, D., Foret, A.: A pregroup toolbox for parsing and building grammars of natural languages. Linguist. Anal. J. 36 (2010) 18. Béchet, D., Foret, A.: Categorial dependency grammars with iterated sequences. In: Amblard, M., de Groote, P., Pogodalla, S., Retoré, C. (eds.) Logical Aspects of Computational Linguistics. Celebrating 20 Years of LACL (1996–2016) - 9th International Conference, LACL 2016, Nancy, France, December 5–7, 2016, Proceedings. Lecture Notes in Computer Science (LNCS), vol. 10054, pp. 34–51 (2016). https://doi.org/10.1007/978-3-662-53826-5_3 19. van Benthem, J., ter Meulen, A. (eds.): Handbook of Logic and Language. North-Holland Elsevier, Amsterdam (1997) 20. Bonato, R., Retoré, C.: Learning rigid Lambek grammars and minimalist grammars from structured sentences. In: Third workshop on Learning Language in Logic, Strasbourg (2001) 21. Buszkowski, W., Penn, G.: Categorial grammars determined from linguistic data by unification. Studia Logica 49, 431–454 (1990) 22. Dekhtyar, M., Dikovsky, A.: Categorial dependency grammars. In: Proceedings of International Conference on Categorial Grammars, pp. 76–91. Montpellier (2004) 23. Dikovsky, A.: Polarized non-projective dependency grammars. In: de Groote, P., Morill, G., Retoré, C. (eds.) Proceedings of the Fourth International Conference on Logical Aspects of Computational Linguistics. Lecture Notes in Artificial Intelligence (LNAI), vol. 2099, pp. 139–157. Springer, Le Croisic, France (2001) 24. Dikovsky, A.: Dependencies as categories. In: Kruiff, G.J., Duchier, D. (eds.) Proceedings of Workshop “Recent Advances in Dependency Grammars”. In conjunction with COLING 2004, pp. 90–97. Geneva, Switzerland (2004) 25. Dudau-Sofronie, D., Tellier, I., Tommasi, M.: Learning categorial grammars from semantic types. In: 13th Amsterdam Colloquium (2001) 26. Ferré, S.: Camelis: a logical information system to organize and browse a collection of documents. Int. J. Gen. Syst. 38(4) (2009) 27. Ferré, S., Ridoux, O.: An introduction to logical information systems. Inf. Process. Manag. 40(3), 383–419 (2004) 28. Ferré, S., Ridoux, O.: Logic functors: a toolbox of components for building customized and embeddable logics. Research Report RR-5871, INRIA, 103 pp. (2006). http://www.inria.fr/ rrrt/rr-5871.html

On Categorial Grammatical Inference and Logical Information Systems

153

29. Florêncio, C.C.: Consistent identification in the limit of the class k-valued is NP-hard. In: de Groote, P., Morill, G., Retoré, C. (eds.) Proceedings of the Fourth International Conference on Logical Aspects of Computational Linguistics. Lecture Notes in Artificial Intelligence (LNAI), vol. 2099, pp. 125–138. Springer, Le Croisic, France (2001) 30. Foret, A.: Pregroup calculus as a logical functor. In: Proceedings of WOLLIC 2007. Lecture Notes in Computer Science (LNCS), vol. 4576. Springer (2007) 31. Foret, A.: A modular and parameterized presentation of pregroup calculus. Inf. Comput. J. 208(5), 395–604 (2010) 32. Foret, A., Ferré, S.: On categorial grammars as logical information systems. In: Kwuida, L., Sertkaya, B. (eds.) International Conference on Formal Concept Analysis. Lecture Notes in Computer Science (LNCS), vol. 5986, pp. 225–240. Springer (2010) 33. Foret, A., Ferré, S.: On categorial grammars and logical information systems: using camelis with linguistic data. In: System Demonstration at LACL 2012 (2012). https://lacl2012.sciencesconf. org/conference/lacl2012/demo_proceedings.pdf 34. Foret, A., Nir, Y.L.: Lambek rigid grammars are not learnable from strings. In: COLING’2002, 19th International Conference on Computational Linguistics. Taipei, Taiwan (2002) 35. Foret, A., Nir, Y.L.: On limit points for some variants of rigid Lambek grammars. In: ICGI’2002, the 6th International Colloquium on Grammatical Inference. Lecture Notes in Artificial Intelligence (LNAI), vol. 2484. Springer, Amsterdam, the Netherlands (2002) 36. Girard, J.Y.: Linear logic: its syntax and semantics. In: Advances in Linear Logic, pp. 1–42. Cambridge University Press (1995). http://iml.univ-mrs.fr/~girard/Articles.html 37. Gold, E.M.: Language identification in the limit. Inf. Control 10, 447–474 (1967) 38. de la Higuera, C.: Grammatical Inference: Learning Automata and Grammars. Cambridge University Press (2010) 39. Kanazawa, M.: Learnable classes of categorial grammars. Studies in Logic, Language and Information. FoLLI & CSLI (1998) 40. Lambek, J.: The mathematics of sentence structure. Am. Math. Mon. 65, 154–169 (1958) 41. Lambek, J.: On the calculus of syntactic types. In: Jakobson, R. (ed.) Structure of Language and Its Mathematical Aspects, pp. 166–178. American Mathematical Society (1961) 42. Lecomte, A.: Grammaire et théorie de la preuve: une introduction. Traitement Automatique des Langues 37(2), 1–38 (1996) 43. Moortgat, M.: Categorial type logic. In: van Benthem and ter Meulen [19] , Chap. 2, pp. 93–177 44. Moot, R., Retoré, C.: The logic of categorial grammars. In: LNCS FOLLI 6850. Springer (2011). http://hal.archives-ouvertes.fr/hal-00607670. LCNS 6850 45. Nicolas, J.: Grammatical inference as unification. In: Rapport de Recherche RR-3632, INRIA (1999). http://www.inria.fr/RRRT/publications-eng.html 46. Partee, B.: Montague grammar. In: van Benthem and ter Meulen [19], Chap. 1, pp. 5–92 47. Pentus, M.: Lambek grammars are context-free. In: Logic in Computer Science. IEEE Computer Society Press (1993) 48. Shinohara, T.: Inductive inference from positive data is powerful. In: The 1990 Workshop on Computational Learning Theory, pp. 97–110. Morgan Kaufmann, San Mateo, California (1990) 49. Sleator, D., Temperley, D.: Parsing English with a link grammar. In: Third International Workshop on Parsing Technologies (1993). ftp://bobo.link.cs.cmu.edu/pub/sleator/link-grammar/ 50. Wright, K.: Identifications of unions of languages drawn from an identifiable class. In: The 1989 Workshop on Computational Learning Theory, pp. 328–333. San Mateo, California (1989)

A Scope-Taking System with Dependent Types and Continuations Justyna Grudzinska ´ and Marek Zawadowski

Abstract Different scope-taking mechanisms have been proposed for capturing quantifier scope alternation, including covert operations of quantifier movement, type-changing rules, storage devices. In this paper we propose a new scope-taking system with dependent types and continuations. The key elements of our formal framework are: (i) richly typed system; (ii) contexts for determining the relative scoping of quantifiers; (iii) recursive procedure by which the interpretation is computed and the dependently typed context is built along the surface structure tree. The main advantage of our proposal is that it does not overgenerate—it produces all and only the attested readings for the defined fragment. The core idea behind the proposal is that certain lexical elements are responsible for inverting scope: relational nouns and locative prepositions. This allows us to provide a principled solution to the question of why certain constructions missing such elements block inverse scope.

1 Introduction Multiply quantified constructions are reported to be ambiguous with different readings corresponding to the relative scoping of the quantifier expressions involved. The examples below are ambiguous between the so-called surface and inverse readings: (1) A man from every city (2) Maud draped a sheet over every table. The inverse linking construction in (1) can be understood to mean that there is a different man coming from every city (inverse reading: every city > a man); it can J. Grudzi´nska (B) Institute of Philosophy, University of Warsaw, Krakowskie Przedmie´scie 3, 00-097 Warsaw, Poland e-mail: [email protected] M. Zawadowski Institute of Mathematics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland e-mail: [email protected] © Springer Nature Switzerland AG 2020 R. Loukanova (ed.), Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018), Studies in Computational Intelligence 860, https://doi.org/10.1007/978-3-030-30077-7_7

155

156

J. Grudzi´nska and M. Zawadowski

be also understood to mean that there is some one man coming from all the cities (surface reading: a man > every city). The example in (2) can be understood to mean that there is a different sheet draped over every table (inverse reading: every table > a sheet), it can be also understood to mean that there is one sheet draped over all the tables (surface reading: a sheet > every table). Different scope-taking mechanisms have been proposed for capturing quantifier scope alternation, including covert operations of quantifier movement [17, 18], type-changing rules [12], storage devices [5]. One problem for the existing proposals is overgeneration. The examples below do not admit inverse readings: (3) One person with every key (4) Maud draped a table with every sheet. The construction in (3) can only be a statement about one person who happens to have every key (one person > every key); the inverse reading is disallowed ( every key > one person). The example in (4) exhibits the so-called frozen scope, i.e., only surface reading is possible (a table > every sheet) and the inverse reading is disallowed ( every sheet > a table). The existing mechanisms, however, generate such unattested readings. To illustrate on an example. Under standard LF-movement analysis, the inverse reading of the inverse linking construction in (1) is attributed to the application of quantifier raising (QR). QR replaces the QP every city with the coindexed trace (t1 ), and adjoins it at DP: (SS)

(LF)

DP a

NP

QP1 PP

NP man

DP DP

every city Det QP

P

a

from every city

NP NP

PP

man from t1

There is no principled explanation of why prepositions like with should block QR: (SS)

(LF)

DP some

NP

person P

DP

QP1 every key Det

PP

NP

DP

QP

with every key

some

NP NP

PP

person with t1

In this paper, we propose a new scope-taking system with dependent types [13, 15, 16] and continuations [1, 2, 8]. Our approach is directly compositional (we do not posit LFs or any structures that go beyond the overt syntax). Moreover, it does not overgenerate—it produces all and only the attested readings. The core idea behind the proposal is that certain lexical elements are responsible for inverting scope: relational

A Scope-Taking System with Dependent Types and Continuations

157

nouns and locative prepositions. This allows us to provide a principled solution to the question of why certain constructions missing such elements block inverse scope. The structure of the paper is as follows. Section 2 provides the core components of our proposal. Section 3 defines the syntax of the system and Sect. 4 explains our notion of context used for determining the relative scoping of DP arguments. Sections 5 and 6 define the semantic operations used in the system, and Sect. 7 defines our recursive procedure by which the interpretation is computed and the context is built along the surface structure tree. Finally, we provide a worked-out example to illustrate our procedure.

2 Core Proposal The idea of having just one type e of all entities originated with Frege and is widely adopted in natural language semantics (strictly speaking, standard Montague-style semantics has two base types: type e and type t of truth values, and a recursive definition of functional types). Our system includes a large number of base types, and a number of type constructors. The variables of our system are always typed: x : X , y : Y , . . . Types are interpreted as sets: |X |, |Y |, . . . Types can depend on the variables of other types, e.g., if x is a variable of type X , we can have type Y (x) depending on the variable x. The fact that Y is a type depending on X is modeled as a function: π : |Y | → |X |. For any element a ∈ |X |, we have a set (fiber) |Y |a = π −1 (a) which is the interpretation of the type Y (x), with x being interpreted as a. Context, another core trait of our system, is a list of (possibly dependently) typed variables:  = x : X , y : Y (x), z : Z(x, y), u : U, . . . Dependencies given in the context determine the relative scoping of quantifiers. For example, in the above context the interpretation of a sentence with two quantifier expressions where Q1 x:X outscopes Q2 y:Y (x) (Q1 x:X > Q2 y:Y (x) ) is available, and the interpretation where Q2 y:Y (x) outscopes Q1 x:X (Q2 y:Y (x) > Q1 x:X ) is not available because the indexing occurrence of the variable x in Y (x) is outside the scope of the binding occurrence of x (next to Q1 ). Types and contexts are constructed by recursive procedure along the surface structure tree.

2.1 Scope Ambiguities of DPs Our treatment of scope ambiguities of DPs builds on the analysis in [6, 9] and is motivated by the phenomenon of inverse linking. We propose to distinguish two kinds of DPs: DPs with surface interpretation (DPsi ) and DPs with inverse interpretation (DPii ). The central observation driving this distinction is that PPs containing DPs that give rise to inverse readings cannot freely change places with regular postnom-

158

J. Grudzi´nska and M. Zawadowski

inal modifiers (e.g. relative clauses RCs) but must be DP-final, as illustrated by the examples below: (a) One person [RC who was famous] [PP from every city] was invited. (b)  One person [PP from every city] [RC who was famous] was invited. Sentence (a) can be understood to mean that every city x is such that one famous person from x was invited, while sentence (b) is semantically odd—it only allows a surface reading saying that one person who came from every city and who was famous was invited. Inverse readings are possible when PPs follow RCs (as in (a)), while non-final PPs give rise to surface readings only (as in (b)). Based on this argument, Zimmermann [21] proposed a different structure for the inverse reading, where the PP (from every city) is not a regular postnominal modifier (as in (b)) but is right-adjoined to one person who is famous (as in (a)): (a) Inverse Reading DPii

PPii

DPsi,1 Det1 one

Pii

NP1

from every city

RC

N1

DPsi,2

person who is famous (b) Surface Reading DPsi,1

NP1

Det1 one NP1

RC who is famous

PPsi

N1 person Psi

DPsi,2

from every city

Our further assumption is that DPs with inverse interpretation (DPii ) can only be set off by the presence of relational nouns (e.g. representative) or locative prepositions (e.g. from): (c) (IR)

(d ) (IR)

DPii DPsi,1

of

PPii

DPsi,1

PPii

a representative Pcon

DPii

DPsi,2 every country

a man

Pii

DPsi,2

from every city

A Scope-Taking System with Dependent Types and Continuations

159

If no relational noun or locative preposition is involved, DPs can only receive a surface interpretation: (e)(SR)

DPsi,1 Det1 one

NP1 PPsi

N1 person Psi

DPsi,2

with every key

Sortal common nouns (e.g, man, city) are treated as base types; on the semantic side, they are interpreted as sets. Relational common nouns (e.g., representative, sister) are treated as dependent types. On the semantics side, they are interpreted as polymorphic indexed sets, e.g. representative takes a set, say the set of countries, and yields the family of sets of representatives indexed by that set. We also have a polymorphic interpretation of determiners and predicates. A DP like some country is interpreted over the set of countries: |some|(|C|) = {X ⊆ |C| : X = ∅}. A DP like some representative of France is interpreted over the fiber of the representatives of France: |some|(|R|France ) = {X ⊆ |R|France : X = ∅}. Relational nouns and locative prepositions are two kinds of lexical elements responsible for inducing dependencies reversing the ordering of the DPs involved. Representative, as in a representative of every country, is modeled as a dependent type/interpreted as the function: π : |R| → |C|, the intended meaning being that for any country a ∈ |C|, we have a set (fiber) |R|a = π −1 (a) of the representatives of that country. By quantifying over this dependency, we get the inverse ordering of the DPs involved: ∀c:C ∃r:R(c) . The interpretation where ∃ outscopes ∀ is not available because the indexing occurrence of the variable c (in R(c)) is outside the scope of the binding occurrence of that variable: ∃r:R(c) ∀c:C . By making the type of representatives dependent on (the variables of) the type of countries, our analysis forces the inversely linked reading without positing any extra scope mechanisms. Inverse readings are also available for DPs involving locative prepositions, as in a man from every city. We assume that the relational use of

160

J. Grudzi´nska and M. Zawadowski

sortal nouns (e.g. man) can be coerced by the presence of locative prepositions (e.g. from).1 This is due to the fact that locative prepositions imply ‘disjointness’ (entities do not occur at more than one place simultaneously), and hence can be interpreted as partial functions (from any type of physical objects to any type of locations). For example, from can be interpreted as the partial function: p : |M (an)| → |C(ity)|. Restricting the set of people to the people coming from cities gives the total function: π : |Mc | → |C|. By quantifying over this dependency, we get the inverse ordering of the two DPs: ∀c:C ∃m:Mc (c) . The reason why inverse readings are blocked with certain prepositions (e.g. with) is that non-locative prepositions do not imply ‘disjointness’ (cannot be interpreted as partial functions), and hence do not induce dependencies reversing the ordering of the DPs involved.

2.2 Scope Ambiguities of VPs Our treatment of scope ambiguities of VPs is motivated by the frozen scope phenomenon. In the double object construction in (5) scope is fixed to the surface order (a student > every book): (5) Someone gave a student every book. The prepositional dative construction in (6) permits both the surface and the inverse scope order (a book > every student, every student > a book): (6) Someone gave a book to every student. The same asymmetry can be found in constructions involving verbs that participate in the spray-load alternation: (7) Someone draped a table with every sheet. (8) Someone draped a sheet over every table. In the with variant of the spray-load alternation in (7) scope is fixed to the surface order. The locative variant in (8) permits both the surface and the inverse scope order. Two main accounts of the observed scope asymmetry have been proposed. One account argues that (5–7) and (6–8) are two distinct constructions with different syntax and semantics: frozen scope is due to an extra layer of non-overt structure [3, 4, 14]. We follow an alternative account, where (6–8) and (5–7) have parallel (surface) structures: a small clause in both cases but with the PP headed by different prepositions, a locative preposition in the ambiguous examples (6–8) and a non-locative preposition in the frozen variants (5–7) (a null preposition encoding possession in (5) and an instrumental preposition in (7)) [10, 11, 20]: 1 In

[19], Partee and Borschev emphasize the permeability of the boundary between sortal and relational nouns, and the fact that sortal nouns can often be coerced to undergo ‘sortal-to-relational’ shifts, as in uses of sortal nouns with an overt argument, e.g. book expresses a sortal concept but its relational use can be coerced in books of ….

A Scope-Taking System with Dependent Types and Continuations VP

VP

drape DPsi,1 a sheet

drape DPsi,1

PP P

SC

V

SC

V

161

DPsi,2

over every table

a table

PP P

DPsi,2

with every sheet

The semantic distinctions between the two structures are caused by the differences in the semantic contributions made by the two different P heads. Locative prepositions can (and non-locative prepositions cannot) invert scope. This is due to the fact that locative prepositions can be interpreted as partial functions, and non-locative prepositions are interpreted as mere binary relations. Our system uses contexts for determining the relative scoping of the DP arguments of the main predicate of the sentence. A context is a list of (possibly dependently) typed variables subject to certain restrictions. The context (with the variable corresponding to the subject argument omitted) can be extended both from the left and right. An extension from the left is by a new variable of a constant type and the extension from the right is by a new variable of a new type that depends on the variable last added to the context. By a recursive procedure we equip surface structure trees with context, extending it by new typed variables either from the left (any preposition) or from the right (locative preposition). In (8), the locative preposition over is likely to induce the function: π : |S| → |T | (sheets can be dependent on tables). Thus the surface structure tree corresponding to (8) can be equipped with the context t : T , s : S(t). By quantifying over this dependency, we get the inverse ordering of the two DP arguments: ∀t:T ∃s:St (t) . Similarly, the locative preposition to in (6) is likely to induce the function: π : |B| → |S| (books can be dependent on students). Thus the surface structure tree corresponding to (6) can get equipped with the context s : S, b : B(s). By quantifying over this dependency, we get the inverse ordering of the two DP arguments: ∀s:S ∃b:Bs (s) . By contrast, non-locative prepositions cannot be interpreted as partial functions, and hence cannot invert scope. In (7), the instrumental preposition with can only induce the context corresponding to the surface ordering of the two DP arguments: ∃t:T ∀s:S . Similarly, the silent preposition encoding possession PHAV E in (5) can only induce the context corresponding to the surface ordering of the two DP arguments: ∃s:S ∀b:B .

3 Syntax This section introduces the syntax of our system. The only new element with respect to the informal discussion above is that we consider sentences involving verbs with n arguments (where n is an arbitrary positive natural number). Including verbs with more than three arguments is a source of the increased complication of our system

162

J. Grudzi´nska and M. Zawadowski

(this might serve as a partial explanation of why humans are rather reluctant to consider such verbs).

3.1 Notation Description of syntactic categories. name S SC DPsi

description name description sentence VP verb phrase small clause DP determiner phrase DP with surface DPii DP with inverse interpretation interpretation Det determiner NP noun phrase N noun V verb Vit intransitive verb Vt transitive verb RC relative clause RP relative pronoun PP prepositional phrase P preposition PPsi PP with surface PPii PP with inverse interpretation interpretation Psi surface scope Pii scope inverting preposition preposition Pcon connector to Y Y clause relational noun (of)

3.2 Rules Sentence: S → DPsi V P,

S → DPii V P

Verb phrase: V P → V,

V P → V SC

Determiner phrase: DPii → DPsi ,

DPii → DPsi PPii

DPsi → Det NP

A Scope-Taking System with Dependent Types and Continuations

163

Noun phrase: NP → NP PPsi ,

NP → NP RC,

NP → N

Relative clause: RC → RP Y Y clause: Y → Vit ,

Y → Vt DPsi

Small clause: SC → DPii ,

SC → DPii PP

Prepositional phrase: PPii → Pii DPii ,

PPii → Pcon DPii

PPsi → Psi DPsi PP → P SC Preposition: P → Psi ,

P → Pii

Remarks 1. Trees generated by the above rules are binary: a node is either a leaf or it has two daughters (one-one rule just renames the label of a node); 2. We insists that the surface structure tree of a sentence has the same shape as the tree that computes its truth value (in each of its readings).

4 Contexts Context is a list of (possibly dependently) typed variables. Types denote collections which can be quantified over. The order of appearance of variables in the context is the order of the quantification in the sentence. Thus the number of the variables in the (fully evaluated) context of a sentence is equal to the number of DP arguments in the (main) predicate in the sentence. 1. The context, if not empty, has a head declaration which is the first declaration of a variable in the context. The head variable and the head type are the variable and the type of the head declaration. No type in the context depends on the head variable. The tail context is the context with the head declaration omitted.

164

J. Grudzi´nska and M. Zawadowski

2. The tail context, if not empty, has an active variable declaration. The active variable declaration in the context is the first or the last variable declaration of the tail context. Like in the case of the head declaration, the active variable and the active type are the variable and the type of the active declaration (the notion of an active variable declaration is added for the reasons of generality to cover cases of sentences involving more than three arguments). 3. We can extend contexts in two ways, either by adding a variable declaration of a constant type at the beginning of the tail context or by adding a variable declaration of a type that depends on the active variable at the end of the tail context. The added declaration in the extended context becomes the active declaration. 4. Examples. The first variable added to the contexts below has index 1. The other variables are added in the decreasing order of their indices. a. In the context: t1 : T1 , t2 : T2 , t4 : T4 , t3 : T3 (t4 ) t1 : T1 is the head declaration and t2 : T2 is the active declaration. b. In the context: t1 : T1 , t4 : T4 , t3 : T3 (t4 ), t2 : T2 (t3 ) t1 : T1 is the head declaration and t2 : T2 (t3 ) is the active declaration. c. The context (4b) can be extended by a variable t of a constant type T to a context: t1 : T1 , t : T , t4 : T4 , t3 : T3 (t4 ), t2 : T2 (t3 ) d. If we want to extend context (4b) by a variable t of a dependent type T , we get the context: t1 : T1 , t4 : T4 , t3 : T3 (t4 ), t2 : T2 (t3 ), t : T (t2 ) Context is interpreted as a diagram of sets and functions. Its categorical limit is the parameter space of the context, i.e., the set of compatible n-tuples of the elements of the sets corresponding to the types involved.

5 Typed Operations In this section we introduce notation and describe set-theoretic operations that will be used in our semantics. All these operations have categorical flavour (this will be made explicit in another paper). For the unexplained notions, the reader may consult [7, 8].

A Scope-Taking System with Dependent Types and Continuations

165

5.1 Continuation (Fibered) Monad Set denotes the category of sets and functions. Let t = {0, 1} be the two-element set. For set X , P(X ) denotes the power-set of X , and Set/X is the category of functions into X that can be thought of as families of sets indexed by the set X . The value of the continuation monad C : Set → Set on set A is C(A) = PP(A), thus it can be X identified with the set t t (when needed). The continuation monad on Set has a lift to the continuation monad CX : SetX −→ SetX on Set/X . It is defined fiber by fiber, i.e., for a : A → X  C(Ax ) −→ X , C(A, a) = x∈X

where Ax = a−1 (x) is the fiber of a over x, and elements of C(Ax ) are sent to x. Any element x ∈ X can be continuized via the unit of C, i.e, the map ηX : X −→ C(X ) such that ηX (x) = {U ⊆ X : x ∈ X }. We often omit index X in ηX .

5.2 Operations  and  Any function f : Y → X induces two functors: f , f : Set/Y −→ Set/X , such that, for b : B → Y , we have: f (B, b) = (B, f ◦ b), f (B, b) =

 

By −→ X .

x∈X y∈f −1 (x)

5.3 Pile up Operations For any sets X , Y , we have pile up operations, left and right: pile up X ,Y , pile up l

r

X ,Y

: C(X ) × C(Y ) −→ C(X × Y )

166

J. Grudzi´nska and M. Zawadowski

defined, for Q ∈ C(X ) and Q ∈ C(Y ), as follows: pile up X ,Y (Q, Q ) = {R ⊆ X × Y : {x ∈ X : {y ∈ Y : R(x, y)} ∈ Q } ∈ Q}, l

pile up

r

X ,Y (Q, Q



) = {R ⊆ X × Y : {y ∈ Y : {x ∈ X : R(x, y)} ∈ Q} ∈ Q }.

5.4 Dependent Right Pile up Operation The operation of the (right) dependent pile up is defined for a pair of composable f

g

functions Z −→ Y −→ X : dpile up

r g,f

: f CY (Z, g) ×X CX (Y , f ) −→ CX (f (Z, g)).

For x ∈ X , Q ∈ (f CY (Z, g))x and Q ∈ CX (Y , f )x , it is defined as follows: dpile up

r

g,f (Q, Q



) = {R ⊆ Zx : {y ∈ Yx : Ry ∈ Q(y)} ∈ Q },

where Zx = (f ◦ g)−1 (x) and Ry = R ∩ g −1 (y), for y ∈ Yx . An element Q ∈  f (CY (Z, g)) is a dependent quantifier on (Z, g), combined along f . Then y ∈ Yx , Q(y) ∈ C(Zy ) is a quantifier in the fiber Zy = g −1 (y). If X = 1, we omit index f (which is ! in this case). This operation combines quantifiers in the fibers of f with the quantifiers in the fibers of g and returns (polyadic) quantifiers in the fibers of f ◦ g.

5.5 CPS-Transforms For a function f : X × Y → Z, we can define CPS-operations, left and right: CPSlf , CPSrf : C(X ) × C(Y ) −→ C(Z), so that, for Q ∈ C(X ) and Q ∈ C(Y ), we have: CPSlf = C(f ) ◦ pile up (Q, Q ), l

CPSrf = C(f ) ◦ pile up (Q, Q ). r

The function f in most cases is the obvious one and then we might omit it.

A Scope-Taking System with Dependent Types and Continuations

167

5.6 Dom Operation Last, there is a simple operation of ‘taking domain of a partial function’. For any partial function f : X → Y , we define: dom(X , f , Y ) = {x ∈ X :↓ f (x)}, i.e., the set of those x ∈ X for which f (x) is defined.

6 Polymorphic Data and Polymorphic Comprehension Many syntactic symbols will be interpreted as polymorphic operations. By this we mean that they in fact constitute a family of interpretations that depend on some other precisely specified data (e.g., set, two sets, function, etc.).

6.1 Polymorphic Data 1. A polymorphic indexed set T associates to every set X a function πX : T (X ) → X . 2. A polymorphic unary (binary, n-ary) relation R associates to every set X (pair of sets X1 , X2 , n-tuple of sets X1 , . . . , Xn ) a relation RX ⊆ X (a binary RX1 ,X2 ⊆ X1 × X2 , an n-ary relation RX1 ,...,Xn ⊆ X1 × · · · × Xn ). If we apply one set to an n-ary relation, the result is a n − 1-ary relation (in the obvious sense). 3. A polymorphic (partial) function p is a polymorphic binary relation that associates to every pair of sets X1 , X2 a (partial) function p : X1 → X2 . 4. By a quantifier Q we understand an association to every set X  an element Q(X ) ∈ C(X ) (such quantifiers can be applied to fibers: Q(A, a) = x∈X Q(Ax ) → X ). We distinguish the notion of a quantifier (which is polymorphic, as above) from the notion of a quantifier Q on a set X which simply means an element Q ∈ C(X ). Having such polymorphic data, we can consider polymorphic operations on them.

6.2 Comprehenion 1 This operation is performed on a polymorphic unary relation R : Set −→ Set. It is just evaluation on a set, i.e, for set X , we have: Cph1 (R, X ) = RX ⊆ X .

168

J. Grudzi´nska and M. Zawadowski

6.3 Comprehenion 2 The operation Cph2 takes a polymorphic binary relation R, a set Y and a quantifier Q on set Y , and returns a unary polymorphic relation so that, for set X , it returns: Cph2 (R, Q, Y )(X ) = {x ∈ X : {y ∈ Y : RX ,Y (x, y)} ∈ Q} ⊆ X .

7 Computation of Truth Conditions The computation of the truth value of the sentence follows the shape of the syntactic tree of that sentence. The computation can be thought of as visiting all the nodes of the tree in the post order, i.e., we compute the value of the left subtree, we then compute the value of the right subtree, and finally we compute the value of the node itself (following the procedures explained below). In the S node, the root, there is a minor departure from this: we compute the left argument and we put the head declaration into the context before calculating the value of the right subtree (V P) and computing the value of the whole sentence. Each node of the tree has a label which is one of the syntactic categories listed in Sect. 3. The syntactic categories are paired with the (semantic) computation procedures to be performed in a given node. Each node of the tree is either a leaf (with a word attached to it) or an inner node, i.e., it has two daughters Left and Right. The semantic value of any leaf is the interpretation of the word attached to that leaf. The semantic value of inner nodes is computed using the semantic values of the nodes Left and Right, according to the procedures described below. Below, we specify for each syntactic category (i.e. label of a node of the surface structure tree) how we compute the value given the arguments and how and when we extend the context.

7.1 Leafs (Data) 1. Det - determiner. Data: Q - quantifier (polymorphic). 2. N - noun. Data: T - set or polymorphic indexed set. 3. V - verb. Data: V - n-ary relation (polymorphic). 4. Vit - intransitive verb. Data: P - unary relation (polymorphic). 5. Vt - transitive verb. Data: R - binary relation (polymorphic).

A Scope-Taking System with Dependent Types and Continuations

169

6. RP - relative pronoun. Data: none. 7. Psi - surface scope preposition. Data: p - polymorphic binary relation. 8. Pii - inverting preposition (locative preposition). Data: p - polymorphic partial function. 9. Pcon - preposition (connector to a relational noun). Data: p - polymorphic function.

7.2 Passing Value Nodes 1. RC - relative clause Left: Right: Value:

relative pronoun with empty semantic content (e.g. who). P - polymorphic unary relation. return P.

2. DPsi - DP with surface interpretation Left: Right: Value:

Q - (polymorphic) quantifier. T - set or polymorphic indexed set.

Q, T .

3. PPii - prepositional phrase with inverse interpretation (in DPii ) Left: Right: Value:

p - polymorphic (partial) function (preposition Pii or Pcon ).

Q, T , where T is a set and Q a quantifier on T .

p, Q, T .

7.3 Nodes with Operations 1. V P - verb phrase V - n-ary relation with n > 1.

Q, T , where: a. T - set that is the interpretation of the active type in the context; b. Q - polyadic quantifier on the parameter space of the tail context. ), Q) ∈ CP(T1 ). Value: CPS(η(V Computation:  be the interpretation of V in the parameter space of the whole a. let V context; ), Q) ∈ CP(T1 ) on the morphism: b. compute CPS(eps)(η(V

Left: Right:

170

J. Grudzi´nska and M. Zawadowski

 n  n   eps : P Ti × Ti −→ P(T1 ) i=1

i=2

2. DPii - DP with inverse interpretation

Q, T , where Q is a quantifier.

p , Q , T , where Q is a quantifier on set T . Moreover: Case 1 either p is a polymorphic partial function (Pii ) and T is a set; Case 2 or p is a polymorphic function (Pcon ) and T is a polymorphic indexed set.    is a quantifier on  Value: Q, T , where  T is a set and Q T. Computation: 1 Case 1 Put T

= dom(T , pT ,T ). So π(= pT ,T ) : T

→ T is a function. Case 2 Apply the polymorphic indexed set T to set T getting Left: Right:

π = pT (T ),T : T

= T (T ) −→ T

2 3

Thus in both cases we get a function π : T

−→ T . We apply quantifier Q to π : T

→ T getting Q(T

, π ) ∈ t:T

C(T

(t)), and Q to T getting Q (T ) ∈ C(T ).  quantifies over the set  Then the quantifier Q T = t∈T T

(t), and is given by the dependent pile up  = dpile upr (Q(T

, π ), Q (T )). Q

3. NP - noun phrase Left: Right: Value:

T - set. polymorphic unary relation P. Cph1 (P, T ) - set.

4. Y - Y clause. Left: Right: Value:

R - polymorphic binary relation.

Q, T  - where T is a set, Q is a (polymorphic) quantifier. Cph2 (X , R, Q, T ) - polymorphic unary relation.

5. PPsi - prepositional phrase with surface interpretation Left: Right: Value:

p - polymorphic binary relation (Psi ).

Q, T , where T is a set and Q is a quantifier on T . Cph2 (p, Q, T ) - polymorphic unary relation.

A Scope-Taking System with Dependent Types and Continuations

171

7.4 Nodes with Operations and Context Extensions 1. S - sentence: Left: Q, T1  - T1 is a set and Q is a quantifier on T1 . Right: V - continuized (unary) relation on T1 . Value: CPSε (Q(T1 ), V )(idt ) ∈ t (truth value of the whole sentence). Computation: a. put t1 : T1 into the context (as the head type in the context); T1 is a new type interpreted as the set T1 . b. choose suitable ε ∈ {l, r} and compute CPSε (eps1 )(Q(T1 ), V ) ∈ C(t), where eps1 : T1 × P(T1 ) → t is the obvious map. 2. SC - small clause Left: Q, T  - T is a set, and Q - a quantifier. Right: p , Q , T , where: a. p - polymorphic binary relation (Pii or Psi ); b. T - set that is the interpretation of the active type in the context (Tk or Tm ); c. Q - polyadic quantifier on the parameter space of the tail context.   Value: Q, T , where:  - quantifier on the parameter space of the newly extended tail context. a. Q b.  T - set. Computation 1: No proviso: p is any preposition (value of p is not used). a.  T = T; b. add to the context a declaration t :  T of a new variable t of a constant T interpreted as set  T , following the declaration of the head type type  variable: T , tk : Tk , . . . , tm : Tm t1 : T1 , t :   = pile upl (Q( c. Q T ), Q (T )). Computation 2: Proviso: if p is Pii and pT ,T : T → T is a partial function a.  T = dom(T , pT ,T ) b. add to the context a declaration t :  T of a new variable t of a dependent T (t ) (where t is the variable of the active type interpreted as T , type  i.e., either Tk or Tm ) at the end of the context: t1 : T1 , tk : Tk , . . . , tm : Tm , t :  T (t ) c.

 = dpile upr (Q( T ), Q (T )). Q

3. PP - prepositional phrase (in VP) Left: p - polymorphic partial function (Pii or Psi ) Right: Q, T 

172

J. Grudzi´nska and M. Zawadowski

a. b.

T - interpretation of the type last added to the context; Q - polyadic quantifier on the parameter space of the new tail context extended by t : T . Value: p, Q, T  Computation: if a declaration t : T does not occur in the context yet, add it to the context (this applies to the last argument only when the context has only the declaration of the head type): t1 : T1 , t : T

7.5 Worked-Out Example The table below serves to record the dynamic aspect of the computation process of the meaning of the example sentence below, the value at the root in node S. We first explain the conventions used in the table that record the dynamics of the process. node ... L

k

... L k

Return Left Right ... ... ... ... ... ... ... ... ... ... ... ... value to be Lk returned L k =? L

k

=? ... ... ... ... ... ... Lk =? . . . ... ... ... ...

Ctx ... ... ... ...

in ... ... ... ...

out ... ... ... ...

x:X ... ... ...

start ... ... ...

finish ... ... ...

Each row of the table records the computation of a single procedure. In the first column node there is the label of the node, say Lk , and at the same time the name of the procedure L performed (k - is the number that distinguishes this call of the procedure L). The column return contains the semantic value computed in that row (by the procedure Lk ) and returned to the suitable place Lk =? in a column left or right below. In the columns left and right there are names of procedures (numbered), followed by =?, say L k =? and L

k

=?. These are procedures called by the procedure L: first left (L ), and then right (L

); if the name of the procedure is the label of a leaf node with a word attached, say xyz, we directly provide the semantic interpretation ‘xyz of such a word. We then perform the computation of the value of the procedure Lk . The moment the computation process begins for the given call of a procedure, start, is recorded in the column in, and the moment the computation process for this call of the procedure is finished, finish, is recorded in the column out. In the column Ctx the declaration of the variable x : X that is added to the context is recorded. One can think of the table as if it were the content of the stack reflecting

A Scope-Taking System with Dependent Types and Continuations

173

the recursive character of the computation. What is currently on the top of the stack is the procedure currently being computed. We wanted this process to be somehow related to the process of understanding of an utterance of such a sentence. Example Some man loaded a chair on every truck with three doors. We assume that the utterance of the above example has already been parsed into a syntactic tree: S DPsi,1

VP

Det1 N1 some man

V

SC

loaded

DPsi,2

PP1

Det2 N2 P ii,1

DPsi,3

a chair on Det3

NP3

every N3

PPsi,1

truck Psi,1

DPsi,4

with Det4

N4

three doors

The computation process by which the truth value of the above sentence is computed and the dependently typed context is built along the (syntactic) tree is presented below. node

Return

Left

Right

Ctx

DPsi,4

Det4 , N4 

Det4 = ‘three

N4 = ‘doors

12 13

PPsi,1

Cph2 (Psi,1 , DPsi,4 )

Psi,1 = ‘with

DPsi,4 =?

11 14

NP3

T = Cph1 (N3 , PPsi,1 )

N3 = ‘truck

PPsi,1 =?

10 15

DPsi,3

Det3 , NP3 

Det3 = ‘every

NP3 = T =?

9 16

PP1

Pii,1 , Det3 , NP3 

Pii,1 = ‘on

DPsi,3 =?

DPsi,2

Det2 , N2 

Det2 = ‘a

N2 = ‘chair

SC

 C = dom(N2 , Pii,1 , NP3 ),  = dpile upr (Det2 (N2 ), Q Det3 (NP3 ))

DPsi,2 =?

PP1 =?

VP

 CPS(η(V ), Q)

V = ‘load

  SC = Q, C =?

DPsi,1

Det1 , N1 

Det1 = ‘some

N1 = ‘man

S

CPSε (Det1 (N1 ), V P)(idt )

DPsi,1 =?

V P =?

t:T

in out

8 17 6

7

C(t) 5 18 c: 4 19 2 m:M

3

1 20

174

Context :

J. Grudzi´nska and M. Zawadowski

C(t) m : M , t : T, c : 

Now we describe step by step the computation recorded in the table. Each inner node can occur on three (in principle different) occasions: when first discovered (in), when the Left argument is computed, and when the Right argument is computed. We then finish the computation of the value to be returned. The labels preceding description of each action consist of the name of the procedure governing the action in question, and the number(s) which correspond to the moment in and/or out (if applicable). S; 1: DPsi,1 ; 2,3:

S: V P; 4: SC; 5: DPsi,2 ; 6,7:

SC: PP1 ; 8: DPsi,3 ; 9: NP3 ; 10: PPsi,1 ; 11: DPsi,4 ; 12,13:

PPsi,1 ; 14: NP3 ; 15: DPsi,3 ; 16: PP1 ; 17:

To compute Left argument, we call the procedure DPsi,1 . Both arguments are leaves. Left argument is the semantic value of Det1 , i.e., the polymorphic quantifier ‘some. Right argument is the semantic value of N1 , i.e., the interpretation ‘man of the constant type Man. Then the pair ‘some, ‘man is Returned to the computation S. Control comes back to S to put the head declaration m : Man to the context and proceed to compute Right argument V P. Left returns directly the interpretation of V : polymorphic relation ‘load , and we proceed to compute Right argument SC. Left calls procedure DPsi,2 . Left directly returns the value of Det2 : polymorphic quantifier ‘a. Right directly returns the value of the leaf N2 : the interpretation ‘chair of the constant type Chair. Then the pair ‘a, ‘chair is Returned to SC. Right calls procedure PP1 . Left returns the interpretation of Pii,1 : polymorphic partial function ‘on. Right calls DPsi,3 Left returns the semantic value of the terminal node Det3 : polymorphic quantifier ‘every. Then Right calls NP3 . Left returns the semantic value of the terminal node N3 : the interpretation ‘truck of the constant type Truck. Right calls PPsi,1 . Left returns the value of the leaf Psi,1 : polymorphic binary relation ‘with. Right calls DPsi,4 . Left returns the value of the leaf Det4 : polymorphic quantifier ‘three. Right returns the value of the leaf N4 : the interpretation ‘doors of the constant type Door, and Return the value

‘three, ‘doors to PPsi,1 . Return the value Cph2 (‘with, ‘three, ‘doors) to NP3 . Return Cph1 (‘truck, Cph2 (‘with, ‘three, ‘doors))(= = ‘truck with three doors) to DPsi,3 . Return the value ‘every, ‘truck with three doors. Add variable declaration t : T (ruck with three doors) to the context: m : M,t : T

A Scope-Taking System with Dependent Types and Continuations

SC; 18:

175

Return the value ‘on, ‘every, ‘truck with three doors. The locative preposition on can be interpreted as the partial function:

pC(hair),T (ruck

with three doors)

: C → T .

We restrict the set of chairs to the set of chairs loaded on trucks with three doors:  C = dom(‘chair, ‘on, ‘truck with three doors). We extend the context by adding a declaration of a new variable c C(t): of the dependent type  m : M , t : T, c :  C(t).  quantifies over set  Then the quantifier Q C= given by the dependent pile up:

t∈T

 C(t) and is

 = dpile upr (‘a( C), ‘every(T )). Q V P; 19:

We apply all the types in the context to V to get a relation VM ,T , C(t) C(t) ∼ C. on the parameter space of the context M × T ×T  =M × ¯ We continuize V and apply it to Q:  ∈ CP(M ). CPS(eps)(η(V ), Q)

S; 20:

Thus VM ,T , C(t) is a relation on triples m, t, c such that m ∈ M , t ∈ T and c ∈  C(t). Finally, we choose a suitable ε ∈ {l, r}, compute CPS-transform on the arguments, and evaluate it at idt to get a truth-value in t, i.e., we get the truth-value of the whole sentence CPSε (eps1 )(‘some(M ), V )(idt ) ∈ t. Depending on whether we use CPSl or CPSr , we get either the ) or the inverse reading (∀t:T ∃c: ∃ ) for surface (∃m:M ∀t:T ∃c: C(t) C(t) m:M the sentence.

8 Conclusion In this paper we provided a new scope-taking system with dependent types and continuations. We defined the recursive procedure by which the interpretation is computed and the dependently typed context is built along the (syntactic) tree. Using this procedure, we can interpret surface structures for inverse readings directly and in a fully compositional way. A further advantage of our system is that it does

176

J. Grudzi´nska and M. Zawadowski

not overgenerate—it produces all and only the attested readings for the defined fragment. The core idea is that certain lexical elements are responsible for inverting scope: relational nouns (treated as dependent types) and locative prepositions (that can induce dependencies). This allowed us to provide a principled solution to the question of why certain constructions missing such elements block inverse scope. Acknowledgements This article is funded by the National Science Center on the basis of decision DEC-2016/23/B/HS1/00734. The authors would like to thank the anonymous reviewers for valuable comments.

References 1. Barker, C.: Continuations and the nature of quantification. Nat. Lang. Semant. 10(3), 211–242 (2002) 2. Barker, C., Shan, C.-C.: Continuations and Natural Language, vol. 53. Oxford Studies in Theoretical Linguistics (2014) 3. Bruening, B.: QR obeys superiority: frozen scope and ACD. Linguist. Inq. 32(2), 233–273 (2001) 4. Bruening, B.: Double object constructions disguised as prepositional datives. Linguist. Inq. 41(2), 287–305 (2010) 5. Cooper, R.: Quantification and Semantic Theory. Reidel, Dordrecht (1983) 6. Grudzi´nska, J., Zawadowski, M.: Inverse linking: taking scope with dependent types. In: Proceedings of the 21st Amsterdam Colloquium, pp. 285–295 (2017) 7. Grudzinska, J., Zawadowski, M.: Scope ambiguities, monads and strengths. J. Lang. Model. 5(2), 179–227 (2017) 8. Grudzinska, J., Zawadowski, M.: Continuation semantics for multi-quantifier sentences: operation-based approaches. Fundam. Inform. 164(4), 327–344 (2019) 9. Grudzi´nska, J., Zawadowski, M.: Inverse linking, possessive weak definites and Haddock descriptions: a unified dependent type account. J. Log. Lang. Inf.. 1–22 (2019) 10. Harley, H.: Possession and the double object construction. Linguist. Var. Yearb. 2(1), 31–70 (2002) 11. Harley, H., Jung, H.K.: In support of the phave analysis of the double object construction. Linguist. Inq. 46(4), 703–730 (2015) 12. Hendriks, H.: Studied flexibility: categories and types in syntax and semantics. Institute for Logic, Language and Computation (1993) 13. Makkai, M.: First order logic with dependent sorts, with applications to category theory. http:// www.math.mcgill.ca/makkai (1995) 14. Marantz, A: Implications of asymmetries in double object constructions. In: Theoretical Aspects of Bantu Grammar, pp. 113–150. CSLI (1993) 15. Martin-Löf, P.: An intuitionistic theory of types: predicative part. Stud. Log. Found. Math. 80, 73–118 (1975) 16. Martin-Löf, P., Sambin, G.: Intuitionistic type theory, vol. 9. Bibliopolis Napoli (1984) 17. May, R: The grammar of quantification. Ph.D. thesis, Massachusetts Institute of Technology, 1978 18. May, R.: Logical Form: Its Structure and Derivation, vol. 12. MIT Press (1985) 19. Partee, B.H., Borschev, V.: Sortal, relational, and functional interpretations of nouns and Russian container constructions. J. Semant. 29(4), 445–486 (2012) 20. Pesetsky, D.M.: Zero Syntax: Experiencers and Cascades. MIT Press (1996) 21. Zimmermann, M.: Boys buying two sausages each: on the syntax and semantics of distancedistributivity. Ph.D. thesis, Netherlands Graduate School of Linguistics, 2002

On the Coevolution of Language and Cognition—Gricean Intentions Meet Lewisian Conventions Nikola Anna Kompa

A long and complex train of thought can no more be carried on without the aid of words, whether spoken or silent, than a long calculation without the use of figures or algebra Charles Darwin

Abstract How might human language have emerged; and is it essentially based on convention or intention? The paper is an attempt at outlining an answer to the first question by trying to answer the second. It is not an attempt at formally modeling the evolution of language-apt creatures or the emergence of human communication. The aim is to achieve a better understanding of the phenomena (language and communication) to be modeled. Our starting point will be the idea that linguistic and other cognitive capacities must have coevolved and provided mutual scaffolding for one another. We will follow Tomasello, Bickerton and others in assuming that cooperation is the key to language evolution. This will motivate us to model the emergence of communication and meaning along game-theoretic lines, as suggested by Lewis and others. One pressing problem is the problem of equilibrium selection. We will briefly discuss an alleged solution but will find it wanting as it does not model the emergence of human language. The remainder of the paper will be devoted to arguing that combining Grice’s intention-based model of meaning with Lewis’ account of conventions helps explain how human language might have unfolded and solve the problem of equilibrium selection.

N. A. Kompa (B) Institute of Philosophy, Osnabrück University, 49076 Osnabrück, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2020 R. Loukanova (ed.), Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018), Studies in Computational Intelligence 860, https://doi.org/10.1007/978-3-030-30077-7_8

177

178

N. A. Kompa

1 Introduction Although language was a topic of interest throughout antiquity and the middle ages, philosophers did not speculate much about the origin of language. Many Christian thinkers thought that language was a gift from God anyway. While this was still the dominant view in the 18th century, some people nonetheless began to, carefully, speculate about the origin of language. It quickly became the subject of heated debates in the second half of the 18th and the first six decades of the 19th century ([1], pp. 389–400, for a historical overview). And with the advent of the idea of transformation in nature in the 18th century, (what may anachronistically be called) biological theories of evolution began to unfold and to affect the study of language and its origin.1 But then, after the publication of Darwin’s The Origin of Species in 1859 theorizing became plagued by outlandish speculations. By 1866 this situation had deteriorated to such an extent that the influential Societé de Linguistique de Paris imposed a ban on all discussions of the topic and effectively excluded all theorizing about language evolution from scientific discourse for more than a century ([4], p. 300).

Other linguistic societies followed suit. For all that, a couple of interesting ideas were already around at that time. One was the idea that language and thought (or cognition, as it is commonly put these days) must have coevolved. More specifically, many authors noted that language and cognition seem to presuppose one another. As Jean-Jacque Rousseau pointed out in his discourse on inequality (1755), “if men needed speech in order to learn to think, they had a still greater need for knowing how to think in order to discover the art of speaking” ([5], p. 57). In 1766, Johann Peter Süßmilch famously drew the conclusion that both language and thought had to be gifts of God because they could not have come into existence independently of one another [6]. By way of reply, Johann Gottfried Herder noted that this is not an argument for God’s intervention, because even if they were gifts of God, the problem would remain: If we were not able to think yet, how could God have taught us language, and if we were not yet language-apt, how could he have taught us the use of reason? And while Rousseau ended up with a sort of conventional account of language, Herder claimed that language and thought came into existence simultaneously–yet without the aid of God: the first act of reason was the first linguistic act [7]. Closely related was the idea that language scaffolds the development of certain cognitive skills–pointed out by Étienne Bonnot de Condillac as early as 1746. Condillac not only claimed a role for language in enhancing memory (as others had done before him, e.g., Thomas Hobbes and John Locke) but pointed out that by learning to voluntarily produce arbitrary signs we learn to conjure up associated ideas or concepts at will, as a “single arbitrary sign is enough for a person to revive an idea by himself” ([8], I, 2, §49); a notion that nowadays goes by the name offline thinking 1 The idea that species or life forms transform was already around at the beginning of the 19th century

(cf. [2]). Given that, before Darwin, the term ‘evolution’ was primarily used in an embryological sense (cf. [3]), precursors to Darwin’s account are occasionally called ‘transformist’ accounts.

On the Coevolution of Language and Cognition—Gricean Intentions …

179

(cf. [9], pp. 79–80; [10], p. 107; [11]. He also emphasized that arbitrary signs help the individual to direct attention “away from those [things] it has before the eyes at the moment” ([8], I, 2, §47); they make us “the masters of the exercise of our imagination” ([8], I, 2, §49). This is a precursor to what today is called ‘cognitive control’ (or ‘executive functioning’) [12, 13]. Others stressed a role of language for more complex cognitive accomplishments, such as the carrying out of a “long and complex train of thought” ([14], p. 45). The augmenting role of language for certain cognitive functions is substantiated by current research (cf. [15, 16]; and cf. [17] for an overview over more recent discussions). On the other hand, there is a growing body of evidence for the assumption that language mastery in turn presupposes certain (socio-)cognitive functions, such as cognitive control, i.e., “the ability to pursue goal-directed behaviours, in the face of otherwise more habitual or immediately compelling behaviours” ([18], p. 3).2 Also, language imposes rather substantive cognitive “demands on short-term memory and processing capacities” ([19], p. 122). As Stephen Levinson and Russell Grey put it: There is a growing consensus that there is a substantial nonlinguistic cognitive infrastructure to human language, consisting of the motivational structure, the cooperative basis, the theory of mind and universal interactional ethology […] ([20], pp. 70–171).

More fully, then, it seems as if language and cognition provide–and might, arguably, have done so in phylogeny–scaffolding for one another. This, then, will be my starting point. I will assume, in what follows, that neither human language nor human cognition could have come about in one fell swoop but that they must have, slowly, coevolved and thereby mutually scaffolded or bootstrapped one another. Little by little, language might have helped to improve our cognitive capacities and they in turn might have helped to improve our linguistic capacities. A certain cognitive achievement (e.g., the voluntary production of a sign), might have made certain (proto-)linguistic capacities possible (e.g., the production of the sign in order to evoke an idea3 in another one’s mind), which in turn might have fostered certain cognitive developments (e.g., the conjuring up of an idea in one’s own mind by voluntarily producing the sign associated with it), which in turn might have made more advanced uses of signs possible (e.g., combining them), and so on and so forth. Accordingly,

2 There is accumulating evidence that cognitive control, which requires the inhibition of a prepotent

(or automatic) response as well as cognitive flexibility, i.e., the ability to entertain and switch between (possibly conflicting) task sets or representations, is necessary for language processing and, in turn, augmented by having a symbolic language at one’s disposal (cf., e.g., [21–25]; cf. [26] for an overview). Analogical reasoning and transfer, i.e. “the application of learned regularities to novel domains and/or modalities” ([27], p. 117), seem also required for–and at the same enhanced by–language mastery (cf. [28–31]). Language and categorization are intimately intertwined, too (cf. [32] for review). This list could be continued (cf. [33] for a discussion). 3 By using the term ‘idea’ I don’t mean to suggest that it is a pictorial or percept-like representation that is thus evoked; still, it seems plausible to assume that more embodied (instead of fully amodal) representations are evoked by those early uses of signs and at this stage of cognitive and linguistic development (cf. [33]).

180

N. A. Kompa

the (ultimate) goal will be to develop a coevolutionary and incremental account of the evolution of language (–not a saltationist account à la [34].4 A Darwinian theory of the evolution of language must be incremental: to explain the transition from a hominin baseline with great ape grade communicative capacities to languageequipped hominins in a series of small steps ([35], p. 271).

Still, one might wonder what exactly needs to be explained in order to explain the evolution of human language. At the most basic level, I take language to be a system of labels and rules (or norms). Moreover, the account to be developed here is premised on the assumption that a major (but nonetheless gradual) difference between human language and animal communication systems lies in the way in which our ancestors came to use labels and how they thereby became sensitive to certain rules or norms (cf. [37]. More specifically, as a result of greater flexibility in using labels, our ancestors came to mean something by producing those labels. Then, as certain (communicative) behavioral regularities emerged, the labels acquired conventional meaning and syntactic properties. This, is turn, enabled our ancestors to engage in more elaborate forms of communication and pragmatic inferences. Consequently, I suggest that the three things most in need of explanation are (i) how our ancestors came to (non-naturally) mean something by using signs or labels and learnt to engage in (increasingly complex) pragmatic reasoning (i.e., how Gricean communication came about); (ii) how signs acquired conventional meaning (how behavioral regularities and a responsiveness to norms emerged); and (iii) how signs acquired syntactic properties. In this paper, I will have only very little to say to (iii) and focus instead on the former two aspects. Yet even with respect to (i) and (ii), the picture will remain rather impressionistic. So in what follows, I will defend, in an admittedly exploratory way, the idea that if we aim at understanding–and carefully modeling–how human language and communication emerged, we need to explain the coevolution and mutual bootstrapping of linguistic (or communicative) and cognitive skills by explaining how semantic and mental content–or conventional meaning and communicative intentions–might have coevolved.5,6 More fully, we will start from the assumption that cooperation was 4 Cf.,

e.g., Jackendoff ‘s ([36], p. 238) suggestion as to possible incremental evolutionary steps.

5 Did language evolve primarily as a means of communication or a means of thought? Noam Chom-

sky and Robert Berwick, e.g., claim that “…language evolved as an instrument of internal thought, with externalization a secondary process” ([34], p. 74). Lev Vygotsky and Alexander Luria, on the other hand (and adopting, in contrast to Chomsky and Berwick, a developmental perspective), have it that “the sign primarily appears in the child’s behaviour as a means of social relations, as an interpsychological function. Becoming afterwards a means by which the child controls its behaviour”, this being an example “of the transformation of means of social behaviour into means of individual psychological organization” ([13], p. 138). The ‘ecumenical’ account suggested here sees the two functions as being intimately intertwined (with communication being given a head start). 6 An intriguing question is how mental states with content– mental representations–might have evolved in the first place. This is a problem Teleosemantics aims to tackle (the locus classicus is [38]). It aspires to explain how certain inner states acquire the function of reliably indicating certain states of the world, and how they thus acquire (natural) meaning [39]. More specifically, as Karen Neander explains: “According to teleological theories of content, what a representation represents

On the Coevolution of Language and Cognition—Gricean Intentions …

181

key in the evolution of our species in general and of language in particular (Sect. 2). This will allow us to model the emergence of communication (and meaning) along game-theoretic lines as suggested by David Lewis (Sect. 3). One pressing problem of the model is the problem of equilibrium selection. We will briefly discuss replicator dynamics as an alleged solution but will find it wanting (Sect. 4) as it does not (yet) model the emergence of human communication (Sect. 5). The remainder of the paper (Sects. 6–8) will be devoted to arguing that combining Paul Grice’s model of communication with David Lewis’ account of conventions (in the spirit of Jonathan Bennett’s 1976) [42]) will take us some way towards explaining the emergence of human communication and solving the problem of equilibrium selection. The two accounts need to be seen as intimately intertwined. That will also shed some light on the perennial linguistic-philosophical problem of how intention and convention relate to one another–or so I will argue. We will, at times, adopt an evolutionary perspective. Yet we will nowhere make claims about how language actually evolved, as the empirical basis for answering this question is too thin to base any definite claim on it. Think of it as a thought experiment, empirically informed; let us ask how language might have evolved, then.7 Given the speculative nature of the project, the task will be to tell as coherent a story as possible which is compatible with the empirical evidence gathered so far, has as few explanatory gaps as possible, invokes only scientifically acceptable mechanisms (such as natural selection, exaptation, etc.), and does not secretly presuppose (sociocognitive) capacities that could not have been in place yet. Obviously, I will not be able to deliver on the promise of providing a comprehensive account of language evolution; the aim is more moderate as I will only try to outline the beginnings of a story of how our ancestors transformed into creatures of considerable linguistic and socio-cognitive accomplishment. Many important steps will be missing!

depends on the functions of the systems that produce or use the representation. The relevant notion of function is said to be the one that is used in biology and neurobiology […]. Proponents of teleological theories of content generally understand such functions to be what the thing with the function was selected for, either by ordinary natural selection or by some other natural process of selection” ([40], p. 1). These theories face various indeterminacy problems ([40], p. 1) and are said to classify too many states as representational states (cf. [41] for discussion). From the perspective of this paper, the most pressing problem is, again, to provide an incremental account of the transition from basic forms of representational states to fully propositional, complex thought and language, and of the coevolution of mental and semantic content. 7 From a strictly biological point of view, it is not language but language-apt-creatures that evolved; the focus would be on the purely biological prerequisites of language production and comprehension. In the paper, I will assume a broader perspective, as I am interested in the socio-cognitive prerequisites and effects of human language.

182

N. A. Kompa

2 Cooperation There is almost consensus by now among evolutionary scholars that cooperation was a key ingredient in the evolution of our species in general and of human language and communication in particular. Less (to none) consensus is on the question of how exactly cooperative behavior evolved. Also, cooperation of some sort or other can be found in other species as well. What is the distinctive feature of human cooperation, then; and what adaptive value would linguistic capacities therein have had? Many authors stress the role of cooperative breeding, i.e. the sharing of infant caretaking duties within a group ([43]; cf. also [44], chap. 26). Focusing on the evolution of language and communication, they emphasize the importance of niche construction and point out shifts in economy and foraging strategies. Michael Tomasello and colleagues, e.g., emphasize the role of mutualism and propose that cooperative communication was adaptive initially because it arose in the context of mutualistic collaborative activities in which individuals helping others were simultaneously helping themselves ([45], p. 170; cf. [46, 47]).

Kim Sterelny (in promoting a model of gene-culture-coevolution; cf. also [48]) argues that there have been major changes in hominin lifeways when our ancestors became skilled tool-users and foragers [49]. Economy shifted from mutualism to indirect reciprocation, with reputation becoming increasingly important ([19]; cf. fn. 9 on the role of reputation). Coordination over larger time scales required the ability to refer to points in time with some precision [19]; communication became more important as information increased and cumulative culture “depended on highfidelity, high volume social learning” ([19], p. 128). Herbert Gintis and colleagues, on the other hand, argue that once weapons where available that allowed the suppression of dominant males, our ancestors created a niche in which non-authoritarian leadership was valued and with it “individuals with the ability to communicate and persuade” ([50], p. 327). Derek Bickerton also has it that a ‘protolanguage’ developed as our ancestors began to construct and inhabit a new niche. Yet on his account it was the highend scavenging niche in which our ancestors no longer cracked bones in order to extract marrow but took to exploiting the carcasses of large herbivores [51]. As there were usually other scavengers at the scene, it was only by recruiting as many group members as possible that our ancestors had a chance of fighting back against competitors and of exploiting the food source themselves ([51], p. 158). By beginning to inhabit a niche in which cooperation paid off, our ancestors began to inhabit a niche in which honestly sharing information paid off and which exerted a selection pressure favoring language-apt creatures. And while there is little consensus on the details of the story (and on the exact relation of socio-economic and linguistic developments), for present purposes nothing turns on the specific economy and foraging strategy our ancestors engaged in back then, as long as cooperation would have proven beneficial.

On the Coevolution of Language and Cognition—Gricean Intentions …

183

3 Coordination Problems and Signaling Systems If we agree that the need for cooperation was a driving force in the evolution of language-apt creatures,8 and if we also agree that cooperation requires coordination, we can describe the problem our ancestors were facing as a coordination problem. More specifically, they had to solve two types of coordination problems. On the one hand, they had to coordinate their non-linguistic activities. Imagine, to use an example of David Lewis’, two ancestors of ours, call them Harry and Sally, who want to collect firewood. Neither of them cares whether Harry goes north and Sally goes south or whether it is the other way around, as long as they manage to collect as much firewood as possible. In Table 1 is a possible payoff matrix: This is a symmetric two-player game. Harry, the row-player, has two strategies to choose between. Sally, the column-player, has the same two strategies from which to choose. They face a coordination problem in Lewis’ sense as the situation meets the following two criteria: (i) their relevant interests coincide; and (ii) there is more than one equilibrium—otherwise it will not be a problem as rationality alone will dictate what to do (cf. [58], pp. 16, 22, 24). The notion of equilibrium in play is that of a pure strategy Nash Equilibrium familiar from game theory.9 The two (or more) equilibria may be equally good, as in the case described. But there are cases with two equilibria, yet everyone is better off at the one than at the other, as for example in Rousseau’s famous stag hunt. Players may nonetheless converge on the ‘worse’ equilibrium. And while language is not necessary to solve coordination problems in Table 1 Game of coordination

8 Again,

Harry/Sally

South

North

South

1, 1

5, 5

North

5, 5

1, 1

that is neither to deny that others species have to coordinate their behavior as well. The claim is not that the need for coordination is a sufficient condition for the evolution of language. Nor is it to deny that other mechanisms (than coordination) might have played a role, too. In the animal kingdom, indices (the roar of a red deer as a reliable index of his size) and handicaps (the peacock’s tale reliably indicates prowess; [52]) might guarantee stable communication (cf. [53]). More importantly, the danger of losing reputation might also foster what is called honest communication (cf., e.g., [53–56]). And, as Mitchell Green points out, signaling may become reliable when animals have to pay a social cost, such as the loss of reputation or credibility, if they fake the signal (cf. [57]). 9 Here is a definition: “An assignment of strategies to players is a Nash equilibrium iff no agent can improve his payoff by deviating unilaterally from it. An equilibrium is strict iff each agent decreases his payoff by deviating unilaterally from it” ([60], p. 10). More formally, a “strategy vector s* = s1 *, s2 *, s3 * … sn * is a Nash equilibrium if πi (si *, s−i *) ≥ πi (si , s−i *) for all si and all i” ([61], p. 64). Lewis himself defines it thus: “In an equilibrium combination, no one agent could have produced an outcome more to his liking by acting differently, unless some of the others’ actions also had been different” ([58], p. 8).

184

N. A. Kompa

general, having a means of communication could help players to avoid converging on the ‘worse’ equilibrium and to coordinate action more successfully (cf. [59]). Moreover, even in cases in which the two (or more) equilibria are equally good, pre-game communication (if occurring between sufficiently trustworthy agents) could solve the problem of equilibrium selection. Consequently, if there had been a strong selection pressure towards coordination, those who would have managed to coordinate their non-linguistic actions more successfully would have enjoyed an evolutionary advantage. On the other hand, communication or linguistic understanding “is a special case of coordination” ([42], p. 179). As a result, the evolution of communication (and meaning) itself can be modeled as a particular type of coordination problem–what Lewis calls a signaling problem. Bickerton’s recruitment problem is a case in point. In a signaling problem, the speaker observes a state of the world and sends a(n) (arbitrarily chosen) signal; this is the speaker’s contingency plan or strategy. The addressee observes the signal and acts upon it; this is the addressee’s strategy. Speaker and addressee have an interest in common, as they both prefer a particular state of the world to be acted upon in a particular manner. Suppose the speaker (the communicator, as Lewis calls him) has the following two strategies: Fc1 If you observe a dead mammoth, send signal S1. If you observe a lion, send signal S2. Fc2 If you observe a dead mammoth, send signal S2. If you observe a lion, send signal S1. The addressee (audience) has the following two strategies. Fa1 If you observe signal S1, leave the cave and meet the speaker outside. If you observe signal S2, hide in the cave. Fa2 If you observe signal S2, leave the cave and meet the speaker outside. If you observe signal S1, hide in the cave. In Table 2 is a possible payoff matrix. Again, there are two equally good equilibria: and . At both equilibria, the players are playing a best response to each other.10 The equilibria are even coordination equilibria:

Table 2 Signaling game

C/A

Fa1

Fa2

Fc1

5, 5

0, −10

Fc2

0, −10

5, 5

strategy si * is a best response to a strategy vector s*−i of the other players if πi (si *, s−i *) ≥ πi (si , s−i *), for all si ” ([61], p. 64).

10 “A

On the Coevolution of Language and Cognition—Gricean Intentions …

185

An equilibrium, we recall, is a combination in which no one would have been better off had he alone acted otherwise. Let me define a coordination equilibrium as a combination in which no one would have been better off had any agent alone acted otherwise, either himself or someone else ([58], p. 14).

If players maintain a coordination equilibrium, they not only don’t want to act differently themselves; they also don’t want the other one to act differently. There is a general preference for conformity (on condition that the others conform).11 Moreover, signaling systems, on Lewis’ account, are coordination equilibria in a signaling game. So, eventually, Lewis defines a signaling convention as any convention whereby members of a population P who are involved as communicators or audiences in a certain signaling problem S do their part of a certain signaling system by acting according to their respective contingency plans. If such a convention exists, we also call a conventional signaling system ([58], p. 135).

A signaling convention, just as any other convention, is a behavioral regularity ([58] i, p. 51; for his definition of convention, cf. [58], pp. 42, 58, 78). Just as any other convention, it does not determine “every detail of behavior” ([58], p. 51). But even if that is granted, the question remains of how to choose between equilibria, given that signaling problems always allow for more than a single equilibrium (as they are a type of coordination problem). How do players manage to converge on one signaling convention rather than another? Furthermore, as Lewis concedes, although there may be verbal signaling conventions in our language, our language is more than a set of signaling conventions. Several important components are missing, such as the ability to create new sentences, to chat or talk about one’s feelings and beliefs; there are not enough moods, no ambiguity nor indexicality, and no grammar ([58], pp. 160–177). Nonetheless, signals in a signaling system have meaning, although no meaning has been deliberately assigned. They acquire meaning by repeatedly and successfully contributing to solving coordination problems. But then, must not the first signals have been non-specific in various ways? Consequently, (at least) two questions need to be answered: 1. How is the problem of equilibrium selection to be solved; what could break the symmetry? 2. What meaning do signals–by figuring in signaling conventions–acquire? I will address them in turn.

11 According

to Lewis, players’ interests need not be perfectly aligned; it suffices if “coincidence of interest predominates” ([58], p. 24).

186

N. A. Kompa

4 Equilibrium Selection and Replicator Dynamics How do we (as players) choose between signaling equilibria? As Lewis admits, (i) explicit agreement would be the safest way to go ([58], pp. 24–42). We could simply agree on what the signals are supposed to mean. But at the dawn of language, that was not an option. (ii) Precedent would do as well; but, again, this is not how it was solved in the beginning. (iii) It might be solved by chance; you observe a state of the world and send a random signal; I react in a random manner. If we are lucky, the reaction is just the appropriate reaction to the state of the world you observe-given our common interest. Yet what are the odds of that, one might wonder. But then, if we try long enough, a signaling system will emerge–at least under certain conditions [62]. This is the lesson to be learned from replicator dynamics, a popular approach within evolutionary game theory. Evolutionary game theory applies orthodox game theory to biology. The basic idea is that strategies that do well grow, others decline, as Ben Polak succinctly puts it.12 Strategies thereby turn into genetic programs that determine a particular type of behavior. Thus, agents need not mentally represent these strategies. Payoffs get related to an individual’s fitness, conceived of as the number of offspring. An individual ‘gets paid’ by having more offspring who replicate its strategy/genetic program. More fully, there are two approaches to evolutionary game theory ([63], § 2). The first goes back to the work of John Maynard Smith and George Price and centers on the concept of an evolutionarily stable strategy; the aim is to provide an analysis of evolutionary stability. The second approach, replicator dynamics, is interested in population dynamics and models strategy frequencies in a population over time. The basic idea of evolutionary stability, very roughly, is that if a population of individuals of type σ (i.e., playing strategy σ) is invaded by a group of mutants of type τ (playing strategy τ), and those playing τ die out (so that the incumbent population cannot be invaded), then strategy σ is an evolutionarily stable strategy.13 Replicator dynamics are differential equations used to model how strategies change over time. The basic idea, similarly, is that if the payoff of playing strategy σ against population τ is greater than playing τ against itself, those using strategy σ will have–on average–more offspring than those playing τ. The proportion of those playing σ should thus increase. And the growth rate of those playing σ should be proportional to the difference in payoffs (cf. [64], pp. 232–235; cf. also [63], § 2.2; [62], p. 54). In other words, [t]he replicator dynamics relates the growth rate of each type of individual to its expected payoff with respect to the average payoff of the population ([65], p. 163).

12 Cf.

the Yale Open Online Course Game Theory by Ben Polak.

13 Similarly, an evolutionarily stable state of a population of type σ can be defined thus: “A population

state σ is evolutionary stable if and only if for all τ = σ, (i) π(σ, σ) ≥ π(τ, σ), (ii) If π(τ, σ) = π(σ, σ), then π(σ, τ) > π(τ, τ)” ([64], p. 208).

On the Coevolution of Language and Cognition—Gricean Intentions …

187

With the help of these models, it can be shown, that under certain conditions signaling systems emerge in a population. More to the point, it has been shown that in binary Lewis signaling games (2 states, 2 signals, 2 acts) information transmission is perfect (cf. [65–67]); signaling system equilibria always arise and there is global convergence on one of the signaling systems (at least if all states are equiprobable, [66]). Yet when the number of players increases (n > 2), things become complicated. As Simon Huttegger and colleagues [65, 66] have shown, sometimes partial pooling equilibria evolve, sometimes signaling system equilibria. In a partial pooling equilibrium, sender and receiver differentiate between certain states but ‘pool’ others.14 In a total pooling equilibrium, the sender always sends the same type of signal, independently of what the observed state of the world happens to be; and the receiver always acts in the same manner, independently of what signal has been sent. Consequently, sometimes there is perfect information transmission, sometimes not [66]. Similar results have been achieved in models of individual learning such as reinforcement learning (cf. [65] for an overview; cf. also [67]).15 Huttegger et al. conclude: Both models of evolution and of individual learning often result in the emergence of somewhat successful communication. Such success is not always guaranteed however. In signaling games with more than two states, signals, and acts, perfect communication is not guaranteed to emerge. […] These conclusions hold both for evolution and learning models ([65], p. 173).

The main problem–from the present perspective–with this type of model as a model of the evolution of communication is not that if n > 2, perfect communication can no longer be guaranteed to emerge, or is likely to emerge only under certain conditions (cf. [66, 69]). Rather, the problem is that it does not (yet) model human communication.16 This is not to deny that replicator dynamics is an exciting line of research that provides important insights and greatly enhances our understanding of signaling systems. It is also not to deny that the evolution of human communication is amenable to formal modeling. It is just to point out that there are important differences between human language and signaling systems (as modeled by replicator dynamics). And the problem is not only to explain in game-theoretic terms how novel signals can emerge or how agents can learn to follow a rule (although these are intriguing questions; cf. [70, 71]). The problem is to explain how a particular form of flexible,

14 Here is an example of Huttegger and colleagues’s [66]: The sender may send signal 1 in response

to state 1 and state 2, and signal 2 or signal 3 in response to state 3; the receiver may react to signal 1 by performing act 1 or act 2, but performs act 3 upon signal 2 and 3. 15 According to Patrick Grim, pragmatic features (such as behavior in accordance with Grice’ maxims) also “fall out as a natural result of information maximization in informational networks” ([68], p. 134). He employs the framework of spatialized game theory in which “agents do not interact randomly with all other agents in the population. They interact (or interact with increased probability) with specific agents–their neighbors in the network structure” ([68], p. 135). Given spatializiation, communication emerges “in response to environmental pressures on the basis of individual gains” ([68], p. 139). Note that the framework presupposes a cooperative, non-competitive context, as otherwise individuals would not be willing to freely share (valuable) information. 16 As Sterelny puts it, “language is not a signaling system”, nor is it a kind of “super-vervetese” ([73], p. 220).

188

N. A. Kompa

intentional communication (also called ostensive inferential communication, [72]) came about.17 Immediately following the passage just quoted, Huttegger et al. note that they “did find that signaling can emerge with very little cognitive sophistication” ([65], p. 173). Brian Skyrms emphasizes that it does not even take a mind to develop that type of signaling system ([62], p. 7). These types of signaling systems also emerge in bacteria ([74], p. 81). They model information transmission. But the information thus transmitted need not be represented by the organisms that engage with one another in a signaling system. Nor need those signals be ‘interpreted’ by the receiver. Or rather, ‘interpreting’ here basically means ‘reacting appropriately’; just as in the case of animal signals. Therefore, these accounts model animal signaling at best. They don’t model human language and communication.18 What is missing, then? In what does human language differ from (animal) signaling systems? What is missing, I will argue, is an account of how agents learnt to flexibly produce signs (and to interpret the signs thus produced) and to intentionally share ideas and information. In order to appreciate the difference, let us look at some comparative studies, which investigate differences between animal communication systems and human language (usage) (cf. [76] for a more comprehensive discussion of animal signals). Comparative studies, if treated with care, can be a valuable source of hypotheses concerning human language.19

5 Animal Signals Versus Human Symbols Commonly, when comparing human language and animal communication systems, the productivity and compositionality of human language is emphasized the most. There is some evidence, though, that animal signals can also be combined and morphologically as well as phonetically modified. Campbell monkeys, e.g., modify the alarm calls given in response to leopards and eagles with an affix ‘-oo’ in order to generate new calls, given in response to general disturbances [77]. Southern pied babblers combine calls into a ‘mobbing’ call [78]. There is also evidence of different great-ape call-cultures [79] and vocal learning in chimpanzees [80]. Whether these complex signals exhibit genuine compositionality, i.e. whether the meaning 17 According to Dan Sperber and Dedre Wilson, “[e]very act of ostensive communication communicates a presumption of its own optimal relevance” ([72], p. 15). And in ostensive inferential communication, one makes “manifest to an audience one´s intention to make manifest a basic layer of information” ([72], p. 54). 18 Someone might think that models of strategic communication based on games of incomplete information, as developed by Crawford and Sobel [75], fare better in this respect. It is somewhat questionable whether they can model the emergence of signals (how they acquire meaning), though. More pragmatic models will be briefly discussed below. 19 My aim is not (as that should require another paper) to come up with a watertight definition of language but only to pinpoint some of the key features human language exhibits; there are further candidates, of course (the locus classicus is [81]).

On the Coevolution of Language and Cognition—Gricean Intentions …

189

of the complex signal is a function of the meaning of its parts and their mode of composition, is less clear; partly because it is not clear which semantic-syntactic properties the simple signals exhibit in the first place (cf. [82], p. 410). Also, these forms of composition fall far short of the productivity, i.e. the property that allows for the production and comprehension of new linguistic structures, that human language exhibits. Human language symbols can–within certain limits and subject to certain qualifications–be compositionally combined and productively elaborated on. They have syntactic properties that they, arguably, acquired in tandem with semantic properties (see the following). More important for present purpose, though, is another feature of language. Some animal signals are produced in a recipient-specific manner.20 Many monkey species produce different alarm calls in response to different types of predators, thereby eliciting different responses in their fellow monkeys. Vervet monkeys, e.g., produce alarm calls, which differ acoustically in response to leopards, eagles and snakes [83].21 And call production seems to be subject to audience effects, such as taking the epistemic situation of the recipient into account; or modifying the signals if one hasn’t been fully understood ([54, 85, 86], pp. 72–73; cf. [44], p. 416). These calls can thus be flexibly produced. But “flexibility” is used to mean a variety of things. Occasionally, “flexibility” is used to mean the ability to adapt to novel situations or to solve hitherto un-encountered problems [87]. Also, one ought to distinguish between behavioral flexibility and cognitive flexibility (fn. 2) which “may be defined as the ability to simultaneously consider multiple conflicting representations of a single object or event and flexibly shifting between these representations […]” ([23], p. 632). But even focusing on the flexible production of signs, ‘flexibility’ can mean different things: a. being able to abstain from producing the signal in the presence of a trigger, i.e., not producing the signal even if there is a leopard nearby; b. being able to voluntarily produce the signal in the absence of an external trigger, i.e., producing the signal even if no leopard is around; c. being able to voluntarily produce the signal in different contexts, i.e., not bound to a particular context (some speak of signal flexibility here; cf. [88]); d. being able to voluntarily produce the signal to different ends (also called ‘functional flexibility’, cf. [88]); e. being able to voluntarily create idiosyncratic signals to familiar ends (e.g., in mother-infant dyads, cf. [87]); f. being able to voluntarily produce the signal in a recipient-specific manner.

20 Some animals, in producing signs, seem to also display a certain sensitivity to communicative or social rules, as when duetting birds engage in some sort of turn-taking (cf. [84]). 21 Skyrms claims that “[n]ature has presented vervets with something very close to a classic Lewis signaling game” ([62], p. 23).

190

N. A. Kompa

In all these cases, the animal has to have (cognitive) control over the production of the signal. However, these forms of flexibility are differentially complex. Most importantly, in some of these cases, the production may (or even must) be not just under voluntary control but also goal-directed, intentional. This takes us to a new (cognitive) level as flexibility now requires that the animal g. be able to intentionally produce the sign, i.e., to produce it with a particular goal in mind (cf. [89]). Yet the signaler might intend either i. that the addressee act in a particular manner, or ii. to inform the addressee, or ‘shape’ his mind. The goal might be behavioral or ‘mental’. The latter has us conceive of the other as of an organism possessing of a mind. And it makes heavy cognitive demands on the addressee as well, especially when it involves voluntary production of a signal to different, maybe even novel, ends or in novel contexts. Flexible production with an intention to share information requires that the addressee be able to reason about the speaker’s intentions; at least it encourages the development of these abilities. The following can be seen as an attempt at modeling (or at highlighting the difficulties one encounters when attempting to model) the transition from voluntary to intentionali to intentionalii production.22 It is not controversial that there are cases of voluntary production in the animal kingdom (in the sense of a–f). What is controversial is the extent to which animals can intentionally produce signals. Animal signals evolve as a means of influencing behavior; many (non-human) primate gestures thus have imperatival force [87]. In other cases, it may be indeterminate whether a signal has imperatival or declarative force. Still, signals usually have one urgent function–such as helping to escape predators. They are (mostly) bound to a particular context of use and ‘meaningful’ in that context only. Their meaning amounts to the informational content they evolved to carry, given the function they evolved to serve. But while this is information in the organism, it is not necessarily information to the organism (–a distinction due to [90]). Signals often derive from natural signs that had no prior signaling function [91]. Dogs, for example, are in the habit of baring their teeth before they attack another animal (cf. [92], p. 17). They just retract their lips in preparation for a bite. Yet he who sees what is coming–who anticipates the bite to come and reacts appropriately–has a better chance of survival. As soon as the baring of the teeth reliably results in flight behavior, the dog might begin to snarl at others in order to threaten them (cf. [82], p. 412). A similar mechanism is exhibited by what Tomasello calls “ontogenetically ritualized intention-movement gestures” ([45], p. 23). Young chimpanzees, e.g., touch the backs of their mothers’ as a request to being carried, yet these signals are “basically abbreviations of full-fledged social actions …” ([45], p. 23), that “rest on the natural tendency of recipients to anticipate the next step in an action

22 It is controversial whether there could have been a smooth transition from animal signals to human

ostensive communication (cf. [93] vs. [35, 94]).

On the Coevolution of Language and Cognition—Gricean Intentions …

191

sequence…” ([45], p. 62).23 What once was a sign of the behavior to come turns into a signal as the addressee learns to anticipate the next step in the action sequence and to react appropriately; the ‘sender’ learns to produce the signal in order to elicit that type of behavior (cf. [88] for a more detailed account of how signs turn into signals). Learning to interpret a signal is tantamount to learning to appropriately act upon it (mirror neurons might help to anticipate what action is coming, cf. [11, 95]). Human language symbols, on the other hand, can be produced in a fully flexible manner. Human speakers can think about leopards whenever they like, not just when one is present (thus going offline). Also, they don’t have to ‘call’ even if there is a leopard nearby. And they can produce signs in all kinds of contexts and to all kinds of ends. The symbols of human language usually don’t have one urgent function; nor are they used for one particular (behavioral) purpose only. Their usage is less context- and purpose-bound, more flexible. Most importantly, they can be used in order to evoke the idea (or representation) of a leopard (or something else) in another mind.24 They are used to manipulate not only behavior, but minds (cf. [97] for a different take on the matter). There is, in most cases, no single type of behavior that would count as the most apt response to the symbol as symbols are primarily a means of influencing minds (cf. [19], p. 123; cf. also [93]). This, arguably, results from the increasingly flexible use they are amenable to. Intentional, flexible production requires that the hearer recognize that the speaker acts with communicative intent. The communicator, in turn, has to let the addressee read her mind (she will do so, presumably, only in a cooperative, trusting context). The signs no longer carry information ‘naturally’, i.e. by reliably indicating a particular state of the world [39], but can be used by speakers to mean something; utterances thereof come to be endowed with non-natural (speaker) meaning [98]. As will be outlined in (slightly) more detail in Sect. 6, with the advent of symbols, a new form of communication (ostensive inferential communication) began to slowly unfold and gain traction; and mindreading has become a popular pastime ever since. Human language is a communication system that employs signs that can be productively combined and intentionally produced; their utterances carry non-natural meaning; their interpretation requires the recognition of the speaker’s communicative intent. This, I claim, is a key feature of human language [93].25,26

23 There is some evidence that these movements are less ritualized than Tomasello and colleagues suggest. Studies investigating how travel of mother-infant dyads in wild chimpanzees gets initiated found that there is more idiosyncrasy, flexibility, variability, and social negotiation in general in the use of these signals than has previously been acknowledged (cf. [96]). Still, these gestures are used to influence behavior, not minds. 24 I am grateful to Christian Nimtz for very helpful discussion on this point. 25 Skyrms would disagree; he insists- that “all meaning is natural meaning” ([62], p. 1). 26 More recently, pragmatic reasoning and inference has also become a subject of neuroscientific studies [99]; and pragmatic assumptions have been put to experimental test [100].

192

N. A. Kompa

6 Gricean Intentions How might ostensive communication, which is grounded in speakers’ attempts at making their communicative intention manifest, have come about from a phylogenetic point of view? In order to answer the question, we need to explain (a) why a speaker should begin to intentionally send a signal; and (b) how the listener could come to interpret that as an attempt at communication in the first place. To that end, let us engage in a thought experiment. It may strike you as a piece of paleo-fiction; and as somewhat cartooned at that. And it is! But note that it is only meant to highlight some of the ingredients that had to be in place and some of the steps our ancestors had to take–with, admittedly, many a step missing. Also, it might help to illustrate the difficulties in modeling our ancestors’ first communicative attempts. Suppose, then, that one of our ancestors, call her ‘Eve’, had found a dead mammoth. She just gets home in search of recruits. In particular, she wants to bring another ancestor of ours, call him ‘Adam’, to join her so they can exploit the food source together. She might try various things: (i) she might point in the direction of the dead mammoth. Given that apes don’t point much (as Tomasello is eager to stress), it is unclear whether pointing (deictic gestures) were already at our ancestors’ disposal. (ii) She might produce a natural sign that is somehow ‘naturally’ related to mammoths such as a mammoth’s tooth or a scrap of its pelt. That presupposes a certain understanding of proxy relations, of one thing being metonymically related to another, such as a part to the whole or an attribute to the one possessing it; something our closest relatives are not very good at either. (iii) Alternatively, she might engage in (panto)mime and thereby produce an iconic sign that somehow resembles what it denotes.27 She might imitate the noises mammoths make or their movements (cf. [51], pp. 159–160).28 Producing and interpreting mimetic signs presupposes mimetic skills ([101], p. 49). Given that apes don’t imitate much, it is, again, unclear whether these skills were at our ancestors’ disposal. But then, since the last common ancestor of non-human and human primates lived about 6–7 million years ago, our ancestors had time to develop at least some of these skills. In what follows, I will presuppose (and so owe you an argument for the claim) that at the dawn of language, our ancestors already had acquired mimetic skills–perhaps in the context of play–and were able to voluntarily produce iconic signs.29 Yet, in the beginning, they need not have produced these signs with communicative intent. Maybe, at first, Eve just imitates mammoth noises (or movements) on encountering 27 Iconicity is a tricky issue as the notion of resemblance or similarity is notoriously hard to spell out (cf. [102], or [103] for an overview). For present purposes, suffice it to say that an iconic sign somehow–by mere association or in combination with other cognitive skills–tends to evoke an image of the thing denoted in the listeners’ mind. 28 The earliest uses of the mammoth-sign may have been multi-modal, employing whatever means were available; gestures, sounds, pantomime, etc. (cf. [104]). 29 Or maybe mimetic skills were acquired in the context of social learning, as more elaborate tool-using-techniques were transmitted from one generation to the next. To learn complex action sequences requires high-fidelity transmission, which in turn is aided by imitation learning ([49]; cf. also [105]).

On the Coevolution of Language and Cognition—Gricean Intentions …

193

a dead mammoth; maybe the group has formed the habit of imitating the sounds (or movements) of the dead animals they find (–or take whatever is your favorite account of how an inchoate association between a gesture or another proto-sign and an object was initially formed). The first two steps would thus be 1. the imitation of mammoth noises or sounds; i.e. the voluntary production of a mimetic sign; and 2. the forming of an association between mammoths and these signs. She then, we may suppose, inadvertently produces the sign as she is dragging Adam along to the place where she found the mammoth. After a while, she no longer drags Adam along but begins to voluntarily produce the ‘sign’ whenever she has found a mammoth–yet only when Adam (or someone else) is around. She will not act with communicative intent yet; the production may simply be motivated by her desire for Adam to come with her to the carcass of the mammoth.30 Yet she also has information that she is willing to share; and if she is reliable, the sign may come to carry this information (see Sect. 8). The next step would thus be 3. the voluntary production of the sign in a recipient-specific manner. Given that the sign is iconic, it will, arguably, evoke in Adam’s mind the idea (or picture, cf. fn. 3) of a mammoth. Also, Adam, in order to anticipate the next step in her action sequence and in light of their previous (successful) encounters, might begin to follow her. Slowly, Eve might begin to intentionally produce the sign, i.e., produce it with the intention (or desire) to influence Adam’s behavior, to solicit a response [106, 107] and get him to join her (whenever she has found the carcass of a large herbivore). And Adam might do the same whenever he needs her help in securing something to eat. In that way, they begin to produce the sign in diverse contexts (but still to the same end), and to honestly provide information (this need not be their intention, at first). A next step would thus be 4. the intentional production of the sign in diverse contexts, intending to thereby bring others to act in a particular manner. Fortunately, all that is required for Eve’s however inchoate intention to be fulfilled is that it be recognized (cf. [74, 109]). For as soon as Adam comes to recognize her intention (–note that both the intention to inform him about the mammoth as well as the intention that they exploit the food source together would do as they would both yield the same result, once recognized), he will act as she wants him to; given that they are both hungry and given that they (both realize that they) can exploit the food source only together. 30 It may, but need not, be an expression of that desire. Expressive behavior, according to Dorit Bar-

On, is a sort of proto-Gricean behavior, that exemplifies “a significant intermediate stage between mere code-like signaling and full (post-)Gricean linguistic communication […]” ([108], p. 348). And while I agree that expressive behavior may be interestingly different from mere ‘code-like signaling’ and have great explanatory value, I don’t quite see how to get from expressive communication to Gricean communication (cf. [107]).

194

N. A. Kompa

But then, as Eve is producing the sign in more diverse contexts, and as she begins to produce the sign even more flexibly and to diverse ends, Adam, in order to interpret her behavior, has to conceive of her action as an attempt at communicating in the first place, as an attempt at ‘signaling signalhood’ ([110]; cf. also [111], p. 379). Unless he thinks that she is not moving and howling just for the fun of it, he might not interpret the sounds, gestures or movements she makes as being of relevance to him. If he does, Adam thereby turns Eve’s sounds or gestures into a socially meaningful act of communication–an idea that is Vygotskian in spirit (cf. [112], p. xxvii). Accordingly, a fifth step might be 5. the recognition that the speaker acts with communicative intent, that she is (not covertly but overtly) trying to influence his behavior (or mind) by producing the sign. This is a precursor to the Gricean mechanism [98, 42], according to which the speaker intends to produce a particular response in the listener (belief or action); and the listener’s recognition of the speaker’s communicative intention is supposed to be effective in producing that response.31 More fully, according to Grice’s analysis of (speaker or non-natural) meaning, “U meant something by uttering x” is true iff, for some audience A, U uttered x intending: (1) A to produce a particular response r; (2) A to think (recognize) that U intents (1); (3) A to fulfill (1) on the basis of his fulfillment of (2)” ([98], p. 92). The speaker intends to produce a particular response in the audience simply due to the recognition of her intentions. Yet neither Eve nor Adam will intend the full Gricean mechanism to be operative in the beginning; they simply exploit the mechanism (without reflecting on it); rely on it, as Bennett puts it ([42], p. 125). Arguably, the Gricean intentions must have unfolded one after another [49, 93, 94]; and the full Gricean model may be unnecessarily sophisticated anyway (for most purposes). Maybe in communication a speaker is simply intending to make her intention manifest [113]. The third intention may be unnecessary for intentional communication at all; it may suffice if the addressee comes to fulfill (1) on the basis of (2), whether the speaker intends this to be so or not [114]. In any case, the trimmed-down Gricean mechanism comes into play as soon as agents begin to intentionally produce signs and interpret the signs thus produced. It requires only steps 1–5 but requires neither the second nor the third intention of the Gricean analysis on the speaker’s part. It only requires that the speaker produce the sign, intending to thereby elicit a response, and that the listener recognize that this is so. 31 Communicative

intentions are out in the open. This is not to deny that speakers happen to have ignoble intentions and use linguistic means in the pursuit of ignoble goals. One thus ought to distinguish between communicative intentions (concerning what one is trying to communicate) and other intentions (concerning what else, beyond being understood, one is trying to achieve by communicating). Yet communication always aims at being understood. And only when signals have acquired fairly stable meaning can deception by linguistic means (lying) get off the ground.

On the Coevolution of Language and Cognition—Gricean Intentions …

195

Slowly, as Eve begins to intentionally produce signs more often, and in even more diverse contexts and to different ends, Adam begins to interpret her behavior as primarily an attempt at sharing information or evoking an idea in his mind, and vice versa (as they both begin to realize that by sharing information one can make people do things–as long as the action serves a common purpose of theirs). There is no longer a single function that the sign fulfills or a particular action that the sign is supposed to elicit. Rather, humans desire to share information and ideas. Non-human primates, on the other hand, lack our Mitteilungsbedürfnis [1]. And “what great apes lack is the interest in engaging in turn-taking dialogues” ([44], p. 421). Eventually, Adam and Eve will come to attribute mental states (intentions) to each other and to read each other’s mind.32

7 Equilibrium Selection and Optimal Communication But then (and for the attribution of specific communicative intentions to become so much as possible), Adam not only has to realize that Eve is trying to communicate but also what she is trying to communicate (–and that is where all the trouble began, some say). This brings us back to the problem of equilibrium selection. Replicator Dynamics solves it by showing that signaling system emerge under certain circumstances by chance; no symmetry-breaking is necessary; no reasoning processes of any sort are required on the part of the agents. Yet the results are signaling systems, not human (ostensive inferential) communication, because, provocatively put, communication is conceived of as information transmission (that can do without minds), not as the pragmatic endeavor of reading minds. Let us, therefore, go back to Adam and Eve. In order to recruit Adam, Eve could have used any sign she liked, as long as Adam would have had a chance of coming to understand what she was trying to get across. But suppose she had used an iconic sign (cf. [42]). Mightn’t she thereby have increased the chances of Adam coming to understand that there is a mammoth nearby? Mightn’t iconicity help do the symmetry-breaking; help solve the problem of equilibrium selection? Iconicity might have eased interpretation as an iconic sign can be interpreted–at least in part–‘naturally’ ([115], p. 127); the relation between the sign and what it denotes is not fully arbitrary. At the same time, iconicity goes some way towards explaining how non-natural meaning might have emerged as an iconic sign also has non-natural meaning; the iconic relation underdetermines (speaker) meaning; the sign still stands in need of interpretation. Luckily, Adam might be able to narrow down the range of interpretations if the sign conveys something relevant. It has to be of relevance to both of them if it is to be interpretable; relevant to the fulfillment of a common purpose of theirs; otherwise Adam would be at a loss–and without any motive–as to how to interpret it (cf. [72, 116]). If it were only of relevance to Adam, why 32 Language might help to augment meta-cognitive capacities by engaging us in cognition-enhancing loops (cf. [33] for a discussion).

196

N. A. Kompa

would Eve bother to tell him; if it were only of relevance to Eve, why would Adam bother to listen (cf. [37])? He needs to see her “gesture as communicative and as relevant to [their] current activity” ([45], p. 203), as communicative relevance will be engendered by joint activities and purposes [45, 46, 117]. This is perfectly in line with Lewis’ suggestions as to how the problem of equilibrium selection might be solved. There is not just agreement, precedence, or chance; players can also go for an equilibrium that is somehow salient ([58], p. 35). ‘Salience’, it has been objected, it not a particularly clear notion [60]. The suggestion of this paper is that it be spelled out as iconicity + relevance. This presupposes an inchoate form of the (trimmed-down) Gricean mechanism: a first attempt at forming a communicative intention and a first attempt at interpreting the other one as trying to communicate something (by means of a sign).33 Fortunately, communicative intentions, other than many other intentions, are out in the open. In sum, Eve might choose an iconic sign so as to ease interpretation and to increase the chance of communicative success. Adam will choose an interpretation that is relevant; he will perform an inference to the best interpretation, that is. This fits in nicely with a basic assumption of formal pragmatics, according to which speaker and listener try to choose optimal form/interpretation (or sign/meaning) pairs. It is the core idea of optimality-theoretic pragmatics (for an overview over more recent applications, cf. [118, 119]), according to which listener and speaker optimize interpretation and form respectively; i.e., they chose an optimal interpretation i for form f and an optimal form f to express i (cf. [118], p. 2). In light of the above considerations, a form can be said to be optimal if it increases ease of interpretation and an interpretation can be said to be optimal if it maximizes relevance (or perfectly balances relevance and processing costs; Sperber and Wilson 1995). According to Michael Franke and Gerhard Jäger, probabilistic pragmatics in general “follows Grice in assigning an important role to goal-oriented, optimal behavior” ([120], p. 5). As they point out, linguistic behavior is behavior that is optimal in light of the beliefs and preference of speaker and listener. Probabilistic pragmatics accommodates speakers and listeners’ subjective uncertainty and models uncertainty as probability distributions (cf. [120], p. 9). But note that the first reasoners would not yet have been fully mature Gricean reasoners and thus not have been able to entertain higher-order representations and engage in complex forms of mindreading and pragmatic inferences (–certain models within epistemic game theory may be promising here; cf. for an overview [121]34 ; [118]). Although the first act of successful communication was more than the production and interpretation of a signal, it would not have been performed (and interpreted) 33 Consequently, some form of Gricean reasoning has to be built into the model–contrary to what Lewis claims; he has it that “meaningnn is a consequence of conventional signaling” ([58], p. 154). He already presupposes a certain level of cognitive sophistication as agents who engage in signaling conventions have cognitively demanding (higher-order) mutual expectations, on his account (an assumption Skyrms is eager to drop; cf. [67]). 34 As Franke explains, the “main idea is that pragmatics reasoning starts by considering some salient, perhaps unstrategic behavior of either speaker or listener. Call this level-0 behavior. Level-(k + 1) behavior is then defined, in simplified terms, as a rational strategy against level-k behavior” ([121], p. 278). This may help to model our ancestors’ first communicative attempts.

On the Coevolution of Language and Cognition—Gricean Intentions …

197

by a mature reasoner in possession of a full-fledged theory of mind yet. Another challenge for probabilistic models is that it is not clear what role conventional, linguistic meaning is supposed to play (cf., e.g., [122, 123]).35 This brings us back to Lewis, as human communication relies on intention as well as convention.

8 Specificity of Content and the Adjustment of Strategies Let us, therefore, come back to the question of what (stable) meaning signals acquire by figuring repeatedly in solutions to coordination problems (the second question posed above). The earliest signs must have been non-specific in at least two different ways; just as the communicator’s communicative intention would not have been very specific at first either.36 As long as one works on the assumption that there are n perfectly distinct and distinguishable states of the world and n perfectly distinct and distinguishable ways of acting upon the world (one that is ‘right’ for each state), the problem of non-specificity does not surface. But once that assumption is dropped, we have to acknowledge that various forms of non-specificity are not just pervasive in everyday online language processing but percolated communication right from the start. Whatever sign Eve will use in order to recruit Adam, it will at first be ambiguous between a request (“Come join me!”) and a description (“There is a dead mammoth!”). It will be ambiguous between a signal-that and a signal-to–as Lewis puts it ([58], p. 144); imperative and indicative at the same time (as is acknowledged by game-theorists; cf. [124]); what Ruth Millikan calls a pushmi-pullyu representation 35 As Daniel Lassiter and Noah Goodman point out, in applying Bayesian models of linguistic communication “we assume that speakers and listeners maintain probabilistic models of each other’s utterance planning and interpretation process, and that these models drive pragmatic language use”. ([122], p. 3806) More fully, it is assumed that “a listener L updates her information state, given that some utterance has been made, by reasoning about how the speaker would have chosen utterances or other actions in various possible worlds, weighing the result by the probability that those worlds are indeed actual:

(7) PL (w/u) α PS (u/w) x PL (w). Conversely, a speaker chooses utterances by reasoning about how the listener will interpret the utterance, together with some private utterance preference PS (u) (representing, for example, frequency effects or a preference for brevity and ease of retrieval). (8) PS (u/w) α PL (w/u) x PS (u). These equations are both instantiations of Bayes’ rule. However, since they are mutually recursive reasoning could go on forever, unless we impose some bound. In addition, it is not obvious where in (7) and (8) literal meaning, as studies in compositional semantics, intrudes (cf. Franke 2009: Chap. 1)” [122]. 36 It is highly implausible that our pre-linguistic ancestors already had mental attitudes (such as intentions) with perfectly precise mental content. It is equally implausible that they became language-apt yet were not able to think.

198

N. A. Kompa

([125], p. 119). Yet depending on the reaction it will come to reliably elicit, it will either–if others treat it as a request–acquire imperatival force and become a signal-to. Or–if it is treated as providing a piece of information about the world–it will acquire indicative force and turn into a signal-that. This might have taken millennia. Also, the earliest signs would lack syntactic specificity [82]. They still need to acquire syntactic properties; turn into a verb, a noun, an adjective, etc. This will require that speakers and listeners learn to distinguish not only content from illocutionary role (request, warning, etc.) but also different roles from one another (subject, object); actor from action; referring (picking something out) from predicating (saying something about it). They also have to learn to hierarchically structure thoughts, to embed clauses within other clauses. Not surprisingly, the evolution of syntax is as complex an issue as it is controversial (and requires a much more thorough treatment than I can afford here). But these earliest signs are still used in reaction to a particular type of situation only. For them to turn into full-fledged symbols, our ancestors must have begun to produce them, again, more flexibly: in different context and to different ends. But also, and crucially, in reaction to more abstract features of the world that they observed various situations to have in common. Only if the sign comes to be used when and only when a mammoth in whatever state is present, for example, or when and only when a dead animal is present, and listeners will begin to act as though that is the information conveyed–e.g., use that information as a premise in their practical or theoretical deliberation– will the sign be endowed with a more abstract, contextfree meaning. The sign will no longer be used to elicit a particular type of behavior in response to a particular state of the world but will be used to express and evoke ideas. This, again, must have taken millennia. But once signs had been endowed with more abstract meaning, and once the number of signs had increased, our ancestors would have become able to form ever more elaborate communicative intentions and to engage in ever more elaborate forms of pragmatic inferences and mindreading. In turn, they could have used the signs even more flexibly, to novel ends. Mental and semantic content would have shaped one another. On Lewis’ account, signs acquire meaning by figuring repeatedly in solutions to coordination problems. But what meaning signs acquire depends on the strategies speakers and listeners respectively pursue. Moreover, speakers and listeners will constantly adjust their strategies so as to approximate those of the other’s and maintain equilibrium (cf. [58], p. 144); there is nothing to be gained from semantically freeriding [37].37 This elegantly accommodates language change. Yet in each particular communicative exchange, a certain amount of uncertainty and leeway will remain, which makes pragmatic, Gricean reasoning indispensable. There always remains a 37 While a free-rider in the moral sense is someone who benefits from the fact that certain norms are in place in their community yet, occasionally at least, secretly contravenes these norms if it is to their own benefit, a semantic free-rider would benefit from their being semantic norms (or conventions) in place but also secretly contravene them. Given that communication aims at being understood and that communicative intention are out in the open, secretly contravening semantic norms (using words in another than their conventional meaning and trying to not let this on) does not seem like a sensible thing to do; at least it is hard to see what could thus be gained.

On the Coevolution of Language and Cognition—Gricean Intentions …

199

gap between linguistically encoded meaning and speaker meaning that needs to be pragmatically bridged [72, 93, 126–128].

9 The Storyline In a nutshell, then, the idea is that at some time in history our ancestors ran into increasingly pressing coordination problems, some in the form of signaling problems (e.g., recruitment), as they engaged in more and more cooperative activities. Our ancestors began first to voluntarily, then to intentionally produce signs, and to interpret each other as trying to communicate something. Those early signs were interpretable only against the backdrop of joint activities and goals that generated communicative relevance. Iconicity might have eased interpretation and solved the problem of equilibrium selection. In repeated exchanges, the signs’ iconicity got lost. Ever more flexible, intentional production (in different contexts and to different ends) encouraged the development of more sophisticated theory-of-mind capacities. Also, as signs were increasingly used in response to more abstract features of the world, their meaning became more abstract and the signs acquired syntactic properties, which allowed for even more productive use. There are various accounts within game theory in particular and formal pragmatics in general that successfully model certain aspects of linguistic behavior. For all that, a coevolutionary and incremental model of how communicative behavior emerged is still pending. From a modeling perspective, the problem seems to be how to best model the emergence of intentional communication. A trimmed-down Gricean mechanism might help to explain how speakers began to mean something by certain sounds or gestures as others began to interpret their behavior as attempts at communication. A broadly Lewisian, game-theoretic account, might then be added to explain how signs acquire more specific, fairly stable meaning and syntactic properties. At the same time, language change has to be accommodated. Finally, a fully Gricean (probabilistic) account, might explain how our ancestors began to entertain (and attribute to each other) more complex representations and how they learnt to resolve the uncertainty that remains in any communicative exchange as conventional meaning commonly un(der)-determines speaker meaning. It might be in this way that we can capture the fact that literal meaning is conventional and speaker meaning intentional. Acknowledgements I am tremendously grateful to Roussanka Loukanova, who organized a workshop on Logic and Algorithms in Computational Linguistics at the University of Stockholm in August 2017, and gave me the opportunity to present an earlier version of the paper; many thanks also to the other participants for valuable comments and critique. I am also very grateful to audiences at the Universities of Münster, Osnabrück and Zurich, especially to Susanne Boshammer, Charles Lowe, Christian Nimtz, Sebastian Schmoranzer and Niko Strobach. Finally, I would like to thank three anonymous referees for very helpful feedback and Rudi Owen Müllan and Charles Lowe for proofreading the manuscript.

200

N. A. Kompa

References 1. Fitch, T.W.: The Evolution of Language. Cambridge University Press, Cambridge (2010) 2. Corsi, P.: Transformist conceptions in European natural history. J. Hist. Biol. 38(1), 67–83 (2005) 3. Sloan, P.: The concept of evolution to 1872. In: Zalta E.N. (ed.) The Stanford Encyclopedia of Philosophy (Spring 2017 edn.). https://plato.stanford.edu/archives/spr2017/entries/evolutionto-1872/ (2017) 4. Christiansen, M., Kirby, S.: Language evolution: consensus and controversies. Trends Cogn. Sci. 7(7), 300–307 (2003) 5. Rousseau, J.-J.: Basic Political Writings (translated and edited by D.A. Cress; introduction and annotation by D. Wootton). Hackett Publishing Company, Indianapolis (2 2011) [1755] 6. Süßmilch, J.P.: Versuch eines Beweises, daß die erste Sprache ihren Ursprung nicht von Menschen, sondern allein vom Schöpfer erhalten habe. Berlin (1766) 7. Herder, J.G.: Abhandlung über den Ursprung der Sprache (edited by W. Proß; commentary and materials by W. Proß, vol. 12.). Carl Hanser Verlag, München (1978) [1771] 8. de Condillac, É.B.: Essay on the Origin of Human Knowledge (Origin) (Cambridge Texts in the History of Philosophy; edited by Hans Aarslef). Cambridge University Press, Cambridge (2001) [1746] 9. Bickerton, D.: More than Nature Needs, Language, Mind, and Evolution. Harvard University Press, Cambridge (2014) 10. Bouchard, D.: The Nature and Origin of Language. Oxford University Press, Oxford (2013) 11. Hurley, S.: The shared circuits model (SCM). How control, mirroring, and simulation can enable imitation, deliberation, and mindreading. Behav. Brain Sci. 31, 1–58 (2008) 12. Vygotsky, L.: Thought and Language (Translation newly revised and edited by A. Kozulin). MIT Press, Cambridge (1986) 13. Vygotsky, L., Luria, A.: Tool and symbol in child development. In: Van der Veer, R., Valsiner, J. (eds.) The Vygotsky Reader, pp. 99–174. Blackwell Publishers, Oxford (1994) 14. Darwin, C.: The Descent of Man. Worldsworth Editions, Ware (2013) [1871] 15. Clark, A.: Magic words: How language augments human computation. In: Carruthers P., Boucher J. (eds.) Language and Thought, pp. 162–183. Cambridge University Press, Cambridge (1998) 16. Clark, A.: Language, embodiment, and the cognitive niche. Trends Cogn. Sci. 10(8), 370–374 (2006) 17. Carruthers, P.: Language in Cognition. In: Margolis, E., Samuels, R., Stich, S.P. (eds.) The Oxford Handbook of Philosophy of Cognitive Science, pp. 382–401. Oxford University Press, Oxford (2012) 18. Cohen, J.D.: Cognitive control. core constructs and current considerations. In: Egner T. (ed.) The Wiley Handbook of Cognitive Control, pp. 3–28. Wiley Blackwell, Hoboken (2017) 19. Sterelny, K.: Language: from how-possibly to how-probably? In: Joyce, R. (ed.) The Routledge Handbook of Evolution and Philosophy, pp. 120–135. Routledge, New York (2018) 20. Levinson, S.C., Gray, R.D.: Tools from evolutionary biology shed new lights on the diversification of languages. Trends Cogn. Sci. 16(3), 167–173 (2012) 21. Apperly, J.A., Carroll, D.J.: How do Symbols affect 3- to 4-year olds’ executive function? Evidence from a reverse contingency task. Dev. Sci. 12(6), 1070–1082 (2009) 22. Carlson, S.M., Davis, A.C., Leach, J.G.: Less is more: executive function and symbolic representation in preschool children. Psychol. Sci. 16(8), 609–616 (2005) 23. Cragg, L., Nation, K.: Language and development of cognitive control. Top. Cogn. Sci. 2, 631–642 (2010) 24. Gooch, D., Thompson, P., Nash, H.M., Snowling, M.J., Hulme, C.: The development of executive function and language skills in the early school years. J. Child Psychol. Psychiatry 57(2), 180–187 (2016) 25. Khanna, M.M., Boland, J.E.: Children’s use of language context in lexical ambiguity resolution. Q. J. Exp. Psychol. 63(1), 160–193 (2010)

On the Coevolution of Language and Cognition—Gricean Intentions …

201

26. Winsler, A., Fernyhough, C., Montero, I. (eds.): Private Speech, Executive Functioning, and the Development of Verbal Self-Regulation. Cambridge University Press, Cambridge (2009) 27. Frost, R., Armstrong, B.C., Siegelman, N., Christiansen, M.H.: Domain generality versus modality specificity: the paradox of statistical learning. Trends Cogn. Sci. 19(3), 117–125 (2015) 28. Christie, S., Gentner, D.: Language helps children succeed on a classic analogy task. Cogn. Sci. 38, 383–397 (2014) 29. Gentner, D., Christie, S.: Relational language supports relational cognition in humans and apes. Behav. Brain Sci. 31, 136–137 (2008) 30. Genter, D.: Why we’re so smart. In: Gentner, D., Goldin-Meadow, S. (eds.) Language in Mind, pp. 195–235. MIT Press, Cambridge (2003) 31. Gentner, D.: Bootstrapping the mind: analogical processes and symbol systems. Cogn. Sci. 34, 752–775 (2010) 32. Lupyan, G., Bergen, B.: How Language programs the Mind. Top. Cogn. Sci. 8, 408–424 (2016) 33. Kompa, N.: Language and embodiment—or the cognitive benefits of abstract representations. Mind Lang. (To appear) 34. Berwick, R.C., Chomsky, N.: Why Only Us?. The MIT Press, Cambridge (2016) 35. Sterelny, K.: Deacon’s challenge: from calls to words. Topoi 35, 271–282 (2016) 36. Jackendoff, R.: Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford University Press, Oxford (2002) 37. Kompa, N.: Language evolution and linguistic norms. In: Roughley, N., Bayertz, K. (eds.) The Normative Animal? On the Anthropological Significance of Social, Moral, and Linguistic Norms, pp. 245–264. Oxford University Press, Oxford (2019) 38. Millikan, R.G.: Language, Thought, and Other Biological Categories. MIT Press, Cambridge (1984) 39. Dretske, F.: Explaining Behavior, Reasons in a World of Causes. The MIT Press, Cambridge (1988) 40. Neander, K.: Teleological theories of mental content. In: Zalta E.N. (ed.) The Stanford Encyclopedia of Philosophy (Spring 2012 edn.). http://plato.stanford.edu/archives/spr2012/entries/ content-teleological/ (2012) 41. Schulte, P.: Perceptual representations: a teleosemantic answer to the breadth-of-application problem. Biol. Philos. 30, 119–136 (2015) 42. Bennett, J.: Linguistic Behaviour. Cambridge University Press, Cambridge (1976) 43. Hrdy, S.B.: Mothers and Others, The Evolutionary Origins of Mutual Understanding. Harvard University Press, Cambridge MA (2009) 44. van Schaik, C.P.: The Primate Origins of Human Nature. Wiley Blackwell, Hoboken (2016) 45. Tomasello, M.: Origins of Human Communication. MIT Press, Cambridge (2008) 46. Tomasello, M.: Why We Cooperate. MIT Press, Cambridge (2009) 47. Zlatev, J.: The co-evolution of human intersubjectivity, morality, and language. In: Dor, D., Knight, C., Lewis, J. (eds.) The Social Origins of Language, pp. 249–266. Oxford University Press, Oxford (2014) 48. Levinson, S.C.: Language evolution. In: Enfield, N.J., Kockelman, P., Sidnell, J. (eds.) The Cambridge Handbook of Linguistic Anthropology, pp. 309–324. Cambridge University Press, Cambridge (2014) 49. Sterelny, K.: Cumulative cultural evolution and the origins of language. Biol. Theory 11, 173–186 (2016) 50. Gintis, H., van Schaik, C.P., Boehm, C.: Zoon Politikon: the evolutionary origins of human political systems. Curr. Anthropol. 56(3), 327–353 (2015) 51. Bickerton, D.: Adam’s Tongue, How Humans Made Language, How Language Made Humans. Hill and Wang, New York (2009) 52. Zahavi, A.: Mate selection—selection for a handicap. J. Theor. Biol. 53, 205–214 (1975)

202

N. A. Kompa

53. Scott-Phillips, T.C.: Evolutionarily stable Communication and Pragmatics. In: Benz A., Ebert C., Jäger G., van Rooij R. (eds.) Language, Games, and Evolution. Trends in Current Research on Language and Games. Lecture Notes in Artificial Intelligence. FoLLI, pp. 117–133. Springer, Berlin (2011) 54. Clay, Z., Zuberbühler, K.: Vocal communication and social awareness in chimpanzees and bonobos. In: Dor, D., Knight, C., Lewis, J. (eds.) The Social Origins of Language, pp. 141–156. Oxford University Press, Oxford (2014) 55. Dunbar, R.: Gossip in evolutionary perspective. Rev. Gen. Psychol. 8(2), 100–110 (2004) 56. Desalles, J.-L.: Why talk? In: Dor, D., Knight, C., Lewis, J. (eds.) The Social Origins of Language, pp. 284–296. Oxford University Press, Oxford (2014) 57. Green, M.: Speech acts, the handicap principle and the expression of psychological states. Mind Lang. 24(2), 139–163 (2009) 58. Lewis, D.: Convention. Harvard University Press, Cambridge (1969) 59. Duguid, S., Wyman, E., Bullinger, A.F., Herfurth-Majstorovic, K., Tomasello, M.: Coordination strategies in chimpanzees and human children in a Stag Hunt game. Proc. R. Soc. B 281, 20141973 (2014) 60. Rescorla, M.: Convention. In: Zalta E.N. (ed.) The Stanford Encyclopedia of Philosophy (Summer 2017 edn.). https://plato.stanford.edu/archives/sum2017/entries/convention/ (2017) 61. Dutta, P.K.: Strategies and Games. Theory and Practice. The MIT Press, Cambridge MA (1999) 62. Skyrms, B.: Signals: Evolution, Learning, & Information. Oxford University Press, Oxford (2010) 63. Alexander, J.M.: Evolutionary game theory. In: Zalta E.N. (ed.) The Stanford Encyclopedia of Philosophy (Fall 2009 edn.). https://plato.stanford.edu/archives/fall2009/entries/gameevolutionary/ (2009) 64. Schecter, S., Gintis, H.: Game Theory in Action: An Introduction to Classical and Evolutionary Models. Princeton University Press, Princeton (2016) 65. Huttegger, S.H., Zollman, K.J.S.: Signaling games: dynamics of evolution and learning. In: Benz A., Ebert C., Jäger G., van Rooij R. (eds.) Language, Games, and Evolution. Trends in Current Research on Language and Games. Lecture Notes in Artificial Intelligence. FoLLI, pp. 160–176. Springer, Berlin (2011) 66. Huttegger, S.H., Skyrms, B., Smead, R., Zollman, K.J.S.: Evolutionary dynamics of Lewis signaling games: signaling systems vs. partial pooling. Synthese 172(1), 177–191 (2010) 67. Skyrms, B.: Pragmatics, logic and information processing. In: Benz A., Ebert C., Jäger G., van Rooij R. (eds.) Language, Games, and Evolution. Trends in Current Research on Language and Games. Lecture Notes in Artificial Intelligence. FoLLI, pp. 177–187. Springer, Berlin (2011) 68. Grim, P.: Simulating Grice. Emergent pragmatics in spatialized game theory. In: Benz A., Ebert C., Jäger G., van Rooij R. (eds.) Language, Games, and Evolution. Trends in Current Research on Language and Games. Lecture Notes in Artificial Intelligence. FoLLI, pp. 134–159. Springer, Berlin (2011) 69. Hofbauer, J., Huttegger, S.: Feasibility of communication in binary signaling games. J. Theor. Biol. 254(4), 843–849 (2008) 70. Alexander, J.M., Skyrms, B., Zabell, S.I.: Inventing new signals. Dyn. Games Appl. 2, 129–145 (2012) 71. Barrett, J.A.: Rule-following and the evolution of basics concepts. Philos. Sci. 81(5), 829–839 (2014) 72. Sperber, D., Wilson, D.: Relevance. Blackwell Publishers, Oxford (2 1995) 73. Sterelny, K.: Language and niche construction. In Oller D.K., Griebel U. (eds.) Evolution of Communicative Flexibility. Complexity, Creativity, and Adaptability in Human and Animal Communication, pp. 215–234. MIT Press, Cambridge (2008) 74. Cloud, D.: The Domestication of Language—Cultural Evolution and the Uniqueness of the Human Animal. Columbia University Press, New York (2014)

On the Coevolution of Language and Cognition—Gricean Intentions …

203

75. Crawford, V.P., Sobel, J.: Strategic information transmission. Econometrica 50(6), 1431–1451 (1982) 76. Maynard Smith, J., Harper, D.: Animal Signals. Oxford University Press, Oxford (2003) 77. Collier, K., Bikel, B., van Schaik, C.P., Manser, M.B., Townsend, S.W.: Language evolution, syntax before phonology? Proc. R. Soc. B 281, 20140263 (2014) 78. Engesser, S., Ridley, A.R., Townsend, S.: Meaningful call combinations and compositional processing in the southern pied babbler. PNAS 113(21), 5976–5981 (2016) 79. Wich, S.A., Krützen, M., Lameira, A.R., Nater, A., Arora, N., et al.: Call cultures in orangutans? PlosONE 7(5), e36180 (2012) 80. Watson, S., Townsend, S., et al.: Vocal learning in the functional referential food grunts of chimpanzees. Curr. Biol. 25, 495–499 (2015) 81. Hockett, C.: The origin of speech. Sci. Am. 203, 88–111 (1960) 82. Burling, R.: Words came first, adaptations for word-learning. In: Tallermann, M., Gibson, K.R. (eds.) The Oxford Handbook of Language Evolution, pp. 406–416. Oxford University Press, Oxford (2012) 83. Cheney, D.L., Seyfarth, R.M.: How Monkeys See the World, Inside the Mind of Another Species. University of Chicago Press, Chicago (1990) 84. Hausberger, M., Henry, L., Testé, B., Barbu, S.: Contextual sensitivity and bird song: a basis for social life. In Oller D.K., Griebel U. (eds.) Evolution of Communicative Flexibility. Complexity, Creativity, and Adaptability in Human and Animal Communication, pp. 121–138. MIT Press, Cambridge (2008) 85. Clay, Z., Zuberbühler, K.: Communication during sex among female bonobos, effects of dominance, solicitation and audience. Sci. Rep. 2, 291 (2012) 86. Zuberbühler, K.: Cooperative breeding and the evolution of vocal flexibility. In: Tallermann, M., Gibson, K.R. (eds.) The Oxford Handbook of Language Evolution, pp. 71–81. Oxford University Press, Oxford (2012) 87. Call, J.: How apes use gestures: the issue of flexibility. In: Oller D.K., Griebel U. (eds.) Evolution of Communicative Flexibility. Complexity, Creativity, and Adaptability in Human and Animal Communication, pp. 235–252. The MIT Press, Cambridge (2008) 88. Green, M.: Organic meaning: an approach to communication with minimal appeal to minds. In: Capone A., Carapezza M., Lo Piparo F. (eds.) Further Advances in Pragmatics and Philosophy: Part 2 Theories and Applications. Springer, Berlin (To appear) 89. Townsend, S., Koski, S.E., et al.: Exorcising Grice’s ghost: an empirical approach to studying intentional communication in animals. Biol. Rev. 92, 1427–1433 (2017) 90. Clark, A., Karmiloff-Smith, A.: The cognizer’s innards: a psychological and philosophical perspective on the development of thought. Mind Lang. 8(4), 487–519 (1993) 91. Krebs, J.R., Dawkins, R.: Animal signals, mind-reading and manipulation. In: Krebs, J.R., Davies, J.R. (eds.) Behavioral Ecology, An Evolutionary Approach, pp. 380–402. Blackwell Publishers, Oxford (1984) 92. Rogers, L.J., Kaplan, G.: Songs, Roars and Rituals: Communication in Birds, Mammals and Other Species. Harvard University Press, Cambridge (1998) 93. Scott-Phillips, T.C.: Speaking Our Minds. Palgrave Macmillan, New York (2015) 94. Sterelny, K.: From code to speaker meaning. Biol. Philos. 32, 816–838 (2017) 95. Rizzolatti, G., Arbib, M.A.: Language within our grasp. Trends Neurosci. 21, 188–194 (1998) 96. Fröhlich, M., Wittig, R.M., Pika, S.: Should I stay of should I go? Initiation of joint travel in mother-infant dyads of two chimpanzee communities in the wild. Anim. Cogn. 19, 483–500 (2016) 97. Gauker, C.: How to learn a language like a chimpanzee. Philos. Psychol. 4(1), 139–146 (1990) 98. Grice, P.H.: Meaning. Philos. Rev. 66(3), 377–388 (1957). Reprinted in Grice, P.H.: Studies in the Way of Words, pp. 213–223. Harvard University Press, Cambridge (1957) (quotations are from Grice 1989) 99. Hagoort, P., Levinson, S.C.: Neuropragmatics. In: Gazzaniga M.S., Mangun G.R. (eds.) The Cognitive Neurosciences, pp. 667–674. MIT Press, Cambridge (5 2014)

204

N. A. Kompa

100. Noveck, I.A., Reboul, A.: Experimental pragmatics: a Gricean turn in the study of language. Trends Cogn. Sci. 12(11), 425–431 (2008) 101. Donald, M.: Mimesis and the executive suite, missing links in language evolution. In: Hurford, J., Studdert-Kennedy, M., Knight, C. (eds.) Approaches to the Evolution of Language, pp. 44–67. Cambridge University Press, Cambridge (1998) 102. Fischer, O., Nänny, M. (eds.): The Motivated Sign. Iconicity in Language and Literature. John Benjamins Publishing Company, Amsterdam (2001) 103. Haiman, J.: Natural Syntax. Iconicity and Erosion. Cambridge Studies in Linguistics. Cambridge University Press, Cambridge (1992) 104. de Waal, F.B.M., Pollick, A.S.: Gesture as the most flexible modality of primate communication. In: Tallermann, M., Gibson, K.R. (eds.) The Oxford Handbook of Language Evolution, pp. 82–89. Oxford University Press, Oxford (2012) 105. Donald, M.: Key cognitive preconditions for the evolution of language. Psychon. Bull. Rev. 24, 204–208 (2017) 106. Moore, R.: Social cognition, stag hunts, and the evolution of language. Biol. Philos. 32(6), 797–818 (2017) 107. Moore, R.: Gricean communication, language development, and animal minds. Philos. Compass 13(12), e12550 (2018). https://doi.org/10.1111/phc3.12550 108. Bar-On, D.: Origins of meaning: must we ‘go Gricean’? Mind Lang. 28(3), 342–375 (2013) 109. Kemmerling, A.: Gricy Actions. In: Cosenza, G. (ed.) Paul Grice’s Heritage, pp. 73–99. Prepols Publications, Turnhout (2001) 110. Scott-Phillips, T.C., Kirby, S., Ritchie, G.R.S.: Signalling signalhood and the emergence of communication. Cognition 113(2), 226–233 (2009) 111. Hurford, J.R.: The origins of meaning. In: Tallermann, M., Gibson, K.R. (eds.) The Oxford Handbook of Language Evolution, pp. 370–381. Oxford University Press, Oxford (2012) 112. Kozulin, A.: Vygotsky in context. In: Vygotsky L. Thought and Language (translation newly revised and edited by Alex Kozulin), pp. xi–lvi. MIT Press, Cambridge (1986) 113. Green, M.: Self-expression. Oxford University Press, Oxford (2007) 114. Moore, R.: Gricean communication and cognitive development. Philos. Q. 67(267), 303–326 (2017) 115. Hurford, J.R.: The Origins of Grammar, Language in the Light of Evolution, vol. II. Oxford University Press, Oxford (2012) 116. Grice, P.H.: Logic and conversation. In: Grice P.H. (eds.) Studies in the Way of Words, pp. 22–40. Harvard University Press, Cambridge (1967) 117. Tomasello, M.: A Natural History of Human Thinking. Harvard University Press, Cambridge (2014) 118. van Rooij, R., Franke, M.: Optimality-theoretic and game-theoretic approaches to implicature. In: Zalta E.N. (ed.) The Stanford Encyclopedia of Philosophy (Winter 2015 edn.). https://plato. stanford.edu/archives/win2015/entries/implicature-optimality-games/ (2015) 119. Blutner, R.: Formal pragmatics. In: Yan, H. (ed.) The Oxford Handbook of Pragmatics, pp. 101–119. Oxford University Press, Oxford (2017) 120. Franke, M., Jäger, G.: Probabilistic pragmatics, or why Bayes’ rule is probably important for pragmatics. Zeitschrift für Sprachwissenschaft 35(1), 3–44 (2016) 121. Franke, M.: Game theoretic pragmatics. Philos. Compass 8(3), 269–284 (2013) 122. Lassiter, D., Goodman, N.D.: Adjectival vagueness in a Bayesian model of interpretation. Synthese 194, 3801–3836 (2017) 123. Franke, M.(ms): Signal to act. Game theory in pragmatics. Unpublished Ph.D. thesis (2009) 124. Huttegger, S.H.: Evolutionary explanations of indicatives and imperatives. Erkenntnis 66, 409–436 (2007) 125. Millikan, R.: Styles of rationality. In: Hurley, S., Nudds, M. (eds.) Rational Animals?, pp. 117–126. Oxford University Press, Oxford (2006) 126. Carston, R.: Thoughts and Utterances: The Pragmatics of Explicit Communication. Blackwell Publishing, Oxford (2002)

On the Coevolution of Language and Cognition—Gricean Intentions …

205

127. Kompa, N.: Contextualism in the philosophy of language. In: Petrus K. (ed.) Meaning and Analysis, New Essays on H.P. Grice, pp. 288–309. Palgrave Macmillan, Basingstoke (2010) 128. Kompa, N.: Meaning and interpretation. In: Conrad, S.-J., Petrus, K. (eds.) Meaning, Context, and Methodology, pp. 75–90. De Gruyter, Berlin (2017)