Language Change, Variation, and Universals 0198865392, 9780198865391

This volume explores how human languages become what they are, why they differ from one another in certain ways but not

725 63 6MB

English Pages 336 [335] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Language Change, Variation, and Universals
 0198865392, 9780198865391

Table of contents :
Cover
Language Change, Variation, and Universals: A Constructional Approach
Copyright
Contents
Acknowledgments
Preface
List of Abbreviations
Part I: Foundations
1: Overview
1.1 The problem
1.2 Constructions
1.2.1 Basics
1.2.2 Constructions are not derivations
1.3 Antecedents
2: Constructions
2.1 Introduction
2.2 What a grammar is for
2.3 A framework for constructions
2.3.1 Representing constructions
2.3.2 Licensing
2.3.3 Linear order
2.4 Appendix: Formalizing constructions
2.4.1 Representations on tiers
2.4.2 Connections between tiers
2.4.3 Licensing via instantiation
3: Universals
3.1 Classical Universal Grammar
3.1.1 Core grammar
3.1.2 Parameters
3.1.3 UG and emerging grammars
3.2 Another conception of universals
3.3 On the notion ‘possible human language’
3.3.1 Possible constructions
3.3.2 An example: Negation
3.3.3 Another example: The imperative
3.4 Against uniformity
4: Learning, complexity, and competition
4.1 Acquiring constructions
4.2 Constructional innovation
4.3 Constructions in competition
4.3.1 Multiple grammars vs. multiple constructions
4.3.2 Defining competition
4.3.3 When do we actually have competition?
4.4 Economy
4.4.1 Representational complexity
4.4.2 Computational complexity
4.4.3 Interpretive complexity
4.5 Simulating competition
4.6 Summary
Part II: Variation
5: Argument structure
5.1 Introduction
5.2 Argument structure constructions (ASCs)
5.2.1 Devices
5.2.2 CS features
5.3 Differential marking
5.3.1 Differential subject marking
5.3.2 Differential object marking
5.4 Modeling differential marking
5.4.1 Acquisition of ASCs
5.4.2 Simulation
5.5 Summary
6: Grammatical functions
6.1 Introduction
6.2 The notion of ‘subject’
6.3 Morphologically rich ASCs
6.3.1 Plains Cree argument structure
6.3.2 Incorporation
6.3.3 Complexity in ASCs
6.4 Split intransitive
6.5 The emergence of grammatical functions
6.6 Summary
7: A′ constructions
7.1 Foundations
7.2 Doing A′ work
7.2.1 Gaps and chains
7.2.2 Relatives
7.2.3 Topicalization
7.3 Scope in situ
7.3.1 Wh-in-situ
7.3.2 In situ in polysynthesis
7.3.3 Other in situ
7.3.4 Cryptoconstructional in situ
7.4 Extensions of A′ constructions
7.5 Toward an A′ constructional typology
7.6 Summary
Part III: Change
8: Constructional change in Germanic
8.1 Introduction
8.2 Basic clausal constructions of Modern German
8.2.1 Initial position in the clause
8.2.2 Position of the finite verb in the main clause
8.2.3 Position of the verb in a subordinate clause
8.2.4 Position of the verb in questions
8.3 The development of English
8.3.1 The position of the verb
8.3.2 The ‘loss’ of V2 in English
8.3.3 The loss of case marking
8.4 The development of Modern German from Old High German
8.5 Verb clusters
8.6 Conclusion
9: Changes outside of the CCore
9.1 English reflexives
9.1.1 Reflexivity in constructions
9.1.2 Variation and change in reflexive constructions
9.2 Auxiliary do
9.2.1 The emergence of do
9.2.2 The spread of do
9.3 Preposition stranding
9.3.1 Why p-stranding?
9.3.2 P-passive
9.3.3 Coercion
9.4 Conclusion
10: Constructional economy and analogy
10.1 The elements of style
10.2 Analogy
10.2.1 Maximizing economy
10.2.2 Routines
10.2.3 Pure style
10.3 Beyond parameters: Capturing the style
10.3.1 Baker’s Polysynthesis Parameter
10.3.2 Greenberg’s universals
10.3.3 Non-Greenbergian universals
10.4 Summary
11: Recapitulation and prospects
References
Language Index
Author Index
Subject Index

Citation preview

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

Language Change, Variation, and Universals

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

Language Change, Variation, and Universals A Constructional Approach P E T E R W. C U L IC OV E R

1

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3 Great Clarendon Street, Oxford, OX2 6DP, United Kingdom Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries © Peter W. Culicover 2021 The moral rights of the author have been asserted First Edition published in 2021 Impression: 1 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this work in any other form and you must impose this same condition on any acquirer Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America British Library Cataloguing in Publication Data Data available Library of Congress Control Number: 2021931013 ISBN 978–0–19–886539–1 DOI: 10.1093/oso/9780198865391.001.0001 Printed and bound in Great Britain by Clays Ltd, Elcograf S.p.A. Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

Contents Acknowledgments Preface List of Abbreviations

ix xi xv

PA R T I. F O U N D A T I O N S 1. Overview 1.1 The problem 1.2 Constructions 1.2.1 Basics 1.2.2 Constructions are not derivations

1.3 Antecedents

2. Constructions

3 3 9 9 12

14

16

2.1 Introduction 2.2 What a grammar is for 2.3 A framework for constructions

16 16 19

2.3.1 Representing constructions 2.3.2 Licensing 2.3.3 Linear order

19 27 28

2.4 Appendix: Formalizing constructions 2.4.1 Representations on tiers 2.4.2 Connections between tiers 2.4.3 Licensing via instantiation

3. Universals 3.1 Classical Universal Grammar 3.1.1 Core grammar 3.1.2 Parameters 3.1.3 UG and emerging grammars

31 31 36 36

41 41 42 44 46

3.2 Another conception of universals 3.3 On the notion ‘possible human language’

50 53

3.3.1 Possible constructions 3.3.2 An example: Negation 3.3.3 Another example: The imperative

53 56 61

3.4 Against uniformity

4. Learning, complexity, and competition 4.1 Acquiring constructions

66

68 68

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

vi contents 4.2 Constructional innovation 4.3 Constructions in competition 4.3.1 Multiple grammars vs. multiple constructions 4.3.2 Defining competition 4.3.3 When do we actually have competition?

4.4 Economy

75 77 78 81 86

87

4.4.1 Representational complexity 4.4.2 Computational complexity 4.4.3 Interpretive complexity

4.5 Simulating competition 4.6 Summary

88 90 96

100 106

P A R T I I . VA R I A T I O N 5. Argument structure 5.1 Introduction 5.2 Argument structure constructions (ASCs)

111 111 112

5.2.1 Devices 5.2.2 CS features

112 118

5.3 Differential marking

120

5.3.1 Differential subject marking 5.3.2 Differential object marking

120 130

5.4 Modeling differential marking

133

5.4.1 Acquisition of ASCs 5.4.2 Simulation

134 139

5.5 Summary

6. Grammatical functions 6.1 Introduction 6.2 The notion of ‘subject’ 6.3 Morphologically rich ASCs

144

145 145 146 147

6.3.1 Plains Cree argument structure 6.3.2 Incorporation 6.3.3 Complexity in ASCs

148 152 156

6.4 Split intransitive 6.5 The emergence of grammatical functions 6.6 Summary

158 160 165

7. Aᆣ constructions 7.1 Foundations 7.2 Doing Aᆣ work 7.2.1 Gaps and chains 7.2.2 Relatives 7.2.3 Topicalization

166 166 169 169 174 175

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

contents vii 7.3 Scope in situ 7.3.1 7.3.2 7.3.3 7.3.4

177

Wh-in-situ In situ in polysynthesis Other in situ Cryptoconstructional in situ

7.4 Extensions of Aᆣ constructions 7.5 Toward an Aᆣ constructional typology 7.6 Summary

178 180 182 183

183 189 194

PA R T I I I. C H A N G E 8. Constructional change in Germanic 8.1 Introduction 8.2 Basic clausal constructions of Modern German 8.2.1 8.2.2 8.2.3 8.2.4

Initial position in the clause Position of the finite verb in the main clause Position of the verb in a subordinate clause Position of the verb in questions

8.3 The development of English 8.3.1 The position of the verb 8.3.2 The ‘loss’ of V2 in English 8.3.3 The loss of case marking

8.4 The development of Modern German from Old High German 8.5 Verb clusters 8.6 Conclusion

9. Changes outside of the CCore 9.1 English reflexives 9.1.1 Reflexivity in constructions 9.1.2 Variation and change in reflexive constructions

9.2 Auxiliary do 9.2.1 The emergence of do 9.2.2 The spread of do

9.3 Preposition stranding 9.3.1 Why p-stranding? 9.3.2 P-passive 9.3.3 Coercion

9.4 Conclusion

10. Constructional economy and analogy 10.1 The elements of style 10.2 Analogy 10.2.1 Maximizing economy 10.2.2 Routines 10.2.3 Pure style

197 197 198 200 202 203 203

204 205 208 213

215 219 223

225 225 225 227

230 230 234

235 235 237 239

241

242 244 249 250 252 257

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

viii contents 10.3 Beyond parameters: Capturing the style 10.3.1 Baker’s Polysynthesis Parameter 10.3.2 Greenberg’s universals 10.3.3 Non-Greenbergian universals

10.4 Summary

262 262 264 267

272

11. Recapitulation and prospects

274

References Language Index Author Index Subject Index

279 309 311 316

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

Acknowledgments This book has been a long time in the making, and has been profoundly influenced by many people. As always, thanks to Ray Jackendoff and Susanne Winkler for their friendship, collegiality, advice, and support. Jack Hawkins read several early versions of the manuscript and generously shared his insights—they can be seen throughout. I am very grateful to Jefferson Barlew for the formal description of the constructional framework developed as part of our collaboration on minimal constructions, which appears in the Appendix to Chapter 2 and is based on Barlew & Culicover (2015). And I owe a tremendous debt to Giuseppe Varaschin, who read the entire manuscript in various incarnations and made countless detailed and constructive suggestions, virtually all of which have been incorporated into the current version. Thanks also to Brian Joseph and Noah Diewald, from whom I learned so much in the course of our discussions in our Cree Reading Group, to Greg Carlson, Ashwini Deo, Adele Goldberg, Björn Köhnlein, Andrew McInnerney, Rafaela Miliorini, Lorena Sainz-Maza Lecanda, Richard Samuels, Yourdanis Sedaris, Andrea Sims, Shane Steinert-Threlkeld, Elena Vaiksnoraite, and Joshua Wampler for stimulating discussions on a range of topics, to Philip Miller for helpful comments on the material in Chapter 3 and for his general perspective on constructional approaches to grammar, to Afra Alishahi for our collaboration on the simulation of change in argument structure constructions, to Andrzej Nowak for our collaboration on language change, and to Marianne Mithun for helpful insight into active-stative languages. Thanks to Zoe Edmiston, whose interest in Plains Cree stimulated my own, and to Morten Christiansen and Nick Chater for giving me the opportunity to write the foreword to their recent book and to think freshly about the foundations of linguistic theory. I am especially indebted to several anonymous reviewers, whose constructive suggestions have pointed to a rethinking of this book in ways that have led to significant improvements. Of course, any errors and deficiencies that remain are my responsibility alone. The Department of Linguistics and The Ohio State University awarded me a Special Assignment in the Autumn of 2016, which made it possible for me to

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

x acknowledgments make significant progress on the first draft. My thanks to Shari Speer, for her constant encouragement and support. I owe a deep debt to the work of Ivan Sag and Partha Niyogi, and I wish I could thank them now—sadly, they left us far too soon, with so much left for us to do without their guidance and insights. I am grateful as well to the late John Davey of Oxford University Press, whose unflagging support and encouragement over many years helped me through the publication of five substantial volumes, each of which contributes significantly to the foundations of the present work. And it is with great regret and sadness that I am unable to share this work with my dear friend and colleague, the late Michael Rochemont. Finally, deepest thanks and much love as always to Diane, Daniel, and Jen for always being there.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

Preface This book began with a nagging worry. By and large, the grammatical literature assumes that grammatical functions such as subject and object are universal and independently represented in syntax, and play an integral role in the description of the form/meaning correspondences that comprise what we call ‘language’. But these supposed universals, like many others, pose several mysteries. Where do they come from? Are they part of the biological endowment for language that is encoded in our genes? Is there a device called Universal Grammar in our brains that incorporates such notions as subject and object, or the equivalent? Did biological evolution select for languages whose grammars make use of these grammatical functions? As I looked more into the literature on grammatical functions, it became clear that they are not universal—not all languages appear to make use of them, and where it appears that they are used, they are not the same crosslinguistically. Of course, we find the terms ‘subject’ and ‘object’ used all the time to distinguish the participants in a relation such as ‘The tiger bit Sandy’ in a given language. And it is possible to stipulate that ‘the tiger’ in some language has the syntactic representation of what we call ‘subject’ in a language like English. But on closer investigation, often what is being distinguished are the phrases that denote entities with thematic roles like agent and patient. It is often said that “A journey of a thousand miles begins with a single step”, and each journey starts from a different place. This is one such journey. No matter where we start from in syntactic theory, the interconnections take us to places that we did not envision at the outset. In this case, as I thought about grammatical functions and how arguments are distinguished crosslinguistically, I found myself engaged in something much more far-reaching and ambitious: the explanation of language change, variation, and typology. Why does change proceed in certain directions, why do we get the variation that we do, why do we get certain variants and not others? How do grammars carry out the task of encoding the expressive functions of language, and what, if any, are the limits on how this is done? Why are certain patterns ubiquitous, across languages and in a single language, while others are rare or nonexistent?

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

xii preface It is of course possible to formulate descriptions of change, variation, and typological patterns in any reasonably explicit descriptive framework. But in order to properly explain them, we need the right descriptive framework. I had been working on constructional phenomena for some time, beginning in fact with my dissertation (Culicover 1971), more recently with the publication of my books Syntactic Nuts (Culicover 1999), Grammar and Complexity (Culicover 2013c), and Explaining Syntax (Culicover 2013b), and in my collaboration with Ray Jackendoff on Simpler Syntax (Culicover & Jackendoff 2005). It seemed promising to pursue a constructional approach to these issues. So in the end, this book is not about grammatical functions, although that is one strand. It is about how and why grammars vary and change, and hence why there are distinct languages, understood as overt expressions of different grammars. To address these questions, I argue for a particular constructional approach to the representation of grammatical knowledge, and I seek to show how this approach helps us understand how different languages and typological patterns might arise out of grammatical change and competition between grammars in the natural social and cognitive environment. The organization of the book is as follows. Part I lays out the foundations of this exploration. It covers a statement of the problem, a reassessment of the notion of Universal Grammar, a theory of constructions, and the conceptual relationships between syntactic theory, grammatical variation, and grammatical change. This part of the book consists of four chapters. Chapter 1 (Overview) lays out the general problem of explaining the form of grammars, and relates this problem to that of characterizing grammatical complexity. Following much recent work, I take the view that reduction of complexity is a driving explanatory force whose effects can be seen in change and in variation. In the overview, I sketch out the general perspective that I adopt on universals, conceptual structure, constructions, complexity and change and variation, and how they are related. Chapter 2 (Constructions) sets out an approach to grammatical description in which the notions of Chapter 1 can be formally implemented. The account is a constructional one. I outline the formalism, show how it is used to account for grammatical phenomena, and highlight its utility in describing variation and change. Chapter 3 (Universals) reviews the approach to universals in contemporary grammatical theory, which is that they are expressions of Universal Grammar (UG), the human faculty of language. In practice UG is assumed to constrain syntax, and thus constitutes an explicit statement of grammatical universals. This chapter formulates a different view, which is that what is universal is

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

preface xiii conceptual structure, and grammatical universals and typological patterns arise as a consequence of pressures to formulate constructional grammars that express conceptual structure as simply as possible. While it is possible to characterize change and variation in any sufficiently expressive descriptive framework, I argue that the constructional approach provides a natural framework for explaining language variation and change. Chapter 4 (Learning, complexity, and competition). This chapter explores how envisioning a language learner as acquiring a grammar consisting of constructions allows us to account for change and variation. I develop the idea that change is not solely the responsibility of early language learners, but may also occur as innovations initiated by adult speakers. One key explanatory component is constructional complexity; another is competition between constructions that have overlapping functions. In Part II I look at a number of cases of variation in constructions that deal with two core expressive functions of a language: argument structure and Aᆣ /filler-gap constructions. I show how the constructional framework provides a formal apparatus that is suitable both for describing the phenomena in a given language and for accounting for the observed variation. This part of the book consists of three chapters. Chapter 5 (Argument structure) applies the theory to variation in systems for expressing argument structure. A central point is that there are multiple devices of comparable complexity that encode the thematic roles; hence it is not necessary to assume that all languages share a uniform syntactic structure at some abstract level of representation. In Chapter 6 (Grammatical functions) I return to the question that triggered this project, the source of grammatical functions (GFs). I review evidence that not all languages require GFs, and show how to capture the relevant correspondences between form and meaning in constructional terms. Chapter 7 (Aᆣ constructions) applies the theory of the preceding chapters to Aᆣ constructions, such as wh-questions and relative clauses. The main result of this chapter is that there is a range of ways in which the conceptual ‘work’ associated with these constructions can be expressed in the correspondence between syntax, phonology, and meaning. None of them involve ‘movement’ in the classical sense, although constructions can express links between constituents not in canonical position relative to their governing heads, giving the illusion of movement. Part III applies the models of constructional learning and network interactions developed in Chapter 3 to establish the plausibility of the account of change sketched out in Part I. This part of the book consists of three chapters.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

xiv preface Chapter 8 (Constructional change in Germanic) tracks several of the major changes in English and German word order and accounts for them in terms of constructional change as formulated in Chapter 3. It argues that the changes in Germanic are relatively simple in constructional terms, although the superficial results are quite dramatic. Among the topics addressed are clause-initial position, V2, VP-initial and VP-final verb position, the loss of V2 and case marking in English, and verb clusters in Continental West Germanic. Chapter 9 (Changes outside of the CCore) shows the broader applicability of the constructional approach. I look at three well-documented developments in English that do not fall into the category of ‘core phenomena’ as understood in Chapter 3, reflexivity, do support, and preposition stranding. These changes are not as central to the expressive function of language as argument structure, operator/scope, and similar phenomena. I argue that these phenomena provide additional evidence that the constructional approach is well-suited for providing genuine explanations for language change and variation. In Chapter 10 (Constructional economy and analogy), I look more deeply into what constitutes constructional economy, and why it plays a role in shaping the form of grammars. I argue that constructional economy is the consequence of what has been called ‘analogy’ in the traditional linguistics literature. Specifically, I suggest that economy in constructions derives from placing a high value on the use and reuse of the components of the processing routines associated with constructional correspondences. I apply this general idea to seek explanations for a range of typological patterns that I refer to generally as ‘style’. Chapter 11 (Recapitulation and prospects) summarizes the main results of this book and lays out some general propositions about how to think further about language variation and change from the perspective of constructional grammars.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

List of Abbreviations Adj ASC AUX BDT CCore CS CWG DM DOM DSM GB (theory) GF HPSG IPP IS LFG LID MGG MOD ModE ModG N Neg NP OE OHG P&P PA PG PLD PP RG SAI SD UG V

adjective Argument structure construction Auxiliary Branching Direction Theory Conceptual Core Conceptual Structure Continental West Germanic Distributed Morphology Differential object marking Differential subject marking Government Binding theory grammatical function Head-driven Phrase Structure Grammar Infinitivus Pro Participio information structure Lexical Functional Grammar Lexeme Identifier Mainstream Generative Grammar Modal Modern English Modern German noun Negative noun phrase Old English Old High German Principles and Parameters Theory Parallel Architecture Proto-Germanic primary linguistic data prepositional phrase Relational Grammar subject Aux inversion Standard Dutch Universal Grammar verb

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

xvi list of abbreviations VP VPR WALS WF ZT

Verb Phrase Verb projection raising World Atlas of Language Structures West Flemish Zürich German

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

PART I

F OU N DAT ION S

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

1 Overview 1.1 The problem My primary concern in this book is how human languages get to be the way they are, why they are different from one another in certain ways and not in others, and why they change in the ways that they do. Given that language is a universal creation of the human mind, the central question is why there are different languages at all. Why don’t we all speak the same language? This chapter lays out the general foundations of inquiry into this question in contemporary linguistic theory, and the specific assumptions that inform the answers developed in this book. I call this central question ‘Chomsky’s Problem’.1 Chomsky’s own answer, hinted at in Chomsky (1965) and further developed in Chomsky (1973, 1981) and other work, has been that in a sense we do all speak the ‘same language’. What we produce is the external manifestation of a universal, biologically determined, abstract faculty of the human mind, called Universal Grammar (UG). This classical Chomskyan account, which I refer to throughout as Mainstream Generative Grammar (MGG), has the following main components: (i) There is a set of very general grammatical principles, structures, and mechanisms, UG, which define the core grammar shared by all languages. (ii) These principles and mechanisms are biological universals and constitute I-language. (iii) Some observable variation is due to differences in the values of core parameters; hence I-language takes different forms as determined by these parameters. The parameters are set by learners on the basis of exposure to primary linguistic data (PLD). (iv) The set of actual sentences and their meanings, produced by a group of speakers, referred to as E-language, is itself of no theoretical 1 The parallel with Chomsky’s “Orwell’s Problem” and “Plato’s Problem” is intended.

Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0001

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

4 overview significance, except insofar as it counts as evidence about I-language, UG, and the parameters of variation. (v) Phenomena that are outside of UG are in the periphery. The periphery contains idiosyncrasies, irregularity, and exceptions. It can vary widely from language to language, although not without principled constraints. (vi) The principles and mechanisms of UG constitute an optimal solution to the problem of expressing thought. Frameworks that fall under the general perspective of (i)–(v) are Government/Binding Theory (Chomsky 1981), Principles and Parameters Theory (Chomsky 1981; Chomsky & Lasnik 1993) and, with (vi), the Minimalist Program (Chomsky 1995b, 2000a). I assume that any solution to Chomsky’s Problem must have the general structure of (i)–(v), but only in the following sense. (vii) It must explain why languages share so many properties, both in form and in function. (viii) It must attribute these properties to some universal source— biological, cognitive, or social, or a combination of these. (ix) It must account for the possibility of variation and for the range of variation. (x) It must accommodate not only regularities and generalizations, but idiosyncrasies, irregularity, and exceptions. (xi) It must explain why certain properties are common while others are rare or do not occur at all. (xii) It must explain how learners arrive at mental representations of language—that is, grammars—that are suitably close to but not necessarily identical to the representation(s) of members of the linguistic community that they are learning from. The solutions that I propose in this book are inspired by Chomsky’s Problem and the Chomskyan program of MGG, and play off of them, but they are in many important respects very different, both in spirit and in substance. Crucially, Chomsky’s approach has been to assume point (vi), i.e. that “language design may really be optimal in some respects, approaching a ‘perfect solution’ to minimal design specifications” (Chomsky 2000a, 93). Chomsky’s notion of optimality is based on an abstract notion of economy, one that is not linked in any straightforward way to the cognitive capacities and limitations of human beings (Johnson & Lappin 1997, 1999).

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

1.1 the problem 5 In contrast, I assume that grammars are not computationally optimal in some abstract sense, but reflect the outcome of the neo-Darwinian evolution of a complex cognitive system (Ladd et al. 2008; Kinsella 2009; Kinsella & Marcus 2009). The evolution of this system is driven, at least in part, by the pressure to reduce the complexity of the mental representation of grammatical competence and associated computational complexity along specific dimensions. I use the term economy to refer precisely to this pressure. The approach that I argue for here has the following general structure. (i) Regarding the core, I assume that it is grounded in conceptual structure, not in a set of formal constraints on grammars. The basic job of language is to express thought.2 I use the term CCore (for Conceptual Core) here to refer to the set of expressive functions that are central to human thought and discourse. A grammar of a language encodes these functions more or less efficiently and transparently—functions such as argument structure, thematic structure, interrogation, imperatives, description, binding, reference and coreference, restrictive modification, negation and quantification, discourse structure, and so on.3 The exact extent of the CCore, and its origins, remain open questions. (ii) I assume that the expressive functions that languages encode are cognitive universals. The grammatical devices for expressing them, on the other hand, are social universals, in the sense that they reside in the minds of speakers in social networks in virtue of their linguistic competence, and are transmitted socially, not biologically, across cultures and generations, through contact between individuals and groups of individuals. These social universals ‘live’ in the social network, that is, in the linguistic competence of all speakers of all languages across time and space. They are universals in the sense that they are universally available for the expressive functions of the CCore. However, no particular way of expressing a particular function needs to be absolutely universal in the sense that it is active in the grammar of every possible language, a point that I return to below. 2 Thus I agree with Chomsky (1972). The expression of thought is logically prior to the communication of thought, at least thought that corresponds to representations that are constructed combinatorily out of primitive elements (Fitch 2011). Externalization is subsequent to the construction of thought, and is manifested at the interface with sound and gesture. This is not to deny, however, that the expression of thought is central to communication, and that some aspects of linguistic form may be explained in terms of constraints imposed by the communicative task. Chomsky (2005) seems to accept this view in his citation of ‘third factor’ explanations, which comprise communicative efficiency among other things. 3 Chomsky’s focus in the Minimalist Program on minimal mechanisms for expressing argument structure and extraction (external and internal binary Merge) is arguably a very restricted variant of the approach that I am taking here, taking syntax as a proxy for a limited portion of conceptual structure, and setting aside most conceptual and grammatical phenomena.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

6 overview (iii) Regarding variation, I assume that anything that can possibly be expressed as a correspondence between sound and meaning is, in principle, possible in language. Hence in principle, variation is unbounded. However, economy, that is, the pressures on grammars to reduce complexity, leads to a significant winnowing of the logical possibilities, in the spirit of markedness (Chomsky 1965, chapter 1). We expect that other things being equal, the simplest ways of expressing the CCore will be most widespread in social networks and perhaps even completely universal in some cases.⁴ The challenge posed by Chomsky’s Problem is a fundamental one. How is it possible to have a restrictive theory of the human language capacity that nevertheless allows for the massive superficial variation and idiosyncrasy that is attested in the world’s languages? Even if we entertain Chomsky’s view of a highly restrictive UG, we must account for the variation—simply banishing it to the periphery will not suffice as an explanation. It is important to always keep in mind that the full range of grammatical phenomena is acquired by learners, not just the parametric variation defined over some characterization of a restricted UG core (Culicover 1999; Culicover & Jackendoff 2005). Regarding this challenge for the learner, Chomsky (2013, 37) says that either there is an infinity of options, in which case challenging and perhaps hopeless abductive problems arise if the task is taken seriously; or there is a finite number, and the approach falls in principle within P & P [Principles and Parameters Theory – PWC]. That leaves open many questions as to how parameters are set, and what role other cognitive processes might play in setting them.

⁴ Dufter et al. (2009, 12–13) raise a number of important questions about how to explain apparent constraints on constructional variation: One such question is whether the factors that influence variant distribution should be an integral part of the grammar or not. A second theoretical issue that needs to be explored further is the nature of generalization. Can generalizations about structural properties within a language be formulated in parallel fashion to cross-linguistic generalizations, using the same types of inheritance networks? Shouldn’t there be a somewhat different status accorded to more general typological principles, constraints, or parameters? Is there really no difference between the modeling of micro-variation and macro-variation? It seems that construction schemata can be postulated rather ad hoc, such that the limits of what is possible in language do not follow from the theory (as it is claimed by models of the Chomskyan tradition). I believe that the approach to economy that I develop in this book offers a useful way to address such questions.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

1.1 the problem 7 Challenging, yes, but not hopeless. In Culicover (1999), I argued that the idiosyncrasies that languages have—the so-called periphery—are as robust as the more general phenomena that have classically been the province of parameter theory, e.g. word order and movement. They are learned, and native speakers have clear intuitions about them. At first glance, the periphery appears to allow unpredictable variation. However, the peripheral phenomena of a language prove to be related in systematic and often revealing ways to the more general and regular phenomena. And the latter are by no means always fully general and fully regular. The classical view of variation is that it is parametric, in the sense that a parameter has a finite number of possible values, preferably two, and each language has a particular setting for each parameter. I argue in the course of this book that characterizing variation in terms of parameters does not shed much light on the nature and scope of variation. The approach that I take is that what is essentially universal is not the inventory of formal devices that define the syntax of a language, as in MGG, but the CCore, that set of conceptual structure functions that a language must encode. Expressing the CCore is the ‘work’ that a grammar does. Syntactic structure is one way of encoding these functions and organizing them into sounds, morphological form is another.⁵ As a consequence, the universal architecture of the CCore is inevitably reflected cross-linguistically in the organization of syntax and morphology, and economy restricts the range of ways in which this work is accomplished. This perspective is guided by a set of ideas and intuitions. Some of these are shared with MGG and some are not. (i) Thought and the expression of thought are universal. All humans are born with the same cognitive apparatus for forming thoughts and the same drive to express and communicate these thoughts in sound and gesture. (ii) Grammar is a distillation of thought. Through this distillation, grammar becomes autonomous, at least to some extent, and its categories only indirectly and imperfectly correspond to the categories of thought (see (iii)). By this I mean that the categories and relations in our conceptual representations are reflected imperfectly, and in a tighter ⁵ While some syntactic theories have sought to reduce morphological form to syntactic derivation (see Harley & Noyer 1999 for a review), I argue in Chapter 5 that there is no empirical motivation for such a step.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8 overview and more restricted way, in the categories and relations that constitute grammars.⁶ For example, our conceptual categories of time are very complex—they cover the past, the present, and the future, and times that precede and follow reference times relative to them and to the time of utterance (Reichenbach 2012 [1958]). The tense systems of languages reflect some of the basic distinctions of conceptual time, but are for the most part simpler than the semantics, and in general do not map directly into particular times and temporal relationships. (iii) Syntax is autonomous. Autonomy of syntax means that although there are correlations between syntactic structure and meaning, to a significant extent, syntax is not reducible to meaning, that is, propositional semantics, information structure, and discourse structure. For example, there are many distinctions in the thematic roles that individuals may play in events and states, but few of these distinctions are grammatically marked (see, for example, Dowty 1991, as well as Chapter 5). (iv) Alignment and packaging. Languages may vary in terms of which aspects of meaning are packaged together in particular morphosyntactic units and corresponding phonological forms. Simple examples are the expressions enter and go into in English. In the case of enter, the meaning components go′ and into′ are packaged into a single word, while in the case of go into they correspond to distinct words. But packaging can be quite a bit more complex than this (Jackendoff 2002; Slobin 2004). These intuitions lead to the characterization of a grammar as consisting of constructions. The formal description, function, and scope of constructions are discussed at greater length in subsequent chapters; in this chapter I provide an informal sketch of the constructional approach to grammatical description and explanation. ⁶ An early expression of this relationship can be found in Paul (1890, 288): Every grammatical category is produced on the basis of a psychological one. The former is originally nothing but the transition of the latter into outward manifestation. As soon as the agency of the psychological category can be recognised in the use of language, it becomes a grammatical category. Its agency, however, by no means ends with the creation of the latter. It is itself independent of language. As it existed before the grammatical category, so it does not cease to operate when this comes into being. In this way the original harmony between the two may be in the course of time disturbed. The grammatical category is to some extent a petrifaction of the psychological. Instead of Paul’s “petrifaction” I use the term “distillation,” but the basic idea is the same.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

1.2 constructions 9

1.2 Constructions This section summarizes the essential features of the constructional approach to grammar, and contrasts it with other syntactic theories.

1.2.1 Basics First, the basic architecture of a constructional approach. I assume, following Jackendoff (2002) and many others, that the minimal description of a linguistic object such as a word or a phrase—a construct—involves a description of its sound (phon), its grammatical structure (syn) and its conceptual structure (or meaning)(cs). These are the tiers. Culicover & Jackendoff (2005) also assume that the grammatical functions are represented on a distinct gf tier. cs may comprise discourse and information structure, or they may be distinct tiers. I leave the question open here of precisely how to incorporate these aspects of meaning, as it appears to be a matter of notation, not substance. Simpler Syntax adopts the view taken in Head-driven Phrase Structure Grammar (HPSG; Pollard & Sag 1994), Lexical-Functional Grammar (LFG; Bresnan & Kaplan 1982), and varieties of Categorial Grammar (Jacobson 1992; Kubota & Levine 2013a,b; Morrill 1995; Oehrle et al. 1988; Pollard 2004; Steedman 1993; Uszkoreit 1986; Zeevat et al. 1987) that syntax is monostratal. That is, there is a single syntactic representation that corresponds to representations on other levels. From this it follows that there is no ‘movement’ per se. Chains that relate constituents external to the basic clause and gaps, pronominal copies, or affixes are products of the correspondence between phon, syn, and cs, and are not produced by deriving one syntactic structure from another. A construction in the sense of Culicover & Jackendoff (2005) describes a relationship between representations on two or more tiers that license the properties of constructs. The essential difference between construction and construct is the difference between a description and the objects that satisfy such descriptions; otherwise they have the same general architecture, and are composed of more or less the same elements following the same principles. An important difference is that a construction may contain variables, while a construct does not. With other constructional grammarians, I assume that the constructions, including general grammatical constructions, idioms, and individual lexical items, reside in the extended lexicon, often called the ‘constructicon’ (Jurafsky 1996).

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

10 overview To illustrate with a concrete example, there is a construction in English that stipulates that under normal circumstances, the verb in a verb phrase precedes its arguments and adjuncts. A sequence of words that consists of a verb and strings that constitute arguments and adjuncts of the verb must meet this condition. Such a sequence is licensed by the construction for the English VP. A sequence of constituents that does not meet this condition is not licensed by this construction, and is ill-formed as a VP, unless there is another construction that licenses it. So, the VP in (1a) is a well-formed VP of English, while the VP in (1b) is not. But the VP in (1c) is well-formed, because there is a special idiomatic construction that licenses it. (1) a. Chris does not [VP like pepperoni pizza]. b. ∗ Chris does not [VP pepperoni pizza like]. c. One swallow does not [VP a summer make]. Thus, the function of a grammar is to state what properties a construct must or may have in order to be well-formed in the language. A formalization of this relationship between construction and construct is given in the Appendix to Chapter 2. The grammar of a language is the set of constructions that state the licensing conditions that together define the set of well-formed constructs in this language. An important consequence of the constructional approach is that it is possible for a grammar to contain two constructions that license alternative forms of the same syntactic structure, or alternative ways of encoding a particular CS function syntactically. For example, for a VP that consists of V and NP, a grammar may have one construction that licenses [VP V–NP], and one that licenses [VP NP–V]. There are languages, including older forms of English, which show such variation. The flexibility that the constructional approach affords is what makes it so useful for our understanding of variation and change.⁷ The descriptions on each tier are representations of the familiar sort— phonological forms, syntactic structures, and semantic representations. For an individual word, the construction that defines it is a lexical entry that specifies its phonological form, its syntactic category, and morphological features, and its meaning (Jackendoff 2002; Jackendoff & Audring 2020). I assume for concreteness that for the most part the elements of phon are determined by the language-specific paradigm function (in the sense of ⁷ It is, of course, possible to recruit other devices to represent such variation, e.g. multiple grammars, or multiple settings for the same parameter (e.g. Yang 2002). There are no empirical differences, as far as I can tell, and in the end the choice of representation must be based on what architecture is most natural.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

1.2 constructions 11 Stump 2001). For English I represent this function as ΦEnglish (or just Φ when it is clear), and so on. For example, the representation of the bare form of the verb kick is, roughly, the correspondence shown in (2). (2) ⎡phon ΦEnglish (kick)=/kɪk/ ⎤ ⎢syn ⎥ [V kick] ⎢ ⎥ ′ λy.λx.kick (agent:x, patient:y)⎦ ⎣cs Crucially, when the verb has particular morphosyntactic properties in a sentence, the corresponding forms specified by Φ appear in phon. For example, the verb kick has the third person singular present tense inflection in Chris kicks Fido; the paradigm function ΦEnglish specifies the form as in (3). (3) ΦEnglish (kick,3,sg,pres) = /kɪks/ For regular inflections, like the English third person singular present, the paradigm function may be formulated in terms of a regular correspondence, adding the appropriate realization of -s to the form of the bare verb. For irregular cases, the form must be stipulated, as in (4). (4) ΦEnglish (be,3,sg,past) = /wəz/ The precise formal characterization of Φ or an equivalent alternative is not a central concern here—what we require is simply some way of expressing the fact that the phonological form in phon reflects the morphosyntactic properties in syn. Inflectional and paradigmatic complexities make it impossible to always state one-to-one correspondences between features of syn and phonetic strings in phon. Assuming an appropriate Φ for each language allows us to set aside the problem of exactly what is in phon in the formal statement of constructional correspondences. The specification of the syntactic representation as [V kick] identifies the lexical properties, in this case, the lexical identity, of the word, which is one of the terms that the paradigm function applies to. As noted by Sag (2012), the phonological form and the meaning together are not sufficient to distinguish the lexical item, since it may lack an interpretation in an idiom, e.g. kick the bucket.⁸ And the use of λ and the thematic roles agent, patient in the semantic representation roughly indicates the argument structure that corresponds to this verb. This is the basic picture, to be refined as we proceed. ⁸ In this respect I depart from the practice in Culicover & Jackendoff (2005) and elsewhere.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

12 overview Finally, as in Relational Grammar (Perlmutter 1983; Perlmutter & Rosen 1984), the constructional framework as described here assumes that grammatical functions have a role in grammatical descriptions. However, unlike RG (but following Simpler Syntax), I do not assume that the GFs such as subject and object are universal primitives, nor do I assume the grammar is formulated entirely in terms of GFs. It is formulated in terms of constructions, which may make reference to GFs. Relations between constructions may then also be expressed in terms of the GFs. I argue in Chapter 5 that in certain cases the GFs are categories of correspondences that emerge in the course of language change as generalizations of features of thematic relations. In others, they are descendants of grammatical devices for marking information structure. They appear to play a useful computational role, and thus constitute a part of the resources that constitute the collective knowledge of language in the social network.

1.2.2 Constructions are not derivations In order to appreciate the constructional approach, it is useful to contrast it with MGG.⁹ Some of the main differences are the following: (i) MGG licenses linear ordering by moving constituents from canonical positions to designated positions. In many constructional approaches, including the one in this book, the grammar states a correspondence between a particular syntactic structure and a linear ordering which gives the appearance of movement when constituents are adjacent in syn (e.g. they are sisters) but their phonological forms are not adjacent in phon, or when there is more than one possible ordering for the phonological forms corresponding to a particular syntactic structure. (ii) MGG assumes a strictly compositional semantics; a constructional grammar does not (although it must account for compositionality where it occurs)—cf. Jackendoff ’s (1997) notion of ‘enriched composition’. (iii) MGG assumes a uniform syntactic structure for the same meaning in a single language and cross-linguistically; a constructional grammar does not.1⁰ ⁹ I am using the term ‘constructional’ in a generic sense, in order to maintain a distinction between the approach that I develop here and the specifics of Construction Grammar (Fillmore 1988; Goldberg 1995; Kay 2002; Michaelis 2012). Goldberg (2013) uses the term ‘constructionist’. 1⁰ For an extended critique of the application of uniformity in MGG, see Culicover & Jackendoff (2005, chapters 2 & 3).

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

1.2 constructions 13 (iv) MGG is rigid in its licensing conditions; a constructional grammar is flexible. (v) In MGG, the form of an inflected or otherwise complex word is derived through movement and adjunction; the resulting constituent is spelled out according to information stored in a list of forms.11 On the constructional approach, the form of these words is defined by the paradigm function Φ or the equivalent. An extended discussion of each of these points would take us too far afield, so I elaborate only the first point, which is central to much of what is taken up elsewhere in this book. Consider the example of adjective-noun ordering in French and Italian. The canonical order is Adj-N, but the order N-Adj is also possible, as shown by the French examples in (5). (5) French a. une belle maison a beautiful house ‘a beautiful house’ b. une voiture rouge a car red ‘a red car’ Taking the underlying order as Adj-N, the order N-Adj is derived in MGG by raising the N to a preadjectival position (Cinque 1994, 2005). To the extent that the two orders correspond to a semantic difference, the raising of N can be triggered by a feature on Adj that corresponds to this semantic difference and must be ‘checked off ’ by adjoining N to it in the observed particular configuration. Or the meaning difference can be associated with invisible elements in the syntactic representation with the appropriate interpretation that differentially trigger movements to different superficial positions. In a sense the alternation here is ‘parametric’, because for a language to have a particular order it must have a particular feature value or invisible element, while a language with the other order has a different feature value or element (or lacks an element). But the use of features or invisible elements to trigger the different orders and the corresponding meaning differences is actually constructional: the observed constructional differences are implemented in terms of features, abstract elements, and movement. I have called this approach 11 For an extreme version of this view, see recent work in Distributed Morphology (e.g. Halle & Marantz 1993, 1994; Embick & Noyer 2007; Harley & Noyer 1999). There is no lexicon per se in DM. Rather, the sound and meaning information that we associate with lexemes is distributed over a Vocabulary and an Encyclopedia, while the categorial information is in the syntax.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

14 overview ‘cryptoconstructionalism’ (Culicover 2017), and will argue as we proceed that an explicit constructional approach captures the phenomena more transparently and in most cases more simply.12 The constructional alternative in the case of adjective-noun order is to formulate the two orders in the description of distinct constructions, and to associate the meaning difference directly with the two orders licensed by the constructions; the correspondence is mediated through the phonological correspondence with the syntactic categories. The argument for one approach over the other is not one of empirical coverage: the two approaches cover the same phenomena. The marking of one adjective with a feature that triggers movement, but not another, is in fact the implementation in a derivational framework of a constructional fact. What is constructional is that the linear ordering is sensitive to lexical properties, perhaps arbitrary lexical properties.13 Many, if not most derivational analyses of linear ordering in MGG are in fact cryptoconstructional. That is, they use the derivational apparatus of feature checking and movement to encode exactly the information that can be encoded directly in a construction. In other words, they are actually constructional analyses in which the linear order and phonological form of elements are mediated by derivations, and not expressed directly. For this reason, I do not dwell too much on the specifics of such analyses in this book, except in a few cases where it is useful to emphasize this point. Chapter 2 elaborates the constructional approach further and compares it with the alternatives.

1.3 Antecedents Since the scope of the present study is substantial, it touches on many issues and topics that have been addressed in many syntactic theories. And, since space is limited, I am not going to be able to refer in detail to most theoretical antecedents and alternatives to my own proposals. In lieu of that, I offer the following general observations, which are not intended to be exhaustive. All contemporary theories are right about something, although not everything, and not the same things. There is considerable insight in work within Government Binding (GB) theory, Principles and Parameters theory, the Minimalist Program, HPSG, LFG, Relational Grammar, Role and Reference Grammar, and Combinatory Categorial Grammar. All of these have an illu12 For a remark that appears to accord with this sentiment, see Chomsky et al. (2019, 251). 13 For one approach, see Bouchard (2009).

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

1.3 antecedents 15 minating perspective on some (but often different) fundamental property or properties of language. I have taken much from all of them, in the spirit of Simpler Syntax, which stresses an eclectic approach to understanding how language works. • MGG is right about the existence of a universal faculty that underlies language. But this faculty is not just about syntax—it is about CS and how phonological and morphological forms are recruited to systematically express aspects of CS. What is universal is the architecture of correspondences that constructions express. Formal devices that do an effective job of expressing the obligatory cs-phon correspondences are likely to persevere in competition between languages and eventually will find their way into many grammars. While they are not universals in the sense that they must be found in every language, they are ubiquitous. • Within MGG, GB theory is right about modularity. But I take the modules to be different. In a constructional theory, there are independent principles and rules of combination for phonology, syntax, morphology, and conceptual structure. • Within MGG, Principles and Parameters theory is right in a certain sense about variation. But the parameters of variation are not about UG, and are not universal. Some are defined at the constructional level, in terms of the logical possibilities for characterizing the cs-phon correspondences. I discuss how to understand parameters in constructional terms at greater length in Chapter 10. • The Minimalist Program is right about the explanatory relevance of economy. But the measure is not about some abstract computation, but about the actual cs-syn-phon correspondences that a grammar must express. • HPSG is right about monostratality and the projection of structure from heads. • LFG is right about correspondences between levels of representation and the basic grammatical architecture. • Relational grammar is right that grammatical functions have a role in grammar, as discussed in section 1.2.1. • Combinatorial categorial grammar is right about the architecture of correspondence from cs to phon. The relationship is mediated by constructions, however, so that the correspondence may be more or less faithful and economical. One consequence is the possibility of variation in generality, transparency, and complexity.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2 Constructions 2.1 Introduction In order to be able to talk in precise terms about language variation and change we must be able to characterize languages in terms of grammatical variation and change. Hence we must be able to characterize grammars. As already discussed in Chapter 1, what defines a language is not a set of sentences, but a mental representation that embodies a grammar. This book argues that a useful formulation of this mental representation from the perspective both of description and explanation is in terms of constructions. In this chapter I outline the constructional theory that I assume in order to frame the scenario of language variation and change, with the goal of providing a solution to Chomsky’s Problem, i.e. why there are different languages at all? Why don’t we all speak the same language? In section 2.2 I sketch out the constructional approach to knowledge of a language. In 2.3 I elaborate the constructional formalism and explain how constructions characterize the well-formedness of linguistic expressions. I also provide a number of examples of English constructions to illustrate how the system works. A more formal development of these ideas is given in the Appendix to this chapter.

2.2 What a grammar is for In this section I summarize how a constructional approach characterizes what it is that speakers know when they know a language. The standard assumption in linguistic theory is that the job of a grammar is to characterize the well-formed expressions of a language. That is, a grammar is a generative grammar in the usual sense. A well-formed expression is an expression whose form and other properties conform to the conditions imposed on it by the grammar. To know a language is to have a mental representation that captures the well-formedness conditions of expressions in that language. Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0002

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.2 what a grammar is for 17 To take a simple example, Sandy snores is well-formed in English because the NP Sandy precedes the verb snores and, since Sandy is a third person singular noun phrase, the verb snores shows the third person singular present tense form. That is, subject and verb ‘agree’ in person and number. Some ill-formed expressions that are not licensed by the grammar are ∗ Sandy snore, ∗ snores Sandy, and ∗ snore Sandy, which violate the requirement of agreement, of linear order, or both. The grammar of English must in some way state explicitly the ordering and agreement requirements that only Sandy snores satisfies. There is another sense in which to understand well-formedness with respect to Sandy snores, however. Suppose that I said that Sandy snores means “The square root of two is irrational.” While the form of the sentence would not be in violation of the traditional rules of English grammatical form, and the corresponding meaning is coherent, there would be a failure of correspondence between the form and the meaning. The word Sandy does not correspond to the meaning ‘the square root of two’, and the word snores does not correspond to the meaning ‘is irrational’. In conventional approaches to the relationship between form and meaning, there cannot be a failure of correspondence because the meaning of a string of words is determined by a computation that takes as its input the individual words and their meanings and the syntactic structure in which they are arranged. So, assuming that the meaning of Sandy is s and the meaning of snores is snore′ , the syntactic structure tells us to apply the meaning of the verb to the meaning of the NP to get (roughly) snore′ (s). This approach to semantic interpretation is called ‘compositional’ or ‘Fregean’ (Partee et al. 1990). Non-Fregean approaches, such as the one sketched out here, argue that there are aspects of interpretation that cannot be localized in the meanings of the individual words but must be associated with the structure in which they appear (Goldberg 1995, 2003). A classic example is the double object construction, V-NP1 -NP2 , which conveys the notion ‘transfer possession of NP2 to NP1 by means of V-ing’, whether or not the verb has transfer of possession as part of its lexical meaning.1 The examples in (1) illustrate. Give literally means ‘transfer possession’, so (1a) is consistent with the ‘transfer possession’ meaning associated with the construction. But in (1b) head is a verb that means simply ‘hit with one’s head’.

1 It is of course possible to complicate the syntax in order to represent all aspects of meaning in a cryptocompositional format, by positing phonologically empty elements that have the noncompositional components of the meaning. A constructional approach applies Occam’s Razor to arrive at a naive, simple syntactic representation in the spirit of Culicover & Jackendoff (2005), one in which such ad hoc meaning-bearing empty elements are ruled out in principle.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

18 constructions ‘Transfer possession’ in this case is part of the interpretation in virtue of the construction itself, not the meaning of the verb. (1) a. Sandy gave Chris the money. b. Sandy headed Chris the ball. (= ‘Sandy transferred possession of the ball to Chris by hitting it with his/her head.’) On a constructional approach, then, well-formedness applies not just to the organization and morphological form of a string of words, but to a phonological string—in this case, a string of words—with a corresponding interpretation. A representation consisting (minimally) of a string of words and a meaning is a construct. In order for a construct to be well-formed, it must be licensed. That is, its form must satisfy the phonological, morphological, and syntactic conditions and its meaning must satisfy the semantic conditions imposed by the constructions that constitute the grammar, as in the Sandy snores, etc. examples discussed above. Such a correspondence can be very general, as in the case we have just been considering, or it can license an idiomatic expresion or a single word.2 An interesting and important property of the constructional approach that is elaborated in Chapter 3 is that constructions are ‘elastic’: the licensing conditions for a construction can be expanded or shrunk to cover larger or smaller sets of elements or categories to account for variation or to characterize change. So dialects and even individual grammars may differ just in terms of whether or not a particular lexical item participates in a given construction (cf. give NP1 NP2 vs. ∗ donate NP1 NP2 ).3 Furthermore, it is possible to have multiple constructions in a grammar that overlap or conflict. Under such circumstances, native speaker competence is what we would expect if there were ‘multiple grammars’ (Kroch 1989) competing with one another in the social network. But it is not necessary to appeal to multiple grammars if single grammars can incorporate multiple 2 Note that many constructionalists prefer to reserve the term ‘construction’ for syntactically complex correspondences that involve non-compositional meaning. 3 The ditransitive form with donate is generally claimed by generative grammarians to be impossible, but is arguably not ruled out on principled grounds. In fact, the following appears quite innocently in the 1901 Transactions of the Annual Convocation of the Royal Arch Masons, Grand Chapter (Mich.). (i) If not, can we donate him the amount of the fee, so that he can pay for his degrees, and if so should it be done previous to the petition? And here is an example where the indirect object is a full NP. (ii) The library now is looking for some organization to donate the library a subscription to the Denton Record-Chronicle, so that a bound file may be kept of it …(https://www.newspapers.com/ newspage/11158344/) For discussion of the elasticity of constructions, see Goldberg (2019).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.3 a framework for constructions 19 constructions with licensing conditions of varying generality. It is sufficient to assume that there are multiple constructions that are competing, not the grammars (Henry 2008, 274). For elaboration of this point, see Chapter 4.

2.3 A framework for constructions Let us now consider in more detail what a theory of constructions looks like. A theory of constructions is a theory of licensing of correspondences, not a theory of syntactic derivations or of constraints (Michaelis 2012). The goal is to explain what determines the well-formedness of an arbitrary construct in a language, that is, a form/meaning correspondence representing a token of the language, in terms of the constructions that comprise the grammar of that language.

2.3.1 Representing constructions Following the Parallel Architecture of Jackendoff (2002), I take a construction to be a correspondence between elements of four ‘tiers’, phon, syn, gf (for ‘grammatical functions’), and cs (for conceptual structure). Each tier has its own primitives, rules of combination, and constraints. The well-formed expressions of a language are just those that exemplify the correspondences between the tiers that are stipulated by the constructions of the grammar. A construction is a generalization over such correspondences that states what correlated properties of representations on the tiers determine the well-formed expressions of the language. A construct that satisfies the conditions of a construction is said to be licensed by that construction. Crucially, phon is responsible for all aspects of overt form, such as the pronunciation of individual words, their morphological inflections, and their temporal ordering.⁴ syn is responsible for the hierarchical organization of the words and phrases. I assume that linear order is represented only in phon, and not in syn.⁵ This is a point that calls for some elaboration, and I return to it in more detail in section 2.3.3. ⁴ A complication that I set aside here is the proper treatment of prosody. ⁵ This approach to constructions assumes that linear order is a phonological property, while morphosyntactic category, constituency, and hierarchical structure are represented in syn. The separation of phonological and syntactic properties echoes one originally made by Curry (1963). It was not assumed in early MGG, but is seen in a number of other approaches, such as HPSG and LFG. For additional discussion of this and related points, see section 2.3.3.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

20 constructions To illustrate the kinds of constructions that can be defined in these terms, I formulate a series of examples of increasing complexity. I start with individual lexemes, look at several idioms, and work up to phrasal constructions. Individual lexemes are minimal constructions. Each lexeme specifies a correspondence between a string of sounds, morphosyntactic properties, and a meaning, and this correspondence must be represented explicitly in the lexicon of the language. For example, the lexical entry of the verb kick is (2). (2) kick ⎡phon ΦEnglish (kick)=/kɪk/1 ⎤ ⎢syn ⎥ [V kick]1 ⎢ ⎥ ′ [λy.λx.kick 1 (agent:x,theme:y)]⎦ ⎣cs I assume for convenience that the interpretation consists of a primitive concept, here represented by the word marked with a prime, following common practice in formal semantics. Thus, I do not address directly the lexical semantics of the individual words—that is a major topic in its own right, quite apart from the current focus. The interpretation includes as well thematic relations such as agent and patient that distinguish the arguments. The decomposition of the thematic relations into more basic thematic features, along the lines of Dowty (1991), is discussed at some length in Chapter 5. There are two parts to the correspondence for kick that we must consider, the phon-syn correspondence and the syn-cs correspondence. As discussed briefly in Chapter 1, the entry in (2) is approximate, because it does not take into account the fact that the lexeme kick may take different forms depending on its inflection: kick, kicks, kicked, kicking. For irregular verbs, the set of forms is larger, e.g. go, goes, went, gone, going. This information is captured by the paradigm function Φ, applied to the lexical item and its inflectional features. In order to simplify the exposition and develop the main intuitions of our approach to constructions, it will be useful to adopt several notational conventions. As shown in (2), the corresponding parts of a construction on different tiers are coindexed. The terms in a construction may be constants, as in (2), or they may be variables, when categories are involved. For the phonological constants, I use the orthographic form in phon instead of the paradigm function or its phonetic value. Thus, (2) would be written as (3).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.3 a framework for constructions 21 (3) kick ⎡phon kick1 ⎤ ⎢syn ⎥ [V kick]1 ⎢ ⎥ [λy.λx.kick′ 1 (agent:x,theme:y)]⎦ ⎣cs For constructions that state correspondences between categories in syn and their representations in phon and cs I use just the indices. For example, consider a construction in language L where the verb is inflected. Letting V denote the verb and φ its inflectional features, we have the construction in (4). (4) V in language L phon ΦL (1,2) [ ] syn V1 [φ2 ] To simplify, I use the index itself in phon to refer to the value of the paradigm function applied to the item with that index; other indices may correspond to relevant morphosyntactic properties. Thus, the content of (4) is written as (5). (5) V in language L phon 1(2) [ ] syn V1 [φ2 ] Finally, many constructions involve complex correspondences between phonological form and items with inflectional and syntactic features. These items are easiest to represent as attribute-value matrices, e.g. the attribute category has the value V, the attribute person has the value 3rd, etc. I represent the lexical identity as an attribute whose value is the particular lexical item; this is the lexeme identifier, or lid (a notion adapted from Sag 2012; see also Stump 2001; Spencer 2013; Harley 2014; Blevins 2016). (6) ⎡category ⎢lid ⎢ ⎢person ⎢ ⎢number ⎢ ⎣tense

V ⎤ kick⎥ ⎥ 3rd ⎥ ⎥ sg ⎥ ⎥ pres⎦

When the full detail is not needed, I continue to use representations like [V kicks].

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

22 constructions Now let’s look at idioms. The representation of the multiword expression by and large is given in (7). The ordering of the individual words is critical and must be specified in the construction in phon; cf. ∗ large and by. I use ‘–’ to represent strict ordering in phon, and > to represent relative ordering. As noted earlier, I assume that there is no ordering in syn, and that syn represents only the hierarchical structure, the categories and the morphological features (see section 2.3.3). (7) Construction: by and large ⎡phon [by2 –and3 –large4 ]1 ⎤ ⎢syn [ADV [P by]2 , [CONJ and]3 , [ADJ large]4 ]1 ⎥ ⎢ ⎥ mostly′ 1 ⎣cs ⎦ The construct ∗ large-and-by with the meaning ‘mostly’ is not licensed by this construction, since while it has the correct syn and cs, it does not have the correct phon. Consider next the idiom kick the bucket. The stipulation that ‘kick’ precedes ‘the bucket’ should be cost free in the syntactic description, because this is a perfectly normal VP in English and thus corresponds in a regular way to the syntactic structure (see (11)). In the case of idioms with normal structure, the linear ordering is licensed by the appropriate ordering conditions for the corresponding syntactic categories, as are the grammatical functions. The phonological form of the verb, as in kicks the bucket, kicked the bucket, is kicking the bucket, etc., does not need to be specified in the construction itself, since it follows independently from the paradigm function ΦEnglish . The only thing that is necessary to distinguish the idiom from the literal expression with the same form is the identification of the set of words that constitute the idiom with the corresponding meaning. So we can represent kick the bucket in the lexicon as in (8).The lid’s are represented by the lexemes in italics. GF4 is the grammatical function of the bucket. (8) kick the bucket ⎡syn [VP [V kick]1 , [NP [ART the]2 , [N bucket]3 ]4 ]5 ⎤ ⎢gf GF4 ⎥ ⎢ ⎥ ′ [λx.die 5 (experiencer:x)] ⎣cs ⎦ Next consider sell down the river, which is partially lexically specified, but contains a variable term.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.3 a framework for constructions 23 (9) sell NP down the river ⎡syn [VP [V sell]1 , NP2 , [PP [P down]5 , [NP [ART the]6 , [N river]7 ]8 ]3 ]4 ⎤ ⎢gf GF2 ⎥ ⎢ ⎥ [λy.λx.betray′ 1+3 (agent:x, theme:y)(2′ )]4 ⎣cs ⎦ There is no phon defined for this construction, because it is built around a VP. As before, it is incumbent on this construction to specify the hierarchical structure on the syn tier, but not the order of elements. Other constructions, such as those that license VP, NP, and PP, license the order of elements and the grammatical functions. Thus, the ordering 1–2–3 is just one possible ordering in VP (10a). Another is 1–3–2 (10b), which is licensed if 2 corresponds to a ‘heavy’ NP, as discussed in Culicover et al. (2017). Sandy (10) a. They sold { them } down the river. the people who trusted them ?Sandy b. They sold down the river { ∗ them }. the people who trusted them This ordering is the responsibility of a construction or set of constructions that link the order of constituents in the VP to information structure and other factors (Wasow 2002; see also Culicover & Winkler 2008). An example of a fully general construction is the one that licenses VP in English. This construction is not specified for any particular V or set of Vs.⁶ The construction says that a VP may consist of a V followed by other, possibly null, material; it has the form in (11). The phon of (11) says that any daughter in VP follows V.⁷ Since for each VP the interpretation is determined by the semantics of the verb, we do not specify the CS in the formulation of this particular construction. (11) Construction: VP-initial V phon [1>2]3 [ ] syn [VP V1 , X2 ]3 ⁶ Strictly speaking, the notation should distinguish between V as the category of an individual lexical item and V as a variable over items in this category. I have chosen not to complicate the notation, leaving it to the different contexts in which the category symbols appear to distinguish them. ⁷ A more comprehensive account might reflect the fact that this condition may be violated by preverbal adverbs, as in Sandy will completely fail the test, if such adverbs are daughters of VP. If they form a complex V with the lexical verb, or are attached higher to the VP, then the generalization holds as stated.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

24 constructions Crucially, (11) alone does not fully license an actual VP with non-null sisters of V. The complements and adjuncts must be licensed by other constructions for expressions of the relevant categories, e.g. NP, CP, AP, PP, VP, and so on. Consider next the GFs. Subject and object are the traditional names for the highest ranked GF and the next highest ranked GF in the GF hierarchy; I assume that they are not primitive universals.⁸ The constructions in (12)–(13), based on Culicover & Jackendoff (2005), explicitly express the correspondence between the subject and object GFs and syntactic configurations. (12) says that the NP daughter of S corresponds to the highest grammatical function in the domain of that S, and (13) says that the NP sister of V corresponds to the next highest function. (12) Construction: Subject syn [S NP1 , AUX2 , …] [ ] gf [GF1 (> …)]2 (13) Construction: Object syn [VP V, NP1 ]2 ] [ gf [GF > GF1 ]2 The Subject construction (12) does not specify the linear position of the subject NP. That is the responsibility of other constructions, such as declarative, which places it before the inflected verb, subject Aux inversion (SAI), which places it immediately after an inflected auxiliary, and special focus constructions, which place it after V (Culicover & Levine 2001; Culicover & Winkler 2008). Note that the Object construction assigns the second grammatical function in the hierarchy to the direct object sister of V.⁹ We also need to assign a GF to the complement of certain prepositions. (14) Construction: Oblique object syn [VP V, [PP P, NP1 ]]2 ] [ gf [GF > GF1 ]2 For discussion of the history of this last construction, see section 9.3. ⁸ In fact, the highest ranked GF need not have the same properties across all languages. For discussion, see Chapter 6. ⁹ Following Culicover & Jackendoff (2005), I assume that the structure of this VP is flat.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.3 a framework for constructions 25 The well-known double object construction, exemplified by Chris gave Sandy a book, can be formulated as in (15). The specification means:1′ in CS is the non-compositional part of the meaning; informally, it is the action denoted by the verb by means of which the transfer is accomplished (Goldberg 1995). (15) Construction: Double object phon [1–2–3]4 ⎡ ⎤ [VP V1 , NP2 , NP3 ]4 ⎢syn ⎥ ⎢gf ⎥ [GF > GF ] 2 4 ⎢ ⎥ [λz.λy.λx.transfer′ (source:x,goal:y3 ,theme:z4 ,means:1′ )(2′ )(3′ )]4 ⎦ ⎣cs

The application of this construction to (Chris) gave Sandy a book is shown in (16). s and b are the cs representations of Sandy and a book, respectively. (16) gave Sandy a book phon ⎡ ⎢syn ⎢ ⎢gf ⎢cs ⎢ ⎣

[gave1 –Sandy2 –a-book3 ]4 ⎤ ⎥ [VP V[give, past]1 , NP[Sandy]2 , NP[a book]3 ]4 ⎥ [GF > GF2 ]4 ⎥ [λz.λy.λx.transfer′ (source:x,goal:y,theme:z,means:give′ 1 )(s2 )(b3 )]4 ⎥ ⎥ ⇒ λx.transfer′ (source:x,goal:s2 ,theme:b3 ,means:give′ 1 ) ⎦

Note that I do not associate a GF with the direct object in the double object construction. This is because there are no grammatical phenomena in English, such as passive, that refer to this constituent. The subject argument of the passive in Standard English corresponds to the object that is closest to the verb in the active. So, when there are two objects, only the first object can become a passive subject.1⁰

1⁰ However, in some dialects, and particularly the style of the King James Bible, the second object may become the subject of the passive. This is most common when the first object is a pronoun. (i) a. Therefore I prayed, and prudence was given me; I pleaded and the spirit of Wisdom came to me. b. Do not neglect the gift you have, which was given you by prophecy when the council of elders laid their hands on you. c. So they took the bull which was given them, and they prepared it . . . d. They were given this land on long-time payments, the water was supplied, every possible assistance was given those people. For such dialects we could assign GF2 to the direct object when the indirect object is pronominal. Plausibly, the analysis of the VP in such cases would not treat the pronominal argument as NP2 , but as a clitic adjoined to V, in which case the object construction in (13) would suffice; alternatively, there might be a separate construction for pronominal direct objects, as well as full NPs as in (i.d), in which both arguments have GFs. I leave the question open here.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

26 constructions (17) a. Chris gave Sandy a book. b. Sandy was given a book. c. ∗ A book was given Sandy. (18) a. Chris gave a book to Sandy. b. A book was given to Sandy. c. ∗ Sandy was given a book to. (19) a. Chris talked to Sandy about Lee. b. Sandy was talked to about Lee. c. ∗ Lee was talked to Sandy about. (20) a. Chris talked about Lee to Sandy. b. Lee was talked about to Sandy. c. ∗ Sandy was talked about Lee to. This type of correspondence between active and passive is central to accounting for phenomena that have been described in terms of meaningpreserving transformations in classical MGG. In a constructional framework, we can state two constructions as schemas, and express the relationship between them by indexing the parts of the schemas that they share. When two constructions share CS representations and differ only in syntactic structure and corresponding phonological form, they have the appearance of classical transformations. But the correspondences may be less exact, and it is in principle possible for them to have different representations on the cs tier. For example, the construction that licenses yes-no questions in English is related to the construction that licenses declaratives. There are two differences between the two constructions. First, the inflected Vaux is in initial position in the question, and in second position, after the subject, in the declarative. Second, the question has an interrogative interpretation, while the declarative does not. So we can represent the relationship between the two constructions as in (21), where the superscripts x,y mark the terms that correspond across the constructions. The schemas are sister schemas in the sense of Jackendoff & Audring (2020).11 (21) Correspondence: declarative ⇔ yes-no question ⇔ phon [1–2–…]3 ⎡ ⎤ y ⎢syn [S NPx1 , V[finite]2 , …]3 ⎥ ⎢ ⎥ 3′ ⎣cs ⎦

phon [2–1–…]3 ⎡ ⎤ y ⎢syn [S NPx1 , Vaux [finite]2 , …]3 ⎥ ⎢ ⎥ Q(3′ ) ⎣cs ⎦

11 Ray Jackendoff (p.c.) suggests that such constructional correspondences are realizations of the original notion of ‘transformation’ due to Harris (1951, 1957).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.3 a framework for constructions 27 This relational formulation says explicitly that the question that corresponds to a declarative is one in which the order of subject and Aux is inverted.12 Similarly, the passive is formulated in (19) as a relation between active and passive. This relational statement says that the argument that corresponds to the second ranked GF in the active corresponds to the first ranked GF when the verb is a passive participle. (22) Relation: active ⇔ passive NP1 ⎡ ⎤ y ⎢syn [S …, [VP V , {[ P, NP ]} ]]2 ⎥ PP 1 ⎢ ⎥ ⎢ ⎥ x gf [GF > GF ] 1 2 ⎣ ⎦ ⇔ syn […, [VP Vy [passive]]]2 [ ] gf [GFx ]2 ] This formulation departs from the treatment of passive in Culicover & Jackendoff (2005), and reflects the Relational Grammar intuition that passive is a construction that promotes a 2-argument to a 1-argument, by suppressing the higher argument. Since we do not have to explicitly represent the position of the corresponding GFs in either active or passive in the statement of this correspondence between constructions, we are able to account for passives where there is no overt subject, e.g. in cases of control (23a) and reduced relatives (23b). In these constructions, the highest GF does not correspond to anything in syn. (23) a. Chris expected to be [VP[passive] selected]. b. the student [VP[passive] selected as our representative] See Culicover & Jackendoff (2005, 194ff) for discussion of such correspondences.13

2.3.2 Licensing The examples of the preceding section illustrate two key aspects of licensing: (i) Each element of each tier of a construct must satisfy some condition of 12 For the distribution of do-support in this and other constructions, see the discussion in section 9.2. The description of more complex constructions such as wh-questions, topicalization, and Germanic V2 is taken up in Chapters 7 and 8. 13 This relational treatment of passive also avoids a problem with ‘double passives’ noted by Müller (2013, 925–7) in connection with the treatment in Culicover & Jackendoff (2005).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

28 constructions some construction, and (ii) all aspects of the construct must be licensed. For instance, in the construct Chris kicked Fido, the phon tier must represent the appropriate forms for each of the words in the appropriate order, and the meaning must be one in which the relation kick′ holds of the entities c, the CS representation of Chris, and f, the CS representation of Fido. Similarly, Sandy kicked the bucket is licensed by the idiom construction and the more general constructions if die′ is predicated of s, and if the proper linear ordering is observed.1⁴ While the idea is intuitively simple, the implementation of a definition of licensing that satisfies this description is non-trivial. The most important parts of the definition are given in sections 2.4.1–2.4.3 in the Appendix to this chapter.

2.3.3 Linear order In section 2.3.1 I noted that in the current framework, linear order is represented only in phon, while syn is where hierarchical structure is represented. This does not mean that syntactic structure has nothing to do with linear order. In fact, licensing of linear order is largely dependent on syntactic structure— more so in some languages than others, of course—and it would be impossible to state generalizations about linear order in a language without reference to representations in syn. The nature of this relationship is a complex one with a long history. Documenting it adequately could easily occupy a monograph on its own. In order not to depart too much from the current narrative, I devote just this brief section to the question and leave a fuller discussion to another venue. To begin, it is a truism that linear order is associated with particular phrase structure configurations in many languages. For instance, in English the verb is initial in the VP. So we could say, following a long tradition, that the syntactic structure for VP in English represents the initial position of V, and this corresponds redundantly to the linear order of the corresponding elements in phon. The syntactic representation would essentially be the familiar [VP V XP]. Alternatively, we could say that the syntactic structure does not represent linear order. The order is defined through the correspondence with phon, which does represent linear 1⁴ There is a possibility that the ability of an idiom to appear in various structures, e.g. passive, correlates with the transparency of the correspondence between syn and cs, as suggested by Nunberg et al. (1994); see also Sag (2012). Similar considerations may account for the degree of productivity in derivational morphology; see Jackendoff & Audring (2020).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.3 a framework for constructions 29 order. The syntactic representation in this case is that of (11), i.e., [VP V, XP]. Only correspondences in which the portion of phon that corresponds to V precedes the portion of phon that corresponds to the rest of VP are licensed. It may appear, then, that the information about the correspondences between linear order in phon and syntactic structure in syn can be represented in more than one equivalent way. However, this is not entirely true. If we encode linear information in syn, then we are assuming that there is a general correspondence between linear order and structure. That means that each aspect of linear order must be associated with a particular syntactic configuration. The most extreme version of this assumption is that of Kayne (1994), where linear order is completely determined by asymmetric binary branching. There are problems with this tight association between structure and order. While V is necessarily initial in a non-idiomatic English VP, the ordering of other constituents is sensitive to focus and discourse properties (Rochemont & Culicover 1990), as well as computational complexity and prosody (Hawkins 1994; Wasow 2002). The assumption that there is a single fixed ‘underlying’ order for the constituents of VP in syn requires movement to produce multiple syntactic surface structures. (In the terminology of classical generative grammar, each constituent must move to a ‘landing site’ that corresponds to its position in the linear order, and the movement must be ‘checked’ in some way in order to be licensed.) These movements, and the structures that trigger them, are largely unmotivated beyond the need to derive various orders from a single source (Rochemont & Culicover 1997; Chomsky et al. 2019). So, for example, a constituent in an A′ position in English must be in the Specifier of the phrasal projection of some head. In the case of topicalization, illustrated in (24), there must be two such heads. One, H01 , which is higher than C0 , is for main clauses. The other, H02 , which is lower than C0 , is for subordinate clauses. (24) a. At the party H01 what [C0 did] you drink? b. I said [C0 that] at the party H02 I drank only beer. These two structures are certainly logically possible. In fact, they have been reified into an entire theory of the left periphery in the Cartography program (Rizzi 2004; Cinque & Rizzi 2008; van Craenenbroeck 2009; Rizzi & Cinque 2016). On this view, every fact about linear order must correspond to an asymmetry in the syntactic structure: if the phonological form corresponding to α precedes that corresponding to β, then α must be higher than β in the tree.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

30 constructions Requiring movement of phrases to Specifier positions requires a head, or at the very least some kind of trigger, for every such movement. A constituent should be able to move to any arbitrary landing site of the appropriate type, and must be constrained from doing so by supplementary principles, assumptions, and stipulations. This is a particular concern, for example, of Rizzi (2007, 2014), who needs to find some way (‘criterial freezing’) of blocking a constituent that has moved to an A′ position in order to agree with a head from moving still higher in the structure. Moreover, movements from a fixed underlying order are not required in order to state the correspondence between focus and other properties that are sensitive to order and the ordering in phon. Take, for example, the Heavy NP Shift construction of English illustrated in (25). (25) Chris put on the table [NP the groceries that we need to make dinner]. The VP-final NP is a focus (Rochemont & Culicover 1990), as captured by the construction in (26). This construction participates in licensing the correspondence in (27). (IS is information structure, which I represent here in a separate tier.) (26) ⎡phon [1–…–2] ⎤ ⎢syn [VP V1 , NP2 , …]⎥ ⎢ ⎥ focus(2) ⎣is ⎦ (27) ⎡phon ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢cs ⎢ ⎣is

[Chris4 –put1 –[on–the–table]3 –

⎤ [the groceries that we need to make dinner]2 ]5 ⎥ ⎥ [S [NP Chris]4 , [VP [V put]1 , ⎥ ⎥ [NP the groceries …]2 , [PP on the table]3 ]]5 ⎥ ⎥ [put′ 1 (c4 ,g2 ,[on′ (t)]3 )]5 ⎥ focus(g2 ) ⎦

The issue becomes more acute when we consider languages such as Russian in which constituent order is relatively free, and polysynthetic languages (e.g. Plains Cree, cf. section 6.3) in which arguments and adjuncts of V may be incorporated into a single word. In the first case, it is most natural to assume a single hierarchical structure with no ordering in syn, and to require that the order in phon reflect the discourse and focus structure, along the lines of (26). In the second case, the linear order in phon is determined by

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.4 appendix: formalizing constructions 31 the morphology, and there is no obvious role for representing order in syn as well. To conclude, I assume throughout that the constituents of a phrase are unordered in syn. Their corresponding forms are ordered in phon.

2.4 Appendix: Formalizing constructions 2.4.1 Representations on tiers The basic idea of constructional licensing is that a construct is licensed if it can be exhaustively analyzed as a set of correspondences, each of which meets the conditions of some construction—in this case, the construct is said to instantiate the construction. Consider, for example, the construct kick Fido as in (28). (28)

⎡phon /kɪk1 faydo2 /3 ⎤ ⎢syn ⎥ [VP V[kick]1 , NP[Fido]2 ]3 ⎢ ⎥ ′ [λx.kick 1 (agent:x,patient:f2 )]3 ⎦ ⎣cs

This construct is licensed because the lexical entries for kick (29) and Fido (30) consist of the same correspondences. (29) kick ⎡phon /kɪk1 / ⎤ ⎢syn ⎥ [V kick]1 ⎢ ⎥ ′ λy.λx.kick 1 (agent:x,theme:y)⎦ ⎣cs (30) Fido ⎡phon /faydo2 / ⎤ ⎢syn NP[Fido]2 ⎥ ⎥ ⎢ f2 ⎣cs ⎦ The forms in (28) corresponding to the syntactic constituents are in the proper linear order for an English VP, as stipulated by the VP-initial V construction in (11), repeated here.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

32 constructions (11) Construction: VP-initial V phon [1>2]3 [ ] syn [VP V1 , X2 ]3 The construct may have other properties that instantiate other constructions. If all of the properties of the construct instantiate some construction or other, we say that these constructions license the construct. In generative grammar terms, they generate it. Following Jackendoff ’s (2002) Parallel Architecture (PA), I assume that each tier has its own well-formedness rules. The phon tier represents the sound of a construct. As indicated earlier, we can sidestep the phonological details by assuming that for every language L there is a paradigm function ΦL that specifies how a lexeme will be spelled out in phon. For languages with complex inflectional paradigms, the paradigm function must take as its argument the lexeme identifier, any predictable information associated with roots, and the morphological features. Leaving aside many of the technical details, we say that a representation on phon is a string of sounds (of a given language). Moreover, we define the relations of precedence (>), immediate precedence (–), and inclusion (,) on representations in phon as follows: (31) Relations: a. Substring: ⊆=def Given strings 1 and 2, 1 ⊆ 2 iff pronouncing 2 involves pronouncing 1. b. Concatenation: –=def For any strings 1 and 2, 1–2 is the result of pronouncing first 1 and then 2, with nothing in between. c. Temporal precedence: >=def Given strings 1,2, and 3, 1 >3 2 iff i. 1 ⊆ 3, and ii. 2 ⊆ 3, and iii. when 3 is pronounced, 1 is pronounced and 2 is pronounced later, and iv. ∄4.4 ⊆ 1 ∧ 4 ⊆ 2 (1 and 2 do not overlap). d. Immediate temporal precedence: –=def Given strings 1,2, and 3, 1–3 2 iff i. 1 ⊆ 3, and ii. 2 ⊆ 3, and iii. when 3 is pronounced, 1–2 is pronounced.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.4 appendix: formalizing constructions 33 e. Inclusion: , =def Given strings 1, 2, and 3, [1, 2]3 iff i. 1 ⊆ 3, and ii. 2 ⊆ 3, and iii. ∄4.4 ⊆ 1 ∧ 4 ⊆ 2 For transparency of representation, I write 1>3 2 as [1>2]3 and 1–3 2 as [1–2]3 in the statement of constructs and constructions. The concatenation and precedence relations defined in (31) have obvious applications to defining word order. Inclusion is used when a construction requires all of the constructs that instantiate it to include certain strings but does not say anything about how those strings must be ordered. A case would be free order in a language like Russian. Next, consider representations on syn. It is important to keep in mind that even though a construction licenses correspondences between phon, syn, and cs representations, we also have to say independently what constitutes wellformed representations on each tier. These representations do not come ‘for free’ in any theory, not even a constructional theory. Since word order is represented by phon, it is sufficient to represent syn using a representation that does not impose order. The syn tier is represented set-theoretically, including multisets. Multisets, notated [ , ], are just like sets in that the order of elements does not matter. For instance, the multisets [a, b] and [b, a] are equivalent. Multisets differ from sets in that the duplication/repetition of elements matters. For example, the sets {a, a, b, b} and {a, b} are equivalent, but the multisets [a, a, b, b] and [a, b] are not. I assume for convenience here that there are primitive syntactic categories. Each syntactic category is represented as a multiset. Some categories are given in (32) and abbreviated with capital letters. I do not assume that this set is exhaustive or universal.1⁵ (32) syn categories a. V = verb b. N = noun c. VP = verb phrase d. NP = noun phrase e. PP = prepositional phrase 1⁵ Where the categories come from, what they are, and how many there are are independent questions that I set aside here. For some relevant discussion, see Kuryłowicz (1965); Emonds (1985); Croft (1991); Culicover (1999); Baker (2003); Wiltschko (2014). In Chapter 4, I discuss how categories might emerge in the course of learning.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

34 constructions Furthermore, in practice I take a syntactic element to be defined by fixing the values of one or more features, corresponding to specific syntactic and morphological properties. For example, general categorical information is represented as [category category-value], and subcategory information such as person, number, etc. are represented similarly. Consider once again the representation of the lexeme kick. (33) elaborates somewhat the representation in (29). (33) kick ⎡phon kick1 = /kɪk1 / ⎤ ⎢ ⎥ category V ⎢ ⎥ [ ]1 = V[kick]1 ⎥ ⎢syn lid kick ⎢ ⎥ ⎢ ⎥ ′ λy.λx.kick 1 (agent:x,theme:y) ⎦ ⎣cs The construction in (33) says that kick is a member of the category V. Importantly, (33) does not say that the syn tier of kick consists of a variable over members of V. There is only one member of the category that is associated with the phon tier kick and the cs tier λy.λx.kick′ (agent:x, theme:y). It is the one with the lid kick. Example (34) shows how the syn tier works for a construct with two lexical items. (34) Construct: kick Fido ⎡phon [kick1 –fido2 ]3 ⎤ ⎢syn ⎥ [VP V[kick]1 , [NP Fido]2 ]3 ⎢ ⎥ ′ [λx.kick 1 (agent:x,theme:f 2 )]3 ⎦ ⎣cs Focusing on the syn tier, (34) shows that the construct consists of a multi-set which is a member of the category VP and which has two elements. The first element, indicated by V, is an element of the category V. Specifically, it is the element V[kick]. The second is an element of the category NP. In cases such as the plural, this approach allows us to dissociate the syntactic marking, the semantic interpretation, and the form. So, for dogs bark, we can mark the agreement in the syntax using the syntactic feature [number plural], while the pl feature in the semantics is only on dog′ , and there is no overt marking on the verb, as shown in (35).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.4 appendix: formalizing constructions 35 (35) ⎡phon ΦEnglish (dog1 ,pl2 )-ΦEnglish (bark3 ,4 )= ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢cs ⎢ ⎣

⎤ ⎥ ⎥ ⎥ category V ⎡ ⎥ ⎡ ⎡ ⎤⎤⎤ ⎥ ⎢ ⎢ ⎢lid ⎥⎥ ⎥ bark3 ⎥ ⎢ category NP ⎢ ⎢ ⎥ ⎥ ⎥ ⎢ ⎡ ⎤⎢ ⎢ ⎥⎥⎥ number plural ⎤ ⎥ ⎥⎢ ⎢ ⎢ ⎢lid ⎡ dog ⎥⎥⎥⎥ 1 , ⎥⎢ ⎢ ⎢ ⎢ rd ⎢person 3 ⎥ ⎥⎥⎥⎥ ϕ ⎥ ⎢ ⎢ϕ ⎢ ⎢ ⎥ ⎥⎥⎥⎥ ⎢ plural] [number 2 ⎦⎢ ⎢ ⎢ ⎣ tense present⎦ ⎥⎥⎥⎥ ⎣ ⎥ ⎢ ⎢ ⎣ 4 ⎦⎥ ⎥ ⎥ ⎢ ⎥ ⎣ ⎦ VP ⎦⎥ ⎣S ⎥ [bark′ 3 (agent:dog′ 1 ;plural2 )]4 ⎥ ⎦ dog1 (pl2 )-bark3 (4 )= dog1 s2 -bark3,4 = /dɔg1 z2 -bark3,4 /

Some nouns are syntactically plural but semantically singular—the pants are expensive. (36) ⎡phon ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs

ΦEnglish (pant1 ,pl2 )=pant1 (pl2 )=pants1,2 =/pænts1,2 /⎤ ⎥ ⎥ ⎡category N ⎤ ⎥ ⎢lid ⎥ pant1 ⎢ ⎥ ⎥ ⎢ ⎥ ⎥ ϕ plural [number ] 2⎦ ⎥ ⎣ ⎥ ′ pants 1 ⎦

In this case, there is nothing in cs corresponding to the plural morphology. Crucially, which categories combine in which ways is determined by the constructions of each language. For present purposes, I assume that the regular syntactic combinations that constitute representations in syn can be expressed in terms of context-free phrase structure rules. In fact one could assume any syntactic theory for the purpose of specifying syn. I follow the spirit of Simpler Syntax (Culicover & Jackendoff 2005) in seeking to make syn no more complex than it needs to be in order to account for the correspondences. Although variables over members of syntactic categories do not figure in (33) and (34), we do need such variables. For example, the VP-initial V construction in section 2.3 applies to constructs regardless of the particular verb and complements involved. Thus V in a construction is a variable over members of the category V, unless of course it is a feature of a particular lexeme. For CS, I represent the cs tier using a higher order logic with at least the basic types e for individuals and t for truth values and with the subtype of individuals ε for Davidsonian events. Assume the functional type constructor ⟨, ⟩. Because semantic frameworks of this sort are widespread, and the details

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

36 constructions are not central to the issues I am focusing on, I do not attempt to define the logic here. The only modification is a non-standard representation of thematic roles. Thematic roles can be understood as properties of predicates indexed by argument position or as relations between events and individuals (Dowty 1989, 1991). On the second approach, in e.g. John runs, the predicate run introduces an event, call it e. agent is then a relation between events and individuals. In this case, agent(e,j) is true of the running event. Given suitable assumptions about how thematic roles are defined, for any eventualityintroducing predicate P and individual x, if x is an argument of P, then in principle it is possible to specify x’s thematic role in the event introduced by P. For such a translation, the representation could be P(e) ∧ role(e, x). However, to keep things simple, I write P(role:x) to mean the same thing. Thus, for the translation of John runs, ∃e.run′ (e) ∧ agent(e,j) and run′ (agent:j) are notational variants.

2.4.2 Connections between tiers Having defined what representations look like on the various tiers, we now have to be precise about correspondences between elements on different tiers before we can define licensing. Connections between tiers are represented using subscripts. For example, consider again the lexical entry for kick in (33). The subscripts indicate that particular elements of particular tiers are connected. The phon material 1 (representing /kɪk/) and the syn material [V kick]1 characterize the same grammatical element.

2.4.3 Licensing via instantiation Consider next how constructions license constructs. There are two basic relations that hold between constructs and constructions. The first is term instantiation. Here ‘term’ is used to refer to expressions in constructs and constructions across all tiers and types: variables, functions, sets, etc. Term instantiation holds between a construct and a construction if some term in the construct is an instance of a term in the construction. Two types of term instantiation are defined in (37). (37) Term Instantiation: A term t in a construct instantiates a term T in a construction iff

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.4 appendix: formalizing constructions 37 a. t is identical to T (instantiation by identity) or, b. t is a value of type α, and T is a λ-bound or free variable of type α (instantiation by substitution). I illustrate (37b) by considering how the construction for kick (i.e. the lexical entry for kick) is instantiated in the construct Chris kicked Fido, which is given in (38). From this point forward, phonetic representations in phon and the details of morphosyntactic correspondences are omitted when they are not central to the discussion, and syntactic representations are simplified where possible. (38) Construct: Chris kicked Fido. ⎡phon [chris1 –kicked2,3 –fido4 ]5 = /krɪs1 -kɪk2 t3 -faydo4 /5 ⎤ ⎢ ⎥ lid kick2 ⎢ ⎥ syn [ NP[Chris] , [ V[ NP[Fido] ]] ], IP 1 VP 4 5⎥ ⎢ tense past 3 ⎢ ⎥ ⎢ ⎥ ′ [kick 2 (agent:c1 ,theme:f4 )]5 ⎣cs ⎦ In the phon tier, the phonological material /kɪk/ in the construct is identical to the phonological material in the lexical construction for kick. Similarly, in the syn tier V in the construct is identical to V in the construction. These descriptions apply, ceteris paribus, to the instantiation of the phon, syn, and cs tiers of the constructions for Chris and Fido. Therefore, only the construction for Fido is provided. (39) Fido ⎡phon fido ⎤ ⎢syn NP[Fido]1 ⎥ ⎢ ⎥ f1 ⎣cs ⎦ The instantiation of kick in the cs tier of (38) involves both identity and substitution. First, the meaning and argument structure of kick are instantiated by identity. In both the construct in (38) and the kick construction in (33), the meaning of kick is kick′ and its argument structure involves an agent and a theme. In contrast, the variables in the cs tier of kick, x, and y, are instantiated by the values f and c, respectively. This is instantiation by substitution. It is acceptable because x and y are variables of type e, and f and c are constants of type e.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

38 constructions The instantiation of the cs tiers of kick and Fido in (38) illustrate an important point. A given term in a construct may instantiate terms in multiple constructions simultaneously. In (38), f simultaneously instantiates f in the construction Fido by identity and the bound variable x in the construction kick by substitution. Instantiation by substitution is further illustrated by the way in which (38) instantiates the VP-initial V construction, repeated here. (11) VP-initial V phon [1>2]3 [ ] syn [VP V1 , X2 ]3 The string /kik2 t3 / in (38) instantiates the variable over string 1 in (11) by substitution. Similarly, /faydo/ instantiates 2 in (11). In the syn tier, the variables V and X are instantiated by V[kick] and NP[Fido], respectively. Instantiation by substitution, defined in (37b), is defined only over λ-bound or free variables. This restriction prevents an individual constant from instantiating a quantificationally bound variable. For example, assume that every has the CS λPλQ∀x.P(x) → Q(x). If (37b) were defined over all variables, then we could substitute j, the interpretation of John, for x. This would result in vacuous quantification, and make incorrect predictions about the interpretation of utterances with every. For this reason, among bound variables, only λ-bound variables can be instantiated by substitution. Free variables are included in (37b) to deal with contextually supplied implicit arguments, such as the implicit location argument needed for the interpretation of local (Partee 1989). An additional kind of term instantiation is illustrated by the relation between elements of the construct in (38) and elements of the VP-initial V construction. This is instantiation by satisfaction. Instantiation by satisfaction is possible when a relation between terms encoded in a construction holds between terms in a construct. In this case, the linear precedence relation between the variables 1 and 2 encoded in (11) holds between the terms kick and Fido which instantiate 1 and 2. Instantiation by satisfaction is defined in (40). It is limited to the phon tier, because allowing it to apply to the cs tier could result in the cs tier of a construct instantiating the cs tier of any construction that it entails. (40) Instantiation by satisfaction: Given construct c with terms t and t′ in the phon tier and construction C with terms T and T′ in the phon tier and relation R such that R(T, T′ ), c instantiates R iff

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

2.4 appendix: formalizing constructions 39 a. t instantiates T and b. t′ instantiates T′ and c. R(t, t′ ). Instantiation by satisfaction is needed primarily to address the mismatch between the kinds of relations available for the phon tier of constructions and those available for the phon tier of constructs. As mentioned in section 2.3.1, the phon tier of a construct can only involve strings and concatenation. However, the phon tier of a construction can express generalizations over constructs using other relations, such as the linear precedence relation. Instantiation by satisfaction specifies how such relations are satisfied in particular constructs. Now it is necessary to say what it means for a construct to instantiate a construction. This is done in (41). (41) Construction instantiation: A construct, c, instantiates a construction, C, iff a. for all terms T in C, there is some term t in c such that t instantiates T (term exhaustion) and b. for all relations R in the phon tier of C, if R is not instantiated by a term in c, then c instantiates R by satisfaction (relation exhaustion) and c. all connections between tiers that hold in C also hold in c (consistent co-indexing). The first two conditions in (41) say that for a construct to instantiate a construction, it must instantiate every term and relation in that construction. The final condition requires that every connection between tiers in the construction must be replicated in the construct. With construction instantiation defined, we are in position to define licensing in (42). (42) Construct licensing: A construct c is a licensed construct iff a. for every term t in c, t instantiates a term T in some construction C and b. for every such C, c fully instantiates C. (42) presents two conditions for a construct to be licensed, or well-formed, in the current framework. The first condition is that every element of the

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

40 constructions construct must instantiate some element of some construction. This means, in essence, that nothing in the construct can be outside the grammar of the language. The second condition is that any construction involved in licensing a construct must be fully instantiated in the construct. In other words, constructs cannot pick and choose parts of constructions to instantiate. Either they instantiate the entire construction, or they do not instantiate it at all. This definition of licensing, combined with the constructions posited above in (11), (30), and (39) plus a suitable construction for a sentence, correctly predicts the Chris kicked Fido construct in (38) to be grammatical. Finally, note that as formulated this system does not differentiate between the correct representation of the construct Chris kicked Fido in (38) and a minimally different representation in which the order of arguments is switched ′ in the cs tier: kick2 (agent:f3 , theme:c1 ). They predict both versions to be acceptable, even though the minimally different cs does not correspond to an acceptable interpretation of Chris kicked Fido. To resolve this problem, it is necessary to add additional complexity to the system, at least for a language like English that uses linear order to distinguish arguments. As discussed in the main body of this chapter, I assume an additional grammatical function tier. This tier connects cs and syn. I also assume that there are constructions in English that stipulate the linear order of subject and object with respect to the verb. Together, these constructions ensure that the mapping between phon, cs, and syn is appropriate. The definitions of instantiation and licensing developed here provide a mechanism by which a set of constructions licenses a construct without inheritance hierarchies or complex constructions built recursively out of more primitive constructions.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3 Universals In this chapter I lay out how a constructional approach to grammar can in principle offer a satisfactory solution to Chomsky’s Problem, introduced in Chapter 1: Given that language is a universal creation of the human mind, why are there are different languages at all, and why don’t we all speak the same language? The approach must address the substantial variation that can be observed in grammar, including local idiosyncracy, but it must also account for the existence of universals, including non-accidental typological patterns and strong constructional correlations. The question of where universals come from is addressed in sections 3.1–3.3. Section 3.1 reviews the approach to universals in MGG. Sections 3.2–3.3 develop the constructional alternative. There I argue that the basis for universals is conceptual structure. Conceptual structure comprises a universal set of functions for expressing human thought. Grammar organizes sounds (and gestures) so as to systematically and efficiently express conceptual structure representations. Along with the universals we must be able to account for the variation. On a constructional approach, there can be multiple solutions to the task of representing conceptual structure. The constraints of efficiency and simplicity have the consequence that these solutions, while they may differ in detail, tend to reflect the structure of the representations more or less faithfully. Sections 4.1–4.3 of the next chapter look at the problem of identifying the correspondences between sound and meaning from the perspective of the language learner, and show how solutions to this problem in terms of constructions can account for variation and change.

3.1 Classical Universal Grammar In this section, I review the classical MGG approach to universals. The essential notions are these: (i) there is a universal syntactic ‘core’, and (ii) there is a restricted, universal set of ‘parameters’ that define possible variation in this core. I discuss how evidence for a universal syntactic core seen in creoles and Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0003

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

42 universals in language acquisition may rather be understood as evidence for a conceptual core, under the assumption that there is a force that drives speakers to find ways to express aspects of conceptual structure using whatever linguistic devices are available to them.

3.1.1 Core grammar In MGG, Universal Grammar constitutes the human faculty of language. It is the component of the human mind/brain that explains both that humans have language, and that human languages have particular properties. The faculty of language determines the mental representation of language that a learner acquires through exposure to exemplars of the language, and accounts for the capacity of speakers to produce novel utterances. Given that humans have language and no other creatures do, it is plausible to assume that this capacity for language is due to our particular genetic makeup—it is wired into our brains. A key distinction is that of the internal, mental representation of language, and the external manifestation of language in the form of sentences. I take it as given that there is a fundamental distinction between a speaker’s knowledge of language and the set of sentences that the speaker can and does produce and understand. Generalizing over a language community—to the extent that this is a coherent notion—there is a fundamental distinction between the knowledge shared by the speakers and the corpus that they produce.1 This said, the solution space defined by this conception of language is vast. It is a singular achievement of contemporary theoretical linguistics that it has been possible to define the problem narrowly enough that a beginning of understanding can be achieved. In order for us to formulate a hypothesis about what the faculty of language looks like, we must have some idea of what the faculty of language is, how a learner forms mental representations that are candidates for being included in a grammar, what sorts of data are available to and used by a learner in the course of formulating hypotheses about grammar, and how a learner makes corrections if it makes an error.

1 These distinctions are somewhat different from Chomsky’s (1986) notions of E-language and I-language. For Chomsky, E-language is the set of sound/meaning correspondences, and on this view, the grammar “may be regarded as a function that enumerates the elements of the E-language” (p. 20). I-language, on the other hand, is an internalized object in the mind of a speaker (p. 22). For discussion and a critique, see Pullum & Scholz (2010).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.1 classical universal grammar 43 Given the vastness of this problem, a natural assumption has been that UG is very restricted. If there was actually only one language wired into the mind, an instinct such as birdsong or bee communication,2 the problem of learning would be trivial—the simple fact of birth, or perhaps exposure to language, would trigger UG, and the knowledge would be activated. All substantive mainstream proposals regarding UG until very recently have tried to keep as close to this ideal as possible. For example, Principles & Parameters Theory (Chomsky 1981; Chomsky & Lasnik 1993), assumes that UG is defined by a single, restricted ‘core’ grammar that allows for limited parametric variation. Those aspects of grammar that do not belong to the core are part of the ‘periphery’. The core is part of the human biological endowment for language, and the value of a parameter is set by the learner on the basis of minimal linguistic input. Over the years there have been many proposals in the mainstream literature regarding the substantive content of UG. These proposals consist of specific conceptions of linguistic structure, operations on those structures, formalisms for stating rules, constraints on rules, and so on. Chomsky himself has been careful until recently to avoid positing that UG consists of specific rules, mechanisms, or grammatical structures. Rather, it is the computational capacity and the general architecture that is universal. He has also been careful not to adopt any dogma that rules out explanations for universals in terms of computational complexity3 or even other factors (Chomsky 2005). However, Chomsky’s most recent proposal appears to restrict UG to a basic mechanism of recursion, i.e. ‘merge’ (Hauser et al. 2002; Berwick et al. 2013; Bolhuis et al. 2014).⁴ And Chomsky has been clear that the rules of grammar themselves do not reflect functional considerations such as processing complexity (Chomsky 1972). Variants of this general view of UG can be found in virtually all work in contemporary syntactic theory that seeks to account for the superficial forms of a given language in terms of syntactic derivations. An early instantiation was the Universal Base Hypothesis, the assumption that all languages are generated by different sets of transformations applying to a universal set of phrase structures (Peters & Ritchie 1969; Wexler & Culicover 1980). More recently, Kayne (1994) proposed that the underlying syntactic structures of 2 Hence the title of Pinker’s The Language Instinct. 3 See, for example, Chomsky’s (1980) critique of Kintsch (1977), or Chomsky (1976), where he writes “one might propose that once process models are developed we will find that all relevant facts are explained without any abstraction to a rule system that articulates the speaker-hearer’s knowledge of his language. This thesis might prove correct . . .”. ⁴ For a general critique of this type of approach to UG, see Pinker & Jackendoff (2005).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

44 universals all languages are right branching and binary branching, and thus head-initial. From these assumptions it follows that languages that appear to be headfinal are actually head-initial but have rampant movement to the left.⁵ And complex head-initial phrases require movement of the head from the most deeply embedded position in the phrase to the highest.⁶ Related work has been pursued in the Cartography program (Rizzi & Cinque 2016). And research on islands (e.g. Ross 1967; Chomsky 1973) were early attempts to spell out substantive UG principles. In fact, the vast majority of work in Principles & Parameters Theory and Minimalism assumes that a uniform abstract syntactic structure corresponds to a given interpretive function, and assumes invisible functional heads and movement as needed to discharge features in order to derive observed differences in linear order and other aspects of superficial form, including complex inflectional morphology.⁷

3.1.2 Parameters Given that languages do vary, the UG solution is to assume that I-language is parameterized. It is certainly true that there are dimensions of variation in grammar, and grammars can be described in terms of different ways in which a particular dimension of variation is realized. The differences can be as dramatic as whether a language employs A′ movement for wh-questions (as in English) or wh-in-situ (as in Chinese). Or they can be as trivial as whether a past participle agrees in a relative clause with the head of the clause in expressions that gloss as ‘the tables that he repainted’ (it does in French but not in Italian—Belletti & Rizzi 1996, 16). However, in MGG some parameters—or at least the theoretically interesting ones—were assumed to be part of UG, relatively abstract, and universal. By ‘relatively abstract’ I mean that a classical parameter in UG is not supposed to be a way of describing an observed superficial difference between languages— it is a ‘macroparameter’ that is intended to cover a set of correlated differences between grammars. For instance, the Pro-drop Parameter (Rizzi 1982, chapter 2) was intended to capture a cluster of properties that distinguished ⁵ So the derivation of [VP NP-V] is (roughly) [ ] [VP [V NPi ]] ⇒ [NPi ] [VP [V t i ]]. ⁶ So the derivation of what superficially appears to be [VP V NP PP PP] is (roughly) [VP ν1 [NP ν2 [PP1 ν3 [PP2 Vi ]]]] ⇒ [VP [[[Vi +ν3 ]+ν2 ]+ν1 ] [NP t i [PP1 t i [PP2 t i ]]]] (Larson 1988). ⁷ Thus, in many respects contemporary minimalism closely resembles generative semantics of the 1960s and early 1970s, an approach long rejected in MGG (Chomsky 1971). For an extended critique of Uniformity as a principle for justifying grammars, see Culicover & Jackendoff (2005, chapters 2 & 3).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.1 classical universal grammar 45 languages like Italian that allow null subjects from those like English that do not. For Baker (2008), this interpretation of macroparameters is not empirically supported. On his view, they are even more general and abstract, and still part of UG. They define the ‘feel’ of a language, e.g. polysynthetic vs. analytic. I return to Baker’s perspective in Chapter 10, and discuss how we can capture the ‘feel’ of a language without assuming that there is a set of parameters that constitute part of UG. More recently MGG has extended the notion of parameters to ‘microparameters’ (Kayne 2005b) and ‘nanoparameters’ (Baunaz et al. 2018). Biberauer & Roberts (2017) distinguish as well ‘mesoparameters’. Macroparameters are assumed to apply generally to heads, mesoparameters to particular categories of heads, and microparameters to classes of functional heads; nanoparameters are idiosyncratic properties of basic lexical items. With such a flexible notion of ‘parameter’, we arrive at a notion of constructions. At this point the distinctions are mainly terminological, setting aside theory-specific assumptions about implementation, e.g. that there are abstract functional categories at play. I argued in Culicover (1999) that these steps from macroparameters to narrower and lexically specific formulations render the otherwise substantive notion of ‘parameter’ vacuous as a theoretical construct (see also Haspelmath 2008; Newmeyer 2017 and, for a more optimistic view, Boeckx 2011). Variation can be much more specific, and in many cases quite arbitrary, depending on the generality of the CS function that is expressed. And, contrary to what has been conjectured, it is not restricted to properties of atomic lexical items.⁸ With a sufficient number of auxiliary assumptions, e.g. abstract features, functional heads, etc., any grammatical distinction can be formulated as ‘parametric’ in some way. So, when Biberauer & Roberts (2017) conclude at the end of a very interesting study of conditional inversion in English (had I known that . . .) that “[t]his general view, then, is consistent with the conception of different kinds of parameters,” I believe that they are assigning an unwarranted theoretical status to the observed variation. Using the term ‘parameter’ suggests, wrongly in my view, that the variation is an aspect of UG, and not simply a descriptive category.⁹ ⁸ See discussion of the ‘Borer-Chomsky Conjecture’ in Baker (2008): “All parameters of variation are attributable to differences in the features of the functional heads in the Lexicon.” Given the vast power of abstract functional heads (e.g. Cinque 1999), it is difficult to see how such a conjecture could be falsified. ⁹ Another view of parameters is that the principles that govern grammars are parametrized. For arguments against this view, see Boeckx (2011).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

46 universals I take up this issue throughout this book, arguing that what we are dealing with when we talk about parameters, mesoparameters, microparameters, and nanoparameters is simply constructional variation of different degrees of generality and involving different dimensions of variation. Strictly speaking, there are no universal parameters in the sense discussed above. To avoid confusion, instead of ‘parameter’, I use the terms dimension and variant to describe constructional variation.

3.1.3 UG and emerging grammars A particularly instructive, albeit dated, proposal about the substantive content of UG is Bickerton (1984), who proposed a universal set of grammatical rules as comprising part of UG. It is particularly useful to see how the notion of UG is linked to the way language is manifested in newly created languages, that is, creoles. “Creole similarities stem from a single substantive grammar consisting of a very restricted set of categories and processes, which will be claimed to constitute part, or all, of the human species-specific capacity for syntax” (178). The rules of this universal grammar are given in (1).1⁰ (1) a. S′ → COMP, S b. S → N‴ , INFL, V‴ S′ c. N‴ → ({ }), N″ Determiner d. N″ → (Numeral), N′ e. N′ → (Adjective), N f. V‴ → V″ , (S′ ) g. V″ → V′ , (N‴ ) (Bickerton 1984, 179) Bickerton argues that creole languages display certain traits that can be described using the rules (1). A major feature of Bickerton’s proposal is that the creole grammar is restricted to certain syntactic categories, such as N and V. Bickerton’s proposal is a somewhat extreme example of the standard view about UG, which is that there is a set of specific grammatical principles, structures, and devices that constitute the biologically encoded, human endowment 1⁰ I have slightly modified Bickerton’s notation.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.1 classical universal grammar 47 for language. Proposals with very different content, but offering the same general perspective, can be found throughout the literature. These range from Bickerton’s, to the Conditions Framework (Chomsky 1973), to the ‘Toolkit Hypothesis’ of Jackendoff (2002); Culicover & Jackendoff (2005),11 to the Merge operation in the Minimalist Program (Chomsky 1995b, 2005, 2015). What we have to ask ourselves is where this knowledge, to the extent that we have characterized it more or less correctly, resides. It is certainly logically possible that rules such as those in (1) evolved to constitute the substantive content of core grammar in UG. Or that the language faculty evolved a toolkit of devices for building grammars. Or that there are universal constraints on derivations that govern movement operations. Or that there is a single, universal operation of Merge for composing syntactic representations, and that the remainder of the apparatus of human language is the consequence of experience and general constraints on computation (Chomsky 2005; Friederici et al. 2017). But a more plausible alternative interpretation, to my mind, is that what is universal is not syntax, in the sense in which the term UG is typically understood. Rather, what is universal is conceptual structure (CS) (Jackendoff 1972, 1983, 1990, 1997, 2002; Culicover & Jackendoff 2005). CS is a representational system for encoding thoughts—thoughts that are as elaborate as any of those that a human might express. All of the fundamental categories of thought are wired in—reference, action, causation, time, etc. Thoughts as represented in CS have structure, which is determined by the CS categories and rules of combination. In addition, I assume that not only do all humans have CS, but they are driven by the need to express thoughts and communicate them. I will call this the Force. The Force is fundamental, and is most clearly manifested in contexts where communication is necessary. The most dramatic such contexts are those where there is no existing fully functional system that humans can acquire so that the Force can be fully satisfied. The circumstances that lead to the emergence of pidgins and creoles are such contexts, as are those where children are deprived of linguistic input, that is, deaf children of hearing parents who do not expose their children to existing, mature sign languages. These speakers, who arguably have no external linguistic input on the basis of which to construct a system for the expression of CS, seize upon their own ability to form gestures and create signs in order to express their thoughts

11 “The language faculty, developed over evolutionary time, provides human communities with a toolkit of possibilities for cobbling together languages over historical time” (5).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

48 universals (Senghas & Coppola 2001; Coppola 2002; Senghas et al. 2004; Mylander & Goldin-Meadow 1991; Goldin-Meadow 2005). They invent signs not only to refer to things in the world, actions, and states, but also to the future, the immediate future, and the past. They invent signs to signal questions, imperatives, and negation (Goldin-Meadow 2005, 214). Moreover, they use sequences of signs to express properties of things in the environment. They individuate the signs for things and for properties. The same sign can be used to refer to a thing, when it precedes a verbal sign, or a characteristic action, when it follows a sign that refers to a thing (GoldinMeadow 2005, 21512). The full sentences that these deaf children produce tend to follow certain regular patterns, but are far from what Bickerton’s specific proposal would predict. Typically only a single argument is expressed, and typically, arguments referring to patients (i.e. affected things), recipients, and actors precede the verb (Goldin-Meadow 2005; Progovac 2015, 205). (But other patterns are also seen, as in the case of a child who positioned transitive actors after the verb (Goldin-Meadow & Mylander 1998, 280).) For instance, most if not all sentences contain a single argument, even when the verb is transitive (‘hit’) or ditransitive (‘give’). Thus, it appears that the grammatical systems of home sign are far from those of fully mature languages. Similarly, it appears that speakers of an emerging language seize upon whatever materials are available to them in the linguistic environment to encode basic CS elements, functions, and relations. A pidgin would be the first step in this process, a common system created by adult speakers of different languages with different vocabularies and rules of grammar, and the need to communicate. A creole would be the next step in the process, where first language learners impose a degree of regularity and generality on the more primitive constructions of the pidgin. I suggest that creolization reflects the human capacity to elaborate systems by reducing complexity of representation as the systems are used for expression and communication. As Hawkins (1994) and others have argued, grammars reflect the consequences of reducing, even eliminating, expressive effort, but also comprise devices that increase clarity and reduce ambiguity. A closer look at the categories that Bickerton identifies as being common to creoles suggests that this idea is on the right track. He cites the use of verbs

12 “Thus, the grammatical categories noun and verb are elements within the deaf child’s syntactic system and, as such, are governed by the rules of that system, just as nouns and verbs are governed by the rules of syntax in natural language.”

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.1 classical universal grammar 49 meaning ‘go’ and ‘come’ to indicate direction, a preposition meaning ‘for’ or a verb meaning ‘go’ to indicate intention, the verb ‘stay’ to indicate duration, the number ‘one’ as an indefinite article, a verb meaning ‘give’ to mean ‘for’, and so on (Bickerton 1984, 175–6, 179). I cite just a few of Bickerton’s examples. (2) Hawaiian Creole dei kam in da mawning taim go skul ‘They came to school in the morning.’ [go used to express direction] (3) Hawaiian Creole dei wen go ap dea in da mawning go plaen ‘They went up there in the morning to plant [things].’ [go used to express purpose] (4) Hawaiian Creole wan taim wen wi go hom inna nait dis ting stei flai ap ‘Once when we went home at night this thing was flying about.’ [stay used to express contemporaneous action] (5) Saramaccan Creole a suti di hagimbeti da di womi he shoot the jaguar give the man ‘He shot the jaguar for the man.’ [give used to express benefactive] It appears in fact that creoles are created out of the need for humans to find ways to explicitly express more complex CS representations than pidgins allow. They are arguably as expressive in many respects as languages with longer histories, and they become more expressive over time as innovations continue to accrue. The formal devices that they employ reflect to some extent the grammars of the languages that constitute their source, and they develop novel complexity of their own. See, for example, Aboh (2015) and work cited there on complexification in creoles, Senghas & Coppola (2001); Senghas et al. (2004) on Nicaraguan sign language, Baptista (2009) on complexity in Cape Verdean and Guinea-Bissau creoles, and Heine (2005) on reflexives in creoles. The idea of the Force as driving learners to seek correspondences in the environment for aspects of CS has clear implications for language acquisition in typically developing pre-linguistic children, as well. The evidence is that the essential components of CS are active as early as it is possible to test children.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

50 universals For example, Fisher et al. (2010) review research that demonstrates that very young children understand the correspondence between words referring to individuals and abstract thematic roles. In one experiment, for example, 25-month-olds who heard the transitive sentence ‘The duck is kradding the bunny’ looked longer at an event in which a duck acted on a bunny than at an event in which the duck and bunny acted independently, while those who heard the intransitive sentence ‘The duck and the bunny are kradding.’ did not. (143)

They conclude that Children are biased to represent experience with language in an abstract mental vocabulary, permitting rapid generalization of newly acquired syntactic knowledge to new verbs. (144)

This “abstract mental vocabulary” is CS, and in this particular case, argument structure. All normally developing children have the CS core in which the fundamental components of actions and events are represented. The Force impels them to seek correspondences between these components and whatever communicative devices are available. If they have linguistic input, they seek to form correspondences with this input. If they do not, then they invent their own correspondence devices for this purpose, seizing on whatever is available and reasonably stable in the environment.

3.2 Another conception of universals The foregoing observations lead to the following proposal: what is fundamentally universal in grammar is CS. It is plausible, then, given the Force, that apparent grammatical universals arise because languages must compute the same relations between sound and meaning. They are subject to the same need to balance effort and explicitness, and arrive at more or less equivalent solutions. Not all CS relations are overtly represented, or represented equally richly, in the grammars of all languages. Taking this perspective allows us to talk about universals without claiming absolute universality of particular grammatical mechanisms. As discussed at greater length in Chapter 4, relative complexity is a key determinant of ‘survival’ in this arena—the simpler beats

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.2 another conception of universals 51 out the more complex, other things being equal (Culicover 2013c). What we see as universals or quasi-universals are relatively economical ways of carrying out the work of encoding CS representations using grammatical form. There are two criteria that must be imposed on any particular grammar that expresses the form/meaning correspondence. First, it must be sufficient— it must be possible to express every CS function that is essential to human thought and communication. A language that does not provide speakers with a way to ask a question or refer to a person is not sufficient. Second, it must be computable—it must be possible to encode every essential CS function in a way that does not exceed the computational capacities of speakers and hearers. These two criteria give rise to two kinds of universals. The criterion of sufficiency gives rise to architectural universals—grammars will tend to have much the same categories and combinatorial possibilities, since these match, albeit more or less faithfully, the categories and combinatorial possibilities of CS. For example, the fact that humans perceive that there are things and categories of things in the world, that they are countable, and that they have properties worth remarking upon means that languages will have devices for referring to types and tokens, that they will have ways to note how many things are being referred to and that they will have ways to distinguish things according to their properties. However, it does not follow that all languages will have exactly the same grammatical categories as English, such as nouns, quantifiers, and adjectives. What is universal is the expression of CS, not the exact syntactic devices for accomplishing this task.13 Connected to the criterion of computability is the idea that universals emerge as a consequence of the drive to reduce the complexity of computing form/meaning correspondences—this is what I have referred to as economy. If there are two or more ways of expressing a form/meaning correspondence that are of different complexity, and if they are in competition, then the simpler ones will drive out the more complex ones, other things being equal. The notion of competition is crucial; in the absence of competition, a complex device for expressing a function of the CCore can survive indefinitely.

13 Wiltschko’s (2014) view of categories is similar in many respects to this one. She argues against a universal inventory of categories (cf. Culicover 1999) and against the idea that there are no universals (Croft 2000). Her proposal for a “Universal Spine” purports to be a syntactic universal (she eschews any claims about psychological reality (pp. 6, 24)) but I would argue her proposal is really about the functional organization of conceptual structure: (discourse) linking, anchoring (event coordinates such as person, tense, location, and realis), point-of-view (aspect), and classification (argument structure).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

52 universals An important corollary of computability is that a form/meaning correspondence that is very difficult to compute will at best be rarely attested in a grammar.1⁴ Many universals that arguably arise from the pressure to reduce complexity are typological universals of the sort originally explored by Greenberg (1966). Some of these universals have been claimed to be categorical, e.g. Universal 16. Universal 16. In languages with dominant order VSO, an inflected auxiliary always precedes the main verb. In languages with dominant order SOV, an inflected auxiliary always follows the main verb.1⁵ But others are probabilistic, e.g. Universal 17. Universal 17. With overwhelmingly more than chance frequency, languages with dominant order VSO have the adjective after the noun. These have been argued by Whitman (2008) to be typological generalizations, not universals, arising as a consequence of language change. Hawkins (1994, 2004, 2014) argues that the Greenbergian universals are the consequence of a pressure toward reducing the computational complexity of the form/meaning correspondence. On his view, orderings that violate the preferred ordering are more costly to compute. Essentially, the complexity metric is based on the minimal syntactic structure that provides sufficient information to identify the argument structure of the sentence. Paired with a theory of how differential costs of computation play out in the competition among grammars—see section 4.3—such a theory offers an account of why one alternative for expressing a particular CS function is more frequent than the others, as in the case of Universal 17, or even completely universal, as has been claimed for Universal 16. But, as Hawkins also notes, similar considerations can account for putative universals such as the island constraints of Ross (1967); see also Hofmeister & Sag (2010); Hofmeister et al. (2013); Sag et al. (2007a). Hofmeister et al.

1⁴ This idea is related to that of ‘attainability’ (Wexler & Culicover 1980, 35–40): “Let 𝒢 be a class of (possible) grammars. . . . [W]e define the class of attainable grammars as those grammars in 𝒢 that can be learned by [the learning procedure] LP from data of the appropriate kind. . . . [C]onceivably there are possible grammars that are not attainable. These might include, for example, grammars whose form is perfect and appropriate and that in fact are learnable by procedure LP, but for which LP demands very complicated data or information, data not available to the language learner.” 1⁵ But see Dryer (1992, 100) (also Hawkins 2013), who shows that both disharmonic orders are rare, but possible.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.3 on the notion ‘possible human language’ 53 (2015) and Culicover & Winkler (2018) make a similar argument for the unacceptability of certain putatively universal ‘freezing’ configurations. In each case, the complexity of processing renders the particular configuration less acceptable than the alternatives. Hence it is relatively rare, and in the limit, fails to occur. The upshot of this scenario is that we may be able to refine the Toolkit Hypothesis (Jackendoff 2002; Culicover & Jackendoff 2005) and provide a principled distinction between those properties that are genuine universals, and those that are not. Some of the hypothesized tools appear to be architectural: e.g. X-bar theory: phrases of category XP, headed by lexical category X; syntactic heads map to semantic functions; syntactic arguments (subjects and complements) map to semantic arguments; syntactic adjuncts map to semantic modifiers. These may prove to be universal in the strong sense, since they bear on the sufficiency criterion—arguably, a grammar that lacks one of these tools would be unable to express the full range of conceptual structures. Others appear to reflect computability considerations: e.g. prefer putting all heads on same side of their complements; topic/given information early; focus/new information late; short constituents early; long constituents late, etc. These would not be universal, but universal tendencies.

3.3 On the notion ‘possible human language’ I consider next the question of how to define ‘possible human language’ in constructional terms. In section 3.3.1 I consider the aspects of CS that constructions must be able to refer to. In section 3.3.2 I illustrate how a constructional formulation is able to deal with a range of ways of expressing one such aspect, negation, in constructional terms, and in section 3.3.3 I do the same for the imperative. Both phenomena show substantial variation crosslinguistically, which has typically been dealt with by MGG through derivation from a uniform underlying syntactic structure. I argue that such derivational analyses are cryptoconstructional, and that the constructional approach is better suited to capturing the variation.

3.3.1 Possible constructions The answer to the question of what constitutes a possible human language in constructional terms must make reference to the forms and functions and the

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

54 universals correspondences between them, not the characteristics of the corpus of expressions. There are three parts to the task of expressing such correspondences: the formal characterization of a correspondence, the possible meanings, and the possible forms. The first is addressed in Chapter 2. For the second, I assume the following are among the major CS properties that a grammar must express. These are CS universals, not grammatical universals. • • • • • • • • •

• •

• •



propositional content; predication reference and coreference reflexivity concrete and abstract properties thematic function imperatives (commands, requests, suggestions, etc.) questions, both truth value and wh. negation, including non-existence and denial quantification • numerosity • degree • comparison binding discourse structure • given-new • information structure (focus, contrast, etc.) • speech act participants • discourse distance (speaker, hearer, 3rd person, in and out of immediate context) • speaker’s relationship to veridicality of the propositional content (evidentiality) • social distance (kinship, humanness, animacy, etc.) definiteness and completeness (and incompleteness and indefiniteness) temporal relations • to utterance time • to time of events referred to • aspect causation

This is, of course, a partial list, although it does cover a large swath of what people say.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.3 on the notion ‘possible human language’ 55 Given the primitive components of CS, such as those above, it is possible to define relations over them of arbitrary specificity. Some of these may be very general and be associated with human concerns independent of particular culture, such as imperatives and questions. Others may be specific to a particular culture and be reflected in unique vocabulary. For example, Evans & Levinson (2009) cite Mithun’s (2001) example of a preposition in Kurok that means ‘in through a tubular space’. And Everett (2012) has argued that the Pirahã do not use referring expressions that express relatively complex quantification or combinations of descriptive properties. Some languages mark evidentiality grammatically, while others do not (Aikhenvald & Dixon 2003). On the basis of the occurrence of idiosyncratic constructions in particular languages, Evans & Levinson (2009) hold that “languages reflect cultural preoccupations and ecological interests that are a direct and important part of the adaptive character of language and culture.” While this may be true for many of the distinctions that grammars mark lexically and morphologically, it does not entail that there are no universal CS functions. Nor does it contradict the assumption that some of these functions are so central to human experience and human communication that some means of encoding them must be present in every grammar.1⁶ The third part of the task of expressing the form/meaning correspondence concerns the form. Contrary to Evans and Levinson’s arguments, the fact that it is possible to ‘package’ the CS functions in various ways in terms to express the form/meaning correspondence is fully consistent with the assumption that there is a universal CS.1⁷ What is crucial, from our perspective, is that there are different ways for a grammar to encode a particular CS function. It may encode the relation directly, or it may allow speakers and hearers to approach it paratactically or inferentially. If the grammar encodes a relation directly, it must do it in a way that is learnable and computable. Those ways of encoding relations that are difficult to learn or compute are in principle possible, but may not be sustainable, particularly in contact situations. And it is quite possible that a particular CS function is not expressed at all in a particular language. On this view, purely syntactic universals are the accidental consequence of complexity, computability, and centrality to expression applied to universal CS.

1⁶ For reactions to Evans and Levinson’s paper, many of which agree with this point, see the commentary on their article, and the papers in Lingua 120.12(2010). See also Dryer (1997), for arguments that many putative universals such as grammatical relations and grammatical categories can be accounted for in terms of the “complex interaction of functional and cognitive principles or ‘forces’ that play a role in language change, in language acquisition, and in language use.” 1⁷ For some important observations about packaging, see Slobin (1987) and Jackendoff (1990).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

56 universals

3.3.2 An example: Negation To illustrate the central point of the preceding section, let us consider a simple example drawn from the list of CS universals, negation. All languages mark negation.1⁸ In English, negation is marked by not, which is similar in its distribution to a class of adverbials but in many respects is sui generis. In particular, sentence negation must immediately follow an inflected auxiliary, producing the sequence Vaux -not, e.g. will not. For simplicity, let us say that the CS interpretation of a sentence S that negates some proposition P with this sequence is ¬P. In some Romance languages sentence negation is marked by a clitic, e.g. ne in French, derived historically from non > nen ‘not’ (Jespersen 1917, 7). This form was strengthened to ne . . . pas, and ne is dropped in contemporary spoken French. In Spanish and Italian, immediately preverbal no/non mark sentence negation. In Arizona Tewa, negation is marked morphologically on the verb using the template “negative prefix, pronominal prefix, verb stem, tense/aspect (where marked), and negative suffix” (Kroskrity 2010). In Warlpiri, negation of a finite clause is marked by putting the form kula ‘Neg’ in the second position AUX (Laughren 2000). In the Salish languages, there are several patterns, including “a nominalized subordinate clause, typically introduced by whichever determiner/complementizer the language employs to introduce non-factive subordinate clauses” (Davis 2001, 56; see also Davis 2005). In Yupik-Inuit negation is marked using derivational morphology on the verb (Sadock & Woodbury 2018). The cross-linguistic distribution of the main ways of marking negation is shown in Table 3.1.

Table 3.1 Negative marking, from Dryer & Haspelmath (2011) Value

Representation

Negative affix Negative particle Negative auxiliary verb Negative word, unclear if verb or particle Variation between negative word and affix Double negation

395 502 47 73 21 119

1⁸ For reviews of the typology of negation, see Miestamo (2000, 2007).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.3 on the notion ‘possible human language’ 57 Assuming that there is a single CS function being expressed by these various grammatical devices, it is reasonable to ask whether there is any evidence that there is also a uniform, universal, abstract syntactic representation for sentential negation. It is of course possible to stipulate that there is such a representation, say [NegP Neg [IP . . .]] for concreteness, which corresponds more or less directly to the semantic interpretation. This is what is typically done in MGG. On such an approach, it is then necessary to assume that each language has a particular derivational scheme that spells out this syntactic configuration in the appropriate way. For example, in English the derivation would have to be something like (6), where the subject and the inflected auxiliary I0 raise to the left so that they precede the phonological form that spells out Neg, i.e. not.1⁹ (6) [ . . . [NegP Neg [IP NPi I0j . . .]]] ⇒ . . . [ NPi I0j [NegP Neg [IP ti tj . . .]]] E.g. [NegP [Neg not] [IP [NP Sandy]i [I0 will]j . . .]]] ⇒ [[NP Sandyi ] [I0 will] [NegP [Neg not][IP ti tj . . .]]] Let us consider a very different solution. The World Atlas of Language Structures (WALS) cites Kolyma Yukaghir as a typical example of a language in which negation is marked with a negative affix on the inflected verb. (7) illustrates. (7) Kolyma Yukaghir (Maslova 2003: 492) met numö-ge el-jaqa-te-je 1sg house-loc neg-achieve-fut-intr.1sg ‘I will not reach the house.’ The derivation of the surface form in (7) from a structure like that assumed in (6) for English would be as in (8); I’ve omitted most structural details for simplicity. (8) [NegP [Neg el] [[I0 te-je]j [VP jaqai . . . ]]]⇒ [NegP [Neg el] [[jaqai +[I0 te-je]j ]k [VP ti . . .]]] ⇒ [NegP [Neg el]+[jaqai +[I0 te-je]j ]k [tk [VP ti . . .]]]

1⁹ To keep things simple, I’m restricting the discussion to declaratives.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

58 universals Here, the verb has to raise to adjoin to the inflection I0 , and then the whole package raises to adjoin to Neg.2⁰ The resulting complex structure Neg+[V0 +I0 ] is spelled out as a single inflected form, presumably listed in a lexicon in case it is not completely analytic. In a language where negation is marked on an auxiliary verb, the derivation would presumably involve Neg combining with some expletive Vaux , and similarly for the other mechanisms for marking negation. So forms like English won’t, can’t would be derived more less like (8), with appropriate spelling out. It should be apparent that deriving the surface form of negation from a uniform syntactic structure is a cryptoconstructional solution to stating the correspondence between this form and the CS representation ¬P. For example, the form in (7), shown in the construct in (9), is licensed by the construction in (10). (Here, as elsewhere, the CS representation ¬4 (5′ ) is understood as the negation operator scoping over the interpretation of 5 without negation.) (9) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎣cs (10) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎣cs

[…el4 -jaqa1 -te2 -je3 …]5

⎤ ⎥ ⎥ ⎥ category jaqa ‘achieve’ 1 ⎡ ⎡ ⎤ ⎤⎥ ⎢ ⎢ ⎥ ⎥⎥ te2 ‘fut’ ⎢ …, ⎢tense ⎥,…⎥ ⎥ ⎢ ⎢ ⎥ ⎥ person/number je ‘intr.1.sg’ ⎥ 3 ⎢ ⎥ ⎢ ⎥ ⎥⎥ ⎢ el4 ‘neg’ ⎣polarity ⎦ ⎦5 ⎥ ⎣S ⎥ ′ ¬4 (5 ) ⎦ …neg-achieve-fut-intr.1sg…

[…4-1-2-3…]5

⎤ ⎥ V1 ⎡ ⎤ ⎤⎥ ⎡category ⎢ ⎥ ⎥⎥ ⎢ tense2 ⎥ ⎢ …, ⎢tense ⎥,…⎥ ⎥⎥ ⎢ ⎥ ⎢ person/number person/number 3 ⎢ ⎥ ⎥ ⎥ ⎢ ⎥⎥ ⎢ polarity neg 4 ⎦ ⎦5 ⎥ ⎣ ⎣S ⎥ ¬4 (5′ ) ⎦

The constructional formulation provides a natural way to capture differences in complexity between alternative ways of expressing negation. The most common device is the negative particle, which explicitly and directly expresses negation that can scope over the S that contains it. For maximal transparency 2⁰ Following the standard MGG assumption that there is no lowering.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.3 on the notion ‘possible human language’ 59 of the correspondence, the particle is attached either to the left of the clause, as in (11), or the right edge. (11) ⎡phon ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs

1–2

⎤ ⎥ ⎡ ⎡category V ⎤ ⎤ ⎥ ⎢ …, ⎢ ⎥⎥ polarity neg1 ⎥, …⎥ ⎥ ⎢ ⎢ ⎥ ⎢ ⎥⎥ ⎣… ⎦ ⎦⎥ ⎣S 2 ⎥ ′ ¬1 (2 ) ⎦

The next most common device is the verbal affix, e.g. (10), which requires a certain amount of processing in order to dissociate it from the verb and map it to a scope with respect to the sentence headed by the V that it is attached to. The last is negation in the verbal auxiliary, which must be dissociated from the verbal auxiliary itself and translated into an operator that has scope over the entire clause. Because it involves two separate mechanisms, we would expect the double negation strategy of French to be relatively rare, and in fact the WALS data suggests that it is: double negation appears in about 10% of the cases. It is therefore interesting to observe that ne is typically dropped in colloquial French, so that only the particle is used: J’ai pas lu ce livre instead of Je n’ai pas lu ce livre (Ashby 1981; Martineau & Mougeon 2003; Armstrong & Smith 2002). This is what we expect under the assumption that other things being equal, grammars change in the direction of simpler constructions for doing a particular task. (Of course, since other things are rarely if ever equal, such movement toward greater simplicity does not occur across the board, and often introduces complexity at the same time on other dimensions.) But French does not exhaust the possibilities. According to Evans & Levinson (2009), citing Master (1946) and Pederson (1993), and Miestamo (2010), citing Pilot-Raichoor (2010, 268–9), negation in Classical Tamil and other Dravidian languages is marked by zero tense. It is fairly straightforward to state this correspondence in constructional terms. In fact, it is not specified directly in the construction itself—the paradigm function Φ maps the negative polarity feature of V to a zero phonological form in the position of the tense affix. The paradigm is shown in (12), and the construction that licenses it is given in (13). (12) Old Kannada (Pilot-Raichoor 2010, 268–9) a. no:d.-uv-em . see-fut-1sg ‘I will see’.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

60 universals b. no:d.-id-em . see-pst-1sg ‘I saw’ c. no:d.-em . see-1sg ‘I do/did/will not see’. (13) ⎡phon ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎣cs

[…Φ(3,1,4)…]2

⎤ ⎥ ⎡ ⎡category V3 ⎤ ⎤ ⎥ ⎢ ⎢polarity neg1 ⎥ ⎥ ⎥ ⎢ …, ⎢ ⎥, … ⎥ ⎥ ⎢tense ⎢ tense4 ⎥ ⎥ ⎥ ⎢ ⎥ ⎥2 ⎥ ⎢S ⎣… ⎦ ⎦⎥ ⎣ ⎥ ¬1 (2′ ) ⎦

In fact any systematic marking of negation can work, as long as it is unambiguous with respect to scope and is identifiable in the course of parsing the phonological string and in learning. Another example that shows that, in principle, any arbitrary correspondence can be a construction is seen in (14) from Brazilian Portuguese. (14) Brazilian Portuguese a. Algum rapaz chegou. some boy arrived ‘Some boy arrived.’ b. Rapaz algum chegou. boy some arrived ‘No boy arrived.’ Quite strikingly, the inverse order of algum and the noun has the interpretation of wide scope negation. While it is possible to devise a derivation of this form that shares an underlying structure with standard negation, e.g. by incorporating an abstract negative head in DP (Martins 2015), it is clear that the derivation is cryptoconstructional, since it is unique to Brazilian Portuguese.21

21 Thanks to Giuseppe Varaschin for bringing this construction to my attention.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.3 on the notion ‘possible human language’ 61

3.3.3 Another example: The imperative Imperatives constitute one of the three main types of sentences that are found in all natural languages, along with declaratives and questions (Sadock & Zwicky 1985). These sentence types correspond to three communicative functions that are innate and universally represented in CS. However, while the latter two are central concerns of MGG and play important roles in defining the ‘core’, imperatives have had a significantly more peripheral role. Imperatives very often appear to contradict generalizations about classical parameters in languages like English. For example, English does not have null pronouns, but the subject of an English imperative is typically null (or absent). Moreover, imperatives show considerable idiosyncrasy and therefore constitute a challenge to the view that idiosyncrasy should be relegated to the ‘periphery’. A rich source of evidence regarding variation in imperatives are the Romance languages. In Spanish, for example, the imperative takes different forms in the second person singular and plural. Although clitics precede the finite verb, they must follow the verb in the imperative. (15) Spanish (Based on examples in Biezma 2008) a. Cierra las ventanas! close.2sg.imp the.pl windows ‘Close the windows!’ b. Cerrad las ventanas! close.2pl.imp the.pl windows ‘Close the windows!’ c. Cierra las! (∗ Las cierra!) close.2sg.imp them ‘Close them!’ d. Cerrad las! (∗ Las cierrad!) close.2pl.imp them ‘Close them!’ But the negative imperative uses the infinitival form, which takes postverbal clitics (16a), or the subjunctive, which takes preverbal clitics (16b). (16) Spanish (Han 1999) a. No leer lo! neg read.inf it ‘Don’t read it!’

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

62 universals b. No lo leas! neg it read.2sg.subj ‘Don’t read it!’ In Italian, there are also singular and plural imperative forms, and as in Spanish, the clitics must follow. (17) Italian (Han 1999) a. Telefona le! call.2sg.imp her ‘Call her!’ b. Telefonate le! call.2pl.imp her ‘Call her!’ In the negative imperative, the infinitive is used, as is the second person plural (but not singular) imperative form. The clitics follow the verb. (18) Italian (Han 1999) a. Non telefonare le! neg call.inf her ‘Don’t call her.’ b. Non telefonate le! neg call.2pl.imp her ‘Don’t call her!’ c. ∗ Non telefona le! neg call.2sg.imp her ‘Don’t call her!’ What is particularly striking is that negation can be used with the second person plural imperative in Italian but not in Spanish. A summary of the distribution of clitics in several Romance languages is given in Table 3.2, based on data from Han (2000, 2001) and standard grammatical descriptions. The important thing to notice is that there is no single form that reliably predicts where the clitic appears in any other form. As in the case of negation, derivational accounts of imperatives assume a uniform syntactic structure for both positive and negative imperatives, and differential selection and movements to derive the observed forms.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.3 on the notion ‘possible human language’ 63 Table 3.2 Distribution of clitics in Romance indicatives, infinitives, and imperatives Language

Finite

Infinitive

Imperative

Neg. Imp.

Spanish Italian French Quebec French Romanian Brazilian Portuguese

cl-Vfin cl-Vfin cl-Vfin cl-Vfin cl-Vfin cl-Vfin

Vinf -cl cl-Vinf |Vinf -cl cl-Vinf cl-Vinf Vinf -cl cl-Vinf

Vimp -cl Vimp -cl Vimp -cl Vimp -cl Vimp -cl cl-Vimp

Vinf -cl|cl-Vsubj Vimp -cl|Vinf -cl|cl-Vinf cl-Vimp Vimp -cl cl-Vimp cl-Vimp

Isac (2015, 14) summarizes Zanuttini’s (1991) explanation for why certain imperatives cannot appear with negation as follows: true imperatives (Class I) are incompatible with negation because languages that have Class I true imperatives use a preverbal negative marker which subcategorizes for a TP, while true imperatives are morphosyntactically defective or reduced, and lack a TP by assumption [emphasis mine – PWC]

On the other hand, according to Isac (2015, 14), negation selects a Mood Phrase. The Mood feature hosted by the Mood head can be checked by an infinitive verb, for example, but not by a true imperative verb, given that the latter are defective and do not have Mood features.The ungrammaticality of a negated Class I true imperative is thus the result of the Mood feature remaining unchecked.

Variants of this type of approach can be found in all minimalist accounts of the imperative. Rivero (Rivero 1994; Rivero & Terzi 1995) takes a similar approach, but attributes the differences between imperative constructions to whether the V moves to C0 before or after ‘spellout’, and whether a ‘mood’ feature is on C0 or I0 to attract V. A set of assumptions about where V lands and where clitics are located in the structure derives the ordering between them: clitics are attached to a head I0 that is lower than C0 and therefore follow V. (19) Vi +C0 [mood] [ …CL+I0 [ …ti …]]

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

64 universals But Neg0 blocks V movement. If clitics are attached to I0 and V moves to I0 , the order Neg-Cl-V+I0 is possible in languages that allow negative imperatives. (20) C0 [ Neg0 CL+Vi +I0 [mood] [ …ti …]] Again, for discussion and a critique, see Han (2000, chapter 2). It is quite clear from such derivational analyses that the surface forms can be derived through a judicious use of ad hoc, cryptoconstructional assumptions about the location of negation in the tree, the hierarchy of functional heads, the number of negative heads, the selectional properties of negation, the adjunction of heads to other heads, the features of I0 and C0 , the ordering of various steps in the derivation with respect to ‘spell out’, and the presence or absence of other functional heads or features such as Mood. Such an approach is taken even by Han (2000), who follows her critique of earlier proposals with one in which I0 adjoins to Neg in languages like French, where negation clusters with clitics in preverbal position in negative imperatives. But Han has no account for the difference between Spanish and Italian second person negative imperatives.22 Moreover, on the assumption that postverbal clitics are the result of V raising to a higher functional head, the fact that both orders are possible in the infinitival imperative in Italian is a puzzle that requires an ad hoc solution. In many ways the idiosyncrasies of the English imperative are even more extreme. A negative imperative in English incorporates don’t/do not, which appears to be a regular case of do-support. But a wider look shows that very little about the English imperative follows from more general constructions, except the relative order of constituents. For example, don’t must be used with be, which in finite sentences is an auxiliary that supports negation itself. aren’t } very accommodating. are not Don’t (you) b. { } be so quick to judge. Do not ∗ Ben’t c. { ∗ } so quick to judge. Be not

(21) a. You {

22 “The question remains as to why the verb in 2nd person plural imperatives moves only up to Inf0 , whereas the verb in 2nd person singular imperatives moves up to C0 in the overt syntax. We do not have an answer for this question at this point.”

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.3 on the notion ‘possible human language’ 65 In cases of coordinated imperatives with overt subjects, a third person subject pronoun must have the objective form. The model for this construction is (22). (22) a. The girls be the cops, the boys be the robbers! (Zanuttini 2008, 206) ∗ he b. You be the doctor, and { } be the nurse! him him c. Don’t you sit down and { ∗ } just stand there! he him d. Don’t sit down and { ∗ } just stand there! he The English imperative, it appears, lacks finite tense in the main clause. This may explain the behavior of be and the objective case of the subject. But it does not account for do, which is a classic reflection of the presence of finite tense (Chomsky 1957). Nor does it account for the fact that English negative imperatives with overt subjects show inversion, as in (23), since inversion is a classic reflection of raising of the tensed auxiliary to a position higher than the subject (Pollock 1989). You (23) a. { Everyone } take care of this problem. Someone you b. Don’t { everyone } start talking. anyone It is reasonable to conclude, then, that the idiosyncrasies of the imperative in English (and other languages) are matters of constructional stipulation. To the extent that the ordering of the verb with respect to its arguments follows the regular pattern of the language, nothing special needs to be said. If there are idiosyncrasies in the inflection, this needs to be stipulated in the paradigm function of the language that realizes verbal inflection. If there are idiosyncrasies of order, this needs to be stipulated in the particular construction for expressing imperative force. As I argue in Chapter 10, idiosyncrasies that nevertheless reflect regular properties of the language are a consequence of economy. And all this can be done explicitly in statements of the constructional correspondences, without any need for derivations and the assumptions that go along with them. I conclude that the imperative, like negation, is a CS universal that may be expressed cross-linguistically in a variety of ways. However, there is no

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

66 universals evidence to support the idea that the imperative construction is a morphosyntactic universal. This said, because of the relationship between imperative semantics and the semantics of other verbal constructions, and the role of economy in shaping grammars, we find that there are overlaps between the morphological devices that are used to mark imperatives—e.g. infinitives and second person indicatives and subjunctives.

3.4 Against uniformity Let us take note of the more general methological strategy that leads to crypoconstructional analyses of the sort discussed in section 3.3, namely, uniformity. The picture that emerges from even these simple examples is that if we attempt to represent a uniform semantic interpretation as a uniform abstract syntactic representation, the syntax of a language in which the derived structure does not closely resemble the assumed logical form can become quite complex. Among the mechanisms that are required are: invisible heads, movement and adjunction of heads to heads, spelling out of complex adjoined heads as inflected forms, spelling out of abstract operators as particles, movement of phrases from lower positions to higher positions, and locality constraints on the movement of heads and phrases. These are, of course, all familiar from mainstream approaches to syntax and in fact are often claimed to be necessary components of the syntactic apparatus. But notice that all of them are required because of the original assumption that there is a uniform abstract syntactic structure from which all of the surface forms in all languages are derived. Echoing Culicover & Jackendoff (2005, chapters 2 & 3), this insistence on uniformity of underlying syntactic representation gives rise to unnecessary complexity in syntax, complexity which Simpler Syntax is at pains to try to eliminate from linguistic theory. The Simpler Syntax alternative is that the correspondences between form and meaning can be represented as constructions, that is, correspondences between syntactic representations, phonological forms, and CS representations. The best argument for a uniform syntactic treatment would be that all languages express the same CS function using the same syntactic devices. But this is not the case. With this in mind, consider Chomsky’s (2000b) claim about what a purely “objective” “Martian scientist” would find upon examining human language:

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

3.4 against uniformity 67 . . . in their essential properties and even down to fine detail, languages are cast to the same mold. The Martian scientist might reasonably conclude that there is a single human language, with differences only at the margins (7).

Clearly, a lot turns here on the phrases “cast to the same mold” and “differences only at the margins.” If we take “cast to the same mold” to refer to universals of CS, then this statement makes sense, but if we understand it as a claim about narrow syntax, then it seems less plausible. “Differences only at the margins” may be connected to centrality of CS functions, but does not appear to have a principled status with respect to syntax.23 The implausibility can be eliminated in a sense by simplifying our understanding of ‘syntax’ to the point that it encompasses only a single very primitive operation, e.g. Merge, as in recent instantiations of the Minimalist Program (Bolhuis et al. 2014; Berwick & Chomsky 2015). But this does not really make the “cast to the same mold” claim more convincing, because now literally everything that needs to be accounted for by a linguistic theory is “at the margins”, except the fact that sentences have phonological form and semantic interpretations. Our fundamental challenge is to account for the universality of language and the variation that can be observed—Chomsky’s Problem (Chapter 1). The following chapters develop further the idea set at at the beginning of this chapter that the universal ‘core’ of language is CS and form/meaning correspondences are represented constructionally.

23 One could also question the very idea that this is what the Martian scientist is likely to do; see for example Searle (2002).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4 Learning, complexity, and competition A widely accepted intuition is that the source of variation and change is language acquisition, specifically, the formulation of grammatical hypotheses by language learners. Section 4.1 provides a formal approach to the acquisition of constructions in terms of generalization and extension of individual constructs. Section 4.2 looks at how new constructions may be created not only through generalization, but in response to pressures to represent aspects of CS. Section 4.3 discusses how the formation of constructional grammars plays out when constructions are in competition over the same CS function. A key factor in how competition is resolved is complexity, which is the subject of section 4.4. Finally, section 4.5 demonstrates through simulation how differences in complexity may lead to the dominance of one constructional alternative over another.

4.1 Acquiring constructions Let us begin with the misidentification of licensing conditions as a possible source of constructional change. There is abundant evidence that errors occur in lexical acquisition by young children (see Ambridge et al. 2013 for a review). The correspondence between a word and its meaning is constructional under the definition of Chapter 2. In the short term, lexical errors made by children are for the most part corrected through experience and direct and indirect feedback, and do not result in observable grammatical change. But the very substantial grammaticalization literature leaves no doubt that in the long term, misidentification of licensing conditions produces very salient changes in word meaning and even grammatical category (Condoravdi & Deo 2014; Deo 2015; Eckardt 2006; Gisborne & Patten 2011; Jäger 2007a; Noël 2007; Sweetser 1988; Traugott 2003, 2008; Trousdale 2008, 2012). Grammatical constructions differ from lexical constructions in that the grammatical constructions state conditions for correspondences involving categories, phrases, and complex structures, not simply single lexical items. However, on a constructional approach they are formally identical, in that Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0004

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.1 acquiring constructions 69 they state conditions in which the representations on each of the tiers may correspond to representations on the other tiers. And these licensing conditions are also susceptible to misidentification. If imprecise hypotheses about the licensing conditions for a construction are introduced in the course of learning and are shared among learners in a social network, there will be constructional change and hence grammatical change. The challenge for the learner, then, is to identify the particular correspondences and relevant feature values of the constructs that exemplify the constructions of the target to be learned, given PLD consisting of (presumably) well-formed constructs. Unlike parameter setting models, I do not presume that the learner already (and paradoxically) has the capacity to parse complex input and use that input to determine what the grammar looks like. Rather, the learner must take the constructs as they are given more or less at face value, and on the basis of this input, hypothesize the constructions, through generalization. And I do not assume that there is a prior set of universal primitive syntactic categories that each new word can be assigned to on the basis of its combinatorial properties (Culicover 1999). Given this, the most conservative strategy for the learner would be to take every construct as a sui generis construction. This would make our model of linguistic competence an exemplar model (Culicover & Nowak 2003; Bybee 2006; Bod 2006, 2009; Gahl & Yu 2006; Tomasello 2003). On this approach, every construct that the learner encounters is recorded in memory, and a weight is assigned to each item that measures how often it occurs.1 The trick is to explain how a learner is able to generalize beyond the PLD that is represented in the exemplars. The alternative, that every exemplar remains sui generis, and that there is no generalization, does not comport with the facts. For example, there is no question that even very young children generalize beyond the explicit input, e.g. Naigles et al. 1986; Naigles 1990; Naigles & Kako 1993; Naigles & Hoff-Ginsberg 1995. In order to allow for generalization, any model, including the exemplar model, must therefore assume that the learner is able not only to analyze and categorize the input, but to group items on the basis of similarity. The parsing of the input into units is a prerequisite for grouping the units into categories 1 An open question is whether any individual exemplars remain in memory in some form forever, or are ultimately replaced by more general representations. We do not have to resolve this question if we agree that learners formulate constructions by generalizing over constructs. It may be that generalizations are represented as operations over sets of exemplars, or they may be represented individually as replacements for sets of exemplars, or both. According to Bybee (2013, 52f): The stored representations in the form of exemplars respond to usage by allowing the representation of both token and type frequency; these frequency patterns are important for understanding the categories that are formed for the schematic slots in constructions.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

70 learning, complexity, and competition and forming generalizations about the distribution of these categories and their correspondences on the various tiers. Since constructs and constructions are made up of the same components, the only difference between a construct and a construction is one of generality. Constructs consist entirely of constants with fixed forms, while constructions may also have variables in the form of categories that go beyond such constants. To see how constructions can be learned, let us first consider the problem of forming constructions when the learner has explicit evidence about how to represent the tiers of the construct. For concreteness I assume the general framework of Briscoe (2000), in which a learner takes each construct, that is, a form/meaning pair, and tries to parse it using the current grammar. In constructional terms, the learner determines if the pair is licensed. If not, it hypothesizes a new category—in our terms, a correspondence. Briscoe shows that for perfect and complete information from the linguistic environment, the learner will converge on the source grammar with high probability. Briscoe takes the syn–phon–cs correspondences of constructions to constitute a categorial grammar with rules of combination and semantic interpretation. These rules specify whether a given ordering is functor-argument or argument-functor. So, for example, the category (S\NP)/NP with meaning λy.λx.V′ (x,y) of a transitive verb V in SVO order encodes what would be expressed as (1) in terms of the syn–phon correspondence. (1) a. VO ⎡phon ⎢syn ⎢ ⎣cs b. SVO ⎡phon ⎢syn ⎢ ⎣cs

[1–2]3

⎤ ⎥ ⎥ [1′ (2′ )]3 [=λy.λx.1′ (x,y)(2′ )=λx.1′ (x,2′ )]3 ⎦

[VP V1 ,NP2 ]3

[4–3]5

⎤ ⎥ ⎥ [3′ (4′ )]5 [=λx.1′ (x,2′ )(4′ )=1′ (4′ ,2′ )]5 ⎦ [S NP4 , VP3 ]5

A categorial grammar of this type is equivalent to a constructional grammar with strictly compositional interpretation driven by the syntactic categories, which are restricted to those of classical generative grammar.2 Constructional grammars depart from this characterization of the form/meaning correspondence in that they allow for interpretations that are associated with the 2 I understand strict compositionality to be a correspondence between the syntactic structure and the semantic interpretation such that the meaning of a phrase is homomorphic to the syntactic structure: one element in the syntactic structure corresponds to the function, and the others correspond to the arguments.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.1 acquiring constructions 71 hierarchical structure of the construction itself, independently of the lexical items. Moreover, they may permit arbitrary subcategories and individual items in the statement of licensing conditions. I begin with acquisition in the two-word case, where the category of one word is fixed, the category of the other is not, but the meaning and category of the phrase is. Briscoe (2000) gives the example of red sock, where the learner knows that the phrase means red′ (sock′ (x)), that red means red′ , and that sock means sock′ and is of category N. If the learner has other adjectives, then the learner may assume that the category of red is N/N. If it does not have words of this category, then it must add it to the grammar. Translating this into constructional terms, the learner encounters a construct of the form in (2). (2) ⎡phon [red1 –sock2 ]3 ⎤ ⎥ ⎢ category X category N ⎥ ⎢ [ [ ],[ ]] ⎥ ⎢syn lid red1 lid sock2 3 ⎥ ⎢ N ⎥ ⎢ [red1 ′ (sock2 ′ (x))]3 ⎦ ⎣cs If the grammar already has the category Adj=N/N, the construct in (2) is licensed if there is a construction of the form in (3), and X is Adj. (3) ⎡phon [1–2]3 ⎤ ⎢syn [N A1 ,N2 ]3 ⎥ ⎥ ⎢ [1′ (2′ )]3 ⎦ ⎣cs If (3) is not already in the grammar, then the learner must hypothesize that there is a new category, call it CATred , that combines with N to produce an N with the interpretation shown in (2). From the constructional perspective, red sock could be idiomatic, and I take this to be the learner’s first hypothesis in the absence of further evidence. What leads the learner to change this hypothesis is evidence that there is a productive interpretive relationship between red and sock—namely, red′ is a functor that takes the interpretation associated with the N as its argument. Such evidence could consist of another instance of red-N with a different N. Or, depending on the CS representations, the evidence could consist of other instances of the form X-N, where the interpretation of N is known and the interpretation of X has the feature [attribute color].3 3 I am abstracting here away from the fact that red has different extensions depending on the noun, e.g. red sock/nose/fire engine etc. (McNally 2013).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

72 learning, complexity, and competition So, following Briscoe, I characterize grammar formation, and in this case, construction formation, as a process of category formation. Learners begin by hypothesizing sui generis categories. A plausible scenario for more inclusive category formation begins with words that denote things in the world, which we categorize as N, and words that denote actions, which we categorize as V (Tomasello 2003; Culicover & Nowak 2003). These categories are inspired by semantic properties, but are not reducible to them—not all languages have the same set of categories, or use the same categories to express the same concepts (Culicover 1999; Dixon 1982, 2004; Croft 2001; Evans & Levinson 2009). Ultimately, the categories are defined by their morphological and distributional properties, and correspond imperfectly to their semantics.⁴ As a consequence of category formation, the learner proceeds through stages where the constructions are of increasing generality. The generalization stops at the point that the interpretation is not strictly compositional. So, for instance, when the learner encounters red herring the construct does not have a CS in which red corresponds to red′ , even though red is an Adj, and herring corresponds to herring′ . (4) ⎡phon [red1 –herring2 ]3 ⎤ ⎥ ⎢ category A category N ⎥ ⎢ [ [ ],[ ]] ⎥ ⎢syn lid red lid herring 1 2 3⎥ ⎢ N ⎥ ⎢ ′ ′ [misleading (clue (x))]3 ⎦ ⎣cs Since neither red nor herring correspond to any part of misleading′ (clue′ (x)), they must both have sui generis subcategories, namely Ared−herring and Nred−herring (Sag et al. 2007b). And the generalization ends here. As noted, constructional approaches to grammar depart from mainstream approaches by allowing for the possibility that correspondences are not strictly compositional in the traditional sense. On such a view, the meaning associated with an expression may depend in part on its structure independent of the meaning of the individual constituents. As an example, let us consider the case of dance into the room, meaning ‘go into the room dancing’. Assume that the learner knows the meanings of the ⁴ Questions that are too complex to go into here in detail are why languages have certain categories that seem to be drawn from a small set, why not all languages have exactly the same categories, and why the number of categories appears to be limited cross-linguistically. I speculate that, like the constructions themselves, the categories are projections of the predominant aspects of CS, including marking of thematic roles. If this is right, any apparent universals are not due to the existence of primitive categories in UG, but to the universal conceptual categories of CS.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.1 acquiring constructions 73 individual words, and the meaning of the construct is clear in context. That is, the learner is presented with (5). (5) ⎡phon [dance1 –[into–the–room]2 ]3 ⎤ ⎢syn [VP V[dance]1 , [PP P[into], NP[the room]]2 ]3 ⎥ ⎢ ⎥ [λx.go′ (x,goal:r2 ,manner:dance′ )]3 ⎣cs ⎦ One instance of this construction would be the basis for formulating an idiom. But multiple instances of the same construction would be sufficient evidence to generalize from the individual manner and location, yielding (6). (6) ⎡phon [1–2]3 ⎤ ⎢syn ⎥ [VP V1 , XP2 ]3 ⎢ ⎥ [λx.go′ (x,goal:2′ ,manner:1′ )]3 ⎦ ⎣cs The key assumption that allows this generalization to happen is that the CS representation is available to the learner from the context. Extending this idea a bit further, consider what the learner does when it has no prior experience with the construction that licenses a given construct. I have assumed that in such a case, the learner forms a new construction. Thus, each term in the construct is a candidate for the construction. If the learner takes every term in the construct to be relevant to the construction that it instantiates, then every construction that is newly hypothesized will be an idiom like red herring. But most of the constructs of the language do not exemplify idioms. So, sooner or later the creation of new idiom constructions will lead to generalization along the lines described here. For example, if we take Fido barks and Fifi barks both to be idioms initially, there is a possibility of forming the construction [S NP barks], and, further along, [S NP VP], along the lines just sketched out. This is the familiar scenario of template construction (Ambridge & Lieven 2015; Culicover & Nowak 2003; Tomasello 2003). Another type of constructional generalization is categorial—a construction is first licensed for a single lexical item, then extends to another, and ultimately is generalized to a much larger set that shares certain features of the initial items. I presume that this is what happens in the case of very general constructions, e.g. the ordering of heads with respect to their complements. A more idiosyncratic example, from the history of Germanic, is the IPP (Infinitivus Pro Participio) construction. This construction is a three-verb cluster, for example one headed by the auxiliary corresponding to ‘have’, in

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

74 learning, complexity, and competition which V2 is infinitival rather than participial. While the normal order in the German subordinate clause is [[VP …V2 ]-V1 ](7a,b), the order in IPP is [V1 [[VP …V3 ] V2 ]] (7c); the tensed verb V1 is initial in the cluster rather than final. (7) German (Jäger 2018, 302) a. dass sie ihn gehört2 hat1 that she him hear.past.part has ‘that she heard him’ b. dass sie ihn hören2 kann1 that she him hear.inf can ‘that she can hear him’ c. dass sie ihn hat1 hören3 können2 /*gekonnt that she him has hear.inf can.inf/can.past.part ‘that she could hear him’ Jäger (2018) documents the historical development of this construction. The constructional change here is located in the character of V2 . The construction is first attested in the fourteenth century with lâzen ‘let’, extends to bitten ‘ask’, then in the fifteenth and sixteenth century it is generalized to the modals mögen ‘may, to want to’, and wollen ‘to want’. It begins to be used with können ‘can’, dürfen, ‘may/need to’, sehen ‘to see’, and machen ‘to make’. At the same time, the construction is competing with, and ultimately prevails over the one in which V2 is a past participle, e.g. lesen gekönnt hat ‘read.inf can.past.part have.3.sg.pres’ (vs. hat können lesen). Other verbs participate as V2 in the IPP construction in due course in the sixteenth and seventeenth centuries (sollen ‘should’, wissen ‘to know’, anfangen ‘to begin’, pflegen ‘to use to’, brauchen ‘to need’, fühlen ‘to feel’, suchen ‘to seek’); some of these drop out in Modern Standard German (Jäger 2018, 307). While it is not a trivial matter to track the exact trajectory of change in terms of particular semantic features, the gradual extension of the construction to a broader set of verbs is clear evidence of generalization. Since the direction of generalization can vary from case to case, there is considerable dialectal variation in the IPP construction across the West Germanic languages, both in terms of the V1 verbs that govern it, the ordering of V2 and V3 , and whether V2 has the infinitival or past participle form (Schmid 2002). This is precisely what we expect in a constructional framework. It is important to observe that the conception of grammar learning sketched out here does not suffer from some of the idiosyncrasies of other learning scenarios. For example, the learner in Wexler & Culicover (1980) has to acquire a transformational grammar, and must hypothesize what the transformations

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.2 constructional innovation 75 are based on the phonological string and the meaning from a universal underlying syntactic structure. The assumption that there is a universal underlying syntactic structure that corresponds to surface linear order through movement transformations appears to be too strong, although it has been quite persistent in contemporary theory (e.g. Kayne 1994). In Gibson & Wexler’s (1992) model of learning, the learner has to make an inference from a surface structure to the values of a set of parameters. This results in indeterminacy in cases where there are multiple parameter settings that are consistent with a single surface pattern (Berwick & Niyogi 1996). In Fodor (1998), the learner sets parameters based on examples in the PLD. It is necessary on this approach to suppose that the learner has prior access to a set of pieces of structure that signal what the parameter values are. Both scenarios depend, counterintuitively, on the learner being able to parse the input prior to formulating hypotheses about the parameter settings.⁵

4.2 Constructional innovation While it is uncontroversial that young children do not acquire all constructions perfectly upon first exposure, the extent to which constructional change is due to errors made by young learners is an open question (cf. Slobin 2005; Cournane 2017). For the present study, there is no need to insist that language change occurs solely as the consequence of learner errors. Rather, I assume that change may occur spontaneously in speakers of any age, even when there is no explicit evidence for it in the PLD. Such change falls into three categories. One is generalization after a construction has been acquired. For example, if a construction explicitly requires a particular set of lexical items for licensing, generalization can occur by extending the construction to semantically related lexical items (Bybee 2013, 58). A construction that licenses a correspondence on the basis of a semantic condition can further generalize by dropping that condition, as discussed in Culicover (2014). I see no reason to assume that such generalization is restricted to young first language learners. Another type of innovation that does not result from direct exposure to PLD is predicated on an existing construction, but conflicts with the evidence for that construction. Suppose, for example, that the PLD for a construction is consistent with a single linear ordering of constituents. In such a case, ⁵ For discussion of some of the issues surrounding the application of parameter setting to language acquisition and related matters, see, among others, Gibson & Wexler (1994); Berwick & Niyogi (1996); Fodor (1998); Culicover (1999); Newmeyer (2004, 2017). See Yang (1999, 2000, 2002, 2004a,b, 2010) for a more nuanced approach to language acquisition as parameter setting.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

76 learning, complexity, and competition the syn–phon mapping of a hypothesized construction specifies the realization of a particular constituent in phon with respect to another category. A typical example is the ordering of V and NP in VP. While the evidence in the PLD may support only one ordering, both linear orderings of the hierarchical structure in the syn for this mapping are in principle available to speakers as logical possibilities.⁶ In other words, while learners are motivated to faithfully replicate the specifics of the constructions that they are exposed to, they are also able to spontaneously adjust their constructions without supporting input, in order to respond to other pressures. What kinds of pressure would cause a departure from the overt evidence? I assume that the linear ordering of a construction as exemplified by the constructs available from the PLD may be influenced by an alternative ordering that optimizes the correspondence on some other dimension. One force for reordering is the maximization of branching harmony (e.g. Hawkins 1994, 2004, 2014), which falls under our notion of economy. Another is the optimality of alignment of linear order and focal accent (Culicover & Winkler 2008), which I discuss in Chapter 8. Finally, a plausible case can be made for complexity being introduced by speakers through the creation of a construction that explicitly expresses a CS function that previously was unmarked. We know that this type of innovation is possible from the study of language creation in home sign and village sign— see section 3.1.3. In these cases, speakers who have no linguistic input seize upon aspects of the environment to encode core aspects of CS. It is entirely plausible that speakers who have linguistic input, either children or adults, also have this capacity. A nice example is noted in the following comment by Kemenade & Westergaard (2012, 96) regarding the tendency of Norwegian children to override V2 when the subject is pronominal: “It thus seems that young Norwegian children are producing I[nformation]S[tructure] based word order patterns, even in the absence of input for such distinctions” [emphasis mine—PWC]. Another example is the use of a preposition that means ‘because’ to serve as a complementizer in Akkadian (Deutscher 2000). As in all cases of innovation, if a novel construction is found by speakers to be sufficiently useful, it will become part of the grammar even though it plays no role in the PLD.⁷ ⁶ Here I am borrowing an insight captured by Gen in Optimality Theory; cf. Tesar & Smolensky (1998). ⁷ If we assume that language originated with a single group of early humans, the extent of constructional diversity that we observe in natural languages suggests that this view of innovation must be essentially correct. Initially, there would have been no contact with other languages that could lead to change in learning. Furthermore, assuming dispersion and low population density, a lot of subsequent

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.3 constructions in competition 77 I should note that a well-documented driver of innovation is contact. Trudgill (2011) and many others have argued that contact may cause both grammatical complexification and simplification. Basically, complexification arises when children attempt to reconcile two or more grammars, while simplification occurs when adults do. The rationale is that as competent language learners, children seek to accommodate the diverse evidence from the two languages, which may result in the properties of one language being incorporated into the grammatical constructions of the other. On the other hand, adults are fully competent in one language but not the other, and are not competent language learners. As a result they formulate constructions for the second language, L2, that are simpler than the actual grammar of that language. Most significantly, they fail to grasp inflectional complexity. These are complex matters; the type and extent of change under contact is quite varied, and depends on many factors that are independent of the grammars of the languages in contact, and it is rarely if ever possible to predict what the changes will be (Thomason 1997, 2001, 2008, 2010, 2017). But it does appear possible to account for at least some outcomes of contact in terms of the grammatical properties of the source grammars (Baptista 2020) and a range of complexity factors (Filipović & Hawkins 2013, 2019). This said, the effects of contact are not particularly specific to the grammatical theory that one uses to describe and explain them. Contact-induced innovation can be characterized not only in terms of constructions, but also in terms of multiple grammars, alternate parameter settings, and so on. For this reason, I do not focus here on describing contact in constructional terms. However, I do appeal to contact in section 4.3, where economy, formulated in constructional terms, may play a role in determining which construction survives constructional competition.

4.3 Constructions in competition The picture sketched out thus far portrays grammars as consisting of sets of constructions that together accomplish the task of expressing the core functions of conceptual structure. Grammars may differ from one another in terms of the precise formulation of particular constructions. Moreover, it is logically possible for a grammar to have more than one construction that

contact would have been precluded after initial innovation. So the diversity must have arisen at least in part from innovation.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

78 learning, complexity, and competition licenses the same set of form/meaning correspondences. Here I consider some of the useful implications of characterizing grammars in constructional terms.

4.3.1 Multiple grammars vs. multiple constructions In real learning situations, a learner is often, if not always, faced with competing evidence about how to express a given CS function. As suggested in Chapter 2, constructions provide a natural framework for representing such situations, in which speakers can be said to have ‘multiple grammars’ (Kroch 1989; Kroch & Taylor 2000; Kroch 2003; Pintzuk 2002; Santorini 1992). On the constructional characterization of such situations, speakers do not have multiple grammars per se, they simply have grammars with multiple constructions that allow different licensing conditions for the same CS function.⁸ A simple case of variation where we would not want to invoke multiple grammars is one where there are two expressions for the same concept. A welldocumented example is the word used to refer to carbonated beverages. In one dialect, the term used is pop, with the lexical entry (8). In another, the term is soda, with the lexical entry (9). The entries are of course identical, except for the phon representation and the lexical identifier. (8)

⎡phon ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢cs ⎢ ⎢ ⎢ ⎢ ⎣

/рɑр/

⎤ ⎥ category N ⎥ ] [ ⎥ lid pop ⎥ ⎥ ⎤⎥ ⎡category thing ⎢property carbonated ⎥⎥ ⎥⎥ ⎢ ⎥⎥ ⎢physical liquid ⎥⎥ ⎢ function drink ⎥⎥ ⎢ ⎢property non-alcoholic⎥⎥ ⎥⎥ ⎢ ⎣… ⎦⎦

⁸ The way to represent such variance in a conventional parametric approach is to allow a grammar to have two settings for a single parameter. From the present perspective, this is just the limiting case where the dimension of variation allows for just two alternatives. As the number of parameters and possible parameter values increases, the conventional parametric approach becomes equivalent to a constructional one.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.3 constructions in competition 79 (9)

⎡phon ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢cs ⎢ ⎢ ⎢ ⎢ ⎣

/sodɑ/

⎤ ⎥ category N ⎥ [ ] ⎥ lid soda ⎥ ⎥ ⎡category thing ⎤⎥ ⎢property carbonated ⎥⎥ ⎢ ⎥⎥ ⎢physical liquid ⎥⎥ ⎢ ⎥⎥ ⎢function drink ⎥⎥ ⎢property non-alcoholic⎥⎥ ⎢ ⎥⎥ ⎣… ⎦⎦

In a case like this, we would not want to say that we have two lexicons, let alone two grammars. Speakers who have one or the other form have one or the other construction for the same concept. Speakers who have both forms have both constructions for the same concept. The two constructions may be indexed to different social contexts and social identities, or they may be in direct competition. In any case, the most natural description appears to be one that invokes multiple constructions. Extending this notion to more complex syntactic constructions, the variance may be dramatic, as in the case of language contact. For example, language A may express wh-questions as A′ constructions, while language B may use whin-situ. Learners may resolve the conflict in a number of ways, depending on social factors, as seen in pidgins and creoles (Bickerton 1988) and bilingualism (Trudgill 2011). For example, again considering the history of English, multiple grammars have been invoked to explain the fact that in Early Modern English, speakers (such as William Shakespeare) appeared to have a grammar in which the main verb preceded negation and underwent SAI, and a grammar in which only an auxiliary verb preceded negation and underwent SAI.⁹ Thus we have examples like those in (10) from Act I of King Lear. Note in particular both the use and the absence of do-support in (10a) and the apparently free use of doth in (10d). (10) a. It did always seem so to us: but now, in the division of the kingdom, it appears not which of the dukes he values most b. Think’st thou that duty shall have dread to speak … c. Do you smell a fault? ⁹ See Bækken (1999) for documentation of the competition between do and main verb in Early Modern English, and Bækken (2000) for review of the loss of SAI of the finite verb from OE to EME.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

80 learning, complexity, and competition d. Which of you shall we say doth love us most? e. Thy youngest daughter does not love thee least; In a constructional grammar, it is perfectly possible in principle for there to be two constructions that license different phon correspondences with the same CS representations, mediated by the appropriate syn. In the case of SAI, for example, one construction licenses the tensed main verb before the subject (11), while the other (the contemporary one), licenses the tensed AUX before the subject (12). (11)

phon [2–1]3

[

syn

(12)

[S NP1 , V[tense]2 ,…]3

]

phon [2–1]3

[

syn

[S NP1 , Vaux [tense]2 ,…]3

]

Both of these constructions appear to exist in Shakespeare’s grammar, perhaps indexed for different situations in terms of rhythm, rhyme, and other prosodic factors, as well as register and other social factors. I assume that such a state of affairs is the norm, and abstract away from such details in order to focus on the formal representational issues. There is a vast number of examples in the literature along similar lines, far too many to list here. For instance, Mondorf (2014, 214) reviews the following cases in which the grammar of a language maintains both analytic and synthetic constructions for the same function. • English comparative alternation: fuller vs. more full • English genitive alternation: the topic’s relevance vs. the relevance of the topic • English future tense alternation: will vs. be going to • English mood alternation: if he agree-∅ (subjunctive) or if he agrees (indicative) vs. if he should agree (modal periphrasis) • Spanish future tense alternation: comere vs. voy a comer • German past tense alternation: Sie brauchte …vs. Sie hat …gebraucht Mondorff suggests that competing constructions such as these are sustained over long periods of time because each responds differently to the demands of economy. The analytic construction is easier to process in cognitively

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.3 constructions in competition 81 demanding contexts, while the synthetic construction is perhaps favored by Hawkins’s (2014) Minimize Forms.

4.3.2 Defining competition In this section I consider how to define when constructions are in competition, using the descriptive formalism already developed. While it may be straightforward to describe a grammar at a particular time as containing multiple constructions, as in the case of English inversion in the preceding section, a constructional approach also offers the potential for explaining how competition between constructions may be resolved over time, leading ultimately to grammatical change. Such an explanation requires that we seek explicit answers to two questions. 1. Under what circumstances are two (or more) constructions actually in competition with one another? 2. What are the dynamics of the system such that constructions in competition acquire more or less weight, even to the point that one completely eliminates the other in the limit? These are complex questions, and a detailed comprehensive exploration of them would easily outgrow the boundaries of a single section or chapter. Nevertheless, we can make certain plausible, preliminary assumptions about the architecture and dynamics of the competition that suggest how constructions play an explanatory role in change and variation. This section sketches an idealized model of competition and identifies the key factors that determine how competition between alternatives is resolved. Section 4.4 reviews several contributors to constructional complexity that may determine which alternative wins out in the competition. Finally, section 4.5 illustrates the dynamics of such competition, using some simple computational simulations. Consider first the dynamics of a system in which constructions are in competition. In the simplest case there are two alternatives for expressing a particular function, e.g. agenthood, interrogative scope, reflexivity. One of the central arguments of this book is that in many cases, there are more than two ways of expressing such functions, which is naturally accommodated by a constructional approach. However, for present purposes, restricting the alternatives to two is a useful simplification for exploring the dynamics of competition.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

82 learning, complexity, and competition Note that even this restriction is an oversimplification, because even in cases of apparent binarity, there can be variation in the licensing conditions. One example is ‘pro-drop’, where a pronominal argument, typically the subject, corresponds to null in phon. A null argument alternates with an overt pronominal. The two constructions can be represented simply as (13). (13) a. Construction:Pro-drop phon ∅1 [ ] syn NP[pro]1 b. Construction: Overt pro phon 1 [ ] syn NP[pro]1 While from a generic perspective there are only two possibilities, as in (13), in fact there are more than two constructional alternatives in the set. In Old High German and Old French, both of which showed loss of prodrop in the course of their evolution to the modern languages, the person and number features of the argument played a role in determining whether a pronominal subject was null or overt (Axel 2005).1⁰ So it is not possible to say that at any given stage, the grammar had (13a), or (13b), or both. In fact it had multiple constructions, variants of both, indexed to particular morphosyntactic features of the argument. Setting aside such complications, it is useful for modeling purposes to assume there are only two constructions in competition for each CS function. If a grammar consists of several constructions that are in two-member alternative sets, then we can view the competition as holding simultaneously but independently for all of these constructions.11 To take a concrete example, consider the phenomenon of V2 in a language that allows for an XP in clause initial position, which is a classic problem of parameter setting (Gibson & Wexler 1994; Berwick & Niyogi 1996). Simplifying somewhat, there are two possible parameter settings for SVO.

1⁰ See also Sprouse & Vance (1999): “A curious property of the gradual loss of null subjects is that it often affects different persons of the verb at different rates.” 11 This assumption corresponds to the independence assumption for classical parameter theory. What this means is that each piece of evidence that the learner is exposed to unambiguously signifies exactly one value for each parameter. Moreover, there is no evidence that is ambiguous with respect to two more more sets of parameter settings. Parametric independence is one of the major advantages of a constructional approach. For discussion of learnability problems that arise when parametric independence does not hold, see Gibson & Wexler (1994) and Berwick & Niyogi (1996).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.3 constructions in competition 83 (14) a. SVO & no V2 b. topicalization of S & V2 A derivation under (14b) would give SVO by moving S to the initial slot and applying V2, as in (15). (15) [ ] [ ] S [V O] ⇒ [S] [ ] tS [V O] ⇒ [S] [V] tS [tV O] This structural ambiguity of SVO is not available in a constructional approach in Simpler Syntax, because there are no derivations, and because it assumes the simplest structures compatible with the observed linear orderings. A language that shows only SVO will have just one construction that links the constituent that precedes the VP to the subject GF and corresponding semantic properties (16a). A language with V2 will have one construction that specifies a correspondence between V and the position in phon that immediately follows the string corresponding to some initial constituent (16b). (16) a. ⎡phon ⎢syn ⎢ ⎣gf b. phon [ syn

⎤; [phon [4 > 5]6 ] [VP V4 , X5 ]6 [S NP1 , VP2 ]3 ⎥ syn ⎥ GF1 ⎦ [1–2–…]3 ] [S …, XP1 , …, V2 , …]3 [1–2]3

A more precise formalization of these constructional alternatives is given in Chapter 7.12 Now we are in a position to consider what happens when there are competing alternative constructions. Niyogi (2006) presents an important formalization of competition that clearly lays out the dynamics of competition between two alternatives in the abstract. In the simplest model, knowledge of language resides in a population of N agents A = {A1 , A2 , …, AN }, some of whom are adults and some of whom are learners. As a first approximation, suppose that there are two languages represented in the population, corresponding to two competing constructions. Call these C1 and C2. Each agent has both constructions in its grammar with weights W1 and W2, respectively. The weight of a construction determines the frequency with which an agent will use that construction. 12 For example, the verb must be more precisely specified because only the highest, inflected verb can appear in second position in this construction.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

84 learning, complexity, and competition This situation is one that Niyogi characterizes as ‘bilingualism’—the grammars of speakers and learners consist of two competing variants in some proportion. This proportion is what Niyogi calls the mix factor λ ∈ [0, 1]. For convenience, let λi be the proportion of C1 for each agent Ai , that is, W1i . Niyogi is concerned with the conditions under which λ becomes W1i + W2i 0, that is, when a variant is eliminated from the population. While each speaker has a different mix factor, suppose as a first approximation that a learner interacts with the entire adult population, for whom λ is the total number of exemplars of one variant divided by the total number of all competing exemplars. While there is variation among the individual λs, Niyogi shows13 that the average mix factor is (17)— (17) E [λ] =

ax ax + b(1 − x)

—where E[λ] is the average value of λ in the population of speakers, a(=W1) is the probability that a speaker will produce evidence for the variant C1, and b(=W2) is the probability that a speaker will produce evidence for the variant C2. From this it follows that (18) E [λt+1 ] =

aE[λt ] aE[λt ] + b(1 − E[λt ])

where E[λt ] is the average value of λ at time t. Letting xt be E[λ], we have Niyogi’s formula (19): (19) xt+1 =

axt axt + b(1 − xt )

This equation describes a dynamical system, where the average mix at each time t+1 depends on the average mix at time t. If the weights of the two variants are equal, there is a stable mix at .5. But if the weight of one variant is greater than that of the other, the mix will favor the larger variant and in the limit will go to 1 (Niyogi 2006, 330). I call this result ‘Niyogi’s Law’. The essential point of Niyogi’s Law is that if there is more evidence for one variant than another in the network, then the more frequent variant will win in the long run, other things being equal. From the perspective of accounting for actual change, the modeling is complicated by the fact that it depends on some prior notion of competition; moreover, it 13 Actually, he says that “it is possible to show.”

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.3 constructions in competition 85 presumes that we have some way of determining the relative frequency of the alternatives. The first point bears on question 1 at the beginning of this section: What counts as competition? Arguably, in the case of lexical representations there is rarely if ever true competition because there is rarely if ever true synonymy. That is, it is very rare that two words or phrases have precisely the same meaning. This is certainly true if ‘meaning’ is understood to include not only reference to states of affairs in the world, but social and psychological associations. In the domain of syntax, however, the situation is not as fluid. There is a finite number of basic CS functions to express, and they do not have to do with states of affairs but with thematic structure, logical relations, discourse and pragmatic functions, and referentiality. These functions for the most part do not allow for the range of meaning variation that we find in the lexical categories—there are many fewer dimensions of variation in the CS functions than in the possible nouns or verbs. As I argued in Chapter 3, the reason why grammars of natural language tend to resemble one another is precisely that they are all engaged in expressing the same relatively limited set of universal CS functions. Given this, I conclude that it is reasonable to assume that two constructions are true competitors when they maintain different licensing conditions for the same CS function, even though genuine synonymy may be rare or nonexistent. One outcome of such a situation is that one construction drives out the other. Alternatively, the two constructions may become differentiated by adding secondary functions. For example, suppose that a language has two constructions for marking yes-no questions, one with inversion, and the other with intonation. A possible development would be that one construction is used as direct question, while the other is used as a question with a presumption of a positive answer. The second point is more fraught. There is no feasible way to determine the relative frequency of competing constructions for a given learner, or even for a community of learners, although it is possible to get a rough measure using corpus data. Moreover, there is no precise way to determine what contributes to the relative frequency of competing alternative constructions. If we take it as given, though, that a bias of some type against a particular construction will lead to its ultimate demise, other things being equal, it is useful to consider the kinds of factors that might contribute to the bias. In cases of contact, construed broadly as we have done here, there are both social and linguistic factors (Thomason 2008, 2010, 2017). I focus on the linguistic factors in section 4.4, under the broad rubric of economy.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

86 learning, complexity, and competition

4.3.3 When do we actually have competition? In the simplest case of competition that we have been tacitly assuming, exactly the same CS function is covered by two constructions. The example just cited of different ways to indicate a yes-no question is an instance of such competition. To take another example, which is discussed in some detail in Chapter 5, one construction would mark the subject using linear order, say ‘precedes the VP’, and the other would use case marking, say ‘nominative case’. Yet another example would be free ordering of the constituents of VP, what is typically referred to as ‘scrambling’. Here is a general characterization: if μ is the CS function and Σ1 and Σ 2 and π1 and π2 are the corresponding representations in syn and phon, respectively, then we have simple competition as represented in (20). (20)

π1x ⎤ ⎡ phon …{ }… ⎢ π2x ⎥ ⎥ ⎢ ⎥ ⎢ 1 Σx ⎥ ⎢ …{ 2 }…⎥ ⎢syn Σx ⎥ ⎢ ⎥ ⎢ …μx … ⎦ ⎣cs

If the correspondence π1 ⇔ Σ1 is less costly overall than π2 ⇔ Σ2 , we expect it to win out in a direct competition, other things being equal. This appears to be the case, for example, of word order taking over from case marking in English and in L2 contact situations.1⁴ If π1 ⇔ Σ1 is less costly than π2 ⇔ Σ2 on one dimension but π2 ⇔ Σ2 is less costly overall than π1 ⇔ Σ1 on another dimension, then we would expect both to survive, and perhaps even to replace one another cyclically over time. In section 8.5 I argue that this type of competition accounts for variation in Continental West Germanic verb clusters. In addition to simple competition there can be complex competition. In this case, a set of CS functions is covered by two sets of constructions, but no individual pair of constructions is in simple competition. An example, to be discussed in detail in section 8.3, is topicalization and V2 in Germanic. The two constructions individually have different functions in present-day German and English, although together they appear to cover roughly the same set of 1⁴ See, for example Wiese & Wallah (2010); Wiese et al. (2014) for discussion of Kiezdeutsch, a variant of German that has developed in multilingual communities.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.4 economy 87 functions. We can represent such a situation as (21), which subsumes (20) in the limiting case. (21) ⎡ Π1x ⎤ ⎢phon …{Π2 }…⎥ x ⎥ ⎢ ⎥ ⎢ 1 Σx ⎥ ⎢ …{ 2 }… ⎥ ⎢syn Σx ⎥ ⎢ ⎥ ⎢ …Mx … ⎦ ⎣cs where Π={π1 …πi }, Σ={Σ1 …Σj }, M={μ1 …μk } Since the individual constructions are not in competition, head-to-head comparisons do not promise to be that revealing. In order to formulate a preliminary hypothesis about what constitutes relative complexity for sets of constructions, it is essential to consider some actual cases. I do this in Chapter 8, where I track the details of the development of the basic word order licensing conditions for Modern English and Modern Standard German. The picture that emerges is that at least in some cases, changes in one construction that reduce complexity on one dimension push changes in related constructions so that coverage of the original set of CS functions is maintained.

4.4 Economy As part of the goal of explaining constructional change and variation in terms of economy, we have to formulate a measure of constructional complexity. I suggest in this section that there are three main contributors to complexity that arguably operate at the level of relatively regular constructions and give a specific shape to the grammar of a language.1⁵ These are representational complexity (section 4.4.1), computational complexity (section 4.4.2), and interpretive complexity (section 4.4.3). These are very likely variants

1⁵ Since I am focusing on syntactic constructions, I do not consider factors that regulate constructional competition at the lexical level. For discussion, see Rohdenburg (2007), who considers ‘cognitive complexity’, and Goldberg (2019), who considers preemption. There may well be other factors at play in determining the outcome of competition besides complexity. For instance, Jäger (2007b, 75) proposes that “grammars that increase communicative success and minimize the speaker effort are considered ‘fitter’ than competing grammars that are less successful in this respect.”

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

88 learning, complexity, and competition of the same thing, but it is useful to distinguish them for the purpose of analysis.1⁶

4.4.1 Representational complexity Representational complexity has to do with the generality of constructions. There is ample evidence that much of language change is in the direction of greater constructional generality, other things being equal. The loss of licensing conditions is a key component of virtually all accounts of grammaticalization, where specific semantic properties of a word are dropped from a construction as the word takes on a purely grammatical function.1⁷ Minimization of representational complexity also rules out complex representations of simple structures. For example, it rules out representations in which constituents that are adjacent in phon are not strictly local in syn, as in (22). (22) a. [whoi [ti [called]]] (instead of [who [called]]) b. [gavei [[a book]j [ti [tj [to Sandy]]]]] (instead of [gave [a book][to Sandy]]) We can characterize relative generality straightforwardly in constructional terms. Constructions express syn–phon correspondences and syn–cs correspondences.1⁸ A syn–phon correspondence is comparable to what is called ‘spell out’ in P & P. A syn–cs correspondence is semantic interpretation. It is not necessarily compositional, although again, reduction of complexity will favor relative transparency. Variation with respect to generality occurs when the conditions on a particular tier of one construction are more restrictive than the corresponding conditions of another construction, but otherwise the constructions are identical. An example would be a construction in which a condition on syn includes

1⁶ The literature on linguistic complexity is substantial and varied, and I do not attempt here to review it systematically. For a range of perspectives, see Aboh & Smith (2009); Ansaldo et al. (2018); Barton et al. (1987); Culicover (2013c); Dahl (2004, 2009); Givón & Shibatani (2009); Miestamo et al. (2008); Newmeyer (2007); Newmeyer & Preston (2014); Sampson et al. (2009); Trudgill (2011). 1⁷ Although there may be other factors involved as well, and generalization is not restricted to grammaticalization; see, for example, Harris & Campbell (1995); Bybee (2003); Jäger (2007a); Kiparsky (2011); Traugott & Heine (1991); Traugott (2003, 2008); Trousdale (2008, 2010); Fried (2009); Noël (2007). 1⁸ Exclamations such as Yikes! are phon-cs correspondences that lack associated syn representations (Jackendoff 2002, 131).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.4 economy 89 a member of a subcategory of V. For instance, the causative construction in English applies to a lexically specified set of verbs, e.g. boiled (23) a. The water { simmered }. evaporated boiled b. I { simmered } the water. evaporated c. The dish fell. d. ∗ I falled the dish. Loss of the lexical restriction leads to generalization of the construction; in fact, (23d) is attested in child language. Similarly, the dative alternation in English is lexically restricted, as seen in well-known examples like (24). gave } some books to the library. donated gave b. I { ∗ } the library some books. donated

(24) a. I {

Yet instances of donate in the double object construction of (24b) are attested, suggesting that the lexical restriction does not hold for all speakers, as noted in section 2.2. Alternatively, generalization may occur in the syn–cs correspondence, where a restriction on a term μn in CS is lost. For example, Greenberg (1978) proposes that demonstratives regularly undergo the diachronic sequence in (25). (25) demonstrative > definite article > non-generic article > noun marker (classifier or gender) Assume that a demonstrative, such as this, has the CS licensing conditions illustrated in (26).1⁹

1⁹ For convenience I use an AVM here to capture the semantic facts, rather than the λ-notation used elsewhere.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

90 learning, complexity, and competition (26)

⎡phon ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢cs ⎢ ⎢ ⎣

this

⎤ ⎥ category DEM ⎥ [ ] ⎥ lid this ⎥ ⎥ thing ⎡category ⎤⎥ ⎢referentiality referential⎥⎥ ⎢ ⎥⎥ ⎢definiteness definite ⎥⎥ ⎢ ⎥⎥ proximate⎦⎦ ⎣location

The evolution in (25) can then be seen as the loss of each of the conditions in turn, from location, to definiteness, to referentiality. Loss of location as a licensing condition yields an expression that marks definite reference in context, i.e. a definite article. Loss of definiteness yields an expression that marks referentiality, i.e. a generic article. And loss of referentiality yields a nominal marker. In sum, one consequence of economy is to drive the simplification of representations, which results in generalization. The tendency toward generalization has of course long been recognized in linguistics; its role can be seen in virtually all descriptions or all phenomena and in all formalisms.2⁰ This said, it has a particularly natural characterization in constructional terms.

4.4.2 Computational complexity A compelling case for the role of various types of computational complexity in explaining variation has been made by Hawkins (1994, 2004, 2014), so I will only summarize the main themes here. Hawkins shows that speakers prefer to produce sentences in which the length of dependencies is minimized as much as possible. This means that speakers seek to minimize the length of fillergap chains, the length of dependencies between subcategorized arguments and their heads, and the length of dependencies between the parts of multiword constructions. Hawkins argues for a number of other measures of complexity, but dependency length will suffice for present purposes. The effects of dependency length minimization are shown by Hawkins to be of two types. In those languages where two linear orderings are possible, there is a clear preference for the one that results in smaller dependency length, in 2⁰ Chater & Vitányi (2003) suggest that the drive toward simplicity may in fact have a general explanatory role in cognition.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.4 economy 91 terms of frequency of occurrence in corpora. And in those languages that allow for just one of these orderings, the majority have the ordering with the smaller dependency length.21 An additional type of computational complexity, discussed in Culicover & Nowak 2002; Hofmeister et al. 2015; Culicover & Winkler 2018, has to do with chain interactions. It has been known since Ross (1967) that extraction from extraposed constituents reduces acceptability, as in (27). (27) a. a person whoi Sandy saw a picture of ti on the post office wall b. ∗ a person whoi Sandy saw a picture tj on the post office wall [of ti ]j Such configurations are said to be ‘frozen’, and the unacceptability of examples such as (27b) has been attributed to a grammatical principle blocking extraction from the extraposed constituent (e.g. the Freezing Principle of Wexler & Culicover 1980). However, a case can be made that what is responsible for the unacceptability here is not a grammatical principle, but computational complexity. In fact, Hofmeister et al. (2015) show that the acceptability of extraction from a constituent that itself has been extracted from its base position is a linear function of the lengths of the two chain dependencies. Such evidence suggests that judgments in these freezing configurations can be accounted for in terms of processing complexity, and do not require a grammatical principle that blocks certain derivations. (28) shows the relationships between the chains in derivations that involve subextractions from constituents that have themselves been extracted; these are called Right Surfing and Left Surfing.22 The patterns are schematized in (28). 21 Harris (2008) argues that some unusual constructions are rare because the circumstances that must come together to make them possible are unlikely. The question, of course, is, then, why are these circumstances so unlikely? Harris argues against the notion that they are disfavored by the configuration of the language faculty, that they are the consequence of processing difficulty, or that they are not easily learned. Her argument invokes the circularity of the analysis, on the post hoc nature of the learnability scenario, and the lack of independent evidence for processing difficulty. Instead, she argues that these rare constructions require a sequence of many particular changes, each of which is less than inevitable. Thus the chance of the rare construction occurring is low. It is plausible nevertheless that the frequency of constructions depends on relative computational complexity or learnability, regardless of whether or not we yet have a satisfactory way of measuring complexity. It is important not to overlook existing proposals, e.g. those of Gibson and Hawkins, cited elsewhere. Moreover, the strategy of formulating an account of computational complexity on relative frequency of constructions is circular only if it is intended to explain the frequency of those constructions. But Gibson and Hawkins are careful to show that their measures have more general applicability. 22 The term ‘surfing’ to refer to extractions from extracted constituents originates with Sauerland (1999).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

92 learning, complexity, and competition (28) a. Right Surfing ?the person who I think that he gave a picture t to Mary of t

b. Left Surfing *the person who I think that to t he gave a book t

The idea that it is the complexity of the chain interaction that is responsible for unacceptability in the freezing cases, and that it is not a matter of grammatical well-formedness, is made more plausible by the fact that there are nonsurfing chain interactions that also yield unacceptability. Consider the Nesting and Crossing configurations, illustrated in (29).23 (29) a. Nesting *the person who I think that books he gave t to t

b. Crossing *the book which I think that to Mary he gave t t

It is, of course, an open question why these chain interactions should be problematic for processing. We can speculate, for example, that the processing demands memory resources that exceed what is normally available (Gibson 1998, 2000), or that the processing of multiple chains runs into confusion because it is trying to manage multiple fillers and multiple gaps in the same space (Lewis 1993; Lewis et al. 2006). We do not need to know precisely how the complexity arises, as long as we have evidence that there is complexity. The unacceptability judgments, and the extreme rarity and even non-existence of configurations such as (28)–(29) lends support to the idea that they are complex. There are, as far as I know, no languages in which a surfing or a crossing configuration has 23 I do not intend to suggest that Nesting and Crossing will automatically yield precisely the same kinds of judgments in every instance. The psycholinguistic mechanisms of chain processing are far from being well understood, and there are well-known lexical effects that can ameliorate judgments in otherwise complex sentences. For discussion, see Lewis et al. (2006) and references cited there.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.4 economy 93 been grammaticalized as a fixed construction, e.g. as a collocation, idiom, or proverb. Interestingly, English has a nesting collocation, exemplified in (30). (30) a. That’s something whichj I have never been able to figure out whati to with do ti { }t. about j b. She’s someone whoj I have never been able to figure out whati to buy ti for tj . This construction is somewhat restricted—it does not easily allow finite tense (31a), and seems less acceptable with other V-P combinations (31b–d). (31) a. b. c. d.

?That’s something whichj I already forgot whati I should do ti with tj . ?That’s something whichj I already forgot whati to clean tj with tj . ?That’s something whichj I already forgot whati to put tj into tj . ?She’s someone whoj I already forgot whoi to introduce ti to tj .

Of course, the argument from frequency of occurrence is circular until we have an independent justification for attributing complexity to these configurations, but there is a prima facie case for complexity in the fact that there is nothing per se that is ill-formed about the offending configurations, as far as the canonical grammatical structures of the language are concerned. The alternative, which is to stipulate that there are universal constraints against configurations such as (28)–(29), is even less satisfactory, because it is a stipulation that begs for a deeper explanation. This type of complexity also arises with respect to phon in ‘flagging’ a particular construction, particularly when it is possible to have phonologically null realizations of syntactic constituents. An example that nicely illustrates this is the Old High German (OHG) null subject. Axel (2005) demonstrates that null subjects in OHG were gradually replaced by overt subjects. Crucially, unlike in Romance, it cannot be argued that null subjects were licensed in OHG by strong agreement with the verb. According to Axel, null subjects occur in OHG only when they follow the inflected V, i.e. in V1, V2, and V3 configurations. (32) a. V1: V[fin]–NPsubj –… b. V2: X–V[fin]–NPsubj –… c. V3: X–Y–V[fin]–NPsubj –…

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

94 learning, complexity, and competition Axel assumes that the structure of the OHG main clause is that of (33), while that of the subordinate clause is (34). The VP is V-final, as in (35).2⁴ Cmax

(33) Root clause Spec1

C C0

Imax Spec2

I I0

Cmax

(34) Embedded clause Spec1

C C0

(35) VP-final V

Vmax

Vmax

V(max) Spec

V1 X1

V2 X2

V03

According to Axel, null subjects occur with different frequency with respect to different feature values: “In the older OHG prose texts referential null subjects are attested in all persons and numbers. However, it is only in the 3rd person singular and plural that the null variant is used more frequently than the overt variant.” Axel attributes the loss of null pronouns to the relatively greater difficulty of parsing sentences with null pronouns: “Crucially, referential null subjects are lost despite the stability of a rich verbal inflection.” However, Axel provides no basis for attributing greater processing complexity to this option. 2⁴ The same structures are assumed for Modern German by Culicover & Winkler (2019), following Travis (1991) and Sternefeld (2006). I assume simpler variants of these structures for the discussion of the historical change in Germanic in Chapter 8.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.4 economy 95 Let us focus on the V2 case (32b), where the inflected verb must follow the first constituent in the clause. Why would the configuration [IP X V[fin] [VP NPsubj tV …]] be more complex if NP is null than if it is overt? And why would there be an advantage to null pronouns in the first place? Let us assume that an earlier predecessor of OHG lacked unrestricted topicalization to Spec,IP, and hence lacked overt inversion of the inflected verb and the subject NP in main clauses (see Chapter 8). In this language, which presumably had rich inflection, the canonical order would be that of (36). (36) [IP NPi V+I0 [VP ti ([…tV ])]] Dropping of pronominal NPs is a reduction in complexity for such structures, because of the redundancy between the overt pronominal form and the rich inflection. I assume that such reduction falls under Hawkins’ (2014: 15) principle of Minimize Forms. Next we have an innovation in which any constituent can appear in Spec,IP, as in Modern German (Chapter 8). The constructs that are licensed are of the form in (37). (37) [IP XP V+I0 [VP (NP) […tV ]]] The licensed constructs have the V in second position in a canonical declarative main clause. But if the NP in the VP happens to be a null pronoun, the sequence does not satisfy the licensing conditions of the V2 construction. The processor expects an NP immediately following V+I0 and when it does not see an NP there, must infer a null subject pronoun in order to license the sequence. This last step in parsing the sequence is an additional cost that adheres just to those sequences in which the finite verb is early in the sequence. In other words, a structure such as (37) is a systematic garden path. On an approach to parsing that assumes a probabilistic context-free grammar (Schuler et al. 2010), the processor uses a parsing strategy in which it follows the most probable trajectory until it runs into counter-evidence. In that case, it switches to the next most probable, and so on. On the assumption that XP–V–NP–…is a more likely parsing path for V2 constructs with topics than XP–V–…, there will in general be a greater cost to the processing of V2 structures with null subjects than with overt subjects. So we have a case of constructional competition in which V2-with-overtsubject is simpler than V2-with-null-subject and ultimately forces it out.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

96 learning, complexity, and competition

4.4.3 Interpretive complexity The third type of complexity has to do with the problem of identifying the precise interpretation given a particular form. I assume that a grammar in which a form has more than one meaning is more complex from the perspective of interpretation than a grammar that has one form for each meaning.2⁵ A well-documented example of such complexity is discussed by Deo (2015). Deo shows that the interpretation of a simple finite tense marker as denoting both simple present and progressive is unstable; the instability is resolved by the emergence of a distinct form for the progressive. An interesting consequence of reducing complexity by introducing a new construction is that it increases complexity in another domain of the grammar. As Deo (2015) shows, when the new construction is introduced, it overlaps in its CS function with the older construction. One outcome of this competition is that as the new construction competes with the older construction, it generalizes, and ultimately supplants it. Another instance of this type of complexity is the English contraction of is.2⁶ (38) Who’s hungry? If is is immediately adjacent to a phonologically null complement, contraction is impossible (Sag 1978, 146). The examples in (39) show that this is the case regardless of the reason for the null complement. In (39a), the null complement is the trace of an NP. In (39b), it is the trace of an AP, and in (39c), it is the trace of an Adv. In (39d), it is an ellipsis site, which many would argue is a null proform (see for example Lobeck 1995 and Culicover & Jackendoff 2012). Example (39e) shows that contraction is blocked before comparative ellipsis as well. Example (39f) shows that contraction is blocked in the case of Right Node Raising, where the complement does not appear immediately adjacent to the copula, whether or not there is a null constituent there. (39) a. b. c. d.

I never found out who the villain is/∗ ’s. You won’t believe how rich Sandy is/∗ ’s. Please tell me where Kim is/∗ ’s. Sandy is funny and Kim is/∗ ’s, too.

2⁵ For a derivational version of this idea, see Reinhart’s (2006) principle ‘Minimize Interpretive Options’, which disfavors applying operations that proliferate interpretations for a single phonological form. 2⁶ Thanks to Jack Hawkins for suggesting this example to me.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.4 economy 97 e. Sandy is taller than Kim is/∗ ’s. f. Sandy is/∗ ’s and Kim isn’t likely to succeed. Arguably, the overt presence of is marks the site of its complement, which can then be identified as null and properly interpreted. However, if is is contracted to ’s, this identification is more difficult, on the assumption that, as a consequence of the attachment of the auxiliary to the subject, there is no longer a head that can serve as the governor for the phonologically null complement, in Lobeck’s (1995) sense.2⁷ In fact, we might speculate that a number of classical grammatical effects that appeal to the absence of ‘proper government’ are due to the greater cost of identifying structure under certain specific conditions. For example, thattrace effects and more generally complementizer-trace effects are ameliorated by phrases between the complementizer and the trace. The examples in (40) are less acceptable than those in (41), which show the Adverb Effect originally noted by Bresnan (1977) and discussed by Culicover (1993). (40) a. ∗ Sandy is the only person who I admitted that t might have known the answer. ∗ b. This is a problem that they couldn’t figure out how t should be addressed. c. ∗ That is the person who I was wondering whether t ought to be discouraged from running for office. (41) a. Sandy is the only person who I admitted that with a little bit of help t might have known the answer. b. This is a problem that they couldn’t figure out how under the present circumstances t should be addressed. c. That is the person who I was wondering whether in times like these t ought to be discouraged from running for office. A possible account of the Adverb Effect is that the complementizer can only flag the presence of a sentential complement, but not an initial gap in the complement, perhaps for prosodic reasons (e.g. Sobin 2002; Sato & Dobashi 2016, but cf. Ritchart et al. 2016). The intervening adverb provides sufficient prosodic material to create the expectation of a subject gap, which is thereby less surprising. Such a story must be taken with a grain of salt, though, since as 2⁷ An alternative explanation that does not appeal to government is that cliticization of ’s is to the following constituent (Bresnan 1971). If the following constituent is phonologically null, cliticization is blocked.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

98 learning, complexity, and competition far as I know there is no independent evidence about the expectations triggered by a complementizer that make such a scenario non-circular. It is just as easy to say that the complementizer does not generate the expectation for a subject gap immediately adjacent to it because of a constraint against *C0 -t.2⁸ A related case concerns the presence or absence of that and relative proforms on the left edge of relative clauses. In Culicover (2013a) I show that otherwise well-formed relative clauses are unacceptable when there is no proper signal of the left edge of the clause. This signal occurs when there is a left-adjoined constituent in the relative clause, as in (42)–(45). (42) a. He is a man to whomi libertyj , we could never grant t j t i . b. This is the beeri which on the tablej , Sandy deposited t i carefully t j . (Baltin 1981) (43) He is a man to whom because of this I will be unable to give any money. (44) ?He is a man to whom never would I give any money. (45) a. They then told me about the time when right into the room walked the Queen of England. b. Please give me one reason why under this very table is sitting the fabulous Hope Diamond. c. Detroit is a town which/that in almost every garage can be found a car manufactured by GM. When a complementizer is absent, the same kinds of relative clauses are much less acceptable.2⁹ (46) a. ∗ He is a mani libertyj , we could never grant t j to t i . b. ∗ This is the beeri on the tablej , Sandy deposited carefully t i t j . (Baltin 1981) (47)



He is a man because of this I will be unable to give any money to.

(48)



He is a man never would I give any money to.

(49) a. ∗ They then told me about the time right into the room walked the Queen of England. 2⁸ Rizzi (1990) derives such a constraint from the failure of Spec-head agreement when there is an overt complementizer. 2⁹ There are other factors at play here as well: the cases in (46a,b) with multiple argument gaps are worse than those in (47)–(48) with one argument gap and one adjunct gap.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.4 economy 99 b. ∗ Please give me one reason under this very table is sitting the fabulous Hope Diamond. c. ∗ Detroit is a town in almost every garage can be found a car manu factured by GM. In Culicover (2013a) I argue that the generalization about these cases is that they lack an overt complementizer and have a constituent that precedes the subject. This configuration inhibits recognition of the left edge of the clause. Similar observations can be made for extraposed and stacked relatives. Moreover, when there is no preposed constituent in the relative clause, but the subject is not a canonical NP, similar difficulties arise, as seen in the examples in (50)–(53). (50) that-clause a. ∗ Otto appears to be a man [∅ [S that it is snowing hard] apparently doesn’t bother t]. b. ?Otto appears to be a man [who [S that it is snowing hard] apparently doesn’t bother t]. c. Otto appears to be a man [(who) it apparently doesn’t bother t [S that it is snowing hard]]. (51) for-to infinitive a. ∗ Colette is the kind of woman [∅ [S for me to speak better French] would probably have pleased t]. b. ?Colette is the kind of woman [who [S for me to speak better French] would probably have pleased t]. c. Colette is the kind of woman [(who) it would probably have pleased t [for me to speak better French]]. (52) embedded wh-question a. ∗ We interviewed a candidate [∅ [S whether it is polite to make eye contact] apparently was not obvious to t]. b. ?We interviewed a candidate [who [S whether it is polite to make eye contact] apparently was not obvious to t]. c. We interviewed a candidate [who [it apparently was not obvious to t [whether it is polite to make eye contact]]. (53) gerund a. ∗ I don’t know anyone [∅ [S speaking better French] wouldn’t appeal to t]. b. I don’t know anyone [who [speaking better French] wouldn’t appeal to t].

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

100 learning, complexity, and competition All of the examples with embedded complex subjects and nothing in complementizer position are quite unacceptable. A plausible explanation is that they are grammatical, but structured in such a way that they systematically create interpretive complexity.

4.5 Simulating competition In this section I show through simulation how complexity may lead to the dominance of less complex constructional alternatives over more complex ones, as Niyogi’s Law predicts.3⁰ The basic idea of the simulation is that there is a network of learners, each of whom has one of a set of constructions for expressing a particular CS function. Each construction may have a negative bias associated with it, which is understood here as relative complexity. When two or more constructions are in competition, the simplest variant, that is, the one with the least negative bias, pushes out the more complex ones, other things being equal. However, since other things are not always equal, it is possible for complex constructions to survive, sometimes indefinitely. The absence of contact is a key factor that contributes to the survival of complex constructions—there is no competition. This result is very similar to that seen cross-linguistically, where languages in isolation can preserve constructions that are arguably very complex. To begin, let us assume a population of agents that both produce and are exposed to linguistic expressions. Assume that each agent in the population interacts with the same number of other agents over a specified distance. The language of each agent is defined by values on a number of features, corresponding to dimensions of variation. Each value has a bias associated with it, which defines its relative complexity. We suppose for purposes of the simulation that there are three constructional features, A, B, and C, each of which permits four alternatives, i.e. {A1,A2,A3,A4}, {B1,B2,B3,B4}, and {C1,C2,C3,C4}. For purposes of the simulation, a language is defined as some combination of alternative values of features A, B, and C. A typical network of agents is given in Figure 4.1; the distribution of languages over the population is shown in the histogram in Figure 4.2.

3⁰ The simulation environment was programmed by Wojtek Borkowski and is reported on in Culicover & Nowak (2003).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.5 simulating competition 101 RGB map of languages

First mem Second mem Third mem

Figure 4.1 Initial state, 25,000 speakers.

4161

Histogram of languages

0 DistrOf(Classification[0..63])

Figure 4.2 Initial histogram. 64 languages: three features, four values each.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

102 learning, complexity, and competition Each gradation on the grayscale corresponds to a particular combination of feature values. Since there are three features with four values each, the number of possible combinations is 64. In Figure 4.1, the alternatives are evenly distributed across the population, as seen in the histogram in Figure 4.2. An agent changes its value for a given feature if the weight of the evidence points to some other value. When there is neutral (that is, 0) bias on a feature, this will occur if the majority of partners that the agent interacts with have the other value. However, if there is bias, the bias against a particular feature value is subtracted from the total strength of the feature values of the interaction partners. Thus, if there is a bias against a particular feature value, it is less likely to be adopted by learners in the course of interaction, even if it is widespread among the population. The bias, if it is strong enough, may facilitate the growth of a minority feature value. To make the simulation realistic, we may suppose that three of the values of the features correspond to attested linguistic phenomena, while for the purposes of the simulation that the fourth value is something that is very unlikely in a natural language, if it is possible at all. • Argument structure, where linear order, case marking systems, verbal agreement compete. The fourth ‘impossible’ value is relative vowel length on the stressed syllable of the head noun. • Wh-questions, where the competition is between a wh-phrase in initial position and a long or short dependency with the gap, and wh-in-situ. The fourth ‘impossible’ value has the wh-phrase in final position. • Yes-no questions, where word order, morphological marking on the head of the sentence, and intonation compete. The fourth ‘impossible’ value is omission of inflection on the head of the sentence. Let us also suppose that for each of the three features, values A1, B1, and C1 are the most preferred; that is, they have the lowest bias against them, they are the least complex, etc. And for feature values 2, 3, and 4 there is increasing bias. So, for X = {A,B,C}, X1 has 0 bias, X2 has a bias of .25 against it, X3 has a bias of .5 against it, and X4 has a bias of .75 against it. The most preferred language will thus be [A1,B1,C1]. The three features have strengths, [1,1,1] with no bias against any feature values. All other combinations will be less preferred. The least preferred will be [A4,B4,C4], with feature strengths [.25,.25,.25], with .75 bias against each feature value. The preferences for the 64 combinations are shown in Figure 4.3, gotten by multiplying the biased values of the features.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.5 simulating competition 103 Language strengths 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 A1 B1 A1 B2 A1 B3 A1 B4 A2 B1 A2 B1 A2 B3 A2 B4 A3 B1 A3 B2 A3 B3 A3 B4 A4 B1 A4 B2 A4 B3 A4 B4 C1

C2

C3

C4

Figure 4.3 Strengths of feature value combinations for all languages.

The initial state of the simulation is shown in Figure 4.1. If we run the simulation with no bias for 1,000 steps we get the kind of result shown in Figure 4.4. RGB map of languages 8823

First mem Second mem Third mem

Histogram of languages

0 DistrOf[Classification[0..63]]

Figure 4.4 Simulation with no bias after 1,000 steps.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

104 learning, complexity, and competition Since there is clustering, the populations of the languages vary, but they are all attested. There is no way to predict the population of any particular language, and the results will be different each time the simulation is run. Figures 4.5 through 4.10 show the progression of the language distribution when there is bias along the lines represented in Figure 4.3. Note how the distribution of languages begins to approach the pattern predicted by (Figure 4.3). The final result if we run the simulation long enough is that only the strongest combination of feature values survives. This simulation illustrates our central point: a bias against a particular construction, due to complexity or other factors, will lead to the demise of this construction in the long term, other things being equal. Our discussion of complexity in section 4.4 suggests how to give a natural characterization of

RGB map of languages

First mem Second mem Third mem

5235

Histogram of languages

0 DistrOf[Classification[0..63]]

Figure 4.5 First snapshot. RGB map of languages 7359

First mem Second mem Third mem

Figure 4.6 Second snapshot.

Histogram of languages

0 DistrOf[Classification[0..63]]

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.5 simulating competition 105 RGB map of languages

First mem Second mem Third mem

29274

Histogram of languages

0 DistrOf[Classification[0..63]]

Figure 4.7 Third snapshot.

RGB map of languages

First mem Second mem Third mem

65414

Histogram of languages

0 DistrOf[Classification[0..63]]

Figure 4.8 Fourth snapshot.

RGB map of languages

First mem Second mem Third mem

Figure 4.9 Fifth snapshot.

171183

Histogram of languages

0 DistrOf[Classification[0..63]]

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

106 learning, complexity, and competition RGB map of languages

247811

First mem Second mem Third mem

Histogram of languages

0 DistrOf[Classification[0..63]]

Figure 4.10 Sixth snapshot.

complexity in terms of constructions. Thus we can establish a plausible basis for using constructions as the descriptive framework for accounts of change and variation. Moreover, we can reason backwards and hypothesize that if a particular feature value or combination of feature values is rare, then it is possible that the cause is a complexity bias. This hypothesis is most plausible if the rarity is seen across all language families and geographical regions, since it is then less likely to be due to genetic factors, where the ancestor lacks a certain property and this lack is transmitted to its descendants. This reasoning is at the root of the work of Hawkins cited earlier, of Culicover (2013c), and of the constructional perspective argued for in this book.

4.6 Summary To review, I argue that a possible human language is one that is defined by a grammar consisting of constructions. What is fundamentally universal in language is conceptual structure and the need to express the core functions of conceptual structure. A grammar may contain competing constructions for the same CS function. Constructions that express a particular CS function can be ranked in terms of their complexity. More general constructions are less complex than more specific constructions, other things being equal. And constructions that require less computation are less complex. Assuming that constructions compete in the course of learning and language contact, economy prefers less complex over more complex

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

4.6 summary 107 constructions, other things being equal. Consequently, constructions that rank low in complexity are likely to be very frequent or even universal, compared to their more complex competitors. Those that rank high are likely to be rare or even nonexistent. Within this framework, we have a characterization of the notion ‘possible human language’, as well as the basis for a solution to Chomsky’s Problem: the vocabulary for describing constructions, as developed in Chapter 2, defines the dimensions of variation, while the complexity metric accounts for the relative universality of variants.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

PART II

VAR IAT ION

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5 Argument structure 5.1 Introduction In this chapter I apply the constructional approach to universals and variation set out in Chapters 1–4 to argument structure marking. Argument structure, on this approach, is the individuation of participants in a CS relation. Classically, these participants have been distinguished in CS in terms of thematic, or semantic, roles, such as agent, patient, theme, instrument, etc. (Gruber 1965; Fillmore 1968; Jackendoff 1972). It is the job of the grammar, and in particular of the mapping between syn and phon, to distinguish the phrases in a sentence that refer to the participants in a relation in such a way that their roles can be reliably identified. I focus in this chapter on the morphosyntactic devices for mediating the correspondences between CS functions and phonological form. My main focus is on how the variety of such devices can be accommodated in a constructional approach. Discussion of grammatical functions is left largely to Chapter 6. The chapter is structured as follows. Section 5.2 reviews the formal devices that are used to mark argument structure found in natural languages, and considers the role of constructional complexity in constraining the variation and the CS properties that argument structure constructions express. Section 5.3 analyzes some representative cases of differential marking of arguments, with the goal of demonstrating the flexibility and precision afforded by the constructional approach. In these cases, what appears to be a single grammatical argument type is differently marked depending on its CS properties. For example, what would be a subject in a language like English might in certain circumstances be marked with dative case, the canonical case for an indirect object. For differential marking, we ask whether it is possible to understand the divergences in terms of alternate paths toward reducing constructional complexity. While differential marking appears to introduce complexity into the grammar, a case can be made that there are competing dimensions where the introduction of differential marking results in reduced complexity. Section 5.4 shows through computational simulation

Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0005

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

112 argument structure how differential marking might emerge as the consequence of such complexity reduction.

5.2 Argument structure constructions (ASCs) The argument structure of a sentence is a function of the predicate that denotes the CS relation, and the phrases that denote the arguments of this relation. The predicate is whatever corresponds to the CS functor, e.g. go′ , fall′ , eat′ , give′ , break′ , etc. I assume for simplicity that these are primitives of CS, and that each one specifies a CS argument structure and the roles of the arguments.1

5.2.1 Devices The typological literature suggests that there are three kinds of devices that languages use to distinguish arguments (Nichols 1986). Linear order: A phrase is in a designated linear position with respect to the phrase that contains it (initial position, second position, or final position, or left-or-right adjacent to the head of the phrase). These are the positions that can be specified independently of the internal properties or length of the phrase (Culicover 1999). Head-marking: A phrase agrees with particular morphological properties marked on the head of the phrase that governs it. Dependent-marking: A phrase is morphologically marked within the phrase that contains the head that governs it. Linear order is the most restricted option for distinguishing arguments. Given two arguments, it is only possible to vary their relative order and their order relative to the verb. Constructions that use linear order to distinguish arguments were introduced in Chapter 2 for English subject and object; I repeat them here. (1) Construction: Subject syn [S NP1 , AUX2 , …]3 [ ] gf [GF1 (> …)]3 1 Thus I do not attempt to represent the lexical semantics of those verbs that can be decomposed into more primitive CS predicates; see Jackendoff (1983, 1990, 1997, 2002) for one prominent approach to lexical semantics.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.2 argument structure constructions (ascs) 113 (2) Construction: Object syn [VP V, NP1 ]2 [ ] gf [GF > GF1 ]2 In the absence of a ‘movement’ construction, the phon of an argument is in reasonably close proximity to the phon of its predicate. For example, the subject in English is for the most part immediately adjacent to the inflected head of its phrase.2 If it is not in an argument position, it is in a designated non-argument, i.e. A′ position, that cannot be confused with an argument position.3 Similarly, the object is more or less adjacent to the predicate, unless it is in a designated A′ position or scrambled to the end of the sentence for reasons of focus (Rochemont & Culicover 1990). While the exact definition of where the A-position of an argument is in the linear ordering may be somewhat flexible, what does not occur is large-scale interpolation, in violation of locality. In such a case, the predicates and their arguments would be interspersed among one another in such a way that the subject of V1 is closer to predicate V2 than the subject of V2 is. Consider, for instance, the correspondence in (3). For readability I have added lexical items. (3) ⎡phon taylor5 –told1 –kelly6 –chris3 –loves2 –peyton4 ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢gf ⎢ ⎣cs

S

NP3 Chris

VP V1

NP4

told

Peyton

NP5 Taylor

[GF3 > GF4 ]1 [GF5 > GF6 ]2 tell′ 1 (c3 ,p4 ,love′ 2 (t5 ,k6 ))

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ S ⎥ ⎥ ⎥ VP ⎥ ⎥ V2 NP6 ⎥ ⎥ ⎥ loves Kelly⎥ ⎥ ⎥ ⎦

Note the absence of alignment between the phon and syn here. Constituents that are adjacent in the tree do not correspond to adjacent portions of phon at 2 Departures from immediate adjacency in phon include simple adverbs (Sandy slowly opened the door) and parentheticals (Sandy, smiling broadly, opened the door). 3 A wh-subject, e.g. Who called?, is in situ in argument position and not in an A′ position. See section 7.3, for discussion.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

114 argument structure all. The subject of V1 appears immediately before V2 , while the subject of V2 appears immediately before V1 . And each object appears in a position adjacent to the other verb. As far as I know, such radical scrambling does not occur in natural languages. Even when the arguments are distinguished by case-marking or other morphology, relative locality appears for the most part to be preserved. Typically, where one argument appears in place of another argument or in another clause, the argument is in a designated focus position or intonationally marked, as illustrated in the examples of Russian ‘prolepsis’ in (4). (4)

Russian (Fortuin & Davids 2013) a. U menja otgadaj kto teper’ ostanovilsja. with me guess who now stayed ‘Guess who is staying with me now.’ b. On očen’ ploxo gotovit. Ne znaet daže šči, he very bad cook. neg knows even cabbage.soup kak svarit’. how to.cook ‘He is a very bad cook. He doesn’t know how to boil even cabbage soup.’ c. Ja Saše xoču, čtoby Boris pozvonil. I S. want, that B. calls ‘I want Boris to call Sasha.’

Alternatively, a verb may be marked to indicate that its apparent subject or object is an argument of another predicate, as in the case of wh-extraction in Chamorro (4). (5) Chamorro (Chung 1994, 14) Hafa masangane-mmu ni policeman [t pära what? wh[obj2].be.told-agr obl policeman [t fut un-cho’gui t]? wh[obj].agr-do t] ‘What were you told by the policeman you should do?’ It is possible, in principle, that a language can use differences in linear order to mark a semantic distinction. For example, if S must precede O for independent reasons, a single language could use just the orders SVO and VSO to mark some distinction in the subjects using the order with respect to V, or

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.2 argument structure constructions (ascs) 115 SVO and SOV to mark some distinction in the objects. An example of the latter is Armenian, where SVO is used to mark definiteness, while SOV marks indefiniteness or non-specificity. (6) Armenian (Dum-Tragut 2009, 562) a. Ani-n kard-um ē ir nor girk’-ě Ani.nom-the read-ptcp.pres she.is her new book.nom-the ‘Ani is reading her new book.’ b. Ani-n girk’ ē kard-um Ani.nom-the book.nom she.is read-ptcp.pres ‘Ani is reading a book.’ (Lit.: Ani reads books.) A schema for dependent-marking is given in (7). 1(2) is the value of the paradigm function for NP1 with morphological features 2. (7) Schema: Dependent-marking ASC ⎤ ⎡phon […–1(2)–…]3 ⎥ ⎢ ⎢ ⎡ ⎡cat NP ⎤ ⎤⎥ ⎢ ⎢ ⎢lid lid ⎥ ⎥⎥ 1 ⎢ ⎢ ⎢ ⎥ ⎥⎥ ⎢ ⎢ ⎥, …⎥ ⎥ ⎢syn case case ⎥ ⎢ ⎢ϕ ⎥ ⎢ [ ]⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎢ … 2⎦ ⎢ ⎣S ⎣ ⎦3 ⎥ ⎥ ⎢ [GF1 ] 3 ⎦ ⎣gf A typical example is Russian, where the canonical subject has nominative case and the canonical object has accusative case. (8) Ivan knigu čitaet Ivan.nom book.acc reads ‘Ivan is reading the book.’ ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢gf ⎢ ⎣cs

[Ivan1 (nom.sg2 )–knig3 (acc.sg4 )–čitaet5 ]6 =

⎤ ⎥ ⎥ ⎥ category N ⎡ ⎡category N ⎤ ⎥ ⎤⎡ ⎤ ⎢ ⎢lid ⎥ ⎥ ⎢ ⎥ ⎥ Ivan1 knig-3 ⎢ ⎢ ⎥⎥ ⎥ ⎢lid ⎥ ⎢ ⎢ ⎥, ⎢ ⎥,V[čitaet] 5 ⎥ ⎥ case nom ⎥ ⎢ case acc ⎥ ⎢ ⎢ ⎥⎥ [ ]2 ⎥ ⎢ϕ [ ]4 ⎥ ⎢ ⎢ϕ ⎥ number sg number sg ⎢ ⎣ ⎥⎥ ⎦⎣ ⎦ ⎣S ⎦6 ⎥ ⎥ ⎥ GF1 > GF3 ⎥ [read′ 5 (agent:i1 ,patient:b3 )] 6 ⎦ [Ivan1,2 –knigu3,4 –čitaet5 ]6

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

116 argument structure In comparison with configuration, case morphology offers the richest options for marking thematic distinctions. For example, Finnish has fifteen different cases and Agul has twenty-five, many of them referring to spatial relationships. The richness of case morphology explains why differential marking is typically reflected in case distinctions. But case does not necessarily map one-to-one to thematic roles. For example, in Kuuk Thaayorre, both participants in a part-whole relationship may have the same case in a simple sentence, and may appear in either position with respect to one another. An example is given in (9). (9) Kuuk Thaayorre (Gaby 2005, 19) kuta nhul paant glass-ak rokr dog.nom 3sg.nom head.nom glass-dat enters ‘The dog puts his head into the jar.’ As Gaby points out, while the dog’s head is in the jar, the dog is in the jar as well, in a sense. For such cases, we must presume a construction that allows two NPs with the same case to correspond to a part-whole relation in CS. Which is the part and which is the whole is not marked syntactically. Finally, let us consider the head-marking, or agreement, schema. Agreement is somewhat more flexible than linear order and less flexible than case marking. An ASC must distinguish an argument at CS and specify its CS properties. These two functions of the ASC need not be determined by the same components of the morphosyntactic representation. For example, in agreement, the GF of an overt argument is indicated in one of two ways. One is by verbal inflectional agreement, and the other by agreement morphemes or clitics. In the case of head-marking, certain features φ of an argument are marked morphologically on the verb itself, as represented by 2(4) in the phon of (10). There may or may not be an overt phrase that corresponds to this argument. If there is, agreement is represented by establishing the identity of the φ properties of the overt phrase and the morphologically specified argument. (10) Schema: Head-marking ASC ⎡phon [1,2(4), …]3 ⎤ ⎢ ⎥ ⎢ ⎡ ⎡cat NP ⎤ ⎡cat V ⎤ ⎤ ⎥ ⎢ ⎢ ( ⎢lid lid ⎥), ⎢lid lid ⎥, …⎥ ⎥ 1 2 ⎢ ⎢ ⎢syn ⎥ ⎥ ⎢ ⎥ ⎥ ⎢ ⎥⎥ ⎢ ϕ φ arg φ 4 ⎦ ⎣ 4 ⎦ ⎢ ⎣ ⎥⎥ ⎢ ⎢ ⎣S ⎦3 ⎥ ⎢ ⎥ ⎢gf ⎥ [GF1 ]3 ⎢ ⎥ ′ ′ ′ [2 (θ:[1 ∪4 ],…)]3 ⎣cs ⎦

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.2 argument structure constructions (ascs) 117 Verbal inflection restricts the set of discourse referents that may satisfy it. Typically these restrictions are those of person, number, and noun class (including gender). The more specific referential properties of the argument are supplied by an overt NP. But in the absence of an overt NP, it is still possible to identify, more or less precisely, the discourse entity that has a particular role on the basis of the inflectional restrictions. To take a simple example, consider Italian verbal inflection. (11) Italian a. Dorme. sleep.3sg ‘He sleeps/is sleeping.’ b. Gianni dorme. G. sleep.3sg ‘Gianni sleeps/is sleeping.’ There is an agreement relation between Gianni and the 3sg inflection that identifies the subject as a singular third person individual. Gianni narrows the reference of this individual. In other words, on this analysis (11b) is functionally equivalent to English Left Dislocation, as in (12). (12) John, he’s sleeping. This functional equivalence does not necessarily mean, however, that the morphological inflection is the syntactic equivalent of a pronoun, as proposed for example, by Jelinek (1984, 1985).⁴ As the construct in (13) shows, it is possible to capture the functional equivalence without taking this step. In a constructional framework the inflection may be stated directly in terms of the morphological features of the syntactic constituents. (13) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢gf ⎢ ⎣cs

[gianni1 –dorme2,4 ] 3

⎤ ⎥ ⎡ ⎡cat NP ⎤ ⎡cat V ⎤⎤ ⎥ ⎢ ⎢lid Gianni ⎥ ⎢lid dorm⎥⎥ ⎥ 1 2 ⎢ ⎢ ⎥⎢ ⎥⎥ ⎥ ⎢ ⎢ ⎥, ⎢ ⎥⎥ ⎥ person 3rd ⎥ ⎢ person 3rd ⎥⎥ ⎥ ⎢ ⎢ϕ arg [ ] [ ] 4⎥ ⎢ 4⎥ ⎥ ⎢ ⎢ number sg ⎦ number sg ⎦⎥ ⎥ ⎣ ⎣S ⎣ ⎦3 ⎥ ⎥ GF1 ⎥ [sleep′ 2 (g1 ∪4′ )] 3 ⎦

⁴ For arguments against Jelinek’s analysis, see LeSourd (2006).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

118 argument structure The argument of dorme is third person singular. These properties are merged with and consistent with the overt argument, Gianni, so the CS representation is well-formed.⁵ If the NP is not a third person, e.g. io ‘I’, then there is a conflict produced by the third person feature 4′ of the verb and the first person feature of the NP, and the construct is morphosyntactically and semantically ill-formed.⁶ Similar considerations hold for pronominal agreement morphemes or clitics. I assume that these, like verbal inflection, are the values of the paradigm function in phon. Again, it is possible to treat such morphology as the output of syntactic operations, and there is a long tradition of such analyses in MGG (e.g. Kayne 1975). But even the earliest work on morpheme structure, e.g. Perlmutter (1970), showed that the morphological structure of ‘pronominal’ agreement morphology cannot be accounted for simply in terms of movement and adjunction in the syntax. It is still necessary to stipulate the overt form of the inflection in phon, which is the job of the paradigm function.

5.2.2 CS features Turning now to CS, the roles agent and patient constitute the fundamental distinction between the arguments of a transitive predicate. In their prototypical form they constitute prototypical transitivity (Hopper & Thompson 1980). Dowty (1991) proposed a decomposition of the classical θ-roles agent and patient into more fine-grained thematic entailments, along the lines of (14) and (15).⁷ (14) Agentive properties a. sentience (and/or perception) b. volitional involvement in the event or state c. causing an event or change of state in another participant d. movement (relative to the position of another participant) e. existence independently of the event named by the verb ⁵ It is necessary to stipulate that the inflectionally marked argument in Italian corresponds to the external argument in CS, that is, x in … λx.f(x,…), regardless of the number of arguments. ⁶ However, I assume that morphological agreement cannot be reduced to referential consistency at CS. While in most cases the two are fully compatible, examples such as English These pants are expensive; This furniture is expensive show that morphological agreement has to be stated independently. ⁷ For alternatives and extensions to Dowty’s original proposal, see among others Ackerman & Moore (2001), Arkadiev (2008), Grimm (2011), Primus (1999), and Wier (2011).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.2 argument structure constructions (ascs) 119 (15) Patientive properties a. undergoes change of state b. incremental theme c. causally affected by another participant d. stationary relative to movement of another participant e. no existence independently of the event, or no existence at all An argument with all of the agentive properties is often taken to be a prototypical agent, and one with all of the patientive properties, a prototypical patient. But the terms ‘agent’ and ‘patient’ are often used to refer in the grammatical literature to individuals that are not agents or patients in the prototypical sense. For example, the two arguments of a predicate meaning ‘see’ will usually be referred to as the ‘agent’ and the ‘patient’, even though neither argument has many of the prototypical properties listed in (14) and (15). In addition, as Primus (2012) has argued, there are non-thematic properties of CS arguments that are highly correlated with some of the properties in (14) and (15). These non-thematic properties may, in principle, also be associated with particular morphosyntactic devices for marking arguments. The property animate, for example, correlates highly with sentient, cause, and movement. Definiteness appears to correlate with causally affected and complete. As suggested in the representation of dependent marking in (7), ASCs are distinguished not only by the particular grammatical form assigned to an argument, but by the corresponding CS properties. In principle we could have as many ASCs as there are subsets of the set of CS properties, and grammars that consist of arbitrary combinations of these ASCs. However, as work going back at least as far as Hopper & Thompson (1980) and continuing through Dowty (1991) and related work shows, there are certain aspects of CS that are particularly salient, most notably causal agency and affectedness.⁸ These are the thematic properties that are associated with linear order by very young children learning English (see, for example, Fisher et al. 2010). Even prelinguistic children already have a sense of what constitutes an intentional agent (Kuhlmeier et al. 2003). In what follows I assume these properties to be innate and central to human cognition and therefore reliably and regularly reflected in grammatical constructions cross-linguistically.⁹ ⁸ See Arkadiev (2008) for discussion of a range of CS properties that appear to be relevant for distinguishing arguments. ⁹ See Jackendoff (2002) for arguments that these properties are central to an account of the evolution of grammar.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

120 argument structure

5.3 Differential marking The organization of CS argument structure is reflected in differential marking, a phenomenon that is particularly well-documented in the typological literature. Briefly, differential subject marking (DSM) marks a ‘subject’ NP with different morphology than is used for the prototypical agent of a prototypical transitive, and similarly for differential object marking (DOM). And in split intransitive languages1⁰ the sole argument of the intransitive is marked differently depending on whether it is closer to a prototypical agent or prototypical patient, at least in the canonical case. In this section I give a few typical examples of three types of differential marking that illustrate their typical properties. Then, in section 5.4, I argue that differential marking is represented in a natural way in terms of variation in ASCs and I show how it can arise through constructional change. Patterns of differential marking are quite diverse, and I make no effort to survey the full range of possibilities here. For broader reviews, see Aikhenvald et al. (2001); Aissen (2003); Arkadiev (2009); Bhaskararao & Subbarao (2004); Birchall (2014); de Hoop & de Swart (2009); Fauconnier (2012); Malchukov (2008); Malchukov & de Hoop (2011); Seržant & Kulikov (2013); Seržant & Witzlack-Makarevich (2018); Sinnemäki (2009); Wichmann (2008); WitzlackMakarevich & Seržant (2018).

5.3.1 Differential subject marking Differential marking often expresses departure from ‘prototypical transitivity’. Prototypical transitivity as characterized by Hopper & Thompson (1980) has two participants. One is a volitional and ‘high potency’ subject, the other is an affected object, and the relation between them denotes a telic, active, and realis event. There are many variants of this notion in the literature—see for example Dowty (1991); Næss (2007); Primus (1999); Ackerman & Moore (2001), Grimm (2011). DSM constructions typically mark arguments as not prototypical agents. Such arguments are typically experiencers, recipients, and agents acting without intention. This property is often characterized as ‘non-volitional’ (Klaiman 1981). However, in many cases DSM appears to have become lexicalized, in

1⁰ Also called stative-active, agent-patient, agentive—see Wichmann (2008) for a review and discussion.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.3 differential marking 121 that it is not possible to establish a strict correspondence between morphological marking and thematic role.11 Moreover, there are functions of DSM that not do not bear on thematic functions (Seržant & Witzlack-Makarevich 2018). I illustrate DSM here with just a few examples, in order to establish the constructional approach to such phenomena; for many more, see the work cited at the beginning of this section. The Icelandic example (16) is a good illustration of DSM—a subject is normally marked nominative in Icelandic, but the subject of hvolfdi ‘capsized’ is marked dative.12 (16) Icelandic (Andrews 2001, 103) Bátnum hvolfdi. the.boat.dat capsized ‘The boat capsized.’ The subject of intransitive hvolfdi is marked with the same case as the object of hvolfdi in the corresponding causative. (17) Icelandic (Andrews 2001, 103) þeir hvolfa bátnum. they capsize the.boat.dat ‘They capsize the boat.’ And in (18), the object of the causative and the corresponding subject are marked accusative.

11 For some particularly clear examples, see Birchall (2014), Holton (2008), and Klamer (2008). Klamer (2008, 221) observes that in some Semantic Alignment systems, the criterial semantic feature refers to the agentive or patientive characteristics of the participant (resulting in an ‘agent/patient’ system), in others, it is the inherent aspect of the predicate as state vs. event that crucially determines the alignment (resulting in an ‘active/stative’ system), yet other systems are based on participant’s semantics as well as inherent aspect of the predicate. And Birchall (2014) writes, in regard to native languages of Latin America, that [e]ven within the descriptive verbs that express physical properties of their subjects, the different marking patterns are conditioned by the lexical class of the predicate rather than strictly semantic criteria. 12 For discussion of the history of dative subjects in Icelandic, see Barðdal (2011). DSM in Icelandic is considerably more complex than the few examples given here might suggest. While there are some patterns of correspondence between Icelandic subject case and thematic roles, the overall correspondences do not suggest any one-to-one relationships (Thráinsson 2007, 206).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

122 argument structure (18) Icelandic (Andrews 2001, 103) a. Bátinn fyllir. the.boat.acc fills ‘The boat fills.’ b. þeir fylla bátinn. they fill the.boat.acc ‘They fill the boat.’ A typical variant of DSM involves marking the absence of ‘control’ on the part of the causal agent.13 Control is rarely if ever entailed by a verb—it is usually an implicature (Primus 1999, 51). For example, Dowty (1991, 552) points out that murder entails that the causal actor acts with intention, but kill does not, since even inanimates like poisons and crashes can kill. However, in normal conversation, the statement Susan killed the mouse with an animate subject would implicate that Susan intended to kill the mouse, because Susan is animate. It is of course possible to defeat this implicature by providing sufficient context; e.g. Susan suddenly rolled over in her sleep and crushed the adjacent mouse. DSM overtly marks this departure from typicality, obviating the need for context.1⁴ And in some languages, the distinction may be marked elsewhere than on the subject. For example, Aikhenvald (2003, 15) cites the use of nonvisual evidential marking on the verb in Tariana and in EastTucanoan languages to indicate lack of control, while visual evidential denotes control. The following data from Agul illustrates some conditions under which an actor may be considered to be non-controlling.1⁵ Agul is an ergative language in which the agent of a transitive is canonically marked by the ergative case, while the sole argument of an intransitive and the patient are (zero-)marked by the absolutive case. (19) Agul (Ganenkov et al. 2009, 173) ze dad maskaw.di-as χab aldarku-naa my father.abs Moscow-in.elat back return-res ‘My father has come back from Moscow.’

13 Following Primus (1999, 38), who cites Dik (1978), I use the term ‘control’ as a cover term for the mental properties of a proto-agent. 1⁴ Primus (1999, 56) points out that thematic properties may be either entailments or implicatures, depending on the properties and the lexical items. 1⁵ Agul is “a language from the Lezgic branch of the East Caucasian (Nakh-Daghestanian) family” (Ganenkov et al. 2009, 173).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.3 differential marking 123 (20) Agul (Ganenkov et al. 2009, 173) dad.a guni ʕut’u-ne father.erg bread.abs eat-pst ‘Father ate bread.’ Actors can be marked with a different case, however. It appears that differential marking occurs when the transitive relation is not prototypical, in the sense of Hopper & Thompson (1980) and Dowty (1991). While ergative case in Agul signals the interpretation that that the action is autonomously carried out, adelative/adessive case marks lack of intention. There are four related interpretations, which appear to share this property. The interpretations are summarized in (21)–(25) (all examples and glosses from Ganenkov et al. 2009). 1. Involuntary Action (21) Agul ruš.a-f-as berʜem kura-se girl-ad-elat dress.abs get.dirty-fut ‘The girl will unwittingly soil the dress.’ (the little girl was told that she has to be careful, but she cannot remember about that all the time, and she will most probably soil the new dress while playing) (22) Agul za-f-as k’eǯ lik’a-s xu-ne I-ad-elat letter.abs write-inf become-pst ‘I managed to write a letter.’ 2. Undesirable Action (23) Agul za-f-as gi-s unaq’u-b xu-ne I-ad-elat that-dat call-msd become-pst ‘It so happened that I had to invite him.’ (I did not plan to do this, but when I was inviting other people, he was near, and it would have been impolite not to invite him) 3. Possibilitive: expresses the possibility of the relation as contrasted with the execution of it. (Lit. ‘it became/happened from him to do it’.) (24) Agul ze gada.ji-f-as wa-s kümek aq’a-s my son-ad-elat [you-dat help.abs do-inf] xa-se become-fut ‘My son will be able to help you.’

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

124 argument structure 4. Causative (25) Agul baw.a ruš.a-f-as || ruš.a-w xed χa-s mother.erg girl-ad-elat || girl-ad water.abs bring-inf q’u-ne do-pst ‘Mother made the girl bring water.’ The constructional representation of Agul case specifies a correspondence between a non-volitional agent, represented here as actor[–control], and ad-elat.1⁶ I use actor to signal that the argument does not necessarily have prototypical agent properties. The constructions are given in (26) and (27), respectively, with illustrations. (26) a. ⎡phon ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs b. phon ⎡ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs

(27) a. ⎡phon ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs

[1(2) > 3, …]4

⎤ ⎥ ⎥ ⎡ ⎡cat NP ⎤ ⎤ ⎥ ⎢ ⎢lid ⎥ ⎥ , V , … lid 1 ⎥ ⎢ ⎢ ⎥ 3 ⎥ ⎢ case ad-elat ⎥ ⎥ 2⎦ ⎥ ⎣S ⎣ ⎦4 ⎥ [λy.λx.3′ (actor[-control]:x,patient:y)(…)(1′ )]4 ⎦ [ruš.a-f-as1,2 berʜem3,4 kura-se5 ]6 [[girl-ad-elat]1,2 [dress.abs]3,4 [get.dirty-fut5 ]6

⎤ ⎥ ⎥ cat NP ⎥ ⎡ cat NP ⎤ ⎥ ⎢ [lid ruš1 berʜem3 ], V[kura-se]5 ⎥ ],[lid ⎥ ⎢ ⎥ case ad-elat case abs ⎥ ⎢ ⎥ 2 4 ⎥ ⎣S ⎦6 ⎥ ′ [λy.λx.get.dirty 5 (actor[-control]:x,patient:y)(d3 )(g1 )]6 ⎦

[1(2) > 3, …]4

⎤ ⎥ ⎥ ⎤ ⎡ ⎡cat NP ⎤ ⎥ ⎥ ⎢ ⎢ ⎥ , V , … lid lid 3 1 ⎥ ⎢ ⎢ ⎥ ⎥ ⎥ ⎢ case erg ⎥ 2⎦ ⎥ ⎣S ⎣ ⎦4 ⎥ ′ ′ [λy.λx.3 ([agent]:x,patient:y)(…)(1 )]4 ⎦

1⁶ The default correspondence is between an agent and ergative case.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.3 differential marking 125 b. ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs

dad.a1,2

ʕut’u-ne5

guni3,4

⎤ ⎥ ⎥ ⎥ cat NP cat NP ⎡ ⎡ ⎤⎥ ⎤⎡ ⎤ ⎢ ⎢ ⎥ dad1 ⎥, ⎢lid gun3 ⎥, V[ʕut’u-ne]5 ⎥ ⎥ ⎢ ⎢lid ⎥ ⎢ ⎥ ⎢ ⎥⎥ ⎣S ⎣case erg2 ⎦ ⎣case abs4 ⎦ ⎦6 ⎥ ⎥ ′ [λy.λx.eat ([agent]:x,y)(f1 )(b5 )]6 ⎦ father.erg1,2 bread.abs eat-pst6

The constructional formulation simply stipulates the case marking correspondences in Agul. It is worth considering briefly whether a derivational account might be more explanatory. On standard approaches to case marking, the case is licensed through agreement with an invisible functional head. For example, in the tree in (28), adapted from Woolford (2009) for a sentence with three arguments, the ‘light verb’ vA assigns the nominative case and the agent role to the subject, vG assigns dative case and the goal role to one object, and V assigns accusative case and the theme role to the other object.1⁷ (28)

vP external argument vA

vP exp/goal vG V

VP theme/internal argument

Clearly, this hierarchical representation of the case-thematic correspondences is a notational variant of the constructional correspondence, unless there is independent evidence for the particular details of the structure. However, there appears to be no such evidence—the tree in (28) is motivated on the basis of a theory in which the only way to represent properties of the phonological representation (“Spell Out”) or interpretation (“Logical Form”) is the configuration of the tree and the individual functional heads like vA . In the absence of independent evidence for the hierarchical structure, Occam’s 1⁷ Woolford’s treatment is essentially that of Larson’s (1988) shell structure.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

126 argument structure Razor suggests that a constructional formulation is to be preferred, other things being equal.1⁸ Let us consider now a very common type of DSM, found especially in South Asian languages. In these languages, dative case is used to mark what is typically referred to as ‘experiencer’. Consider the following examples from Hindi, Kannada, and Punjabi. (29) Hindi (de Hoop & Narasimhan 2009, 64–5) a. raam=ne patthar=ko / patthar-∅ toDl-aa Raam=erg stone=acc / stone-nom break-pfv.sg.m ‘Raam broke a/the stone.’ b. raam=ko ghar jaa-naa hae Raam=dat home go-inf be.pres.3sg ‘Raam wants to/has to go home.’ c. raam=ko ek kitaab-∅ mil-ii Raam=dat one book-nom receive-pfv.sg.f ‘Raam received a book.’ (30) Kannada (Sridhar 1979, 109) avanige tāyiya jnapāka bantu. he-dat mother’s remembrance-nom.(neut) came(neut) ‘He remembered his mother’. (31) Punjabi (Bhatia 1993, 170–1) Tuàà nüü shor sunaaii dittaa. you.obl dat noise.m hear give.past.m ‘You heard the noise.’ As the translations show, the notion of experiencer, to the extent that it is shared by the subjects of ‘want’, ‘receive’, ‘remember’, and ‘hear’, must be rather abstract. Grimm (2011) proposes to locate this abstract meaning as a region of a lattice of thematic properties based on an extension of Dowty’s (1991) thematic entailments. On Grimm’s analysis, the experiencer is a sentient participant that lacks all of the other properties that are prototypically associated with agents, e.g. volition, control, causal action, etc.

1⁸ Not surprisingly, accounting for more complexities in differential marking and cross-linguistic variation requires more complexity in a derivational account. See for example Kalin (2018), who invokes a variety of licensors in addition to v for various case marking patterns, as well as notions such as obligatory and secondary licensing, uninterpretable abstract Case, and the projection of nominal features as functional heads in nominal structure.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.3 differential marking 127 Another factor that enters into DSM is aspect. For example, in Hindi the canonical case-marking has a nominative agent in the imperfect and an ergative agent in the perfect. (32) Hindi (de Hoop & Narasimhan 2005, 327) a. wo-∅ ek bakraa-∅/ek bakre=ko he-nom one goat-nom/one goat=acc hae be.pres.3sg ‘He sells a goat/the goat.’ b. us=ne ek bakraa-∅/ek bakre=ko he=erg one goat-nom/one goat=acc ‘He sold a goat/the goat.’

bec-taa sell-ipfv.sg.m

bec-aa sell-pfv.sg.m

The ergative/nominative distinction also appears with a small set of intransitive verbs associated with bodily function. (33) Hindi (de Hoop & Narasimhan 2005, 335) a. raam-ne chiikh-aa Raam=erg scream-pfv.sg.m ‘Raam screamed (purposefully).’ b. raam-∅ chiikh-aa Raam-nom scream-pfv.sg.m ‘Raam screamed.’ De Hoop and Narashimhan argue that differential marking has two functions in Hindi. One is to mark the relative thematic distance between subject and object: when the object in a transitive is more agent-like, the arguments are more likely to be marked. This makes them more easily distinguished. The other function is to mark the ‘strong’ agents, that is, arguments that have more of the features that characterize prototypical agents, along the lines of Hopper & Thompson (1980). These observations about Hindi demonstrate that a case-marking system, while flexible, is still restricted in its correspondences with CS features. We can see in other cases how differential case marking interacts with head-marking to yield interesting clusters of properties. In the examples in (34), the case of the agent appears to depend on the particular lexical verb, while the verb bears a portmanteau morpheme that agrees with the agent and the patient or recipient, which bear nominative case.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

128 argument structure (34) Gyarong (lCog-rtse rGyal-roṅ Sino-Tibetan; Himalayas)(Nagano 1984, cited by Bickel 2011, 403) a. nəyo-ki chigyo kəw-nasno-ch ̇ ko. 2s-erg(A1) 1d.nom(O) 2>1-scold-1d AUX ‘You (s) scold us (dual).’ b. nəyo chigyo kəw-wu-ch ko. 2s.nom(A2) 1d.nom(G) 2>1-give-1d AUX ‘You (s) give (it to) us (dual).’ Such correspondences can be represented directly in a constructional approach by indexing particular forms in phon (through the paradigm function) and clusters of feature values in syn. For instance, the form kəwnasío-ch ko in (34a) is the paradigm function realization of the inflected verb. The morphosyntactic arguments indexed as arg2 and arg3 , shown in (35) for (34a), correspond to agentive and patientive arguments in CS. (35) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎣cs

Φ:scold1 (2nd 2 >1st .dual3 ,1st .dual3 )=kəw2,3 -nasno ̇ 1 -ch3 ⎤ ⎥ ⎥ ⎤ ⎡category V ⎥ ⎥ ⎢lid scold1 ⎥ ⎥ ⎢ ⎥ ⎥ ⎢arg nd [person 2 ]2 ⎥ ⎢ ⎥ ⎥ ⎢ ⎥ st person 1 ⎥ ⎢ ⎥ [ ]⎥ ⎢arg ⎥ number dual 3 ⎥ ⎦ ⎣ ⎥ ′ scold 1 (agent:you2 ,patient:us3 ) ⎦

Similar examples of such ‘rich morphology’ are discussed in further detail in section 6.3. Finally, note that the thematic distinctions associated with differential subject marking may be reflected in other ways than on subject per se. For example, in Montana Salish, the morphology that nominalizes a verb is different depending on whether the cause is an agent or an instrument (McKay 2019). For example, the suffix mín in (36a) indicates that the open argument is agentive (‘one who uses it to x’), and the suffix tín in (36b) indicates that it is an instrument (‘something that does x’). When both are used, as in (36c), the open argument is a patient (‘something that x applies to’).1⁹ 1⁹ McKay (2019) argues for a derivational analysis in which these morphemes are decomposed into more basic elements.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.3 differential marking 129 (36) Montana Salish (Mengarini et al. 1877–9, cited by McKay 2019, 111) a. chalmín čál-mín cut-inst1 ‘scissors (regarding the man)’ b. aiptín ʕay-p-tín fast-inch-inst2 ‘what carries a fellow fast’ c. čalmínten čal-mín-tín cut-inst1 -inst2 ‘scissors in regard to the thing cut’ Along related lines, in Blackfoot, certain intransitive verbs select animate subjects that need not refer to sentient individuals. These verbs, which are of type AI in the traditional Algonquian terminology, are morphologically distinguished from verbs that select inanimate subjects. This is a head-marking variant of DSM, in contrast to the more familiar dependent marking type (Seržant & Witzlack-Makarevich 2018). One example from Blackfoot is oo ‘go’, shown in (37). (37) Blackfoot (Kim 2018, 151) Anna saahkomaapi/ainaka’si waamis-oo-wa. dem boy/wagon up-go.ai-3s ‘The boy/wagon went up/moved upwards.’ Kim (2018) shows that if there is a specified end point of the action, that is, if it is telic, a sentient subject is required. We may interpret this condition as requiring intention on the part of the referent of the argument. (38) Blackfoot (Kim 2018, 151) a. Anna saahkomaapi itap-oo-wa oomi isspahkoyi. dem boy goal-go.ai-3s dem hill ‘The boy went to that hill.’ oomi isspahkoyi. b. ∗ Anna ainaka’si itap-oo-wa dem wagon goal-go.ai-3s dem hill ‘The wagon went to that hill.’

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

130 argument structure Similarly, only sentient arguments can be the subjects of AI causatives, conveying intention. (39) Blackfoot (Kim 2017, 134) a. Anna saahkomaapi inakataki-wa (pokon) dem boy roll.ai-3s (ball) ‘That boy rolled (a ball).’ b. ∗ Anna ainaka’si inakataki-wa (pokon) dem wagon roll.ai-3s (ball) ‘That wagon rolled (a ball).’ (Intended meaning: ‘That wagon made a ball roll.’) To summarize, a subject may depart from prototypicality in a number of ways. In some languages, such departures are marked by DSM. Such correspondences can be directly captured using the constructional formalism.

5.3.2 Differential object marking Let us now consider differential object marking (DOM). In nominativeaccusative languages DOM is typically accomplished with an oblique case such as dative, instrumental, or genitive instead of accusative case, or by marked vs. unmarked, as in (40) or with an adposition, as in (41). (40) a. Spanish (Primus 2012, citing García 2007) ∗ Conozco este actor / a este actor. know.prs.1sg this.m.sg actor / obj this.m.sg actor ‘I know this actor.’ Conozco esta película / ∗ a esta película. know.prs.1sg this.f.sg film / obj this.f.sg film ‘I know this film.’ b. Hindi (Mohanan 1994, 59) bacce=ne kitab pa hi child.obl=erg book read pfv ‘The child read a book.’ Ila=ne bacce=ko u haya Ila=erg child.obl=acc lift pfv ‘Ila lifted the child.’

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.3 differential marking 131 c. Zaza (Arkadiev 2009, 167) televe kitav cên-o student(dir) book(dir) take-prs.3sg ‘The student is taking the book.’ televe malım-i vinen-o student(dir) teacher-obl see-prs.3sg ‘The student sees the teacher.’ d. Maltese (Comrie 1982) Marija qatlet far. Marija killed.3sg.f rat ‘Marija killed a rat.’ Marija qatlet lill-far. Marija killed.3sg.f acc.the-rat ‘Marija killed the rat.’ e. Russian Ja vižu dorogu. I see.1sg road.acc.sg ‘I see the road.’ Ja ne vižu dorogi. I neg see.1sg road.gen.sg ‘I don’t see any road.’ f. Finnish (Sands & Campbell 2001, 296) Pekka syö omena-n. Pekka.nom eat.3sg apple-acc ‘Pekka will eat the apple.’ (future, telic) Pekka syö omena-a. Pekka.nom eat.3sg apple-part ‘Pekka is eating the/an apple.’ (progressive, atelic) (41) a. Sandy chewed the meat. Sandy chewed on the meat. b. Sandy painted the wall. Sandy painted on the wall. In an ergative-absolutive language, DOM is typically expressed with an oblique case instead of absolutive case, as in (42).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

132 argument structure (42) a. Basque (Haspelmath 2001, 58) Ez ditut lore-ak erosi. neg I.have.them flower-pl.abs bought ‘We haven’t bought the flowers.’ Ez dut ogi-rik erosi. neg I.have.it bread-ptv bought ‘We did not buy any bread. ’ b. Kalkatungu (Blake 1976, 286) kupmuru-tu caa kalpin lai-na old.man-erg here young.man.abs hit-past ‘The old man hit the young man’ kuparjuru caa kalpin-ku lai-mina. old.man.abs here young.man-dat hit-imperf ‘The old man is hitting the young man’ The functions of DOM are somewhat more difficult to pin down than those of DSM. For example, oblique marking of an object in Spanish is typically claimed to be conditioned by animacy (Bossong 1985; Aissen 2003). However, Primus (2012) cites evidence that the oblique marking occurs just in case the two arguments are capable of playing the same role, presumably on the basis of ‘typicality’.2⁰ So in (43) the objects are inanimate but oblique. (43) Spanish (Primus 2012) a. El profesor reemplaza al libro. the professor replace.prs.3sg obj.def.m.sg book ‘The professor takes the place of the book.’ b. En esta receta, la leche puede sustituir in this.f.sg recipe the:f.sg milk can:prs.3sg replace al huevo. obj.def.m.sg egg ‘In this recipe, egg can be replaced by milk.’ Whether this use of oblique marking reflects the core function of DOM in Spanish or is simply parasitic on the core function is an open question.

2⁰ Primus (2012, 15): “It is not animacy per se that counts but rather the semantic function of the object. It must be a potential protoagent in the situation denoted by the predicate.”

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.4 modeling differential marking 133 In the Maltese examples in (40d), the difference between canonical marking and differential marking appears to be one of definiteness. In Warlpiri, an ergative language, dative instead of absolutive case is selected by certain verbs. Legate (2002) cites the verbs in (44) as having this property. (44) warrirni ‘seek’, kurriyi-mani ‘entrap, ambush’, riwarri-mani ‘consume completely’, wurru-mardarni ‘ambush’, ngurru-ngarni ‘desire strongly’, pun-pun-ngarrirni ‘advise’, lawa-nyanyi ‘fail to see’, wapal-nyanyi ‘search for’, yarnta-yarntarlunyanyi ‘stare angrily at with an intent to harm’, wapalpa-pangirni ‘search by digging’, pulka-pinyi ‘praise’, pututu-pinyi ‘warn’, … For most, but not all, of these verbs, the objects are not physically affected by the action. In his review of the typology of argument structure constructions, Bickel (2011) notes that the particular marking depends “mostly in a probabilistic rather than categorical way, on such referential properties as animacy, humanness, definiteness, specificity or more general notions of saliency.” I interpret this situation as follows: There is a semantic basis for the emergence of a particular way of marking an argument. As in other cases, it is difficult for learners to precisely identify the licensing conditions for this correspondence (Chapter 4); in addition to the core semantic basis, there are many properties that are typically but not uniformly associated with the core, as discussed in section 5.2. Thus, the correspondence ceases to be associated with a particular set of semantic properties and instead becomes associated with particular lexical items or classes of lexical items. In the next section, I model how such a misidentification of features relevant to a syn-cs correspondence might arise.

5.4 Modeling differential marking Now let us see how a differential marking ASC may emerge from a more general ASC. Such a change is a type of innovation, based on existing constructions but not directly exemplified in the PLD, as discussed in section 4.2. Change in the other direction, loss of differential marking in favor of canonical marking through generalization, is a more predictable consequence of generalization (Seržant 2013).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

134 argument structure

5.4.1 Acquisition of ASCs The examples of argument marking systems considered in this chapter suggest that differential marking arises as a way of distinguishing arguments with particular non-prototypical properties. The modeling of constructional learning in Chapter 4, as well as some of the examples of differential marking, suggest that change may occur when speakers fail to precisely replicate the licensing conditions for particular correspondences. Putting the two together, it is reasonable to suppose that differential marking may arise through the association of particular thematic properties with particular morphological forms. Beyond this, the evidence suggests that differential marking may ultimately become dissociated from the exact semantic correspondence. Its original semantic basis can still be identified, but the correspondence is no longer reducible to a set of necessary and sufficient semantic conditions. Under such circumstances, differential marking may become associated with particular lexical items, only some of which have the prototypical semantic properties that triggered the change. Barðdal (2011) documents a change in differential subject marking in Icelandic that exemplifies such a scenario. At an earlier stage of the language, both accusative and dative case are used to mark subjects that are not canonical agents. They encode such non-volitional roles as “experience-based” (emotion, attitudes, cognition, perception, bodily states, changes in bodily states, decline), and “happenstance” (failing/mistaking, success/performance, ontological states, social interaction, gain, personal properties). For example, consider (45). (45) Icelandic (Barðdal 2011, a. Mig langar í me.acc longs in ‘I want ice cream.’ b. Mer langar í me.dat longs in ‘I want ice cream.’

61) ís. ice-cream ís. ice-cream

On Barðdal’s account, which is a constructional one compatible with the one developed in this book, as more verbs fit a particular case-marking pattern the thematic condition must become less restrictive, in order to accommodate the full range of cases. The accusative subject construction exemplified in (45a)

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.4 modeling differential marking 135 and the dative subject construction exemplified in (45b) have been in competition. The dative subject construction has grown over time at the expense of the accusative subject construction, which was lower in type frequency, as predicted by Niyogi’s Law (section 4.3.2). At the present time, the dative construction covers more verb classes; a plausible prediction is that over time it will continue to absorb more and more the classes that now fall under the accusative construction. We can define acquisition of ASCs in terms of the schemas in (7) and (10), repeated here. (For examples, please see the original discussion of these schemas.) (7) Schema: Dependent-marking ASC ⎡phon […–1(2)–…]3 ⎤ ⎥ ⎢ ⎢ ⎡ ⎡cat NP ⎤ ⎤⎥ ⎢ ⎢ ⎢lid lid ⎥ ⎥⎥ 1 ⎢ ⎢ ⎢ ⎥ ⎥⎥ ⎢syn ⎢ ⎢ ⎥, …⎥ ⎥ case case ⎥ ⎥ ⎥ ⎢ ⎢ϕ ⎢ [ ]2 ⎥ ⎥ ⎥ ⎢ ⎢ ⎢ … ⎢ ⎣ ⎦ ⎦3 ⎥ ⎣S ⎥ ⎢ GF1 ⎦ ⎣gf (10) Schema: Head-marking ASC ⎤ ⎡phon [1,2(4), …]3 ⎥ ⎢ ⎢ ⎡ ⎡cat NP ⎤ ⎡cat V ⎤ ⎤ ⎥ ⎢ ⎢ ( ⎢lid lid ⎥), ⎢lid lid ⎥, …⎥ ⎥ 2 1 ⎢ ⎢ ⎥ ⎢syn ⎥ ⎥ ⎥ ⎢ ⎥⎥ ⎢ ⎢ arg φ ϕ φ 4 ⎦ 4 ⎦ ⎣ ⎥⎥ ⎢ ⎣ ⎢ ⎢ ⎦3 ⎥ ⎣S ⎥ ⎢ ⎥ ⎢gf GF1 ⎥ ⎢ ′ ′ ′ [2 (θ:[1 ∪4 ],…)]3 ⎦ ⎣cs In constructing grammars, speakers create individual constructions that sit somewhere in the space of possibilities defined by these schemata. For concreteness, let us focus on (7). This schema links GFs with case forms. Suppose that for a particular case form casei , there is a correspondence that satisfies this schema, where the CS features are {f1 ,…,fk }. In this correspondence, the relationship between the case form and the CS features is mediated by GF1 . Let us suppose, finally, that every construct with casei that the learner encounters in the PLD meets these CS conditions.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

136 argument structure The question before us is, what kinds of extensions can speakers make in constructing the correspondence between CS features as they appear in the PLD and morphological forms such as case? While it is the task of a learner to associate GF1 with casei , the actual CS features that are present in the PLD go beyond those that have traditionally been used to define thematic roles. The thematic features of the sort that Dowty (1991) identified can be organized into ‘tiers’ in the sense of Culicover & Wilkins (1986) and Jackendoff (1987). (46) a. Extensional: exist, appear, move, cause, contact, consume, … b. Intensional: sentient, animate, control, intention, volition, … c. Psychological: perception, emotion, sensation, … The extensional properties can be adduced by a learner by observing events and states in the physical environment. The intensional properties require some understanding of the nature of action and cannot be adduced simply by observing the physical. The psychological properties can be adduced by a learner by observing events and states in its internal environment. The distinctions are fundamental. They are at the root of understanding how actions such as buy and sell are the same (extensionally) and how they are different (intensionally); similarly for frighten and fear, please and like, and so on. In addition participants also have incidental properties that do not define their roles in events, but may be highly correlated with thematic properties (Alishahi & Stevenson 2010). (47) a. Physical: size, dimensions, solid/liquid, compact/diffuse, shape, color, … b. Biological: animacy, gender, age, … c. Social: rank, kinship, … d. Discourse: definiteness, QUD21, topic, focus=new/old, … e. Aspect: completeness, telicity, … f. Space: proximity, direction, … The extensional and some of the incidental properties are most accessible to the learner, because they are physically instantiated. The intensional and psychological properties, and incidental properties such as aspect and kinship, are more challenging to identify with precision, since they are more abstract. 21 Question Under Discussion (Roberts 2012).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.4 modeling differential marking 137 But on any instance the actual properties of the argument that is marked with casei will be a superset of those that participate in the target constructional correspondence. So the task of a learner is to figure out which properties of CS are the actual licensing conditions for this construction. And these superset properties are available to speakers in many if not all instances in which casei is used. To make the situation more complex (and more realistic), suppose that there are several constructions that relate different case forms to different sets of thematic features. Suppose furthermore that the five distinct basic types of agentive arguments discussed by Dowty (1991) are distinguished by five distinct morphological forms. For convenience, each case is named after the corresponding thematic property. The constructs in (48) are adapted from Dowty (1991, 572–3). The set of properties of the entity denoted by Sandy are shown for each case. (48) a. Sandy.perception sees Chris. ⇔ {sentient, animate, human, perception} b. Sandy.control wants a cookie. ⇔ {sentient, animate, human, control} c. Sandy.cause causes unhappiness. ⇔ {sentient, animate, human, cause} d. Sandy.move passed the window. ⇔ {sentient, animate, human, move} e. Sandy.exist needs a new car. ⇔ {sentient, animate, human, exist} In principle it is possible to figure out that although Sandy is an argument in all cases, with the features {sentient, animate, human}, these are not licensing conditions for the correspondences, and the relevant features are the last ones in each set. But to do this a speaker would require a massive amount of carefully curated input that provides minimal pairs for all of the case forms. For example, there must be evidence that when the cause is inanimate, the case form corresponds to cause and not -control. However, the more realistic assumption is that the input is noisy and not curated, and that some forms are statistically rarer than others. Although the language makes distinctions, it is not clear that the speaker is able to determine exactly which semantic distinctions correspond to which morphological distinctions. The speaker has to figure out which form of Sandy to use in a novel utterance. The speaker may have heard Sandy.control murdered Chris once. But there is a chance that the speaker will decide that the relevant feature is

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

138 argument structure sentient or control or move, or a combination of these, especially if these are the most frequent forms that the speaker has encountered. A similar conflation can be modeled for agreement, as long as the inflectional morphology is sufficiently flexible. Suppose that the speaker is exposed to the following constructs, where what is to the left of ⇔ is the form, and what is to the right is the relevant part of the CS representation. (49) a. Sandy 3.perception.sees Chris. ⇔ {sentient, animate, human, perception} b. Sandy 3.control.wants a cookie. ⇔ {sentient, animate, human, control} c. Sandy 3.cause.causes unhappiness. ⇔ {sentient, animate, human, cause} d. Sandy 3.movement.passed the window. ⇔ {sentient, animate, human, move} e. Sandy 3.exist.needs a new car. ⇔ {sentient, animate, human, exist} Again, the correspondence is between the particular morphological form and the particular CS property. In this case, we might expect a simplification of the paradigm function, for example where the phonological form of 3.movement.V falls together with that of 3.cause.V. It would still be possible to mark a salient different feature, e.g. control, but the force of simplification is in the direction of reducing the set of morphological options until a general subject correspondence emerges. But there are, in principle, opposite forces that could sustain the distinctions or even introduce new ones. Since, in most cases, agents will have more than one of the CS properties, we would expect that over time the speaker (and a community of speakers) would lose most if not all of the distinctions. If there is a particularly salient distinction, such as whether the agent acted with intent, then we might expect this distinction to be modeled sufficiently often that speakers will be able to pick it up and preserve it in their grammar. Or speakers could innovate such a distinction. What we expect, and in fact find, is that there is a single default correspondence between a grammatical form and a set of semantic properties, which is typically called the ‘subject’, and then differential marking of noncontrolling agents and experiencers. As we have seen in the examples cited in sections 5.3.1 and 5.3.2, such correspondence varies cross-linguistically. In one limiting case it involves the most agentive available argument, even one with no agentive properties, which

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.4 modeling differential marking 139 is the English type of subject. In the other limiting case it is restricted to an actual agent, which is the active-stative type. While the narrowest possible characterization of ‘agent’ would be a prototypical agent that acts volitionally, I have not found evidence of an active–stative language that restricts marking in this way.22

5.4.2 Simulation In order to demonstrate the plausibility of the scenario sketched out in the previous sections, it is useful to simulate the attested change. Culicover et al. (2016) report on a preliminary computational simulation of constructional learning based on that of Alishahi & Stevenson (2008, 2010). In this simulation, a network of learners is presented with exemplars of English sentences and representations of their meaning expressed as sets of feature values drawn from (46)–(47). Based on this input, the model constructs an association between the syntactic pattern, the argument position and the semantic representation associated with the verb. Thus it is possible for the model to associate different clusters of features with the same argument position of a given syntactic pattern. The model may acquire differential marking, and even hypothesize differential marking when it does not actually exist in the target language. In the current simulation, learners see only (randomly generated) examples of certain constructions; a particular learner might not get exposed to instances of all the constructions in the first round (or even the first few rounds). When a learner has to produce a frame for its neighbors, it has to generalize from what it has seen and learned so far, which often leads to mistakes. The input that a Target produces is noisy, as some features are occasionally removed or changed randomly (according to some pre-specified parameter). These erroneous instances can mislead the learners and result in them learning imperfect constructions. We constructed two learning scenarios. In one, the network of learners is fully connected, and each learner receives input from the Target and from each of the other learners according to specified parameters. In the other, the network of learners is partially connected, so that all learners receive input from the Target, but the learners are divided into two disconnected groups. Figure 5.1 shows the starting state of affairs in the fully connected scenario. 22 In contrast, according to Marianne Mithun (p.c.), the active-stative languages that she is familiar with treat inanimate causes as agents.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

140 argument structure

TARGET

1 3

2 4

Figure 5.1 Learning simulation: start.

The sequence of network states in Figure 5.2 visualizes a progression in this scenario. While learners diverge from the Target in the earlier stages of the simulation, over time they all converge on the Target. TARGET

1

1

3

2

TARGET

TARGET

2

4

1

3

4

TARGET

1

3

2

3

2

4

4

Figure 5.2 Learning simulation: fully connected.

The sequence of network states in Figure 5.3 visualizes a progression in the partially connected scenario. While one group of learners converges on the Target, the other group does not. TARGET

1

2

TARGET

1

3

4

2

TARGET

1

3

4

Figure 5.3 Learning simulation: partially connected.

2

3

4

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.4 modeling differential marking 141 A computational simulation, reported on in Culicover et al. (2016), shows precisely these types of changes.23 When all of the learners in the simulation interact with one another and with the Target, the grammar of the Target is acquired uniformly. But when the learners are divided into two groups that do not interact with one another, they begin to drift apart. The difference between the two scenarios is shown in the heat maps in Figures 5.4 and 5.5. After 2 rounds

After 20 rounds 1.0

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10111213141516171819

0.0

1.0

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10111213141516171819

0.0

ARGUMENT 1 (TRANSITIVE THEME) After 2 rounds

After 20 rounds 1.0

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10111213141516171819

0.0

1.0

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10111213141516171819

0.0

Figure 5.4 Heat map of the similarity between learners’ profiles for each argument position in a generic Transitive construction. Twenty learners are partitioned into two disjoint subpopulations of ten.

To see how a scenario such as Figure 5.4 could give rise to differential subject marking in concrete terms, consider the problem of learning the proper marking of a subject where the only indications about the form of the ASC are nominative case marking and the CS features. (50) shows the Target

23 I am grateful to Afra Alishahi for the computational simulation and the heat maps.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

142 argument structure 1.0

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10111213141516171819

0.0

Figure 5.5 Heat map of the similarity between learners’ profiles for agent position in a generic Transitive construction. Twenty learners form a single partition.

correspondences for an ‘active’ verb such as kiss, a ‘perception’ verb such as see, and a ‘transfer’ verb such as give. To simplify the representations, NPs such as cat NP are written as NP-nom1 , etc. [ ]1 case nom (50) a. Target correspondence ‘active’: category V ⎡ ⎤ ]]4 ⎥ ⎢syn [NP-nom1 , NP-acc2 ,[ lid kiss3 ⎥ ⎢ ⎢ ⎥ ⎢gf GF1 > GF2 ⎥ ⎢ ⎥ ⎢ ⎥ animate ⎡ ⎤ ⎢ ⎥ no-move ′ ′ ⎢ ′ ⎥ [kiss 3 ( move :1 ,[ ]:2 )] 4 ⎥ ⎢cs ⎢ ⎥ … ⎢ ⎥ ⎣… ⎦ ⎣ ⎦ b. Target correspondence ‘perception’: category V ⎡ ⎤ ]]4 ⎥ ⎢syn [NP-nom1 , NP-acc2 ,[ lid see3 ⎥ ⎢ ⎢ ⎥ ⎢gf GF1 > GF2 ⎥ ⎢ ⎥ ⎢ ⎥ animate ⎡ ⎤ ⎢ ⎥ no-move ′ ′ ⎢ ′ ⎥ [see 3 ( no-move :1 ,[ ]:2 )] 4 ⎥ ⎢cs ⎢ ⎥ … ⎢ ⎥ ⎣… ⎦ ⎣ ⎦

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

5.4 modeling differential marking 143 c. Target correspondence ‘transfer’ category V ⎤ ⎡ ]]5 ⎥ ⎢syn [NP-nom1 , NP-acc2 , NP-dat3 ,[ lid give4 ⎥ ⎢ ⎢ ⎥ ⎥ ⎢gf GF1 > GF2 ⎥ ⎢ ⎥ ⎢ animate inanimate animate ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎥ ⎢ ′ ′ ′ ⎢ ′ ⎥:2 ,⎢no-move⎥:3 )] 5 ⎥ [give 4 ( move ⎥:1 ,⎢move ⎢cs ⎥ ⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ … … … ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎦ ⎣ The critical property of this set of correspondences is that the dat case corresponds to ⎡animate ⎤, as seen in (50c). A perceiver, or more generally, ⎢no-move⎥ ⎢ ⎥ ⎣… ⎦ an experiencer, has just these properties. One possibility is that in speakers accurately reconstruct the Target correspondences, summarized in (51). (51) a. NP-nom ⇔ [animate] b. NP-acc ↔ [inanimate] c. NP-dat ↔ animate [ ] no-move But there is another possibility that is consistent with some but not all of input. Rather than take animate to correspond to nom regardless of other factors, as in (51a), speakers extend NP-dat to animate of perception verbs, and [ ] no-move restrict NP-nom to animate of action verbs, as in (52). [ ] move (52) a. NP-nom ⇔ animate [ ] move b. NP-acc ↔ [inanimate] c. NP-dat ↔ animate ] [ no-move What we have, then, are scenarios in which languages that at one stage do not distinguish ‘subject’ NPs in terms of their thematic properties change in a

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

144 argument structure later stage in such a way that special cases are isolated in terms of particular thematic and correlated properties and generalized with other arguments that share these properties.

5.5 Summary This chapter has demonstrated how to formulate the three types of argument structure constructions in terms of the constructional framework developed in Chapter 2. All three carry out the same function of licensing correspondences between phonological form and CS arguments. The CS arguments are clusters of thematic and other semantic features. Not all languages mark all features in the same way, but there is a core of thematic features that appear to participate in all ASCs. These features are part of the CCore, the universals of CS that grammars have evolved to express. We have also seen how differential marking may arise as a consequence of realignment of the case marking for experiencers so that it marks nonvolitional agents as well as recipients. On the other hand, speakers may also produce generalization of correspondences, so that thematic licensing conditions are lost. Such a scenario is plausible as an explanation for the emergence of the general grammatical functions subject and object, the focus of Chapter 6.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6 Grammatical functions 6.1 Introduction I argue in this chapter that the work of distinguishing arguments in order to reliably map them to thematic roles does not necessarily involve grammatical functions (GFs) such as subject and object. In some languages it does, and in some it does not. Constructional formulations accommodate the observed variation without requiring uniform syntactic representations and uniform correspondences, both within a single language and cross-linguistically. The GFs are appropriate to the grammatical description of a language just when the correspondences between thematic roles and overt form cannot be read off directly from phon, e.g. from morphological form or linear order. In such a situation, it is appropriate to say that this aspect of syntax has become ‘autonomous’. The GFs are participants in indirect correspondences, in that they go beyond the predicate in determining the thematic role for a particular phrase with a particular morphosyntactic form. Thus, the GFs are informal terms for morphosyntactic properties that correspond to relative positions on the thematic hierarchy. What we call ‘subject’ is the highest GF, ‘object’ is the next highest GF, and so on. I adapt Dowty’s (1991) algorithm for mapping CS arguments to GFs: in a language like English, the argument of a relation that has more ‘agentive’ properties such as volition, animacy, and casual action corresponds to the higher GF. In case there is only one argument, it is necessarily the subject. And when there is no argument, there must be an expletive subject. The chapter is organized as follows. Section 6.2 briefly reviews the notion of ‘subject’ and corresponding thematic properties. Section 6.3 considers in some detail the rich morphological system of Plains Cree, a polysynthetic language. The goal here is to demonstrate that the constructional framework allows for the natural and transparent formulation of fairly complex correspondences that are dramatically different from those in a language like English. Crucially, these correspondences do not require positing GFs. The alternative in MGG, which assumes a uniform correspondence between thematic role

Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0006

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

146 grammatical functions and morphosyntactic representation, is to force both languages into the same descriptive syntactic mold. Doing so does not achieve greater transparency or explanation, but does place great burdens on the descriptive apparatus because of the substantial constructional differences between the two types of languages. Section 6.4 applies the constructional perspective to other ways that languages can express correspondences between CS and syntax without intervening GFs. Section 6.5 develops an account of the English-style GFs subject and object in terms of complexity reduction. I show how it is not necessary to assume a primitive notion of ‘subject’ in order to capture generalizations about how CS corresponds to syntactic structure.

6.2 The notion of ‘subject’ The function of the notion ‘subject’ is that it expresses a flexible correspondence between thematic structure and a designated morphosyntactic property. The thematic role of the subject is not fixed, but is constrained by the alignment of grammatical and thematic hierarchies.1 The morphosyntactic property may be constituent order, case marking, agreement, or a combination. It is therefore entirely plausible that what we call ‘subject’ in one language has different morphosyntactic properties than what we call ‘subject’ in another language (Keenan 1976), and that in yet another language there is no generalization of form/meaning correspondences that requires the notion subject (Mithun 1991). Dowty (1991) argued that in English, the argument that bears more agentive properties corresponds to the configuration that we call ‘subject’. I assume that the precise definition of what characterizes ‘more agentive properties’ varies cross-linguistically. Variation in this domain could arise because of the difficulty for a learner to identify precisely what the relevant properties are based on experience, as discussed in Chapter 5. In the absence of explicit instruction, specific agentive properties can only be inferred from introspection and the visible behavior of others, and already established constructional patterns (Dautriche et al. 2014; Gleitman 1990; Gleitman & Landau 1994; Gleitman et al. 2005; Naigles et al. 1986). 1 For some specific proposals, see Aissen 1999; Dowty 1991; Culicover & Jackendoff 2005.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.3 morphologically rich ascs 147 In some languages the highest position on the GF hierarchy appears to be restricted to agentive arguments, and does not allow patients. In this case, the language does not appear to have a subject GF at all, since arguments with agentive properties are invariably marked in one way. Such a language may have a ‘functional’ passive, in the sense that the agentive argument may be omitted, but there is no promotion of a non-agentive argument to subject.2

6.3 Morphologically rich ASCs The constructional schemata in section 5.2, allow for arbitrarily complex correspondences between morphosyntactic form and phonological form, especially when paradigm functions are allowed to specify arbitrarily complex forms for particular combinations of morphosyntactic properties. It is possible, of course, to envision maximally simple or ‘canonical’ paradigm functions, those that are very regular and transparent. But as Corbett (2015) demonstrates, deviations from maximal simplicity in inflectional morphology can occur in a wide variety of ways. The complexity of an inflectional correspondence can be seen clearly in a constructional representation. A particularly compelling instance of morphosyntactic complexity occurs in the argument structure constructions of Plains Cree. In this section, I look at how arguments are marked in Plains Cree, how arguments are incorporated into verbs, and how overt arguments are interpreted with respect to the morphology. The goal of this section is to demonstrate the need for the full expressive power of the constructional formalism. I show that the morphosyntax of Plains Cree, while very different from that of English, is quite up to the task of specifying which morphosyntactic argument goes with which thematic role. And it does this work without requiring GFs.

2 Such a language is Central Pomo, where the verb is marked passive but the patientive arguments retain their morphological case. E.g. (i) Central Pomo (Mithun 2008, 229) Yal mi: dá:-’-č’-a-w 1pl.patient there want-rfl-imprf.pl-passive-prf ‘We’re not wanted there.’

ṱh in. not-imprf.sg

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

148 grammatical functions

6.3.1 Plains Cree argument structure I look first at how Plains Cree distinguishes agentive and patientive arguments in a transitive relation. First, note that Plains Cree has four different types of verbs. The intransitives have different roots depending on whether the subject is animate (VAI) or inanimate (VII), and the transitive verbs have different roots depending on whether the object is animate (VTA) or inanimate (VTI). The examples in (1) illustrate. (1) a. VAI: ni-pimipahta:-n 1-run-sg ‘I run.’ b. VII: mihkwa:-w-a maskisin-a be-red-3-pl shoe-pl ‘The shoes are red.’ c. VTA: ni-wa:pam-a:-w 1-see-dir-3 ‘I see him/her.’ d. VTI: ni-wa:paht-e:-n 1-see-theme-sg ‘I see it.’ Consider, now, the example in (2), which illustrates the morphologically intransitive verb a:pacihta- ‘use’ and the the morphologically transitive verb nipah- ‘kill’. (2) Plains Cree (Ahenakew 1986) pe:yak mo:swasiniy piko ni-t-a:pacihta:-n e-nipah-akik one shell only 1-t-use.anim-1 subord-kill.anim.1|3pl ni:sosa:p pihe:-sis-ak, pe:yak pihe:w, pe:yak twelve prarie.chicken-dim-pl one prarie.chicken one kinose:w, ekwa pe:yak mo:swa. fish and one moose ‘I used only one shell to kill twelve baby prairie-chickens, one [grown] prairie-chicken, one fish, and one moose.’

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.3 morphologically rich ascs 149 Here, the involvement of a first person argument in the relation denoted by the main predicate is marked both by the prefix ni- and the suffix -n in ni-t-a:pacihta:-n ‘I use’. The prefix appears when the verb is used in the socalled ‘independent order’, roughly corresponding to a direct assertion. The participation of first person singular and third person plural arguments in the relation is marked by the inflection akik ‘anim.1|3pl’ in e-nipah-akik ‘I kill them’. Here there is no overt prefix for first person because the verb is in the so-called ‘conjunct order’. The schematic correspondences are given in (3) and (4), explicitly spelling out the paradigm function. I use ‘+’ to indicate bound affixation in phon. (3) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎣

⎤ ⎥ ⎥ ⎥ cat V ⎡ ⎤ ⎥ ⎢lid lid ⎥ ⎥ 1 ⎢ ⎥ ⎥ ⎢ st ⎥ person 1 ⎢ 2 ⎥ ⎥ ]⎥ ⎢arg [ number sg3 4 ⎥ ⎣ ⎦ ⎦

(4) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎣

[1+ik2,3 ]4

[ni2 +1+n3 ]4 1st -V-sg

⎤ ⎥ ⎥ ⎥ cat V ⎤ ⎥ ⎡ ⎥ ⎥ ⎢lid lid1 ⎥ ⎥ ⎢ ⎢arg [person 1st ] ⎥ ⎥ 2 ⎥ ⎥ ⎢ ⎥ ⎥ ⎢ person 3rd ⎥ ⎥ ⎢ ]3⎥ ⎢arg [ number pl ⎦ ⎥ ⎣ 4⎦ V-1st |3rd

Note that in the morphosyntactic representation, the arguments of the verb are listed as features on the verb, rather than as separate constituents. This is because there is no overt syntactic evidence that these arguments are pronominal, although of course it is possible to formulate syntactic analyses in which the information contained in these features is represented as empty pronouns in a syntactic tree.3 The evidence for the properties of the arguments is the CS representation. Also, the arguments are not distinguished as subject and object, because there is no clear evidence that Plains Cree (and most, if 3 For an analysis where the arguments are represented as NPs in a conventional phrase marker, see Oxford (2014).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

150 grammatical functions not all of the Algonquian languages) distinguish arguments in terms of GFs (Dryer 1997; Dahlstrom 2017). I have left the cs tier empty in these correspondences because of a distinctive property of Plains Cree (and more generally, Algonquian) morphology: the determination of which argument corresponds to which thematic properties depends on the verbal suffixal morphology. In some cases this morphology can be separated from the person/number suffix, in some cases it cannot be. In (2), the suffix -akik on e-nipah-akik ‘I kill them’ indicates that the first person argument is the agent (or most agentive) while the third person argument is patient (or most patientive). If the ending was -icik, as in e-nipah-icik, the meaning would be ‘They kill me’, and not ‘I kill them’. Following standard practice in Algonquian linguistics, I refer to the feature that determines the direction of argument interpretation as ths (for ‘theme sign’) with values direct and inverse (Wolfart 1973, 12), as well as marking for animacy and transitivity in some cases (Dahlstrom 1991, 25). Whether the direct or inverse morphology is used in a given case depends on a thematic hierarchy, typically expressed as 2nd > 1st > 3rd > 3rd (Dahlstrom 1991). If obv the argument higher in the hierarchy is the agent (or more agentive), then the direct morphology is used; otherwise the inverse morphology is used.⁴ With this information in hand, we can complete and generalize (4). The direct value of ths produces the interpretation 1′ (agent:2′ ,patient:3′ ), while the inverse value produces the interpretation 1′ (agent:3′ ,patient:2′ ). The properties of the arguments are spelled out by the paradigm function as a suffix to the verbal root. (5) Construction: direct thematic correspondence ⎡phon [1(2,3,4)]5 ⎤ ⎢ ⎥ cat V ⎢ ⎥ ⎡ ⎤ ⎢ ⎥ ⎢lid ⎥ lid1 ⎢ ⎥ ⎢ ⎥ ⎢arg1 [person number] ⎥ ⎢ ⎥ 2 ⎥ ⎢ ⎢ ⎥ syn ⎢ ⎥ ⎢ ⎥ person person ⎥ ⎢ ⎢ ⎥ ]3⎥ ⎢arg2 [ ⎢ ⎥ number number ⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ths direct 4 ⎢ ⎥ ⎣ ⎦5 ⎢ ⎥ ′ ′ ′ [λy.λx.1 (agent:x, patient:y)(3 )(2 )]5 ⎦ ⎣cs ⁴ For discussion of a range of proposals for deriving theme signs as person agreement in Potawotami, a related Algonquian language, see Lewis (2019) and work cited there. If the arguments are both third person, then discourse prominence is used to distinguish them, since the two arguments cannot be distinguished in terms of their personal relationship to the discourse participants (Rhodes 2017).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.3 morphologically rich ascs 151 Construction: inverse thematic correspondence ⎤ ⎡phon [1(2,3,4)]5 ⎥ ⎢ cat V ⎥ ⎢ ⎡ ⎤ ⎥ ⎢ ⎥ ⎢lid lid1 ⎥ ⎢ ⎢ ⎥ ⎢arg1 [person person] ⎥ ⎥ ⎢ 2 ⎢ ⎥ ⎥ ⎢ syn ⎥ ⎢ ⎥ ⎢ person person ⎥ ⎢ ⎥ ⎢ ]3⎥ ⎢arg2 [ ⎥ ⎢ number number ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ inverse4 ⎥ ⎢ ⎦5 ⎣ths ⎥ ⎢ [λy.λx.1′ (agent:x, patient:y)(2′ )(3′ )]5 ⎦ ⎣cs As can be seen, the constructions are identical except for the value of ths and the corresponding alignment of arguments with the thematic roles. The formulation of the direct/inverse alternation is thus quite straightforward as a construction. It is of course possible to derive the same correspondence using a derivation; see for example Oxford (2014). An important property of a derivational analysis is that the idiosyncrasies of the correspondence between morphological properties and phonological form have to be built into the phrase structure and the rules that govern the adjunction of individual morphemes to roots. These idiosyncrasies, which are inescapable regardless of the grammatical framework, are dealt with directly in terms of constructions. This difference between constructions and derivations is an important one that warrants highlighting. It is striking that the internal morphological structure of a polysynthetic verb does not necessarily reflect the abstract syntactic structure that we would posit through an isomorphic correspondence with semantic structure. A particularly nice illustration of this is the following example from Bininj Guin-wok, an Australian aborigine language. (6) Bininj Guin-wok (Evans 2017, 315) A-ban-yawoyʔ-wargaʔ-maɳe-gaɲ-giɲe-ŋ. 1sg.subj-3pl.obj-again-wrong-ben-meat-cook-pstpf ‘I cooked the wrong meat for them again.’ The literal translation ‘I them again wrong for meat cooked’ shows that the benefactee is disjoint from the benefactive morpheme, and the property ‘wrong’ is disjoint from the morpheme that denotes ‘meat’. While it is certainly possible to derive this linear order by moving around the constituents of a conventional syntactic structure, such a derivation fails to address the question of why these constituents move to these landing sites. Such an analysis, like the

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

152 grammatical functions derivational analysis of polysynthesis in Plains Cree, is cryptoconstructional. It does not appear that accounting for the constructional properties in a derivational format provides us with any greater insight into either the grammar of the language or the properties of derivations.

6.3.2 Incorporation Now let us consider argument incorporation in Plains Cree. In a sense, all arguments are incorporated in Plains Cree, in that the morphology of the verb marks the person, and sometimes the number, of the arguments. However, it is possible to have an incorporated argument that restricts one of the arguments in the same way that the person and number features do in (4), but adds specific information to the referent. In (7a), e: is the theme marker for transitive animate, and a: is the theme marker for intransitive animate. The forms for ‘hand’ are shown in boldface. (7) Plains Cree (Hirose 2004, 108) a. ni-kisi:pe:k-in-e:-n ni-cihciy-a 1-wash-by.hand-1.ths-non3 1-hand-pl ‘I washed my hands.’ b. ni-kisi:pe:k-in-cihciy-a:-n 1-wash-by.hand-hand-intr-non3 ‘I washed my hands.’ The correspondence between the incorporated argument and the CS representation of (7b) is given in (8). (8) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣cs

[ni2 -kisi:pe:k-in1 -cihciy3 -a:4 -n2 ]5

⎤ ⎥ ⎥ ⎥ cat V ⎡ ⎡ ⎤⎤ ⎥ ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢lid wash.by.hand1 ⎥ ⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎥ st person 1 ⎢ ⎢ ⎥⎥ ⎥ ] ⎢ ⎢arg [ ⎥⎥ ⎥ number sg 2 ⎢ ⎢ ⎥ ⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢arg [person 3rd |my.hands] ⎥⎥ 3 ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎥ animate2 ⎢ ⎢ ⎥⎥ ] ⎥ ⎢ ⎢ths [ ⎥⎥ intransitive4 ⎥ ⎢ ⎣ ⎥ ⎦ ⎥ ⎣S ⎦5 ⎥ [λy.λx.wash.by.hand′ 1 (agent:x,patient:y)(h′ 3 ′ )(ego2 ′ )]5 ⎦ [12 -wash-by.hand1 -hand3 -intr4 -non32 ]5

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.3 morphologically rich ascs 153 More generally, the incorporation construction may be stated as (9).⁵ (9) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣cs

[2+1+3+4+2]5

⎤ ⎥ ⎡ ⎡cat V ⎤⎤ ⎥ ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢lid lid1 ⎥⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎥ person person ⎢ ⎢ ⎥⎥ ⎥ ] ⎥⎥ ⎥ ⎢ ⎢arg [ number number 2 ⎥⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢ ⎢ ⎢ person 3rd |incorp ⎥⎥ ⎥ ⎢ ⎢arg [ ] ⎥⎥ ⎥ number number 3 ⎥⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢ ±animate2 ⎥⎥ ⎥ ⎢ ⎢ ths [ ] ⎥⎥ ⎥ ⎢ ⎢ intransitive4 ⎣ ⎦⎦ ⎥ ⎣S 5 ⎥ ′ ′ ′ [λy.λx.1 (agent:x,patient:y)(3 )(2 )]5 ⎦

Note that this construction is morphologically intransitive, since the patient argument is incorporated. This fact highlights the important distinction between syntactic transitivity and semantic transitivity. The former holds when two arguments are expressed morphosyntactically. The second holds when there are two arguments in CS, regardless of whether they are both expressed morphosyntactically.⁶ Our analysis makes two predictions. First, if there is only one argument, it cannot be incorporated, since the construction is stated in terms of two arguments. This prediction is correct; cf. (10). (10) Plains Cree (Hirose 2004, 142) a. pa:hp-i-w awa:sis laugh-intr-3 child ‘a/the child is laughing’ b. ∗ pa:hp-a:was-o-w laugh-child-intr-3 The second prediction is that there should be verbs that lack incorporation and are semantically transitive, but morphosyntactically intransitive. These do ⁵ I omit a number of additional details, including marking for the obviate/proximate distinction. ⁶ The distinction is relevant not only for argument marking in languages such as Plains Cree, but more generally in such areas as binding theory. Cf. the distinction drawn by Reinhart & Reuland (1993) between morphosyntactic binding and semantic binding.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

154 grammatical functions exist—they are the so-called AI+O verbs (Dahlstrom 2013). In fact, we have already encountered one such example, namely a:pacihta:- ‘use’ in (2). For a simpler case, consider (11). (11) Plains Cree (Dahlstrom 2013) ahpe:nemo-wa o:si:me:h-ani depend.on-3/ind his-younger.sibling-anim.obv.sg ‘He relies on his younger brother.’ To see how this construction follows from our general formulation, consider (9). The second argument is specified as 3rd |incorp. The incorporated part combines with the verbal root. Now, if the correspondence of incorp with phon is null—that is, ‘zero morphology’—the same construction will license an intransitive construct with a transitive interpretation, along the lines of (12). (12) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣cs

[2+1(4,2,3)+2]5

⎤ ⎥ ⎥ ⎡ ⎡cat V ⎤⎤ ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢lid lid1 ⎥⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎢arg [person person] 2 ⎥⎥ ⎥ ⎢ ⎢ ⎥ ⎥⎥ ⎢ ⎢arg [person 3rd ] ⎥ ⎥⎥ 3 ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎣ths intransitive4 ⎥ ⎦⎥ ⎥ ⎦5 ⎣S ⎥ ′ ′ [λy.λx.1 (agent:x,patient:y)(pro3 )(2 )]5 ⎦

Since 3 is null, this semantic argument may be supplied by the linguistic context, in the form of a full NP. This is in fact what happens. As Dahlstrom (2013, 63) says, “[t]he Algonquianist label ‘AI+O’ means that it is inflected as an AI (Animate Intransitive) verb, but it takes an object (O) as well.”⁷ In this case, the construct has a representation like that in (13), for (11). ⁷ This construction is actually a variety of DOM. Dahlstrom (2009, 229f) cites the following transitive verb meanings from Meskwaki, another Algonquian language: ‘depend on, sit on, dance on, lie together with, lie having, be buried with, keep holding, have as a mother, marry, have as a friend, have as an enemy’. Dahlstrom (2013), discussing a range of Algonquian languages, observes that these are verbs whose objects are not affected patients.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.3 morphologically rich ascs 155 (13) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣cs

[ahpe:nemo1 -wa4,2,3 [o:si:me:h-ani]6 ]5

⎤ [depend.on1 -3/ind4,2.3 [his-younger.sibling-anim.obv.sg]6 ]5 ⎥ ⎥ ⎥ cat V , NP[his-younger.sibling] ⎡ ⎡ 6⎤ ⎤ ⎥ ⎢ ⎢ ⎥ ⎥ ⎥ lid depend.on 1 ⎢ ⎢ ⎥ ⎥ ⎥ ⎥ ⎥ ⎢ ⎢ ⎥ rd ⎢ ⎢arg [person 3 ] 2 ⎥ ⎥ ⎥ ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢arg [person 3rd ] 3 ⎥ ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎥ ⎢ ⎥ ⎢ ths intransitive ⎥ 4 ⎦ ⎥ ⎢ ⎣ ⎥ ⎥ ⎦5 ⎣S ⎥ ′ ′ ′ [λy.λx.depend.on 1 (agent:x,patient:y)(pro3 ∪6 )(pro 2 )]5 ⎦

Now consider transitive verbs with full NP arguments. As we have seen, the inflected verb expresses a full proposition. The NPs contribute descriptive and referential information to the proposition. The structure is therefore essentially what has been proposed by Jelinek (1985) and Baker (1996): the NPs are adjuncts and bind the pronominal arguments. The constructional analysis is straightforward, extending the analysis of Italian agreement in section 5.2.⁸ The NPs have the relevant morphological properties and perform the same function as the ARGs in constructions such as direct and inverse (5). Thus the morphological inflection corresponds both to the NP and to the verbal argument. These correspondences are shown in (14). (14) a. mihkwa:-wa maskisin-a be.red-3pl shoe-pl ‘The shoes are red.’ b. ⎡phon [mihkwa:1 -wa2 maskisin3 -a4 ]5 ⎤ ⎥ ⎢ [be.red1 -3pl2 shoe3 -pl4 ]5 ⎥ ⎢ ⎥ ⎢ NP ⎤ ⎡cat V ⎡ ⎡cat ⎤⎤ ⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ⎥ ⎢ lid shoe3 ⎥ ⎢lid red1 ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢ ⎥ , ⎢ ⎥⎥ ⎥ ⎢ syn rd rd ⎢ ⎥⎢ person 3 ⎢ ⎢person 3 ⎥⎥ ⎢ ⎥ ⎢arg [ ] ⎥⎥ ⎥ ⎢ ⎢ number pl 2 ⎥ ⎥ ⎢ ⎢ ⎣number pl4 ⎦ ⎣ ⎦ 5⎥ ⎢ ⎣S ⎦ ⎢ ⎥ ′ ′ [λx.red 1 (theme:x)(s3 [plural4 ] ∪ 2 )]5 ⎣cs ⎦

⁸ The similarity is noted by Kayne (2005a, 7) and Baker (2008, 8).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

156 grammatical functions (15) a. wa:pam-e:wa o-si:me:h-ani look.at-3>3′ /ind his-younger.sibling-anim.obv.sg ‘He looks at his younger brother.’ b. ⎡phon [wa:pam1 -e:wa2,3,4 o-si:me:h-ani6 ]5 ⎤ ⎥ ⎢ look.at-3>3′ /ind his-younger.sibling-anim.obv.sg ⎥ ⎢ ⎥ ⎢ cat V ⎡ ⎤⎤ ⎥ ⎡ ⎢ ⎢ cat NP ⎥⎥ ⎥ ⎢ lid look.at1 ⎤ ⎢ ⎢ ⎡ ⎥⎥ ⎥ ⎢ ⎢ ⎢lid younger.sibling⎥ ⎢ ⎢ rd ⎥⎥ ⎢syn ⎥ , ⎢arg1 [person 3 ]3 ⎥⎥ ⎥ ⎢ ⎢ ⎢person 3rd ⎥ ⎢ ⎥ ⎢ ⎢ rd ⎥⎥ ⎥ ⎢ ⎥ ⎢ arg2 3 [person ] ⎢ ⎥ 4 ⎥ ⎢ number sg ⎢ ⎦6 ⎢ ⎥⎥ ⎥ ⎢ ⎣ ⎢ transitive2 ⎣ths ⎦⎦ 5 ⎥ ⎣S ⎥ ⎢ ⎥ ⎢cs [λy.λx.look.at′ 1 (agent:x,patient:y)(3′ ) ⎥ ⎢ ([younger.sibling′ [sg]]6 ∪ 4′ )]5 ⎦ ⎣

6.3.3 Complexity in ASCs To summarize this section, I have shown that the constructional framework is flexible enough to capture the idiosyncrasies of Plains Cree argument structure, a morphologically complex polysynthetic language. All of the Plains Cree constructions considered in this section fall under the general headmarking schema of (10), repeated here without the GF tier, which is irrelevant in Plains Cree. (10) Schema: Head-marking ASC ⎡phon [1,2(4)]3 ⎤ ⎢ ⎥ ⎢ ⎤ ⎥ ⎡ ⎡cat NP ⎤ ⎡cat V ⎤ ⎢ ⎥ ⎢ (⎢ ), ⎢lid lid2 ⎥, . . . ⎥ ⎥ ⎥ ⎢ ⎢lid lid1 ⎥ ⎢syn ⎥ ⎢ ⎥ ⎥ ⎥ ⎢ ⎢ ϕ φ4 ⎦ ⎣arg φ4 ⎦ ⎦3⎦ ⎣S ⎣ ⎣ Much of the complexity of Plains Cree resides in the paradigm function that spells out the combinations of arguments of the transitive verb as individuated bound morphemes in phon. The direct/inverse marking of the verb manages the correspondence of the transitive arguments with the CS arguments. Another factor, which is not directly related to argument structure, is that the form of the verbal root often depends on the animacy of the theme argument, both in the transitive and the intransitive. This property can be captured

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.3 morphologically rich ascs 157 by slightly expanding the licensing conditions to include animacy in the formulation of the arguments, so that we arrive at the general constructional schema in (16). (16) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

[1(2,3,4,5)]6 ⎡cat ⎢lid ⎢ ⎢ ⎢ ⎢arg ⎢ ⎢ ⎢ ⎢... ⎢ ⎢ ⎢ ⎢ths ⎢ ⎣

⎤ ⎥ ⎤ ⎥ ⎥ ⎥ lid1 ⎥ ⎥ ⎥ ⎥ person person 2 ⎡ ⎤⎥ ⎥ ⎢number number3 ⎥⎥ ⎥ ⎢ ⎥⎥ ⎥ ⎣animacy [animate|inanimate]4 ⎦⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ direct|inverse ⎤ ⎡ ⎥ ⎥ ⎢aninmate|inanimate ⎥ ⎥ ⎥ ⎢ ⎥ ⎥ ⎥ ⎣transitive|intransitive⎦ 5 ⎦6⎦ V

Along related lines, it is noteworthy that there is considerable variation cross-linguistically in which arguments may appear incorporated into complex verbs, even among closely related languages. For example, consider incorporation in the Gunwinyguan languages of Australia, documented by Evans (2017, 316–17). In the most restricted case, Binij Gun-wok and Dalabon, only absolutive arguments, that is, intransitive subjects and transitive objects, can be incorporated. In Warray, absolutive arguments, destination and source can be incorporated. In Rembarrnga, absolutive arguments, location, destination, source, and means can be incorporated. And in Wubuy, absolutive arguments, themes, locations, and instruments may be incorporated. This variation is clearly thematic, and reflects familiar hierarchical relations. A constructional approach conforms nicely to the observed diversity, in that it allows for variability and generalization in the thematic specification, along the lines shown in (17). Θ is a non-agentive section of the thematic hierarchy, and incorp is the incorporated argument. (17) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎣cs

[ . . . -2-1- . . . ]3

⎤ ⎥ ⎡ ⎡cat V ⎤⎤ ⎥ ⎢ ⎢lid lid ⎥⎥ ⎥ 1 ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢arg [ person incorp ] ⎥⎥ ⎥ 2 ⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎢ ... ⎥ ⎣ ⎦⎦ 3 ⎥ ⎣S ⎥ [λx.1′ (θ ∈ Θ:x, . . . )(2′ )]3 ⎦

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

158 grammatical functions An illustration is given in (18), using data from Rembarrnga in Evans (2017) (citing Saulwick 2003, 338–45). (18) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎢ ⎢ ⎣cs

[ga4 -juɭa2 -yuʈ1 ]3

⎤ ⎥ ⎥ ‘It runs into the water.’ ⎥ ⎥ ⎥ ⎡ ⎡cat V ⎤⎤ ⎥ ⎢ ⎢ ⎥ ⎥ ⎥ ⎢ ⎢lid run1 ⎥⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎥ ⎢ ⎢arg [ person water]2 ⎥⎥ ⎥ ⎥⎥ ⎢ ⎢ rd ⎥ ⎢ ⎢arg [ person 3 ]4 ⎥⎥ ⎥⎥ ⎥ ⎢ ⎢ ⎥ ⎦⎦ 3 ⎣S ⎣ . . . ⎥ [λy.λx.run′ 1 (agent:x,direction:y, . . . )(pro4 )(w)]3 ⎦ 3min.s-water-run(npst)

6.4 Split intransitive I turn now to split intransitive languages. These are languages in which the morphological form of the intransitive argument appears to depend for the most part on the semantic role of the argument or on lexically determined aspectual properties of the verb. In typical examples, the marking of the intransitive argument is the same as the argument of a transitive argument that has the same or a similar role.⁹ This is in contrast to ergative systems, where the argument of the intransitive is marked the same as the patientive argument, regardless of its thematic properties. GFs appear to play no role in these languages in characterizing the correspondence between thematic roles and morphosyntactic form. I illustrate with a few examples. In Guaraní, the pronominal prefix for an intransitive actor is a-, while the prefix for intransitive theme of a state is šé-, as shown in (19a). (19b) shows that the patient of a transitive is marked with the prefix šé-, while (19c) shows that the agent of a transitive is marked with a-. (19) Guaraní (Mithun 1991, 511) a. a-xá. ‘I go.’ šé-rası.́ ̄ a-puʔa.́ ̃ ‘I got up.’ šé-ropehɨí.

‘I am sick.’ ‘I am sleepy.’

⁹ For reviews of patterns of split intransitivity, see van Valin (1990) and Creissels (2008).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.4 split intransitive 159 b. šé-rerahá. šé-yukà vará ̃ moʔa.́ ̃ c. a-gwerú aına. ́̃

‘It will carry me off.’ ‘He would have killed me.’ ‘I am bringing them now.’ ha upépe a-gařá šupé ‘and there I caught him’

As expected, argument marking in split intransitive languages can be accomplished with case, word order, or agreement morphology. Donohue (2008, 28) cites Ambonese Malay as a language in which word order distinguishes the thematic properties of the intransitive argument—the order in transitives is SVO, while the order in intransitives is SV for agentive subjects and VS for patientive subjects. Again, the constructional formulation is straightforward. (20) a. ⎡phon [1–2]3 ⎤ ⎢ ⎥ [S NP1 , V2 ] 3 ⎢syn ⎥ ⎢ ⎥ ′ ′ [2 (agentive:1 )]3 ⎦ ⎣cs b. ⎡phon [2–1]3 ⎤ ⎢ ⎥ [S NP1 , V2 ] 3 ⎢syn ⎥ ⎢ ⎥ ′ ′ [2 (patientive:1 )]3 ⎦ ⎣cs Another pattern can be seen in Acehnese, in which agentives are marked as proclitics and patientives as enclitics. (21) Acehnese (Durie 1988) kèe h’an geu-patéh-kuh gopnyan me not 3-believe-1 (s)he ‘(S)he doesn’t believe me.’ The sole argument of an intransitive is expressed as a proclitic or an enclitic depending on its role. (22) Acehnese (Durie 1988) a. geu-jak gopnyan 3-go (s)he ‘(S)he goes.’ b. gopnyan rhët-(geuh) (s)he fall-(3) ‘She falls.’

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

160 grammatical functions Here, the constructional formulation is similar to the one for Guaraní, with head-marking instead of dependent-marking. (23) a. ⎡phon ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs b. ⎡phon ⎢ ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs

[2+1–4]3 (e.g. geu2 +jak1 gopnyuan-4 ; cf. (22a))⎤ ⎥ ⎡ ⎡category V ⎤ ⎥ ⎤ ⎥ ⎢ ⎢ ⎥, NP4 ⎥ lid1 ⎢ ⎢lid ⎥ ⎥ ⎥ ⎢ arg ⎥ ⎥ [person person ] 2 ⎦ ⎣S ⎣ ⎦3 ⎥ ⎥ ′ ′ ′ [λx.1 (agentive:x)(2 ∪4 )]3 ⎦ [4–1+2]3 (e.g. gopnyuan4 rhët1 -(geuh2 ); cf. (22b))⎤ ⎥ ⎥ ⎡ ⎡category V ⎤ ⎤ ⎥ ⎢ ⎢ ⎥ ⎥ , NP lid1 4⎥ ⎢ ⎢lid ⎥ ⎥ ⎢ arg ⎥ ⎥ [person person ] 2 ⎦ ⎣S ⎣ ⎦3 ⎥ ⎥ [λx.1′ (patientive:x)(2′ ∪4′ )]3 ⎦

Languages with split intransitive marking are not nominative-accusative, since the intransitive argument does not pattern with what we would identify as the subject of the transitive. Nor are they ergative-accusative, since the intransitive argument does not pattern with the object. Consequently, it appears that there is no basis for categories such as subject and object in these languages (Mithun 1991; Dryer 1997).

6.5 The emergence of grammatical functions I conclude this chapter with a reprise of the question that originally motivated this study, which is, Where do the GFs subject and object come from? Are they primitives of the theory, or are they constructional generalizations of more restricted argument structure constructions?1⁰ I argue in this section for a perspective on GFs that is close to that of Keenan (1976). Keenan argues that there is a diverse set of properties that characterize the notion of subject. Many of them are semantic, e.g. binding of reflexives, control of infinitival complements, while others are morphosyntactic or syn1⁰ The literature on these questions is vast, and I do not have the space to review it adequately here. For some representative work, see Bhat (2002); Bickel (2011); Burgess et al. (1995); Dryer (1997); Dziwirek et al. (1990); Givón (1997); Johnson (1974); Marantz (1984); Mithun (2012); Palmer (1994); Paul (2010); Primus (2009); Schachter (1976, 1996); Shibatani (1977); Siewierska & Bakker (2012); Witzlack-Makarevich (2011); Wolvengrey (2005).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.5 the emergence of grammatical functions 161 tactic. Languages vary according to which collection of properties is associated with a particular configuration, case-marking, or agreement pattern. Keenan argues that some types of sentences are semantically more basic than others; these appear to be more or less what have been called prototypical transitives. The subjects of such sentences have most if not all of the typical properties, while subjects of non-basic sentences may have fewer.11 As suggested in the introduction to this chapter, and touched on in the discussion thus far, there is considerable typological evidence to suggest that the GFs are not universal primitives. Rather, they are reflections of the fact that in some languages, a particular grammatical device corresponds not to a particular thematic role, but to an argument in a thematic hierarchy. In such a case the actual thematic role is determined by the semantics of the verb, by voice marking of the verb, and perhaps other factors. Let us consider, for example, what it means for an NP to be the subject of a sentence in English. Effectively, there are no necessary CS properties associated with this argument in virtue of having the highest GF. That is, an NP may be licensed in the ‘subject’ position regardless of its thematic properties, or even if it lacks thematic properties, if other licensing conditions are satisfied. When there are two thematic arguments, the subject must be the most agentive argument that is licensed by the verb that is consistent with the voice marking of the verb in the sentence (Dowty 1991). More generally, as a first approximation we can envision a hierarchy based on the thematic features, with agentive features outranking patientive features, such that the argument that is higher in the hierarchy corresponds to ‘subject’ in syn.12 Figure 6.1 shows the feature lattice of Grimm 2011. Following Grimm, let us suppose that the ASCs in a language with case marking consist of correspondences between regions of this lattice and particular morphological forms. As we saw in our examination of differential marking, the number and scope of such correspondences may vary among languages. But considerations of simplicity predict that the regions will be coherent, not arbitrary. So while there can be a case marking for experiencers that distinguish them from volitional agents, we would not expect to find such

11 For a critique of Keenan’s proposals, see Johnson (1977), who argues, correctly I believe, that the notion ‘subject’ itself plays no theoretical role when the various properties are properly identified. See also Siewierska & Bakker (2012), who review the notion of ‘strength’ of a GF in terms of the variety of semantic roles that it can bear. 12 There are exceptions to the general pattern, e.g. where a verb licenses a non-thematic subject, cases like frightens where a verb requires a thematic assignment that conflicts with the hierarchy, in idioms whose arguments simply lack thematic properties, in obligatorily passive constructions such as Sandy is said by everyone to be a genius.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

162 grammatical functions Instigation, Motion, Sentience, Volition

Maximal Agent

Instigation; Instigation, Motion, Motion, Sentience, Sentience Sentience, Volition Volition Motion, Motion,Instigation,Sentience, Instigation Sentience Sentience Volition Motion Instigation Sentience

Total Persistence gentivity Axis Patientivity Axis



Instigation, Motion, Sentience, Volition Instigation, Instigation, Motion, Motion, Sentience, Sentience Sentience, Volition Volition



Instigation, Motion, Sentience, Volition

Motion, Motion, Instigation, Sentience, Instigation Sentience Sentience Volition Motion Instigation Sentience

Qualitative Persistence (Beginning)

Instigation, Motion, Instigation, Motion, Sentience, Sentience, Sentience Volition Volition Motion, Motion, Instigation,Sentience, Instigation Sentience Sentience Volition Motion Instigation Sentience Existential Persistence (Beginning) Maximal Patient

Figure 6.1 Agentive/patientive feature lattice. (Grimm 2011, 15)

a case marking for disjoint regions of the lattice, e.g. experiencers and objects that lack independent existence. And if such a situation would arise through an historical accident, we would expect it to be unstable.13 Culicover & Jackendoff (2005), drawing on insights of Relational Grammar, propose that the passive construction in English is consistent with this type of cs-syn correspondence. As discussed in section 2.3, in the passive the highest ranking argument is typically an oblique, and the subject is lower on the hierarchy. The passive can be understood as a construction that backgrounds the highest ranking argument, by marking it as oblique. In this way the next highest argument in the hierarchy is the one that corresponds to subject. As discussed by Givón (2009, chapter 3), there are a number of functionally

13 See Steinert-Threlkeld & Szymanik (2020) for a computational demonstration that learning of color terms is optimal when the meaning is a compact region of the semantic space. In principle, the same notion of optimality should apply to correspondences with any semantic space, and in particular the correspondence between grammatical marking and thematic properties.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.5 the emergence of grammatical functions 163 equivalent ways in which this can be done. Similarly, an antipassive construction backgrounds the most patientive argument by marking it as oblique. Furthermore, in many languages the verb agrees with a designated syntactic argument, quite independently of its thematic function or case marking. Such agreement makes it appear that this designated argument is ‘subject’. The examples in (24) from Basque illustrate this point; similar examples can be drawn from many languages with split intransitivity or ergative-absolutive case marking. (24) Basque (Creissels 2008) a. Gizon-ak ur-a edan du man-sg.erg water-sg.abs drink.pfv AUX.prs.P3sg.A3sg ‘The man has drunk the water.’ b. Gizon-a etori da man-sg.abs come.pfv AUX.prs.S3sg ‘The man has come.’ c. Ur-ak irakin du water-sg.erg boil.pfv AUX.prs.P3sg.A3sg ‘The water has boiled.’ Here, the case marking may be sensitive to the thematic properties of the argument, with absolutive picking out the theme, or case may be lexically governed. But verbal agreement appears to depend on the highest ranked argument in the thematic hierarchy. A similar situation holds in Choctaw (Heath 1977; Broadwell 2006), in which arguments are head marked according to the grammatical function hierarchy, but also dependent marked according to thematic functions. Choctaw is normally subject-object-verb, with nominative case on subjects and optional accusative case on objects. E.g. (25) Choctaw (Broadwell 2006, 39) John-at tákkon(-a) chopa-h. John-nom peach(-acc) bought ‘John bought a peach.’ However, the orders OSV and SVO are also possible (generally accompanied by intonational breaks). In these instances, accusative case is obligatory on the object.

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

164 grammatical functions (26) Choctaw (Broadwell 2006, 39) Tákkon-a John-at chopa-h. peach-acc John-nom bought ‘John bought a peach.’ At the same time, there are three case-specified series of pronominal affixes that attach to the verb: agentive, patientive, and dative. Agentive is used to denote active or voluntary activity (e.g. ‘to go’, ‘to stand’, ‘to die’). Patientive is used to denote inactive or involuntary intransitives (‘to be sick’, ‘to be good’, etc.). (27) Choctaw (Heath 1977, 204) a. Či-pi:sa-li-h 2pat-see-1agent-prst ‘I see you.’ b. Či-(y)abi:ka-h 2pat-be.sick-prst ‘You are sick.’ c. Iš-iya-h 2agent-go-prst ‘You are going.’ Subjects of stative intransitive verbs are marked with the patientive marker, while the subjects of active intransitive verbs are marked as agentive. In a language like English the generalization that there is a syntactic realization of the highest argument on the hierarchy is a real one—this syntactic realization is what we call ‘subject.’ In English, the licensing of a clause requires that what we call ‘subject’ is realized overtly in phon. In the absence of a highest argument in this position, there must be an expletive it or there to satisfy the licensing conditions.1⁴ The view that I am arguing for here thus accepts a variant of the classical MGG approach to GFs in English, in that they are configurationally realized. However, I do not take the particulars of the grammar of English to be universal, and thus allow for languages in which this constructional generalization does not hold. 1⁴ Effectively, every predicate licenses an external argument that must be realized overtly in certain constructions. A verb like rain requires external it, which corresponds to the subject GF in It’s raining and I expect it to rain. The English raising-to-object construction requires that the subject GF1 of the infinitival predicate correspond to the object GF2 of the higher verb. For discussion, see Culicover & Jackendoff (2005, chapter 6).

OUP CORRECTED PROOF – FINAL, 20/6/2021, SPi

6.6 summary 165 The problem before us, now, is to explain how ASCs for subject and object could have evolved from ASCs that express correspondences between morphology and thematic properties directly. There are two parts to the problem: (i) how did configuration and position become the relevant criteria in place of case marking, and (ii) how did the narrow semantic conditions on particular correspondences between case and thematic features become lost in favor of a thematic hierarchy? The connection between these two phenomena is clear: configuration and linear order offer significantly fewer options for distinguishing various regions of the thematic lattice than does case marking. So as configuration becomes the mechanism for marking argument structure, the corresponding regions must be expanded. In this way, ‘agent’ becomes reinterpreted as ‘most agentive’. On the other side of the coin, differential subject marking can persist only when the ASCs involve case marking or head-marking. On this view, the loss of case marking in languages like English is not a cause, but at least in part a result of the emergence of configurational ASCs. I explore this idea in more detail in Chapter 8.

6.6 Summary I have argued in this chapter that the work of distinguishing arguments in order to reliably map them to thematic roles does not necessarily involve grammatical functions (GFs) such as subject and object. In some languages it does, and in some it does not. Using constructions, we can state the correspondences and accommodate the observed variation. We do not need to assume that there is a uniform underlying syntactic structure and configuration for subject and object that is realized in different ways in different languages. GFs emerge when thematic roles and overt form do not correspond in phon, reflecting morphological form or linear order. The GFs then play an indirect role in correspondences, mediating between CS and morphosyntactic representations.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7 A′ constructions This chapter outlines a constructional approach to A′ constructions. In mainstream approaches, these constructions are typically derived through movement: the moved constituent appears in a designated non-argument position, and its function is captured by an invisible copy in its canonical position. The constructional analysis captures the generalizations about A′ constructions without appealing to overt movement, covert movement, movement of invisible constituents, or invisible levels of representation, and relates A′ constructions to non-A′ -constructions with similar interpretive properties. In section 7.1 I review the properties that A′ constructions share, and how these properties have been captured in mainstream syntactic theory. A key property is that these constructions have a ‘gap’ that corresponds to a binding relation in CS. Section 7.2 isolates the essential functions of A′ constructions, and develops a constructional analysis. Section 7.3 looks at ‘in situ’ constructions that have the functions of A′ constructions, but not the form. I show how the constructional approach is able to capture the common functions without stipulating that an in situ construction has a gap in its syntactic representation. Section 7.4 extends the analysis of A′ constructions with gaps to less wellstudied cases. There I show how a constructional approach accounts for the properties of these constructions without stipulating that they share all of the syntactic properties of wh-questions and relative clauses. Section 7.5 shows how a constructional approach is able to accommodate in a natural way the range of A′ and A′ -like constructions found crosslinguistically.

7.1 Foundations I begin by reviewing the properties that characterize A′ constructions. Conventionally, an A′ construction is one in which an argument or an adjunct is in a designated position and acquires its functional interpretation Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0007

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.1 foundations 167 in virtue of a structural relationship with a gap elsewhere in the structure. The designated position is an A′ position. As we will see, this characterization is somewhat theory-dependent, since in many so-called A′ constructions, there is no obvious candidate for a constituent in an A′ position. Typically, but not always, there is a gap in some functional position. And there is a binding relation, at least in CS. To appreciate the difference between an A′ position and the canonical argument position that corresponds to it, consider the examples in (1). Qα is the interrogative operator in CS that takes scope over the proposition and binds a variable corresponding to the object. (1) a. You are eating a pizza. CS: eat′ (you′ ,p) b. What are you eating ? CS: Qα .eat′ (you′ ,α) It has been firmly established for decades that what signals that what is the object and hence the patient of eat cannot be the initial position in (1b). This function is canonically expressed in English by the post-verbal position, indicated by ‘ ’ in (1b) and corresponding to a pizza in (1a). In classical terms, this argument position is governed by the head eating.1 In order to properly interpret (1b), it is necessary to establish some kind of relationship between the clause-initial constituent and the post-verbal position, which is where a corresponding non-interrogative phrase typically resides. This relationship is in principle unbounded, which precludes the possibility of interpreting the A′ constituent as an argument or adjunct of the nearest verb; cf. (21). (2) What did Otto say Sandy claimed (everyone knew . . .) you were eating ? The relationship between the A′ constituent and the argument (A) position is usually marked in contemporary linguistic theory by coindexing the A′ constituent and some designated symbol, e.g. t (for trace), or a phonetically null copy of the A′ constituent, in the position of the gap. This phonologically null constituent bears the appropriate configurational relationship to the verb. (3) illustrates. 1 In various instantiations of MGG this position is assigned case and a thematic role under government either by the verb, or by an invisible functional head that is closely connected to the verb; see Chomsky (1981) for one influential proposal.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

168 a′ constructions (3) a. whati are you eating ti |whati b. whati did Otto say Sandy claimed you were eating ti |whati c. whati did Otto say Sandy claimed everyone knew you were eating ti |whati The A′ constituent and ti or whati constitute a chain. The standard device for marking scope in mainstream theory for licensing ′ A chains is movement2, where the structure containing the A′ constituent in its canonical position is mapped into a structure in which the A′ constituent is in A′ position and there is a coindexed trace or copy in the canonical position. The height of attachment of the moved constituent in the derived structure corresponds to its scope in the interpretation. There are alternatives that do not involve multiple structures, but treat the chain as a relation defined over a single syntactic representation (Brame 1976; Koster 1978; Pollard & Sag 1994). Following Culicover & Jackendoff (2005) and much work in a variety of non-derivational frameworks, I assume the latter perspective here. The main point of this chapter is to show that it is possible and, in fact, desirable to account in constructional terms for the typology of constructions for CS functions like (1b). To understand how the current constructional approach differs from classical derivational approaches in generative grammar, it is useful to first review briefly the main points of Chomsky (1977). In that piece, Chomsky observed that a significant number of constructions in English have certain striking properties in common. Among other things, (i) (ii) (iii) (iv)

they have a gap in the canonical position, like (1b); they appear to obey island constraints (Ross 1967); they often, but not always, have an overt constituent in an A′ position; they express unbounded dependencies between the A′ position and the gap, like (21).

A problem that Chomsky had to deal with in accounting for A′ constructions is property (iii). While all of the constructions that he considered do express an unbounded dependency, in some cases there does not appear to be a constituent in an A′ position that participates in this dependency. A typical example is (4).

2 internal merge in the Minimalist Program

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.2 doing a′ work 169 (4) I baked [NP the pizza [S Otto says Sandy claimed . . . you are now eating ti ]] There does not appear to be a constituent in A′ position in the relative clause Otto says Sandy claimed . . . you are now eating ti that forms a chain with ti . In order to account in a uniform way for all A′ constructions, Chomsky assumed that there is always a constituent in the A′ position, even if it is not visible. That is, it is syntactically present but has no phonological form. Thus property (iv) holds in all cases, by stipulation. On this account, the sentence in (4) has the representation in (5), where OP is an invisible phrase (an empty operator) in A′ position. (5) I baked the pizza OPi Otto says Sandy claimed . . . you are now eating ti . I refer to those cases where there is an overt A′ constituent as visible chains, and those where there is no A′ constituent as invisible chains. In the remaining sections I sketch out how to capture in a constructional approach the properties that such constructions have in common, as well as the idiosyncrasies of the invisible chain constructions. Looking at A′ constructions from the perspective of constructional typology, there are two main questions that we have to address. First, for each type of A′ construction—that is, an A′ construction that expresses a particular kind of CS relation—what are the possible formal correspondences that do this work? Second, how do these possible alternatives compare in terms of complexity, and can the distribution of common, rare, and unattested correspondences be accounted for in terms of complexity?

7.2 Doing A′ work 7.2.1 Gaps and chains As discussed in Chapter 3, what is at issue from the perspective of grammatical theory is whether or not the same expressive work is necessarily done by the same syntactic devices in all languages. The simplifying assumption of MGG is that it is; the result has been a (parametrized) Universal Grammar.3 3 And to the extent that the work appears to be done by morphology, it has led to a syntacticized theory of morphology, i.e. Distributed Morphology (Embick & Noyer 2007; Halle & Marantz 1993, 1994; Harley & Noyer 1999).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

170 a′ constructions But as we have seen, the richness of grammatical variation suggests that UG is an imperfect and impoverished idealization at best. When it is extended to account for the actual phenomena, the idea of a restricted set of parametric variations around a universal core either falls short in terms of descriptive adequacy, or becomes a way of implementing constructional variation in terms that are not well-suited for it. That is, it is cryptoconstructional in the sense of Chapter 3. The phenomena of A′ constructions highlight this issue particularly clearly. In all languages it is possible to express the kinds of relations that in English are expressed by wh-questions, relative clauses, topicalization, and so on. But not all languages have these constructions, at least, not in the form that they take in English. The question thus arises, given that these functions are all in the CCore, how do we account for the variation? A clue comes from a comparison of wh-questions with relative clauses. Informally, a wh-question clause defines a property and asks the hearer to supply one or more entities (in the abstract logical sense) that have this property.⁴ For example, if I ask Who ate the pizza? I am asking the addressee to supply the value of the individual or individuals α such that the property [λx.x ate the pizza] holds of α. If I ask What you are reading?, I am asking you to supply the identity of the thing α such that the property [λx.you are reading x] holds of α. And so on. In contrast, a relative clause attributes a property to an entity. For example, in the man who ate the pizza, the reference of the NP headed by man is the man α such that the property [λx.x ate the pizza] holds of α. While the constructions are syntactically different, they do share certain syntactic properties and certain corresponding semantic properties. The central assumption that I make here is that it possible to decompose the work that a question does into three parts: (i) it invokes the question operator that represents the fact that the speaker is asking a question, (ii) it specifies the kind of entity or property α that the question is about, and (iii) it expresses the property that holds of α. Similarly for relative clauses, ceteris paribus. (However, topicalization may be different, for reasons that I note in section 7.2.3.) Crucially, in English the property denoted by these constructions corresponds to the part of the A′ configuration that contains the gap. For the

⁴ For formal approaches to the semantics of questions, see Ginzburg & Sag (2000); Krifka (2001, 2008) and work cited there.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.2 doing a′ work 171 two questions in (1) above, for example, we may represent the meaning as in (6).⁵ (6) a. whoi [ti ate the pizza] ⇔ Qα .λx.eat′ (x,p)(αperson ) b. whati [you read ti ] ⇔ Qα .λx.read′ (you′ ,x)(αthing ) In (6a), Qα corresponds to “supply the identity of α”, λx.eat′ (x,p) corresponds to “such that the property ‘x ate the pizza’ ” and αthing , as the argument of this property, corresponds to “holds of the thing α”. Thus, the meaning of who ate the pizza is “supply the identity of the thing α such that the property ‘x ate the pizza’ holds of α.” Similarly for (6b).⁶ The key to the constructional typology is that it locates the universal properties of these kinds of questions (and other constructions) in the CS representations that they express, not in the syntactic representations that express them. So the questions in (6) in a language that lacks overt wh-phrases in initial position do not actually have wh-phrases in initial position at some invisible level of syntactic representation, or invisible wh-phrases in initial position. But they do have the same CS representations. From the point of view of constructions, such a language would lack an ′ A construction for this function. That is, the language has a construction that expresses the CS function, but does the job without a correspondence with an A′ chain. Since we are concerned not with universals of syntactic form, but with universals of CS function, I discuss such cases here with the A′ constructions, that is, those that involve a gap configuration. The correspondence between the gap configuration in an A′ construction and the CS property is central to this analysis. Since there are many A′ constructions, a plausible hypothesis is that what they share is the property component of their CS interpretation. So, unlike in the MGG approach, what is common to A′ constructions is not the syntactic operation of wh-movement that produces the A′ chain. It is the semantic representation that corresponds to the tail end of the A′ chain. In this way, we relocate the universal from the syntax to the semantics. We have seen the beginning of how this works in the analysis of the whquestions. The non-A′ part of the sentence, typically a clause containing a gap, corresponds to its regular CS representation. The gap corresponds to the ⁵ In order to make things as clear as possible, I depart from standard logical structures by incorporating aspects of Jackendoff ’s formalism for CS, in particular the superscript for binding (Jackendoff 1987). As usual, the reader should feel free to substitute any notation that s/he prefers. ⁶ I am assuming here for simplicity that a questioned subject in English forms a chain in syn, as does a questioned adjunct of time, place, etc. There are other ways to get the correspondence with the semantic representation without making this assumption; see (12) below.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

172 a′ constructions λ-bound variable x. The constructional correspondence may thus be represented as (7). (7) Construction: Gap ⎡phon [. . .–∅1 –. . .]2 ⎤ ⎢syn [S . . ., XP1 , . . .]2 ⎥ ⎢ ⎥ λx.2′ (x1 ) ⎣cs ⎦ This construction treats the canonical position of the A′ constituent (if there is one) as syntactically real but corresponding to null phonological material. In essence, it is a trace. 2′ is the interpretation of the whole sentence, minus the part that corresponds to the gap. If just one constituent is questioned, as in (6), then the constituent in the canonical position will correspond simply to a λ-bound variable. For example you read t has the interpretation in (8). Similarly for relative clauses. (8) λy.λx.read′ (agent:x, patient:y)(you′ )(y) ⇒ λy.read′ (agent:you′ , patient:y) Since the portion of phon corresponding to the XP is empty, we have a correspondence between the phonological gap, an argument or adjunct in canonical position in syn that lacks any lexical content, and a λ-bound variable in CS.⁷ This correspondence is uniform across the A′ constructions. What the CS function applies to, and whether there is an XP spelled out elsewhere in phon, are matters for individual constructions. So, for example, in the case of the interrogative, the lexical representation of the wh-phrase is exemplified by (9a), with a CS representation of the form λP.Qα .P(α).⁸ In a question, it appears in phon in initial position and in CS it takes the representation of the clause S2 containing the gap as its argument, as in (9b).⁹ (9) a. what ⎡phon ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs

what1

⎤ NP ⎤⎥ ⎡category ⎥ ⎢subcategory wh ⎥⎥ ⎥⎥ ⎢ what1 ⎦⎥ ⎣lid ⎥ [λP.Qα .P(αthing )1 ⎦

⁷ For arguments that adjuncts also bind null syntactic constituents, see Hukari & Levine (1995). ⁸ This formulation is a variant of that in Muskens (2003). ⁹ Here, as elsewhere, using S to refer to the clause sets aside questions about whether it is a projection of a functional head.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.2 doing a′ work 173 b. Construction: wh-question ⎡phon [1–2]3 ⎤ ⎢syn [S XP[wh]1 , S2 ]3 ⎥ ⎢ ⎥ [1′ (2′ )]3 ⎣cs ⎦ So the phon of what did you eat? corresponds to (10). (10) [λP.Qα .P(αthing )]1 (λx.eat′ (you′ ,x)2 ) ⇒ Qα .λx.eat′ (you′ ,x)(αthing ) ⇒ Qα .eat′ (you′ ,αthing ) One important feature of this representation is that there is no direct syntactic relationship between the NP what and the canonical position that corresponds to the gap. The chain has three parts: (i) the correspondence between what in syn and Qα in CS, (ii) the link between Qα and α in CS, and (iii) the correspondence between α and an empty NP in canonical position (Culicover 2009). Notice also that (9b) does not say anything about the presence of a gap. S2 must contain an empty constituent of the same type as XP1 , so that the gap correspondence is licensed and the interpretation is well-formed. If there is no object gap, the S does not correspond to a property, and the combined interpretation is incoherent. For example, the ungrammatical sentence (11a) has the representation (11b). (11) a. ∗What did you eat a pizza? b. λP.Qα .P(αthing )1 (eat′ (you′ ,p)2 ) ⇒ Qα [eat′ (you′ ,p)2 (αthing )1 ] Since there is no gap, there is no λ-bound variable x and no way to assign a proper interpretation to α. But if what is questioned is a subject, then there does not need to be a gap. All that is necessary is that the meaning denotes a property, i.e. that it contains a λ-bound variable. This is seen in the representation of the construct for Who called? in (12). (12) ⎡phon ⎢syn ⎢ ⎢cs ⎢ ⎢ ⎢ ⎣

[who1 -called2 ]3

⎤ ⎥ ⎥ [λP.Q1α .P(αperson,1 )(λx.call′ 2 (x))]3 ⇒⎥ ⎥ ⎥ [Q1α .λx.call′ 2 (x)(αperson,1 )]3 ⇒ ⎥ α ′ [Q1 .call 2 (αperson,1 )3 ⎦ [S NP[who]1 , V[called]2 ]3

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

174 a′ constructions There is no gap in this construct because the λ-bound variable x in the representation of the verb call corresponds to the subject.

7.2.2 Relatives Next, I extend the constructional approach to the analysis of English relative clauses, following the analysis of Culicover 2011. The common part of the correspondence is the non-A′ part of the sentence that contains the gap. The precise form of the relative clause corresponds to the identification of the operator and its representation in phon. As in questions, the chain includes a binding relation between the operator and the variable in CS. English finite relative clauses come in three main varieties: wh-, that-, and ∅-, as summarized in (13). (13) a. the book whichi you read ti (wh-relative) b. the book that you read ti (that-relative) c. the book you read ti (∅-relative) The interpretation in each case is, informally, “the book α such that the property [λx.you read x] holds of α”. (14) …[S you read ti ] ⇔ λx.read′ (you′ ,x) Each relative construction defines a slightly different part of the syntactic representation that corresponds to the ‘operator’, in this case, the head that the property is attributed to. (15) a. N1 [S XP[wh] . . .] ⇔ 1′α .λP.P(α) [wh-relative] b. N1 [S that . . .] ⇔ 1′α .λP.P(α) [that-relative] c. N1 [S ∅ . . .] ⇔ 1′α .λP.P(α) [∅-relative] These correspondences, combined with the correspondence for the relative clause itself as an adjunct of the head, give the CS representation for (13) in (16). (16) book′α [λP.P(α)(λx.read′ (you′ ,x))] ⇒ book′α [λx.read′ (you′ ,x)(α)] ⇒ book′α [read′ (you′ ,α)] Informally, this is the representation of “the book α such that you read α”.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.2 doing a′ work 175 The interpretation of S2 as a property in (16) is what is shared by all A′ constructions. Hence it applies to pied-piping (17) and, if further extended to [N N, VPinf ], to infinitival relatives (18). (17) a. the book [on which]i you put the glass ti b. the book [on which]i to put the glass ti (18) a book (∗ which) (for NP) to read ti Each of these syntactic configurations participates in a correspondence whose CS representation is essentially the same as the ones in (15). If the infinitival VP construction in (18) lacks a subject, it has the interpretation of arbitrary control. The key is that the constituent in the syntax that corresponds to the gap in phon also corresponds to the λ-bound variable x in CS, while in the case of pied-piping, α is part of the representation of the A′ constituent. To illustrate this last point, let us suppose for the sake of the example that the CS representation of on which is on′ (α). Then the CS representation of (17a) is (19).1⁰ (19) book′α [λP.P(on′ (α))(λx.put′ (arb,glass′ ,x))] ⇒ book′α [λx.put′ (arb,glass′ ,x)(on′ (α))] ⇒ book′α [put′ (arb,glass′ ,on′ (α))] As this derivation shows, reduction has the effect of what has been accomplished by ‘reconstruction’ in classical treatments of A′ constructions—the variable α falls under the scope of the operator just as though it is in situ (Sternefeld 2000; Culicover 2011).

7.2.3 Topicalization A final construction to consider as we develop the foundations for A′ constructions is topicalization, illustrated in (20). (20) a. b. c. d. e.

[People like that]i , I like ti . He may be educated, but [smart]i he isn’t ti . [On the table]i , I put the remains of the dinner ti . [That Sandy is sorry]i , I refuse to believe ti . [Slowly]i , Sandy stirred the milk into the sauce ti .

1⁰ I omit representation of the deontic force associated with to in this construction. For discussion, see Landau (2001).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

176 a′ constructions It appears at first sight that topicalization is like wh-questions and relativation, and traditionally it has been treated as being essentially the same (e.g. Chomsky 1977). In fact, they all appear to involve potentially unbounded chains, as in (21). (21) On the table, Sandy said that Chris believed that . . . I put the remains of the dinner t. However, unlike in wh-questions and relativization, there is no apparent pro-form in topicalization that corresponds to a variable that can be bound by an operator in CS. Moreover, it is not obvious that the proper interpretation of topicalization should be expressed in terms of such an operator. The situation is further complicated by the fact that topicalization may apply to VPs in English, as in (22). (22) a. They said that we should put the beer that we bought in the swimming pool . . . and [put the beer that we bought in the swimming pool] we did. b. They said that we should something in the swimming pool . . . and [put in the swimming pool] we did [the beer that we bought]. c. They said that we should put the beer that we bought somewhere cool . . . and [put the beer that we bought] we did [in the swimming pool]. Similar VP-topicalization is even freer in German (Culicover & Winkler 2019). For example, (23) German (Culicover & Winkler 2019) a. Den Mercedes 220 gekauft hat keiner. the.acc Mercedes 220 bought has no.one.nom ‘No one bought the Mercedes 220.’ b. Gekauft hat den Mercedes 220 keiner. bought has the.acc Mercedes 220 no.one.nom ‘No one bought the Mercedes 220.’ c. Verkaufen wird er seinen Mercedes 220 nie. sell.inf will he his.acc Mercedes 220 never ‘He will never sell his Mercedes 220.’ d. Seinen Mercedes 220 zu verkaufen, versuchte er erst his.acc Mercedes 220 to sell.inf tried he first gar nicht. even not ‘He didn’t even try to sell his Mercedes 220.’

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.3 scope in situ 177 e. Zu verkaufen versuchte er seinen Mercedes 220 erst to sell.inf tried he his.acc Mercedes 220 first gar nicht. even not ‘He didn’t even try to sell his Mercedes 220.’ And as is well-known, the position of the topicalized constituent in English is not the same as that of the wh-phrase in A′ position. In main clauses, the topicalized phrase precedes a wh-phrase (24a), while in embedded clauses, it follows the complementizer that (24b). (24) a. To Sandy, which book are you planning to give? b. I think that to Sandy you should give this book. It is possible, therefore, that topicalization is not a special case of an A′ construction, contrary to appearances. A possible alternative is that there is no operator, and the topicalized constituent is in a non-canonical position in phon, and must be interpreted as though it is in its canonical position. Thus, the topicalization construction would state simply that an XP may appear in clause-initial position. On such an analysis, topicalization would be a variant of long distance scrambling (Ross 1967). Scrambling in languages like Japanese, Russian, and German allows for the relatively free ordering of constituents, preserving the interpretation corresponding to the hierarchical structure in syn. To satisfactorily work out such an analysis in detail and compare it to one in which topicalization is a special type of A′ construction would take us far beyond the scope of the current work. At this point, therefore, I leave the question open.

7.3 Scope in situ In this section I show how to express correspondences in which an element scopes over a clause but is in its canonical position. A typical case is that of wh-questions, but the phenomenon occurs as well with relatives in some languages. As noted in section 7.2.1, I classify such constructions as A′ , because they do the work of A′ constructions even though they do not literally involve a syntactic alternation between a constituent in canonical position and a constituent in an overt A′ position.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

178 a′ constructions

7.3.1 Wh-in-situ Let us begin with wh-in-situ. In some languages, such as French, wh-in-situ can be used to ask a direct question. An example is (25). (25) Tu manges quoi? you eat what ‘What are you eating?’ In order to license an interrogative interpretation for wh-in-situ, we must have a construction in which the interrogative phrase corresponds to an operator in CS that scopes over the clause and binds a variable. The interrogative phrase must also correspond to a λ-bound variable in the clause. The construction is given in (26). (26) Construction: wh-in-situ syn [S . . . ,YP[wh]1 , . . .]2 [ ] cs [λP.Q1α .P(α1 )(λx.2′ (x))] This construction licenses the French example in (25), as shown in (27). (27) ⎡phon ⎢ ⎢syn ⎢ ⎢cs ⎢ ⎣

[tu3 –manges4 –quoi1 ]2

⎤ ⎥ [S NP[tu]3 , V[manges]4 , NP[quoi]1 ]2 ⎥ ⎥ [λP.Q1α .P(αthing,1 )(λx.eat′ 4 (you3 ,x))]2 ⇒⎥ ⎥ [Q1α .eat′ 4 (you′ 3 ,αthing,1 )]2 ⎦

English also has wh-in-situ in multiple wh-questions such as (28). (28) a. Who read what? b. Which student read which book? c. Who thinks that we should visit which museum? But English allows wh-in-situ only when there is a wh-phrase in A′ position, which must be a licensing condition on the correspondence in (26). In the CS representation of wh-in-situ in English, the clause containing the wh-phrase in situ must correspond to an expression with a wh-operator scoping over the clause and the position of the wh-phrase corresponding to a λ-bound variable. I represent these questions with multiple operators. For (28c), for example, the representation would be (29).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.3 scope in situ 179 (29) λP.Qα .P(αperson )(λR.Qβ .R(βmuseum )(λy.λx.think′ (x, should-visit′ (we′ ,y)))) ⇒ λP.Qα .P(αperson )(Qβ .λy.λx.think′ (x, should-visit′ (we′ , y))(βmuseum )) ⇒ Qα .λP.P(αperson )(Qβ .λx.think′ (x, should-visit′ (we′ , βmuseum ))) ⇒ Qα .Qβ .λx.think′ (x, should-visit′ (we′ , βmuseum ))(αperson ) ⇒ Qα .Qβ .think′ (αperson , should-visit′ (we′ , βmuseum )) Multiple wh-questions require special constructions, given that they do not occur in some languages, and take decidedly different forms in others (Dayal 2017). In the English case, the wh-phrase is spelled out in its canonical position, and the interpretation is an extension of that of the simple whquestion. It allows the wh-in-situ to be interpreted as a wh-question with a Q operator only if it is in the scope of a wh-phrase in A′ position. (30)

syn [S XP[wh]1 , …, YP[wh]3 , …]2

[

cs

] β λP.Q1α .P(α1 )(λR.Q3 .R(β3 )(λy.λx.2′ (x,y)))

This construction licenses (28c) along the lines sketched out in (29). Since the wh-in-situ construction does not restrict the S that the wh-phrase scopes over, it can be local or long distance. Hence, we immediately account for the well-known Chinese data from Huang (1982) in (31). Whether or not the wh-phrase gets a narrow or a wide-scope interpretation depends on whether the verb is one that can select an interrogative complement. Ask takes an interrogative complement, believe and think take only declarative complements, and know takes either, as the translations show. (31) Chinese (Huang 1982) a. Ni xihuan shei? you like who b. Zhangsan wen wo [shei mai-le shu]. Zhangsan ask me who bought books ‘Zhangsan asked me who bought books.’ c. Zhangsan wen wo [ni maile shenme] Zhangsan ask me you bought what ‘Zhangsan asked me what you bought.’ d. Zhangsan xiangxin [shei mai-le shu]. Zhangsan believe who bought books ‘Who does Zhangsan believe bought books?’

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

180 a′ constructions e. Zhangsan renwei [ni maile shenme] Zhangsan think you bought what ‘What does Zhangsan think you bought?’ f. Zhangsan zhidao [shei mai-le shu]. Zhangsan know who bought books ‘Who does Zhangsan know bought books?’ ‘Zhangsan knows who bought books.’

7.3.2 In situ in polysynthesis A language in which the wh-phrase is in situ in a different sense is Plains Cree (Blain 1997). Blain (1997) shows that the wh-question in Plains Cree is a focusing construction similar to a cleft, with the wh-phrase as the focus. The typical wh-question resembles a relative clause, where the subordinators ka- or e- shown in (32) introduce the subordinate clause and the wh-phrase is obligatorily in initial position.11 (32) Plains Cree (Blain 1997)12 a. awi:na ka:-oce:m-a:-t Mary-wa who-prox rel-kiss-dir-3 Mary-obv ‘Who(prox) kissed Mary(obv)?’ b. awi:na ka:-wa:pam-at who-prox rel-see-2>3 ‘Who did you see?’ c. awi:na e:-itwe:-yan e:-ite:yiht-am-an John e:-oce:m-a:-t who conj-say-2 conj-think-th-2 John conj-kiss-dir-3 ‘Who did you say you think John kissed?’ In CS the wh-phrase corresponds to a wide scope operator, ka- corresponds to a λ-operator, and the proximate/obviative marking on the wh-phrase specifies the bound argument. The correspondence for (32a) is given in (33).

11 There are differences between ka- and e- that I do not pursue here. It appears that ka- is a relative marker while e- is simply a subordinator. What this means is that in the wh-question with e-, the λoperator has to be introduced by the wh-phrase itself as part of the construction, while in the ka- case there is a compatibility between the interpretation of the subordinate clause and the interpretation induced by the wh-operator. 12 Key to the glossing: prox proximate, obv obviate, rel relative, conj conjunction, th theme marker, dir direct (thematic alignment).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.3 scope in situ 181 (33) ⎡phon ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢] syn ⎢ ⎢ ⎢ ⎢ ⎢cs ⎢ ⎣

[awi:na1 –ka1 -[oce:m2 -a:7 -t3 ]4 mary5 ]6

⎤ ⎥ ⎥ ⎥ category V ⎡ ⎤⎤ ⎥ ⎡ ⎢ ⎢lid ⎥ kiss2 ⎥ ⎥ ⎢ ⎥⎥ ⎥ ⎢ rd ⎥ ⎢ NP[who]1 , NP[Mary]5 , ⎢arg 3 1 ⎥ ⎥⎥ ⎥ ⎢ ⎢ ⎢arg 3rd 3 ⎥ ⎥ ⎥ ⎢ ⎥⎥ ⎥ ⎢ ⎢ theme direct 7 ⎦4 ⎦6 ⎥ ⎣ ⎣S ⎥ [Q1α .λy.λx.kiss′ 2 (agent:x,patient:y)(αperson,1 )(m3∪5 )]6 ⎥ ⎥ ⇒ [Q1α .kiss′ 2 (agent:αperson,1 ,patient:m3∪5 )]6 ⎦ [who-prox1 rel1 -kiss2 -dir7 -33 Mary-obv5 ]6

What we see here is a relatively transparent correspondence between the morphosyntactic representation and the conceptual structure. Mixed strategies for wh-questions can also be found. In Meskwaki, another Algonquian language, in a main clause wh-question a wh-phrase is in leftmost position and binds an argument in the inflected verb, as in Plains Cree. The examples in (34) illustrate.13 (34) Meskwaki (Dahlstrom 2019) a. kašiča:hi išina:kosiwaki? kaši=ča:hi išina:kosi-waki how=so appear.thus-3p/ind.ind ‘What did they look like?’ b. we:ne:hča:hi i:ni e:ta? we:ne:ha=ča:hi i:ni IC-i-ta who=so that IC-say.thus-3/part/3 ‘Who said that?’ However, in an embedded question the verb is inflected in the ‘interrogative order’. In this case, the inflection on the verb is sufficient to indicate that the embedded clause is interrogative, and the interrogative operator binds one of the arguments marked in the verbal inflection. Example (35) illustrates. (35) Meskwaki (Dahlstrom 2019, citing Michelson 1930, 118) wi:hasemiha:kwe:hini e:hpwa:wi–kehke:nema:či e:h-pwa:wi–kehke:nem-a:či [IC-wi:h-asemih-a:kwe:hini] aor–not–know-3>3′ /aor IC-fut-help-3>3′ /int.part/3′ ‘He (prox) didn’t know whom (obv) he (prox) should help.’ 13 The gloss ind.ind indicates that the verb is inflected as an indicative of the ‘independent order’, one of the verbal inflectional paradigms of the language. part marks the participial order, yet another paradigm. IC indicates ‘initial change ablaut rule’.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

182 a′ constructions In this example, the interrogative inflection on the embedded verb indicates that the variable corresponding to the obviative argument—the object of ‘help’, indicated by 3′ —is bound by the wh-operator.

7.3.3 Other in situ To the extent that a CS interpretation involves an operator of some sort, then we should expect to find examples of languages in which this operator corresponds to a constituent in situ. Relative clauses in Korean are a case of this phenomenon. Korean not only has wh-in-situ, it has internally headed relatives. An example of an internally headed relative is given in (36a), where the head noun is in situ in the relative clause. Example (36b) is an externally headed relative clause, where the relative clause is an adjunct to the NP-final head noun. (36) Korean (Chung & Kim 2002, 43) a. Tom-un [NP [S sakwa-ka cayngpan-wi -ey iss-nun] Tom-top apple-nom tray-top -loc exist-pne kes]-ul mekessta kes-acc ate ‘Tom ate an apple, which was on the tray.’ b. Tom-un [NP [S cayngpan-wi -ey iss-nun] sakwa]-ul Tom-top tray-top -loc exist-pne apple-acc mekessta. ate ‘Tom ate an apple that was on the tray.’ The interpretation of the internally headed relative works essentially the same way as does wh-in-situ. The internal NP corresponds to the operator in CS, and the rest of the clause functions as the property that applies to it.1⁴ (37)

syn [NP [S NP1 ,…]2 ]

[

cs

] 1′α .λP.P(α)(λx.2′ (x))

1⁴ Korean also has topicalization, which might suggest that it allows A′ constructions. But since Korean has null pronouns, it is entirely possible that what appears to be topicalization does not have a gap, but a resumptive pronoun. In this case the construction is one that explicitly links the A′ constituent to the variable that corresponds to the pronoun.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.4 extensions of a′ constructions 183

7.3.4 Cryptoconstructional in situ It should be noted that the classical approach to wh-in-situ in MGG is to assume that there is movement that does not result in visible reordering of phonological material. As reviewed by Cheng (2009), this movement can be of several different kinds, depending on theoretical assumptions. It can be invisible movement at LF, a level of syntactic representation that does not correspond to phon, it can be invisible movement of a feature that is not spelled out, or it can be copying of a constituent such that the copy is not spelled out in the moved position, but the original is spelled out in the original position. These are all cryptoconstructional solutions to the extent that they require stipulations in order to get wh-questions in one language to behave differently than wh-questions in another language. Stipulating a direct correspondence between the wh-phrase and its interpretation as I propose here seems to be more straightforward. Moreover, these movement-based approaches predict that the properties of wh-in-situ should be similar, if not identical, to overt movement. But, as Cheng notes, wh-in-situ does not behave like overt movement in many respects. This raises the very real possibility that there is no movement in the in situ constructions, and that the constraints apply only to the processing of dependencies that involve disjoint segments of phon, however they are characterized in formal terms. For discussion, see Culicover (2013c) and the references cited there. In summary, we have seen that there is a family of constructions that can be viewed from two perspectives. On the one hand, there are members that can represent the same CS relation—interrogation, relative clause, and so on. On the other hand, there are members that use the same formal devices for correspondences with syn and phon—basically, A′ or in situ. All of these options can be naturally characterized in terms of the formal vocabulary of constructions.

7.4 Extensions of A′ constructions In English there are many constructions that display some or all of the properties of A′ chains. This observation was the basis for Chomsky’s (1977)

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

184 a′ constructions proposal to unify them by treating them all as the result of wh-movement. In this section I show that such constructions can be easily integrated into the constructional account of A′ constructions without appealing to movement, deletion, or invisible A′ constituents. A number of these constructions are listed in (38), with illustrations. In each example the gap is marked as t. All of these constructions permit long chains in principle, although some are pragmatically less felicitous than others. (38) a. pseudo-clefts: What (Lee says) Sandy ate t was a pizza. b. clefts: It was a pizza that (Lee says) Sandy ate t. too hot eat t. c. too/enough: This pizza is { } to { } hot enough admit that you ate t. tough eat t. d. tough/fun: This pizza will be { } to { } fun admit that you ate t. e. Slifting: Sandy is intelligent, I think (Lee said) t. eating t. f. for: A pizza like this is good for { } denying that you ate t. eating t. g. worth: This pizza is not worth { } denying that you ate t. h. Comparative correlatives: The more pizza (Lee says) I eat t, the happier (Lee says) I am t. i. Comparatives: Sandy is taller than (Lee says) Chris is t. What is characteristic of many of these examples is that there is no apparent A′ constituent. In (38a,b) the antecedent of t is arguably the focus a pizza, which is not in an A′ position. In (38c,d), it appears that the antecedent of t is the subject of the sentence, which is not an A′ position. This problem is a longstanding one in movement accounts of chains, and is addressed at length by Chomsky (1977, 1981), who invokes the movement of invisible wh-phrases in order to derive them. For us, however, the problem is to state the correspondence between syntactic structure and CS in such a way that the interpretation associated with the Gap construction contributes to the overall interpretation. Consider the tough/fun construction, for example. Assume that the syn-cs correspondence is as in (39).1⁵

1⁵ For simplicity, I do not include the option that the complement of fun is an infinitival clause, e.g. For us to eat the pizza is fun.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.4 extensions of a′ constructions 185 (39)

[

syn [AP A[fun]1 ,VP2 ]3 cs

[fun′ 1 (2′ )]3

]

This correspondence is simple, and in fact follows from the more general default head-complement correspondence. But it actually accounts for the three variants in (40), with some plausible additional assumptions. (40) a. To eat the pizza is fun. b. It is fun to eat the pizza. c. The pizza is fun to eat. Taking [to eat the pizza] in (39) to correspond to VP2 , (40a) is licensed by (39). The construct for (40b) is (41). In this case we must further assume that as a default, a sentence without a thematic subject must have dummy it as the subject—this is the ‘last resort’ strategy found in many approaches. (41) ⎡phon [it–is–fun1 –[to–eat]2 –[the–pizza]3 ]4 ⎤ ⎢syn [S [AP A[fun]1 , [VP V[eat]2 , NP[the,pizza]3 ]]]4 ⎥ ⎢ ⎥ [fun′ 1 (eat′ 2 (arb,p3 ))]4 ⎣cs ⎦ On the other hand, if VP contains an NP that corresponds to a gap, that correspondence is licensed by Gap. The entire VP corresponds to λx.eat′ (arb,x), and this in turn is predicated of the subject by a general correspondence. (42) ⎡phon [[the–pizza]3 –is–fun1 –[to–eat]2 –∅5 ]4 ⎤ ⎢syn [S NP[the,pizza]3 , [AP A[fun]1 , [VP V[eat]2 , NP5 ]]]4 ⎥ ⎢ ⎥ [λx.fun′ 1 (eat′ 2 (arb,x))(p3 )]4 ⇒ [fun′ 1 (eat′ 2 (arb,p3 ))]4 ⎦ ⎣cs I assume that similar correspondences license too/enough, the semantics of which are somewhat more complex. It is of course necessary to consider what it is about fun that makes this correspondence possible. Why is (43b) not acceptable? (43) a. Chris is happy to bake the pizza. b. ∗ The pizza is happy to bake t. The answer is that happy does not select as its theme argument a VP that denotes a proposition. Cf. ∗ To bake the pizza is happy; ∗ It is happy to bake

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

186 a′ constructions the pizza. The explanation thus is a lexical-semantic one—everything else follows from the simple correspondences and the selectional restrictions of the predicates. Fun selects a proposition, and happy selects an individual.1⁶ Next, let us consider the two focus constructions, pseudo-cleft and cleft. The pseudo-cleft has an overt A′ constituent. But to get the focus to be understood as filling the gap in the pseudo-cleft, it is necessary to state a rule of interpretation for the copula (Jacobson 1984). In the cleft, the constituent in focus position is arguably the antecedent of the gap, but the syntax does not make this relationship transparent. Movement of the focus from the that-clause is not plausible, because the focus position is not an A′ position,1⁷ and the that-clause is effectively a relative clause. However, a constructional formulation is straightforward, as in (44).1⁸ (44) ⎡phon ⎢syn ⎢ ⎢cs ⎢ ⎣is

[it1 –…–be2 –3–4]5

⎤ [S NP1 ,…,V[be]2 ,YP3 ,S4 ]5 ⎥ ⎥ ⎥ [4′ (3′ )]5 ⎥ ′ focus(3 ) ⎦

Here, the focus constituent YP3 is an argument of the cleft clause in CS. Since this clause contains a gap, as in (38c), the interpretation is well-formed. (45) ⎡phon ⎢syn ⎢ ⎢cs ⎢ ⎣is

[it1 –was2 –[a–pizza]3 –[that–Sandy–ate–∅]4 ]

⎤ [S NP1 ,…,V[be]2 , NP[a pizza]3 , [S that, Sandy, ate, NP]4 ]5 ⎥ ⎥ ⎥ [λx.eat′ (sandy′ ,x)]4 (p3 ) ⎥ focus(p3 ) ⎦

Consider next the worth construction. The VP with the gap corresponds to λx.F(x). Worth takes this as its argument, and the entire expression is predicated of the subject. The interpretation of (46a) is shown in (46b). 1⁶ A special construction is needed for a nice person to talk to in (i), since the interpretation does not follow immediately from applying the Gap construction of to talk to to a nice person. (i) Chris is a nice person to talk to. 1⁷ However, the focus position is not an argument position; it is a dedicated position with a special focus function. In this respect it is similar to A′ positions on the left edge of the clause. The notion of A′ position in MGG is configurational, not functional. So under usual MGG assumptions it is problematic to call the focus position of a cleft or pseudo-cleft an A′ position. 1⁸ I am ignoring some of the complexities of the English copula construction in the syn representation of (44) in order to focus on the essential points. Most importantly, when the main verb is be and there is no auxiliary verb in Aux, be functions as the Aux.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.4 extensions of a′ constructions 187 (46) a. The pizza is worth eating. b. [cs worthwhile′ (λx.eat′ (arb,x)(p))] As in the case of fun, whether a predicate may participate in this construction is a function of the selectional properties of the adjective. Worth is very idiosyncratic: the predicate must select a complement that denotes an action, the complement must be progressive, the complement must be extraposed, and the complement must contain a gap. So we do not get ∗ Sandy is happy seeing because happy does not select an action or a progressive complement, hence ∗ Seeing Sandy is happy; ∗ It is not happy seeing Sandy, etc. Difficult does select an extraposed action complement in the progressive, but the complement cannot contain a gap when the complement is progressive, hence Eating pizza is difficult; It is difficult eating pizza; ∗ Pizza is difficult eating but Pizza is difficult to eat. For these reasons, the worth construction is restricted to worth and worthwhile. The construction in (47) states the possible correspondences. (2 in phon(a) should be understood as excluding 4, which appears elsewhere in the string.1⁹) (47) ⎡ (a) 4–X–be–1–2 ⎤ }]3 ⎥ ⎢phon [{ (b) it–X–be–1–2 ⎥ ⎢ ⎥ ⎢ ⎢syn [S A[worth]1 ,[VP[prog] …, NP4 ]2 ]3 ⎥ ⎥ ⎢ [worthwhile′ 1 (2′ )]3 ⎦ ⎣cs (48) applies this construction to Pizza is worth eating. (48) ⎡phon [pizza4 –is–worth1 –[eating]2 ]3 ⎤ ⎢syn [S A[worth]1 , [VP[prog] V[eating], NP[pizza]4 ]2 ]3 ]⎥ ⎢ ⎥ [worthwhile′ 1 (λx.eat′ (arb,x)(p4 ))2 ]3 ⎣cs ⎦ Another construction that takes advantage of the interpretation of a clause with a gap that does not fall naturally under a movement analysis is Slifting (Ross 1973). Ross discusses a range of cases exemplified by (49).

1⁹ A possibility that I have not explored is that this option might work for constructions with fun, etc.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

188 a′ constructions (49) a. b. c. d.

Sandy is intelligent, I think. Sandy is intelligent, I think you realize. Sandy is intelligent, it seems to me. Sandy is intelligent, I’m happy to say.

On Ross’ analysis, the main clause is the complement of the verb in the tag, e.g. think, and is moved to the left of it. This analysis is somewhat problematic for a number of reasons, including the fact that a clause with an overt that cannot participate in this construction (50a,b), not every verb allows slifting (50c), and the ‘slifted’ clause in interrogatives is clearly a main clause (50d). (50) a. ∗That Sandy is intelligent, it seems to me. b. ∗Sandy is intelligent, it seems to me that. c. ∗Sandy is intelligent, I forgot. [cf. I forgot Sandy is intelligent.] d. Is Sandy intelligent, do you think. [cf. ∗ Sandy is intelligent, do you think?; ∗ Do you think is Sandy intelligent?] Slifting can of course be derived by assuming that the parenthetical tag contains an invisible operator in the position of the gap that moves to the left edge of the tag and is somehow linked to the main clause. It should be apparent by now that this device is simply a syntacticization of the Gap construction. The interpretation can be derived directly without movement, as shown in (51). This correspondence is an approximation in that it omits the fine details of the function of the parenthetical as a hedge, an evidential, etc. (Asher 2000), as well as the prosodic constraints on where the parenthetical may appear in phon. (51) Construction: Slifting ⎤ ⎡phon [1–2]3 ⎢syn [S S1 , [S2 ]]3 ⎥ ⎥ ⎢ [2′ (1′ )]3 ⎦ ⎣cs Finally, for discussion of the comparative correlative, see Culicover (2013c). The technical details there are somewhat different than what is proposed here, but the analysis is essentially the same.2⁰ 2⁰ I leave the comparative construction aside here since its complications go far beyond what can be encoded as an invisible chain construction. For descriptive details and typological analysis, see Beck et al. (2004, 2009); Beck (2011).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.5 toward an a′ constructional typology 189

7.5 Toward an A′ constructional typology Thus far in this chapter I have developed the general constructional approach to A′ and comparable in situ constructions and showed how to apply this approach to particular correspondences. The shared component of many A′ constructions is the Gap correspondence (7), which specifies that a clause containing an XP that corresponds to ∅ in phon simultaneously corresponds to λx.F(x) in CS, with x corresponding to the XP. (7) Construction: Gap ⎡phon […–∅1 –…]2 ⎤ ⎢syn [S …, XP1 , …]2 ⎥ ⎢ ⎥ λx.2′ (x1 ) ⎣cs ⎦ The specific constructions for wh-questions, relative clauses, and so on specify where the phonological material goes that corresponds to the operator. The lexicon specifies precisely how each operator is realized in phon. On the other hand, the in situ constructions provide alternative ways of expressing the same CS function in the syn-phon correspondence. From the perspective of constructional typology, each of these components of the correspondence offers a space for variation. The A′ correspondence must satisfy the following requirements:21 • It must specify how the operator is expressed in phon. • It must specify how the scope of the operator is marked in phon. • It must specify how the variable that is bound by the operator is marked in phon. • It must specify how the canonical position corresponding to the variable is represented. Since the scope of an operator is propositional, the marking of scope in phon must be associated with a corresponding constituent that has a 21 Thus the constructional approach corresponds to simple one-step movement in a derivational approach. It is more restrictive, however, because the derivational approach also allows for derivations that involve multiple steps (Chomsky 1981). One complexity that arises in the derivational approach is that each such step in principle could leave some overt copy of its operation in phon or not. This predicts 2n possible realizations for an n-step derivation. In fact, what we see is that the overt manifestation of a long distance A′ dependency is uniform across all potential steps—either there is some visible reflex, or there is not (Hukari & Levine 1995). It is, of course, possible to stipulate how many copies are overt, but the issue doesn’t arise on the constructional approach.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

190 a′ constructions propositional interpretation. This marking may be more or less transparent, and we would expect the degree of transparency of the marking to correspond to the relative complexity of the construction. Let us call the morphosyntactic device for marking scope the operator marker, the location of the device in phon the operator position, the functional position in the syntactic structure of the operator the source position, and the marking of the source position in phon the source marker. Every construction that licenses a correspondence such as a wh-question or a relative clause must stipulate the operator marker and the operator position. For example, in the English wh-question the operator marker is the whphrase itself, the operator position is the left edge of the clause, and the source marker/position in phon is a gap. In a language with wh-in-situ, the operator marker may be morphological, in which case the operator position is typically the verb of the clause that the operator scopes over, or there may be no operator marker, and the source marker is the operator itself. As the data from Chinese in (31) in section 7.3.1 show, it is not necessary to actually mark the operator overtly in order to get a scope interpretation. Overt marking is, of course, more explicit. Which combination of these options for marking scope is found in a language will depend in part on the morphosyntax of the language. For A′ constructions, for example, if the language is right branching then the left boundary of the clause is a position that is unique to that clause, as shown in (55). Let us represent the operator marker as δ. The S in which δ is in the operator position in these two trees indicates whether the operator scopes over the proposition denoted by S1 or that denoted by S2 . The operator position in phon is determined by how high in the tree the operator is attached. (52) a. Wide scope S2 …

δ

… S1 b. Narrow scope S2 …

… …

S1 δ



OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.5 toward an a′ constructional typology 191 The operator marker δ may be the A′ constituent itself in conventional A′ constructions such as English wh-questions. Or it can be a designated morpheme, such as expletive was ‘what’ in German partial wh-movement (Fanselow 2017; Sabel 2000), illustrated in (53). In this example was is in the A′ operator position of the main clause, marking the scope of the question, while the operator wen is in the A′ position of the embedded clause. (53) German Was glaubst du, wen der Fritz gesehen hat. what think you, who.acc the.nom Fritz seen has ‘Who do you think Fritz saw?’ The expletive wh-phrase is an operator marker for conveying the wide scope of wen ‘who’ without ‘long movement’. German lacks wh-in-situ, so that the wh-phrase must appear in an A′ position, as seen in (54). (54) German (Müller 1997) ∗ Was glaubst du [CP dass sie wann gekommen ist]? Q think you that she when came is ‘When do you think she came?’ Similar constructions can be found in languages such as Hungarian (Horvath 1997) and Hindi (Fanselow & Mahajan 2000; Mahajan 2000); see Mycock (2004) for a review of the typology.22 Interestingly, leftward adjunction of δ does not unambiguously mark scope if the language is left branching. This is shown in (55), where phon and syn are conflated for convenience.23 (55)

S2

S2 …

δ S1

S1 …

δ

… …

22 Note that the expletive strategy does not appear to be employed in simple questions in languages that use the left edge as the operator position. So (i) is not possible in German. (i) German (Müller 1997, 276) ∗ Was ist sie warum gekommen? Q is she why came ‘Why did she come?’ Thus, the expletive strategy appears to be a way of avoiding long distances between the operator and the gap, a well-known source of computational complexity (Gibson 1998, 2000). 23 ‘Leftward adjunction’ of a constituent δ to a constituent X is the ordering of the form of δ immediately before the form of X in phon.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

192 a′ constructions If the material in S1 precedes everything in S2 , then adjoining δ to either S1 or S2 will locate it in the same place relative to the rest of phon, leading to greater interpretive complexity.2⁴ Thus we have a straightforward complexity-based explanation for the typological observation that V-final languages often lack wh-movement. This is Greenberg’s (1966) Universal 12. Universal 12: If a language has dominant order VSO in declarative sentences, it always puts interrogative words or phrases first in interrogative word questions; if it has dominant order SOV in declarative sentences, there is never such an invariant rule. One alternative strategy is to adjoin the operator marker to the clause-final inflected verb, giving rise to phons corresponding to the orderings in [S2 [S1 …V1 ] …V2 +δ] and [S2 [S1 …V1 +δ] …V2 ] or [S2 [S1 …V2 ] …δ+V2 ] and [S2 [S1 …δ+V1 ] …V2 ]. Logically, δ could be a constituent, e.g. a wh-phrase, with the gap in canonical position.2⁵ But then the operator will be in a position that follows the source position that it binds. As discussed in Culicover (2013c), putting the dependent before the constituent that binds it is more complex in terms of computing the interpretation than the alternative ordering. So on grounds of complexity, we would expect to find that languages that are strongly left branching and have wh-in-situ would have a morphological marking of the inflected verb to indicate the scope. In fact, in left branching languages such as Japanese and Korean that do not permit canonical A′ constructions, scope is marked morphologically on the verb. (56) Japanese (Watanabe 2001) a. dare-ga ringo-o tabeta no? who-nom apple-acc ate Q ‘Who ate an apple?’ b. Boku-wa [CP [IP John-ga nani-o katta] ka] I-top John-nom what-acc bought Q shiritai. want-to-know ‘I want to know what John bought.’ 2⁴ The interpretive complexity can of course be avoided by putting a constituent of S2 before the embedded left branching S1 , but doing this may run afoul of the tendency to put heavy constituents first in V-final languages (Yamashita & Chang 2001) and introduces center-embedding of S as well, which appears to be dispreferred (Kuno 1973). 2⁵ Although putting it after V would violate linear ordering constraints in a strictly V-final language. ‘Rightward movement’ in wh-questions is in fact not found in strictly V-final languages (Bresnan 1974; Kayne 1994).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

7.5 toward an a′ constructional typology 193 In Hungarian and Hindi, on the other hand, the operator marker is the operator itself, and the operator position is the immediate preverbal position. The following examples show wh-questions with both expletives and the operators in this position. (57) a. Hungarian (Horvath 1997) mit mondott János, hogy ki-vel what.acc say.past.3sg John.nom that who-with táncolt? dance.past.3sg ‘With whom did John say that he had danced?’ b. Hindi (Mahajan 2000, 317) Sitaa-ne kyaa socaa, ki Ravii-ne kis-ko dekhaa? Sita-erg what thinks that Ravi-erg who-dat see.past ‘Who does Sita think Ravi saw?’ If a language is VSO, then the position of the verb uniquely marks the left boundary of the clause. The most transparent marking would therefore be to put the operator immediately adjacent to the verb, as formulated in Greenberg’s Universal 12 above. Adjunction of the operator marker to the right edge of the clause is also a logical possibility, but is very rare if it occurs at all. According to Whaley (2010), it has been claimed to occur in Khasi and Tennet. Ackema & Neeleman (2002) propose that rightward movement is rare because of processing difficulties. Apparently, wh-movement to the right is possible in sign languages (Cecchetto et al. 2009) but this dislocation occurs in a single clause. Unbounded rightward movement is problematic because of limitations on short-term memory: “when the parser reaches the right peripheral wh-phrase, the unit that contains its trace has already been closed off and the dependency should not be processed” (Cecchetto 2007, 5). At this point we have the beginning of a constructional typology based on complexity considerations. In left branching languages that are strictly V-final we expect to find only wh-in-situ, with the operator marker being a morphological marking of the verb. In right branching languages, the operator marker may be the constituent corresponding to the operator, and the operator position may be the left edge of the clause or a designated position with respect to the verb, e.g. immediately preceding the verb. The source marker may be a gap, which corresponds to a λ-bound variable in CS. Or the operator marker may be an expletive in the A′ position at the left edge of the clause, and the operator may be in situ or in a designated position, e.g. the left edge of a lower clause or the immediate preverbal position.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

194 a′ constructions

7.6 Summary I have shown in this chapter how to account for the properties of A′ constructions and in situ constructions that do the work of A′ constructions within our constructional framework. I have suggested that one advantage of the constructional approach is that it allows us to unify the constructions that involve chains without invoking movement or deletion, thereby satisfying the objectives of Simpler Syntax. The constructional formalism also allows us to assess the relative complexity of alternative ways of doing the ‘work’ of A′ constructions, and to explain why it is that certain alternatives appear to be more common than others.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

PART III

C HA NG E

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8 Constructional change in Germanic 8.1 Introduction In this chapter I trace the development of the grammars of the English and German from a hypothesized protolanguage in constructional terms. I show that the constructional framework offers considerable insight into the nature of the variation and the trajectories of change. The differences between the languages are for the most part minimal in terms of the individual constructions, although their superficial effects may be significant when they are combined in a single grammar. Most importantly, the changes appear to fit well into the framework of constructional change laid out in Chapter 4.1 The chapter is organized as follows. Section 8.2 uses Modern German (ModG) to isolate five basic constructions that give the Germanic languages their character. These have to do with (i) what appears in initial position in the main clause (the result of topicalization), (ii) where the verb appears, i.e. the position of the finite verb in the main clause, (iii) the position of the verb in a subordinate clause, (iv) the position of the verb in the VP, and (v) the position of the verb in yes-no and wh-questions. Section 8.3 lays out the corresponding constructions in Old English (OE) and traces the development of Modern English (ModE) in terms of competition among constructions, and changes in the licensing conditions of the individual constructions. Section 8.4 pushes the history back to some hypotheses about ProtoGermanic (PG) and the trajectory of change through Old High German (OHG) to ModG.

1 There is a rich literature on syntactic change in English, Germanic more generally, as well as other languages. I do not attempt to address or even cite all of it here. Moreover, many recent articles, monographs, and collections use the MGG framework, especially functional heads and triggered movement, to track the changes. See, for example Kemenade & Vincent (1997); Pintzuk (1999); Fischer et al. (2000); Pintzuk et al. (2000); Roberts & Roussou (2003); van Kemenade & Los (2006); Crisma & Longobardi (2009); Roberts (2010); Biberauer & Sheehan (2013); Pintzuk (2014). Because of space limitations, I am able to reformulate here only a few of the phenomena dealt with in such work in constructional terms.

Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0008

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

198 constructional change in germanic Section 8.5 takes a constructional look at verb clusters and the phenomenon of verb projection raising, which produces various orderings of the head and its complements in a number of Germanic languages.

8.2 Basic clausal constructions of Modern German The basic facts of German word order are these. • In a declarative main clause there must be a constituent in initial position, and the tensed verb appears in second position (1). (1) Gestern habe ich dieses Buch gelesen. yesterday have.1sg I this book read.past.part ‘I read this book yesterday.’ • In a subordinate clause the finite verb follows all other verbs (2). (2) Ich sagte, dass ich dieses Buch gestern I said, that I this book yesterday gelesen habe. read.past.part have.1sg ‘I said that I read this book yesterday.’ • In direct yes-no questions the finite verb appears in first position (3), (3) Hast du dieses Buch gestern gelesen? have.2sg you this book yesterday read.past.part ‘Did you read this book yesterday?’ • while in direct wh-questions the finite verb immediately follows the clause-initial wh-phrase (4). (4) Welches Buch hast du gelesen? which book have.2sg you read.past.part ‘Which book did you read?’ The examples in (5) illustrate further the requirement that the initial position of a main clause must be filled, and that the finite verb must appear in second position. (5a) is well-formed with subject in initial position and the

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.2 basic clausal constructions of modern german 199 verb in second position. (5b) shows that these requirements may be satisfied by an expletive in initial position, and (5d–h), repeated from Chapter 7, show that a non-subject and in fact a non-argument may fill the initial position. (6a–c) show that violations of these requirements result in ill-formedness. (5) German (Culicover & Winkler 2019) a. Der Fritz hat dieses Buch gelesen the.nom Fritz have.3sg.pres this book read.past.prt ‘Fritz read this book.’ b. Es haben gestern viele Menschen Pudding it have.3pl.pres yesterday many people pudding gegessen. eat.past.prt ‘Many people ate pudding yesterday.’ (Zwart 1997) c. Leider hat der Fritz nicht das Buch unfortunately have.3sg.pres the.nom Fritz not the book gelesen. read.past.prt ‘Unfortunately, Fritz didn’t read the book.’ d. Den Mercedes 220 gekauft hat keiner. the.acc Mercedes 220 bought has no.one.nom ‘No one bought the Mercedes 220.’ e. Gekauft hat den Mercedes 220 keiner. bought has the.acc Mercedes 220 no.one.nom ‘No one bought the Mercedes 220.’ f. Verkaufen wird er seinen Mercedes 220 nie. sell.inf will he his.acc Mercedes 220 never ‘He will never sell his Mercedes 220.’ g. Seinen Mercedes 220 zu verkaufen, versuchte er erst his.acc Mercedes 220 to sell.inf tried he first gar nicht. even not ‘He didn’t even try to sell his Mercedes 220.’ h. Zu verkaufen versuchte er seinen Mercedes 220 erst to sell.inf tried he his.acc Mercedes 220 first gar nicht. even not ‘He didn’t even try to sell his Mercedes 220.’

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

200 constructional change in germanic (6) a. b. c. d.



Der Fritz hat gelesen dieses Buch. Der Fritz dieses Buch gelesen hat. ∗ Dieses Buch der Fritz hat gelesen. ∗ Dieses Buch hat gelesen der Fritz. ∗

In addition, there are main clause exclamative and subordinate conditionals with the verb in initial position, and verb-initial is also possible in a special narrative style (Brandner 2010). It is also possible under certain circumstances for the verb to be in third position in the main clause (Müller 2003). In the following discussion I leave these additional constructions aside. Finally, when there are multiple verbs in a single clause, the positioning of the verbs with respect to one another in languages such as Dutch and Swiss German is another source of variation that affords interesting insights. I take up these cases in a separate section on ‘verb clusters’ (section 8.5).

8.2.1 Initial position in the clause For our constructional analysis we need a construction that licenses a root declarative when there is a constituent in initial position. For convenience let us refer to a finite declarative main clause as Sroot , leaving open the question of whether there is a richer structure. ‘Root’ comprises main clauses and certain subordinate clauses (Blümel 2017; Gärtner 2000; Gärtner 2002; Reis 2006). Correspondingly, the VP must be headed by a finite verb. Virtually all treatments of German word order in MGG assume that there is a dedicated phrasal position for the topicalized constituent, and a dedicated head position for the finite verb in V2. The clause is assumed to consist of CP, a projection of C0 , and IP, a projection of I0 , with V2 being movement of the finite verb to I0 and/or C0 . I adopt here a constructional variant of this analysis, one that does not involve movement nor require functional heads. Crucially, however, I assume that a root clause has a designated, phonologically null specifier that corresponds to the initial position in the clause and that the topicalized constituent corresponds to this position. This position is the traditional Vorfeld ‘prefield’ in German. It must be filled by an expletive es if no other constituent appears in it, as in (5b). The first licensing condition is stated in (7). ‘3⨝1’ is used to indicate that the position of X1 is identical to the position of Spec3 . Since X1 corresponds to this position, it is not expressed in the phon of S that is subscripted ‘2’.2 2 I notate the initial constituent in (7) as X, not XP. Following Chomsky (1995a) and other work, I assume that there is no syntactic difference between heads and phrases.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.2 basic clausal constructions of modern german 201 (7) Construction: German Topicalization ⎡phon 3⨝1–2 ⎤ ⎢syn [Sroot Spec3 , …, X1 , …]2 ⎥ ⎢ ⎥ 2′ ⎣cs ⎦ Note that I am not treating topicalization in German as an A′ construction in the sense discussed in Chapter 7, since there is no CS variable. Rather, I assume here that it is a type of (possibly long-distance) scrambling, in which a constituent is spelled out in phon in the position of Spec rather than in its canonical position. It follows from (7) that a finite declarative main clause that lacks a constituent in initial position is not well-formed, unless of course there is another construction that licenses another constituent ordering. Since X may be an expletive, there is no specific requirement in the construction that X in initial position be linked to a null phrase in situ. X cannot be the tensed V, because then the construct would not satisfy the second position licensing condition of the V2 construction, to be developed in section 8.2.2. The initial phrase may be a sentential adverb like leider ‘unfortunately’ or a sentence connective like also ‘therefore’. Or it may be an argument or an adjunct. So, for example, on the assumption that the subject NP in German is a constituent of VP, an SVO sentence like (5a) will be a construct along the lines of (8). (8) ⎡phon [[der-fritz]2 ⨝1–[hat3 -[dieses-buch]4 -gelesen5 ]]6 ⎤ ⎢syn ⎥ S6 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ Spec1 VP[fin] ⎢ ⎥ ⎢ ⎥ ⎥ ⎢ V[fin] VP ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ hat3 NP2 V′ ⎢ ⎥ ⎢ ⎥ ⎢ der Fritz NP4 V5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ dieses Buch gelesen⎥ ⎢ ⎥ [λy.λx.read′ 5 (x,y)(b4 )(f2 )]6 ⎣cs ⎦ For a topicalized verb, as in (5e) we have (9).

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

202 constructional change in germanic (9) ⎡phon [gekauft2 ⨝1–hat3 –[den–mercedes-220]4 –keiner5 ]6 ⎤ ⎥ ⎢syn S6 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ Spec1 VP[fin] ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ V[fin]3 VP ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ hat NP5 V′ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ keiner V2 NP4 ⎥ ⎢ ⎥ ⎢ ⎢ gekauft den Mercedes 220 ⎥ ⎢ ⎥ [neg5 (λx.λy.(buy′ 2 (x,y))(m4 )(pro5 )]6 ⎦ ⎣cs Similarly for other constituents.

8.2.2 Position of the finite verb in the main clause The finite verb in root clauses is in second position. The licensing condition for the position of the verb in the finite main clause is now given as (10). (10) Construction: V2 ⎡phon [1–2–3]4 ⎤ ⎢syn [Sroot Spec1 , [VP Vroot [fin]2 , …]3 ]4 ⎥ ⎢ ⎥ 4′ ⎣cs ⎦ Crucially, this construction simulates local movement of the finite V to the second position by stating phon so that the finite verb immediately follows the position corresponding to the initial Spec, which together with the topicalized constituent is in initial position. The reason why this analysis only simulates movement is that there is no syntactic structure on which the ordering is defined. Rather, the ordering is a linear ‘spelling out’ of the constituents according to the hierarchical structure and the interpretation. Such ordering is possible when the linear ordering of the constituents is a local matter. In this case, what is at issue is the linear position of the finite verb in the clause. In the root clause, it appears in second position. In a subordinate clause it appears in final position, as discussed in the next section.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.2 basic clausal constructions of modern german 203

8.2.3 Position of the verb in a subordinate clause A verb that is not in second position must be in final position in its VP. This is the default position for a verb. The licensing condition is given in (11). (11) Construction: VP-final V phon [2>1]3 [ ] syn [VP V1 , XP2 ]3 Following Pintzuk’s (2014) proposal for Old English (see also Culicover & Winkler 2019), I assume that in ModG a declarative subordinate clause is an S with a complementizer. e.g. dass ‘that’, ob ‘whether’, etc., which selects a tensed VP complement. Subordinate clauses are for the most part not root clauses. Thus, the VP-final V construction licenses final V in VPs of finite subordinate clauses, as well as VP-final past participles. Some subordinate clauses count as root clauses, and these also have V2 (Gärtner 2000).

8.2.4 Position of the verb in questions Finally, we have to specify the position of the finite verb in a question. In a yes-no question the finite verb is in initial position and in a root wh-question it is in second position immediately after the wh-phrase. I formulate the construction for the position of the verb in questions in (12). Since the verb is initial in yes-no questions, it is not possible to invoke V2 to license it unless we assume that there is an invisible constituent in initial position. Since we assume Spec as a constituent of every root clause, this initial position is readily available—it is the position of Spec. The finite verb precedes Spec in questions. (12) Construction: Inversion (in question) ⎡phon 2–[1–…]3 ⎤ ⎢syn [Sroot Spec1 , [VP V[fin]2 , …]]3 ]⎥ ⎥ ⎢ Q(3′ ) ⎣cs ⎦ The construction for the position of the wh-phrase was stated in Chapter 7 and is repeated in (13). The wh-phrase is left adjoined to the S that it scopes over, both in main and subordinate clauses. The Gap construction of Chapter 7 accounts for the interpretation.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

204 constructional change in germanic (13) Construction: Wh-question ⎡phon [1–2]3 ⎤ ⎢syn [S XP[wh]1 , S2 ]3 ]⎥ ⎢ ⎥ [1′ (2′ )]3 ⎣cs ⎦ Since Spec is a constituent of root sentences, root sentences that have the order XP[wh]–V[fin]–…are licensed by Inversion and Wh-question together, if the corresponding interpretation is interrogative. Variants of these constructions are found in all of the Germanic languages. They form the basis for our investigation into the constructional history of Germanic and the source of variation in the grammars of the Germanic languages. It is important to recognize that characterizing this variation in constructional terms does not introduce complexity into the description—the complexity, such as it is, is inherent in the diversity of possible positions for the finite verb and the possibility of topicalization. Of particular significance is the difference between main and subordinate clauses, which must be stipulated on any account.

8.3 The development of English In this section I trace the current structure of ModE back to that of OE in terms of constructional change. There are three major factors leading to the development of ModE. • First, in OE the verb could occupy several positions, including the final position in the clause, while in ModE the verb is initial in the VP. • Second, in OE there is robust V2, while in ModE there is no V2 in cases of topicalization, although there is still inversion in questions and in certain other more restricted constructions. • Third, OE had robust case marking, while case marking has largely been lost in ModE and is reflected only in the pronominals. At first glance it appears that these changes are unrelated: (i) OV became VO, (ii) V2 was lost except in cases of subject Aux inversion (SAI), and (iii) case was lost. In contrast, I suggest that all three changes are connected, and the connections can be clearly traced in constructional terms. This connectedness among diverse superficial features of a language is what I call ‘style’ in Chapter 10.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.3 the development of english 205 At the center of the analysis is the following proposition: V2 was not lost in the development of ModE; rather, the licensing condition for V2 narrowed from XP in initial position of the clause to NP in initial position. With this change, essentially all of the other constructions of OE are unchanged, or underwent only slight modifications in the transition to ModE. The following sections set out the details of this scenario.

8.3.1 The position of the verb Let us begin with the position of the verb in OE. Van Kemenade (1987) argues that the VP in OE is invariably OV with movement of the verb to various positions, while Pintzuk (1993) argues that both OV and VO are possible basic orders.3 My constructional approach follows that of Pintzuk, since with her, I assume that both constructions are in the grammar and in competition. That is, in addition to the VP-final V construction (11), OE had an active version of VP-initial V, as in (14). (14) Construction: OE VP-initial V phon [1>2]3 ] [ syn [VP V1 , XP2 ]3 Given the competition between OV and VO, the shift from OV to VO cannot be characterized as a change of parameter value in the classical sense (Allen 2000, 12). Rather, over time, the construction that licenses VO won out over the construction that licenses OV. The question then arises, why did this happen? And perhaps more significantly, why does the competition between these two orders never result in OV winning out over VO (Kiparsky 1996)? I assume that both VO and OV are always available to speakers even in the absence of positive evidence for one of them (section 4.2). The fact that there are two constituents that must be ordered with respect to one another entails that both orderings in phon are in principle possible. Speakers are able to spontaneously reorder the head in order to reduce complexity on some dimension, even in the absence of positive evidence for the new ordering.

3 The evidence for a original OV order is supported by the form of compounds, which are headfinal (Bean 1983, 49ff). This in turn supports the assumption that Proto-Germanic was OV, a point that I return to in section 8.4.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

206 constructional change in germanic So the question arises, why did OV give way to VO? Taylor & Pintzuk (2012a,b), among others, have argued that the choice between OV and VO turned on the different properties of the two constructions with respect to “grammatical weight, complexity, and information status”.⁴ They write there is a clear difference in the frequency of VO for pronominal objects, lexical objects, and clausal objects . . . with more complex objects favoring post-verbal position more strongly. In addition, . . . even among the lexical objects, we find higher rates of VO with complex than with simple objects. (Taylor & Pintzuk 2012a, 837–8)

Along related lines, Pintzuk & Taylor (2006) and others have argued that even in an OV language there can be instances in which V is not VP-final owing to the extraposition of heavy or focal material postverbally. On such a view, some instances of VO reflect the emergence of a new ordering, while others reflect OV order with extraposed material. At the same time, the language can be VO at some stage, with OV resulting from fronting of O for discourse purposes (Pintzuk 2005). A constructional approach is fully compatible with these multiple options, and appears to more successfully account for the data than a parametric one. The competition between VO and OV has difficulty capturing what is going in VPs in which the verb is medial if the competition is characterized in terms of binary parameters. In contrast, in the constructional approach, one VP construction reflects the robustness of the VO order, another reflects the robustness of the OV order. Each order reduces complexity on a different dimension and is licensed by a different construction. In addition, various other constructions license the ordering of direct objects and other constituents in positions that are not licensed by one or another of these VP constructions. To the extent that one VP construction becomes stronger, it may subsume one or more of the other constructions. For example, as VO grows in strength, placement of a constituent before V may be licensed by a weaker construction that allows for XP-V, but is in conflict with the licensing conditions for VO. That is, they both express a correspondence

⁴ Here they are part of a robust tradition in the study of the older Germanic languages, including Bech (2001), Bies (1996), Hinterhölzl & Petrova (2009), Hinterhölzl (2015), Hinterhölzl (2017), Kemenade & Los (2008), Petrova (2009), and Sapp (2011).

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.3 the development of english 207 with the same CS function. To resolve the conflict, the XP-V construction may weaken further, to the point where it drops out, following Niyogi’s Law. When we consider the position of Aux as well, the situation becomes even clearer. Taylor & Pintzuk (2012b) cite data that shows that VO is more frequent when V follows Aux. They argue that VAuxO is an alternative to OVAux, the consequence of positioning O late in the VP for reasons of information structure and grammatical weight. The order AuxVO is compatible both with VO ordering and positioning O late in the VP for these reasons, as well as branching harmony. This analysis thus fits into our constructional perspective, and is difficult to reconcile with a parametric approach. There are effectively five constructions governing the linear ordering of Aux, V, and O. All of the constructions may be active in the grammar of a single speaker (Taylor & Pintzuk 2012b, 38), and some are in competition. There are two constructions governing the ordering of Aux and VP, namely AuxVP and VPAux. There are two constructions governing the ordering of V and O, VO, and OV. And there is one (complex) construction licensing the clausefinal position of O for reasons of information structure and grammatical weight. In the competition, if there is evidence for the order VO in the PLD, the construction governing that order is strengthened. Then, for reasons of maximizing harmonic ordering, VO favors AuxVP, and so we get a gradual loss of VAuxO. There may be an additional factor supporting VO, as well. This is V2—the construction that requires the finite verb to follow the initial constituent. When the finite verb is able to appear before the VP arguments in some main clauses, pressure is put on the system to have all verbs before the VP arguments in all clauses. Again, the motivation is branching harmony: if other phrases are headinitial, there is some pressure to position V to the left of its complements. At the same time, ordering V before its complements makes the thematic structure available in processing before the arguments are encountered, which optimizes the part of interpretation that assigns thematic roles to the arguments in CS. Of course, there are additional competing pressures from the perspective of constructing the discourse representation, where there is a preference for the arguments coming as early in the processing as possible. The branching harmony motivation is in fact offered by Kiparsky (1996, 154): “it harmonizes the mixed Germanic system in which heads took their complements sometimes to the left and sometimes to the right, depending on a complex set of conditions.” In other words, the shift to VO reduces the complexity of a partially disharmonic grammar.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

208 constructional change in germanic

8.3.2 The ‘loss’ of V2 in English This brings us to the question of the ‘loss’ of V2. I suggest that V2 was not actually lost in ModE, it just took on a new character. Specifically, the topic became externalized, and was not associated with the first position in the finite clause. At the same time, the subject NP became the primary occupant of clause-initial position. Thus, the current SVO ordering in English is V2 in the finite clause, with the subject in first position and the finite verb in second position. The scenario is this: 1. The domain in which the position of the finite verb is defined is a clause, and 2. the topic is not considered to be part of this domain, and 3. the initial constituent of the clause is NP instead of XP. 4. Then the basic order in ModE follows without stipulating “loss” of V2. Points 1–2 are supported by the observation in Chapter 7 that English topicalization puts the topicalized phrase outside of the wh-phrase in a whquestion. The relevant example is repeated here. (15) To Sandy, which book are you planning to give? The externalization of English topicalization is a minimal constructional change from the German topicalization of (7)—it is the loss of the requirement that the topicalized constituent correspond to the position of Spec. But this constituent must still be initial in its domain. Externalization was made possible by the narrowing of the OE German Topicalization construction to NP as the first term in S. In OE there was no absolute requirement that the finite verb must appear in second position in the clause (Bean 1983, 79). It was possible in OE for a clause to have topicalization without V2, particularly with a pronominal subject. (16) is an OE example. (16) Old English (Walkden 2015) æfter his gebede he ahof þæt cild up after his prayer he lifted the child up ‘After his prayer he lifted the child up.’ (cocathom2,+ACHom_II,_2:14.70.320)

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.3 the development of english 209 Bean (1983) provides data that shows that even at the earlier stages of documented OE, the verb could be in a variety of positions in both main and subordinate clauses. Bean’s percentages of SVX versus V-final sentences for nine epochs in the development of English from pre-755 to 1140 are shown in Table 8.1.⁵ Table 8.1 Percentages of SVX and SXV in OE, from Bean (1983, 67). Dates

SVX

SXV+XSV+OSV

Pre-755 755–860 865–884 885–891 892–900 958–1001 1048–1066 1122–1124 1132–1140

26 32 18 38 27 26 44 38 35

47 34 20 19 12 12 16 14 21

Bean (1983, 136) concludes that “OE was neither verb-second nor topicverb, but verb-third.” And evidence from Early Middle English shows a significant number of verb-final VPs along with VPs that are verb-initial or verb-medial (Allen 2000). Subordinate clauses show roughly the same sort of distribution, although the percentages are different for each stage. For example, in stage V (892–900), the percentage of V-final main clauses is 12%, while the percentage of V-final subordinate clauses is 38%, and the percentage of V-final relative clauses is 68%. We can interpret this shift not as loss of V2, but as a shift to requiring V2 in the lower clause. I suggested in Culicover (2008, 2013c) that this change was possible because of the independent requirement in OE that pronominal clitics be adjoined to the left edge of the VP (van Kemenade 1987). Clearly this interpretation of the status of pronominal clitics cannot be absolute, because of the fact that pronominal subjects licensed V2 in main clause declaratives. But if there is any opportunity for the pronominal subject to count as other than a

⁵ I have adapted Bean’s Table 4.2 by adding the percentages for SXV, XSV, and OSV, and by replacing her notation for the nine epochs with the dates given on page 64: (I) pre-755, (II) 755–860 (excluding the Cynewulf/Cyneheard episode in 755), (III) 865–84, (IV) 885–91, (V) 892–900, (VI) 958–1101, (VII) 1048–66, (VIII) 1122–4, (IX) 1132–40. For additional details about the sources, see Bean (1983).

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

210 constructional change in germanic phrasal constituent, it is possible for the V2 construction to license constructs such as (16).⁶ On this scenario, the V2 construction and the clitic construction are in competition over the position of the finite verb. That is, we have two constructions licensing different orders for the same CS functions, as illustrated in (17). (17a) licenses a construct in which a subject NP follows the finite verb if a constituent other than the subject is in initial position, i.e. XP–V–NP–…, while (17b) licenses a construct in which a pronominal NP is VP-initial, and hence precedes the finite verb, i.e. (XP)–NPpro –V–…. (17) a. phon [1–2–3]4 [ ] syn [Sroot Spec1 , [VP Vroot [fin]2 , …]3 ]4 b. phon 1–2 [ ] syn [VP …, NP[pro]1 ]2 As the strength of the second construction grows, the scope of V2 becomes restricted to constructs where the verb precedes a full NP. At the same time, the subject clitic must be able to count as being in first position in order to license the SVO ordering. So we must be able to formulate the grammar in terms of competing constructions that license different linear orderings. The constructional approach affords us just this type of flexibility. The requirement that the finite verb be in second position in the narrower domain is now somewhat relaxed in ModE. It is possible for an adverb to appear between the subject and the finite verb, as in (18a). But the legacy of V2 still persists, as shown by the fact that it is less acceptable for a nonparenthetical complex PP to appear in this position, as shown in (18b). then (18) a. Sandy { quickly } opened the package. deliberately at that time b. ∗ Sandy { with considerable speed } opened the package. on purpose

⁶ For an account of the ordering of subject pronouns and full NPs in English in terms of constraint reranking in Optimality Theory, see Clark (2011). In this analysis, full NP subjects are phrasal, and hence can appear in Spec,IP or Spec,VP, while pronominals are not, and must appear in Spec,IP, by stipulation. The gradualness of the change from V2 to SVO is achieved by introducing stochastic information into the constructions.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.3 the development of english 211 Finally, let us consider the grammar of questions in OE. Kiparsky (1996) reviews evidence due to van Kemenade (1987) that shows that the landing site of the wh-phrase in a wh-question in OE is not the same as the topicalization site. First, while it is possible to topicalize without V2 when there is a clitic subject, it is not possible to have a wh-question without V2 when there is a clitic subject. This follows if the wh-phrase in OE is ordered in syn external to the clause that contains the topicalized phrase—they are in different positions in the hierarchical structure. Second, a wh-question cannot have a VP-final V, but must show inversion.⁷ So the data suggest that OE already had the same construction for questions as the one in ModG, namely (13), repeated here. (13) Construction: wh-question ⎡phon [1–2]3 ⎤ ⎢syn [S XP[wh]1 , S2 ]3 ⎥ ⎢ ⎥ [1′ (2′ )]3 ⎣cs ⎦ In addition, English retained the Inversion construction (12), repeated here. The position of the finite verb in ModE is identified with that of the auxiliary verbs, i.e. modals, have/be and do. (12) Construction: Inversion (in question) ⎤ ⎡phon 2–[1–…]3 ⎢syn [Sroot Spec1 , [VP V[fin]2 , …]]3 ]⎥ ⎥ ⎢ Q(3′ ) ⎦ ⎣cs To sum up, then, the change from OE to ModE with regard to word order assumes that OE already had topicalization, V2, inversion in questions, clauseinitial wh-questions, VP left-edge pronominal clitics, and VP-final V. The rise of VP-initial V led to the reordering of OV to VO. Cliticization and the frequent occurrence of V2 with NP in topic position led to the generalization of NP-V[fin]-…to all instances of V2 and all types of NP subjects. Thus the cliticization construction was subsumed. Finally, the constructions generalized from main clauses to all clauses, leading to the present-day picture in which English appears to be strictly S(Aux)VO. The changes are summarized in Table 8.2.

⁷ Kiparsky also says that a pronominal clitic may not immediately follow a clause-initial wh-phrase or negative phrase, but this claim is disputed by Axel (2007, 284–5).

Topicalization

Main clause

Subord. clause

VP

Questions

OE

[S [Spec XP]–…]

< [S [Spec XP]–V[fin] …] > [S V[fin] …]

[VP XP–V] [VP V–XP]

[S (XP[wh]–)V[fin] …]



[S [Spec XP]–…]

[S [Spec NP]–V[fin] …]

< [VP XP–V] > [VP V–XP]

[S (XP[wh]–)V[fin] …]

ModE

[S XP–S]

[S NP–V[fin] …]

V2 VP-medial VP-final V2 < VP-medial > < VP-final > V2

[VP V–XP]

[S (XP[wh]–)Vaux [fin] …]

Angled brackets indicate a weaker competitor.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

Table 8.2 Development of major English constructions from OE to ModE.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.3 the development of english 213

8.3.3 The loss of case marking Another major innovation of English from OE was the loss of case marking. The literature for the most part attributes the fixing of the subject position (as in 8.3.2) to the loss of case, which required that grammatical functions be distinguished in terms of position.⁸ Kiparsky (1997) observes that “lack of inflectional morphology implies fixed order of direct nominal arguments,” but not the converse. Kiparsky’s explanation relies on a rather complex interaction between the introduction of Infl into phrase structure, raising of V to Infl, abstract Case as an instantiation of thematic roles, a thematic role hierarchy and abstract features that fix each role on the hierarchy, and a complex set of assumptions about the relationship between case-marking, the location of clitics, and the thematic features of specifiers and complements. A simpler approach to the same general insight is to observe that when linear order starts to become fixed, case marking starts to become redundant. There is no need to mark a subject with nominative case if it is necessarily preverbal, as required by the narrowing of V2 to cases with initial NP. There is of course a residue in the case of the pronouns, arguably due to their high frequency. Loss of case is then a way of reducing the representational complexity of the grammar. The inverse causal story, where the loss of case leads to the fixing of subject and object positions, fails to explain why case marking was lost in the first place. It has been suggested that phonological weakening led to the loss of case distinctions, but this leaves open the question of why there was phonological weakening in the first place. We might plausibly take phonological weakening as accompanying and not causing grammaticalization (Campbell 2000).⁹ The constructional trajectory is straightforward. In the OE stage, the highest ranked argument in the thematic hierarchy is assigned nom case, with the position of the argument determined by other factors.

⁸ Kroch (2003): “…the loss of morphological case distinctions due to phonological weakening at the ends of words is generally thought to lead to rigidity of word order to compensate for the increase in ambiguity induced by the loss of case.” Taking loss of case to be a consequence rather than a cause of fixed order rules out accounts such as that of Hinterhölzl (2017). ⁹ Jespersen (1909) appears to have been the first to suggest this possibility. Along related lines, the phonological processes that lead to case distinctions may well have been the result of configurational marking, rather than the cause. Moreover, contact with case-less Scandinavian may have hastened the loss of case in the English spoken by bilinguals (Hornung 2017). Allen (2006) concludes that “[w]hile the reduction of case marking surely had an important role in the final loss of some word orders, nevertheless the movement toward less variation in word order seems difficult to explain as a result of the loss of case marking on its own.”

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

214 constructional change in germanic (19) Construction: OE-nominative ⎤ ⎡phon [1(2), …]3 ⎥ ⎢ category NP ⎥ ⎢ ] , …] ⎥ [ [ ⎢syn case nominative2 1 ⎢ 3⎥ S ⎥ ⎢ GF1 (> …) ⎦ ⎣gf Then, the V2 construction narrows to NPs in S-initial position. (20) Construction: NP-initial ⎡phon [1(2)>…]3 ⎤ ⎢ ⎥ ⎢ ⎡ ⎡category NP ⎤ ⎤ ⎥ ⎢ ⎥ ⎢ ⎢ ⎥, …⎥ lid1 ⎢syn ⎢ ⎢lid ⎥ ⎥ ⎥ ⎢ ⎥ ⎥ ⎢ nominative2 ⎦ ⎢ ⎣S ⎣case ⎦3⎥ ⎢ ⎥ GF1 (> …) ⎣gf ⎦ Finally, the case requirement is lost, except for pronominals. (21) Construction: ModE subject ⎡phon [1>…]3 ⎤ ⎢ ⎥ [S NP1 , …] ⎥ ⎢syn 3 ⎢ ⎥ GF1 (> …) ⎦ ⎣gf The loss of nominative case in English as a condition on the thematic correspondence is reminiscent of the fact that children learning English typically fail to produce subject Aux inversion in wh-questions, even when they are producing subject Aux inversion in yes-no questions (Brown 1973). That is, they will say What your name is? instead of What’s your name? In this case, the initial wh-phrase is sufficient to mark the sentence as a wh-question, while inversion is simply a formal requirement with no semantic force and thus less salient for the language learner (Culicover 2000). In fact, inversion does not occur in embedded wh-questions in standard English—I wonder what your name is—showing that it is not necessary to signal the interrogative. Similarly, the scenario sketched out here treats nominative case on the subject as a purely formal requirement, at the point at which clause-initial position took over as the signal for marking agentivity. This view is supported by the fact that the phonological changes required for the loss of English

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.4 the development of modern german 215 case-marking were not ‘neogrammarian’ changes, but were morphologically conditioned.

8.4 The development of Modern German from Old High German Having sketched out the major German constructions and the trajectory of change for English, we are now in a position to look at the development of ModG from OHG. ModG unlike ModE, has preserved the original Topicalization construction that requires that there be a constituent in S-initial position. According to Axel (2007, 181), OHG allowed a topic on the left periphery of a sentence both without V2 and with V2. These options are licensed by the German topicalization construction in (7) for ModG. At the same time, it appears that unlike ModG, OHG allowed V-initial declaratives (Axel 2007, chapter 3). Axel (2007, 168) proposes that early Germanic clauses were in fact uniformly V-initial, with particles marking their function. Thus, the simplest constructional formulation of V-initial declaratives is to treat them in the same way as V2, but without the requirement that there be an initial constituent in topic position. That is, given the basic structure of VP in (22), the construction in (23) licenses a linear ordering in which the verb is left-adjacent to the VP, and hence is initial. (22)

S …

VP

… V′

NP V

XP

(23) Construction: OHG V1 phon [2–3]4 [ ] syn [Sroot [VP Vroot [fin]2 , …]3 ]4 Given (23), a declarative sentence in OHG with an initial V is licensed. Given the topicalization construction (7), a declarative sentence in OHG with an initial XP is also licensed. The innovation that yields ModG is the correlation between topicalization, and initial V, expressed in the ModG V2 construction (10), repeated here.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

216 constructional change in germanic (10) Construction: V2 ⎡phon [1–2–3]4 ⎤ ⎢syn [Sroot Spec1 , [VP Vroot [fin]2 , …]3 ]4 ⎥ ⎢ ⎥ 4′ ⎣cs ⎦ In other words, the scope of V2 in ModG main clauses is narrowed from the OHG case, in that in ModG the V is spelled out in the position before the rest of the VP only when there is an XP in initial position in the clause. The accepted assumption in MGG that V2 is a syntactic operation— movement of the inflected verb to C1⁰—introduces a number of complexities that do not arise in the constructional analysis. Absence of V2 requires absence of C, or of some feature on C. A language like OHG in which the movement is optional must have competing syntactic structures, some with C that triggers the movement and some without C. In contrast, a constructional approach allows us to put these complexities aside, since all that is at issue is the linear order of constituents and not a derived phrase structure. For OHG, the optionality of V2 reflects the fact that both V2 and OHG V1 are active constructions in competition. Let us consider now the position of finite V in declarative main and subordinate clauses and nonfinite V. Axel (2007, chapter 2) marshals evidence that shows that in the absence of topicalization, the finite V could remain in VP-final position, although such cases in main clauses are relatively rare (Axel 2007, 49–51, 62). By and large the finite verb appears in first or second position in main clauses. In subordinate clauses, many instances of nonfinite verbs are in final position, e.g. (24a); however, there are also cases where the nonfinite verb precedes arguments or adjuncts of the V, e.g. (24b). (24) Old High German (Axel 2007, 49–51, 62) a. …thaz thaz kind bisnitan uuvvrdi …that the child circumcised became ‘that the child was circumcised’ b. dhazs ir chicoria uuari gote that he obedient was God.dat ‘that he was obedient to God’ It thus appears that both OV and VO were possible constructions in OHG, as in OE. However, there are numerous complexities associated with this 1⁰ Originally proposed by Evers 1975.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.4 the development of modern german 217 assumption. For example, separable particles must precede the verb (Axel 2007, 94–5). Axel contrasts the dual structure analysis with the assumption that the only possible underlying structure is OV, combined with the assumption that there is the possibility of extraposition of material to the right of V, including the direct object. Since extraposition in this case is a cryptoconstructional alternative to saying that V need not be VP-final, the alternatives appear to be descriptively equivalent, and each requires conditions on what can be post-verbal. Regarding yes-no questions and wh-questions, it appears that the other constructions of ModG that we have considered were already in place in OHG. Returning to topicalization, a case can be made that the function of topicalization became more general in the evolution from PG to ModG, to the point where it is now simply a formal syntactic requirement even when there are no discourse functions associated with it (Axel 2007). Such a change is characterized by the loss of a licensing condition, in this case, one having to do with the interpretation assigned to the topicalized constituent. Thus, comparing OE and OHG, it appears that we can make the following broad generalizations. 1. Both had the same constructions for yes-no questions and wh-questions. 2. Both had OV and VO constructions for VP, with differences having to do with the placement of pronouns, particles, and other prosodically light elements (Axel 2007, 105–6). OE arguably had more VO in subordinate clauses than OHG did. 3. Both OE and OHG had topicalization. 4. Both OE and OHG had V2 in competition with V-in-situ in main clauses. 5. OHG had V1 in main clauses without topicalization; OE did not. We may then plausibly hypothesize that characteristics 1–4 were inherited by both languages from PG. Equally plausibly, OHG inherited V1 from PG, while OE did not. To summarize, the trajectory from OHG to ModG appears to be that shown in Table 8.3. Working our way back from OHG and OE gives us the hypothesized constructions for PG in the first line.11

11 Sapp (2016) argues that OHG was basically OV, with extraposition producing instances of VO. He argues against the idea that the head position of V in VP is underspecified (Haider 2010; Schallert 2010; Schlachter 2012). On a constructional approach, both orders must be licensed by competing constructions. However, each of the two orders may be consistent with different licensing conditions that are linked to discourse and information structure and prosody.

Topicalization

Finite verb

VP

Questions

PG

[S [Spec XP]–…][?]

[S V[fin] …]

[S (XP[wh]–)V[fin] …]

OHG

[S [Spec XP]–…]

↓ ModG

… [S [Spec XP]–…]

[S [Spec XP]–V[fin] …] [S [Spec XP]– …V[fin]] [S V[fin] …] … [S XP–V[fin] …]

[VP XP–V] [VP V–XP] [VP XP–V] [VP V–XP] … [VP XP–V]

… [S (XP[wh]–)V[fin] …]

[S (XP[wh]–)V[fin] …]

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

Table 8.3 Trajectory of change from PG to ModG.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.5 verb clusters 219

8.5 Verb clusters I conclude this chapter with a constructional analysis of verb clusters and verb projection raising (VPR). I show how the range of possibilities can be accounted for in terms of complexity. Variation in verb clusters is found in Continental West Germanic (CWG; Wurmbrand 2004, 2006). Verb clusters are sequences of modals (MOD), haveand be-type auxiliaries (AUX), and various control verbs (V). Typical two-verb clusters are given in (25), using ModG forms for the sake of illustration. The verb order is shown in parentheses, where 1 is the highest verb (i.e. the one closest to the root) in a standard syntactic analysis, 2 the next highest, and so on. So in (25a), for example, 1 is kann and 2 is singen. (25) Maria glaubt, dass Maria believes that a. sie die Arie singen kann. (21) she the aria sing.inf can ‘…she can sing the aria.’ b. sie die Arie kann singen. (12) The order 21 of (25a) is found in ModG; the 12 order of (25b) is found in a number of Dutch dialects, Swiss German dialects, and other CWG varieties. I proposed in Culicover (2014) that 12 is a response to pressure to order functional scope taking elements, such as AUX and MOD, before their complements, while 21 is a response to pressure to position heads and their dependents as close as possible to one another. In a verb-final language, these two pressures result in alternative orderings for the members of the cluster.12 A comparable situation does not arise in a verb-initial language, where V1 –V2 –XP order is optimal both for scope and for dependency on the head. In addition to verb clusters, CWG has verb projection raising (VPR). VPR appears to be the result of reordering not simply V, but a projection of V, after the higher V. Some examples are given in (26). The V-projection is underlined.

12 Assuming a standard derivational framework, Bobaljik (2004, 129ff) considers a number of explanations of why reordering in verb clusters is found only in verb-final languages, but does not arrive at a definitive solution.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

220 constructional change in germanic (26) Zürich German (Haegeman & van Riemsdijk 1986, 419) a. das de Hans es huus chaufe2 wil1 (21) that the Hans a house to.buy wants ‘…that Hans wants to buy a house.’ b. das de Hans es huus wil1 chaufe2 (12) that the Hans a house wants to.buy c. das de Hans wil1 es huus chaufe2 (VPR) that the Hans wants a house to.buy The orders NP-21, NP-12, and 1-NP-2 are possible as well in West Flemish (WF). However, only the first two are possible in Standard Dutch. And only the first is possible in ModG. Haegeman and van Riemsdijk’s solution to representing the variation is essentially a constructional one: they adapt a proposal of Huybregts (1984) to the effect that the linear ordering of the verbs is determined by a “reanalysis” process that affects only the linear ordering, while retaining the hierarchical structure. This is a reordering that applies “in the phonology” (p. 420). We can formulate this basic idea into our constructional framework as follows. First, there must be a construction that specifies that the complements and adjuncts of the lexical head must precede it, regardless of how each is ordered with respect to other constituents in the larger VP (p. 428). This is the VP-final V construction (11). Second, there is a construction that specifies the relative ordering of verbs in the particular dialect or language. In WF and Zürich German (ZT) the construction has the form in (27). I show all of the projections as V, so that V2 is phrasal. (27) ⎡phon ⎢ ⎢ ⎢syn ⎢ ⎣cs

(a) [2–1]3 ⎤ (b) [1–2]3 ⎥ ⎥ [V V1 , V2 ]3 ⎥ ⎥ [1′ (2′ )]3 ⎦

V1 is the head, as indicated by the CS representation. Either it follows its complement (27a), an ordering that is subsumed under VP-final V (11), or it precedes its complement (27b). If the order is 21, then the arguments of the verb must precede 2 in order to be licensed by VP-final V, so we have NP-21. If the order is 12, then either NP-12 or 1-NP-2 are licensed by VP-final V, which says that V must follow its complements.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.5 verb clusters 221 Suppose, now, that 2 is a VP headed by a V that itself takes a VP complement; then there are three verbs. The tree in (28) shows the structure with the canonical VP-final ordering. (28)

V6 V5 V4 NP

V1 V2

V3

In this tree, the orderings between V5 and V1 , between V4 and V2 , and between NP and V3 are all licensed by VP-final V, and thus the ordering 321 is licensed. (27) licenses 231, 123, and 132. But not all orderings are possible in all dialects and languages. While the 321 ordering is possible in ModG, it is not possible in Standard Dutch (SD), WF, and ZT. Moreover, even in ModG, this ordering is not possible when V1 is an auxiliary that takes a modal as the head of its complement.13 In that case, V1 is ordered before its complement even in a subordinate clause (29a), and instead of a participle, the head of the complement is infinitival—this is the so-called Infinitivus Pro Participio (IPP) construction, noted in Chapter 4. (29) a. German (Schmid 2002, 11) dass Peter das Buch hat lesen können. that Peter the book has read.inf can.inf ‘that Peter was able to read the book.’ b. German (Schmid 2002, 10) Peter hat das Buch lesen können. Peter has the book read.inf can.inf ‘Peter was able to read the book.’ c. ∗ dass Peter das Buch lesen gekönnt hat ∗ Peter hat das Buch lesen gekönnt. Since IPP does not follow from the other constructions governing the ordering of projections of V, it must be a distinct construction.1⁴ 13 Although the possibility exists in some dialects with some verbs in V1 position (Zwart 1995; Augustinus & Van Eynde 2017). 1⁴ For discussion of the history of IPP, see Zwart (2007); Jäger (2018).

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

222 constructional change in germanic Beyond this special construction, there are many other idiosyncratic stipulations that govern the various combinations of two and three verb structures. These have to do with the particular features of V1 , e.g. whether it is a modal or an auxiliary, and the type of V2 , e.g. whether it is causative, benefactive, durative, etc. For example, Schmid (2005) gives an overview of the threeverb possibilities in Zürich German shown in Table 8.4. Note the possibility of 213—this ordering is not licensed by any combination of VP-final V and (27). Table 8.4 Overview of verb order patterns in Zürich German (adapted from Schmid 2005, 76).

Causative Modal PV Benefactive Durative Inchoative CV

Perfect, V2 : PastP

Perfect, V2 : IPP

Future



321, 123, 132 ? 321, 123, 132 ? 231, 123 231, 123, 213

321, 123, 132 ? 321, 123, 132 321, 123, 132 321, 231, 123, 132 321, 132 231, 123, 213 321, 123, 132, 213



321, ? 123, 132 321, 231, 123, 132, 213 321 231, 213 321, 123, 213

∗ ∗ ∗

Schmid gives similar, but different tables for other Swiss German dialects. Given such dialectal variability and lexical sensitivity, I have argued (Culicover 2013c, 2014) that the possible verb clusters should be licensed constructionally. Moreover, I have argued that those orderings that are very common reflect relatively low computational complexity of the correspondence between the linear order and CS, as well as branching harmony. 321 and 123 show branching harmony, 321 optimizes the connection between the complements of the main verb 3, since the ordering is NP-321. The ordering 123 optimizes computation of the scope of tensed auxiliaries and modals. Mixed orderings reflect the fact that these two forces are in opposition and that different orderings are reflecting optimization on different dimensions. To conclude, let us recall the observation above that 21 is possible in SD, but not 321 (cf. (26)). There is no formulation of licensing conditions that will rule in 21 and rule out 321, whether the conditions are formulated in terms of construction, in terms of a parameter setting, in terms of reanalysis as in Haegeman & van Riemsdijk (1986), or in terms of movement and adjunction (Wurmbrand 2006). On our constructional approach, all logically possible orderings are available to speakers, but only some will be selected. The ones that are not selected in a particular grammar are ranked higher in complexity on some dimension, and it is this dimension that is optimized in this grammar.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

8.6 conclusion 223 In fact, Bach et al. (1986) studied the relative complexity of 321 and 123 in German and Dutch and found them to be similar in processing difficulty. But 321 was significantly more complex than 21 and 123 was significantly more complex than 12. Moreover, the difficulty for the German speakers of the 321 structures was greater than that of the 123 structures for Dutch speakers. A plausible account of the preference for 123 over 321 is that 321 puts V1 too late in the string for the easy interpretation of logical scope.1⁵ While 12 is preferred over 21 on the same grounds, the shallower embedding of 21 versus 321 renders the difference smaller, hence the preference difference is less. Interestingly, SD permits 213 when V2 is an inchoative or control verb. (30) Dutch (Schmid 2005) a. dat het opgehouden2 heeft1 te regenen3 that it stopped has to rain ‘that it has stopped raining’ b. dat hij nooit geprobeerd2 heeft1 te doen3 alsof that he never tried has to do as-if ‘that he never tried to pretend’ This ordering puts V1 earlier in the sequence than the expected 321. It is possible that the other disharmonic cases 231 and 132 that occur in various dialects may reflect the pressure to order functional elements before their arguments—23 instead of 32, and 1[32] instead of [32]1. Exploration of these and similar hypotheses calls for a research project that goes beyond the scope of this study, one that seeks independent evidence for the relative complexity of these alternatives based on judgments of acceptability and corpus frequency; I leave the matter for future research.1⁶

8.6 Conclusion This chapter traced the development of the grammars of the contemporary Germanic languages from the proto-language in constructional terms. The 1⁵ See Stabler (1994) for a formalization of a related idea in terms of ‘connectivity’. 1⁶ Cecchetto (2007) offers suggestions that are compatible with my own regarding to the role of linear order. For example, the ordering ∗ [[INFL VP] COMP] does not occur because “by the time the selecting Head COMP is met the portion of the structure containing INFL might have already been closed off [by the parser]” (p. 9) and “notice that, for this explanation to work, it must be the case that there is an asymmetry between COMP and INFL, namely the occurrence of COMP guarantees that an INFL node will be present somewhere in the incoming sentence, while the presence of INFL does not warrant the presence of COMP” (p. 10). See also Cecchetto (2009).

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

224 constructional change in germanic constructional framework offers insight into the nature of the variation and the trajectories of change. Stated in constructional terms, the differences between the languages are for the most part minimal in terms of the individual constructions, although their superficial effects may be significant when they are combined in a single grammar. Most importantly, the changes appear to fit well within framework of change and variation laid out in Chapter 4. I considered several core constructions in Germanic that are responsible for topicalization and the position of the finite and non-finite verb. The German topicalization construction appears to be essentially unchanged from PG through OHG, ModG, and OE; it is generalized in ModE in terms of the syntactic configuration that licenses it. Similarly, all of the languages appear to have V2, where the finite verb appears in second position. I argued that the main innovation of English was the restriction of V2 to finite clauses with initial NP, where topicalization has been externalized. Similarly, all of the languages had essentially the same construction for questions, where the finite verb appears in initial position in a yes-no question and after the fronted wh-phrase in a wh-question. English departs from the other Germanic languages in the emergence of the category Vaux , which appears in SAI in questions and other constructions. Finally, ModE has lost the VP-final V option, while ModG has lost the VP-initial V option seen in earlier stages.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

9 Changes outside of the CCore In this chapter I focus on three well-documented developments in English as further support for applying the constructional approach to language change and variation. Section 9.1 looks at the history of English reflexives, and illustrates how a constructional approach can track the development of the present-day system, many aspects of which fall under the binding theory of GB theory. The essential observation of this study is that the morphosyntactic marking of a pronoun as a reflexive is not identical to the semantic relation of binding, a difference emphasized in the work of Reinhart & Reuland (1993). The semantic relation of binding is a universal one, while the particular marking of reflexivity that a language employs is not. The English method of marking reflexivity appears to have evolved over some time, a fact that casts some doubt on the universality of the classical binding theoretic approach to reflexivity of, for example, Chomsky (1981, 1986). But as we will see, the insight of Condition A of the binding theory is essentially correct. Section 9.2 shows how the grammaticalization of do as an auxiliary, that is, do-support, receives a natural description in constructional terms, with minimal stipulation. While do-support is an idiosyncrasy of English grammar, it turns out that it is just an extreme example of a widely attested phenomenon, do-periphrasis. The historical trajectory of English do-support has a particularly natural description in terms of constructional generalization. Section 9.3 takes up preposition-stranding (p-stranding) and shows how the development of this characteristic property of English also receives a natural account in a constructional approach. Again, the trajectory is one of constructional generalization.

9.1 English reflexives 9.1.1 Reflexivity in constructions I begin my review of the evolution of the English reflexive with the distinction between morphosyntactic reflexivity and binding in CS. In ModE, Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0009

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

226 changes outside of the ccore an argument that is ‘local’ with respect to an argument with which it is semantically identified must be marked as reflexive. This is captured by Condition A of the classical binding theory, narrowly stated here in terms of the reflexive instead of the term ‘anaphor’ in the original formulation. Condition A:

a reflexive must be locally bound.

There are three factors that enter into the constructional characterization of reflexivity, which to a considerable extent echo Condition A. • The reflexive pronoun is overtly marked in phon as reflexive. • The GF of the reflexive is lower on the GF hierarchy than the argument that binds it within the same syntactic domain. • The argument in cs corresponding to the reflexive pronoun is semantically identified with another argument. A reflexive construction is one that expresses a correspondence between these properties. On this definition, reflexivity occurs when there is a particular marking that distinguishes a reflexive element that is locally bound. Binding and locality are universal primitives; reflexivity is a constructional expression of local binding, although it may be used for other purposes as well. The reflexive construction for English is stated in (1) (1) Construction: English reflexive.1 ⎤ ⎡phon […, 2, …]3 ⎢syn [NP[pro,refl]2 , …]3 ⎥ ⎥ ⎢ ⎥ ⎢gf [GF1 > GF2 ]3 ⎥ ⎢ ′ ′ 1 =2 ⎦ ⎣cs The sentence Mary saw herself is licensed by this construction, as shown in (2). (2) ⎡phon ⎢syn ⎢ ⎢gf ⎢ ⎣cs

[Mary1 –saw4 –her-self2 ]3

⎤ [S NP[Mary]1 , V[saw]4 , NP[pro,fem,sg,refl]2 ]3 ⎥ ⎥ ⎥ [GF1 > GF2 ]3 ⎥ α α see′ 4 (agent:m1 ,patient:m2 ) ⎦

1 There is no requirement that GF1 in this construction corresponds to a syntactic constituent, because of cases like the imperative Help yourself!, control like Protecting/To protect oneself/yourself/myself is essential, and nominals like Gifts to/from myself always embarrass me.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

9.1 english reflexives 227 Because this construction stipulates a correspondence between local binding of a coargument and self -marking, it is not possible for a construct to be licensed if two arguments corefer in CS, there is a binding relationship between them, and the corresponding string lacks reflexive marking. Such a construct is not licensed. Thus, examples like (3) must be analyzed as instances of non-reflexivity given in (1), if they appear in correspondences where the two arguments are covalued. (3) Mary saw her. The reflexive construction thus preempts the interpretation of her in (3) as ‘Mary’ (Dowty 1980; Levinson 1987; Safir 2004). In this way, we derive the effects of Condition B of the Binding theory (Chomsky 1981) for syntactic configurations in which two arguments corresponding to local GFs are covalued. The ECM construct in (4) is also licensed by (1). While the reflexive does not correspond to a thematic argument of expect, it corresponds to a locally bound argument on the GF tier, because the thematic subject of the infinitive is the syntactic direct object of expect.2 (4)

phon ⎡ ⎢syn ⎢gf ⎢ ⎣cs

[Napoleon1 –expects2 –[him-self]3 –to-win4 ]5 ⎤ [S NP[Napoleon]1 , V[expects]2 , NP[pro,masc,sg,refl]3 ,VP[to.win]4 ]5 ⎥ ⎥ [GF1 > GF3 ]5 ⎥ α ′ ′ [λy.λx.expect 2 (exp:x,theme:y)(n1 ) (λz.win 4 (agent:z)(α3 ))]5 ⎦

9.1.2 Variation and change in reflexive constructions The power of the constructional account is that it allows for independent variation and change in the syn-phon correspondence and in the syn-cs correspondence. Regarding the first, we can envision changes in the phonological expression of arguments that are covalued and in the configuration indicated in (1). Regarding the second, we can envision reflexivity for configurations other than the core reflexive construction captured in (1). Examples of this are logophoric anaphora and lexically governed reflexives such as English behave/perjure oneself, obligatorily reflexive verbs like French se demander ‘to wonder’, and 2 See Culicover & Jackendoff (2005) for an account of ECM in constructional terms. I omit the details here.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

228 changes outside of the ccore Russian bojat’sja ‘to fear’. In these cases, the lexical construction licenses reflexivity without CS binding, but the local GF condition must still be satisfied; see Varaschin (2018) and references cited there. Naturally, each noncanonical case requires its own licensing conditions. Focusing on the first type of change, it is striking that in OE, binding (in CS) by a coargument (in GF) was expressed using simple pronouns (examples from Peitsara 1997; König & Siemund 2000; König & Gast 2002, respectively). (5) Old English a. hine he bewerað mid wæpnum him(self) he defended with weapons ‘he defended himself with weapons’ [ElG 96.11] b. Ic on earde bad / …ne me swor fela I on earth was / …neg me swear many ‘I was around on earth …I never perjured myself ’ [Beo 1722] c. þæt we us gehydan mægon that we us hide may ‘that we may hide ourselves’ [Junius, Christ, & Satan 100] In the absence of reflexive marking there was no preemption of this interpretation, and hence no Condition B effects along the lines discussed in section 9.1.1 for ModE. The reflexive marker -self began to be introduced around 1500. It appeared most frequently with third person pronouns, and only gradually became generalized to all persons (van Gelderen 2000, 53). And, as Siemund (2002, 9) notes, bare form pronouns were the default for reflexive indirect objects, only giving way to the marked form at the beginning of the Early ModE period. The situation in 1500 is a clear case of high interpretive complexity (section 4.4.3). The trajectory of change can easily be described in terms of the standard generalization of constructional licensing conditions, driven by pressure to reduce interpretive complexity. The first attested cases of the change show introduction of -self when the CS is of the form in (1) and NP2 in syn is 3.acc. A reasonable speculation is that this innovation occurred in order to disambiguate the third person pronoun.3 Crucially, the first and 3 See van Gelderen (2000, chapter 2) for evidence that reflexives are introduced in first and second person as well. Van Gelderen argues that this fact shows that the innovation did not begin with the third person. However, this logic is questionable, since there is no reason that the generalization cannot begin to extend to the first and second person forms before the generalization is complete in the third person.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

9.1 english reflexives 229 second person pronouns uniquely denote speaker and hearer, and thus are not ambiguous. Subsequent generalization to reduce representational complexity would then drop the third person condition and the acc condition, which yields the present state of affairs.⁴ In contrast to this scenario, let us consider the account of the historical changes in an MGG framework. Certainly the most extensive treatment is that of van Gelderen (2000). Van Gelderen’s summary of her explanation of the observed trajectory is as follows. 1. Pronouns have features. 2. Features may be ‘strong’ or ‘weak’. 3. If the features are ‘weak’, a pronoun cannot be ‘left out’, and verbal inflection cannot be reduced. In other words, strong features allow pro-drop; loss of pro-drop is an indication of weakening of features. 4. Pronouns with ‘strong’ features may not be anaphors; pronouns with ‘weak’ features may be anaphors. 5. Inherent Case is gradually lost in Middle English. A pronoun with inherent Case may function reflexively. So as inherent Case is lost, reflexive marked anaphors must be used. 6. The features change from Interpretable (e.g. inherent Case) to Uninterpretable (e.g. structural Case). 7. So at a stage where a pronoun expresses reflexivity, the pronoun has ‘strong’ features. But as the features become ‘weak’, the pronoun is excluded from this function, and a reflexively marked anaphor must be used. It is clear that the scenario summarized here is formulated in terms of the vocabulary of a specific theory, and of course we could dispute particular assumptions of that theory or even the entire approach. Moreover, the notions ‘strong’ and ‘weak’ appear to substitute for ‘has/lacks pro-drop’, and do not correspond to any independent properties. But putting these caveats aside, it is striking that the account of change is expressed in terms of stipulated changes in the elements of the formal representation of the syntactic elements, with no explanation of why these particular elements changed. Why is inherent Case lost? Why do pronouns ⁴ It is striking that in languages like French that use clitics to mark pronominal arguments, only the third person reflexive is distinguished in phon: me ‘I, myself ’; te ‘you, yourself ’; but le, la ‘he, she’, se ‘him/herself ’.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

230 changes outside of the ccore change from ‘strong’ features to ‘weak’ features? What is the relationship between specified features and interpretability, and what is the reason for changes in the status of these features? Thus, van Gelderen’s treatment is cryptoconstructional—the substitution of abstract, non-independently motivated features that are responsible for distinctions that otherwise can be represented constructionally. For a more extensive critique of van Gelderen (2000), see Gast (2002).⁵

9.2 Auxiliary do In this section I consider English auxiliary do and the phenomenon of dosupport. To understand English do-support, we must understand first how it came into being, and then how it changed over time. In section 9.2.1 I discuss the forces that led to the use of English do as an auxiliary. In section 9.2.2 I discuss its historical trajectory to the present time.

9.2.1 The emergence of do To track the emergence of auxiliary do, it is important to first be specific about present-day auxiliary do, in order to fix the end point of the change. The simplest (and most accurate) generalization about auxiliary do is that it is a modal verb that selects a VP headed by a lexical verb, thus not auxiliary have and be. Like other modals, do cannot be the head of an infinitive. Given that a sentence may be headed by a tensed auxiliary, do may otherwise appear in any construction that licenses modals such as will, can, as well as tensed have and be (Sag et al. 2020). There are in fact several such constructions in English, including the canonical simple declarative clause with Verum Focus (6a), SAI (6b), negation (6c), VPE (6d), VP topicalization (6e), pseudogapping (6f), and clausal relatives (6g). Each of these explicitly requires a tensed auxiliary (Culicover 2013c; Sag et al. 2020).⁶ ⁵ Another approach to reflexives is that of Bergeton & Pancheva (2011). They argue that the Modern English reflexive is actually a combination of a null pronoun and an intensifier of the form pro+self, e.g. himself. The account is constructional, in that it does not presume to explain the change in minimalist or derivational terms; rather, it argues for reanalysis of overt forms in terms of their function and distribution. ⁶ Sag et al. (2020) note that in standard spoken English NP-do-V-. . . with unstressed do does not occur. In Culicover (2013c) I propose that this is because it is preempted by NP-V-. . . . On this view, NPdo-V-…with unstressed do is not impossible, and in fact was quite commonly used in the seventeenth century (Kroch 1989). It is also encountered in formal legal indictments, e.g. “. . . and the fact that

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

9.2 auxiliary do 231 (6) a. b. c. d. e. f. g.

Chris did call. Did Chris call? Chris didn’t call. Chris did. …and call Chris did. …Kim eats more carrots than Chris does onions. They said I eat carrots, which I do.

It might be objected that classical do-support as a ‘last resort’ captures a generalization without positing a multitude of different constructions: Tense is realized on do when it is unable to adjoin to an adjacent verb, as originally proposed by Chomsky (1957). But this objection overlooks the fact that it is necessary to specify the individual contexts in which this last resort must be invoked, that is, the contexts in which Tense is not adjacent to a verb, such as SAI, negation, VPE, VP topicalization, and pseudo-gapping. None of these come for free, and must be explicitly stated as constructions of English. Moreover, using the condition of non-adjacency does not explain the occurrence of do to express Verum Focus without the assumption that there is an invisible morpheme that intervenes between Tense and the verb, a cryptoconstructional device. Finally, as has been recognized for a long time, an intervening adverb does not license do-support (7). (7)



Chris did slowly walk into the room.

Accepting the characterization of do as an auxiliary with particular selectional properties, the question arises as to why the main verb do was recruited for this purpose at an earlier stage in the history of the language. An apt statement of the key intuition is the following passage from Jäger (2005, 86). . . . a tendency to employ verbal periphrasis optionally can be observed in (first) language acquisition, where it serves as an avoidance strategy for morphological (paradigmatic) complexity (Tieken-Boon van Ostade 1990, 20–21, Kroch 1994, 191, Auwera 1999, 60). Here the acquisition of a single verbal paradigm, namely that of the auxiliary, compensates for that of inflectional rules for lexical verbs and the frequent paradigmatic variation thereof (various conjugation or inflection classes) (also cf. Slobin 1985a). he sent the children to the schoolhouse is no defense whatever, for he did refuse to send them to school, prepared to attend, as the law requires, and he did fail and refuse to furnish them equivalent instruction . . .”) (New York Supplement, 1915). And it can be used for emphasis without being stressed, e.g. “I do so much like the fact that you are not going to eat all of the caviar.”

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

232 changes outside of the ccore As my findings suggest, a congener of English do is a likely candidate for this strategy by virtue of its schematicity and is thus frequently employed cross-linguistically.

Jäger is suggesting here that the reason for a language to create a periphrastic construction with a dummy auxiliary verb is that it reduces complexity.⁷ This reduction works in several ways. First, as Jäger suggests, since do selects a complement headed by an uninflected verb, the irregularities associated with the conjugation of that verb are avoided. In English, for example, it is necessary to master just three forms, do, does, did, instead of the regular present along with many irregular past tense forms for a fairly large number of verbs, e.g. ate, saw, brought, lost, spoke, took, drank, sang, ….⁸ Second, as noted in connection with (6) above, auxiliary do participates in constructions such as SAI and negation, which in earlier forms of English, as in German, applied to the tensed main verb. So in a construction such as SAI, the main verb would be separated from its arguments and adjuncts. The a-examples in (8) and (9) show the ordering without do-support. (8) a. Ate you the pickles? b. Did you eat the pickles? (9) a. You ate not the pickles. b. You didn’t eat the pickles. The lexical heads are in the closest possible proximity to one another, which Hawkins has shown is preferred quite generally and facilitates processing (Hawkins 1994, 2004, 2014). It is not surprising that the form used is a variant of do, since it is semantically the most neutral verb in which the subject and other referential expressions have θ-roles. Nevertheless, the shift to periphrastic do in English, at least, requires thematic neutralization, since main verb do requires that the subject be an agent, but periphrastic do does not. As in the case of other changes, this generalization can be modeled in constructional terms through the loss of licensing conditions. If the lexical entry for lexical do is (10), the loss of the argument structure requirement yields the modal do in (11). (10) ⎡phon 1 ⎤ ⎢syn ⎥ V1 ⎢ ⎥ λy.λx.act′ 1 (agent:x,action:y)⎦ ⎣cs ⁷ This idea is explored to some extent in Culicover (2013c). ⁸ Haspelmath & Sims (2013) show that there are 46 distinct patterns of irregularity in English.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

9.2 auxiliary do 233 (11) ⎡phon 1 ⎤ ⎢syn V1 ⎥ ⎢ ⎥ ∅⎦ ⎣cs We can simply stipulate that this verb is categorized as a modal, or speculate that its category follows from its non-thematic lexical structure; I leave the question open. Let us turn now to the function of periphrastic do. English is not unique in having such an element—Jäger (2006) documents the periphrastic use of do in 200 genetically and typologically diverse languages. Thus, the change from (10) to (11) is found at least in part in other languages, a fact that should not be surprising if we accept the motivation from reduction of complexity. Given this, it is interesting to observe that a good portion of the earliest evidence for do-periphrasis in English consists of do in non-negative, noninterrogative clauses. The results of a search using Google Ngram of the corpus of Early English Books Online for instances of he doth V compared with doth he V and he doth not V are given in Figure 9.1. doth V

20 18 16 14 12 10 8 6 4 2 0

he doth V

doth he V

doth he V

1680s

1690s

1670s

1650s

1660s

1630s

1640s

1620s

1600s

1610s

1590s

1570s

1580s

1550s

he doth V

1560s

1540s

1520s

1530s

1510s

1490s

1500s

1470s

1480s

he doth not V

he doth not V

Figure 9.1 Occurrence of doth in affirmative declarative, interrogative, and negative with subject he.

Unlike the pattern observed by Kroch (1989) based on the data collected by Ellegård (1953), this source of data, to the extent that it reflects the spoken language, suggests that the growth of he doth is essentially identical to that of doth he from 1470 to about 1640; at around that point, both begin to decline, but the decline in he doth is greater, and as we know, falls off precipitously into the ModE era.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

234 changes outside of the ccore But putting aside the relative frequencies of these constructions, it is clear that whatever motivation can be found for the innovation of do-periphrasis in interrogative and negation, that motivation is not relevant to simple declaratives. The motivation for he doth V can only be that it makes it possible to avoid inflecting the main verb, which distinguished first, second, and third person in the present singular.

9.2.2 The spread of do This brings us to the question of why do-periphrasis is so widespread and absolute in ModE. Here I appeal to the characterization of competition in Chapter 4. Recall that there I characterized the competition between constructions in terms of Niyogi’s (2006) notion of bilingual competition. The two languages may be identical except for the formulation of just a single construction. In the present case, we can envision SAI and negation as each contributing to bilingual competition. That is, one variant has inversion of the tensed verb, and one variant has inversion of the tensed auxiliary. Similarly, one variant has the tensed verb before not, and one variant has the tensed auxiliary before not. Taking each pairing to be an instance of bilingual competition, we can ask, which variant will win the competition? If there is even a slight bias due to lower complexity in favor of the construction that specifies ‘tensed auxiliary’, then by Niyogi’s Law, in the long run the variant with that construction will win. From section 9.2.1 I hypothesize that the key to this competition is the innovation of do-periphrasis in order to avoid marking inflection on the main verb. The availability of do as a modal permitted innovation of the other constructions that applied to tensed auxiliaries, preserving the integrity of the VP. Thus, it is no accident that the original innovation was do-periphrasis in affirmative declaratives, which was then picked up for SAI and negation. Finally, given the generalization of do as a modal, it is straightforward to apply it to any construction in which a tensed auxiliary is required, e.g. VPE, VP topicalization, pseudo-gapping, and clausal relatives. In fact the Early English Books corpus has a number of examples as early as the mid-sixteenth century of …than he doth NP, which is pseudo-gapping (12), and …which he doth, which is a clausal relative (13). (12) if a man loue any thyng more by any way than he doth god, that in heuen is on hye than is that man to god vnkynde⁹ ⁹ “The Prick of Conscience, Parts 1-3” (1542), by R. Wyer.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

9.3 preposition stranding 235 (13) hee will make all his creatures to worke against them: which he doth, to signifie to them, that hee is against them also1⁰ It is thus reasonable to conclude that once it was available at all, modal do could be used for reduction of complexity, and quite generally for any construction that specifies a tensed auxiliary as part of its description.

9.3 Preposition stranding The next change that I consider is p(reposition)-stranding. P-stranding is exemplified in (14). For convenience I mark the site of the phonologically absent argument as a trace t, and coindex it with the element that is linked to it; no particular syntactic analysis is intended by this marking. (14) a. b. c. d.

Whati are you looking at ti ? The elephanti (that) I was looking at ti was huge. The video evidencei is tough to look at ti . The video evidencei was looked at ti , but no conclusions were drawn.

I show that p-stranding is handled nicely on a constructional approach, without requiring the assumption of an unintuitive syntactic structure for the V+P combination.

9.3.1 Why p-stranding? There are several long-standing questions that arise in connection with pstranding. The most fundamental one is, why do languages like English have it for cases like (14), while other languages do not. Where p-stranding is not possible, as in Standard French, pied-piping is required (15c,d).11 (15) a. At what are you looking? b. The elephant at which I was looking was huge. c. Avec qui parles-tu? [French] with who speak-you ‘Who are you speaking with?’ d. ∗ Qui parles-tu avec? 1⁰ “A commentary vpon the vvhole booke of Iudges” (1615), by Richard Rogers. 11 For discussion of the constructional formulation of p-stranding and pied-piping, see Chapter 7.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

236 changes outside of the ccore A less central but equally challenging question is why English (and the Scandinavian languages) allow p-stranding in the passive (14d). From the perspective of change, we observe that p-stranding was restricted in OE to relative clauses with the fixed complementizer þe (Allen 1980); the question is, why did the grammar change so that the cases in (14) are licensed in ModE? Let us start with the prepositional-, or p-passive, i.e. p-stranding in the passive. The p-passive cannot be explained in terms of a typical constructional generalization, following p-stranding in relatives, since the passive is an Aconstruction, not an A′ -construction. This said, it is plausible that there is some kind of a connection between the two types of p-stranding: languages like Swedish that allow p-stranding in A′ constructions also allow p-passive (Engdahl & Laanemets 2015). And it does not appear that there are any languages that allow only the p-passive and not p-stranding in A′ constructions. In order to explore this issue in more depth, we must formulate a suitable account of the p-passive in synchronic terms. I base my analysis on that of Goh (2000), who proposes what is essentially a constructional account. Goh proposes that V+P functions as a syntactic unit in the passive, and in the active has a structure in which V and P do not constitute a unit. Adapting and extending Goh’s proposal somewhat, I assume there are two types of V+P pairs, those that are compositional and those that are noncompositional in interpretation. The syntactic structure is always the same, that is, [VP V [PP P NP]]. In the compositional case, the preposition assigns a θ-role to its complement. In the non-compositional case, V+P functions as a semantic unit and this unit assigns a θ-role in CS to the complement of the preposition. In many cases, such as rely on, approve of, think about, etc., this interpretation is lexically determined, but in some cases the noncompositional interpretation is constructionally coerced. The examples in (16)–(18) illustrate the compositional and the coerced non-compositional cases. A compositional V+P occurs when the preposition assigns its own interpretation, independently of the verb, as in (16a,b). Because the role is not assigned by V+P, passive is not possible. In contrast, a non-compositional V+P is one where the V selects the P, and the θ-role is specified by the entire lexical entry, e.g. rely on, as in (17). When ppassive applies and there is no lexical selection of the preposition by the verb, as in (18b), constructional coercion nevertheless forces an interpretation in which the NP gets a θ-role from V+P. (16) a. George slept in New York. b. ∗ New York was slept in by George.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

9.3 preposition stranding 237 (17) a. Chris relied on Sandy. b. Sandy was relied on by Chris. (18) a. George slept in the bed. b. The bed was slept in by George. The analysis that I propose to account for these facts makes crucial use of the relational treatment of the passive outlined in section 2.3. The passive is licensed by the relational schema in (19) in which the second ranked GF in the active corresponds to the highest ranked GF in the passive. (19) Relation: active ⇔ passive NPx1 ⎡ ⎤ y ⎢syn [S …, [VP V ,{[ P, NPx ]}]]3 ⎥ PP 1 ⎢ ⎥ ⎢ ⎥ x ⎣gf [GF > GF1 ]3 ⎦ ⇔ syn [S …, [VP Vy [passive]]]3 [ ] gf [GFx ]3 The corresponding GF is implicated in the explanation for the p-passive in section 9.3.2 as case marking is lost in English, and the formulation of the passive construction is central to the coercion explanation in section 9.3.3.

9.3.2 P-passive With the foregoing in place, we can explain the extension of passive to the p-passive in terms of the identification of the object GF. When V+P is compositional, P assigns a role to the NP and hence the GF domain is the PP. In this case, the NP does not have a GF with respect to V. But when V+P is non-compositional, the domain is the VP. The difference is illustrated in (20). (20) a. Compositional: syn [S …, [VP V, …, [PP P, NP1 ]2 ,…]]3 ] [ gf [GF]3 [GF1 ]2 b. Non-compositional: syn [S …, [VP V, …, [PP P, NP1 ],…]2 ]3 [ ] gf [GF > GF1 ]3

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

238 changes outside of the ccore The emergence of the p-passive is the consequence of the change from (20a) to (20b). Crucially, the treatment of the prepositional complement as having a GF with respect to V+P is relevant only to a construction that explicitly refers to this GF in its licensing conditions, namely, the passive, thereby capturing Goh’s intuition that V+P functions as a unit only in the passive. So the p-passive is licensed in English because V+P may have a non-compositional interpretation. The lexical entry for cases like rely on associates the θ-role theme with the complement of the preposition, where we understand theme in the sense of Dowty (1991) or Ackerman & Moore (2001). In the default correspondence for English, theme corresponds to the second GF. So a p-passive is licensed by the passive schema. But more needs to be said, since a non-compositional interpretation does not necessarily lead to the possibility of a p-passive in a language. Languages like German have many non-compositional V+P pairs, but lack the p-passive. (21) a. Er hat auf sie besteht. he has on them insisted ‘He insisted on them.’ ∗ b. Sie würden auf besteht. they were on insisted ‘They were insisted on.’ The crucial difference between German and English is that the preposition in German case-marks its complement as either acc or dat, while the preposition in English does neither. Interestingly, the only argument that counted as direct object in OE with respect to the passive was the accusative argument (Bennett 1980). That is, non-accusative arguments did not passivize. Thus, it appears that the change that allowed for passivization of the object of a preposition was the extension of the object GF associated with the V to the object of the preposition, made possible by the loss of case distinctions and the emergence of the non-compositional interpretation of the V+P pair.12 These changes are formulated in constructional terms as follows. First, in the emergence of non-compositional interpretations, a lexical item acquires a prepositional complement, as in (22).

12 According to Bennett (1980), the non-compositional interpretation was not present in OE, but was beginning to emerge in the Middle English period (Goh 2000, 8).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

9.3 preposition stranding 239 (22)

[

>> syn [VP V1 [PP P4 , NP2 ]] syn [VP V1 , NP2 ]3 ] [ ] ′ cs λy.λx.1 (agent:x,theme:y2 ) cs λy.λx.1′ (agent:x,theme:y2 )

Second, the default case for complements of prepositions is accusative. dat (23) { gen } >> acc acc Third, accusative case is identified with GF1 . (24)

syn NP[acc]1

[

gf

GF > GF1

]

Finally, as a consequence of the default case, the domain for assigning a GF to the complement of the preposition shifts from the PP (20a) to the VP (20b): (25)

syn [S …, [VP V, …, [PP P, NP1 ]2 ,…]]3 ] >> [ gf [GF]3 , [GF1 ]2 syn [S …, [VP V, …, [PP P, NP1 ],…]2 ]3

[

gf

[GF > GF1 ]3

]

This last change can be understood as a generalization that arises in the course of learning, as learners form hypotheses about the correspondences between case and GFs. As the case distinctions are lost between complements of V and complements of P, the latter take on the appearance of the former. Thus it is a natural step to take the V as the domain of the GF.

9.3.3 Coercion In addition, we can explain the difference between (16) and (18)—The bed/∗ New York was slept in—in the same terms, using constructional coercion. In order for a passive to be licensed, the second ranked GF of the active verbal form and the first ranked GF in the passive verbal form must be understood as corresponding to one another. If the second GF in active verbs is by default the theme, then the highest GF in passive forms acquires this property by default. Hence in (16) and (18), both New York and the bed are coerced to be interpreted as theme.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

240 changes outside of the ccore (16) a. George slept in New York. b. ∗ New York was slept in by George. (18) a. George slept in the bed. b. The bed was slept in by George. This interpretation as theme is plausible in the case of the bed but not New York, a fact that has been observed by many—the subject of this type of passive is ‘affected’, which is understood here to be a possible interpretation for the argument of sleep in. It is of course well known that the passive applies to the object regardless of its θ-role. It need not be a theme. But the point here is that in a case where there can be a compositional or a non-compositional interpretation of the PP complement of the verb, the subject of the passive is coerced into the noncompositional interpretation and is linked to the object GF, and hence it must be a possible theme. Analyzing the p-passive in this way allows us to avoid claiming that the complement of the preposition can always have the object GF with respect to the V; rather, it has this GF in the passive only, as Goh proposes, and by coercion, which is my extension of his proposal. Thus, the complement of the preposition is not predicted to undergo heavy NP shift and otherwise act like a direct object with respect to other constructions. It is a direct object only with respect to the passive, under the non-compositional interpretation. The examples in (26) illustrate. Note that heavy shift applies to the PP, but not to its complement. These differences are explained if the PP is in the syntactic domain of the V, but its complement is not, regardless of the semantic interpretation.13 approved of } the decision that you made very much. liked ∗ approved of b. Sandy { } very much [the decision that you made]. liked c. Sandy approved very much of the decision that you made.

(26) a. Sandy {

13 Thanks to Giuseppe Varaschin for pointing out that it is possible to state this coercion constructionally, as in (i). (i) syn [VP V1 [PP P2 NP3 ]]4 ] [gf GF > GF3 cs [λy.λx.[1+2]′ (θ:y,theme[+affected]:x)(3′ )]4

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

9.4 conclusion 241

9.4 Conclusion I have argued in this chapter that the constructional approach to change can illuminate not only very general and uniform, but also relatively idiosyncratic phenomena in a language. Characterizing the phenomena and the changes in constructional terms gives us a way to understand the dynamics of the system quite independently of the centrality of the phenomena. In fact, there does not appear to be a need to make sharp distinctions between constructions that deal with argument structure and operator binding, on the one hand, and reflexivity, do-support and preposition stranding on the other. They appear to require the same descriptive devices and to be sensitive to the same pressures regarding economy.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10 Constructional economy and analogy The central argument of this book is that by characterizing grammatical knowledge in terms of constructions we can shed light on why and how languages change, and on the nature and limits of linguistic variation. An important part of the argument is that constructions are appropriate alternatives to parameters. As noted in Chapter 3, the classical view of parameters has not been supported by empirical results. The contemporary view of parameters is cryptoconstructional: it accounts for variation and change at all levels of description in terms of derivations from a universal structure, movements and adjunctions triggered by the properties of abstract functional heads and features. In Culicover (2005) I compared the appearance of linguistic structure to the perception of an image that is not actually present in a picture but signaled by various features of the picture; the image emerges when we squint at it.1 In the present case, the analogy is that parameters as typically characterized do not exist, but what does exist gives rise to the appearance of parameters, as an approximation. The question is, what is it that has the appearance of parameters when we ‘squint’ at it? It is not sufficient to argue against a classical or cryptoconstructional approach to variation and change—we must explain why the structure emerges, and why certain patterns are found cross-linguistically, while others are not. In this book, I argue that this can be done by recognizing that the universal structure is conceptual structure, and that the patterns emerge as the consequence of economy, that is, the pressure to reduce complexity. I have argued that many changes can be understood in terms of economy, following the general line of argument established by Hawkins 1994 and elsewhere, and explored in Culicover 2013c. My primary focus has been on demonstrating that the vocabulary of constructions is well-suited to the description of change and variation. Economy underlies a wide range of phenomena, including but not restricted to the generalization of constructional representations, branching harmony, and minimization of dependency length. 1 The image that I used is Dali’s Lincoln in Dalivision, easily found on the Internet.

Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0010

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

constructional economy and analogy 243 But we can go even further and ask, why does economy play such a central role in determining the form of grammars? I argue in this chapter that economy in constructions derives from placing a high value on the use and reuse of the components of the processing routines associated with constructional correspondences. The grammatical effects of such use and reuse is what we call ‘analogy’.2 The result of economy is that particular patterns spread through the grammar of a language. Spreading may reach a point where two constructions might be combined into a more general construction, which would have the status of a parameter setting in the classical sense. But it does not have to go this far, and often does not. The consequence is that a description in terms of classical parameters falls short, but the tendency toward certain characteristic patterns nevertheless emerges. Developing this idea, I argue in this chapter that economy allows us to answer a number of questions about language change and variation, e.g., Questions 1. Why is there pressure to reduce complexity of grammatical representations? 2. Why does language contact result in language change? 3. Why do languages become more complex in certain respects, even as they become less complex in other respects? 4. Why are parameters an appealing device for characterizing variation, even where they are descriptively inadequate? 5. Why are there distinctive typological patterns, if there are no parameters? The structure of the chapter is as follows. In section 10.1 I review Baker’s (1996) argument that there is a polysynthesis parameter. I suggest we need a more distributed and less categorical notion that I call style, which results from a composite of grammatical devices across categories and constructions. To develop this notion of style more precisely, I compare how Plains Cree and English express simple propositions and reference to individuals. Each language has a characteristic style that can be seen in the mapping between form and meaning, in particular in the use of the formal devices that signal this mapping. While some of the differences can be described in terms of

2 A similar argument is suggested by Jackendoff & Audring (2020, 159) with respect to morphology.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

244 constructional economy and analogy ‘parameters’, they are more fine-grained than prototypical parameters and also more irregular. Crucially, style extends to constructions that are functionally very dissimilar. The constructions cannot be combined, suggesting that style cannot be reduced in a natural way to basic parametric distinctions.3 The review of cross-categorial similarities in section 10.1 brings us to the classical notion of analogy as a way of accounting for similarity that falls short of uniformity. Section 10.2 proposes that what has been called ‘analogy’ is the computational system’s response to the pressure to reduce energy expenditure in processing form/meaning correspondences. Such reduction of energy expenditure leads to greater uniformity across paradigms and constructions, but only occasionally to complete uniformity. This view of uniformity contrasts strongly with Chomsky’s (2001) Uniformity Principle, which takes complete uniformity as the norm rather than as the limiting exception. Section 10.3 sums up the key ideas, addressing question 5—why are there typological patterns? The short answer, which is elaborated at some length in section 10.3, is that the patterns are augmented and highlighted by analogy. A particular constructional solution to a form/meaning correspondence is reified throughout the grammar. Thus, constructions that do not express the same relation nevertheless may make use of the same formal device. The result is the emergence of a distinctive style, the way that a language does its work. In cases of linguistic change, stylistic devices are likely to be preserved, so that descendants share certain constructional properties. In situations of contact, various aspects of style may be exchanged between languages, yielding more distributed stylistic patterning and Sprachbund effects.

10.1 The elements of style Baker (1996) cites Sapir (1921, 120) regarding the structural “genius” of a language. Since Sapir’s perspective is very relevant to the concerns of this book, I reproduce it here: For it must be obvious to any one who has thought about the question at all or who has felt something of the spirit of a foreign language that there is such 3 In theories like HPSG (Pollard & Sag 1994) and Sign-Based Construction Grammar (Sag 2012), the work of parameters is taken up by the inheritance of abstract constructions. I argue here that what is needed is an even more flexible approach to capture the relations between constructions (see also Jackendoff & Audring 2020).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.1 the elements of style 245 a thing as a basic plan, a certain cut, to each language. This type or plan or structural “genius” of the language is something much more fundamental, much more pervasive, than any single feature of it that we can mention, nor can we gain an adequate idea of its nature by a mere recital of the sundry facts that make up the grammar of the language.

Baker’s solution to the problem of capturing this “genius” is “the polysynthesis parameter”. Specifically, I defend the view that Mohawk and similar languages have a single property that distinguishes them from other language types and that influences the form of virtually every sentence of the language. Thus, I will argue that Sapir was right in an important sense in saying that languages have a “structural genius.” (p.4)

Baker makes it clear that Sapir’s notion of “genius” is far more general than the classical view of parameters, which bear on relatively specific linguistic differences, to the extent that they are ‘microparameters’, ‘mesoparameters’, or even ‘nanoparameters’. Baker’s idea is that of a ‘macroparameter’, one that governs quite globally the way a language goes about expressing meaning using sound. He concludes, “I propose the very strong position that polysynthetic languages differ from other languages in exactly one macroparameter” (p. 8). Baker’s demonstration that the grammar of a language like Mohawk differs in fundamental and global ways from the grammar of a language like English is compelling. In fact, when we look closely at the grammars of other languages—Chinese, Tagalog, Pirahã, German, Igbo, Jacaltec, Plains Cree, Walbiri, Russian, and so on—we are likely to come away with the same feeling: this language has a characteristic way, a style of going about its business of expressing meaning using sound. We can see a particular style in genetically related languages, and in genetically unrelated languages that live in the same geographical region. As a way of getting at what defines the style of a language, let us first consider what it means to say that a grammar expresses one value of a ‘parameter’. As I argued in Chapter 3, what this means in present terms is that there is some CS function that a language must express,⁴ there is some formal device that the grammar employs to express this function, and there is some phonological form that corresponds to this formal device. That is, there is a construction. ⁴ With the caveat that some languages may not express certain functions for cultural reasons. Pirahã (Everett 2012) appears to be an extreme example.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

246 constructional economy and analogy Let us take the familiar example of argument structure. As we saw in Chapter 5, while English strings out constituents in a particular linear order to express argument structure, Plains Cree uses polysynthesis. An example is given in (1).⁵ (1) Plains Cree (Dahlstrom 1991, 11) ni- wa:pam- a:w- ak 1 see direct 3 3pl ‘I see them.’ It is a familiar observation in X′ theory that noun phrases have complex structure similar to that of sentences (Chomsky 1970; Abney 1987). In particular, just as the sentence has a subject, i.e. an agentive argument in the canonical case, so the noun phrase may have a possessor.⁶ So an obvious question is, what would we expect to be the form of a possessed noun phrase in Plains Cree? Not surprisingly, perhaps, the NP in Plains Cree does not resemble the NP in English at all. In Plains Cree, the phrase headed by a verb and a noun follow very similar ‘templates’. A typical Plains Cree phrase headed by a noun is given in (2), where the polysynthetic approach and the more analytic approach of English can be clearly compared. (2) Plains Cree (Dahlstrom 1991, 15) ni- ta:nisina:n- ak 1 daughter 1p pl ‘our daughters’ Although the form in (2) is perhaps not surprising, there are still some questions to be answered. 1. Why doesn’t Plains Cree use polysynthesis for phrases headed by verbs and English-style analytic phrasing for noun phrases (for example)? 2. For that matter, while English uses more or less analytic morphology, with some inflection, for sentences, why doesn’t it use Plains Cree-style morphology for noun phrases? 3. And why does Plains Cree use the form ni ‘first person’ and ak for plural in verb-headed and noun-headed phrases, rather than forms that are ⁵ I adjust Dahlstrom’s notation slightly here to align the gloss and to represent long vowels. ⁶ In fact, in some languages, such as Japanese, the same marker may be used to mark a subject and a possessor (Saito 2004).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.1 the elements of style 247 idiosyncratic to the phrase types? It is certainly true that inflectional forms do not have to be the same across paradigms even of the same category in a language—is this cross-category identity in Plains Cree an accident, or is there something that motivates it? 4. Similarly, why aren’t the plural and/or person markers prefixes in one construction and suffixes in the other? The short answer to question 1—Baker’s answer—is that each language has a value of the polysythesis parameter. Plains Cree is polysynthetic and English is not. Baker’s (1996) informal statement of the Polysynthesis Parameter is, “Every argument of a head element must be related to a morpheme in the word containing that head” (p. 16). According to Baker, languages that have the value +polysynthetic for this parameter not only have morphemes in the word for the subject and object arguments, but may also have noun incorporation as well as other properties. Notice that Baker’s formulation of the polysynthesis parameter assumes a key component of what it seeks to account for. The notion “morpheme in the word containing [a] head” presupposes that the language has complex morphology, so that arguments can be expressed within the word. Baker assumes X′ syntax for the arguments, and assumes that the interpretation of the arguments is mediated by movement derivations that result in structures in which they are linked to the corresponding morphemes. For instance, (3a) is the spelling out of the structure in (3b). (3) Mohawk (Baker 1996, 12) a. Wa’-ke-nákt-a-hnínu-’. fact-1sS-bed-∅-buy-punc ‘I bought the/a bed.’ b. S NP

VP

I

V

NP

Ni

V

N

bed

buy

ti

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

248 constructional economy and analogy So Baker is right—there is such a thing as style. A language has certain patterns that show up across a range of constructions. But it turns out that only some of these are closely related, although they share certain properties. An example that I have already discussed in some detail is p-stranding in English, which is a characteristic of the language. P-stranding can be seen in a wide range of constructions, both A and A′ . Another case is subject Aux inversion, which shows up in many English constructions with quite different semantic functions. A third case is English do-support, which occurs in a range of constructions with different superficial properties. I return to them in section 10.2.3. Because of this lack of tight cohesion across constructions, I believe that Baker’s notion of an overarching macroparameter is too strong. This said, we do want to understand why a language should reflect a style across different categories and in different domains. That is, we want to understand why the same formal devices tend to be used in the same way across categories, albeit not uniformly. To a considerable extent, we may appeal to the approach to typology of Hawkins, cited elsewhere, as part of our answer to questions such as 1–4. Hawkins proposes that the tendency of languages to show uniform ‘parametric’ values reflects a pressure to increase communicative efficiency. For example, disharmonic branching direction involving heads and complements across categories in a single language delays or interferes with the identification of core structure and increases computational complexity—see the discussion in section 3.2. Hawkins (2014, 34–5) defines communicative efficiency this way: Communication is efficient when the message intended by S is delivered to H in rapid time and with the most minimal processing effort that can achieve this communicative goal

and Acts of communication between S and H are generally optimally efficient; those that are not occur in proportion to their degree of efficiency.

In other words, the encoding of a message is efficient when it requires minimal effort by the hearer to decode it. But this does leave open the question of whether the effort exerted by the speaker in encoding it is also relevant, and whether there is any tradeoff between speaker and hearer effort.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.2 analogy 249 While such notions of efficiency are helpful, as in the case of branching direction, it is also necessary to explain why the same formal device would be used across categories, or why the same form would be used for different, but related functions. Hawkins’ proposal in this respect is the principle of Minimize Forms (Hawkins 2014, 15): The human processor prefers to minimize the formal complexity of each linguistic form F (its phoneme, morpheme, word, or phrasal units) and the number of forms with unique conventionalized property assignments, thereby assigning more properties to fewer forms. These minimizations apply in proportion to the ease with which a given property P can be assigned in processing to a given F.

What we have, Hawkins argues, is a competition between clarity of expression, which yields “minimal processing effort” in one sense, and economy of encoding, which yields “minimal processing effort” in another sense.⁷ My focus in what follows is on the latter.

10.2 Analogy These ideas have a natural characterization in terms of constructional complexity. To the extent that constructional complexity is reduced along the lines discussed in preceding chapters, the more similar are the correspondences. Constructional similarity corresponds in many respects to the traditional concept of analogy. Analogy is a notion that has been criticized as being too vague to form the basis of genuine explanation (Chomsky 1966, 66). The main difficulty is to identify the properties used to compute the analogy in a principled way. I suggest in this section that we can understand analogy in terms of processing economy. The reduction of distinct forms and distinct form/meaning correspondences reduces the number of distinct processing steps required to compute correspondences. As I show, such an explanation goes well beyond the more obvious examples of change, where correspondences are generalized

⁷ In fact, this tension between the apparent efficiency of reusing the same devices and the apparent clarity of using different devices for clarity of communication is prominent in the literature; see for example the papers in Sampson et al. (2009) and MacWhinney et al. (2014), and Wasow (1997a,b, 2002).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

250 constructional economy and analogy and exceptions are eliminated. It offers a basis for explaining the broader typological patterns that I have characterized as style.

10.2.1 Maximizing economy Chomsky’s criticism notwithstanding, it seems unobjectionable that humans are constantly seeking to identify patterns and to form generalizations. I take this as a given.⁸ In the case of language, it is plausible that, in part, this has to do with the reduction of energy expenditure in processing linguistic expressions—that is, maximization of economy. In some cases, maximization of economy is accomplished by true generalization, where the categories of the participants in the constructional correspondence overlap.⁹ The key idea that relates constructional complexity to economy is that constructions are represented in memory as schemas (Jackendoff & Audring 2020, chapter 2) and are realized in actual use as processing routines (Jackendoff & Audring 2020, chapter 7). This idea is captured in the following from Bybee (2013): From the broader perspective of usage-based theory, however, constructions can be viewed as processing units or chunks—sequences of words (or morphemes) that have been used often enough to be accessed together.

What Bybee refers to as “words (or morphemes)” may, of course, be categories, given that many constructions are non-idiomatic, general components of grammars. Consider, for example, the task of expressing the agent argument of a verb denoting action in a language like English. Having accessed the phonological form of the verb V, and having constructed the form of the noun phrase that

⁸ For discussion of the issues surrounding analogy and a review of attempts to deal with it in psychology and linguistics, see Blevins & Blevins (2009). For other perspectives, see Traugott & Trousdale (2013) and Kiparsky (2011). De Smet & Fischer (2017) discuss the role of analogy in accounting for language change, and in particular the utility of constructional representations and the relationships between constructions in defining what is at play in analogical extensions: “It has been observed, for instance, that an analogical extension is the more likely, the more its outcome resembles one or more already existent patterns” (p. 243). ⁹ The idea that grammaticalization is the result of analogy under pressure to reduce computational complexity is argued for by Fischer (2013), who cites as antecedents Itkonen (2005); Wanner (2011); Saussure (1922 [1983]); Slobin (1985b), as well as Hermann Paul.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.2 analogy 251 denotes the agent, call it ‘Agt’, the routine of producing the sentence must order Agt before V. Similarly, in understanding a sentence of the form Agt–V–…, the hearer must assign to Agt an interpretation that it is the agent of the verb. Let us call this processing routine ‘Subject’. Now consider the task of expressing the experiencer of a verb denoting perception in the same language. Clearly, it requires less processing effort to ‘piggyback’ the processing of this semantic structure on the Subject routine, especially because the experiencer shares certain properties with the agent; they are both the more agentive arguments of their respective predicates— see Chapter 5. It is of course possible to have different routines for the two interpretations, and there are languages that do distinguish agents and experiencers (as discussed in Chapter 5), but that is more costly in terms of computational complexity. We can invoke this sort of post hoc, processing-based explanation for all cases of generalization, across all domains and all properties. In the area of word order, for example, we can add this type of explanation to Hawkins’ principles, arguing that generalizing the routines that speakers and hearers have established for identifying arguments and adjuncts with respect to heads counts as reduction of energy expenditure, even as it reduces memory load and the time needed to identify the skeletal argument structure. In polysynthetic languages, locating most of the affixes on one side of the root or the other allows a particular routine for relating linear position to semantic function to be generalized.1⁰ Aikhenvald (2017, 287) reports that in the Amazonian Basin, there is a tendency for languages to have more suffix positions than prefix positions; in the extreme, Panoan languages, and Urarina, an isolate from north-western Peru, have one prefix position and many suffix positions. More generally, the data in Dryer & Haspelmath (2011, chapter 26A) shows that suffixation is significantly preferred over prefixation.11 The ordering of affixes is an area where it does not appear to be possible to appeal to the kinds of factors that have been argued to play a role in the ordering of syntactic constituents. For instance, Hawkins (1994) explains harmonic branching as the result of a pressure to keep heads as close to one another as possible, thereby reducing the processing required to identify the skeletal 1⁰ Mobile affixes exist, but are rare (Jenks & Rose 2015). 11 For an explanation for this preference in terms of learning complexity, see Hana & Culicover (2008).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

252 constructional economy and analogy argument structure of a sentence. Since there is no constituent structure and no branching in the form of a word as it is expressed phonologically, this is not a motivation for any analogous harmonic morphological effects. In section 10.2.2 I show that there are in fact such effects, and attribute them to economy. Similarly, the use of polysynthesis across categories discussed in section 10.1 does not in any obvious way facilitate communication. But using the same processing routines for mapping between word-internal morphology and argument structure across domains requires less energy than using different processing routines for the same type of relation across domains. Consider also cases where there is a phon-syn correspondence that does not involve CS, and therefore no constructional generalization. Such cases are what Culicover & Jackendoff (2005) characterized as a ‘signature’ of a language. The signatures, taken together, define the style of a language. They are components of constructions that are identifiable in the routines by which correspondences are computed. While they are stated in terms of correspondences, they do not necessarily inherit their properties from more general constructions because they do not overlap in function. By the same token, the use of the same devices across categories does not appear to generalize to the level of a parameter in the classical sense. So, there can be languages in which polysynthesis is a property of verbs and not of other categories (Mithun 2017b, 257).

10.2.2 Routines Let us now work through a more explicit version of the motion of analogy as underlying style by formulating and then comparing the argument structure processing routines for prefixation, suffixation, and analyticity. Consider first (1), repeated here. (1) Plains Cree (Dahlstrom 1991, 11) ni- wa:pam- a:w- ak 1 see direct 3 3pl ‘I see them.’ The constructional correspondence that licenses this sentence is (4), repeated from section 6.3.1.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.2 analogy 253 (4) Construction: direct thematic correspondence ⎤ ⎡phon [1(2,3,4)]5 ⎥ ⎢ cat V ⎥ ⎢ ⎡ ⎤ ⎥ ⎢ ⎥ ⎢lid lid1 ⎥ ⎢ ⎢ ⎥ ⎢arg1 [person number] ⎥ ⎥ ⎢ 2 ⎢ ⎥ ⎥ ⎢ syn ⎥ ⎢ ⎥ ⎢ person person ⎥ ⎢ ⎥ ⎢ ]3 ⎥ ⎢arg2 [ ⎥ ⎢ number number ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ direct4 ⎥ ⎢ ⎦5 ⎣ths ⎥ ⎢ [λy.λx.1′ (agent:x, patient:y)(3′ )(2′ )]5 ⎦ ⎣cs Here, agent and patient are understood to be the most agentive and patientive arguments, in the sense of Dowty (1991). A plausible routine for producing phon given CS is given in (5). → is to be understood as phonological production. The steps are partially temporally ordered, in that identification of a component of CS must precede production corresponding to that component. The ordering of steps in the routine is not arbitrary: since the parts of phon that correspond to particular elements in syn are strictly ordered, the corresponding steps in the routine are similarly ordered. (5) 1. 2. 3. 4.

identify relation: see′ 1 ; identify arguments: agent:ego2 , patient:3rd3 +PL4 ; ego > 3rd: direct 5 ; compute ΦPlainsCree 1(2,3,4,5): (a) ego2 → ni(b) see′ 1 → wa:pam(c) direct 5 → a:(d) 3rd3 +PL4 → w-ak

Parsing of the form reverses the steps.12 Step 3 in (5) is an example of what Slobin (1987) calls “thinking for speaking”: a speaker of Plain Cree must assess the relative positions in the hierarchy of the two arguments in order to determine whether the theme will be direct or inverse. A speaker of English does not need to pay attention to the hierarchy, because it has no grammatical consequences in English. Compare 12 For recent computational approaches to parsing Plains Cree morphology, see Schmirler et al. (2018); Arppe et al. (2016); Harrigan et al. (2017).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

254 constructional economy and analogy (5) with the routine for the same concept in English. Note that the two routines are identical up to step 3. The paradigm function of Plains Cree is of course more elaborate than that of English—compare step 4. identify relation: see′ 1 ; identify arguments: agent:ego2 , patient:3rd3 +PL4 ; ego2 → I compute ΦEnglish 1(2): (a) see′ 1 → see 5. 3rd3 +PL4 → them

(6) 1. 2. 3. 4.

The routine for the Plains Cree NP uses many of the same steps as the routine for the verb in (5). The correspondence for (2) ni-ta:nis-ina:n-ak ‘our daughter’ is given in (7). (7) 1. identify relation: daughter′ 1 ; 2. identify arguments: possessor:ego2 +PL3 ; 3. compute ΦPlainsCree 1(2,3): (a) ego2 → ni(b) daughter′ 1 → ta:nis(c) ego2 +PL3 → ina:n -ak By hypothesis, executing parts of routines that are shared across constructions is less costly than executing parts of routines that are specific to a particular construction. Intuitively, the computation of the Plains Cree NP is less costly than it would be if it used the English routine for the English syntax, given in (8), even setting aside differences in lexical form. (8) 1. identify relation: daughter′ 1 ; 2. identify arguments: possessor:ego2 +PL3 3. compute ΦEnglish 2(3): (a) ego2 +PL3 → our 4. daughter′ 1 → daughter The steps in the processing routines of course reflect parallelisms in the morphosyntax. My hypothesis is that it is the use of the same steps that contributes to economy, and not the morphosyntax per se. But because of the close correspondence between the morphosyntax and the computation, the morphosyntax is a good proxy for the computation and the notion of

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.2 analogy 255 ‘analogy’ captures similarities in processing routines. And this in turn allows us to introduce a substantial simplification in representing the routines by recognizing the relationship between the routines and the correspondences. As noted, the constructions are representations of the schemas underlying the routines used to compute constructs. They are representations of the relationships between particular sounds and meanings. For example, (8) represents the computation of the correspondence in (9). (9) ⎡phon [our2,3 –daughter1 ]4 ⎤ ⎢syn ⎥ [NP [NP our]2,3 [N daughter]1 ]4 ⎢ ⎥ ′ [daughter 1 [possessor:ego2 +PL3 )]4 ⎦ ⎣cs The production routine can be constructed by starting with the CS representation, and then proceeding to the corresponding phon representation, going from left to right. In this simple case, then, there is a series of intermediate constructs that are associated with the steps in (8) as illustrated in (10). (10)

⎡ phon [ our2,3 − daughter1 ]4 ⎤ Step 3 Step 4 ⎢ ⎥ ⎢ syn ⎥ ] [ [ our] [ daughter ] 4 NP NP 2,3 N 1 ⎢ ⎥ Step 3 ⎢ ⎥ Step 4 ⎢ ⎥ ′ [ daughter1 [possessor ∶ ego2 +PL3 ] ]4 ⎥ ⎢ cs Step 2 Step 1 ⎣ ⎦

In this way, we can isolate the parts of related constructional schemas. Consider harmonic branching, e.g. head-final VP and PP. The shared portion of the routines is shown as cosubscripted boxes in the constructions.13 (11) ⎡ Head final VP⎤ ⎡ Head final PP⎤ ⎢phon 2– 1 ⎥ ⎢phon 2– 1 ⎥ i i ⎢ ⎥, ⎢ ⎥ ⎢ ⎥⎢ ⎥ syn [ V , XP ] syn [ P , NP ] VP 1 j 2 ⎦ ⎣ PP 1 j 2 ⎦ ⎣ The substance of the shared portion is the step ‘order the phon corresponding to the head following all other material in phon that corresponds to other constituents in the phrase’. 13 Here I am adapting the notation that Jackendoff & Audring (2020) use for representing the relatedness between morphological constructions that do not inherit their shared properties from a more general construction. This notion of relatedness can be traced back to Harris (1957).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

256 constructional economy and analogy Degree of similarity can also be measured in terms of the constructional formulation. As I noted in the version of Culicover (1973) reprinted in Culicover (2013b), similarity between two constructions corresponds to Levenshtein distance (Levenshtein 1966), that is, the number of elementary transformations needed to change one into the other. In the case of (11), for example, we might take the Levenshtein distance to be one, since the only difference is the category of term 1. Without going through the details, it should be clear that the Levenshtein distance between the construction for the Plains Cree sentence in (4) and the construction for the English sentence is substantially larger—there is virtually no overlap at all. What we see, then, is a way to get at the style of a language while maintaining individual constructions when the facts warrant. To take just one more example, Haugen (2011) shows that one of the key traits of polysynthesis, pronominal subject and object incorporation in the verb, developed separately for subjects and objects in Southeast Puebla Nahuatl. Thus, Baker’s (2001) proposal that polysynthesis is the property of expressing GFs internally to the verb is too strong, in that it does not allow for marking of one argument without the other. Baker rejects the idea that there may be one parameter for each GF, arguing that there is no object agreement without subject agreement. But Haugen shows that there are languages that have just object agreement, e.g. O’odham. According to Haugen, following Mithun, the order of morphemes reflects the historical development of the agreement morphology, and both orders are possible (p. 320). From a constructional perspective, pronominal subject and object incorporation are different, but related phenomena, as formalized in (12). For purposes of illustration, I assume that they correspond to prefixing.1⁴ (12) a. subject incorporation ⎡phon 1(2)=2+1 ⎤ ⎢ ⎥ ⎢ ⎡category V ⎤⎥ ⎢ ⎢lid ⎥⎥ lid1 ⎢ ⎢ ⎥⎥ ⎢syn ⎢ ⎥⎥ person person ⎢ ⎥⎥ ⎢ arg [ ] ⎢ ⎥⎥ ⎢ number number ⎢ 2 ⎦⎥ ⎣ ⎢ ⎥ λx.1′ (θ:x)(2′ ) ⎣cs ⎦ 1⁴ It is interesting to observe that this incorporation follows an ergative pattern. When there is a single argument, it is the first, and only, incorporated argument. When there are two arguments, the object is the first incorporated argument.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.2 analogy 257 b. object incorporation ⎡phon 1(2,3)=3+2+1 ⎢ ⎡category V ⎢ ⎢lid ⎢ lid1 ⎢ ⎢ ⎢ ⎢ person ⎢arg ⎢syn [ number ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ person [ ⎢ ⎢arg number ⎢ ⎣ ⎢ λy.λx.1′ (θ:x,θ:y)(2′ )(3′ ) ⎣cs

⎤ ⎥ ⎤⎥ ⎥⎥ ⎥⎥ person ⎥⎥ ] ⎥⎥ number 2 ⎥⎥ ⎥⎥ person ⎥⎥ ] ⎥⎥ number 3 ⎥ ⎦ ⎥ ⎦

The possibility of pronominal subject incorporation into the verb facilitates, by analogy, incorporation of (i) pronominal objects, (ii) non-pronominal objects, (iii) other categories. There are no parameters, but extensions of parts of processing routines to related constructions.1⁵

10.2.3 Pure style Cases where the similarity in routines is not matched by similarity in construction are particularly interesting. These are cases where the same grammatical device, that is, the phon-syn correspondence, is used across unrelated constructions. They show clearly that what we have been calling ‘analogy’ cannot be reduced to shared parameters or inherited cs-syn-phon correspondences. The reason for this is that the same processing steps are used in computing two constructions, but the two constructions do not express overlapping CS functions. Thus, they cannot be viewed as being special cases of a more general construction, as characterized in section 10.2. I refer to this situation as ‘constructional irreducibility’.1⁶ As mentioned in section 10.1, one instance of constructional irreducibility is preposition stranding in English. In Chapter 9, I showed that p-stranding in passives is formally very close to p-stranding in A′ constructions. But the passive is not an A′ construction, and it is impossible to treat both passive and

1⁵ Haugen (2011, 330) makes just this case for Nahuatl—first, the development of subject agreement, and then the development of incorporation. However, he couches his scenario in terms of resetting the various parameters, which is just a different way of saying constructional change. 1⁶ Constructional irreducibility thus appears to reflect generalization that does not form a natural class, in the sense of Harris & Campbell (1995, 102).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

258 constructional economy and analogy A′ constructions as inheriting the property of p-stranding from a more general construction.1⁷ Nevertheless, both make use of the same component of processing, one that treats the verb and the preposition as a unit with respect to assignment of a thematic role to the complement of the preposition. Crucially, this processing component is not a matter of syntactic representation, but of thematic role assignment. The preposition cannot be stranded in the passive when there is thematically independent material between the verb and the preposition (13a), but it can be if the verb and the intervening material form a semantic unit, such as speak (softly) to (13b) and take advantage of (13c). (13) a. ∗ Chris was given the book to. b. Chris was spoken (softly) to. c. Chris was taken advantage of. In contrast, the traditional notion of ‘reanalysis’ isolates the complement of the preposition by treating the verb, the preposition, and intervening material as a syntactic unit (Hornstein & Weinberg 1981), as in (14). Therefore it does not distinguish the cases in terms of how the thematic role is assigned. (14) a. ∗ Chris was [V given the book] to. b. Chris was [V spoken (softly)] to. c. Chris was [V taken advantage] of. The implausibility of such a syntactic analysis was shown in Chapter 9.1⁸ Since the assignment of thematic roles is a matter of CS, it does not correspond to a simple configurational relationship in syn or a local relationship in phon.1⁹ The generic computational routine that captures this generalization is given in (15). (15) 1. identify arguments: agent:α2 , patient:β3 2. … 1⁷ This observation can be understood to apply very generally, regardless how we choose to characterize constructions. For examples, suppose we use movement derivations to describe A′ and A constructions. In order to explain preposition stranding in A′ constructions, we might appeal to an ‘escape hatch’ that allows the NP to leave the PP (cf. van Riemsdijk 1982). But then we are faced with the problem that an argument cannot move from an escape hatch to a higher argument position (Chomsky 1973). 1⁸ See also Baltin & Postal (1996). 1⁹ Here, I am obviously departing from an idea that was prominent in GB theory (cf. Chomsky 1981) that the assignment of thematic roles is a syntactic phenomenon, analogous to the assignment of case.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.2 analogy 259 3. P1 NP3 → 1– ∅3 4. … This routine says that the complement of a preposition that corresponds to the patientive argument may be null in phon. That is, English has preposition stranding. But this argument must be licensed in some other position in order for the construct to be well-formed. So, If the complement is in A′ position, it will be spelled out in initial position in the string. Similarly, if this argument is the subject of a passive, as in (13d), it will be spelled out in subject position. Other constructions that p-stranding appears to have been generalized to are those that may not be reducible to the semantics of the general A′ construction discussed in Chapter 7, and thus may be additional cases of constructional irreducibility. These are exemplified in (16). (16) a. b. c. d.

Chris needs a good talking to. Your problem is definitely worth thinking seriously about. Your problem is fun to think about. It is a fun problem to think about. Your problem is too hard to work on. It is too hard a problem to work on.

Stepping back a bit, what we see here is that English has a distinctive style, captured by the routine in (15). It is not specific to a particular construction, but is involved in licensing across a set of constructions. This stylistic property of English is similar, in this respect, to the cross-categorial polysynthesis property of Plains Cree. In neither case is a parameter either necessary or sufficient. And at least for English p-stranding, it is not clear what such a parameter would look like. Of course, as always, it is possible to stipulate cryptoconstructional ‘micro-parameters’ such as [allows p-stranding in A′ constructions], [allows p-stranding in passives], [allows p-stranding with need], [allows p-stranding with fun], etc. But then the notion of parameter does no explanatory work. And such an approach raises the puzzling question of why there are no languages with p-stranding in passives but no p-stranding in A′ constructions. Clearly, the answer is that p-stranding in passives is a extension of the core A′ cases, by analogy. Another characteristic English property that conveys a similar stylistic flavor is do-support. In section 3.3.3 I observed that in the English negative imperative we see what looks like do-support, e.g.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

260 constructional economy and analogy (17) a. Don’t touch that button. b. Don’t you touch that button. However, because there is no tense in the imperative (Culicover 1971), it is difficult to collapse this case of do-support with the regular variant that appears in tensed sentences. Thus, I concluded in Chapter 3 that the occurrence of don’t in negative imperatives is simply a matter of constructional stipulation. But this conclusion does not explain why the same form is used in the tensed and imperative cases, of course. From the perspective of constructional representations, the form in the negative imperative could just as well be *gront. But it is obvious that the use of don’t in the negative imperative is not accidental—it takes advantage of an existing form with a particular function. In the context of the current discussion, we can understand this phenomenon as another instance of analogy, albeit a very restricted one. Basically, the routine that is used to convey negation in the tensed sentence is recruited to convey negation in the imperative. A proposed routine is sketched in (18). Again, it is essential to recognize that recruitment of such a routine is quite different from shared use of the same grammatical construction—don’t in the negative imperative when the subject is you is not reducible to the don’t in a negative tensed sentence when the subject is you, even though they are the same form. (18) 1. 2. 3. 4. 5.

identify arguments: agent:addressee1 , … identify polarity: neg2 … neg2 → don’t …

addressee1 is realized as you in the appropriate linear ordering with respect to don’t as licensed by the particular construction, e.g declarative, interrogative, imperative. Yet another example from a different grammatical domain involves nonfinite sentential complementation in English, which uses the forms -ing and to. De Smet (2013) tracks the history of these forms from Middle English to Modern English. For instance, many uses of infinitival to convey potentiality. However, there are also many that do not (De Smet 2013, 24f). Thus it is not possible to account for the widespread use of this form of complementation as having been driven by the association of the form with a particular meaning, which would be a straightforward constructional generalization.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.2 analogy 261 De Smet argues that while there may be some semantically-based generalizations governing the use of to, these conflict with and may be overridden by other constructional considerations. Hence, it is a formal device for marking complementation that is somewhat idiosyncratically governed by particular verbs and particular constructions. Similarly, the gerund spreads from marking the complements of a relatively small set of verbs in Middle English to a progressively larger set, based on what de Smet calls ‘narrow’ and ‘broad paradigmatic analogy’ (De Smet 2013, 68f;144f)—these are semantic and formal similarities. A significant point, in the current context, is that infinitival to and the gerund show constructional irreducibility, where the particular form simply has to be stipulated in the contexts in which it is licensed. And the argument for their widespread use across constructions is that they are available devices for signaling complementation that can be extended and repurposed to the extent required. For example, logically, different prepositions could have been used with the infinitive consistent with different meanings, as in (19), but they are not. (19) a. b. c. d.

want to/∗ in eat the pizza sorry to/∗ for eat the pizza like to/∗ at/∗ with eat pizza fun to/∗ on/∗ by eat pizza

The reason why the preposition to is used in all cases is analogy, the maximization of economy. To take another example from English, SAI occurs in questions, with topicalized negative constituents that have sentential scope, and so. In Culicover (2013c, 89) I argue against the claim of Goldberg & Giudice (2005) that there is a semantic factor that unifies all cases of SAI. But this leaves open the question, why did the history of English lead to the use of the same formal device for different correspondences? Logically, questions could have SAI, sentential negation could have no inversion, and so could license the auxiliary in final position. Granted, this fractionation of the constructional grammar is counterintuitive. But to explain why it does not occur, we have to appeal to some principle. Similarity of form leads to reduction in the complexity of the grammatical description when the functions overlap. But when the functions do not overlap, we have constructional irreducibility. We cannot appeal to economy of representation. Then we have to appeal to maximal use of the same formal

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

262 constructional economy and analogy devices purely on the basis of economy of computation. In the end, SAI is a signature of English, just like preposition stranding, don’t, infinitival to and gerundive -ing in complements. It signals the presence in initial position of a constituent that corresponds to an operator that scopes over the entire sentence.

10.3 Beyond parameters: Capturing the style I have argued in this chapter that the way to capture typological patterns is through constructional economy, realized as analogy. In this section I review a number of classical typological generalizations and show how they may be understood in these terms. I start with Baker’s Polysynthesis Parameter and recapitulate briefly the observations made thus far about the phenomena that it seeks to account for. Then I consider Greenberg’s (1966) universals that bear on morphology and syntax, with additional reference to Dryer & Haspelmath (2011). I conclude with some additional candidates for universals of the Greenbergian sort, inspired by the constructional approach to analogy.

10.3.1 Baker’s Polysynthesis Parameter Recall that our central principle regarding typological patterning is that, driven by economy, speakers will tend to shape constructions that express more or less related form/meaning correspondences in ways that use existing computational schemata. This is what we mean by ‘analogy’. This extension of schemata may in the limit achieve a level of generality expressed in classical parameter theory, but need not. The many cases where the extensions of schemata fail to show complete generality are the arguments against classical parameter theory, and arguments for the more fine-grained expressive power of constructions.2⁰ Recall that Baker’s Polysynthesis Parameter has to do with whether or not arguments are morphologically incorporated in the verb. With this in mind, it is clear that the evidence that Baker himself provides in Baker (1996) is overwhelmingly against this macroparameter, even as he argues for it in great detail. Perhaps the most salient instance of this is that in virtually every

2⁰ As noted elsewhere, the phenomena themselves do not argue against more fine-grained versions of parameters from the perspective of descriptive adequacy.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.3 beyond parameters: capturing the style 263 polysynthetic language, third person direct objects are not morphologically marked. To quote Baker (1996, 21): It must be conceded that the necessity of NI or agreement is not at all apparent on the surface in Mohawk. I have already alluded to the fact that verbs with neuter objects and no incorporated noun root seem to have no morpheme on the verb that indicates the object.

The solution, of course, is the traditional one of zero-morphology. He continues: However, this problem disappears if we assume that Mohawk has a phonologically null third person neuter morpheme on the verb in these cases. Positing the existence of such zero morphemes is a common practice in both the descriptive and theoretical literature; for example, this device is used explicitly in Jelinek 1984, as well as in most descriptions or analyses of head-marking languages (e.g. Mithun 1986b for Mohawk). The important point is that this ∅ element is restricted to third person neuter; for other person-gender categories the form is not in general ∅ and the need for either agreement or incorporation shows up clearly.

Occam’s Razor, as well as the constructional approach that I argue for here, suggests that there is no morphology in these cases, rather than zeromorphology. The lexical entry of a prototypical transitive verb itself entails that there is a patient. For example, it is impossible to engage in an act of hitting, without hitting something. And it is impossible to engage in an act of killing, without killing some animate thing. These entailments are constructional, of course, since they do not correspond to the absence of overt morphosyntax in every language. It is the paradigm function that determines whether there is anything in phon that corresponds to a third person argument. In Plains Cree, the lexical form of the verb itself indicates whether the patient argument is animate (wâpam- ‘see her/him’) or inanimate (wâpaht‘see it/something’). In addition, there are AI+O verbs in Plains Cree, as discussed in section 6.3.2. These are syntactically intransitive verbs that are semantically transitive. That is, they have only one morphologically marked argument, the subject, yet correspond to two-argument relations in CS. The difference between these verbs and ‘zero-morphology’ verbs is that in the latter case, the second argument is not marked on the verb only if it is third person.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

264 constructional economy and analogy Thus, the AI+O verbs have the general default constructional form of (20). (20) ⎡phon ⎢ ⎢ ⎢syn ⎢ ⎢ ⎢ ⎣cs

1(2)

⎤ ⎥ category V ⎤ ⎡ ⎥ ⎢lid ⎥ lid1 ⎥ ⎥ ⎢ ⎥ arg2 ⎦ ⎥ ⎣arg ⎥ 1′ (agent:2′ ,patient:pro)⎦

In addition, in languages like Plains Cree and Mohawk that mark arguments on the verb, datives are transitive with respect to agent and recipient, with the theme being a non-agreeing argument—essentially, they are TA+O. This is another case where an unaffected theme argument is part of the lexical semantics of a verb, but does not enter into the syntactic construction. Other properties that Baker associates with the Polysynthesis Parameter are similarly variable across languages. As noted in section 6.3.3, incorporation of non-pronominals is not uniform across languages, but extends to nonarguments in some languages, e.g. location and instrument. While word order in polysynthetic languages is quite free in Mohawk, it is more restricted in others (Baker 1996, 117). More generally, Sadock (2017) argues that ‘polysynthesis’ is a subjective term that has different instantiations across languages (see also Fortescue 2017). Mithun (1986, 2017a) has shown that noun incorporation, which is characteristic of polysynthesis, comes in several varieties, and Evans (2017, 333) provides evidence from the languages of Northern Australia that “the typological traits associated with polysynthesis are not strongly linked.” Thus, it appears that the notion of a parameter setting for an entire language, even one as intuitively distinctive as polysynthesis, does not capture the actual extent of variation observed in polysynthetic languages. It is too categorical, and hence does not give Baker the flexibility to adequately characterize the empirical phenomena that he identifies, at least not without appeal to auxiliary parameters and other devices such as null morphemes. What we do have is style, the consequence of analogy driven by economy.

10.3.2 Greenberg’s universals I turn next to Greenberg’s universals (Greenberg 1966). As is generally recognized, most of these universals are not universal in a strict sense. They have

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.3 beyond parameters: capturing the style 265 the status of statistically significant correlations: a language with property X is more likely to have property Y than the logically possible alternative(s). Such correlations are to be expected in the current framework, where the same grammatical devices are likely to be used across constructions in order to maximize economy. Many familiar Greenbergian universals have to do with branching direction, where heads tend to precede their complements across categories. In fact, branching harmony is characterized in precisely this way by Dryer (1992, 89), who proposes the following principle. The Branching Direction Theory (BDT): Verb patterners are nonphrasal (nonbranching, lexical) categories and object patterners are phrasal (branching) categories. That is, a pair of elements X and Y will employ the order XY significantly more often among VO languages than among OV languages if, and only if, X is a nonphrasal category and Y is a phrasal category.

Dryer’s formulation captures the formal similarity of branching direction across categories, without trying to reduce all the cases to a common function. This formal similarity is precisely what I am trying to capture with constructional analogy. I review several Greenbergian universals that appear to fall under this general approach, with comments where appropriate. Universal 2. In languages with prepositions, the genitive almost always follows the governing noun, while in languages with postpositions it almost always precedes. The orderings P-NP&N-Gen and NP-P&Gen-N are instances of branching harmony, and fall under the rationale offered by Hawkins (1994) and Dryer (1992). As is generally the case for branching harmony, we may also make a case for sharing of processing routine, consistent with BDT: the head of the phrase defines the category of the phrase, which in turn determines its function in the interpretation. Carrying out this step in a uniform way reduces processing complexity. Universal 3.

Languages with dominant VSO order are always prepositional.

Universal 4. With overwhelmingly greater than chance frequency, languages with normal SOV order are postpositional. These two universals reflect the generalization of one of two processing strategies: (i) the head determines the thematic roles of the arguments, which

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

266 constructional economy and analogy follow it; (ii) the arguments are identified, and then slotted into the roles determined by the head. In case (i) we get VSO and P-NP; in case (ii) we get SOV and NP-P. Universal 9. With well more than chance frequency, when question particles or affixes are specified in position by reference to the sentence as a whole, if initial, such elements are found in prepositional languages, and, if final, in postpositional. Processing strategies similar to those for Universals 3 and 4 may be relevant here, but the analogy is less clear. Universal 13. If the nominal object always precedes the verb, then verb forms subordinate to the main verb also precede it. This appears to fall under the BDT, as generalizations of the processing strategies for integrating verbs with their complements. Universal 18. When the descriptive adjective precedes the noun, the demonstrative and the numeral, with overwhelmingly more than chance frequency, do likewise. Greenberg’s universal is supported by data from WALS. According to Dryer & Haspelmath (2011, 87A–89A), of 803 languages with fixed ordering where there is a distinct demonstrative (as apposed to an affix), 554 (69%) are strictly head-initial or head-final, while 229 (31%) show mixed ordering. Table 10.1 Ordering of Dem, Num, and Adj with respect to N. ADJ/N

DEM/N

NUM/N

#

ADJ N ADJ N ADJ N ADJ N N ADJ N ADJ N ADJ N ADJ

DEM N DEM N N DEM N DEM DEM N DEM N N DEM N DEM

NUM N N NUM NUM N N NUM NUM N N NUM NUM N N NUM

210 14 11 13 66 79 78 334

While they may well be distinct categories from the perspective of morphosyntax, demonstratives, numerals, and adjectives all have similar semantic functions—they restrict the reference of a noun to a subset. The subset may be restricted in terms of discourse availability (demonstrative), numerosity (number), or particular properties (adjective). Hence, a plausible routine for constructing a noun phrase might involve first identifying the

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.3 beyond parameters: capturing the style 267 relevant noun given the set referred to, then identifying the restriction, and then ordering the restriction before the noun. Multiple restrictions would be more simply expressed by ordering them all before the noun. Ordering some restrictions before the noun and some after on the basis of more refined semantic properties is certainly possible, but requires more attention and selection.21 Universal 40. When the adjective follows the noun, the adjective expresses all the inflectional categories of the noun. In such cases, the noun may lack overt expression of one or all of these categories. In this case, the inflectional marking denotes the grammatical function of the NP. If the noun is unmodified, then it is final in the phrase. If the adjective follows the noun, the adjective is final. Therefore, economy yields general marking of the NP in final position.

10.3.3 Non-Greenbergian universals I conclude this discussion of typology and universals with some nonGreenbergian candidate (C-)universals that appear to follow from the constructional framework developed in this book. These are rather speculative, and limitations of time and space do not allow a comprehensive and broad cross-linguistic survey to test their validity. For some of these, data is available in WALS. For those where WALS data is not available, I have checked them against a small but diverse set of languages for which detailed analyses are readily found in the literature: Chinese, English, Italian, Jacaltec,22 Japanese, Mohawk,23 Navajo, Plains Cree, Russian, Tagalog, Warlpiri.2⁴ In this case, the correlations, such as they are, are merely suggestive. These C-universals are all based on the premise that relative homogeneity in constructional form for similar function is preferred over heterogeneity, other things being equal. I have organized them into three rough groups: A′ and related constructions, argument structure, morphology. Most of them are

21 I leave open here the question of whether apparent absolute universals reflect anything other than the limits of analogy in generalizing processing routines. For discussion, see Whitman (2008); Dryer (1992); Kiparsky (2008). According to Whitman, even many generalizations that have been classified as ‘absolute’ in the literature have turned out to have exceptions. Dryer’s position is that all of these generalizations are statistical, while Whitman and Kiparsky hold that some of them reflect universal linguistic principles. 22 Craig (1977). 23 Baker (1996). 2⁴ Hale (1968).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

268 constructional economy and analogy fairly obvious, but are worth stating explicitly, since they are by no means logically necessary. A′ and related constructions C-universal 1: A language that uses an A′ construction for direct whquestions is most likely to use A′ for for indirect wh-questions, and a language that uses wh-in-situ for direct wh-questions is most likely to use wh-in-situ for indirect wh-questions. It makes sense that a language would use the same formal device for direct and indirect wh-questions, but it is not a tautology. The data in Table 10.2 is consistent with the prediction. Note that the designation ‘clause-initial’ means simply that a wh-question is an apparent filler-gap construction—for example, it has been argued that the wh-question in Plains Cree is a type of cleft construction (Blain 1996). The same may be true for other languages that appear to have A′ constructions. Table 10.2 Devices for direct and indirect wh-questions. Language

Direct wh-Q

Indirect wh-Q

Predicted?

Chinese English Italian Jacaltec Japanese Mohawk Navajo

wh-in-situ A′ A′ A′ wh-in-situ clause-initial25 initial or wh-in-situ w/particle -lá clause-initial A′ A′ ;26 in situ27

wh-in-situ A′ A′ A′ wh-in-situ clause-initial initial w/ particle -lá

✓ ✓ ✓ ✓ ✓ ✓ ✓

clause-initial A′ A′

✓ ✓ ✓

Plains Cree Russian Tagalog

C-universal 2: If a language uses an A′ construction for wh-interrogatives, it is likely to use an A′ construction for relative clauses. If it uses wh-in-situ for wh-interrogatives, it is likely to use head-internal or head-final relative clauses without A′ .

2⁵ An NP argument may precede (Baker 1996, 118). 2⁶ Absolutive case only (Aldridge 2002). 2⁷ Law & Gärtner (2005).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.3 beyond parameters: capturing the style 269 This predicted correlation is based on the observation that wh-questions and relative clauses use the Gap construction for the relation between filler and gap. The data in Table 10.3 appears to be consistent with this prediction. Table 10.3 Direct wh-questions and relative clauses. Language

Direct wh-Q

Relative clause

Predicted?

Chinese English Italian Jacaltec Japanese Mohawk Navajo

head-final A′ A′ A′28 head-final A′ and head-internal29 head-internal & final

✓ ✓ ✓ ✓ ✓

Plains Cree

wh-in-situ A′ A′ A′ wh-in-situ clause-initial initial or wh-in-situ w/particle -lá clause-initial

Russian Tagalog

A′ A′ ; in situ

clause-initial relative marker ka:30 A′ head-internal &f inal31

(✓)

✓ (✓)

C-universal 3: Wh-questions are constructions that express operator scope, while topicalization has to do with discourse structure. If there is no operator in topicalization that corresponds to the wh-operator, as suggested in Chapter 7, we would expect there to be no necessary generalization between the two. So we predict that a language could have an A′ construction for wh-questions, and another type of construction for topicalization. Table 10.4 suggests that this prediction may be correct. C-universal 4: If a language has a clause-typing particle to indicate a direct yes-no question, it need not appear in the indirect question. But no language marks the indirect question with a particle and not the direct question. Based on Cheng (1991, 34–5) this appears to be true; the data shown in Table 10.5 from the languages in the current survey appear to be consistent.

2⁸ No relative marker but gap (Craig 1977). 3⁰ Wolfart (1996, 394). 31 Law (2000).

2⁹ Baker (1996, 163).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

270 constructional economy and analogy Table 10.4 Devices for expressing wh-questions and topicalization. Language

Wh-question

Topicalization

Predicted?

Chinese English Italian Jacaltec Japanese Mohawk

in situ A′ A′ A′ in situ clause-initial, possibly split32 initial or wh-in-situ w/particle -lá clause-initial A′ A′ ; in situ

clause-initial single clause-initial multiple clause-initial clause-initial+resumptive clause-initial with affix multiple clause-initial

✓ ✓ ✓ ✓ ✓

clause-initial



clause-initial33 scrambling A′



Navajo Plains Cree Russian Tagalog

Table 10.5 Marking of direct and indirect yes-no question. Language

Particle in direct Q?

Particle in indirect Q?

Predicted?

Chinese English Italian Jacaltec Japanese Mohawk Navajo Plains Cree Russian Tagalog

ma N/A N/A N/A -ka kʌ34 daʔ…(-ish)36 na:37 N/A intonation; optional ba

none N/A N/A N/A -ka kʌ35 -go none N/A kung ‘whether’



✓ ✓ ✓ ✓

C-universal 5: If subject precedes V, then the possessive precedes the N. Given that the functions of the subject and the possessor are closely related, and in some cases thematically identical (Chomsky 1970), economy will prefer them in the same position with respect to their heads. The data in Table 10.6 from Dryer & Haspelmath (2011, 82A; 86A) appears to confirm this prediction.

32 Baker (1996, 158). 33 Mühlbauer (2003). 3⁴ Baker (1996, 93, n. 28). 3⁵ Feurer (1977, 33); thanks to Mark Baker for the citation. 3⁶ Schauber (1975). 3⁷ Ellis (2000, 9).

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.3 beyond parameters: capturing the style 271 Table 10.6 Ordering of subject and possessor with respect to their heads. S-V order

Poss-N order

#

Predicted?

SV SV VS VS

Poss N N Poss Poss N N Poss

597 273 18 136





Morphology C-universal 6: If head-marking is used in the clause, it is likely to be used in the NP and other phrases, and if dependent-marking is used in the clause, it is likely to be used in the NP and other phrases. Dryer & Haspelmath (2011) has data regarding marking of core arguments from 236 languages. The distribution is given in Table 10.7. The data appear to confirm what is predicted. Table 10.7 Distribution of argument marking devices (Dryer & Haspelmath 2011, 25A). Marking

#

consistent head-marking consistent dependent-marking consistent head and dependent-marking consistent zero-marking inconsistent or other

47 46 16 6 121

The basic data for head-marking from Dryer & Haspelmath (2011, 24A; 23A) are given in Table 10.8. Table 10.8 Head vs. dependent marking in clause and NP. S

NP

#

Predicted?

HM HM HM HM DM DM DM DM HM & DM

HM DM no marking HM & DM HM DM no marking HM & DM HM & DM

49 3 8 1 12 47 6 5 22



✓ ✓

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

272 constructional economy and analogy C-universal 7: A language will be uniformly analytic or synthetic across S and NP. The data in Table 10.9 suggest that this is correct. Table 10.9 Analytic/synthetic across categories. Language

S

NP

Predicted?

Chinese English Italian Jacaltec Japanese Mohawk Navajo Plains Cree Russian Tagalog

analytic analytic analytic synthetic analytic synthetic synthetic synthetic analytic analytic

analytic analytic analytic synthetic analytic synthetic synthetic synthetic analytic analytic

✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

C-universal 8: If a language uses a particular type of affixation for case marking, it will use the same type of affixation for number marking. Constructional analogy predicts that a range of grammatical functions will be encoded in the same way. The data in Table 10.10 from Dryer & Haspelmath (2011, 33A; 51A) suggests that this prediction is correct for suffixation; the number of cases of prefixation is too small to permit a judgment. Table 10.10 Affixation type for case and number. Case

Number

#

Predicted?

suffix suffix prefix prefix

suffix prefix suffix prefix

273 3 14 14



10.4 Summary Let me review the key points of this chapter. I began with Baker’s invocation of Sapir’s notion of the ‘genius’ of a language, which I call ‘style’. Pursuing a theme developed throughout the book, I observed that while the classical notion of parameter can approximate style, it is too inflexible to adequately capture the variation that we actually encounter.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

10.4 summary 273 I proposed that style can be more adequately captured by appealing to analogy, giving it a more precise formulation in terms of economy considerations, applied to constructional representations. On this view, constructions are descriptions of the processing schemas that speakers use to compute correspondences between form and meaning. To the extent that different schemas use the same processing steps in the same relation to each other, they are more economical. ‘Analogy’ is the term that I use to refer to the situation in which one schema adopts some of the properties of another. I applied these notions to the grammatical phenomena covered by Baker’s Polysynthesis Parameter and some of Greenberg’s implicational universals. The data that is covered by these accounts, as well as the data that falls outside of them, appears to be accommodated fairly well in terms of constructional analogy. Finally, I proposed several additional candidate universals that are suggested by the idea that grammatical devices used in one construction are likely to be used in other, more or less related constructions. It is fair to say that ‘constructional analogy’ is not as conceptually elegant as a classical macroparameter. However, the ways that languages seek to maximize economy appears to be more compatible with a constructional formulation, in which there are many opportunities for generalization as well as variation.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

11 Recapitulation and prospects I have taken the position in this book that languages can be individuated in terms of their constructions, and that language change and language variation are best accounted for in terms of constructional change and variation. I assume, with MGG, that what we want to account for is the knowledge of language in the minds of speakers. However, I have developed a different view about how grammars are to be characterized and what are universals in linguistic theory. Grammars are composed of constructions that explicitly state correspondences between conceptual structure and other aspects of meaning, syntactic structure, and phonological form. The core universals are not syntactic, but universals of conceptual structure. Languages change and differ from one another in terms of the particular grammatical devices that participate in expressing these conceptual structure universals. Syntactic universals, either absolute universals or implicational universals, arise in part as a consequence of economy, that is, the pressure to reduce complexity in representing and computing these correspondences. This reduction of complexity is the basis for one construction being selected over another by language learners. Those constructions that survive this competition are those that are actually attested in the languages of the world. The virtue of characterizing grammars in terms of constructions is that the vocabulary for stating constructions is for the most part uniform across all descriptions. That is, we have a uniform way of describing the generalizations that count as the ‘rules’ of language, the idiosyncrasies of idioms, irregularities and exceptions, the primary data on the basis of which constructions are hypothesized by learners, and grammatical change and variation. The only variable, as discussed in Chapter 2, is the degree of generality of terms on the phon, syn, and cs tiers. Very specific terms are required for the description of particular constructs, and for lexically restricted correspondences. More general terms are used to state correspondences involving sets of lexical items, subcategories, and general categories. In Chapter 3 I argued that the true universals of grammar are those of CS. What is universal, on this view, are the functions and relations that grammars Language Change, Variation, and Universals: A Constructional Approach. Peter W. Culicover, Oxford University Press. © Peter W. Culicover 2021. DOI: 10.1093/oso/9780198865391.003.0011

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

recapitulation and prospects 275 must express, not the formal devices that they use. This assumption frees up syntactic theory from the artificial requirement that all superficial structures of all sentences of all languages should be derived from the same, or minimally different, abstract syntactic representations. One welcome consequence is Simpler Syntax (Culicover & Jackendoff 2005)—the syntactic devices no longer have to do the work that semantics and pragmatics must be able to do anyway. But with no universal constraints on syntactic representations, there must be some way of accounting for the fact that the range of grammatical variation is much narrower than what is logically possible. How are arbitrarily complex ways of representing particular CS functions and relations to be ruled out? The answer must be that there are selective pressures that weed out arbitrarily complex correspondences and favor those that are relatively simple. The question of how to characterize complexity was taken up in Chapter 4. There, I argued for a number of contributors, including the complexity of grammatical representations, the complexity of computing phon-syn correspondences, and the overall complexity of distinguishing the correspondences responsible for licensing sets of similar constructs. Given a characterization of complexity, It is then possible to begin to understand the process by which the more complex correspondences are weeded out and the less complex are favored. In Chapters 5–9 I showed how to view a number of particular linguistic phenomena in such terms. In Chapter 10 I showed how the pressure to reduce complexity favors the use of particular constructional devices across categories and functions in a single language, leading to correlations of grammatical properties and distinctive typological patterns. With the foregoing as prologue, it is natural to consider whether and how it might be extended. Can a theory of constructional change, using the notions of complexity, economy, and analogy proposed here, account more broadly for typological variation? I believe that it can, no doubt with modifications, refinements, and corrections as the empirical basis is enriched. Given the scope of such a project, a fuller exploration of this possibility must await another occasion. Another domain of potential extension is ‘evolution of language’. In the broadest sense, this term comprises the emergence of language in previously non-linguistic humans, grammatical change over the course of prehistory, and documented change in the historical era. Focusing just on grammatical change, I believe that the constructional perspective might be useful in modeling language evolution as a variety of neo-Darwinian evolution. Notably, Darwin saw languages as similar to species in how they occupy their spaces

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

276 recapitulation and prospects and compete with one another, and how they and their components came into being, spread, and died. Languages, like organic beings, can be classed in groups under groups; and they can be classed either naturally according to descent, or artificially by other characters. Dominant languages and dialects spread widely, and lead to the gradual extinction of other tongues. A language, like a species, when once extinct, never, as Sir C. Lyell remarks, reappears. The same language never has two birth-places. Distinct languages may be crossed or blended together. …We see variability in every tongue, and new words are continually cropping up; but as there is a limit to the powers of the memory, single words, like whole languages, gradually become extinct. [Darwin 1871, 466]

Most strikingly, from our perspective, is that Darwin saw the strong similarity between the competition for species and members of species to survive in the physical world and the competition for languages and elements of languages—what he called “words and grammatical forms”—to survive in the human social world. As Max Muller[sic] (’Nature,’ January 6th, 1870, p. 257) has well remarked: – ‘A struggle for life is constantly going on amongst the words and grammatical forms in each language. The better, the shorter, the easier forms are constantly gaining the upper hand, and they owe their success to their own inherent virtue.’ To these more important causes of the survival of certain words, mere novelty and fashion may be added; for there is in the mind of man a strong love for slight changes in all things. The survival or preservation of certain favoured words in the struggle for existence is natural selection. [Darwin 1871, 466]1

This raises some interesting questions. Most importantly, is the analogy based on something real? Is it more than a metaphor? What is it about species 1 Darwin appears to have taken a few liberties with Müller’s text. Here is the original: A much more striking analogy, therefore, than the struggle for life among separate languages, is the struggle for life among words and grammatical forms which is constantly going on in each language. Here the better, the shorter, the easier forms are constantly gaining the upper hand, and they really owe their success to their own inherent virtue. Here if anywhere, we can learn that what is called the process of natural selection, is at the same time, from a higher point of view, a process of rational elimination; for what seems at first sight mere accident in the dropping of old and the rising of new words, can be shown in most cases to be due to intelligible and generally valid reasons. For further discussion, see Mark Dingemanse’s http://dlc.hypotheses.org/399.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

recapitulation and prospects 277 and languages that explains the fact that they appear to evolve in much the same way? When we look at the Darwinian theory of evolution in the context of contemporary biology, is the analogy simply suggestive, as it was in Darwin’s day, or do the correspondences go deeper? The potential role of a constructional approach in addressing these questions is intriguing. Constructional complexity is an appealing analogue to Darwin’s notion of fitness—a construction is fitter than the alternatives if it is more economical. I have already discussed the notion of competition among constructional variants. An analogue to mutation would be generalization and error in the specification of constructional terms in the course of learning. And the analogue to selection would be the spread of one constructional variant through a social network at the expense of others. In fact, the terms of a construction may be seen as analogues to the genes, so that constructions that emerge and persist in the competition can be seen as having their internal makeup determined by their success in reducing complexity. This said, the task of translating this, or any grammatical formalism into an explanatory evolutionary model is non-trivial, to say the least.2 I hope that the present work will serve as a useful basis for a deeper understanding of the mechanisms governing grammatical change and variation, and in the end, perhaps even the evolution of languages.

2 Many thanks to an anonymous reviewer for steering me away from this precipice.

OUP CORRECTED PROOF – FINAL, 18/6/2021, SPi

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

References Abney, Steven. 1987. The English Noun Phrase in its Sentential Aspect. Cambridge, MA: Department of Linguistics, MIT Ph.D. dissertation. Aboh, Enoch O. 2015. The Emergence of Hybrid Grammars: Language Contact and Change. Cambridge: Cambridge University Press. Aboh, Enoch O. & Norval Smith. 2009. Simplicity, simplification, complexity and complexification. In Enoch O. Aboh & Norval Smith (eds.), Complex Processes in New Languages, 1–26. Amsterdam/Philadelphia: John Benjamins Publishing Company. Ackema, Peter & Ad Neeleman. 2002. Effects of short-term storage in processing rightward movement. In Sieb Nooteboom, Fred Weerman, & Frank Wijnen (eds.), Storage and Computation in the Language Faculty, 219–56. Dordrecht: Kluwer Academic Publishers. Ackerman, Farrell & John Moore. 2001. Proto-properties and Grammatical Encoding. Stanford, CA: CSLI Publications. Ahenakew, Freda. 1986. Kiskinahamawâkan-âcimowinisa, vol. 2. Department of Native Studies, University of Manitoba. Aikhenvald, Alexandra Y. 2003. Mechanisms of change in areal diffusion: New morphology and language contact. Journal of Linguistics 39(1). 1–29. Aikhenvald, Alexandra Y. 2017. Polysynthetic structures of Lowland Amazonia. In Michael Fortescue, Marianne Mithun, & Nicholas Evans (eds.), The Oxford Handbook of Polysynthesis, 284–311. Oxford: Oxford University Press. Aikhenvald, Alexandra Y. & R. M. W. Dixon. 2003. Studies in Evidentiality. Amsterdam/Philadelphia: John Benjamins Publishing Company. Aikhenvald, Alexandra Y., R. M. W. Dixon, & Masayuki Onishi. 2001. Non-canonical Marking of Subjects and Objects. Amsterdam/Philadelphia: John Benjamins Publishing Company. Aissen, Judith. 1999. Markedness and subject choice in optimality theory. Natural Language and Linguistic Theory 17(4). 673–711. Aissen, Judith. 2003. Differential object marking: Iconicity vs. economy. Natural Language & Linguistic Theory 21(3). 435–83. Aldridge, Edith. 2002. Nominalization and wh-movement in Seediq and Tagalog. Language and Linguistics 3(2). 393–426. Alishahi, Afra & Suzanne Stevenson. 2008. A computational model of early argument structure acquisition. Cognitive Science 32(5). 789–834. Alishahi, Afra & Suzanne Stevenson. 2010. A computational model of learning semantic roles from child-directed language. Language and Cognitive Processes 25(1). 50–93. Allen, Cynthia L. 1980. Movement and deletion in Old English. Linguistic Inquiry 11(2). 261–323. Allen, Cynthia L. 2000. Obsolescence and sudden death in syntax: The decline of verb-final order in early Middle English. In Ricardo Bermúdez-Otero, David Denison, Richard M. Hogg, & C. B. McCully (eds.), Generative Theory and Corpus Studies: A Dialogue from 10 ICEHL, 3–25. Berlin/New York: Mouton de Gruyter.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

280 references Allen, Cynthia L. 2006. Case syncretism and word order change. In Ans van Kemenade & Bettelou Los (eds.), The Handbook of the History of English, 201–23. Wiley Online Library. Ambridge, Ben & Elena Lieven. 2015. A constructivist account of child language acquisition. In Brian MacWhinney & William O’Grady (eds.), The Handbook of Language Emergence, 478–510. Wiley-Blackwell. doi:10.1002/9781118346136.ch22. https://onlinelibrary.wiley.com/doi/abs/10.1002/9781118346136.ch22. Ambridge, Ben, Julian M Pine, Caroline F Rowland, Franklin Chang & Amy Bidgood. 2013. The retreat from overgeneralization in child language acquisition: Word learning, morphology, and verb argument structure. Wiley Interdisciplinary Reviews: Cognitive Science 4(1). 47–62. Andrews, Avery D. 2001. Non-canonical A/S marking in Icelandic. In Alexandra Y. Aikhenvald, Robert M.W. Dixon, & Masayuki Onishi (eds.), Non-canonical Marking of Subjects and Objects, 85–112. Amsterdam/Philadelphia: John Benjamins Publishing Company. Ansaldo, Umberto, Walter Bisang, & Pui Yiu Szeto. 2018. Grammaticalization in isolating languages and the notion of complexity. In Heiko Narrog & Bernd Heine (eds.), Grammaticalization from a Typological Perspective, 219–34. Oxford: Oxford University Press. Arkadiev, Peter M. 2008. Thematic roles, event structure, and argument encoding in semantically aligned languages. In Mark Donohue & Søren Wichmann (eds.), The Typology of Semantic Alignment, 101–11. Oxford: Oxford University Press. Arkadiev, Peter M. 2009. Differential argument marking in two-term case systems and its implications for the general theory of case marking. In Helen de Hoop & Peter de Swart (eds.), Differential Subject Marking, 151–71. Dordrecht: Springer. Armstrong, Nigel & Alan Smith. 2002. The influence of linguistic and social factors on the recent decline of French ne. Journal of French Language Studies 12(01). 23–41. Arppe, Antti, Jordan Lachler, Trond Trosterud, Lene Antonsen, & Sjur N. Moshagen. 2016. Basic language resource kits for endangered languages: A case study of Plains Cree. CCURL 1–8. Ashby, William J. 1981. The loss of the negative particle ne in French: A syntactic change in progress. Language. 674–87. Asher, Nicholas. 2000. Truth conditional discourse semantics for parentheticals. Journal of Semantics 17(1). 31–50. Augustinus, Liesbeth & Frank Van Eynde. 2017. A usage-based typology of Dutch and German IPP verbs. Leuvense Bijdragen: Tijdschrift Voor Germaanse Filologie 101. 101–22. Auwera, Johan van der. 1999. Periphrastic ‘do’: Typological prolegomena. In Guy A. J. Tops, Betty Devrient, & Steven Geukens (eds.), Thinking English Grammar: To Honour Xavier Dekeyser, Professor Emeritus, 457–70. Leuven/Paris: Peters. Axel, Katrin. 2005. Null subjects and verb placement in Old High German. In Stephan Kepser & Marga Reis (eds.), Linguistic Evidence: Empirical, Theoretical and Computational Perspectives, 27–48. Berlin/New York: Mouton de Gruyter. Axel, Katrin. 2007. Studies on Old High German Syntax: Left Sentence Periphery, Verb Placement and Verb-second. Amsterdam/Philadelphia: John Benjamins Publishing Company. Bach, Emmon, Colin Brown, & William Marslen-Wilson. 1986. Crossed and nested dependencies in German and Dutch: A psycholinguistic study. Language and Cognitive Processes 1(4). 249–62. Bækken, Bjørg. 1999. Periphrastic do in Early Modern English. Folia Linguistica Historica 33 (Issue Historica-vol-20-1-2). 107–28. Bækken, Bjørg. 2000. Inversion in Early Modern English. English Studies 81(5). 393–421. Baker, Mark C. 1996. The Polysynthesis Parameter. Oxford: Oxford University Press.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 281 Baker, Mark C. 2003. Lexical Categories. Cambridge: Cambridge University Press. Baker, Mark C. 2008. The macroparameter in a microparametric world. In Theresa Biberauer (ed.), The Limits of Syntactic Variation, 351–74. Amsterdam/Philadelphia: John Benjamins Publishing Company. Baltin, Mark. 1981. Strict bounding. In C. Lee Baker & John McCarthy (eds.), The Logical Problem of Language Acquisition. Cambridge, MA: MIT Press. Baltin, Mark & Paul M. Postal. 1996. More on reanalysis hypotheses. Linguistic Inquiry 27(1). 127–45. Baptista, Marlyse. 2009. Economy, innovation and degrees of complexity in creole formation. In Enoch O. Aboh & Norval Smith (eds.), Complex Processes in New Languages, 293–315. Amsterdam/Philadelphia: John Benjamins Publishing Company. Baptista, Marlyse. 2020. Competition, selection, and the role of congruence in creole genesis and development. Language 96(1). 160–99. Barlew, Jefferson & Peter W. Culicover. 2015. Minimal constructions. Unpublished manuscript. The Ohio State University. Barðdal, Jóhanna. 2011. The rise of dative substitution in the history of Icelandic: A diachronic construction grammar account. Lingua 121(1). 60–79. Barton, Edward G. Jr., Robert C. Berwick, & Eric Sven Ristad. 1987. Computational Complexity and Natural Language. Cambridge, MA: MIT Press. Baunaz, Lena, Liliane Haegeman, Karen De Clercq & Eric Lander (eds.). 2018. Exploring Nanosyntax. Oxford: Oxford University Press. Bean, Marian C. 1983. The Development of Word Order Patterns in Old English. London & Canberra: Croom Helm. Bech, Kristin. 2001. Word Order Patterns in Old and Middle English: A Syntactic and Pragmatic Study. Bergen: University of Bergen dissertation. Beck, Sigrid. 2011. Comparison constructions. In Claudia Maienborn, Klaus von Heusinger, & Paul Portner (eds.), Semantics: An International Handbook of Natural Language Meaning, 1341–89. Berlin: Mouton de Gruyter. Beck, Sigrid, Sveta Krasikova, Daniel Fleischer, Remus Gergel, Stefan Hofstetter, Christiane Savelsberg, John Vanderelst, & Elisabeth Villalta. 2009. Crosslinguistic variation in comparison constructions. Linguistic Variation Yearbook 9(1). 1–66. Beck, Sigrid, Toshiko Oda, & Koji Sugisaki. 2004. Parametric variation in the semantics of comparison: Japanese and English. Journal of East Asian Linguistics 13(4). 289–344. Belletti, Adriana & Luigi Rizzi. 1996. Su alcuni casi di accordo del participio passato in francese e in italiano. In Paola Benica, Guglielmo Cinque, T. de Mauro, & Nigel Vincent (eds.), Italiano e dialetti nel tempo. Saggi di grammatica per Giulio Lepschy, 7–22. Rome: Bulzoni. Bennett, Paul A. 1980. English passives: A study in syntactic change and relational grammar. Lingua 51(2–3). 101–14. Bergeton, Uffe & Roumyana Pancheva. 2011. A new perspective on the historical development of English intensifiers and reflexives. In Dianne Jonas, John Whitman, & Andrew Garrett (eds.), Grammatical Change: Origins, Nature, Outcomes, 123–38. Oxford: Oxford University Press. Berwick, Robert C. & Noam Chomsky. 2015. Why Only Us: Language and Evolution. Cambridge, MA: MIT Press. Berwick, Robert C., Angela D. Friederici, Noam Chomsky, & Johan J. Bolhuis. 2013. Evolution, brain, and the nature of language. Trends in Cognitive Sciences 17(2). 89–98. Berwick, Robert C. & Partha Niyogi. 1996. Learning from triggers. Linguistic Inquiry 27(4). 605–22.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

282 references Bhaskararao, Peri & Karumuri Venkata Subbarao (eds.). 2004. Non-nominative Subjects. Amsterdam/Philadelphia: John Benjamins Publishing Company. Bhat, D. N. Shankara. 2002. Grammatical Relations: The Evidence against Their Necessity and Universality. London: Routledge. Bhatia, Tej K. 1993. Punjabi. A Cognitive-Descriptive Grammar. London: Routledge. Biberauer, Theresa & Ian Roberts. 2017. Conditional inversion and types of parametric change. In Bettelou Los & Pieter de Haan (eds.), Word Order Change in Acquisition and Language Contact: Essays in Honour of Ans van Kemenade, 57–77. Amsterdam/Philadelphia: John Benjamins Publishing Company. Biberauer, Theresa & Michelle Sheehan. 2013. Theoretical Approaches to Disharmonic Word Order. Oxford: Oxford University Press. Bickel, Balthasar. 2011. Grammatical relations typology. In Jae Jung Song (ed.), The Oxford Handbook of Linguistic Typology, 399–444. Oxford: Oxford University Press. Bickerton, Derek. 1984. The language bioprogram hypothesis. Behavioral and Brain Sciences 7(2). 173–88. Bickerton, Derek. 1988. Creole languages and the bioprogram. In Frederick J. Newmeyer (ed.), Linguistics: The Cambridge Survey (II), 268–84. Cambridge: Cambridge University Press. Bies, Ann. 1996. Syntax and Discourse Factors in Early New High German: Evidence for VerbFinal Word Order. Philadelphia, PA: University of Pennsylvania dissertation. Biezma, Marıa. 2008. On the consequences of being small: Imperatives in Spanish. In Anisa Schardl, Martin Walkow, & Muhammad Abdurrahman (eds.), Proceedings of the 38th Annual Meeting of the North Eastern Linguistics Society. Amherst, MA: GLSA. Birchall, Joshua. 2014. Argument Marking Patterns in South American Languages. Nijmegen: Radboud Universiteit dissertation. Blain, Eleanor M. 1996. The covert syntax of wh-questions in Plains Cree. In Annual Meeting of the Berkeley Linguistics Society, vol. 22, 25–35. Berkeley, CA: Berkeley Linguistics Society. Blain, Eleanor M. 1997. Wh-constructions in Nehiyawewes (Plains Cree). Vancouver, B.C.: University of British Columbia dissertation. Blake, Barry J. 1976. On ergativity and the notion of subject: Some Australian cases. Lingua 39(4). 281–300. Blevins, James P. 2016. Word and Paradigm Morphology. Oxford University Press. Blevins, James P. & Juliette Blevins. 2009. Introduction: Analogy in grammar. In James P. Blevins & Juliette Blevins (eds.), Analogy in Grammar: Form and Acquisition, 1–12. Oxford: Oxford University Press. Blümel, Andreas. 2017. Exocentric root declaratives: Evidence from V2. In Leah Bauke, Andreas Blümel, & Erich Groat (eds.), Labels and Roots, 263–90. Boston/Berlin: Walter de Gruyter. Bobaljik, Jonathan D. 2004. Clustering theories. In Katalin É Kiss & Henk van Riemsdijk (eds.), Verb Clusters: A Study of Hungarian, German, and Dutch, 121–46. Amsterdam/Philadelphia: John Benjamins Publishing Company. Bod, R. 2006. Exemplar-based syntax: How to get productivity from examples. The Linguistic Review 23. 291–320. Bod, R. 2009. From exemplar to grammar: A probabilistic analogy-based model of language learning. Cognitive Science 33. 752–93. Boeckx, Cedric. 2011. Approaching parameters from below. In Anna Maria Di Sciullo & Cedric Boeckx (eds.), The Biolinguistic Enterprise: New Perspectives on the Evolution and Nature of the Human Language Faculty, 205–21. Oxford: Oxford University Press. Bolhuis, Johan J., Ian Tattersall, Noam Chomsky, & Robert C. Berwick. 2014. How could language have evolved? PLoS Biology 12(8). e1001934.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 283 Bossong, Georg. 1985. Empirische Universalienforschung: Differentielle Objektmarkierung in den Neuiranischen Sprachen. Tübingen: Gunter Narr. Bouchard, Denis. 2009. A solution to the conceptual problem of cartography. In Jeroen van Craenenbroeck (ed.), Alternatives to Cartography, 245–74. Berlin: Mouton de Gruyter. Brame, Michael K. 1976. Conjectures and Refutations in Syntax and Semantics. Amsterdam: North Holland. Brandner, Ellen. 2010. On the syntax of verb-initial exclamatives. Studia Linguistica 64(1). 81–115. Bresnan, Joan. 1971. Contraction and the transformational cycle in English. Unpublished manuscript, MIT. Bresnan, Joan. 1974. The position of certain clause-particles in phrase structure. Linguistic Inquiry 5(4). 614–19. Bresnan, Joan. 1977. Variables in the theory of transformations. In Peter W. Culicover, Thomas Wasow, & Adrian Akmajian (eds.), Formal Syntax, 157–96. New York: Academic Press. Bresnan, Joan & Ronald Kaplan. 1982. Lexical-Functional Grammar: A formal system for grammatical representation. In Joan Bresnan (ed.), The Mental Representation of Grammatical Relations, 173–281. Cambridge, MA: MIT Press. Briscoe, Edward J. 2000. Evolutionary perspectives on diachronic syntax. In Susan Pintzuk, George Tsoulas, & Anthony Warner (eds.), Diachronic Syntax: Models and Mechanisms, 75–108. Oxford: Oxford University Press. Broadwell, George Aaron. 2006. A Choctaw Reference Grammar. Lincoln and London: University of Nebraska Press. Brown, Roger. 1973. A First Language. Harvard: Harvard University Press. Burgess, Clifford S., Katarzyna Dziwirek, & Donna B. Gerdts (eds.). 1995. Grammatical Relations: Theoretical Approaches to Empirical Questions. Stanford, CA: CSLI Publications. Bybee, Joan. 2003. Cognitive processes in grammaticalization. In Michael Tomasello (ed.), The New Psychology of Language: Cognitive and Functional Approaches to Language Structure. Volume ii, 145–67. New York: Psychology Press. Bybee, Joan. 2006. From usage to grammar: The mind’s response to repetition. Language 82(2). 711–33. Bybee, Joan. 2013. Usage-based theory and exemplar representations of constructions. In Thomas Hoffmann & Graeme Trousdale (eds.), The Oxford Handbook of Construction Grammar, 49–70. Oxford: Oxford University Press. Campbell, Lyle. 2000. What’s wrong with grammaticalization? Language Sciences 23(2–3). 113–61. Cecchetto, Carlo. 2007. Some preliminary remarks on a “weak” theory of linearization. Sezione di Lettere 2(1). 1–13. Cecchetto, Carlo. 2009. Backwards dependencies must be short: A unified account of the final-over-final and the right roof constraints and its consequences for the syntax/morphology interface. In Caterina Donati, Chiara Branchini, Teresa Biberauer, & Ian Roberts (eds.), Challenges to Linearization, 57–92. Berlin: De Gruyter Mouton. Cecchetto, Carlo, Carlo Geraci, & Sandro Zucchi. 2009. Another way to mark syntactic dependencies: The case for right-peripheral specifiers in sign languages. Language 85(2). 278–320. Chater, Nick & Paul Vitányi. 2003. Simplicity: A unifying principle in cognitive science? Trends in Cognitive Sciences 7(1). 19–22. Cheng, Lisa Lai-Shen. 1991. On the Typology of Wh-questions. Cambridge, MA: MIT dissertation.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

284 references Cheng, Lisa Lai-Shen. 2009. Wh-in-situ, from the 1980s to now. Language and Linguistics Compass 3(3). 767–91. Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton. Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. Chomsky, Noam. 1966. Cartesian Linguistics: A Chapter in the History of Rationalist Thought. New York: Harper and Row. Chomsky, Noam. 1970. Remarks on nominalization. In Roderick A. Jacobs & Peter S. Rosenbaum (eds.), Readings in English Transformational Grammar, 184–221. Waltham, MA: Ginn. Chomsky, Noam. 1971. Deep structure, surface structure, and semantic interpretation. In Danny Steinberg & Leon Jacobovits (eds.), Semantics in Generative Grammar, 183–216. Cambridge: Cambridge University Press. Chomsky, Noam. 1972. Problems of Knowledge and Freedom: The Russell Lectures. London: Fontana/Collins. Chomsky, Noam. 1973. Conditions on transformations. In Stephen Anderson & Paul Kiparsky (eds.), A Festschrift for Morris Halle, 232–86. New York: Holt, Reinhart, & Winston. Chomsky, Noam. 1976. On the biological basis of language capacities. In Robert Rieber (ed.), The Neuropsychology of Language: Essays in Honor of Eric Lenneberg, 1–24. New York: Plenum Press. Chomsky, Noam. 1977. On wh-movement. In Peter W. Culicover, Thomas Wasow, & Adrian Akmajian (eds.), Formal Syntax, 71–132. New York: Academic Press. Chomsky, Noam. 1980. Rules and Representations. New York: Columbia University Press. Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, Noam. 1986. Knowledge of Language. New York: Praeger Publishers. Chomsky, Noam. 1995a. Bare phrase structure. In Gert Webelhuth (ed.), Government Binding Theory and the Minimalist Program, 383–439. Oxford: Oxford University Press. Chomsky, Noam. 1995b. The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, Noam. 2000a. Minimalist inquiries: The framework. In Roger Martin, David Michaels, & Juan Uriagereka (eds.), Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, 89–156. Cambridge, MA: MIT Press. Chomsky, Noam. 2000b. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press. Chomsky, Noam. 2001. Derivation by phase. In Michael Kenstowicz (ed.), Ken Hale: A Life in Linguistics, 1–52. Cambridge, MA: MIT Press. Chomsky, Noam. 2005. Three factors in language design. Linguistic Inquiry 36(1). 1–22. Chomsky, Noam. 2013. Problems of projection. Lingua 130. 33–49. Chomsky, Noam. 2015. Problems of projection: Extensions. In Elisa Di Domenico, Cornelia Hamann, & Simona Matteini (eds.), Structures, Strategies and Beyond: Studies in Honour of Adriana Belletti, 1–16. Amsterdam: John Benjamins Publishing Company. Chomsky, Noam, Ángel J Gallego, & Dennis Ott. 2019. Generative grammar and the faculty of language: Insights, questions, and challenges. Catalan Journal of Linguistics 18. 229–61. Chomsky, Noam & Howard Lasnik. 1993. Principles and parameter theory. In Arnim von Stechow, Wolfgang Sternefeld, & Theo Vennemann (eds.), Syntax: An International Handbook of Contemporary Research, 506–69. Berlin: Walter de Gruyter. Chung, Chan & Jong-Bok Kim. 2002. Differences between externally and internally headed relative clause constructions. In Proceedings of HPSG 2002, 3–25. Stanford, CA: CSLI Publications. Chung, Sandra. 1994. Wh-agreement and “referentiality” in Chamorro. Linguistic Inquiry 25(1). 1–44.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 285 Cinque, Guglielmo. 1994. On the evidence for partial N movement in the Romance DP. In Guglielmo Cinque, Jan Koster, Jean-Yves Pollock, Luigi Rizzi, & Rafaela Zanuttini (eds.), Paths towards Universal Grammar. Studies in Honour of Richard S. Kayne, 85–110. Washington, DC: Georgetown University Press. Cinque, Guglielmo. 1999. Adverbs and Functional Heads: A Cross-linguistic Perspective. New York: Oxford University Press. Cinque, Guglielmo. 2005. Deriving Greenberg’s universal 20 and its exceptions. Linguistic Inquiry 36(3). 315–32. Cinque, Guglielmo & Luigi Rizzi. 2008. The cartography of syntactic structures. Studies in Linguistics 2. 42–58. Clark, Brady. 2011. Subjects in early English: Syntactic change as gradual constraint reranking. In Dianne Jonas, John Whitman, & Andrew Garrett (eds.), Grammatical Change: Origins, Nature, Outcomes, 256–74. Oxford: Oxford University Press. Comrie, Bernard. 1982. Syntactic-morphological discrepancies in Maltese sentence structure. Communication & Cognition 15(4). 281–306. Condoravdi, Cleo & Ashwini Deo. 2014. Aspect shifts in Indo-Aryan and trajectories of semantic change. In Chiara Gianollo, Agnes Jäger, & Doris Penka (eds.), Language Change at the Syntax-semantics Interface, 261–92. Amsterdam: De Gruyter Mouton. Coppola, Marie. 2002. The Emergence of Grammatical Categories in Home Sign: Evidence from Family-based Gesture Systems in Nicaragua. University of Rochester: Department of Brain and Cognitive Sciences. Corbett, Greville G. 2015. Morphosyntactic complexity: A typology of lexical splits. Language 91(1). 145–93. Cournane, Ailís. 2017. In defense of the child innovator. In Éric Mathieu & Robert Truswell (eds.), From Micro Change to Macro Change, 10–24. Oxford: Oxford University Press. Craenenbroeck, Jeroen van (ed.). 2009. Alternatives to Cartography. Berlin/New York: Mouton de Gruyter. Craig, Colette Grinevald. 1977. The Structure of Jacaltec. Austin, TX: University of Texas Press. Creissels, Denis. 2008. Remarks on split intransitivity and fluid intransitivity. In Oliver Bonami. & Patricia Cabredo Hofherr (eds.), Empirical Issues in Syntax and Semantics, vol. 7, 139–68. Paris: CNRS. Crisma, Paola & Giuseppe Longobardi. 2009. Historical Syntax and Linguistic Theory. Oxford: Oxford University Press. Croft, William. 1991. A conceptual framework for grammatical categories. Journal of Semantics 7. 245–80. Croft, William. 2000. Explaining Language Change: An Evolutionary Approach. Harlow: Pearson Education Ltd. Croft, William. 2001. Radical Construction Grammar: Syntactic Theory in Typological Perspective. Oxford & New York: Oxford University Press. Culicover, Peter W. 1971. Syntactic and Semantic Investigations. Cambridge, MA: MIT dissertation. Culicover, Peter W. 1973. On the coherence of syntactic descriptions. Journal of Linguistics 9. 35–51. Culicover, Peter W. 1993. Evidence against ECP accounts of the that-t effect. Linguistic Inquiry 24(3). 557–61. Culicover, Peter W. 1999. Syntactic Nuts: Hard Cases, Syntactic Theory, and Language Acquisition. Oxford: Oxford University Press.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

286 references Culicover, Peter W. 2000. Language acquisition and the architecture of the language faculty. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the Berkeley Formal Grammar Conference Workshop. Stanford, CA: CSLI Publications. Culicover, Peter W. 2005. Squinting at Dali’s Lincoln: On how to think about language. In Proceedings of the Annual Meeting of the Chicago Linguistic Society, vol. 41(2). 109–28. Chicago: Chicago Linguistic Society. Culicover, Peter W. 2008. The rise and fall of constructions and the history of English dosupport. Journal of Germanic Linguistics 20(1). 1–52. Culicover, Peter W. 2009. Natural Language Syntax. Oxford: Oxford University Press. Culicover, Peter W. 2011. A reconsideration of English relative clause constructions. Constructions 2. 1–14. Culicover, Peter W. 2013a. English (zero-)relatives and the competence-performance distinction. International Review of Pragmatics 5. 253–70. Culicover, Peter W. 2013b. Explaining Syntax. Oxford: Oxford University Press. Culicover, Peter W. 2013c. Grammar and Complexity: Language at the Intersection of Competence and Performance. Oxford: Oxford University Press. Culicover, Peter W. 2014. Constructions, complexity and word order variation. In Frederick J. Newmeyer & Laurel B. Preston (eds.), Measuring Linguistic Complexity, 148–78. Oxford: Oxford University Press. Culicover, Peter W. 2017. Cryptoconstructionalism. In Kier Moulton & Anne-Michelle Tessier (eds.), Festschrift for Kyle Johnson. University of Massachusetts at Amherst. Culicover, Peter W., Afra Alishahi, & Elena Vaiksnoraite. 2016. The constructional evolution of grammatical functions. Workshop on Events in Language and Cognition, University of Florida. Culicover, Peter W. & Ray Jackendoff. 2005. Simpler Syntax. Oxford: Oxford University Press. Culicover, Peter W. & Ray Jackendoff. 2012. A domain-general cognitive relation and how language expresses it. Language 82(2). 305–40. Culicover, Peter W., Ray Jackendoff, & Jenny Audring. 2017. Multiword constructions in the grammar. Topics in Cognitive Science 9(3). 552–68. Culicover, Peter W. & Robert D. Levine. 2001. Stylistic inversion in English: A reconsideration. Natural Language & Linguistic Theory 19(2). 283–310. Culicover, Peter W. & Andrzej Nowak. 2002. Markedness, antisymmetry and complexity of constructions. Linguistic Variation Yearbook 2(1). 5–30. Culicover, Peter W. & Andrzej Nowak. 2003. Dynamical Grammar: Minimalism, Acquisition and Change. Oxford: Oxford University Press. Culicover, Peter W. & Wendy Wilkins. 1986. Control, PRO, and the projection principle. Language 62(1). 120–53. Culicover, Peter W. & Susanne Winkler. 2008. English focus inversion. Journal of Linguistics 44. 625–58. Culicover, Peter W. & Susanne Winkler. 2018. Freezing, between grammar and processing. In Susanne Winkler, Jutta Hartmann, & Andreas Konietzko (eds.), Freezing, 353–86. Berlin: De Gruyter Mouton. Culicover, Peter W. & Susanne Winkler. 2019. Why topicalize VP? In Verner Egerland, Valeria Molnar, & Susanne Winkler (eds.), The Architecture of Topic. Berlin: Walter de Gruyter. Curry, Haskell B. 1963. Some logical aspects of grammatical structure. In Roman Jacobson (ed.), Structure of Language and its Mathematical Aspects: Proceedings of the Twelfth Symposium in Applied Mathematics, 56–68. American Mathematical Society.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 287 Dahl, Östen. 2004. The Growth and Maintenance of Linguistic Complexity. Amsterdam/Philadelphia: John Benjamins Publishing Company. Dahl, Östen. 2009. Increases in complexity as a result of language contact. In Kurt Braunmüller & Juliane Houser (eds.), Convergence and Divergence in Language Contact Situations, 41–52. Amsterdam/Philadelphia: John Benjamins Publishing Company. Dahlstrom, Amy. 1991. Plains Cree Morphosyntax. New York: Garland Publishing, Inc. Dahlstrom, Amy. 2009. Obj without Obj – a typology of Meskwaki objects. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of LFG09 Conference, 223–39. Stanford, CA: CSLI Publications. Dahlstrom, Amy. 2013. Argument structure of quirky Algonquian verbs. In Tracy Holloway King & Valeria de Paiva (eds.), From Quirky Case to Representing Space: Papers in Honor of Annie Zaenen, 61–71. Stanford, CA: CSLI Publications. Dahlstrom, Amy. 2017. Seeking consensus on the fundamentals of Algonquian word order. In Monica Macaulay & Margaret Noodin (eds.), Forty-fifth Algonquian Conference, 59–72. East Lansing, MI: Michigan State University Press. Dahlstrom, Amy. 2019. Embedded questions in Meskwaki: Syntax and information structure. In Monica Macaulay & Margaret Noodin (eds.), Papers of the Forty-eighth Algonquian Conference, 69–86. East Lansing, MI: Michigan State University Press. Darwin, Charles. 1871. The Descent of Man. New York: Modern Library, Random House. Dautriche, Isabelle, Alejandrina Cristia, Perrine Brusini, Sylvia Yuan, Cynthia Fisher, & Anne Christophe. 2014. Toddlers default to canonical surface-to-meaning mapping when learning verbs. Child Development 85(3). 1168–80. Davis, Henry. 2001. On negation in Salish. In Proceedings of the 36th International Conference on Salish and Neighbouring Languages, Chilliwack, BC, 8–10. Davis, Henry. 2005. On the syntax and semantics of negation in Salish. International Journal of American Linguistics 71(1). 1–55. Dayal, Veneeta. 2017. Multiple wh-questions. In Martin Everaert & Henk van Riemsdijk (eds.), The Wiley Blackwell Companion to Syntax, Second Edition, 1–54. Wiley Online Library. De Smet, Hendrik. 2013. Spreading Patterns: Diusional Change in the English System of Complementation. Oxford: Oxford University Press. De Smet, Hendrik & Olga Fischer. 2017. The role of analogy in language change: Supporting constructions. In Marianne Hundt, Sandra Mollin & Simone E. Pfenninger (eds.), The Changing English Language: Psycholinguistic Perspectives, 240–68. Cambridge: Cambridge University Press. Deo, Ashwini. 2015. The semantic and pragmatic underpinnings of grammaticalization paths: The progressive to imperfective shift. Semantics and Pragmatics 8. 1–52. Deutscher, Guy. 2000. Syntactic Change in Akkadian: The Evolution of Sentential Complementation. Oxford: Oxford University Press. Dik, Simon C. 1978. Functional Grammar. Amsterdam: North Holland. Dixon, R. M. W. 1982. Where have All the Adjectives Gone? And Other Essays in Semantics and Syntax. Berlin: Walter de Gruyter. Dixon, R. M. W. 2004. Adjective classes in typological perspective. In R. M. W. Dixon & Alexandra Y. Aikhenvald (eds.), Adjective Classes: A Crosslinguistic Typology, 1–49. Oxford: Oxford University Press. Donohue, Mark. 2008. Semantic alignment systems: What’s what, and what’s not. In Mark Donohue & Søren Wichmann (eds.), The Typology of Semantic Alignment, 24–75. Oxford: Oxford University Press.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

288 references Dowty, David. 1980. Comments on the paper by Bach and Partee. In J. Kreiman & A. Ojeda (eds.), Papers from the Parasession on Pronouns and Anaphora, 29–40. Chicago: Chicago Linguistics Society. Dowty, David. 1989. On the semantic content of the notion of ‘thematic role’. In Gennaro Chierchia, Barbara H. Partee, & Raymond Turner (eds.), Properties, Types and Meaning: Semantic Issues, 69–129. Dordrecht: Kluwer Academic Publishers. Dowty, David. 1991. Thematic proto-roles and argument selection. Language 67. 547–619. Dryer, Matthew S. 1992. The Greenbergian word order correlations. Language 68(1). 81–138. Dryer, Matthew S. 1997. Are grammatical relations universal? In Joan Bybee, John Haiman, & Sandra A. Thompson (eds.), Essays on Language Function and Language Type, 115–43. Amsterdam/Philadelphia: John Benjamins Publishing Company. Dryer, Matthew S. & Martin Haspelmath. 2011. The World Atlas of Language Structures Online. Munich: Max Planck Digital Library. http://wals.info. Dufter, Andreas, Jürg Fleischer, & Guido Seiler (eds.). 2009. Describing and Modeling Variation in Grammar. Berlin: Mouton de Gruyter. Dum-Tragut, Jasmine. 2009. Armenian: Modern Eastern Armenian. Amsterdam/Philadelphia: John Benjamins Publishing Company. Durie, Mark. 1988. Preferred argument structure in an active language: Arguments against the category ‘intransitive subject’. Lingua 74(1). 1–25. Dziwirek, Katarzyna, Patrick Farrell, & Errapel Mejías-Bikandi. 1990. Grammatical Relations: A Cross-theoretical Perspective. Stanford, CA: CSLI Publications. Eckardt, Regine. 2006. Meaning Change in Grammaticalization: An Enquiry into Semantic Reanalysis. Oxford: Oxford University Press. Ellegård, Alvar. 1953. The Auxiliary Do: The Establishment and Regulation of its Use in English. Gothenburg Studies in English. Stockholm: Almqvist and Wiksell. Ellis, C. Douglas. 2000. Spoken Cree Level 1. West Coast of James Bay. Edmonton: University of Alberta Press. Embick, David & Rolf Noyer. 2007. Distributed morphology and the syntax/morphology interface. In Gillian Ramchand & Charles Reiss (eds.), The Oxford Handbook of Linguistic Interfaces, 289–324. Oxford: Oxford University Press. Emonds, Joseph E. 1985. A Unified Theory of Syntactic Categories. Dordrecht: Foris. Engdahl, Elisabet & Anu Laanemets. 2015. Prepositional passives in Danish, Norwegian and Swedish: A corpus study. Nordic Journal of Linguistics 38(3). 285–337. Evans, Nicholas. 2017. Polysynthesis in Northern Australia. In Michael Fortescue, Marianne Mithun, & Nicholas Evans (eds.), The Oxford Handbook of Polysynthesis, 312–35. Oxford: Oxford University Press. Evans, Nicholas & Stephen C. Levinson. 2009. The myth of language universals: Language diversity and its importance for cognitive science. Behavioral and Brain Sciences 32(5). 429–48. Everett, Daniel L. 2012. What does Pirahã grammar have to teach us about human language and the mind? Wiley Interdisciplinary Reviews: Cognitive Science 3(6). 555–63. Evers, Arnold. 1975. The Transformational Cycle in Dutch and German. Bloomington, Indiana: Indiana University Linguistics Club. Fanselow, Gisbert. 2017. Partial wh-movement. In Martin Everaert & Henk van Riemsdijk (eds.), Blackwell companion to syntax, 437–92. Wiley Online Library. Fanselow, Gisbert & Anoop Mahajan. 2000. Towards a minimalist theory of wh-expletives, wh-copying, and successive cyclicity. In Uli Lutz, Gereon Müller, & Arnim von

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 289 Stechow (eds.), Wh-scope Marking, 195–230. Amsterdam: John Benjamins Publishing Company. Fauconnier, Stefanie. 2012. Constructional Effects of Involuntary and Inanimate Agents: A Cross-linguistic Study. Leuven: Katholieke Universiteit Leuven dissertation. Feurer, Hanny. 1977. Questions and Answers in Mohawk Conversation. Toronto: McGill University dissertation. Filipović, Luna & John A. Hawkins. 2013. Multiple factors in second language acquisition: The CASP model. Linguistics 51(1). 145–76. Filipović, Luna & John A. Hawkins. 2019. The complex adaptive system principles model for bilingualism: Language interactions within and across bilingual minds. International Journal of Bilingualism 23(6). 1223–48. Fillmore, Charles J. 1968. The case for case. In Emmon Bach & Robert T. Harms (eds.), Universals in Linguistic Theory, 1–88. New York: Holt, Rinehart, and Wilson. Fillmore, Charles J. 1988. The mechanisms of construction grammar. In Proceedings of the Fourteenth Annual Meeting of the Berkeley Linguistics Society, 35–55. Berkeley, CA: Berkeley Linguistics Society. Fischer, Olga. 2013. An inquiry into unidirectionality as a foundational element of grammaticalization: On the role played by analogy and the synchronic grammar system in processes of language change. Studies in Language 37(3). 515–33. Fischer, Olga, Ans van Kemenade, Willem Koopman, & Wim van der Wurff. 2000. The Syntax of Early English. Cambridge: Cambridge University Press. Fisher, Cynthia, Yael Gertner, Rose M. Scott, & Sylvia Yuan. 2010. Syntactic bootstrapping. Wiley Interdisciplinary Reviews: Cognitive Science. doi:10.1002/wcs.17. Fitch, W. 2011. The evolution of syntax: An exaptationist perspective. Frontiers in Evolutionary Neuroscience 3.9. Fodor, Janet Dean. 1998. Unambiguous triggers. Linguistic Inquiry 29(1). 1–36. Fortescue, Michael. 2017. What are the limits of polysynthesis? In Michael Fortescue, Marianne Mithun, & Nicholas Evans (eds.), The Oxford Handbook of Polysynthesis. Oxford: Oxford University Press. Fortuin, Egbert & Ico Davids. 2013. Subordinate clause prolepsis in Russian. Russian Linguistics 37(2). 125–55. Fried, Mirjam. 2009. Construction grammar as a tool for diachronic analysis. Constructions and Frames 1(2). 262–91. Friederici, Angela D., Noam Chomsky, Robert C. Berwick, Andrea Moro, & Johan J. Bolhuis. 2017. Language, mind and brain. Nature Human Behaviour 1(10). 713–22. Gaby, Alice. 2005. Some participants are more equal than others: Case and the composition of arguments in Kuuk Thaayorre. In Megistu Amberger & Helen de Hoop (eds.), Competition and Variation in Natural Languages: The Case for Case, 9–39. Amsterdam: Elsevier. Gahl, Susanne & Alan C. L. Yu. 2006. Introduction to the special issue on exemplar-based models in linguistics. The Linguistic Review 23(3). 213–16. Ganenkov, Dmitry, Timur Maisak, & Solmaz Merdanova. 2009. Non-canonical agent marking in Agul. In Helen de Hoop & Peter de Swart (eds.), Differential Subject Marking, 173–98. Dordrecht: Springer. García, M. 2007. Differential object marking with inanimate objects. In Georg A. Kaiser & Manuel Leonetti (eds.), Proceedings of the Workshop ‘Definiteness, Specificity and Animacy in Ibero-Romance Languages’, 63–84. Gärtner, Hans-Martin. 2000. Are there V2 relative clauses in German? The Journal of Comparative Germanic Linguistics 3(2). 97–141.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

290 references Gärtner, Hans-Martin. 2002. On the force of V2 declaratives. Theoretical Linguistics 28(1). 33–42. Gast, Volker. 2002. Review of Elly van Gelderen, “A history of English reflexive pronouns: Person, self, and interpretability”. Language 78(3). 583–5. Gelderen, Elly van. 2000. A History of English Reflexive Pronouns: Person, Self, and Interpretability. Amsterdam/Philadelphia: John Benjamins Publishing Company. Gibson, Edward. 1998. Linguistic complexity: Locality of syntactic dependencies. Cognition 68(1). 1–76. Gibson, Edward. 2000. The dependency locality theory: A distance-based theory of linguistic complexity. In Alec Marantz, Yasushi Miyashita, & Wayne O’Neil (eds.), Image, Language, Brain, 95–126. Cambridge, MA: MIT Press. Gibson, Edward & Kenneth Wexler. 1992. Parameter setting, triggers and V2. GLOW Newsletter 28. 16–17. Gibson, Edward & Kenneth Wexler. 1994. Triggers. Linguistic Inquiry 25(4). 407–54. Ginzburg, Jonathan & Ivan A. Sag. 2000. Interrogative Investigations. Stanford, CA: CSLI Publications. Gisborne, Nikolas & Amanda Patten. 2011. Construction grammar and grammaticalization. In Heiko Narrog & Bernd Heine (eds.), The Oxford Handbook of Grammaticalization, 92–104. Oxford: Oxford University Press. Givón, Talmy. 1997. Grammatical Relations: A Functionalist Perspective. Amsterdam/Philadelphia: John Benjamins Publishing Company. Givón, Talmy. 2009. The Genesis of Syntactic Complexity: Diachrony, Ontogeny, Neurocognition, Evolution. Amsterdam: John Benjamins Publishing Company. Givón, Talmy & Masayoshi Shibatani. 2009. Syntactic Complexity: Diachrony, Acquisition, Neuro-cognition, Evolution. Amsterdam/Philadelphia: John Benjamins Publishing Company. Gleitman, Lila. 1990. The structural sources of verb meanings. Language Acquisition 1(1). 3–55. Gleitman, Lila & Barbara Landau. 1994. The Acquisition of the Lexicon. Cambridge, MA: MIT Press. Gleitman, Lila, Kimberly Cassidy, Rebecca Nappa, Anna Papafragou, & John C. Trueswell. 2005. Hard words. Language Learning and Development 1(1). 23–64. Goh, Gwang-Yoon. 2000. The Synchrony and Diachrony of the English Prepositional Passive: Form, Meaning, and Function. Columbus, OH: The Ohio State University dissertation. Goldberg, Adele E. 1995. Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press. Goldberg, Adele E. 2003. Constructions: A new theoretical approach to language. Trends in Cognitive Sciences 7(5). 219–24. Goldberg, Adele E. 2013. Constructionist approaches. In Thomas Hoffmann & Graeme Trousdale (eds.), Oxford Handbook of Construction Grammar, 15–31. Oxford: Oxford University Press. Goldberg, Adele E. 2019. Explain Me This: Creativity, Competition, and the Partial Productivity of Constructions. Princeton University Press. Goldberg, Adele E. & Alex Del Giudice. 2005. Subject-auxiliary inversion: A natural category. The Linguistic Review 22(2-4). 411–28. Goldin-Meadow, Susan. 2005. What language creation in the manual modality tells us about the foundations of language. The Linguistic Review 22(2-4). 199–225. Goldin-Meadow, Susan & Carolyn Mylander. 1998. Spontaneous sign systems created by deaf children in two cultures. Nature 391. 279–81.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 291 Greenberg, Joseph H. 1966. Some universals of grammar with particular reference to the order of meaningful elements. In Joseph H. Greenberg (ed.), Universals of Language, 73–113. Cambridge, MA: MIT Press (2nd edn). Greenberg, Joseph H. 1978. How does a language acquire gender markers? Universals of Human Language 3. 47–82. Grimm, Scott. 2011. Semantics of case. Morphology 21(3-4). 515–44. Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Cambridge, MA: MIT dissertation. Haegeman, Liliane & Henk van Riemsdijk. 1986. Verb projection raising, scope, and the typology of verb movement rules. Linguistic Inquiry 17(3). 417–66. Haider, Hubert. 2010. Wie wurde Deutsch OV. In Arne Ziegler (ed.), Historische Textgrammatik und historische Syntax des Deutschen, 11–32. Berlin/New York: De Gruyter. Hale, Kenneth L. 1968. Preliminary observations concerning the order of constituents in Walbiri sentences. Unpublished manuscript, 35 pp. MIT, Cambridge, MA. Halle, Morris & Alec Marantz. 1993. Distributed morphology and the pieces of inflection. In Kenneth Hale & Samuel Jay Keyser (eds.), The View from Building 20, 111–76. Cambridge, MA: MIT Press. Halle, Morris & Alec Marantz. 1994. Some key features of distributed morphology. In Andrew Carnie & Heidi Harley (eds.), MITWPL 21, Papers on Phonology and Morphology, 275–88. Cambridge, MA: MIT. Han, Chung-hye. 1999. Cross-linguistic variation in the compatibility of negation and imperatives. In K. Shahin, S. Blake, & E.-S. Kim (eds.), Proceedings of the Seventeenth West Coast Conference in Formal Linguistics, 265–79. Cambridge: Cambridge University Press. Han, Chung-hye. 2000. The Structure and Interpretation of Imperatives: Mood and Force in Universal Grammar. New York: Garland Publishing, Inc. Han, Chung-hye. 2001. Force, negation and imperatives. The Linguistic Review 18. 289–325. Hana, Jiri & Peter W. Culicover. 2008. Morphological complexity outside of universal grammar. Ohio State University Working Papers in Linguistics. Reprinted in Peter W. Culicover (2013). Explaining Syntax, 84–108. Oxford: Oxford University Press. Harley, Heidi. 2014. On the identity of roots. Theoretical Linguistics 40(3-4). 225–76. Harley, Heidi & Rolf Noyer. 1999. Distributed morphology. Glot International 4(4). 3–9. Harrigan, Atticus G., Katherine Schmirler, Antti Arppe, Lene Antonsen, Trond Trosterud, & Arok Wolvengrey. 2017. Learning from the computational modelling of Plains Cree verbs. Morphology 27(4). 565–98. Harris, Alice C. & Lyle Campbell. 1995. Historical Syntax in Cross-linguistic Perspective. Cambridge: Cambridge University Press. Harris, Alice C. 2008. On the explanation of typologically unusual structures. In Jeff Good (ed.), Linguistic Universals and Language Change, 54–80. Oxford: Oxford University Press. Harris, Zellig S. 1951. Methods in Structural Linguistics. Chicago: University of Chicago Press. Harris, Zellig S. 1957. Co-occurrence and transformation in linguistic structure. Language 33(3). 283–340. Haspelmath, Martin. 2001. Non-canonical marking of core arguments in European languages. In Alexandra Y. Aikhenvald (ed.), Non-canonical Marking of Subjects and Objects, 53–84. Amsterdam/Philadelphia: John Benjamins Publishing Company. Haspelmath, Martin. 2008. Parametric versus functional explanations of syntactic universals. In Theresa Biberauer (ed.), The Limits of Syntactic Variation, 75–108. Amsterdam: John Benjamins Publishing Company.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

292 references Haspelmath, Martin & Andrea Sims. 2013. Understanding Morphology. London: Routledge. Haugen, Jason D. 2011. On the gradual development of polysynthesis in Nahuatl. In Dianne Jonas, John Whitman, & Andrew Garrett (eds.), Grammatical Change: Origins, Nature, Outcomes, 315–31. Oxford: Oxford University Press. Hauser, Marc D., Noam Chomsky, & W. Tecumseh Fitch. 2002. The faculty of language: What is it, who has it, and how did it evolve? Science 298(5598). 1569–79. Hawkins, John A. 1994. A Performance Theory of Order and Constituency. Cambridge: Cambridge University Press. Hawkins, John A. 2004. Efficiency and Complexity in Grammars. Oxford: Oxford University Press. Hawkins, John A. 2013. Disharmonic word orders from a processing efficiency perspective. In Theresa Biberauer & Michelle Sheehan (eds.), Theoretical Approaches to Disharmonic Word Orders, 391–406. Oxford: Oxford University Press. Hawkins, John A. 2014. Cross-linguistic Variation and Efficiency. Oxford: Oxford University Press. Heath, Jeffrey. 1977. Choctaw cases. In Annual Meeting of the Berkeley Linguistics Society, vol. 3, 204–13. Heine, Bernd. 2005. On reflexive forms in creoles. Lingua 115(3). 201–57. Henry, Alison. 2008. Variation and syntactic theory. In J. K. Chambers, Peter Trudgill, & Natalie Schilling-Estes (eds.), The Handbook of Language Variation and Change, 267–82. Oxford: John Wiley & Sons. Hinterhölzl, Roland. 2015. An interface account of word-order variation in Old High German. In Theresa Biberauer & George Walkden (eds.), Syntax over Time: Lexical, Morphological, and Information-structural Interactions, 299–317. Oxford: Oxford University Press. Hinterhölzl, Roland. 2017. From OV to VO in English: How to Kroch the nut. In Bettelou Los & Pieter de Haan (eds.), Word Order Change in Acquisition and Language Contact: Essays in Honour of Ans van Kemenade, 9–34. Amsterdam/Philadelphia: John Benjamins Publishing Company. Hinterhölzl, Roland & Svetlana Petrova. 2009. Information Structure and Language Change: New Approaches to Word Order Variation in Germanic. Berlin/New York: Walter de Gruyter. Hirose, Tomio. 2004. Origins of Predicates: Evidence from Plains Cree. London: Routledge. Hofmeister, Philip, Peter W. Culicover, & Susanne Winkler. 2015. Effects of processing on the acceptability of frozen extraposed constituents. Syntax 18. 464–83. Hofmeister, Philip, & Ivan A. Sag. 2010. Cognitive constraints and island effects. Language 86. 366–415. Hofmeister, Philip, Laura Staum Casasanto, & Ivan A. Sag. 2013. Islands in the grammar? Standards of evidence. In Jon Sprouse & Norbert Hornstein (eds.), Experimental Syntax and the Islands Debate, 42–63. Cambridge: Cambridge University Press. Holton, Gary. 2008. The rise and fall of semantic alignment in North Halmahera, Indonesia. In Mark Donohue & Søren Wichmann (eds.), The Typology of Semantic Alignment, 252–76. Oxford: Oxford University Press. Hoop, Helen de & Bhuvana Narasimhan. 2005. Differential case-marking in Hindi. In Mengistu Amberber & Helen de Hoop (eds.), Competition and Variation in Natural Languages: The Case for Case, 321–45. Amsterdam: Elsevier. Hoop, Helen de & Bhuvana Narasimhan. 2009. Ergative case-marking in Hindi. In Helen de Hoop & Peter de Swart (eds.), Differential Subject Marking, 63–78. Dordrecht: Springer.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 293 Hoop, Helen de & Peter de Swart (eds.). 2009. Differential Subject Marking. Dordrecht: Springer. Hoop, Helen de & Peter de Swart. 2009. Cross-linguistic variation in differential subject marking. In Helen de Hoop & Peter de Swart (eds.), Differential Subject Marking, 1–16. Dordrecht: Springer. Hopper, Paul J. & Sandra A. Thompson. 1980. Transitivity in grammar and discourse. Language 56. 251–99. Hornstein, Norbert & Amy Weinberg. 1981. Case theory and preposition stranding. Linguistic Inquiry 12(1). 55–91. Hornung, Annette. 2017. English: The Grammar of the Danelaw: Arizona State University. Unpublished doctoral dissertation. Horvath, Julia. 1997. The status of ‘wh-expletives’ and the partial wh-movement construction of Hungarian. Natural Language and Linguistic Theory 15(3). 509–72. Huang, C.-T. James. 1982. Logical Relations in Chinese and the Theory of Grammar: MIT dissertation. Hukari, Thomas E. & Robert D. Levine. 1995. Adjunct extraction. Journal of Linguistics 31(02). 195–226. Huybregts, M. A. C. 1984. The weak inadequacy of context-free phrase structure grammars. In Gert de Haan, Mieke Trommele, & Wim Zonneveld (eds.), Van Periferie naar Kern, 81–99. Dordrecht: Foris. Isac, Daniela. 2015. The Morphosyntax of Imperatives. Oxford: Oxford University Press. Itkonen, Esa (ed.). 2005. Analogy as Structure and Process: Approaches in Linguistics, Cognitive Psychology and Philosophy of Science. Amsterdam/Philadelphia: John Benjamins Publishing Company. Jackendoff, Ray. 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: MIT Press. Jackendoff, Ray. 1983. Semantics and Cognition. Cambridge, MA: MIT Press. Jackendoff, Ray. 1987. The status of thematic relations in linguistic theory. Linguistic Inquiry 18(3). 369–411. Jackendoff, Ray. 1990. Semantic Structures. Cambridge, MA: MIT Press. Jackendoff, Ray. 1997. The Architecture of the Language Faculty. Cambridge, MA: MIT Press. Jackendoff, Ray. 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Jackendoff, Ray & Jenny Audring. 2020. The Texture of the Lexicon. Oxford: Oxford University Press. Jacobson, Pauline. 1984. Connectivity in Phrase Structure Grammar. Natural Language and Linguistic Theory 1(4). 535–81. Jacobson, Pauline. 1992. Flexible categorial grammars: Questions and prospects. In Robert D. Levine (ed.), Formal Grammar: Theory and Implementation, 129–67. Oxford: Oxford University Press. Jäger, Agnes. 2018. On the history of the IPP. In Agnes Jäger, Gisella Ferraresi, & Helmut Weiß (eds.), Clause Structure and Word Order in the History of German, 302–23. Oxford: Oxford University Press. Jäger, Andreas. 2005. The cross-linguistic function of obligatory ‘do’-periphrasis. In Iliana Mushin (ed.), Proceedings of the 2004 Conference of the Australian Linguistic Society (available from http://ses.library.usyd.edu.au/bitstream/2123/111/1ALS-20050630AJ.pdf) Sydney: Australian Linguistics Society. Jäger, Andreas. 2006. Typology of Periphrastic ‘Do’-constructions. Bochum: Brockmeyer.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

294 references Jäger, Andreas. 2007a. Grammaticalization paths of periphrastic ‘do’-constructions. Studies van de Belgische Kring voor Linguïstiek 1–18. Jäger, Gerhard. 2007b. Evolutionary game theory and typology: A case study. Language 83(1). 74–109. Jelinek, Eloise. 1984. Empty categories, case, and configurationality. Natural Language and Linguistic Theory 2(1). 39–76. Jelinek, Eloise. 1985. The projection principle and the argument type parameter. Unpublished manuscript, University of Arizona. Jenks, Peter & Sharon Rose. 2015. Mobile object markers in Moro: The role of tone. Language 91(2). 269–307. Jespersen, Otto. 1909. Progress in Language: With Special Reference to English. London: George Allen, Limited. Jespersen, Otto. 1917. Negation in English and Other Languages. København, AF Høst. Johnson, David E. 1974. On the role of grammatical relations in Linguistic theory. In M. W. LaGaly, R. Fox, & A. Bruck (eds.), Proceedings from the Tenth Regional Meeting of the Chicago Linguistics Society, 269–83. Chicago Linguistic Society. Johnson, David E. 1977. On Keenan’s definition of “subject of ”. Linguistic Inquiry 8(4). 673–92. Johnson, David E. & Shalom Lappin. 1997. A critique of the minimalist program. Linguistics and Philosophy 20(3). 273–333. Johnson, David E. & Shalom Lappin. 1999. Local Constraints vs. Economy. Stanford, CA: CSLI Publications. Jurafsky, Daniel. 1996. A probabilistic model of lexical and syntactic access and disambiguation. Cognitive Science 20(2). 137–94. Kalin, Laura. 2018. Licensing and differential object marking: The view from Neo-Aramaic. Syntax 21(2). 112–59. Kay, Paul. 2002. An informal sketch of a formal architecture for construction grammar. Grammars 5(1). 1–19. Kayne, Richard S. 1975. French Syntax: The Transformational Cycle. Cambridge, MA: MIT Press. Kayne, Richard S. 1994. The Antisymmetry of Syntax. Cambridge, MA: MIT Press. Kayne, Richard S. 2005a. Pronouns and their antecedents. In Movement and Silence, 105–35. Oxford: Oxford University Press. Kayne, Richard S. 2005b. Some notes on comparative syntax: With special reference to English and French. In Guglielmo Cinque & Richard S. Kayne (eds.), The Oxford Handbook of Comparative Syntax, 3–69. Oxford: Oxford University Press. Keenan, Edward. 1976. Towards a universal definition of “Subject of ”. In Charles Li (ed.), Subject and Topic, 303–33. New York: Academic Press. Kemenade, Ans van. 1987. Syntactic Case and Morphological Case in the History of English. Dordrecht: Foris. Kemenade, Ans van & Bettelou Los (eds.). 2006. The Handbook of the History of English. Oxford: Wiley-Blackwell. Kemenade, Ans van & Bettelou Los. 2008. Discourse adverbs and clausal syntax in Old and Middle English. In The Handbook of the History of English, 224–48. New York: John Wiley & Sons. Kemenade, Ans van & Nigel Vincent. 1997. Parameters of Morphosyntactic Change. Cambridge: Cambridge University Press. Kemenade, Ans van & Marit Westergaard. 2012. Syntax and information structure: Verbsecond variation in Middle English. In Anneli Meurman-Solin, Maria Jose Lopez-Couso, & Bettelou Los (eds.), Information Structure and Syntactic Change in the History of English, 87–119. Oxford: Oxford University Press.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 295 Kim, Kyumin. 2017. Animacy and transitivity alternations in Blackfoot. In Monica Macaulay & Margaret Noodin (eds.), Papers of the Forty-sixth Algonquian Conference, 123–40. East Lansing, MI: Michigan State University Press. Kim, Kyumin. 2018. The role of final morphemes in Blackfoot: Marking aspect or sentience? In Monica Macaulay & Margaret Noodin (eds.), Papers of the Forty-seventh Algonquian Conference, 147–64. East Lansing, MI: Michigan State University Press. Kinsella, Anna R. 2009. Language Evolution and Syntactic Theory. Cambridge: Cambridge University Press. Kinsella, Anna R. & Gary F. Marcus. 2009. Evolution, perfection, and theories of language. Biolinguistics 3(2-3). 186–212. Kintsch, Walter. 1977. Memory and Cognition. New York: John Wiley and Sons. Kiparsky, Paul. 1996. The shift to head-initial VP in Germanic. In Höskuldur Thráinsson, Samuel David Epstein, & Steve Peter (eds.), Studies in Comparative Germanic Syntax, 140–79. Dordrecht/Boston/London: Kluwer Academic Publishers. Kiparsky, Paul. 1997. The rise of positional licensing. In Ans van Kemenade & Nigel Vincent (eds.), Parameters of Morphosyntactic Change, 460–94. Cambridge: Cambridge University Press. Kiparsky, Paul. 2008. Universals constrain change; change results in typological generalizations. In Jeff Good (ed.), Linguistic Universals and Language Change, 23–53. Oxford: Oxford University Press. Kiparsky, Paul. 2011. Grammaticalization as optimization. In Dianne Jonas, John Whitman, & Andrew Garrett (eds.), Grammatical Change: Origins, Nature, Outcomes, 15–51. Oxford: Oxford University Press. Klaiman, M. H. 1981. Toward a universal semantics of indirect subject constructions. In Proceedings of the Seventh Annual Meeting of the Berkeley Linguistics Society, 123–35. Berkeley, CA: Berkeley Linguistics Society. Klamer, Marian. 2008. The semantics of semantic alignment in Eastern Indonesia. In Mark Donohue & Søren Wichmann (eds.), The Typology of Semantic Alignment, 221–51. Oxford: Oxford University Press. König, Ekkehard & Volker Gast. 2002. Reflexive pronouns and other uses of self-forms in English. Zeitschrift für Anglistik und Amerikanistik 50(3). 1–14. König, Ekkehard & Peter Siemund. 2000. The development of complex reflexives and intensifiers in English. Diachronica 17(1). 39–84. Koster, Jan. 1978. Locality Principles in Syntax. Dordrecht: Foris. Krifka, Manfred. 2001. For a structured meaning account of questions and answers. In Caroline Féry & Wolfgang Sternefeld (eds.), Audiatur vox Sapientia: A Festschrift for Arnim von Stechow, 287–319. Berlin: Academie Verlag. Krifka, Manfred. 2008. The semantics of questions and the focusation of answers. In Chungmin Lee, Matthew Gordon, & Daniel Büring (eds.), Topic and Focus: Crosslinguistic Perspectives on Meaning and Intonation, 139–50. Dordrecht: Springer. Kroch, Anthony. 1989. Reflexes of grammar in patterns of language change. Language Variation and Change 1(3). 199–244. Kroch, Anthony. 1994. Morphosyntactic variation. In K. Beals, J. Denton, R. Knippen, L. Melnar, H. Suzuki, & E. Zeinfeld (eds.), Papers from the 30th Regional Meeting of the Chicago Linguistics Society: Parasession on Variation and Linguistic Theory, 180–201. Chicago: Chicago Linguistics Society. Kroch, Anthony. 2003. Syntactic change. In Mark Baltin & Chris Collins (eds.), Handbook of Contemporary Syntactic Theory, 699–729. Oxford: Blackwell. Kroch, Anthony & Ann Taylor. 2000. Verb-object order in early Middle English. In Susan Pintzuk, George Tsoulas, & Anthony Warner (eds.), Diachronic Syntax: Models and Mechanisms, 132–63. Oxford: Oxford University Press.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

296 references Kroskrity, Paul V. 2010. Getting negatives in Arizona Tewa: On the relevance of ethnopragmatics and language ideologies to understanding a case of grammaticalization. Pragmatics 20(1). 91–107. Kubota, Yusuke & Robert D. Levine. 2013a. Coordination in hybrid type-logical categorial grammar. In Ohio State University Working Papers in Linguistics. Columbus, OH: Department of Linguistics. Kubota, Yusuke & Robert D. Levine. 2013b. Empirical foundations for hybrid type-logical categorial grammar: The domain of phenomena. Unpublished manuscript, Ohio State University. Kuhlmeier, Valerie, Karen Wynn, & Paul Bloom. 2003. Attribution of dispositional states by 12-month-olds. Psychological Science 14(5). 402–8. Kuno, Susumu. 1973. Constraints on internal clauses and sentential subjects. Linguistic Inquiry 4(3). 363–85. Kuryłowicz, Jerzy. 1965. The evolution of grammatical categories. Diogenes 13(51). 55–71. Ladd, D. Robert, Dan Dediu, & Anna R. Kinsella. 2008. Languages and genes: Reflections on biolinguistics and the nature-nurture question. Biolinguistics 2(1). 114–26. Landau, Idan. 2001. Elements of Control: Structure and Meaning in Innitival Constructions. Dordrecht: Kluwer Academic Publishers. Larson, Richard. 1988. On the double object construction. Linguistic Inquiry 19(3). 335–92. Laughren, Mary. 2000. Constraints on the pre-auxiliary position in Warlpiri and the nature of the auxiliary. In John Henderson (ed.), Proceedings of the 1999 Conference of the Australian Linguistic Society. Law, Paul. 2000. On relative clauses and the DP/PP adjunction asymmetry. In Artemis Alexiadou, Paul Law, André Meinunger, & Chris Wilder (eds.), The Syntax of Relative Clauses, 161–99. Amsterdam: John Benjamins Publishing Company. Law, Paul & Hans-Martin Gärtner. 2005. Post-verbal wh-phrases in Malagasy, Tagalog, and Tsou. UCLA Working Papers in Linguistics 12: The Proceedings of AFLA XII. 211–26. Legate, Julie Anne. 2002. Warlpiri: Theoretical Implications. Cambridge, MA: MIT dissertation. LeSourd, Philip S. 2006. Problems for the pronominal argument hypothesis in MaliseetPassamaquoddy. Language 83(2). 486–514. Levenshtein, V. I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8). 707–10. Levinson, Stephen C. 1987. Pragmatics and the grammar of anaphora: A partial pragmatic reduction of binding and control phenomena. Journal of Linguistics 23(2). 379–434. Lewis, Richard. 1993. An architecturally-based theory of human sentence comprehension. In Proceedings of the 15th Annual Conference of the Cognitive Science Society, 108–13. Hillsdale, NJ: Erlbaum. Lewis, Richard, Shravan Vasishth, & Julie Van Dyke. 2006. Computational principles of working memory in sentence comprehension. Trends in Cognitive Science 10(10). 447–54. Lewis, Robert E. 2019. Theme signs in Potawatomi as object agreement and the inverse. In Monica Macaulay & Margaret Noodin (eds.), Papers of the Forty-eighth Algonquian Conference, 123–42. East Lansing, MI: Michigan State University Press. Lobeck, Anne. 1995. Ellipsis: Functional Heads, Licensing and Identication. New York: Oxford University Press. MacWhinney, Brian, Andrej Malchukov, & Edith Moravcsik (eds.). 2014. Competing Motivations in Grammar and Usage. Oxford: Oxford University Press.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 297 Mahajan, Anoop. 2000. Towards a unified treatment of wh-expletives in Hindi and German. In Uli Lutz, Gereon Müller, & Arnim von Stechow (eds.), Wh-scope Marking, 317–32. Amsterdam/Philadelphia: John Benjamins Publishing Company. Malchukov, Andrej L. 2008. Animacy and asymmetries in differential case marking. Lingua 118(2). 203–21. Malchukov, Andrej L. & Helen de Hoop. 2011. Tense, aspect, and mood based differential case marking. Lingua 121(1). 35–47. Marantz, Alec. 1984. On the Nature of Grammatical Relations. Cambridge, MA: MIT Press. Martineau, France & Raymond Mougeon. 2003. A sociolinguistic study of the origins of ne deletion in European and Quebec French. Language. 118–52. Martins, Ana Maria. 2015. Negation and NPI composition inside DP. In Theresa Biberauer & George Walkden (eds.), Syntax over Time: Lexical, Morphological and Informationstructural Interactions, 102–22. Oxford: Oxford University Press. Maslova, Elena. 2003. A Grammar of Kolyma Yukaghir. Berlin: De Gruyter Mouton. Master, Alfred. 1946. The zero negative in Dravidian. Transactions of the Philological Society 45(1). 137–55. McKay, Isabel. 2019. The internal structure of Montana Salish instrumental nominals. In D. K. E. Reisinger & Gloria Mellesmoe (eds.), Papers for the International Conference on Salish and Neighbouring Languages, vol. 54 UBCWPL, Vancouver, BC: UBC. McNally, Louise. 2013. Semantics and pragmatics. Wiley Interdisciplinary Reviews: Cognitive Science 4(3). 285–97. Mengarini, Gregory S. J., Joseph Giorda, Leopold van Gorp, Josep Bandini, & Joseph Guidi. 1877–9. A Dictionary of the Kalispel of Flat-head Indian Language. St. Ignatius, MT: St. Ignatius. Michaelis, Laura A. 2012. Making the case for construction grammar. In Hans Christian Boas & Ivan A. Sag (eds.), Sign-based Construction Grammar, 31–68. Stanford, CA: CSLI Publications. Michelson, Truman. 1930. Contributions to Fox Ethnology II, Bureau of American Ethnology Bulletin, vol. 95. Washington, DC: G.P.O. Miestamo, Matti. 2000. Towards a typology of standard negation. Nordic Journal of Linguistics 23(1). 65–88. Miestamo, Matti. 2007. Negation–an overview of typological research. Language and Linguistics Compass 1(5). 552–70. Miestamo, Matti. 2010. Negatives without negators. In Jan Wohlgemuth & Michael Cysouw (eds.), Rethinking Universals: How Rarities Affect Linguistic Theory, 169–94. Berlin: Walter de Gruyter. Miestamo, Matti, Kaius Sinnemäki, & Fred Karlsson. 2008. Language Complexity: Typology, Contact, Change. Amsterdam: John Benjamins Publishing Company. Mithun, Marianne. 1986. On the nature of noun incorporation. Language 62(1). 32–7. Mithun, Marianne. 1991. Active/agentive case marking and its motivations. Language 67. 510–46. Mithun, Marianne. 2001. The Languages of Native North America. Cambridge: Cambridge University Press. Mithun, Marianne. 2008. Does passivization require a subject category? In Greville G Corbett & Michael Noonan (eds.), Case and Grammatical Relations: Studies in Honor of Bernard Comrie, 211–40. Amsterdam/Philadelphia: John Benjamins Publishing Company. Mithun, Marianne. 2012. Core argument patterns and deep genetic relations. In Pirkko Suihkonen, Bernard Comrie, & Valery Solovyev (eds.), Argument Structure and Grammatical Relations: A Crosslinguistic Typology, 257–94. Amsterdam/Philadelphia: John Benjamins Publishing Company.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

298 references Mithun, Marianne. 2017a. Argument marking in the polysynthetic verb and its implications. In Michael Fortescue, Marianne Mithun, & Nicholas Evans (eds.), The Oxford Handbook of Polysynthesis, 30–58. Oxford: Oxford University Press. Mithun, Marianne. 2017b. Polysynthesis in North America. In Michael Fortescue, Marianne Mithun, & Nicholas Evans (eds.), The Oxford Handbook of Polysynthesis, 235–59. Oxford: Oxford University Press. Mohanan, Tara. 1994. Argument Structure in Hindi. Stanford, CA: CSLI Publications. Mondorf, Britta. 2014. (Apparently) competing motivations in grammar and usage. In Brian MacWhinney, Andrej Malchukov, & Edith Moravcsik (eds.), Competing Motivations in Grammar and Usage, 209–28. Oxford: Oxford University Press. Morrill, Glyn. 1995. Discontinuity in categorial grammar. Linguistics and Philosophy 18(2). 175–219. Mühlbauer, Jeff. 2003. Word-order and the interpretation of nominals in Plains Cree. Unpublished manuscript, University of British Columbia. Müller, Gereon. 1997. Partial wh-movement and optimality theory. Linguistic Review 14. 249–306. Müller, Stefan. 2003. Mehrfache Vorfeldbesetzung. Deutsche Sprache 31(1). 29–62. Müller, Stefan. 2013. Unifying everything: Some remarks on simpler syntax, construction grammar, minimalism, and HPSG. Language 89(4). 920–50. Muskens, Reinhard. 2003. Language, lambdas, and logic. In Geert-Jan Kruijff & Richard Oehrle (eds.), Resource-sensitivity, Binding and Anaphora, 23–54. Dordrecht: Kluwer Academic Publishers. Mycock, Louise. 2004. The wh-expletive construction. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG 2004 Conference, 370–90. Stanford, CA: CSLI Publications. Mylander, Carolyn & Susan Goldin-Meadow. 1991. Home sign systems in deaf children: The development of morphology without a conventional language model. Theoretical Issues in Sign Language Research 2. 41–63. Næss, Åshild. 2007. Prototypical Transitivity. Amsterdam/Philadelphia: John Benjamins Publishing Company. Nagano, Yasuhiko. 1984. A Historical Study of the rGyarong Verb System. Tokyo: Sheishido. Naigles, Letitia. 1990. Children use syntax to learn verb meanings. Journal of Child Language 17(2). 357–74. Naigles, Letitia G. & Edward T. Kako. 1993. First contact in verb acquisition: Defining a role for syntax. Child Development 64(6). 1665–87. Naigles, Letitia R., Lila Gleitman, & Henry Gleitman. 1986. Children acquire word meaning components from syntactic evidence. In Esther Dromi (ed.), Language and Cognition: A Developmental Perspective, 104–40. Norwood, NJ: Ablex. Naigles, Letitia R. & Erika Hoff-Ginsberg. 1995. Input to verb learning: Evidence for the plausibility of syntactic bootstrapping. Developmental Psychology 31(5). 827. Newmeyer, Frederick J. 2004. Against a parameter-setting approach to typological variation. Linguistic Variation Yearbook 4(1). 181–234. Newmeyer, Frederick J. 2007. ‘More complicated and hence, rarer’: A look at grammatical complexity and cross-linguistic rarity. In Simin Karimi, Vida Samiian & Wendy Wilkins (eds.), Phrasal and Clausal Architecture: Syntactic Derivation and Interpretation. In Honor of Joseph E. Emonds, 221–42. Amsterdam/Philadelphia: John Benjamins Publishing Company. Newmeyer, Frederick J. 2017. Where, if anywhere, are parameters? A critical historical overview of parametric theory. In Claire Bowern, Laurence Horn, & Raffaella Zanuttini (eds.), On Looking into Words (and Beyond), 547–69. Berlin: Language Sciences Press.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 299 Newmeyer, Frederick J. & Laurel B. Preston (eds.). 2014. Measuring Linguistic Complexity. Oxford: Oxford University Press. Nichols, Johanna. 1986. Head-marking and dependent-marking grammar. Language 62(1). 56–119. Niyogi, Partha. 2006. The Computational Nature of Language Learning and Evolution. Cambridge, MA: MIT Press. Noël, Dirk. 2007. Diachronic construction grammar and grammaticalization theory. Functions of Language 14(2). 177–202. Nunberg, Geoffrey, Ivan A. Sag, & Thomas Wasow. 1994. Idioms. Language 70(3). 491–538. Oehrle, Richard T., Emmon W. Bach, & Deirdre Wheeler (eds.). 1988. Categorial Grammars and Natural Language Structures. Dordrecht/Boston: Reidel. Tieken-Boon van Ostade, Ingrid. 1990. The origin and development of periphrastic auxiliary do: a case of destigmatisation. NOWELE. North-Western European Language Evolution 16(1). 3–52. Oxford, William Robert. 2014. Microparameters of Agreement: A Diachronic Perspective on Algonquian Verb Inflection. Toronto: University of Toronto dissertation. Palmer, Frank Robert. 1994. Grammatical Roles and Relations. Cambridge: Cambridge University Press. Partee, Barbara H. 1989. Binding implicit variables in quantified contexts. In Caroline R. Wiltshire, Randolph Graczyk, & Bradley Music (eds.), Papers from the 25th Regional Meeting of the Chicago Linguistics Society, Part One, 342–65. Chicago: University of Chicago. Partee, Barbara H., Alice G. B. ter Meulen, & Robert E. Wall. 1990. Mathematical Methods in Linguistics. Dordrecht/Boston: Kluwer Academic. Paul, Hermann. 1890. Principles of the History of Language. London: Longmans, Green, and Co. Paul, Ileana. 2010. Subjects: Grammatical relations, grammatical functions and functional categories. Language and Linguistics Compass 4(9). 890–902. Pederson, Eric. 1993. Zero negation in South Dravidian. In Lise M. Dobrin, Lynn Nichols, & Rosa M. Rodriguez (eds.), Papers from the 27th Regional Meeting of the Chicago Linguistics Society 1991, Part Two: Parasession on Negation. Chicago: Chicago Linguistic Society. Peitsara, Kirsti. 1997. The development of reflexive strategies in English. In Matti Rissanne, Merja Kytö, & Kirsi Heikkonen (eds.), Grammaticalization at Work: Studies of Long-term Developments in English, 277–370. Berlin/New York: Walter de Gruyter. Perlmutter, David M. 1970. Surface Structure Constraints in syntax. Linguistic Inquiry 1(2). 187–255. Perlmutter, David M. 1983. Studies in relational grammar 1. Chicago: University of Chicago Press. Perlmutter, David M. & Carol Rosen. 1984. Studies in Relational Grammar 2. Chicago: University of Chicago Press. Peters, P. Stanley & Robert W. Ritchie. 1969. A note on the universal base hypothesis. Journal of Linguistics 5(1). 150–2. Petrova, Svetlana. 2009. Information structure and word order variation in the Old High German Tatian. In Roland Hinterhölzl & Svetlana Petrova (eds.), Information Structure and Language Change: New Approaches to Word Order Variation in Germanic, 251–80. Berlin/New York: Mouton de Gruyter. Pilot-Raichoor, Christiane. 2010. The Dravidian zero negative: Diachronic context of its morphogenesis and conceptualisation. In Jan Wohlgemuth & Michael Cysouw (eds.), Rara and Rarissima: Documenting the Fringes of Linguistic Diversity, 267–304. Berlin/New York: Mouton de Gruyter.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

300 references Pinker, Steven & Ray Jackendoff. 2005. The faculty of language: What’s special about it? Cognition 95(2). 201–36. Pintzuk, Susan. 1993. Verb seconding in Old English: Verb movement to Infl. The Linguistic Review 10(1). 5–36. Pintzuk, Susan. 1999. Phrase Structures in Competition: Variation and Change in Old English Word Order. New York: Garland Publishing, Inc. Pintzuk, Susan. 2002. Verb-object order in Old English: Variation as grammatical competition. In David Lightfoot (ed.), Syntactic Effects of Morphological Change, 276–99. Oxford: Oxford University Press. Pintzuk, Susan. 2005. The syntax of objects in Old English. In Montse Batllori, Maria-Luïsa Hernanz, Carme Picallo, & Francesc Roca (eds.), Grammaticalization and Parametric Variation, 251–66. Oxford: Oxford University Press. Pintzuk, Susan. 2014. Phrase Structures in Competition: Variation and Change in Old English Word Order. London: Routledge. Pintzuk, Susan & Ann Taylor. 2006. The loss of OV order in the history of English. In Ans van Kemenade & Bettelou Los (eds.), The Handbook of the History of English, 249–78. Oxford: Wiley-Blackwell. Pintzuk, Susan, George Tsoulas, & Anthony Warner. 2000. Diachronic Syntax: Models and Mechanisms. Oxford University Press on Demand. Pollard, Carl. 2004. Higher-order categorical grammar. In Proceedings of the Conference on Categorial Grammars (CG2004), Montpellier, France, 340–61. Pollard, Carl & Ivan A. Sag. 1994. Head-driven Phrase Structure Grammar. Chicago: University of Chicago Press and CSLI Publications. Pollock, Jean-Yves. 1989. Verb movement, UG and the structure of IP. Linguistic Inquiry 20(3). 365–424. Primus, Beatrice. 1999. Cases and Thematic Roles: Ergative, Accusative and Active. Tübingen: Max Niemeyer. Primus, Beatrice. 2009. Case, grammatical relations, and semantic roles. In Andrej L. Malchukov & Andrew Spencer (eds.), The Oxford Handbook of Case, 261–75. Oxford: Oxford University Press. Primus, Beatrice. 2012. Animacy, generalized semantic roles, and differential object marking. In Monique Lamers & Peter de Swart (eds.), Case, Word Order and Prominence, 65–90. Dordrecht: Springer. Progovac, Ljiljana. 2015. Evolutionary Syntax. Oxford: Oxford University Press. Pullum, Geoffrey K. & Barbara C. Scholz. 2010. Recursion and the infinitude claim. Recursion in Human Language 104. 113–38. Reichenbach, Hans. 2012 [1958]. The Philosophy of Space and Time. Mineola, NY: Courier Corporation. Reinhart, Tanya. 2006. Interface Strategies. Cambridge, MA: MIT Press. Reinhart, Tanya & Eric Reuland. 1993. Reflexivity. Linguistic Inquiry 24(4). 657–720. Reis, Marga. 2006. Is German V-to-C movement really semantically motivated? Some empirical problems. Theoretical Linguistics 32(3). 369–80. Rhodes, Richard A. 2017. Obviation, inversion, and the notion of topic in Algonquian. In Monica Macaulay & Margaret Noodin (eds.), Papers of the Forty-sixth Algonquian Conference, 197–212. East Lansing, MI: Michigan State University Press. Riemsdijk, Henk van. 1982. A Case Study in Syntactic Markedness. Dordrecht: Foris. Ritchart, Amanda, Grant Goodall, & Marc Garellek. 2016. Prosody and the that-trace effect: An experimental study. In Proceedings of the 33rd West Coast Conference on Formal Linguistics, 320–8.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 301 Rivero, María Luisa. 1994. Negation, imperatives and Wackernagel effects. Rivista di Linguistica 6(1). 39–66. Rivero, Maria Luisa & A. Terzi. 1995. Imperatives, V-movement and logical mood. Journal of Linguistics 31(2). 301–32. Rizzi, Luigi. 1982. Issues in Italian Syntax. Dordrecht: Foris. Rizzi, Luigi. 1990. Relativized Minimality. Cambridge, MA: MIT Press. Rizzi, Luigi. 2004. On the cartography of Syntactic Structures. In Luigi Rizzi (ed.), The Structure of CP and IP, 3–15. Oxford: Oxford University Press. Rizzi, Luigi. 2007. On some properties of criterial freezing. Studies in Linguistics 1. 145–58. Rizzi, Luigi. 2014. Some consequences of criterial freezing. In Peter Svenonius (ed.), Functional Structure from Top to Toe: The Cartography of Syntactic Structures, 19–46. Oxford: Oxford University Press. Rizzi, Luigi & Guglielmo Cinque. 2016. Functional categories and syntactic theory. Annual Review of Linguistics 2. 139–63. Roberts, Craige. 2012. Information structure: Towards an integrated formal theory of pragmatics. Semantics and Pragmatics 5. 6–1. Roberts, Ian. 2010. Agreement and Head Movement: Clitics, Incorporation, and Defective Goals. Cambridge, MA: MIT Press. Roberts, Ian & Anna Roussou. 2003. Syntactic Change: A Minimalist Approach to Grammaticalization, Cambridge: Cambridge University Press. Rochemont, Michael & Peter W. Culicover. 1990. English Focus Constructions and the Theory of Grammar. Cambridge: Cambridge University Press. Rochemont, Michael & Peter W. Culicover. 1997. Deriving dependent right adjuncts in English. In Dorothee Beerman, David LeBlanc, & Henk van Riemsdijk (eds.), Rightward Movement, 277–300. Amsterdam: John Benjamins Publishing Company. Rohdenburg, Günter. 2007. Functional constraints in syntactic change: The rise and fall of prepositional constructions in early and late modern English. English Studies 88(2). 217–33. Ross, John R. 1967. Constraints on Variables in Syntax. Cambridge, MA: MIT dissertation. Ross, John R. 1973. Slifting. In Maurice Gross, Morris Halle, & Marcel P. Schützenberger (eds.), The Formal Analysis of Natural Languages, 133–69. The Hague: Mouton. Sabel, Joachim. 2000. Partial wh-movement and the typology of wh-questions. In Uli Lutz, Gereon Müller, & Arnim von Stechow (eds.), Wh-scope Marking, 409–48. Amsterdam/Philadelphia: John Benjamins Publishing Company. Sadock, Jerrold. 2017. The subjectivity of the notion of polysynthesis. In Michael Fortescue, Marianne Mithun, & Nicholas Evans (eds.), The Oxford Handbook of Polysynthesis, 99–115. Oxford: Oxford University Press. Sadock, Jerrold M. & Anthony C. Woodbury. 2018. Negation in Yupik-Inuit, a family in which productive negation is expressed as derivational morphology. Paper presented at Syntax of the World’s Languages VIII, Paris, Sept 3–5, 2018. Sadock, Jerrold M. & Arnold M. Zwicky. 1985. Speech act distinctions in syntax. Language Typology and Syntactic Description 1. 155–96. Safir, Ken. 2004. The Syntax of Anaphora. Cambridge, MA: MIT Press. Sag, Ivan A. 1978. Floated quantifiers, adverbs, and extraction sites. Linguistic Inquiry 9(1). 146–50. Sag, Ivan A. 2012. Sign-based construction grammar—a synopsis. In Hans C. Boas & Ivan A. Sag (eds.), Sign-based Construction Grammar, 61–197. Stanford, CA: CSLI Publications. Sag, Ivan A., Rui P. Chaves, Anne Abeillé, Bruno Estigarribia, Dan Flickinger, Paul Kay, Laura A. Michaelis, Stefan Müller, Geoffrey K. Pullum, Frank Van Eynde et al. 2020. Lessons from the English auxiliary system. Journal of Linguistics 56. 1–69.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

302 references Sag, Ivan A., Philip Hofmeister, & Neal Snider. 2007a. Processing complexity in Subjacency violations: the complex noun phrase constraint. In Proceedings of the 43rd Annual Meeting of the Chicago Linguistic Society. Chicago: University of Chicago. Sag, Ivan A. 2007b. Remarks on locality. In Proceedings of the 14th International Conference on Head-driven Phrase Structure Grammar, 394–414. Saito, Mamoru. 2004. Genitive subjects in Japanese. In Peri Bhaskararao & Karumuri V. Subbarao (eds.), Non-nominative Subjects, Volume 2, 103–18. Amsterdam/Philadelphia: John Benjamins Publishing Company. Sampson, Geoffrey, David Gil, & Peter Trudgill (eds.). 2009. Language Complexity as an Evolving Variable. Oxford: Oxford University Press. Sands, Kristina & Lyle Campbell. 2001. Non-canonical subjects and objects in Finnish. In Alexandra Y. Aikhenvald, R. M. W. Dixon, & Masayuki Onishi (eds.), Non-canonical Marking of Subjects and Objects, 251–306. Amsterdam/Philadelphia: John Benjamins Publishing Company. Santorini, Beatrice. 1992. Variation and change in Yiddish subordinate clause word order. Natural Language and Linguistic Theory 10(4). 595–640. Sapir, Edward. 1921. Language: An Introduction to the Study of Speech. New York: Harcourt, Brace, and Co. Sapp, Christopher D. 2011. The Verbal Complex in Subordinate Clauses from Medieval to Modern German. Amsterdam/Philadelphia: John Benjamins Publishing Company. Sapp, Christopher D. 2016. Word order patterns in the Old High German right periphery and their Indo-European origins. Diachronica 33(3). 367–411. Sato, Yosuke & Yoshihito Dobashi. 2016. Prosodic phrasing and the that-trace effect. Linguistic Inquiry 47(2). 333–49. Sauerland, Uli. 1999. Erasability and interpretation. Syntax 2(3). 161–88. Saulwick, Adam. 2003. Aspects of the Verb in Rembarrnga: A Polysynthetic Language of Northern Australia: Grammatical Description, Texts and Dictionary. Melbourne: University of Melbourne. Saussure, Ferdinand de. 1922 [1983]. Cours de Linguistique Generale. Paris: Payot. Schachter, Paul. 1976. Subject in Philippine languages. In Charles Li (ed.), Subject and Topic, 491–518. New York: Academic Press. Schachter, Paul. 1996. The subject in Tagalog: Still none of the above. In UCLA Occasional Papers in Linguistics, vol. 15. Los Angeles, CA: UCLA. Schallert, Oliver. 2010. Als Deutsch noch nicht OV war. In Arne Ziegler (ed.), Historische Textgrammatik und Historische Syntax des Deutschen: Diachronie, Althochdeutsch, Mittelhochdeutsch, vol. 1, 365–94. Berlin: Walter de Gruyter. Schauber, Ellen. 1975. Theoretical Responses to Navajo Questions. Cambridge, MA: MIT dissertation. Schlachter, Eva. 2012. Syntax und Informationsstruktur im Althochdeutschen: Untersuchungen am Beispiel der Isidor-Gruppe. Heidelberg: Universitätsverlag Winter. Schmid, Tanja. 2002. West Germanic IPP-constructions: An Optimality Theoretic approach. Stuttgart: University of Stuttgart dissertation. Schmid, Tanja. 2005. Infinitival Syntax: Infinitivus Pro Participio as a Repair Strategy. Amsterdam/Philadelphia: John Benjamins Publishing Company. Schmirler, Katherine, Antti Arppe, Trond Trosterud, & Lene Antonsen. 2018. Building a constraint grammar parser for Plains Cree verbs and arguments. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2981–8.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 303 Schuler, William, Samir AbdelRahman, Tim Miller, & Lane Schwartz. 2010. Broad-coverage parsing using human-like memory constraints. Computational Linguistics 36(1). 1–30. Searle, John. 2002. End of the revolution. The New York Review of Books 49(3). 33–6. Senghas, Ann & Marie Coppola. 2001. Children creating language: How Nicaraguan Sign Language acquired a spatial grammar. Psychological Science 12(4). 323–8. Senghas, Ann, Sotaro Kita, & Asli Özyürek. 2004. Children creating core properties of language: Evidence from an emerging sign language in Nicaragua. Science 305(5691). 1779–82. Seržant, Ilja A. 2013. Rise of canonical subjecthood. In Ilja A. Seržant & Leonid Kulikov (eds.), The Diachronic Typology of Non-canonical Subjects, 283–310. Amsterdam/Philadelphia: John Benjamins Publishing Company. Seržant, Ilja A. & Leonid Kulikov. 2013. The Diachronic Typology of Noncanonical Subjects. Amsterdam/Philadelphia: John Benjamins Publishing Company. Seržant, Ilja A. & Alena Witzlack-Makarevich. 2018. Differential argument marking: An introduction. In Ilja A. Seržant, Alena Witzlack-Makarevich, & K. Mann (eds.), The Diachronic Typology of Differential Argument Marking, 1–40. Berlin: Language Science Press. Shibatani, Masoyoshi. 1977. Grammatical relations and surface cases. Language 53(4). 789–809. Siemund, Peter. 2002. Reflexive and intensive self-forms across varieties of English. Zeitschrift für Anglistik und Amerikanistik 50(3). 250–68. Siewierska, Anna & Dik Bakker. 2012. Three takes on grammatical relations: A view from the languages of Europe and North and Central Asia. In Pirkko Suihkonen, Bernard Comrie, & Valery Solovyev (eds.), Argument Structure and Grammatical Relations: A Crosslinguistic Typology, 295–323. Amsterdam/Philadelphia: John Benjamins Publishing Company. Sinnemäki, Kaius. 2009. Differential object marking: A cross-linguistic study. Presentation in ‘Workshop Differential object marking: Theoretical and empirical issues’, Helsinki. Slobin, Dan I. 1985a. The child as linguistic icon-maker. In John Haiman (ed.), Iconicity in Syntax, 221–48. Amsterdam: John Benjamins Publishing Company. Slobin, Dan I. 1985b. Crosslinguistic evidence for the language-making capacity. The Crosslinguistic Study of Language Acquisition 2. 1157–249. Slobin, Dan I. 1987. Thinking for speaking. In J. Aske, N. Beery, L. Michaelis, & H. Filip (eds.), Proceedings of the 13th Annual Meeting of the Berkeley Linguistics Society Meeting, 435–45. Berkeley, CA: Berkeley Linguistics Society. Slobin, Dan I. 2004. The many ways to search for a frog. In S. Strömqvist & L. Verhoeven (eds.), Relating Events in Narrative: Typological and Contextual Perspectives, 219–57. Hillsdale, NJ: Lawrence Erlbaum. Slobin, Dan I. 2005. From ontogenesis to phylogenesis: What can child language tell us about language evolution. In Sue Taylor Parker, Jonas Langer, & Constance Milbrath (eds.), Biology and Knowledge Revisited: From Neurogenesis to Psychogenesis, 255–85. London: Routledge. Sobin, Nicholas. 2002. The comp-trace effect, the adverb effect and minimal CP. Journal of Linguistics 38(3). 527–60. Spencer, Andrew. 2013. Lexical Relatedness. Oxford: Oxford University Press. Sprouse, Rex & Barbara Vance. 1999. An explanation for the decline of null pronouns in certain Germanic and Romance dialects. In Michel deGraff (ed.), Language Creation and Language Change: Creolization, Diachrony and Development, 257–84. Cambridge, MA: MIT Press.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

304 references Sridhar, Shikaripur N. 1979. Dative subjects and the notion of subject. Lingua 49(2-3). 99–125. Stabler, Edward P. 1994. The finite connectivity of linguistic structure. In Charles Clifton, Jr. Lyn Frazier, & Keith Rayner (eds.), Perspectives on Sentence Processing, 303—36. Hillsdale, NJ: Erlbaum. Steedman, Mark J. 1993. Categorial Grammar. Lingua 90. 221–58. Steinert-Threlkeld, Shane & Jakub Szymanik. 2020. Ease of learning explains semantic universals. Cognition 195. 104076. Sternefeld, Wolfgang. 2000. Semantic vs. syntactic reconstruction. SfS-Report 02–00, Universität Tübingen. Sternefeld, Wolfgang. 2006. Syntax: Eine Morphologisch Motivierte Generative Beschreibung des Deutschen. Tübingen: Stauffenburg. Stump, Gregory T. 2001. Inflectional Morphology: A Theory of Paradigm Structure, Cambridge: Cambridge University Press. Sweetser, Eve E. 1988. Grammaticalization and semantic bleaching. In Rachel Wojdak, Marc Ettlinger, Nicholas Fleisher, & Mischa Park-Doob (eds.), Proceedings of the 14th Annual Meeting of the Berkeley Linguistics Society, 389–405. Berkeley: Berkeley Linguistics Society. Taylor, Ann & Susan Pintzuk. 2012a. Rethinking the OV/VO alternation in Old English: The effect of complexity, grammatical weight, and information status. In Terttu Nevalainen & Elizabeth C. Traugott (eds.), The Oxford Handbook of the History of English. Oxford: Oxford University Press. Taylor, Ann & Susan Pintzuk. 2012b. Verb order, object position and information status in Old English. York Papers in Linguistics 2. 29–52. Tesar, Bruce & Paul Smolensky. 1998. Learnability in optimality theory. Linguistic Inquiry 29(2). 229–68. Thomason, Sarah G. 1997. Contact Languages: A Wider Perspective. Amsterdam/Philadelphia: John Benjamins Publishing Company. Thomason, Sarah G. 2001. Language Contact. Edinburgh: Edinburgh University Press. Thomason, Sarah G. 2008. Social and linguistic factors as predictors of contact-induced change. Journal of Language Contact 2(1). 42–56. Thomason, Sarah G. 2010. Contact explanations in linguistics. In Raymond Hickey (ed.), The Handbook of Language Contact, 31–47. Wiley Online Library. Thomason, Sarah G. 2017. Contact as a source of language change. In Brian D. Joseph & Richard D. Janda (eds.), Handbook of Historical Linguistics, 687–712. Wiley Online Library. Thráinsson, Höskuldur. 2007. The Syntax of Icelandic. Cambridge: Cambridge University Press. Tomasello, Michael. 2003. Constructing a Language: A Usage-based Theory of Language Acquisition. Cambridge, MA: Harvard University Press. Traugott, Elizabeth Closs. 2003. Constructions in grammaticalization. In Brian D. Joseph & Richard D. Janda (eds.), Handbook of Historical Linguistics, 624–47. Oxford: Blackwell Publishers. Traugott, Elizabeth Closs. 2008. Grammaticalization, constructions and the incremental development of language: Suggestions from the development of degree modifiers in English. In Regine Eckard, Gerhard Jäger, & Tonjes Veenstra (eds.), Variation, Selection, Development—Probing the Evolutionary Model of Language Change, 219–50. Berlin/New York: Mouton de Gruyter.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 305 Traugott, Elizabeth Closs & Bernd Heine. 1991. Approaches to Grammaticalization: Volume II. Types of Grammatical Markers. Amsterdam/Philadelphia: John Benjamins Publishing Company. Traugott, Elizabeth Closs & Graeme Trousdale. 2013. Constructionalization and Constructional Change. Oxford: Oxford University Press. Travis, Lisa deMena. 1991. Parameters of phrase structure and verb-second phenomena. In Robert Freidin (ed.), Principles and Parameters in Comparative Grammar, 339–64. Cambridge, MA: MIT Press. Trousdale, Graeme. 2008. Constructions in grammaticalization and lexicalization: Evidence from the history of a composite predicate construction in English. In Graeme Trousdale & Nikolas Gisborne (eds.), Constructional Approaches to English, 33–67. Berlin: Walter de Gruyter. Trousdale, Graeme. 2010. Issues in constructional approaches to grammaticalization in English. In Ekaterini Stathi, Elke Gehweiler, & Ekkehard König (eds.), Grammaticalization: Current Views and Issues, 51–71. Amsterdam/Philadelphia: John Benjamins Publishing Company. Trousdale, Graeme. 2012. Grammaticalization, constructions and the grammaticalization of constructions. In Kristin Davidse, Tine Breban, Lieselotte Brems, & Tanja Mortelmans (eds.), Grammaticalization and Language Change: New Reflections, 167–98. Amsterdam/Philadelphia: John Benjamins Publishing Company. Trudgill, Peter. 2011. Sociolinguistic Typology: Social Determinants of Linguistic Complexity. Oxford: Oxford University Press. Uszkoreit, Hans. 1986. Categorial Unification Grammars. In Proceedings of COLING-86, Also appears as Center for the Study of Language and Information Report No. CSLI-8666, Stanford, CA. Valin Jr, Robert D. van. 1990. Semantic parameters of split intransitivity. Language 66. 221–60. Varaschin, Giuseppe. 2018. A simple theory of English reflexives: How good is it? Unpublished manuscript, Federal University of Santa Catarina. Walkden, George. 2015. Verb-third in early West Germanic: A comparative perspective. In Theresa Biberauer & George Walkden (eds.), Syntax over Time: Lexical, Morphological, and Information-structural Interactions, 236–48. Oxford: Oxford University Press. Wanner, Dieter. 2011. The Power of Analogy: An Essay on Historical Linguistics. Berlin: Walter de Gruyter. Wasow, Thomas. 1997a. End-weight from the speaker’s perspective. Journal of Psycholinguistic Research 26(3). 347–61. Wasow, Thomas. 1997b. Remarks on grammatical weight. Language Variation and Change 9(01). 81–105. Wasow, Thomas. 2002. Postverbal Behavior. Stanford, CA: CSLI Publications. Watanabe, Akira. 2001. Wh-in-situ languages. In Mark Baltin & Chris Collins (eds.), The Handbook of Contemporary Syntactic Theory, 203–25. Malden, MA/Oxford: Blackwell Publishers. Wexler, Kenneth & Peter W. Culicover. 1980. Formal Principles of Language Acquisition. Cambridge, MA: MIT Press. Whaley, Lindsay. 2010. Syntactic typology. In Jae Jung Song (ed.), The Oxford Handbook of Linguistic Typology, Oxford: Oxford University Press. Whitman, John. 2008. The classification of constituent order generalizations and diachronic explanation. In Jeff Good (ed.), Linguistic Universals and Language Change, 233–52. Oxford: Oxford University Press.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

306 references Wichmann, Søren. 2008. The study of semantic alignment: retrospect and state of the art. In Mark Donohue and Søren Wichmann (eds.), The Typology of Semantic Alignment, 3–23. Oxford: Oxford University Press. Wier, Thomas R. 2011. Georgian Morphosyntax and Feature Hierarchies in Natural Language. Chicago: University of Chicago dissertation. Wiese, Heike, Horst J Simon, Marianne Zappen-Thomson, & Kathleen Schumann. 2014. Deutsch im mehrsprachigen Kontext: Beobachtungen zu lexikalisch-grammatischen Entwicklungen im Namdeutschen und im Kiezdeutschen. Zeitschrift für Dialektologie und Linguistik 81(3). 274–307. Wiese, Heike & Spass Wallah. 2010. Kiezdeutsch – ein neuer Dialekt des Deutschen. Aus Politik und Zeitgeschichte, Ausgabe 2–10. Wiltschko, Martina. 2014. The Universal Structure of Categories: Towards a Formal Typology. Cambridge: Cambridge University Press. Witzlack-Makarevich, Alena. 2011. Typological Variation in Grammatical Relations. Leipzig: University of Leipzig dissertation. Witzlack-Makarevich, Alena & Ilja A. Seržant. 2018. Differential argument marking: Patterns of variation. Diachrony of Differential Argument Marking 19. 1. Wolfart, H. Christoph. 1973. Plains Cree: A Grammatical Study. Philadelphia, Pennsylvania: American Philosophical Society Transactions. Wolfart, H. Christoph. 1996. Sketch of Cree, an Algonquian language. Handbook of North American Indians 17. 390–439. Wolvengrey, Arok. 2005. Inversion and the absence of grammatical relations. In Casper de Groot & Kees Hengeveld (eds.), Morphosyntactic Expression in Functional Grammar, 419–45. Berlin: Walter de Gruyter. Woolford, Ellen. 2009. Differential subject marking at argument structure, syntax, and PF. In Helen de Hoop & Peter de Swart (eds.), Differential Subject Marking, 17–40. Dordrecht: Springer. Wurmbrand, Susi. 2004. West Germanic verb clusters: The empirical domain. In Katalin E. Kiss & Henk van Riemsdijk (eds.), Verb Clusters: A Study of Hungarian, German and Dutch, 43–85. Amsterdam/Philadelphia: John Benjamins Publishing Company. Wurmbrand, Susi. 2006. Verb clusters, verb raising, and restructuring. In Martin Everaert & Henk van Riemsdijk (eds.), The Blackwell Companion to Syntax, vol. 5, 229–343. Oxford: Blackwell Publishers. Yamashita, Hiroko & Franklin Chang. 2001. Long before short preference in the production of a head-final language. Cognition 81(2). B45–B55. Yang, Charles. 2010. Three factors in language variation. Lingua 120(5). 1160–77. Yang, Charles D. 1999. A selectionist theory of language acquisition. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, 429–35. Association for Computational Linguistics. Yang, Charles D. 2000. Internal and external forces in language change. Language Variation and Change 12(3). 231–50. Yang, Charles D. 2002. Knowledge and Learning in Natural Language. Oxford: Oxford University Press. Yang, Charles D. 2004a. Toward a theory of language growth. In Lyle Jenkins (ed.), Variation and Universals in Biolinguistics, 37–56. Amsterdam: Elsevier Science. Yang, Charles D. 2004b. Universal grammar, statistics or both? Trends in Cognitive Sciences 8(10). 451–456. Zanuttini, Rafaella. 1991. Syntactic Properties of Sentential Negation: A Comparative Study of Romance Languages. Philadelphia, PA: University of Pennsylvania dissertation.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

references 307 Zanuttini, Rafaella. 2008. Encoding the addressee in syntax: Evidence from English imperative subjects. Natural Language and Linguistic Theory 26(1). 185–218. Zeevat, Henk, Ewan Klein, & Jo Calder. 1987. Unification Categorial Grammar. In Nicholas J. Haddock, Ewan Klein, & Glyn Morrill (eds.), Categorial Grammar, Unification Grammar and Parsing, vol. 1, Edinburgh Working Papers in Cognitive Science, Centre for Cognitive Science, University of Edinburgh. Zwart, Jan-Wouter. 1995. A note on verb clusters in the Stellingwerf dialect. Linguistics in the Netherlands 12(1). 215–26. Zwart, Jan-Wouter. 1997. Transitive expletive constructions and the evidence supporting the multiple specifier hypothesis. In Werner Abraham & Elly van Gelderen (eds.), German: Syntactic Problems—Problematic Syntax, 105–34. Tübingen: Max Niemeyer. Zwart, Jan-Wouter. 2007. Some notes on the origin and distribution of the IPP-effect. GAGL: Groninger Arbeiten zur germanistischen Linguistik 45. 77–99.

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

Language Index Acehnese 159 Agul 122–5 Akkadian 76 Ambonese Malay 159 Arizona Tewa 56 Armenian 115 Basque 132, 163 Bininj Guin–wok 151 Blackfoot 129–30 Brazilian Portuguese 60, 63 Chamorro 114 Chinese 179, 190, 268–70, 272 Choctaw 163–4 Classical Tamil 58 Dalabon 257 Dutch 219–21, 223 East-Tucanoan 122 Finnish 116, 131 French 13–14, 44, 56, 59, 63, 64, 82, 227, 229 n4, 235 German 73–4, 76, 80, 82–3, 86, 93, 94–5, 176–7, 198ff, 200–12, § 8.4–5, 220–3, 238 Germanic 73, 74, 86, 197–224 Guaraní 158 Gyarong 128 Hindi 126–7, 130, 191, 193 Hungarian 191, 193 Icelandic 121–2, 134 Italian 13, 44–5, 56, 62–4, 117, 155, 268–70, 272 Jacaltec 268–70, 272 Japanese 177, 192, 246 n6, 268–70, 272

Kannada 126 Khasi 193 Kiezdeutsch 86 Kolyma Yukaghi 57 Korean 182, 192 Kurok 55 Kuuk Thaayorre 116 Maltese 131, 133 Meskwaki 181 Middle English 209, 229, 238, 260–1 Mohawk 245, 247, 263–4, 268–70, 272 Navajo 268–70, 272 Norwegian 76 OHG, see Old High German Old English 204ff, 232 Old French 82 Old High German (OHG) 82, 93–5, 215–18, 224 Old Kannada 59 O’ odham 256 Panoan 251 Pirahã 55, 245 n4 Plains Cree 30, 180–1, 246–7, 252–4, 256, 259, 263–4, 268–70, 272 Proto–Germanic (PG) 205 n3, 217, 224 Punjabi 126 Rembarrnga 158 Russian 30, 33, 114, 115, 131, 177, 228, 268–70, 272 Salish 56, 128, 129 Southeast Puebla Nahuatl 256 Spanish 56, 61–64, 80, 130, 132 Tagalog 268–70, 272 Tariana 122 Tennet 193

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

310 language index Urarina 251

Wubuy 167

Warlpiri 56, 133 Warray 157 West Flemish (WF) 220–1

Yupik-Inuit 56 Zaza 131 Zürich German (ZT) 220–2

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

Author Index Aboh, E. 49 Ackema, P. 193 Ackerman, F. 118 n7, 120, 238 Ahenakew, F. 148 Aikhenvald. A. 55, 120, 122, 251 Aissen, J. 120, 132, 146 n12 Aldridge, E. 268 n26 Allen, C. 205, 209, 213 n9, 236 Ambridge, B. 73 Andrews, A. 121, 122 Ansaldo, U. 88 n16 Arkadiev, P. 118 n7, 119 n8, 120 Alishahi, A. 136, 139 Armstrong, N. 59 Arppe, A. 253 n12 Ashby, W. 59 Audring, J. 10, 26, 28 n14, 243 n2, 244 n3, 250, 255 n13 Augustinus, L. 221 n13 Auwera, J. 231 Axel, K. 82, 93, 94, 211 n7, 215, 216, 217 Bach, E. 223 Bækken, B. 79 Baker, M. 33, 40, 50, 155, 243, 244, 245, 247, 248, 256, 262, 263, 264, 267 n23, 268 n25, 269 n29, 270 nn32 and 34, 272, 273 Baltin, M. 98, 258 n18 Barðdal, J. 121 n12, 134 Barton, E. 88 n16 Baunaz, L. 45 Baptista, M. 49, 77 Bean, M. 205 n3, 208, 209 Bech, K. 206 n4 Belletti, A. 44 Bennett, P. 238 Bergeton, U. 230 n5 Berwick, R. 43, 67, 75, 82 Bhaskararao, P. 120 Biberauer, T. 45, 197 n1 Bickel, B. 128, 133

Bickerton, D. 46–9, 79 Bies, A. 206 n4 Biezma, M. 61 Birchall, J. 120, 121 n11 Blain, E. 180, 268 Blevins, Jame. 250 n8 Blevins, Juliett. 250 n8 Blümel, A. 200 Bobaljik, J. 219 n12 Bod, R. 69 Boeckx, C. 45 Bolhuis, J. 43, 67 Borkowski, W. 100 Bouchard, D. 14 n13 Brandner, E. 200 Bresnan, J. 9, 97, 192 n25 Briscoe, E. 70, 71, 72 Broadwell, G. 163, 164 Brown, R. 214 Bybee, J. 69, 75, 88 n17, 250 Campbell, L. 88 n17, 131, 213, 257 n16 Cecchetto, C. 193, 223 n16 Chang, F. 192 n24 Cheng, L. 269 Chater, N. 90 n20 Chomsky, N. 3–7, 14 n12, 29, 41, 42 n1, 43–5, 47, 65–7, 167 n1, 168–9, 170, 183–4, 189, 200, 225, 227, 231, 244, 246, 249, 250, 258 n17, n18, 270 Chung, C. 182 Chung, S. 114 Cinque, G. 13, 29, 44, 45 n8 Clark, B. 210 n6 Condoravdi, C. 68 Coppola, M. 48, 49 Cournane, A. 75 Craenenbroeck, J. van 29 Craig, C. 267 n22, 269 n28 Creissels, D. 158 n9, 163 Crisma, P. 197 n1

OUP CORRECTED PROOF – FINAL, 12/7/2021, SPi

312 author index Croft, W. 33, 51 n13, 72 Culicover, P. 6, 7, 9, 11 n8, 12 n10, 14, 17 n1, 23, 24, 27, 29, 30, 33 n15, 35, 43, 44 n7, 45, 47, 51, 52 n14, 53, 66, 69, 72, 73, 74, 75, 76, 88 n16, 91, 94 n24, 96, 97, 98, 99, 100 n30, 106, 112, 113, 136, 139, 141, 146 n1, 162, 164 n14, 168, 173, 174, 175, 176, 183, 188, 192, 199, 203, 209, 214, 219, 222, 227 n2, 230, 232 n7, 248 n7, 242, 251 n11, 252, 256, 260, 261, 275 Dahl, Ö. 88 n16 Dahlstrom, A. 150, 154, 181, 246, 252 Darwin, C. 276 Dautriche, I. 146 Davids, I. 114 Davis, H. 56 De Smet, H. 250 n8, 260, 261 Deo, A. 68, 96 Deutscher, G. 76 Dik, S. 122 n13 Dixon, R. 55, 72 Dobashi, Y. 97 Donohue, M. 159 Dowty, D. 8, 20, 36, 118, 119, 120, 122, 123, 142, 136, 137, 145, 146 n1, 161, 227, 238, 253 Dryer, M. 52, 55 n16, 56, 150, 160, 251, 262, 265, 266, 267 n21, 270, 271, 272 Dufter, A. 6 Dum-Tragut, J. 115 Durie, M. 159 Eckardt, R. 68 Ellegård, A. 233 Ellis, C. 270 n37 Embick, D. 13 n11, 169 Emonds, J. 33 Ende, F. van 221 n13 Engdahl, E. 236 Evans, N. 55, 59, 72, 151, 157, 158, 264 Everett, D. 55, 245 n4 Evers, A. 216 n10 Fanselow, G. 191 Fauconnier, S. 120 Feurer, H. 270 n35 Filipović, L. 77 Fillmore, C. 12 n8, 111

Fischer, O. 197 n1, 250 Fisher, C. 50, 119 Fitch, W. 5 n2 Fodor, J. 75 Fortuin, E. 114 Fried, M. 88 n17 Friederici, A. 47 Gaby, A. 116 Gahl, S. 69 Ganenkov, D. 122, 123 Gärtner, H.-M. 200, 203, 268 n27 Gast, V. 228, 230 Gelderen, E. va. 228–30 Gibson, E. 75, 82, 91 n21, 92, 191 n22 Ginzburg, J. 170 Gisborne, N. 68 Giudice, A. 261 Givón, T. 88 n16 Gleitman, L. 190 Goh, G.-Y. 236, 238 n12, 240 Goldberg, A. 12 n9, 17, 18 n3, 25, 87 n15, 261 Goldin-Meadow, S. 48 Greenberg, J. 52, 89, 264 Grimm, S. 118 n7, 120, 126, 161, 162 Haegeman, L. 220, 222 Haider, H. 217 n11 Hale, K. 267 n24 Halle, M. 13 n11, 169 n4 Han, C–h. 61, 62, 64 Hana, J. 251 n11 Harley, H. 7 n5, 13 n11, 21, 169 n3 Harrigan, A. 253 n12 Harris, A. 88, 91 n21, 257 n16 Harris, Z. 26 Haspelmath, M. 45, 56, 132, 232 n8, 251, 262, 266, 270, 271, 272 Haugen, J. 256, 257 n15 Hauser, M. 43 Hawkins, J. 29, 48, 52, 76, 77, 81, 90, 91 n21, 95, 96 n26, 106, 232, 242, 248, 249, 251, 265 Heath, J. 163, 164 Heine, B. 49, 88 n17 Hind. 126–7, 130, 191, 193 Hinterhölzl, R. 206 n4, 213 n8 Hirose, T. 152, 153 Hoff-Ginsberg, E. 69

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

author index 313 Hofmeister, P. 52, 91 Holton, G. 121 n11 Hoop, H. d. 120, 126, 127 Hopper, P. 118 Hornstein, N. 258 Hornung, A. 213 n9 Horvath, J. 191, 193 Hukari, T. 171 n6, 172 n7, 189 n21 Huybregts, R. 220 Itkonen, E. 250 n9 Jackendoff, R. 6, 8, 9, 10, 11 n8, 12, 17 n1, 19, 24, 26, 27, 28, 32, 35, 43 n4, 44 n7, 47, 55 n17, 66, 69, 88 n18, 96, 111, 112 n1, 119 n9, 136, 146 n1, 162, 164 n14, 168, 171 n5, 227 n2, 243 n2, 244 n3, 250, 252, 255 n13, 275 Jacobson, P. 9 Jäger, Agne. 74, 221 Jäge, Andrea. 231–3 Jäger, G. 68, 87, 88 n17 Jelinek, E. 117, 155, 263 Jenks, P. 251 n10 Jespersen, O. 56, 213 n9 Johnson, D. 4 Jurafsky, D. 9 Kako, E. 69 Kalin, L. 126 n18 Kaplan, R. 9 Kay, P. 12 n9 Kayne, R. 43, 45, 75, 118, 155 n8, 192 n25 Keenan, E. 146, 160, 161 Kemenade, A. van 76, 197 n1, 205, 206, 209, 211 Kim, J.-B. 182 Kim, K. 129, 130 Kinsella, A. 5 Kintsch, W. 43 Kiparsky, P. 88 n17, 205, 207, 211, 213, 250 n8, 267 n21 Klaiman, M. 120 Klamer, M. 121 n11 König, E. 228 Krifka, M. 170 n4 Kroch, A. 18, 78, 213 n8, 230 n6. 231, 233 Kroskrity, P. 56 Kubota, Y. 9

Kuhlmeier, V. 119 Kulikov, L. 120 Kuno, S. 192 n24 Kuryłowicz, J. 33 n15 Laanemets, A. 236 Ladd, D. 5 Landau, B. 146 Landau, I. 175 n10 Lappin, S. 4 Larson, R. 125 n17 Lasnik, H. 4, 43 Laughren, M. 56 Law, P. 268 n27, 269 n31 LeSourd, P. 117 Levinson, S. 55, 59, 72, 227 Levine, R. 9, 24, 171 n6, 172 n7, 189 n21 Lewis, Richard 92 Lewis, Robert 150 n4 Lieven, E. 73 Lobeck, A. 96, 97 Longbardi, G. 197 n1 Los, B. 197 n1, 206 MacWhinney, B. 249 n7 Mahajan, A. 191, 193 Malchukov, A. 120 Marantz, A. 13 n11, 169 Marcus, G. 5 Martineau, F. 59 Maslova, E. 57 Master, A. 59 McKay, I. 128, 129 McNally, L. 71 n3 Mengarini, G. 129 Michaelis, L. 12 n9, 19 Miestamo, M. 56, 59, 88 n16 Mithun, M. 5, 139 n22, 146, 147 n2, 158, 160, 252, 256, 263, 264 Mondorf, B. 80 Moore, J. 118 n7, 120, 238 Morrill, G. 9 Mougeon, R. 59 Mühlbauer, J. 270 n33 Müller, G. 191 Müller, S. 27 n13, 200 Mycock, L. 191 Mylander C. 48

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

314 author index Næss, A. 120 Nagano, Y. 128 Naigles, L. 69, 146 Narasimhan, B. 126, 127 Neeleman, A. 193 Newmeyer, F. 45, 75 n5, 88 n16 Nichols, J. 112 Niyogi, Parth. 75, 82–4, 234 Noël, D. 68, 88 Nowak, A. 69, 72, 72, 91, 100 n30 Noyer, R. 7 n5, 13 n11, 169 n3 Oehrle, R. 9 Oxford, W. 149 n3, 151 Pancheva, R. 230 n5 Partee, B. 17, 38 Patten, A. 68 Paul, H. 8 Pederson, E. 59 Peitsara. K. 228 Perlmutter, D. 12, 118 Peters, S. 43 Petrova, S. 206 n1 Pilot-Raichoor, C. 59 Pinker, S. 43 nn2 an. 4 Pintzuk, S. 78, 197 n1, 205, 206, 207 Pollard, C. 9, 168, 244 n3 Pollock, J.-Y. 65 Preston, L. 88 n16 Primus, B. 118 n7, 119, 120, 122, 130, 132, 160 n10 Progovac, L. 48 Pullum, G. 42 n1 Reichenbach, H. 8 Reinhart, T. 98 n25, 153 n6, 225 Reis, M. 200 Relational Grammar 12, 14, 15, 27, 162 Reuland, E. 153 n6, 225 Rhodes, R. 150 n4 Riemsdijk, H. van 220, 222, 258n. 17 Ritchart, A. 97 Ritchie, W. 43 Rizzi, L. 29, 30, 44, 98 n28 Roberts, C. 136 n21 Roberts, I. 45, 197 n1 Rochemont, M. 29, 30, 113

Rohdenburg, G. 87 n15 Rose, S. 251 n11 Rosen, C. 12 Ross, J. 44, 52, 91, 168, 177, 187, 188 Roussou, A. 197 n1 Sabel, J. 191 Sadock, J. 56, 61, 264 Safir, K. 227 Sag, I. 9, 11, 21, 28 n14, 52, 72, 96, 168, 170 n4, 230, 244 n3 Sampson, G. 88 n16, 249 n7 Sands, K. 131 Santorini, B. 78 Sapir, E. 244–5, 272 Sapp, C. 206 n4, 217 n11 Sato, Y. 97 Saulwick, A. 158 Saussure, F. de 250 n9 Schallert, O. 217 n11 Schauber, E. 270 n36 Schlachter, E. 217 n11 Schmid, T. 74, 221, 222, 223 Schmirler, K. 253 n12 Scholz, G. 42 n1 Searle, J. 67 n23 Senghas, A. 48, 49 Seržant, I. 120, 121, 129, 133 Shakespeare, William 79, 80 Sheehan, M. 197 n1 Shibatani, M. 88 n16 Siemund, P. 228 Sims, A. 232 n8 Sinnemäki, K. 120 Slobin, D. 8, 55 n17, 75, 231, 250 n9, 253 Smith, A. 59 Smolensky, P. 76 n6 Sobin, N. 97 Sprouse, R. 82 n10 Sridhar, S. 126 Stabler, E. 223 n16 Steedman, M. 9 Sternefeld, W. 94 n24, 175 Stevenson, S. 136, 139 Stump, G. 11, 21 Subbarao, V. 120 Swart, P. 120 Sweetser, E. 68

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

author index 315 Taylor, A. 78, 206, 207 Tesar, B. 76 n6 Thomason, S. 77, 85 Thompson, S. 118 Thráinsson, H. 121 n12 Tieken-Boon van Ostade, I. 231 Tomasello, M. 69, 72, 73 Traugott, E. 68, 88 n17, 250 n8 Travis, L. 94 n24 Trousdale, G. 68, 88 n176, 250 n8 Trudgill, P. 77, 79, 88 n16 Uszkoreit, H. 9 Vance, B. 82 n10 Varaschin, G. 228 Valin, R. van 158 n9 Vincent, N. 197 n1 Vitányi, P. 90 n20 Walkden, W. 208 Wallah, S. 86 n14 Wanner, D. 250 n9 Wasow, T. 23, 29, 249 n7 Watanabe, A. 192 Weinberg, A. 258

West Flemish (WF) 220–1 Westergaard, M. 76 Wexler, K. 43, 52 n14, 74, 75, 82, 91 Whaley, L. 193 Whitman, J. 52, 267 n21 Wichmann, S. 120 Wier, T. 118 n7 Wiese, H. 86n14 Wiltschko, M. 33 n15, 51 n13 Winkler, S. 23, 24, 53, 76, 91, 94 n24, 176, 199, 203 Witzlack-Makarevich, A. 120, 121, 129, 160 Wolfart, H. 150, 269 n30 Woodbury, A. 56 Woolford, E. 125 Wurmbrand, S. 219, 222 Yamashita, H. 192 n24 Yang, C. 10 n7, 75 n5 Yu, A. 69 Zanuttini, R. 63, 65 Zeevat, H. 9 Zwart, J.-W. 199, 221 n13 Zwicky, A. 61

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

Subject Index Adverb Effect 97–8 agent role 36, 118–20, 127–8, 134, 137–9, 147, 161, 165, 214, 236, 250–1, 264 analogy 243–4, 262, 264–6, 267 n21, 272–3, 275 animacy 119, 122, 129, 132–3, 136–9, 142–3, 145 antipassive 163 autonomy of syntax 8, 145 bias 85, 100, 102–4, 106, 234 branching harmony, see linear order, harmonic Cartography 29, 49 case-marking 86, 114, 116, 125, 126 n18, 127, 134, 141, 146, 161–3, 165, 237, 272 Categorial Grammar 9, 14, 15, 70 CCore, see Conceptual Core chain 9, 90–2, 168–9, 173–4, 176, 184 Chomsky’s problem 3, 4, 6, 16, 41, 67, 107 clitics 25 n10, 56, 61–4, 209–11, 229 n4 coercion 236, 239–40 cognition 4, 5, 6, 7, 55 n16, 80, 87 n15, 119, 134 complexity computational 5, 29, 52, 87, 190, 222, 248, 250 n9, 251 interpretive 192, 228 representational 213, 229 compositionality 12, 17, 18 n2, 25, 70, 72, 88 and p-passive 236–8, 240 Conceptual Core (CCore) 5, 6, 7, 51, 144, 170 conceptual structure 5, 7, 9, 15, 19, 41, 42, 47, 51 n13, 53, 77, 106, 181, 242, 274 construct 9, 10, 18 licensing of 31–6 contact 5, 55, 76 n7, 77, 79, 85, 86, 100, 106, 213 n9, 243, 244

Construction Grammar 12 n9 constructions (formalized) direct thematic correspondence 150, 253 English reflexive 226 gap 172, 189 German topicalization 201 inverse thematic correspondence 150 inversion (in question) 203, 211 ModE subject 214 NP-initial 214 object 24, 113 OE-nominative 214 OE VP-initial V 205 OHG V1 215 overt pro 82 Pro-drop 82 Slifting 188 subject 24, 112 VP-final V 203 VP-initial V 23, 32 V2 202, 216 wh-question 173, 204, 211 wh-in-situ 178 contraction of is 96–7 core grammar 3, 41–43, 47 correspondences (constructional) 17, 18 n2, 19–21, 24, 29, 31, 36–7, 66, 68–70, 88, 133 correspondences (formalized) active ⇔ passive 27, 237 declarative ⇔ yes–no question 26 creole languages 41, 46–9, 79 cryptoconstructional 14, 53, 58, 60, 64, 78 n3, 152, 170, 183, 217, 230, 231, 242, 259 dependency length 90–1, 242 dependent-marking 112, 115, 135, 160, 271 discourse structure 8, 9, 54 double object construction 17, 25, 89

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

subject index 317 E-language 3, 42 n1 economy 4–7, 15, 51, 65–6, 76–7, 80, 85, 87–90, 106, 241–3, 249–50, 252, 254, 261–2, 264–5, 267, 270, 273–5 ellipsis, see VP ellipsis ergativity 122, 123, 127, 133, 158, 160, 163, 256 n14, 277 evolution of language 5, 275, 277 experiencer role 120, 126, 138, 143–4, 161–2, 251 extraposition 206, 217 Force (the) 47, 49, 50 focus 24, 29, 30, 53, 113, 186 freezing 30, 53, 91, 92 gap 97–8, 102, 166–8, 170–5, 182 n14, 184–93, 269 GB, see Government Binding theory generalization 6 n4, 50, 68–75, 89–90, 96, 133, 139, 160, 225, 228–30, 239, 250–2 GF hierarchy 24, 147, 226 GFs, see grammatical functions Government Binding theory (GB) 4, 14, 15, 225, 258 n19 grammatical functions (GF) 9, 12, 15, 19, 22, 23, 24, 25 n10, 111, 145, 147, 158, 160–5 grammaticalization 68, 88, 213, 225, 250 n9 Head-Driven Phrase Structure Grammar (HPSG) 9, 15, 19 n5, 244 n3 head marking 112, 116, 127, 129, 135, 156, 160, 165, 263, 271 Heavy NP Shift 30 HPSG, see Head-Driven Phrase Structure Grammar I-language 3, 4, 42 n1, 44 idioms 9, 11, 20, 22, 23, 28, 73, 93, 161 n12, 274 incorporation 152–3, 157, 247, 256–7, 263 information structure 8, 9, 12, 23, 30, 54, 207, 217 n11 innovation (constructional) 49, 75–7, 95, 133

inverse morphology 150–1, 155, 253, 263 IPP (Infinitivus Pro Participio) construction 73–4, 221–2 island constraints 52, 168 Left Dislocation 117 Levenshtein distance 256 Lexical-Functional Grammar (LFG) 9, 14, 15, 19 n5 licensing 9, 10, 17–19, 27–28, 31–40, 88 linear order, harmonic 68 n15, 76, 112, 207, 222–3, 242, 248, 251, 252, 255, 265 in constructions 19, 22, 29 macroparameter 44–5, 245, 248, 262, 263–4 Mainstream Generative Grammar (MGG) 3, 4, 7,12,13, 14, 15, 19 n5, 26, 41, 42, 44, 44 n7, 45, 53, 57, 58 n20, 61, 118, 145, 164, 167 n1, 169, 171, 183, 186 n17, 197 n1, 200, 216, 229, 274 mesoparameter 45, 245 MGG, see Mainstream Generative Grammar microparameter 45, 245 Minimalist Program (MP) 4, 5 n3, 14, 15, 44, 47, 67, 168 n2 movement 9, 12–4, 29–30, 44, 62, 66, 75, 168, 187 rightward movement 192 n25, 193 V-movement 200, 202, 205, 216 wh-movement 171, 184, 191, 192, 193 multiple grammars 10 n7, 18, 77, 78–9 nanoparameter 45, 245 Niyogi’s Law 84, 100, 135, 207, 234 null subjects 45, 82 n10, 93–5 object construction, see constructions (formalized) oblique object 24 Optimality Theory 76, 210 n.6 OSV order 163, 209 paradigm function 10, 11, 13, 20, 21, 32, 59, 65, 118, 128, 147, 263 Parallel Architecture 19, 32

OUP CORRECTED PROOF – FINAL, 19/6/2021, SPi

318 subject index parameters 3, 4, 6, 7, 15, 41, 44–6, 61, 75, 78 n8, 139, 206, 242–5, 257, 259, 262, 264 partial wh-movement 191 passive 25–7, 28 n14, 147, 161 n12, 162, see also p-passive participle agreement 44 patient role 158, 159, 161–4, 167, 263 periphery 4, 6, 7, 29, 43, 61 PLD, see primary linguistic data polysynthesis 152, 180, 245–7, 252, 256, 259, 262–4 p-passive 236–40 prepositional passive, see p-passive preposition stranding, see p-stranding primary linguistic data (PLD) 3, 69, 75, 76, 133, 135, 136, 207 Principles and Parameters theory (PPT) 4, 6, 14, 15, 43, 44 pro-drop 44, 82, 229 p-stranding 248, 257–9 recipient role 264 reflexive 49, 160 Relational Grammar 12, 14, 15, 27, 162 relative clause 98–100, 169–70, 172, 174–5, 189, 190, 236 Role and Reference Grammar 14 SAI, see subject Aux inversion scope 189–90 interrogative 167–8, 178–80, 191–2, 203, 269 negative 58–60, 261–2 relative 175, 178 scrambling 86, 114, 177, 201, 270 sign language 47–49, 193, Simpler Syntax 9, 12, 15, 35, 66, 83, 194, 275 Sign–Based Construction Grammar 244 sister schemas 26 social network 5, 6, 12, 18, 69, 277 SOV order 52, 115, 192, 265, 266 split intransitivity 120, 158–160, 163 subject Aux inversion 24, 79, 81, 203–4, 211, 214, 224, 230, 231, 232, 234, 248, 261, 262

subject construction, see constructions (formalized) SVO order 70, 82–3, 114, 115, 159, 163, 201, 208, 210 thematic hierarchy 145, 150, 157, 161, 163, 213 thematic roles 8, 36, 50, 72 n4, 136, 145, 161, 207, 213, 258, 265, see also agent role, patient role tiers 9, 19, 20, 31, 36–8, 39, 69, 70, 136, 274 Toolkit Hypothesis 47, 53 topicalization 29, 83, 86, 95, 170, 182 n18, 197, 201, 204, 208, 211–12, 215–18, 224, 230–1, 234, 269–70 transitivity 118, 120, 150, 153, 158 n9 UG, see Universal Grammar uniformity 12, 44 n7, 53, 57, 58, 66–7, 244 Universal Base Hypothesis 43 Universal Grammar (UG) 3, 41–4, 45–8, 169 universals 3, 5, 7, 15, 24, 41, 50–3, 55, 67, 72 n4, 111, 144, 171, 262, 264–72 verb clusters 86, 198, 219–23 verb projection raising (VPR) 219–20 VP ellipsis (VPE) 96, 230–1, 334 VP-final V 94, 203, 205–6, 211, 216, 220–1 VP-initial V 23, 31, 32, 35, 38, 205, 211, 224 VP-topicalization 176–7, 231, 234 VSO order 52, 114, 192, 193, 265, 266 V2 76, 82–3, 93, 95, 200–12 wh-in-situ 44, 79, 102, 183, 190–3, 268–70 wh-question 99, 102, 170, 173, 176, 178–83, 190–3, 198, 203–4, 211, 214, 217, 224, 268–70 yes-no question 26, 85, 102, 198, 203, 214, 217, 224, 269, 270