A Unified Theory of Syntactic Categories [Reprint 2015 ed.]
 9783110808513, 9783110130546

Table of contents :
Acknowledgments
Introduction: The Bar Notation and the Adjacency Hypothesis
Chapter 1: The Source of Categorial Asymmetries; Indirect θ-Roles and Generalized Case-Marking
1.1. Primitive Categories and Heads of Phrases
1.2. A Preliminary Account of Specifiers
1.3. Subject Phrases
1.4. Two Base Composition Rules
1.4.1. Complement Phrases outside X̄
1.4.2. Complement Phrases inside X̄
1.5. Direct θ-role Assignment Exemplified
1.6. Indirect θ-role Assignment
1.7. The Asymmetry in Noun and Verb Complement Systems
1.8. A Generalized Theory of Abstract Case
1.8.1. The Nature of Case Categories
1.8.2. A Uniform Principle of Case Assignment
1.8.3. Examples of Case Assignment
1.8.4. Case-based Definitions of Grammatical Relations
Chapter 2: The Revised θ-Criterion‚ Clausal Subcategorization, and Control
2.1. The “Understood Subject Property” of Non-finite Verbs
2.2. The Distribution of Gerunds
2.3. Subjects of Gerunds
2.4. The Revised 0-Criterion
2.5. Structure-Based Arguments for Bare VP’s
2.6. Technical Implications of Bare VP’s for Lexical Insertion, Subcategorization, and Obligatory Control
2.7. A Note on Recoverability
2.8. Implications for Morphology and Concluding Remarks
First Appendix to Chapter 2: The Empty Head Principle
Second Appendix to Chapter 2: Verb Raising in Dutch and German
Chapter 3: Clausal Word Order and Structure-Preservation
3.1. General Characteristics of S Expansions
3.2. Base Word Orders and Observed Word Order Patterns
3.3. Motivations for S and V P as Vk, k ≥ 2
3.3.1. Motivations for S as Vmax
3.3.2. Motivations for V P as V2
3.4. Subject Prominence vs. Topic Prominence inside S
3.5. The Structure-Preserving Constraint
3.6. Movements of SP(V) = I
3.7. Local Movements of V
3.8. Greenberg’s VSO Universals as Evidence for a Universal V P under S
3.9. Conclusion
Chapter 4: Grammatical Formative Categories and the Designation Convention
4.1. The General Nature of Syntactic Categories
4.2. Closed Categories
4.3. Disguised Lexical Categories
4.4. Unique Syntactic Behavior of Closed Class Items
4.5. Suppletion as a Property of Closed Categories
4.6. The Designation Convention and the Epiphenomenon of Auxiliary Verbs
4.7. Late Lexical Insertion
4.8. Post-transformational Insertion of Grammatical Verbs, Nouns, and Adjectives
4.9. Conclusion
Chapter 5: Principles of Inflectional Morphology
5.1. Inflectional vs. Derivational Morphology
5.2. The Genesis of Inflectional Morphology
5.3. The Relation between Words and Syntactic Units
5.4. The Language-Particular Nature of Inflection and the Adjacency Hypothesis
5.5. Tense-Inflection, Modals, and Verbs
5.6. A Comparison of French and English Verbal Inflection
5.7. The Source of Morphological Case
5.8. The False Case of English Pronouns
5.9. Conclusion: the Ephemeral Morphological Component
Chapter 6: Subordination and the Category P
6.1. The Range of PP Structures
6.2. Intransitive Prepositions
6.3. The Prepositional Copula as
6.3.1. The Predicate Nominal after Non-comparative as9
6.3.2. The PP Status of Non-comparative as with NP
Chapter 7: S̄ as P̄ and COMP as P
7.1. [P,-WH] = that/ø
7.2. [P, + WH] = if/whether
7.3. P = lexical subordinating conjunction
7.4. [P, + GOAL] = for ø
7.5. P as a Landing Site for WH-movement
7.6. Subcategorization for Elements in COMP
7.7. Explanations of Transformations Using COMP and S?
7.7.1. Extraposition of S̄
7.7.2. Topicalization of S̄
7.7.3. Main Clause COMP-Deletion
7.7.4. Appositive Relative Structures
7.8. The Role of S̄ and COMP in a Theory Constraining Movements
7.9. The Status of Comparative than/as Clauses
7.10. Conclusion
Appendix to Chapter 7: The Generalized Distribution of WH
Bibliography
Index

Citation preview

A Unified Theory of Syntactic Categories

Studies in Generative Grammar The goal of this series is to publish those texts that are representative of recent advances in the theory of formal grammar. Too many studies do not reach the public they deserve because of the depth and detail that make them unsuitable for publication in article form. We hope that the present series will make these studies available to a wider audience than has hitherto been possible. Editors:

Jan Koster Henk van Riemsdijk

Other books in this series: 1. Wim Zonneveld A Formal Theory of Exceptions in Generative Phonology 2. Pieter Muysken Syntactic Developments in the Verb Phrase of Ecuadorian Quechua 3. Geert Booij Dutch Morphology

12. Osvaldo Jaeggli Topics in Romance Syntax 13. Hagit Borer Parametric Syntax

15. Hilda Koopman The Syntax of Verbs

5. Jan Koster Locality Principles in Syntax 6. Pieter Muysken (ed.) Generative Studies on Creole Languages 7. Anneke Neijt Gapping

9. Noam Chomsky Lectures on Government Binding

11. Luigi Rizzi Issues in Italian Syntax

14. Denis Bouchard On the Content of Empty Categories

4. Henk van Riemsdijk A Case Study in Syntactic Markedness

8. Christer Platzack The Semantic Interpretation Aspect and Aktionsarten

10. Robert May and Jan Koster (eds.) Levels of Syntactic Representation

of

and

16. Richard S. Kayne Connectedness and Binary Branching 17. Jerzy Rubach Cyclic and Lexical

Phonology

18. Sergio Scalise Generative Morphology

Joseph E Emonds

A Unified Theory of Syntactic Categories

1985 FORIS PUBLICATIONS Dordrecht - Holland/Cinnaminson - U.S.A.

Published by: Foris Publications Holland P.O. Box 509 3300 AM Dordrecht, The Netherlands Sole distributor for the U.S.A. and Canada: Foris Publications U.S.A. P.O. Box C-50 Cinnaminson N.J. 08077 U.S.A.

CIP-DATA Emonds, Joseph E. A Unified Theory of Syntactic Categories / Joseph E. Emonds. - Dordrecht [etc.] : Foris. (Studies in Generative G r a m m a r ; 19) With ref. ISBN 90-6765-091-9 bound ISBN 90-6765-092-7 paper SISO 837.4 UDC 801.56 Subject heading: syntax ; generative grammar.

ISBN 90 6765 091 9 (Bound) ISBN 90 6765 092 7 (Paper) ® 1985 Foris Publications - Dordrecht. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission from the copyright owner. Printed in the Netherlands by ICG Printing, Dordrecht.

This book is dedicated to my friends Mitsou Ronat, Judith McA'Nulty, and Alfredo Hurtado

It would be painful and perhaps pointless to bring out here the many ways in which I feel I benefited both personally and professionally from knowing these linguists. It is a great sorrow to me to think that these three are gone. I wish each of them could read this book.

Acknowledgments M y primary intellectual debt in writing this book is to all those recent authors in generative grammar who collectively continue to create linguistic science. Foremost a m o n g them, of course, is N o a m Chomsky. Among these, I would also like to single out some colleagues who have, in the four years I have been writing this book, repeatedly encouraged me in concrete ways to continue to develop my research. Besides those more directly involved who are mentioned below, these include Ann Banfield, Hagit Borer, Margot Griffin, Randall Hendrick, K a z u k o Inoue, Masaru Kajita, Jean-Claude Milner, Anne Rochette, Thomas Roeper, and Wendy Wilkins. Indebtedness on a personal scale runs, of course, on a continuum, and there are many friends in linguistics whom I am indebted to besides those mentioned here. In particular, Alfredo H u r t a d o invited me to work on the topic of control in Spanish gerunds (Ch. 2) and provided a forum where it could be fruitfully discussed (the Workshop on Spanish Syntax at Simon Fraser University). Shalom Lappin likewise arranged for a series of lectures at the University of Ottawa, which proved to be an opportunity for productive interchange with those that attended. Mark Baltin, N o a m Chomsky, Judith McA'Nulty, Lisa Travis, and H a n n a Walinska de Hackbeil have each been kind enough to go over a chapter carefully and provide me with very useful material and commentary. This book has been written while I have been in the Linguistics Department at the University of Washington. It is certainly the members of this department who are most responsible for maintaining a tensionfree, intellectually stimulating, personable, and progressive atmosphere in which teaching and writing can be a pleasure as well as labor. They d o this in spite of the obstacles of woefully insufficient funding, particularly for graduate students. All of my colleagues on the faculty, Michael Brame, Heles Contreras, Georgette Ioup, Ellen Kaisse, Frederick Newmeyer, and Sol Saporta have encouraged and assisted me in dozens of ways, directly and indirectly, both with respect to the content of this book and with respect to daily aspects of my job, each in ways characteristic of them alone. I thank each of them for their intellectual, personal, and political integrity. It is a rare privilege to be an academic who has for each of his departmental colleagues a deep personal and intellectual respect. The students in the University of Washington department have made many important contributions to this project; it is they who have suffered through certainly erroneous preliminary and partly thought-out versions

of the ideas developed here, both in courses and in hours of discussions. Their patience, interest, criticism, and suggestions have been an indispensible means for improving my material. I am especially appreciative of those who have discussed many of the topics in this book while doctoral candidates, even though their treatments and mine diverge in countless ways. These include Abdulaziz Alshalan, Hanna Walinska de Hackbeil, Nobuko Hasegawa, Hajime Hoji, Mi-Jeung Jo, Anne Lobeck, Rosemarie Whitney, Young-Jae Yim, and Karen Zagona. Similarly, I profited from many discussions with Julia Horvath, Sharon Klein, and Vida Samiian, especially while on their U.C.L.A. doctoral committees. Henk LaPorte and the staff employed by Foris have been conscientious throughout the production process; I especially have appreciated his enthusiastic and cooperative attitude about this project. Anita Hoffman and Rita Liang have been very perceptive and efficient in the difficult preparation of the bibliography and subject index, respectively, and I am grateful to each of them. My son, Peter Emonds-Banfield, has (usually) been understanding and tolerant when my plea of "having to work," both on this book and at the university, has left him spending many hours on his own during which he might have rightfully expected my company. For his patience I am grateful. I acknowledge a special debt to the departmental staff at the University of Washington, Anita Tabares-Laws, Peter Skaer, and Corinne Moore, for their cooperation and help throughout the arduous task of preparing the manuscript. Like the faculty, they have facilitated the administrative part of my job in every way, so as to make this research possible. Moreover, they have patiently typed and retyped the greatest part of the manuscript and seen to its being reproduced properly. Anita Tabares-Laws in particular has kept a whole range of departmental affairs running smoothly to the benefit of faculty research. Without her, I could not have chaired the department, written a book, and remained arguably sane. My gratitude goes out especially to Frederick Newmeyer, Henk van Riemsdijk, and Rosemarie Whitney, who have each edited almost the entire manuscript, reading it both critically and sympathetically. They have given generously of their time, bringing to bear on this task intellectually specialized appreciations of the issues that I have greatly benefited from. Many of their suggestions have been incorporated or otherwise taken into account without specific acknowledgment. Colleagues like them have provided me during the last three years with the confidence I needed to follow through on my own ideas, whether with or against the current.

Table of contents Acknowledgments Introduction: The Bar Notation and the Adjacency Hypothesis

Chapter 1: The Source of Categorial Asymmetries; Indirect 0-Roles and Generalized Case-Marking 1.1. Primitive Categories and Heads of Phrases 1.2. A Preliminary Account of Specifiers 1.3. Subject Phrases 1.4. Two Base Composition Rules 1.4.1. Complement Phrases outside X 1.4.2. Complement Phrases inside X 1.5. Direct 0-role Assignment Exemplified 1.6. Indirect 0-role Assignment 1.7. The Asymmetry in Noun and Verb Complement Systems .. 1.8. A Generalized Theory of Abstract Case 1.8.1. The Nature of Case Categories 1.8.2. A Uniform Principle of Case Assignment 1.8.3. Examples of Case Assignment 1.8.4. Case-based Definitions of Grammatical Relations

1

13 13 18 21 26 27 29 32 37 42 52 52 54 58 61

Chapter 2: The Revised ^-Criterion, Clausal Subcategorization, and Control 67 2.1. The "Understood Subject Property" of Non-finite Verbs 67 2.2. The Distribution of Gerunds 70 2.3. Subjects of Gerunds 75 2.4. The Revised 0-Criterion 78 2.5. Structure-Based Arguments for Bare VP's 86 2.6. Technical Implications of Bare VP's for Lexical Insertion, Subcategorization, and Obligatory Control 97 2.7. A Note on Recoverability 106 2.8. Implications for Morphology and Concluding Remarks 109

First Appendix to Chapter 2: The Empty Head Principle

Ill

Second Appendix to Chapter 2: Verb Raising in Dutch and German

115

Chapter 3: Clausal Word Order and Structure-Preservation 121 3.1. General Characteristics of S Expansions 121 3.2. Base Word Orders and Observed Word Order Patterns 125 3.3. Motivations for S and VP as V*, k > 2 128 3.3.1. Motivations for S as Vmax 128 3.3.2. Motivations for VP as V 2 130 3.4. Subject Prominence vs. Topic Prominence inside S 132 3.5. The Structure-Preserving Constraint 138 3.6. Movements of SP(V) = I 141 3.7. Local Movements of V 146 3.8. Greenberg's VSO Universals as Evidence for a Universal VP under S 150 3.9. Conclusion 153 Chapter 4: Grammatical Formative Categories and the Designation Convention 4.1. The General Nature of Syntactic Categories 4.2. Closed Categories 4.3. Disguised Lexical Categories 4.4. Unique Syntactic Behavior of Closed Class Items 4.5. Suppletion as a Property of Closed Categories 4.6. The Designation Convention and the Epiphenomenon of Auxiliary Verbs 4.7. Late Lexical Insertion 4.8. Post-transformational Insertion of Grammatical Verbs, Nouns, and Adjectives 4.9. Conclusion Chapter 5.1. 5.2. 5.3. 5.4. 5.5. 5.6. 5.7. 5.8. 5.9.

155 155 159 162 165 170 172 176 184 190

5: Principles of Inflectional Morphology 193 Inflectional vs. Derivational Morphology 193 The Genesis of Inflectional Morphology 198 The Relation between Words and Syntactic Units 201 The Language-Particular Nature of Inflection and the Adjacency Hypothesis 205 Tense-Inflection, Modals, and Verbs 210 A Comparison of French and English Verbal Inflection 213 The Source of Morphological Case 220 The False Case of English Pronouns 237 Conclusion: the Ephemeral Morphological Component 242

Chapter 6: Subordination and the Category P 6.1. The Range of P P Structures 6.2. Intransitive Prepositions 6.3. The Prepositional Copula as

247 248 252 264

6.3.1. The Predicate Nominal after Non-comparative as 6.3.2. The P P Status of Non-comparative as with N P

267 272

Chapter 7: S as P and COMP as P 7.1. [P, — W H ] = that/fy 7.2. [P, + W H ] = if/whether 7.3. P = lexical subordinating conjunction 7.4. [P, + G O A L ] =/or/(j) 7.5. P as a Landing Site for WH-movement 7.6. Subcategorization for Elements in C O M P 7.7. Explanations of Transformations Using C O M P and S 7.7.1. Extraposition of S 7.7.2. Topicalization of S 7.7.3. Main Clause COMP-Deletion 7.7.4. Appositive Relative Structures 7.8. The Role of S and C O M P in a Theory Constraining Movements 7.9. The Status of Comparative than/as Clauses 7.10. Conclusion

281 283 286 289 291 299 305 309 309 313 316 319 322 327 332

Appendix to Chapter 7: The Generalized Distribution of WH

333

Bibliography

335

Index

343

Introduction

The bar notation and the adjacency hypothesis For centuries, the determination of the categories of syntax and the principles governing their combination have formed the study of grammar. Within the framework of generative grammar, the central morpheme categories "X" have been determined to be the noun, verb, adjective, and preposition ( X = N , V, A, P). All phrasal categories used inside sentences are hypothesized to be "projections" of the lexical categories XJ (j = a small integer), where each XJ has one and only one X as its "lexical head." The centrality of the X and the restriction of phrasal categories to X} is called the "bar notation" (Chomsky, 1970; Emonds, 1976, Ch. 1; Jackendoff, 1977). Questions immediately arise: D o all languages realize the same inventory of categories? D o these categories combine in the same way in all languages? Are their principles of combination relatively simple? Are the combinatorial properties of these categories to some extent autonomous, rather than being completely derivable from other principles, such as the principles of the lexicon or of semantics? Finally, and I think centrally, what are the essential, defining properties of N, V, A, and P? I will comment on the first four questions and then return to the final question concerning the nature of each bar notation category. Basing my conclusions both on other work and on the material presented in this book, I intend to justify strong affirmative claims for the first four questions above, with the following qualifications: (i) Categorial Uniformity. The categories defined in terms of the bar notation, X-> and SPECIFIER (X) (cf. sections 1.2, 4.1, and 4.2 for specifiers), do not differ from language to language, but their subcategories which are realized in each language's syntax may vary. (ii) Hierarchical Universality. The range of permitted hierarchical combinations of syntactic categories does not vary from language to language at the level of deep structure. 1 However, different restrictions on the linear order of constituents may be stated for this level. 1. The syntactic level of deep structure, at which Hierarchical Uniformity across languages holds, is understood here essentially as presented in Chomsky (1976). Such deep structures are related to partially "observable" syntactic surface structures by a highly restricted set of transformational operations, many of whose properties will be discussed in this book, particularly in Ch. 3. Surface structures are "observable" in that the left-right sequence of morphemes in a well-

2

A unified theory of syntactic

categories

(iii) Syntactic Asymmetry. The principles determining the hierarchical combinations of syntactic categories at deep structure are simple, in the sense that no device as complex as a set of "phrase structure rules" is needed. However, these principles are not simply a "natural logic" of predicate-argument structure, as envisaged in the tradition of generative semantics; rather, the lexical categories X = N, V, A, P appear in them in limited non-symmetric ways, with a level of complexity somewhat akin to that envisaged by Gruber (1965). The asymmetries are determined by the defining characteristics of each category, to be discussed below. (iv) Autonomy from the Lexicon. The theory of permitted deep structure categorial combinations is independent of the lexicon, in the sense that these combinations do not follow solely from the organization of the lexicon. The permitted deep structures are partly dependent, however, on the semantic component, i.e., on the principles which determine the semantic interpretation of predicate-argument structures, as I will argue in Ch. 1. The answers that I will develop for the above questions - that a small inventory of grammatical categories organized according to universal and simple principles at an abstract deep structure level in a syntactically autonomous way suffices for elegant and empirically adequate descriptions of natural languages - clearly identify this work as Chomskyan (for example, according to the criteria in the Introduction of Newmeyer, 1980). If the academic field of linguistics were a science, the above statement, except for the reference to the subject matter (e.g., "grammatical," "syntactically"), would be a truism. However, since linguistics contains subfields in which a scientific approach has not been successful, as well as an overflow of practitioners within syntax and phonology who deny the scientific status of even these areas either explicitly or implicitly, it is appropriate to state at the outset that, as a generativist, I am attempting to construct a scientific theory of syntax, and by this attempt, I affirm that it is presently possible to do so. Each of the above claims (i)—(iv) has controversial implications, several of which will be developed in detail in this book. Thus, Categorial Uniformity (i) implies that all languages, including verb-initial and verbFootnote—Continued formed surface structure always corresponds to an acceptable string of pronounced morphemes in a language. However, surface structures also contain hierarchical structure, empty categories, and possibly indices which are not pronounced. Moreover, some surface structures are ill-formed by virtue of filters or restrictions that apply to the logical and phonological forms derived from them. Finally, some acceptable strings of morphemes may either be not directly generated by the grammar at all (so-called "derivatively generated" strings) or be generated only by virtue of an optional stylistic reordering of a surface structure string. Since surface structures are not to be equated with "occurring strings of morphemes," for the above reasons, I will typically refer to them by the more abstract term of "s-structures", in conformity with much current usage.

The bar notation

and the adjacency

hypothesis

3

final languages, should have a verb phrase (VP) distinct from the sentence (S) if and only if one language does (Ch. 3). Hierarchical Universality (ii) implies, for example, that if a certain syntactic category exists in a language, it must appear in the same deep structure position as it does in other languages. For example, verbal I N F L E C T I O N , being the category of the English modals, can be argued to be a deep structure sister to VP (Emonds, 1976, Ch. 6). By Hierarchical Universality I N F L E C T I O N can be a deep structure sister to VP in English if and only if it appears in the same position in French deep structures, since both languages clearly exhibit this syntactic category in their surface structure tense endings on verbs. While (i) and (ii) are important and far from trivial claims about natural language, what distinguishes this book from other recent generative treatments of the base, or deep structures, are the claims of Syntactic Asymmetry (iii) and Autonomy from the Lexicon (iv). F o r concreteness, I will first compare the approach of this book to that of two other generativists who have recently written extensively on the base component, JackendofT (1977) and Stowell (1981), and then I will discuss my approach on its own terms. Both of these authors have relatively well-worked-out theories and share many of my assumptions. We agree not only on the existence of a transformational component that maps deep structures onto surface structures ("s-structures"), but also more or less on the range of constructions that are to be considered "transformationally derived" rather than "base-generated." We further agree on many aspects of the bar notation theory; i.e., that there are four lexical categories, that phrasal categories are projections of lexical categories, and that other rules of grammar are to be stated in terms of the bar notation categories. Thus, it is of interest to highlight, in a preliminary and cursory way, some basic differences between this book and the work of JackendofT and Stowell. Jackendoff's view of (iii) is that, of the asymmetries among N, V, A, and P, the fundamental ones are: (a) only nouns and verbs may take subjects, (b) only verbs and prepositions take prepositionless objects. 2 In my view, these statements are not sufficiently general. With respect to (a), I argue in Chs. 1, 2, and 3 that the fundamental asymmetry between the verb and other lexical categories is that the verb takes an extra projection (or "bar level") not allowed with N, A, and P. From this, several properties peculiar to V, and its projections VP and S, will be shown to follow, 2. At the outset, JackendofT (1977, 32) states that the feature names that differentiate the lexical categories have only a "heuristic, nontheoretical significance." If this is meant to be true of his features ( + SUBJ, + OBJ), then Jackendoff's phrase structure rules are at best a catalog of descriptive generalizations which, on their own terms, invite further study to separate out the fundamental from the derivative distinguishing characteristics of the lexical categories. However, JackendofT subsequently claims, for example, that the (surely theoretical) definition of "subject" utilizes the feature SUBJ (p. 41). Thus, my statement accurately reflects Jackendoff's practice, in spite of his disclaimer.

4

A unified theory of syntactic

categories

including some differences between subjects in S's and subjects in NP's that do not fall out naturally in Jackendoff's system. Similarly, I will argue that Jackendoff's (b) should be generalized to "only verbs and prepositions take prepositionless complements," and that this statement, when properly formalized, significantly broadens the scope and import of Jackendoff's proposal. 3 In contrast to the asymmetries just discussed, Jackendoff imposes a parallelism across the lexical categories which commits him to a claim that parallel subcategorizations and interpretations largely determine the deep structure categorial combinations; i.e., his view on (iv) is that the base component is less autonomous than the one I develop here. Specifically, Jackendoff's claim (1977, 36) is that parallel grammatical relations in deep structure are expressed by parallel hierarchical configurations; in Ch. 1 and 2, I strongly dispute this. While I retain structural definitions of "subject" and "object" as N P arguments to a lexical head X which state that they are external and internal, respectively, to X 1 ,1 do not require, as does Jackendoff, that the subject be a sister to a fixed projection of X and that the object be a sister to X. Rather, my definitions of grammatical relations, while simply stated, allow a range of differently situated N and N P to serve as subjects and objects. The resulting interplay between the determination of the grammatical relations (by the semantic component) and the possible deep structures (by the categorial component) permits, as I argue in Ch. 1 and 2, explanations of otherwise unmotivated morphological distinctions and morpheme-insertion rules. The asymmetries among the lexical categories in Stowell's work are attributed to theoretical statements involving case-theory and government-theory, and are not directly expressed in deep structure categorial combinations. This might seem to be a metatheoretical advantage over my claim (iii), that there are asymmetries in the behavior of various X at deep structure, provided Stowell's statements actually had wider empirical coverage than do the ones I will propose in their stead. But, as I will argue, the opposite is true. For example, for Stowell, case theory determines that V and P can have prepositionless N P objects, while N and A cannot (the discussion here is of English). As mentioned above, I will establish in Ch. 1 the more general 3. The status of N P complements to A in Chinese (Huang, 1982, Ch. 2), Persian (Samiian, 1983, Ch. 3). and Korean (Jo, in preparation) invites clarification. In languages with morphological case such as German and Latin, these NP's invariably exhibit some oblique case rather than the accusative case (van Riemsdijk, 1983). There is syntactic evidence that oblique cases are PP's at deep structure (cf. Schein, 1981) with a phonologically empty P. Thus, it may be that the N P complements to A established in the above works are of the form [ p p [ p 0 ] N P ] at deep structure. Morphological case gives additional support for the dative being associated with a deep PP structure: the German dative is the usual case after a lexical preposition, and in Latin, the unmarked prepositional case (the "ablative") is always identical to the dative in the plural. The hypothesis that a dative belies a deep structure P P is discussed in more detail at the end of Ch. 1 and in section 5.7.

The bar notation and the adjacency

hypothesis

5

p r o p o s i t i o n t h a t only V a n d P can have prepositionless complements. In o r d e r t o a c c o m m o d a t e this, t h e theory of N P case m u s t be extended in s o m e way. M y p r o p o s a l (for simplifying the categorial c o m p o n e n t ) is t o replace Stowell's enrichment of the stipulations of case t h e o r y with a n a s y m m e t r i c principle of 0-role assignment. A simplified general principle of b a s e - d e p e n d e n t case assignment then o p e r a t e s freely o n d e e p structures constrained r a t h e r by principles of 0-role assignment. I claim for this system b o t h empirical a n d theoretical a d v a n t a g e s over Stowell's system. Stowell derives the special properties of the subject of a V indirectly f r o m the role t h a t the special g r a m m a t i c a l category I N F L E C T I O N plays in his t h e o r y of g o v e r n m e n t . T r a n s l a t i n g terminology s o m e w h a t , a second a s y m m e t r y across lexical categories in Stowell's system is t h a t V P is t h e only m a x i m a l projection which is always the sister t o the g r a m m a t i c a l f o r m a t i v e category I N F L E C T I O N that a p p e a r s with it (by virtue of a special categorial rule e x p a n d i n g S as a projection of I N F L E C T I O N t h a t s u p p l e m e n t s the theory of government). I n c o n t r a s t , the g r a m m a t i c a l formative category t h a t characteristically a p p e a r s with N , namely D E T (determiner), is a daughter, not a sister t o N P , a n d similarly for the g r a m m a t i c a l category of D E G (degree words) that a p p e a r s with A. Within the bar n o t a t i o n , this a s y m m e t r y follows f r o m my p r o p o s a l t h a t V, but not N, A, or P, h a s a third projection in the bar n o t a t i o n . F o r in t h e bar n o t a t i o n , each lexical category X is paired with a c o r r e s p o n d i n g g r a m m a t i c a l formative category SP(X), called a specifier, which is a d a u g h t e r t o t h e maximal projection of X. If S = V 3 , then we can t a k e I N F L E C T I O N to be the specifier of V, a n d it follows t h a t it is the sister to VP( = V 2 ), while S P ( N ) a n d SP(A) are d a u g h t e r s to N P a n d AP. T h u s , m y general claim t h a t only V has a third projection explains why a special g r a m m a t i c a l f o r m a t i v e category associated with V (i.e. I N F L ) c a n a p p e a r outside V 2 , while t h e same is n o t true for N a n d A. But since t h e claim t h a t S = V 3 has o t h e r implications as well (set out in s o m e detail in Chs. 1, 2, a n d 3), this statement is m o r e general t h a n a s e p a r a t e rule, used by Stowell, which stipulates t h a t a special category I N F L is a sister t o VP.4 W i t h respect to A u t o n o m y f r o m the Lexicon (iv), Stowell's a p p r o a c h differs f r o m m i n e in that he assumes t h a t the t h e o r y of the lexicon, yet t o be specified, d e t e r m i n e s the u p p e r limits of complexity for subcategorization frames. While I agree t h a t this is true for the subcategorization of V's, I will a r g u e that a n u m b e r of systematic discrepancies in t h e s u b c a t e g o r i z a t i o n f r a m e s associated with V's a n d the c o r r e s p o n d i n g N's a n d A's can be predicted f r o m t h e interplay of verb s u b c a t e g o r i z a t i o n s

4. This introductory discussion of the role of I N F L E C T I O N in both Stowell's and my theories deliberately glosses over the rather complicated theoretical apparatus we utilize to explain various characteristics of non-finite clauses, and is meant only to give the reader a very general idea of the direction to be pursued in this study. A more complete exposition of my o w n ideas on non-finite clauses appears in Ch. 2 and 7 of this book.

6

A unified theory of syntactic

categories

with asymmetric 0-role assignment principles (Chs. 1 and 2), so that the burden of explanation provisionally placed on the as-yet-undeveloped theory of possible lexical entries is greatly reduced. I hope that this brief comparison of the approach of this book with those of its "closest relatives," the works by JackendofT and Stowell, has given some indication of how much room there is for argument, even among those who agree on a relatively wide range of methodological and theoretical points. In order to adequately justify what I feel are considerable improvements over some of their formulations and some points in Chomsky (1981), a book-length study has seemed necessary. Needless to say, if I have succeeded in making such improvements, I owe these authors a great debt for their thorough and insightful contributions to the elucidation of the same basic problems. I would not want the reader, however, to conceive of this book first and foremost as a comparison of my views with those of other authors. The book has its internal logic. In it, I attempt to treat rather exhaustively all the principles that I feel bear on the deep structures of language. A universal syntax of deep structure must include statements of combinations allowed in the bar notation, as well as definitions of the basic grammatical relations (subject, object, indirect object) and a theory of how heads assign semantic roles ( = "0-roles") to complements. As these statements are developed in Ch. 1, in particular for the "open" lexical categories N, V, and A, what I take to be fundamental laws governing and setting apart the category N and the category V quickly emerge. These laws are closely linked to the notions of grammatical subject and complement. Only a noun (phrase) can be a subject. Only a verb can take complement types freely. The implication of these extreme restrictions on combining open lexical categories is that a fourth ("closed") head category P emerges. The property of P is that, like V, it can take any complement, and at the same time "transmit" a semantic role from an open lexical category head (N, V, A) to its object. I argue at length in Ch. 1 that P's provide sufficient but also necessary structure for free combinations of open lexical categories. The emphasis placed on P as the sine qua non of many grammatical combinations is the basis for the subject matter of the last chapters in the book. In particular, Ch. 5 argues for the necessity of P in indirect objects and in other structures exhibiting oblique morphological case. Ch. 6 shows that a (non-case-marking) P is structurally required with a wide range of predicate attributes. Ch. 7 claims that any "subordinator," including the much-discussed S-introductory C O M P , is also a structurally-induced P, allowing an X to assign a 0-role to an S. Thus, these chapters all testify to the centrality of the structural links provided only by P, as discovered in Ch. 1. Another fundamental difference between V and all other heads which

The bar notation

and the adjacency

hypothesis

7

emerges in Ch. 1 is that only V has distinct second (VP) and third (S) projections in the bar notation. This distinction leads to separate studies, in Chs. 2 and 3 respectively, of the conditions under which V 2 (VP) can occur alone, and of the properties of V 3 (S). Since the third projection of V tolerates more transformational deformations of deep structures than do the X 2 , Ch. 3 is the natural place to introduce the structure-preserving principle, which sets limits on the divergence between deep structures and s-structures. In particular, the notion of local, language-particular transformation is introduced and exemplified. By seeing the effects of such rules on regular underlying structures, it can be seen how certain recalcitrant language types in fact conform to Categorial Uniformity and Hierarchical Universality at deep structure. Chs. 4 and 5 further investigate the role of local, language-particular rules in obscuring similarities among underlying categorial combinations across languages. Further principles which apply only to closed categories, the Designation Convention of Ch. 4 and the Invisible Category Principle of Ch. 5, are shown to combine with local rules so as to yield apparently quite diverse surface structures, particularly in the area of inflection. But at the same time, these principles, as well as the local rules themselves, can sanction only a limited range of operations, so that the strong claims about the sparsity and uniformity of deep categorial combinations developed in Chs. 1—3 can stand. While I have made some claims about what local, language-particular rules can d o in Chs. 3-5, I have deliberately excluded much material which will be published separately that elaborates on a restricted theory of language-particular transformations. In many sections of this book, I develop ideas which I consider crucial for universal grammar, even though they are not presently central points of contention with many authors working in a generative framework. This is especially true in the last two chapters, where I concentrate on the role of the categories P and P P as the main means of subordination and free recursion in syntax. In the last chapter, where I assimilate the category C O M P L E M E N T I Z E R to P and the category S to P, I thereby deny that any phrasal category can escape the bar notation. These results, which I think have been fruitful and are certainly potentially controversial, should not be understood as opposed to presently elaborated alternative theories. 5 Rather, my proposals about C O M P and S are natural simplifications of a presently utilized system of categories, some of whose members happen to stand in a quite complex and previously unnoticed relation of complementary distribution (e.g., P and S are in this relation).

5. The work of Freiden and Babby (1983), which argues that I N F L E C T I O N is the head of S, has come to my attention after work on Ch. 7 was completed. One of their arguments is that there is an agreement rule between C O M P and INFL, analogous to an agreement between SP(N) and N, which suggests to them that C O M P = SP(INFL). My

8

A unified theory of syntactic

categories

Another hypothesis about the form of universal grammar which is not widely contested but which is centrally important in much present-day theory construction is what I will call the "Adjacency Hypothesis." At several points in this book and throughout my separate work on language-particular transformations, I try to refine and strengthen this proposal, even though it is not the principal subject matter here. Borer's (1984) parametric model, for example, incorporates such a constraint. Adjacency Hypothesis: N o language-particular rule of any type makes use of a string variable. For many years, and still in the minds of many linguists, syntax is the only component of linguistic description where it has been thought necessary to use symbols which refer to strings of arbitrary length ( = "string variables"). If we can demonstrate that such variables are never necessary in language-particular statements, an extremely strong claim about natural language systems embodied in the Adjacency Hypothesis emerges: Given a complete and accurate definition of "adjacent", n o child can ever learn a dependency particular to some but not all natural languages which is stated in terms of elements related at a distance. 6 A quick survey of the components of a typical formal linguistic model reveals that the burden of demonstrating the generality of the Adjacency Hypothesis falls mainly on syntacticians. Thus, it is a commonplace for those working on morphology, whether or not they postulate separate morphological components, to claim that rules of morphology involve only segments that are syntactically adjacent (cf. Roeper and Siegel, 1978, and Ch. 5 here). Similarly, most of the recent developments in formal phonology, both in the "metrical" and the "autosegmental" veins, have been motivated by the twin observations that (a) the majority of phonological processes involve obviously adjacent segments, and (b) the language-particular aspects of those which d o not (e.g., prosodic and harmony phenomena) can be recast as local by the proper elaboration of universal phonological theory (personal communication, Morris Halle, J e a n - R o g e r Vergnaud). A recent summary of the formal differences and similarities of autosegmental and metrical phonology (stressing their motivated similarities and the lack of motivation for their differences) makes a point of just this sort: Footnote 5 — Continued proposals in Ch. 2 and 7 attribute the infinitival to to a different cause; I argue that an infinitive form of an S arises if and only if the surface subject N P contains no terminal element. The analysis of/or-phrase subjects which supports my position is given in Ch. 7, and that of English "raising to object" constructions is given in Emonds (1980a). 6. Culicover and Wilkins (1984) is organized around a locality principle of exactly this type. M. Halle (pers. comm.) attributes the following remark to the physicist L. Tisza: "All physics is an attempt to define adjacency." If so,the Adjacency Hypothesis would suggest that "physics" be replaced by "science

The bar notation and the adjacency

hypothesis

9

"A final parallel between the two systems rests on an observation made by John Goldsmith . . . In autosegmental phonology, there are two types of spreading. One is maximal and proceeds by a general convention, the well-formedness condition, and the other is confined to a limited domain, e.g. only one or two syllables, and this is achieved by a language-specific rule . . . In metrical phonology, analogously, there are essentially two types of tree: unbounded and binary, whether we follow McCarthy's 1979 system or Hayes' 1980 system." (Leben, 1982, 6; I am indebted to Ellen Kaisse for pointing out the passage t o me.) Within syntax proper, there have never been string variables in language-specific base rules. It seems therefore more than plausible that theoretically constrained formal replacements for such mechanisms (such as those developed here in subsequent chapters) can maintain the claim that a deep structure categorial component requires no string variables in its language-specific aspects. Lexical entries, even in their syntactic specifications, are generally assumed to be structured so that no string variables are needed; this is Chomsky's (1965, Ch. 2) claim that lexical subcategorization is local; in this book, local subcategorization will be discussed in Chs. 1 and 2. The rules of formal semantic interpretation, while probably requiring string variables, supposedly d o not vary greatly across languages. T o the extent that they do, the required rules can hopefully be stated without recourse to a string variable. Thus, suppose that English and Chinese differ by some semantic rule which is equivalent to "Quantifier Q, may have wide scope." (For discussion of the relevant data, see Huang, 1982, Ch. 4). If the Adjacency Hypothesis holds true for semantics, then the notion of "wide scope" would have to be automatically determined by universal grammar, even though the actual value of Q, could vary across languages, and even depend on other factors of the grammar of the language in question. It seems needlessly pessimistic to conclude that universal semantics could not provide definitions of notions like "wide scope", and for this reason I have not included any provisos whatsoever with the Adjacency Hypothesis, since I am confident it can be made to hold in syntax. Within syntax, the only component that even plausibly falsifies the hypothesis by containing language-specific string variables is the transformational component. Not many years ago, say in 1970, it would have been non-controversial within transformational grammar to claim that the Adjacency Hypothesis is easily falsified even by the most obvious syntactic differences among languages. Thus, one would have said: Japanese and English differ in that English and not Japanese has a transformational rule preposing a phrase marked with the feature W H to sentence-initial position over a string variable Y. That is, the following rule would have been assumed to be part of the grammar of English, but

10

A unified theory of syntactic

categories

not of Japanese ( X m a x = a maximal phrase, such as NP, AP, or PP): i^max Y-

WH

Z ===>2+1 —0 — 3

Similarly, French, but not English or Japanese, would have been assumed to contain a transformation with a string variable Y by means of which pronominal objects are moved to a pre-verbal clitic position: NP-V + Y-PRONOUN-Z=>1 -3 + 2-0-4 (Kayne, 1975, Ch. 2). Today, following a research program initiated in Chomsky (1976), it is assumed practically throughout generative grammar, however different the analyses proposed for such phenomena as "WH-fronting" and "cliticplacement", that these discrepancies in the particular grammars of Japanese, English, and French are not due to language-particular statements containing string variables. The differences are rather attributed to differences in these languages' structures at the "landing site" positions of these movement operations; e.g., Japanese might have no sentence-initial C O M P node, and English and Japanese might have no preverbal clitic position defined in their base components; alternatively, within a general transformational "move a" schema, a Japanese-particular statement might specify "a # WH" or English and Japanese statements might require "a = PHRASE". Whether it is a question of specifying base positions or of setting restrictions of the form "A # B" elsewhere in the syntax, the language-particular statements in question do not contain a string variable. In Ch. 3 and 5, I will go into more details about how and why we can consider movements across string variables in syntax to be always due to an interplay of universal statements involving such variables and language-particular statements involving category memberships or category adjacency conditions. More extensive syntactic justification for the Adjacency Hypothesis is also the subject matter for another volume on language-particular rules, alluded to above. But even at places where I do not emphasize the Adjacency Hypothesis in this book, the relevance of it to various proposals will be remarked, and in a number of places the hypothesis will influence the choice of formal treatment. The Adjacency Hypothesis, my hypotheses about S and C O M P in Ch. 7, and a number of others throughout the book (e.g., those on inflectional morphology in Ch. 5 and the Designation Convention in Ch. 4) are independently justified in a wide variety of ways, and may be easily accepted by readers who might not wish to accept some of the more controversial views of the first three chapters. I would have liked to begin with the less controversial material, but it has seemed in the course of

The bar notation

and the adjacency

hypothesis

11

writing that the basic syntactic and semantic relations between heads and complements, which include subcategorization, control, and what counts as "unmarked movement," must logically precede in my exposition the last four chapters, which deal with grammatical formative categories. So the reader is unfortunately to be exposed to the most controversial (but to my mind, equally well-supported) material first. The formal plan of the book is then as follows: Chapter 1: the projections of phrases built around lexical heads XJ; general principles of 0-role assignment, the definitions of grammatical relations, and a theory of abstract case. Chapter 2: the lexical representation of head-complement relations; subcategorization, the ^-Criterion, and obligatory control; establishing that VP does not imply S. Chapter 3: the simple unmarked syntactic movements; establishing that S does imply VP. Chapter 4: grammatical formative categories which are independent words, especially those of category X = N, A, V; the Designation Convention. Chapter 5: grammatical formative categories which are bound morphemes. Chapter 6: the non-lexical head-of-phrase P and the range of its projections. Chapter 7: reducing C O M P to P and S to P; the role of the nonrecursive, initial symbol E. In general, the intention of this book is to elucidate as much as possible the formal and empirically justified relations among the bar notation categories, by studying in a relatively complete way those syntactic phenomena in English and French (with some reference to work on other languages when appropriate) which highlight both the differences and similarities among these categories. The goal of such a study, like that of any study of formal syntax, is to set limits on the type of theoretical devices and categories that need to be employed in insightful descriptions of both universal and language-particular grammatical processes. By contemplating these devices and the relations among these categories, we then see a likeness of the power and beauty of the speaking mind, and why it must be respected and developed in each creature that has it.

Chapter 1

The source of categorial asymmetries; indirect 0-roles and generalized case-marking 1.1. Primitive Categories and Heads of Phrases Traditional and generative grammar agree that central among the categories of syntax are the "major lexical categories": nouns (N), verbs (V), and adjectives (A). The characteristic of major lexical categories is that in English and in most languages they contain usually upwards of a thousand members listed in a lexicon. Also, in typical daily use of language, neologisms ("coinings" of new words) are restricted to these categories. Besides these three major lexical categories, languages have only what can be called "grammatical categories", that is, categories which have at most about twenty or so members, and which are not modified by neologism. Throughout this study, the symbol L varies over exactly the three values N, V, A. 1 So we begin with the claim that in the realm of syntax, all morphemes are either in a lexical category or in a grammatical category. It is by now familiar in formal accounts of syntax that these lexical and grammatical categories can combine only in certain sequences, and that many of these combinatorial regularities are to be expressed by a set of phrase structure rules or principles of deep structure that generate labelled bracketings of morpheme sequences, called deep structures. The well-formed morpheme sequences of deep structure we will label here as E (for "expression", following Banfield, 1973). Among the various E, the type that for centuries has rightfully been a principal focus of investigation by students of grammar and semantics is the one which may express a "judgment", in the sense of Frege (see the discussion in Kuroda, 1975). This type or "expansion" of E is called a sentence or, when attention is on its structure rather than on its sense, a clause, and it is notated S. In the familiar Indo-European languages, morpheme sequences which have the structure of an S generally must contain a grammatical category 1. The productive class of adverbs that end in ly in English is considered to consist of adjectives with an ending. Cf. the discussions in Jackendoff (1977, section 2.3) and Hendrick (1978). The category preposition has more than twenty members in many languages. Its special status as a grammatical category which is also a "head of a phrase" is taken up in detail in the latter part of this chapter.

14

A unified

theory

of syntactic

categories

expressing "tense" and must not be contained in a larger S in order to express a judgment. However, even if these two criteria are not met, grammarians usually do not hesitate to assign the label S to the sequences in question, if their deviance from the structure of well-formed judgments is minimal. Thus, in the grammar developed here, all the italicized sequences in (1) are called "S", even though only the first sequence expresses a judgment. (1)

Yes indeed, somebody will start dishing the children out their lunch. Somebody start dishing the children out their lunch. I don't know if somebody will start dishing the children out their lunch. For somebody to start dishing the children out their lunch wouldn't be appropriate.

Once the categories E and S, the major lexical categories L ( = N , V, A), and the grammatical categories are admitted into the theory of syntax, the question that arises is whether any intermediate subsequences of categories should be assigned category labels. Again, it is by now widely accepted that any other such category which can occur in deep structure and which can contain a lexical category is structured around an obligatory "head" category X, and is notated X', where i is a small integer. The possible values for X are L and also P, where P (which usually corresponds to the traditional term "preposition") is a head which is a grammatical rather than a lexical category. The categories L' and P 1 are called "phrases", and when i is maximal for a given X (notated "X m a x "), we say that X1 is an "X-phrase", or a "maximal projection of X". Thus, N m a x is a "noun phrase", and X m a x is alternatively notated "XP". This notation for phrases is called the "bar notation", in that X1 can be alternatively written as X with i bars over X (e.g. X 2 = X). In this regard, I will sometimes use the term "particle" for P, and hence P P = P m a x = particle phrase. In this study, no formal difference distinguishes "prepositional" a n d / o r "post-positional phrase" from "particle phrase"; for justification, see section 6.2. The particular variant of the bar notation I will begin with here assumes that each deep structure phrase X ' , i > 0, consists of a unique and obligatory head XJ, possibly accompanied by certain non-head grammatical categories, by maximal projections, and by S. 2 The sequences of categories which make up X' or S are called the immediate constituents or daughters of X' or S. Some interesting questions can be raised concerning the value of "max" appropriate for each value of X. Jackendoff (1977) advances the "uniform 2. Variants in which certain non-maximal projections can be sisters to X are proposed in Ronat (1973) and Zagona (1982); a restricted use of this idea is made in Ch. 2 of this study. In coordinated constituents, we probably want to say that there are multiple heads. Cf. Dougherty (1970).

The source of categorial

asymmetries

15

three-level hypothesis", where max = 3 in all cases. In E m o n d s (1979), I a r g u e t h a t for N , t h e value of max is 2. In this study, I f u r t h e r claim that V c o n t r a s t s with N , A , a n d P in t h a t t h e value of m a x for V is 3; I r e t u r n t o this m a t t e r below. It is generally recognized that p h r a s e structure rules are i n a d e q u a t e for expressing linguistic generalizations a b o u t well-formed b a r n o t a t i o n d e e p structures. O n e t h o r o u g h critique of p h r a s e s t r u c t u r e rules can be f o u n d in Stowell (1981, Ch. 2). Even Jackendoff, in his a t t e m p t to provide a m o r e or less c o m p l e t e set of English p h r a s e structure rules, expresses reservations in the end on whether such rules can a d e q u a t e l y express descriptive generalizations (1977, 81-85). M a n y of these criticisms can be s u m m e d u p u n d e r the following t w o very general a n d I think very telling points: First, p h r a s e s t r u c t u r e rules m a k e it impossible t o clearly distinguish the c o n t r i b u t i o n s of universal g r a m m a r f r o m aspects of particular g r a m m a r s ; a m o n g o t h e r things, certain left-to-right orderings expressed by classical p h r a s e s t r u c t u r e rules are language-particular, while most of the hierarchical s t r u c t u r e they assign is either a r g u a b l y universal, or in any case can hardly be asserted t o be language-particular. Second, p h r a s e s t r u c t u r e n o t a t i o n wrongly implies t h a t t o o large a n u m b e r of different sets of base rules for d e e p structures are possible. As an example of these t w o points, consider t h e fact t h a t g r a m m a t i c a l formatives which are not affixes o r clitics precede their head in the m a j o r i t y of languages a n d across all values of XA (X varies across N , A, V, P a n d j varies f r o m 0 to 2.) Even if all languages d o n o t c o n f o r m t o this type, it is surely t h e case that m a n y languages (e.g., English a n d F r e n c h ) c o n f o r m " o n the whole"; individual p h r a s e s t r u c t u r e rules for e x p a n d i n g t h e various X-' would t h u s fail t o reflect this p r o p e r t y in a revealing way. This generalization c a n be expressed as (2). (2)

H e a d P l a c e m e n t for N o n - p h r a s a l Modifiers: In d e e p structure, all i m m e d i a t e constituents of X ' which are n o t clauses or p h r a s e s precede t h e head of X'. (Y-', where j > 0, is called a phrase.) 3

S o m e ways in which English a n d F r e n c h c o n f o r m t o (2) a r e as follows. T h e English verb is preceded not only by the auxiliary, but also by negation a n d by certain u m m o d i f i a b l e a d v e r b s A (of the scarcely type; cf. E m o n d s , 1976, C h . 5.). F r e n c h negation w o r d s (pas, point, guere, jamais, etc.) a n d certain other u n m o d i f i a b l e a d v e r b s follow t h e finite verb in surface structure, but it is shown o n i n d e p e n d e n t g r o u n d s in E m o n d s (1978) t h a t the finite - b u t not infinitival - verb moves t r a n s f o r m a t i o n a l l y t o the left over these g r a m m a t i c a l formatives, so t h a t in fact F r e n c h verbal negation c o n f i r m s (2) in an interesting way.

3. For a treatment of grammatical categories in which a similar principle plays a role in a Categorial Grammar framework, see Flynn (1983).

16

A unified

theory

of syntactic

categories

Similarly, the English and French head noun is preceded in deep structure not only by determiners, but also by negation, by unmodifiable adjectives, by numerals, etc. While such grammatical formatives are not necessarily members of a single archi-category such as S P E C I F I E R (X), they are not phrases Y', i> 0, and hence, by (2), they precede the head: (3)

Not one person did I see. (only order possible)

The three houses on the block are old. T h e houses three on the block are old. The (*very) other reason for this i s . . . T h e reason other for this i s . . . John's (*most) principal objections to that a r e . . . *John's objections principal to that a r e . . . A mere mention of that w o u l d . . . T o o mere a mention of that w o u l d . . . *A mention mere of that would . . . John has (*very) barely finished. *John has finished barely. A possible objection to (2) might be made on the basis of the English post-verbal "particle" node PRT, which often appears in discussions of the English VP. However, it has been argued in Emonds (1972) and never seriously refuted that such particles are instances of P P in deep structure. 4 A striking confirmation of the validity of Head Placement (2) comes from a consideration of morphology. In Chapter 5, it will be argued that all English and French inflectional morphology (which follows the head in surface structure) is derived either from pre-head positions in deep structure or from a transformational adjunction to X' in such a way that (2) is not violated. 5 Thus, in deep structure, the categories from which the English tense suffixes, the English plural marker, the English adjectival comparative markers, etc. are derived all precede the X' to which they are attached in surface structure. 4. JackendofT (1977) considers such particles to be a bar notation phrase, and as such, they could not be a counterexample to (2). However, JackendofT attributes phrasal status to all categories, so a generalization akin to (2) in his system would have to be stated in terms of his features + C O M P and + D E T . Without some revision in the distribution of these features, the generalization cannot be straightforwardly expressed in such terms. 5. M y argumentation that inflectional morphology is transformationally derived would not have been controversial ten years ago. A different position is put forward in Lieber (1980). In Chapter 5, I will counter some of her argument, but I accept what seems to be her most interesting claim - that rules of the same type are needed to express certain generalizations about derivational and inflectional morphology (cf. her Ch. 2). I think what is crucial is that when syntactic phrases Y J ( j > 0 ) are being composed, it is forbidden to use subcategorizations whose domain is Y°. Beyond this, it may well be that (derivational) rules which apply inside Y° and (syntactic) rules which apply outside Y° are of the same type.

The source of categorial

asymmetries

17

The scope of (2) extends even to derivational morphology. If we follow the interesting argumentation of Williams (1981) and Lieber (1980) to the effect that the head of a word composed by derivational morphology is its category-determining derivational suffix (speaking for example of English, Latin, Polish, French, etc.), as exemplified in (4) and (5), then (2) holds for the value i = 0, as well as for the syntactic cases, where i>0. (4)

A N / \

A N / \ I N V ation I .1 organ lze

(5) A I al

N

K nuclear

N

n

I

I physic(s)

N I ist

Taking up Lieber's suggestion, the difference between a derivational affix which is, say, an N and a lexical N is simply the presence or absence of a subcategorization feature. Thus, (a)tion is + N, + V , while organ is simply + N. Further, it is rather obvious, as has been observed to be by H. Hoji, that subcategorizations of particular items d o not vary significantly across languages with differences in word orders: so the subcategorization mechanism should not refer to linear order. Lieber points out some exceptions to (2) among English affixes: thus, the verb-forming be- as in befriend, besiege is a prefix. She acknowledges their atypical status. We can assume the verb-forming prefix be, parallel to ize, is listed as taking a + N complement with the added exceptional stipulation that be is a prefix: (6)

[ v ize], 4- N

[y be], + N

, "is a prefix".

Now let us assume that linguistic theory requires at a given level of structure that less general statements, being marked, supersede more general statements. Here, the algorithm that determines (6) to be less general than (2) is the obvious one that (6) contains a constant (be) while (2) does not. 6 6. Actually, this algorithm is what Sommerstein has proposed for phonology, where he also incorporates the claim that less general statements precede more general ones: Proper Inclusion Precedence: If every (logically possible) form meeting the structural description of rule A (here rule 6, JE) also meets the structural description of rule B (here rule 2, JE), and the converse is not the case, then rule A has precedence over rule B (Sommerstein, 1977, 186). Added support for Lieber's contention that affixes like be- and -able have the categories of heads comes from the fact that a synchronic grammar can reflect the naturalness of certain grammatical formatives representing both lexical categories and derivational affixes by the use of parentheses in subcategorization: be, + V , + (N); able, + A , + ( V ) .

18

A unified

theory

of syntactic

categories

Even though the Head Placement Principle (2) holds for a variety of both verb-second and verb-final languages, it is not clear at this point whether a language-particular statement is involved, or whether Head Placement is in fact a universal, subject only to language-particular exceptions such as (6). If Head Placement should turn out to be languageparticular, then its eventual formal statement must, according to the Adjacency Hypothesis discussed in the Introduction, be stated without an internal variable. This might be done in a number of ways, depending on how languages which are marked with respect to head placement fail to conform to (5); presumably, the language-particular statements would describe departures from (5). It is more likely that Head Placement, given its applicability to widely differing languages, is a consequence of universal grammar. But it could nonetheless follow from very different theories of grammar: (i) In one theory, the head follows everything in deep structures in the unmarked case, and verb-second and verb-initial languages share a languageparticular stipulation that places the head before (only) maximal phrases and clausal complements, (ii) In another theory, Head Placement (5) holds universally, but it is not formally related to the relative order of the head and its maximal phrase complements, (iii) In a third theory, Head Placement is the consequent in an implicational universal that applies to languages with a fixed word order base. Given these divergent possibilities, it seems sufficient to leave Head Placement is the consequent in an implicational universal that applies to of factoring its content out of the statements that determine the hierarchical constituency relations in deep structures.

1.2. A Preliminary Account of Specifiers Many, and probably most, of the grammatical categories which are not prepositions (that is, not of the form X') are assigned in deep structure to categories called the "specifiers" of X, notated here SP(X). Languages like English and French accord with Head Placement (2), in that the specifiers of various heads precede the head. Previously (e.g., in Chomsky, 1970) this was expressed in the following phrase structure rule: (7)

Xmax->SP(X)-Xmax-1

The most clear-cut representatives of SP(X) in English are as follows, ignoring for the moment the many contextual restrictions on the various specifier morphemes. (8)

SP(N) = D E T E R M I N E R = this, that, these, those, the, a(n), each, every, all, both, half, some, any, no, which, what. SP(A) = I N T E N S I F I E R = very, so, as, more, most, less, least, too, enough, how, somewhat, rather, quite, real, this, that.

The source of categorial asymmetries

19

SP(V) = AUXILIARY = will, would, can, could, may, might, shall, should, must, ought, need, dare. SP(P) = right, clear, straight. I return in section 1.4 to why SP(P) has so few members. The grammatical behavior of the members of SP(X) is dealt with in some detail in Ch. 4 and 5 below. Here, it suffices to make two general points about the category SP(X). The first point about the category symbol SP(X), as expressed, for example, in (7), is that a fundamental and uniform syntactic relation between SP(X) and X across values of X( = N, A, V, P) is implied. But moderate reflection on the nature of SP(V) ( = tense and modal categories), SP(N) ( = demonstratives, quantifiers, numerals), and SP(A) ( = expressions of degree and intensity) strongly suggests that with respect to their semantic interpretation, any parallels among these categories are at best secondary to the fundamentally unique roles they play in logical semantics so much so that Jackendoff (1977, 37) denies that the various SP(X) are formally related at all. However, to deny syntactic status to SP(X) would miss the striking fact that each of the central lexical categories L is tightly associated with a particular closed grammatical category SP(L) both in the base and in the operation of the local transformations (cf. Ch. 5 below). Moreover, minor parallels have been discovered among the SP(X), especially between SP(N) and SP(A) (e.g., the this-that contrast and the possibility of a W H member: which, what, how). Thus, there are reasons for treating all SP(X) as parallel, and for having a single symbol available to refer to them. In contrast, the uniqueness among the SP(X) is in interpretation. The rules for logical interpretations of tense morphemes based on surface configurations given in Emonds (1975) have no counterparts in the other SP(X) systems; similarly, rules of interpretation for other SP(X) proposed by other authors (JackendofT, 1977, Ch. 5-6; Milner, 1978, Ch. 7-8) usually d o not generalize across values of X. The category SP(X) therefore conforms to the following principle, in large part suggested by Stowell (1981), and which is a working hypothesis throughout this chapter. (9)

Category-neutral Syntax: Syntactic principles of the base component generalize across values of X; rules of semantic interpretation are often based on category-specific values of X.

I take this principle to be the source of the autonomy of syntax and semantics. In my view, this pervasive discrepancy between parallel syntax and asymmetric semantics has been either misread or at best insufficiently articulated by previous authors. Jackendoff explicitly claims that fundamental semantic relations and deep syntactic structures are parallel (1977, 37). Stowell (1981, Ch. 2) raises the possibility that apparent asymmetries in the base are due to rules of logical form, but, as will become clear, I d o not believe he accurately locates these asymmetries.

20

A unified theory

of syntactic

categories

At this point, I will not develop further the membership system, the syntactic behavior, or the semantic interpretation of SP(X), but I do consider that the general properties of these categories give credence to the above claim that rules of semantic interpretation are often categoryspecific, while syntactic principles of the base, including those which involve SP(X), tend to generalize across values of X. A second point about the category SP(X) is that we must specify its existence in universal grammar by factoring out the linear order expressed in (7) now captured by Head Placement (2). The residue of (7), which is almost certainly a universal, is expressed in (10a): (10)

(a) SP(X) can only be a daughter of X m a x and a sister of X m a x _ 1 .

This formulation allows us to accomodate the possibility that some languages may be "flatter" than others; that is, the value of "max" might be less for some languages than for others, and/or some languages might require that the head of X 7 always be X-' - 1 . I will not be concerned with these possibilities here. Two types of trees generable by (10a) are exemplified in (10b—c): (10)

(b)

^ V ^ ^ . . . V m a x - . . SP(V)...

(c)

^jvjmax ... SP(N)... Nmax " . .

Of course, (10b) is ruled out as a deep structure in a language which conforms completely to Head Placement of Non-phrasal Modifiers (2). (10c) is exemplified by English. Formally, I will write the operation in (10) as (11):

(11)

X m a x ^SP(X), x m a x - '

C —> A, B should be read as "C may dominate the immediate constituents A and B." It is not implied that A and B are the only daughters of C in a well-formed tree conforming to such a rule. Moreover, C can be left unexpanded. I call such rules "base composition rules." In my conception, this kind of statement is typically part of universal grammar. Statements of this sort express the Categorial Uniformity and the Hierarchical Universality discussed in the Introduction. I d o not totally exclude the possibility that a base composition rule may be language-particular. However, I tentatively propose that languageparticular base composition rules are limited to expansions of non-phrasal nodes - for example, differing possibilities for compound L or for expansions of C O M P in different languages. An English-particular base composition rule is also provided by (12);

The source of categorial asymmetries (12)

21

SP(X)->NP, X / V ; English-specific

The English possessive N P is one type of phrase which is generated by (12). Such phrases alternate with a range of determiners; in particular, a possessive N P has a distribution almost identical to that of that/those and this/these. Chomsky (1970) provides evidence that some possessive NP's are base-generated. Rule (12) should not be considered to play a role only in the possessive construction. Pre-head measure phrases, which almost certainly fit into the specifier system of both A and P, also can be generated by (12). Like possessives, they alternate with individual specifier morphemes. (13)

(a) The house is very high. The house is ten feet high. *The house is very ten feet high. *The house is ten feet very high. John is standing right behind the house. John is standing a short distance behind the house. *John is standing right a short distance behind the house. *John is standing a short distance right behind the house.

Measure phrases modifying nouns, as in John's two mile driveway, are not generated by (12). Like adjectives, these phrases follow the SP(N), and they also fail to exhibit the plural morpheme. French, in which the distribution of phrases in deep structure is otherwise almost identical to that in English, exhibits neither lexical possessive phrases nor lexical pre-head measure phrases inside A P and P P . Thus, it seems to be a correct generalization that the expansion of SP(X) as an N P is English-specific. (13)

(b) *La maison est dix mètres haute. "The house is ten meters high." *Jean se trouve une petite distance derrière la maison. "John is standing a short distance behind the house."

Finally, specifiers, like other non-head constituents, are typically optional, even though classes of heads, such as count nouns, may require a specifier. Exceptionally, the SP(V) appears to be an obligatory constituent of S; I return to this point in Ch. 3. In summary, while base composition rules for expanding a morpheme category may be language-specific, I claim that the possible expansions for phrases are determined by universal rules whose proper form factors out linear ordering conditions and is typified by rule (11). These syntactic rules are universal and category-neutral. 1.3. Subject Phrases As mentioned above, my formalization of base composition rules does not imply that the category on the left of such a rule may dominate only categories specified on the right of that same rule. The same category may

22

A unified theory of syntactic

categories

appear on the left in different base composition rules; each base composition rule is a maximally simple expression of a single generalization. Some generalizations about the possible dominance relations in the base overlap in interesting ways, even though their interplay cannot and should not be expressed in a single formula. For example, the rule for specifiers (11), which applies across all values of X, co-exists, in my view, with a rule for certain generating subject phrases Y m a x which holds at least for the value X = V. (14)

^max_ > Y m a x ^max —1

For X = V, (11) and (14) together yield the structure [ V m a x Y m a x - S P ( V ) - v ™ " " 1 ] . For reasons to be given below, this reduces to the familiar expansion of S, [s NP — AUX — VP]. The base composition rules being proposed thus contrast with phrase structure rules or a bar notation schema in that a single composition rule does not necessarily specify all the daughters of a single parent node. However, no base composition rule can be used more than once to expand a single symbol; for example, multiple subjects and specifiers for a single X-i are not allowed. Regarding (14), there are two questions: What are the possible values of X (that is, which types of phrases may contain subject phrases)? And, what are the possible values of Y (that is, which types of phrases may serve as subjects)? Stowell (1981, Ch. 4) takes the position that both X and Y may vary over all head-of-phrase categories. While I agree with Stowell's research heuristic of a category-neutral base component, I think he is mistaken on both counts with regard to the issue of "subjects across categories." First, let us consider the question of whether all phrasal types may serve as subjects. I contend that there is a universal consonance between subjects and the category NP, which Stowell has failed to undermine the arguments for. His account of this correlation is that categories which receive abstract case (e.g., NP) appear in case-marked subject positions, while categories which do not are excluded. But AP's typically exhibit morphological case (in Indo-European case-marking languages) and presumably take abstract case, yet they cannot regularly be subjects. Conversely, PP's and S's, which do not receive abstract case, should freely appear as subjects of infinitives, yet they do not. That is, whatever devices might be called upon to explain these facts, there are no patterns which suggest that the theory of case alone can explain why non-NP's consistently fail to occur as subjects. Moreover, it is precisely when the correlation between NP's and subject position appears weakened (by, for example, verbs which seemingly have S subjects) that more thorough investigation has revealed that N P (and N) are inexorably present as deep structure subjects. While Stowell accepts the descriptive generalization of Emonds (1976, Ch. 4) to the effect that S

The source of categorial

asymmetries

23

and P P are not in fact in subject position in surface structure (they must be topicalized or extraposed), he does not address the explanation I gave for why any S and P P generated as deep structure subjects (and N P objects) move. In brief, my explanation was and is that these are N P positions, and that the empty head N required by the bar notation must be either removed or co-indexed during the transformational derivation in order to yield a well-formed surface structure. Hence, it follows from the bar notation and the trace theory of movement rules that the S generated as the sole lexical phrases in base N P positions (necessarily with an empty N sister) must move and leave a co-indexed element in these empty surface N P positions. 7 Therefore, the explanation for why S's do not appear in surface N P positions depends directly on a restriction like (15): (15)

The Subject Principle: Phrasal arguments of X external to X (i.e., subject phrases) must be NP's.

The Subject Principle shound not be coalesced with (14) because it has wider applicability. For example, (15) ensures that a possessive subject phrase generated as a daughter of SP(N) by (12) will be an NP; that is, in light of (15), the symbol N P in (12), intended to encompass both subject phrases and measure phrases, can be replaced by the category which includes both NP's and AP's, since AP's also serve as measure phrases (three dollars cheaper, very much cheaper; three dozen books, too few hooks, etc.). In succeeding sections of this chapter and in Ch. 2, I will also establish that subjects can occur outside the maximal projection of their predicate, and in these positions also they are consistently NP's, even though they are not generated by (14). 7. I return in sections 1.6, 7.7.1, and 7.7.2 to details of how this co-indexing is achieved and to how the empty N's are removed when S is topicalized or extraposed. At the time I originally made my proposal, trace theory was embryonic, and my own formulations incomplete and in certain ways ad hoc. In more detail, Stowell's proposal is that S's and PP's are generated directly as subjects (with no empty N), and that they are forced to vacate this position because they are incompatible with nominative case. But there are two interpretations of the "case-resistance" of categories like S and PP; one is that they don't receive case and that the proximity of a case-assigning category has no effect on them, and the other is that the proximity of a caseassigning category actually forces them to move. The only other instance of such "forced movement" adduced by Stowell concerns his claim that S (but not PP) inside X can be interpreted only by virtue of being extraposed. But in just those instances where my original proposal contrasts with his (where I claimed there was no S extraposition involved: the S complements to verb classes like seem, murmur, and persuade, and to nouns and adjectives), Stowell fails, in my view, to show that S moves. In some cases, he utilizes ad hoc devices; in others, the verb classes are not dealt with; and in others, incomplete and unrepresentative data suggests patterns that are just not there (here I refer to his treatment of S complements to nouns). Thus, I don't believe that Stowell has established that S or P P exhibit any "forced movement" other than not being compatible with subject position or subcategorized object N P position.

24

A unified theory of syntactic

categories

For these reasons, the Y in (14) should remain category-neutral, in accord with Stowell's (9). The Subject Principle (15) always restricts this Y to the value N. This restriction can be thought of as part of categoryspecific semantics, in the sense that (15) concerns the way predicates are related to arguments; the external argument of X must be an NP. Actually, the Subject Principle as stated above may well be a "syntactified" version of an even more obviously logical or semantic requirement, and might easily be correlated with general observations brought forward by Keenan and other writers that subjects often are required to be "referential" in ways that other NP's are not (cf. Keenan, 1976). We might rephrase (15), for example, as "Specifiers of external arguments must be able to express quantification or co-reference." This directly suggests its semantic nature. Let us turn to whether all maximal projections can contain subjects; should X in (14) be restricted? While I agree that members of all head categories X may impose selection restrictions on and be in a grammatical relation with a subject phrase external to X, I will argue in the next chapter that the structural definition of subject phrase (" = argument external to X") is satisfied by a variety of configurations, and not only by daughters of a maximal projection as required by Stowell, and also Jackendoff (1977, section 3.4). In this I agree with Travis and Williams (1983) and Williams (1983). The contribution of (14) to generating subject phrases is that of satisfying the requirement that clauses, in contrast to other X m a x , must contain expressed or understood syntactic subjects of X. In English, X takes on only the value V in (14); that is, it is not the case that the subject of X is always inside X m a x (cf. Ch. 2). In Chs. 2 and 3,1 will argue that X is limited to V in (14) not by direct stipulation, but from the fact that universally, at least in the unmarked situation, only V has three rather than two bar notation projections. This is expressed by reformulating (14) as (16), and imposing (17), a categoryspecific statement which describes the unmarked case. (16)

X 3 -> Y m a x , X 2

(17)

Only V can have 3 rather than 2 projections.

Taking into account the Subject Principle and the fact that subjects are typically clause-initial, (16) reduces in phrase structure rule format to "X ->NP —X ." This is the putative universal rule proposed in Williams (1984) for languages with verbless deep structure clauses, if we equate X 3 here with Williams' S. (Alternatively, X 3 could be replaced by V 3 in (14), and verbless sentences would be "exocentric"; the category of the projection and of the lexical head would not agree.) If Williams has the right analysis for such sentences, (17) is an unmarked category-specific option for the category-neutral (16). If there are no verbless deep structure S, so

The source of categorial

asymmetries

25

that (17) is universal, it can still be thought of as category-specific semantics; only verbs can have arguments external to X 2 . 8 Category-specific generalizations such as (15) for nouns and (17) for verbs delineate the fundamental distinguishing properties of the lexical categories, and can account for much of their asymmetric behavior in individual paradigms. Such generalizations come closer to explaining these pervasive asymmetries, which we will be examining in detail in the rest of this chapter, than d o formal cross-classifications of N, V, A, and P in terms of two binary features. The generalizations of (15) and (17) are essentially correlations between the type of meanings expressed by SP(X) (reference for N and modality for V) and the deep structures or logical forms the corresponding X m a x can appear in. The differences among N, V, A, and P seen in this way cannot be referred to freely either in the transformational component or in other statements of deep structure and logical form. This restrictiveness is a welcome development, even though it might appear at present to be too strong a claim to say that (15) and (17) are the only statements in grammar that distinguish among N, V, and A. The base composition rules (11) and (16), the Subject Principle (15), and the Head Placement Principle (2) allow for three types of deep structure clauses: N P - AUX - V 2 , AUX - N P - V 2 , and AUX - V 2 - N P , where AUX = SP(V). In order to complete the specification of the English deep structure S as the first of these, we must insure that the deep structure subject is initial in S. Since this is most likely the unmarked case in universal grammar, there is no need for a further stipulation in the grammar of English. How exceptions to this word order constraint are stated for other languages is important, particularly because of the Adjacency Hypothesis of the Introduction. However, serious consideration of language-particular rules outside of English and typologically similar languages is beyond the empirical scope of this study. Since subjects of clauses are arguably obligatory (Chomsky, 1981, Ch. 2), there arises the question of when a category that appears on the right of a base composition rule is obligatory. Beyond the fact that each XJ must have an obligatory head Xk in deep structure, we know that the principles of the base should allow complement phrases to be optional, with obligatory occurrence being stipulated by subcategorizations of individual lexical items. How far should we extend the notion that nonheads are syntactically optional? In Ch. 3,1 return to the question of what renders subjects obligatory. 8. In Emonds (1980b), three arguments are given, apparently contrary to the idea that the subject of a V is inside V max , that the subject NP is external to VP. But these three arguments are easily made compatible with the present treatment by replacing X mdx in the principles I formulate there with Xk, That is, the arguments in Emonds (1980b) can be construed either as arguments for Hornstein's (1977) position that S # V max or for the position that max for V is greater than max for the other head categories. This latter position will be developed in Chs. 2 and 3. In either case, these arguments are evidence against JackendofT's "uniform 3-level hypothesis" in which S = V max .

26

A unified

theory

of syntactic

categories

With preliminary but I think plausible principles and base composition rules for heads, specifiers, and subjects in mind, we can now turn to the central problem addressed in this chapter, the asymmetries across syntactic categories found in the complement structures to N, A, V, and P. I.4. Two Base Composition Rules In this section, I will argue that the expansions of Xk, k = 1 or 2, provide solid evidence for the principle of Category-neutral Syntax (9). The Subject Principle (15) will play an important role throughout. Careful consideration of the broadest generalizations holding of these expanisons will provide more support as well for factoring out universal statements of dominance relations from language-particular ordering statements. 9 Concerning left-right order, the fixing of a general word-order parameter has the following effect on head-initial languages such as English: (18)

Head Placement for Phrasal Complements: A phrase cannot be a left-sister to the head of a deep structure Xk, k < 2 . 1 0

The verb-second and the verb-first languages of Greenberg (1963) are those which are subject to (18). The status of verb-first languages is treated in some detail in Emonds (1980b), and will be returned to here in Ch. 3. My claim that subject phrases in sentences are outside V 2 , as expressed in (16) and (17) above, allows subjects of sentences to escape the effect of (18), which does not apply to a third projection of X. Sentence-initial adverbial also escape (18), and the free recursion they exhibit (Unfortunately for the average person who has a moderate income, ...) confirms that V should have a third bar notation projection. In all other cases where phrases apparently precede heads in English, an argument can be made that either (i) a deep structure configuration is not involved, (ii) the apparent phrase actually is an X° in deep structure, or (iii) the phrase is a daughter of SP(X), and not itself a left-sister to the head. An English pre-head phrase which exemplifies sometimes (i) and sometimes (iii) is the possessive NP. As argued in section 1.2, base-generated possessive N P are daughters of SP(N). Such pre-head NP's are not found in general across head-initial languages, and they do not violate (18), if they are generated by a rule like (12).11 9. Some of the empirical considerations leading to my conclusions are also taken into account in Stowell (1981). However, my conclusions are different in many ways, as will be noted at appropriate points. 10. Head Placement rules out a left-branching verbal complex in French, of the type proposed in Emonds (1978), unless my V' there is re-interpreted as V°. The principle of Head Placement allows the "complements" of traditional grammar to be defined as phrases internal to X, and "modifiers" to be defined as either "external to X or a non-phrase." These characterizations are independent of left-right order, and thus seem superior to the definitions in Jackendoff (1977). II.

Even if my claim that English possessive and measure NP's are daughters of SP(X)

The source of categorial

asymmetries

27

Case (ii) holds when certain pre-head adjectival structures are deep structure sisters to X°, as in (19). (19)

The main reason he stood directly behind the door is a virtual mystery. Those who were really thin scarcely have eaten.

It can be argued that the base category involved in these constructions is A 0 , not a projection of A; cf. Ch. 5. Keeping in mind provisions (i)-(iii) then, (18) stands as an ordering generalization about English deep structures, as well as being more obviously the case for a wide range of verbinitial and verb-second languages. 1.4.1.

Complement

Phrases

outside

X

I will first discuss a base composition rule for complements outside of X. In Emonds (1976, Ch. 5), I justify phrase structure rules such as N m a x ^ . N m a x _ p m a x a n ( j y 2 ^ y 2 _ p m a x ( t r a n s l a t i n g t h e rules there into bar notation terms). But these phrase structure rules miss generalizations in two ways: first, the left-right order is clearly due to the general fact that heads of phrases in English precede phrasal complements (18); second, a defining distributional characteristic of p m a x is that it can appear freely as the daughter of essentially any phrase, and not just in a few stipulated positions, as such individual phrase structure rules imply. In view of this, Williams (1978) suggests that the proper generalization governing the distirbution of P P is a rule that applies across bar levels, having a form something like X-'-»X-' _ 1 . . . (PP)(S). I agree with Williams, but would further observe that S should be identified with P P (the principal hypothesis of Ch. 7). This step eliminates the problem of generating sequences like . . . PP-S-PP-S under an X 2 , as pointed out to me by H. vanRiemsdijk. Since (18) orders any P P after a head within any X f c (k->X\ P m a x

Footnote 11 — Continued. rather than sisters of N cannot be maintained, it is still the case that the possibility of lexical N P as variants of SP(X) is English-particular. The principle appealed to in section 1.1 ("less general left-right ordering statements superside more general ones") can be applied to "lexical NP's can precede X in English." This latter statement is less general than Head Placement (18), and so supersedes it. We might allow non-lexical subjects (i.e., the PRO of Chomsky, 1981) inside NP's in languages like French by restricting (18) to "No lexical phrase" rather than to "No phrase." This is then consistent with Milner's treatment of subjects in French NP's (Milner, 1982, Ch. II).

28

A unified theory

of syntactic

categories

In the next subsection, a distinct composition rule allows X°, but not X 1 or X 2 , to have other phrasal sisters in addition to the P P provided by (20). But between the lowest X 1 and the highest X 2 in a given X m a x there can only be one phrasal sister per head (e.g., languages observe binary branching), since the only base composition rule for generating such "outer complements" is (20), or generalizations of (20) that provide for P P and other categories as well. In line with considerations presented in Emonds (1976, Ch. 5) and those in Hornstein and Lightfoot (1981), it is important to allow k to equal j — 1 or j in (20). This permits a number of anaphoric processes involving expressions such as do so and one(s) to consist of coindexing of full constituents. Cf. also Jackendoff (1977, Ch. 4). The anaphoric processes described this way also support my claim that V has a third projection in the bar notation, since they involve VP ( = v m a x _ 1 in the present system) and N ( = N m a x _ 1 ) , and hence can be uniformly stated as co-indexings of empty X m a x _ 1 . In terms of the category-neutral feature system of Muysken (1983), the anaphors are characterized simply by the features + P R O J E C T I O N , - M A X I M A L . For example, all the bracketed phrases in John [[[spo/ce to Mary] at the party] briefly] are of the same category v m a x _ and as a result, the anaphoric v m a x ~~' do so can replace any of them; all of the following are acceptable continuations of the preceding example: and Sue did so ((after lunch) in detail). Base principle (20) is the only case where the bar notation that I employ in this work departs from the restriction that every X ; has a head whose superscript is j— 1. This principle, including the condition on the superscript, is another syntactic, category-neutral, and universal principle of the base. For clarity, I have at this point left the P P as a stipulated category in (20), but this will be generalized in the next chapter. For the moment, I am not concerned with the category-specific rules of semantic interpretation that take complements outside X produced by (20) as input; this is in part the task of chapters 2 and 7, but for the most part is not addressed in this work. It should be noted that the PP's generated outside X by (20) must be classified by universal grammar as non-arguments, or else a conflict with the Subject Principle (15) would arise. If for some reason it turns out that complements outside X should be called arguments, then (15) should be recast as "phrasal arguments external to X not generated by (20) must be NP's." 1 2 12. One possible generalization of (20) is to make it category-neutral, replacing P r a d x by 2 Y , and then making the revision of (15) just mentioned a bi-conditional: "Phrases external to X are NP's if and only if they are not generated by (20)." This would allow AP's and VP's to appear outside X in the same contexts as PP's d o (probably a correct decision, in light of constructions to be discussed in Chapters 2 and 7), and yet would exclude NP's. I think this is also correct, but it would require a discussion of measure NP's and also of appositive NP's that would lead us too far afield here.

The source of categorial

asymmetries

29

In a similar vein, the measure phrases generated as daughters of SP(A) and SP(P) by (12) must also be classified as non-arguments, or else measure phrases would wrongly qualify as subjects of A and P. Equivalently, AP and P P cannot contain the external argument of their head. The justification for considering neither PP's outside X nor measure phrases to be arguments of X is that X does not appear to assign a semantic role (a "0role") to these kinds of phrases. 1.4.2. Complement

Phrases inside X

I now turn to the principles that generate complement phrases inside X. Since these principles must include those that allow for direct and indirect objects, prepositional objects, predicate attributes, a variety of complement clauses, and must futhermore account for the discrepancies across values of the head whenever these complements appear, this topic is obviously central to any discussion of the categorial component of a grammar. It is in the system of principles that interpret the internal arguments of X that I claim that fundamental asymmetries occur across values of X, in accord with the principle of Category-neutral Syntax (9). In accord with usage established in Jackendoff (1972) and Chomsky (1981), I refer to these interpretations of internal arguments as "thematic relations" or "0relations." That is, the basis for the interpretation of a given Y 2 or Y 3 which is a sister to some head X° is that X and Yk usually (not always, as will be seen below) stand in exactly one of a small number of ^-relations called (head, agent), (head, theme), (head, source), (head, goal), etc. For discussion of typical cases of these relations, see Jackendoff (1972, Ch. 2). In all such cases, we say that X assigns a 0-role to Yk, and that Yk is the theme (or agent, etc., as the case may be) of X°. The requirement that each complement Yk be assigned exactly one 0role, and that each expressed 0-role be assigned to exactly one Y* is called the "0-criterion" (Chomsky, 1981, 36). It is adopted here, subject to an important clarification in Ch. 2. The asymmetries in the way 0-roles are assigned are, as I see it, due to the role played by the unique grammatical head-of-phrase category P. Briefly, informally, and partially inaccurately, V, unlike N and A, is fundamentally a "relational" lexical category. As a result, V can, in terms I will develop presently, "assign 0-roles directly." (A paradigm case concerns direct objects, hence the term "direct."). But all of N, A, V can also assign 0-roles "indirectly" (again, a paradigm case being the indirect object) by means of an intermediate relational category P. P in turn, like V, can assign 0-roles directly, but not indirectly. Hence, there are two ways that 6roles are assigned: directly, by V and P, and indirectly, by N,A,V. I will argue that these are the distinctions among the head-of-phrase categories from which many others follow and in terms of which all others can be stated. This system fails to distinguish between N and A, of course, but N

30

A unified theory of syntactic

categories

and A are distinguished by the Subject Principle (15), so no fundamental opposition between lexical categories is left unexpressed. In order to develop and explain the system of asymmetric 0-role assignment, I find it useful to retrace an idealized and I suppose imaginary evolutionary development from a more primitive universal syntactic system to the universal grammar UG I aim to describe here. In this more primitive system LG, the category P does not exist. Rather, the only headof-phrase categories that exist are lexical: N, A, and V. Since I am describing the projected immediate ancestor to UG, LG has all the features of U G which are independent of the existence of P; for example, LG contains the grammatical categories SP(L), the maximal projections L m a x , and the distinction between S and VP. However, in LG, only V is relational, in that only V can take complement phrases. Moreover, when this complement phrase is an (object) NP, it is distinguished from subject NP's (the external argument) by an "abstract case-mark", which is nothing else than the projection of the feature V onto its sister NP. Thus, there is a principle of 0-role assignment (21):

(21)

fl-role assignment in LG: V assigns a 0-role to one L^ (L = N,A,V); further, if L = N, then V is projected onto N P as a case-mark to distinguish internal and external arguments, in semantic interpretation and possibly phonologically.

In traditional terms, what I consider to be the feature V on an N P is called "accusative case", but I see no reason to introduce this superfluous category name except for expository reasons. A little reflection shows that LG is woefully inexpressive compared to UG. (i) N and A can take no complement phrases; (ii) there are no indirect objects; (iii) even if directional phrases are expressed by, say, a VP, there are no transitive verbs which take both an N P theme and a directional phrase; (iv) assuming that S is to be assimilated to PP, at least in some way (Ch. 7 argues they are the same), there are no indirect questions or other clausal complements with marked complementizers; (v) there are no clausal subjects (in LG, this follows from the Subject Principle (15), although U G escapes the limitation, as explained in section 1.3). This partial list shows that LG badly needs enrichment, if it is to be comparable to UG. There is no reason to assume that the mode of enrichment discovered, invented, or arrived at was "determined" by the previous state LG, but it can be teleologically described as the following "plan" of LG speakers: "We do not need more lexical items; we need rather to express more relations among the various existing X o and Yk. Thus, we do not need a lexical category, but we do need a new grammatical category P which is like V in that it is relational. It is the projections of this new relational

The source of categorial

asymmetries

31

category which will provide the richness of complement interpretations to N, A, and even V, lacking in LG." 1 3 The three sentences in the above "plan" can be formalized by factoring the essentially non-autonomous formulation (21) into a base composition rule, a 0-role assignment principle, and a case-marking principle. All three of these are central features of universal grammar. The first sentence of the "plan" consists in allowing a wider range of internal arguments to heads. This is achieved by the following base composition rule. In accordance with (9), the autonomy of syntax and semantics in U G , this rule is category-neutral: (22)

X'->X°, Yk, V\ k and j> 2. 1 4

The second sentence of the "plan" assimilates the new relational category P to V in the way that it assigns 0-roles. Thus, (21) is replaced by a principle that includes both V and P: (23)

Direct 0-role Assignment: If a V or P is a sister to a phrase Y / i ( k > 2 ) and is subcategorized for Yk, then Yk may receive a 0role. This is called "direct 0-role assignment." Each V or P may assign at most one 0-role to a sister.

It is not implied by (23) that every that is assigned a 0-role directly receives that 0-role from V or P itself; I return to the consequences of this below. My terminology is unfortunately and perhaps misleadingly close to that of Chomsky's (1981, 38) use of the terms "direct 0-marking" (of internal complements) and "indirect 0-marking" (of subjects). As will be seen, my direct and indirect 9-role assignment does not correspond to his direct and indirect 9-marking. Since I do not make use of his distinction in my discussion, at least the readers of this paragraph will not be confused. The simple case of a single complement phrase being assigned a 0-role by a sister V or P is straightforward. However, (22) and (23) also permit multiple sister complements to V and P in U G . Direct 0-role assignment alone automatically provides one way of interpreting such a construction, even though this is not stipulated in the "plan": one of the two sisters of V or P, say ZJ in (22), is assigned a 0-role by V or P, and the other

13. The reader will forgive me for not conforming to the grammar of LG in citing the ideas of its foremost thinkers. 14. I have no stake in limiting the number of internal arguments to 2. Possibly 3 or even more are sometimes required, but I know of no absolutely convincing cases, such as a transitive verb with two idiomatic prepositional complement phrases. This observation is due to N . Chomsky (pers. comm.). "Category-neutral" in (22) means just that: all of X, Y, and Z can vary over N, A, V, and P.

32

A unified theory of syntactic

categories

complement Yk is an external argument to argument of V or P. (24)

rather than an internal

Mary [ v considers] [ Y t Bill] [ Z J my best friend]

When this situation arises, case-marking, originally a device in LG which distinguishes an object of V from a subject of the same predicate, is extended to cover the need in UG for distinguishing a subject of Z ; from a subject of V. In this way, case-marking by V or P and assignment of a 6role by V or P become autonomous - case-marking in U G is not always associated with an internal argument of a verb. (25)

Case-marking: V and P are projected as features onto adjacent sister NP's as case-marks (Stowell, 1981, Ch. 3; cf. section 1.8 here).

Again, I see no reason to call the case assigned by P by a special name such as "oblique", "ablative", or "dative"; it is just the feature P projected onto NP. Since P is a head-of-phrase category in UG, there exist automatically in the unstipulated case a p m a x = p 2 and also an SP(P). But, because SP(P) arises as a consequence of a change elsewhere in the grammar, and is not developed specifically as a means of expressing something previously unexpressed, we might expect that SP(P) would not have a wide range of members or a unique logical function. This expectation, as noted earlier in section 1.2, is borne out. 15 Finally, the richness of complement interpretations described in the third sentence of the "plan" is to be obtained by allowing all the lexical categories to assign 0-roles "indirectly", with the help of the new category P. Before I elaborate on how this is done, it is appropriate to digress and exemplify in some detail the range of constructions that are interpreted by direct 0-role assignment, in order to better understand what remains to be examined under indirect 0-role assignment. 1.5. Direct 6-role Assignment Exemplified In LG, there is no P and hence no complements to P. But the base composition rule (22) in U G makes possible both intransitive P and a range of complements which receive a 0-role from P. These complements are discussed in detail in Ch. 6. (26)

P + NP: into the room, without John, etc.

15. One might object that the transition from LG to U G is so ancient that an impoverishment of SP(P) at that point would have been remedied by now. But it is not clear that really new functions in any of the other SP(X) systems (e.g., the invention of a new type of quantifier, modal, or degree of comparison) have emerged in any recent or observable stage of language.

The source of categorial (27) (28)

(29)

(30)

(31)

asymmetries

33

P + S: while John sang, because John sang, if John sang, since John sang, etc. Here P = t h e subordinating conjunctions; cf. Ch. 7. P + VP: while reading Kant, since reading Kant, after reading Kant, etc. These participial VP's with subordinating conjunctions of time are discussed in Ch. 2. Cf. also the VP's after as: he struck me as knowing the answer (Ch. 6). P + AP: There are few such cases, but it must be recalled that the ratio of transitive V to V with AP complements is high, so it is not suprising that the same holds for P. He suddenly changed from sad to radiantly happy. John strikes me as distrubed. Cf. Ch. 6 for argument that as = P. Mary took John for sensitive, (idiomatic) P + PP:16 John is from near St. Louis. They'll judge me as without qualifications. Sue is in for a surprise, (idiomatic) When the fight started, Bill made for behind the counter. (idiomatic) P without a Y* complement: These are the English post-verbal particles, studied in Fraser (1965) and Emonds (1972). Bring the cat in, tear the house down, etc.

Since (23) is the only principle that allows P to assign a 0-role to a complement, P should not have more than one phrasal sister. In general, this is borne out emprically, but there are two cases where this prediction might be thought to be incorrect. The first concerns the existence of PP's like to Boston from New York and in New York at the Hilton, first shown to be constituents by Jackendoff (1973). (32)

It's from New York to Boston that he traveled. It was in New York at the Hilton that he was found.

These P P ' s are assigned the structure (33) here, and require n o further comment. (33)

(branching generated by base composition rule (20))

in

New York

P

NP

at

the Hilton

16. Hendrick (1976) argues convincingly that some of the P + P P structures proposed in Jackendoff (1973, 1977) are to be analyzed otherwise. But the structures in (30) still seem to me best analyzed as P + PP.

34

A unified

theory

of syntactic

categories

A second construction where P seems to have two complements, and in which both are arguably within P, is the "absolute phrase." These complements have been studied extensively by Ruwet (1978) for French, van Riemsdijk (1978) for Dutch, and Ishihara (1982) for English. In English, the construction is exemplified in (34): (34)

How can you work with children in the room? With John president, certain tasks will get attended to. The fact that I invited company with my kitchen empty shocked him. With John being so difficult to please, you won't have a pleasant time.

The two complements in P in an absolute phrase, an N P and a Y 2 , are clearly in a subject-predicate relation. Ishihara (1982), who utilizes the requirement in Chomsky (1981, Ch. 2) that each sister of a head receive a 0-role from that head, concludes that NP's such as children, John, and my kitchen in (34) cannot be sisters to P; they must rather form, with Y 2 , "small clause" ( = verbless S) complements to P. But these small clauses seem poorly motivated to me, since the conditions under which S appears without a verb are not made precise. Moreover, the added requirement in grammatical theory which entails the existence of these small clauses, that an X-internal N P necessarily receives its 0-role from X, strikes me as wholly unnecessary. By allowing an N P the freedom to be either an internal argument of X or an external argument of some Y 2 within X, the problem of specifying the distribution of verbless S's disappears. In the next chapter, I will define external argument ( = subject phrase) more carefully, but here it suffices to say that the lowest N P c-commanding Y can be taken to be the subject of Y. That is, in (35), the first N P is the external argument of Y for all values of Y, and is thus assigned its 0-role (directly) by Y, not by P. (35)

with children John my kitchen John

-

1 Y I in the room president empty being so difficult

Y 2 , on the other hand, can be assigned its 0-role by P, also in accordance with direct 0-role assignment. And furthermore, the principles given so far, which allow P to assign only one 0-role, in fact correctly predict that if P has two sister phrases, then one must be the external

The source of categorial

asymmetries

35

argument of the other (i.e., the one must be the subject of the other, or, in terms of traditional grammar, the phrase P must be an absolute construction). Possibly, the defining semantic characteristic of an absolute phrase is that Y 2 receives no 0-role at all; its semantic relation to the rest of the sentence would then be entirely pragmatic. When we turn to 0-role assignment by the category V, we find a richer range of complement structures than with P. This is due to the fact that V is the only category which may assign 0-roles directly and indirectly. In this section, I limit discussion to those constructions where direct 0-role assignment by V occurs, and here we find counterparts to all the constructions with P discussed above. Since V is a lexical and P is a grammatical category (more on this distinction is given in Ch. 4), the number of V that enter into any one given construction is always greater than the number of P. (36) (37) (38)

(39)

(40)

(41)

V + N P : the ordinary case of transitive verbs. V + S: hope John will sing, say John will sing, etc. S complements to V are discussed in detail in Ch. 7. V4- VP: in Ch. 2 , 1 argue that the verbs of temporal aspect which take complements in W + ing have neither N P nor reduced S sisters; the complements are VP's. They continued clearing the street. Did David start doing his project? She should cease describing those machines. V + AP: this is the typical case of predicate adjectives. John appeared reluctant to leave. T h a t dessert tasted sweeter than candy. His plant grew tall. My friend stayed sober. V + PP: these are the intransitives whose sisters cannot be N P . 1 7 John fell into the street. *His older brother fell John into the street. The train lurched into the tunnel. *The engineer lurched the train into the tunnel. V without a complement: these are the strict intransitives, whose only "complements" are outside V.

17. I have not found any clear reason when V is intransitive to distinguish between V assigning a 0-role directly to a P P or indirectly to the phrase immediately dominated by PP. We might say that an obligatorily intransitive verb can assign a 0-role directly only to PP, since direct 0-role assignment applies to at most one sister of V. Another possibility is that PP's d o not receive 0-roles directly. It can be observed that this is consistent with the "plan" transforming the prepositionless LG into U G ; the purpose of P is to assign 0-roles, not to receive them. N o t e that the PP's that receive 0-roles in (30) above are good candidates for receiving them indirectly, since they are objects of grammatical formatives.

36

A unified theory of syntactic

categories

Another instance of V + N P besides those where N P is the direct object is when N P is a predicate nominal, as in (42). (42)

John became a surveyor. He plays the student very well. He has remained an assistant. This chair resembles your couch. She could appear the unwelcome guest. I arrived a poor man. John stayed a day-laborer all his life.

The main purpose of Ch. 6 is to demonstrate that even this construction has its counterpart in the PP system, and that P's such as noncomparative as and into take predicate nominal N P complements. It is appropriate to point out here, as was done above with prepositions, some instances where V is subcategorized for two sisters, and one is the external argument of the other. That is, V assigns a 0-role to one of the two sisters, and the 0-role of the second is assigned by the first, as in (43). These are the constructions analyzed as "small clauses" in Stowell (1981). (43)

V assigns 0-role directly to Yk Y assigns 0-role to external argument NP.

There various values of Yfc are illustrated in (44). (44)

Yk = PP (Stowell, 1981, Ch. 4) I expect John off the ship. I dislike you in that suit. Y* = N P We elected John secretary. Chomsky considered that paradigm an interesting problem. I judge this the best entry. They appointed me guardian of your estate. This law makes me an illegal alien. The home secretary should classify them political prisoners. Y* = AP The organization considered that law repressive. The passenger disliked the reggae music loud. Bill preferred his steak rare. Kathy proved him wrong. Few believe a labor party capable of redistributing wealth.

The source of categorial asymmetries

37

We should prepare the meat dry. Yk = VP (The non-NP, non-S status of these VP is shown in Ch. 2.)

John found Bill studying in the library. I caught the assistant stealing from the drawer. This concludes the exemplification of direct 0-role assignment by V and P, as well as that of cases where one subcategorized complement is the external argument of the other. Throughout, I have been utilizing the Subject Principle (15), which states that external arguments are always NP's, as well as the universal characteristic of V that S is a third projection of V that contains an obligatory external argument of V (as stated in (16) and (17)). As a result of this latter restriction, it is impossible for V 3 to qualify as one of the Y* in a structure (43) generated by the universal expansion rule for X(22). 1.6. Indirect 6-role Assignment The final sentence of the "plan" which transforms L G to U G can now be formalized: properties of P are to "provide the richness of complement interpretations to N, A, and even V, lacking in LG." The device which supplements the direct 0-role assignment by an X ° to its sister phrases is called "indirect 6-role assignment." However, indirect fl-role assignment does not specifically mention P; nor should it, since P is sometimes not involved. Since the special role of P is not stipulated but rather derived from other statements, this section is regrettably somewhat technical and may appear non-empirical. Nonetheless, the distinction between direct and indirect 0-role assignment is amply confirmed empirically in the following section (1.7), in my analysis of control (section 2.6), in the discussion of oblique cases (section 5.7), and in the behavior of clausal complements (esp. sections 7.1.1 and 7.1.2). Precisely because the justification for the distinction is pervasive, the introduction to it in this section is necessarily limited to technical definitions, supplemented with only a few examples. The reader familiar with the government-binding framework may wish to bear in mind that my indirect 0-role assignment roughly corresponds to a range of situations where a phrase receives a 0-role from a certain lexical X, but does not receive case from X or SP(X); such phrases include indirect objects, complements to nouns, clausal subjects and objects, and infinitives of obligatory control. The notion of indirect 0-role assignment using P which I develop here is related to the "compositional 0-role assignment" suggested in Stowell (1981). In direct 0-role assignment, if D is a subcategorized argument of some head X, then D and X are sister constituents. For indirect 0-role assignment, I allow subcategorized elements D to "constitute" a sister or a subject NP to X, where "constitute" is defined as follows:

38 (45)

A unified theory of syntactic

categories

D constitutes a O if and only if O dominates D and the only terminal elements under C are under D.

Some example configurations where D "constitutes" a sister or a subject N P of X, and hence is available for indirect 0-role assignment by X, are given in (46), along with fixed values for X. 18 (46)

(a) destruction, N, +_

N P (D = NP):

N I destruction

/ P

PP

0 (b) sell, V, +

NP

the

city

N P N P (D = NP):

V

NP

/ P

sell books P I 0 (c) blame, V, +

\

NP on

P

X

NP I John

N P (D = on + NP):

V ^ NP PP. I I / \ blame troubles P NP

1

on (d) wonder, V, +

,L

John

WH S (D = WH + S):

V wonder

^ S WH^ I if

^ S | he left

18. Actually, in the terminology of early transformational grammar, D "is a" C , but it seems too confusing to say that under indirect 0-role assignment, I require that subcategorized D "be a" sister to X, when I don't mean that D and X are necessarily sisters. In order to prevent indirect 0-role assignment from over-generating, it probably should be required that one category in D be subjacent to some projection Xk of the head. But since subcategorization features are strictly speaking local transformations, this follows from conditions on transformations in U G automatically.

The source of categorial (e) amaze, V, + S

N P (D = S, a clausal subject):

NP' I N. N I 0

39

asymmetries

that he left

SP(V) I could

VP I

A

V I amaze

NP J her

An even wider range of structures where indirect fl-role assignment is used will be presented in Ch. 2. The formal statement of the third sentence of the "plan" is as follows: (47)

Indirect 0-role Assignment: If direct 0-role assignment is not possible, a phrase Yk (k >2) subcategorized by a member of a lexical category L, possibly together with an introductory grammatical formative, can be assigned a 0-role if it constitutes a sister or subject of L. This is called "indirect fl-role assignment."

It can be demonstrated formally why P plays a central role in indirect 0-role assignment. Suppose that a head X° has a subcatagorization feature + F Y \ with F possibly (3, and that X° is not a possible sister toY'' (i.e., direct 0-role assignment is excluded). In order for (47) to apply, there must be some sister C of X° such that F + Yk constitutes C . Then either (i) C° is empty, or (ii) C° is not empty and dominates F. (i) An empty C° which is a P can be seen in (46a-b); an empty C° which is N appears in (46e). Of course, these structures can surface only if there is a surface insertion of a lexical item under C°, or if some rule or principle operates so as to erase C° in surface structure; that is, empty nodes throughout a derivation are ill-formed. In Ch. 4, 5, and 7, it will be shown that grammatical formatives often are inserted in contexts defined by surface structure, so that an empty C° which is a P can be well-formed in two ways (via insertion of a formative or an erasure principle), while an empty C° which is an L (N, A, V) can be well-formed only through an erasure. Examples like (46e), derived by such an erasure, will be discussed in Ch. 7. Thus, P plays a more central role than L. (ii) It remains to discuss the case where C° is not empty and dominates F, as in (46c-d). Since members of lexical categories must be inserted in deep structure frames that satisfy their own lexical specifications, a lexical category member F under C° would assign a 0-role to the phrase Yk under C ; this would mean that Yk could not be assigned another 0-role by the X° outside of C , or else the 0-criterion would be violated. Therefore, a non-empty C° dominating F must be of a grammatical category, which do not necessarily assign 0-roles. But the only non-lexical head-of-phrase

40

A unified theory of syntactic

categories

category is P, so a non-empty C° must be P; i.e., F is a preposition. 18b I have assumed that the sister C to X° constituted by F + Y* is endocentric (has a head). If C is S, and has no non-phrasal head, then, by the above reasoning, F could be any grammatical formative. However, I will argue in Ch. 7 that the head of S is C O M P = P, so that this case reduces to the endocentric one. Thus, we see that the existence of the category P is precisely the increment that separates UG from LG. In the typical cases just described, if a subcategorized D receives a 0-role indirectly, it will be the object of an empty P(46a-b) or an entry-particular subcategorized P(46c-d). Indirect fl-role assignment allows the list of missing complements in LG to appear in UG: indirect objects, all complements to N and A, directional complements to transitive verbs, indirect questions, clausal subjects, etc. The only modifications in the subcategorization features of the lexical entries entailed by indirect 0-role assignment are the desired additions (e.g., + S , + N P NP, etc.) and extensions of a feature like + N P from a verb to all the lexical nouns and adjectives related to that verb. Indirect 0role assignment is not accompanied by any case-marking device, basically because of the sparsity of base categories available for marking each other. 19 Some indirect 9-role assignments involve P which are empty nodes in deep structure, and which may or may not be accompanied by a syntactic feature such as DIRECTIONAL. A more exact account of how these various empty P are filled with morphemes such as of, to, from, as, etc. is provided in section 1.8. There are also indirect 0-role assignments by X to a phrase Yk which depend on the presence of a subcategorized grammatical formative such as WH, a certain idiomatic P, or even an element such as it (he hates it that you stop so often) that together with Yk constitutes a sister to X; cf. (46).20" 18b. In Ch. 4, we will see that certain V are "grammatical", rather than lexical; that is, there is a closed subclass of V which, like P, are non-lexical heads of phrases. Nothing in theory prevents such V from participating in indirect 0-role assignment, exactly analogously to the way P participate. In languages like English and French, grammatical V are not utilized in this way for indirect 0-role assignment (or for "case-marking"). This may be due to a parameter which differentiates language types. Languages which have "serial verb constructions" often apparently lack PP structures; Chinese, as I. Roberts has pointed out to me, is a clear case. The more grammaticalized serial verbs may just be closed class V participating in indirect 0-role assignment. In fact, a language like Chinese may replace all or almost all of the English uses of PP with V P structures, whose V are in the closed or "grammatical" subset of V to be discussed in Ch. 4. English and Chinese would then differ as to which category is used in indirect 0-role assignment, P or V. 19. If case names were other than the base categories themselves (counter to my claim here), then nothing in principle would prevent there from being as many case categories as there are structural relations. But empirically, languages have a restricted number of cases, each of which is used to express a range of structural relations. This fact supports my position that cases correspond to projected categories rather than to the much greater number of structural relations these categories can enter into. 20.

A technical summary of all possible 0-role assignments to sisters of P may be in

The source

of categorial

asymmetries

41

The final formulation of the part of the 0-criterion that has been developed in this section is as follows: (48)

Every 0-role obligatorily present in the lexical entry of some X° must be assigned to exactly one argument of X°. Every phrase "(lexical P) - (Y*)" which is or constitutes an argument of X must be assigned a 0-role.

The second part of (48) will be simplifed in Ch. 2, where a final aspect of the 0-criterion is examined ("every N P receives at most one 0-role"). In what follows, indirect 0-role assignment to complements of V will be subsumed under other topics; in particular, indirect objects are discussed in section 1.8 under case-marking, and clausal subjects and indirect questions are discussed in Ch. 7. As evidence in favor of distinguishing between direct and indirect 0-role assignment, I next discuss how indirect 0-role assignment permits nouns and adjectives to have a complement system. In my view, this is the only device which allows these categories to have phrasal complements. It is fairly controversial to hold that 0-roles are assigned differently to complements of verbs and to complements of nouns, as I claim. Chomsky (1981, Ch. 3) and Stowell (1981, Ch. 3) attribute the asymmetry in the noun and verb complement systems to how these two categories assign case, and claim (if I understand correctly) that subcategorization and 0-role assignment is otherwise parallel in the two

F o o t n o t e 20—Continued order. Assume first that P is lexical, rather than empty. The lexical item inserted under P may be subcategorized to forbid complements (e.g., the particle together), or it may take one, or m o r e t h a n one complement. If this lexical P takes one complement (any Y \ k > 2), then either it assigns a 0-role directly to Y', or it (P) is the grammatical formative which participates in the indirect assignment of a 0-role of Y* by the higher head which is a sister to P P (cf. the next section); b o t h of these situations c o m m o n l y occur. Suppose that the lexical item under P is subcategorized to take m o r e than one sister phrase, say C and C'. At most one of these C may receive a f - r o l e directly from P , by (23). So C' can get a 0-role only by (a) being subcategorized for and assigned a 0-role by a higher head, or (b) being an external argument of some other head Z. But (a) is impossible since C ' does not constitute a sister to a higher head (because of C). So C ' must be the external argument of and hence c - c o m m a n d Z m a \ The only candidate for Z m a x under P P is C (C being the constituent given a 0-role directly by P). Hence, if P is subcategorized to t a k e more t h a n one sister, one of the sisters is the subject of the other; with, whose subcategorization is + N P (Y*), may head an absolute phrase if and only if Y ' is chosen. Finally, assume that P is empty. Clearly, it c a n n o t assign a 0-role itself. Thus, a sister Y l of P must either be the external argument of some o t h e r head Z, or else Y k must get a 0-role indirectly from the higher sister to P P . In the former case, Z m a x must also be a sister t o P, but qualifies for neither direct (P is empty) nor indirect (because of Y*) 0-role assignment, and hence violates the 0-criterion. So, when P is empty, it has at most a single sister which receives a 0-role indirectly from a higher head. This is an exhaustive list of the possibilities for 0-role assignment to sisters of P, and it turns out that all and only the structures predicted t o exist by principles (22), (23), (47), and the 0-criterion d o in fact occur.

42

A unified

theory

of syntactic

categories

systems, with any irregularities being typical of the kind of variation found across lexical entries. Put in another way, for Chomsky and Stowell, 0-role assignment and subcategorization are not autonomous, but case assignment is. In my view, there is a limited autonomy between subcategorization and 0-role assignment. (This is already expressed, for example, when I claim that some subcategorized complements receive an external 0-role from another complement rather than an internal one from their governing head.) But, given this autonomy, casemarking is no longer an autonomous set of category-specific statements; in the framework of section 1.8, it is reduced to a single general principle about bar notation categories "projecting" onto N and A. 2 0 b 1.7. The Asymmetry in Noun and Verb Complement Systems If a noun or adjective is subcategorized as + N P in the lexicon (e.g., most derived nominals of transitive verbs are so subcategorized), the only way this N P can receive a 0-role is via a P P structure. That is, the subcategorization can be satisfied only by a P - N P sequence which constitutes a P P sister of the head and in which the P is empty in deep structure. Such a P is typically realized as of in surface structure, although a principle of derived structure (the "Empty Head Principle" in the first appendix of Ch. 2) licenses an empty P in certain transformationally altered constructions. In a case language like German, the N P ' s resulting from the feature + N P on a V typically receive morphological accusative case. Complements to N and A may have the surface status of a prepositionless N P , but then they are invariably marked as dative or genitive rather than accusative (van Riemsdijk, 1983). As will be seen in the next section and in more detail in section 5.7,1 attribute this sort of oblique case inflection to the presence of a deep structure P P (as with indirect objects of V), so that, as far as phrasal sisters to X° are concerned, all and only the N P which are sisters to X° are sisters to V or P. Thus, contrasting with direct 0-role assignment to N P ' s in examples like (26) and (36), we find, in the N system, either a P P structure with an empty or entry-particular P for direct objects, or that the feature -INP does not carry over to the derived nominal at all: (49)

the description of a city; the promise of reform; the answer {to/*of} a question; the blame {for/*of} the accident; John's marriage {to/*of} Sue; *John's anxious expectation of this bad news; *Mary's reception of a phone call.

20b. AP's are also case-marked, but not directly by the V of which they are a subcategorized complement. Languages with morphological case-marking indicate that casemarked AP's receive case from the N j they modify (Latin, German), or possibly from being in a PP structure (the Polish and Russian uses of the instrumental case for various predicate adjectives).

The source of categorial asymmetries

43

Verbs which take predicate nominals either do not have corresponding derived nominals (if the principal use of the verb is in this context), or the derived form cannot take predicate nouns without an introductory as. (The prepositional status of this as is established in Ch. 6.) (50)

*I was disappointed by John's unexpected remaining a cook. *The becoming {of/as} an adult entails responsibilities. *Her ten-year stay a political prisoner ruined her career. Cf. Her ten-year stay behind bars ruined her career. *My arrival a poor man surprised my family. Her appearance {as/*0} the unwelcome guest was embarassing. The chair's resemblance {to/*0} a couch is surprising.

Since "linking verbs" as in (50) do not assign case, the asymmetric casemarking ability of N and V in the Chomsky-Stowell system does not explain why derived nominals never tolerate predicate nouns. But the requirement that complements within N be 0-marked indirectly, by being within a PP, does. Predicate adjectives are also not case-marked by V, as a cursory survey of languages which mark case morphologically will show. But still we find that AP sisters to V are tolerated, while those to N are not. This again suggests that the ability to case-mark is not the source of asymmetry. (51)

John appeared reluctant to leave. *We were surprised by John's appearance reluctant to leave. That dessert tasted sweeter than candy. T h a t dessert's taste sweeter than candy overwhelmed us. His plant grew tall. *His plant's tall growth is easily explainable. My friend stayed sober for years. *I am happy about my friend's stay sober for years.

Superficially complicating the issue here is the existence of many derived nominals which paraphrase verb-predicate attribute combinations: (52)

(a) She could recognize that the trumpet sounded flat, (b) She could recognize the trumpet's flat sound.

But in cases like (52b), the trumpet's flat sound is also a paraphrase of "N is — A" (the trumpet's sound was flat), and some interpretive device ID must account for this latter alternation independently of "V-predicate attribute" combinations, as shown in (53): (53)

(a) She could pick out the trumpet's flat note. The trumpet's note was flat. *The trumpet noted flat.

44

A unified theory of syntactic

categories

(b) We disliked that flat rendering of the Davis tune. That rendering of the Davis tune was flat. *They rendered the Davis tune flat. Not only can the device ID work on N + A when the corresponding V is not + AP, it may not work on N + A when V, + AP exists but N-is-A does not: (53)

(c) *John's appearance was reluctant to leave. *His plant's growth was tall.

It must be concluded then that the interpretation of A + N combinations as in (53) does not result from any parallelism with V + AP combinations, and hence does not involve a similar assignment of 0-roles. That is, the adjectives that modify nouns are not arguments of the nouns, but rather modifiers of a different sort. Both in traditional and early transformational grammar, it was assumed that these adjectives are directly related to relative clauses, at least in their mode of interpretation; as such, they do not receive 0-roles from N, but are interpreted by some independent mechanism. I retain this assumption here, as there is no evidence to contradict it. With this assumption providing the explanation for the existence of apparent counterexamples like (52), the examples of (51) provide direct confirming evidence that N does not directly 0-mark AP, while V does. Further confirming evidence for my view that N and V assign 0-roles differently (and that asymmetric case marking is only a special case of the more general contrast) can be obtained by examining predicate attributes (NP's and AP's) to transitive as well as intransitive verbs. In section 1.5,1 gave examples of V and P in which the head has two phrasal sisters which receive a 0-role directly, one from the head (V or P) and the other from the complement of which it is the subject NP. The direction of 0-role assignment was diagrammed in (43) and typical examples are repeated here: (54)

P has two sisters: I invited company with my refrigerator empty. V has two sisters: The organization considered that law repressive.

Since sisters to N can be assigned 0-roles only indirectly, it follows that derived nominals corresponding to examples in (44) should be excluded. They are so excluded: *my expectation John off the ship, *our election John secretary, *my judgment this the best entry, etc. Again, this cannot be completely attributed to the fact that a derived nominal does not assign case. Lack of case is remedied in the Chomsky-Stowell system by the insertion of the case-marking P of, and yet this does not produce acceptable sentences. (Cf. the acceptable sentences of (44).)

The source of categorial (55)

asymmetries

45

(a) *Our election of John secretary was illegal. *Chomsky's consideration of that paradigm an interesting problem was a turning point. *My judgment of this the best entry was criticized. *Their appointment of me guardian of your estate was a mistake. *Their making of me an illegal alien was unprecedented. *Any classification of them political prisoners would be a step forward. *The consideration of that law repressive is evidence of an open mind. *Her dislike of reggae music loud cost her a friend. *Bill's preference of his steak rare came as no surprise. *Kathy's proof of him wrong came at a good moment. *A belief of a Labor Party capable of redistributing wealth was a post-war characteristic. T h e y recommended the preparation of the meat dry.

The theory of 0-role assignment does not determine by itself when the use of the prepositions of and as together suffices to create a derived nominal corresponding to an example as in (44). It only predicts that if such a derived nominal exists, all the 0-role assignments within it will be indirect. So there are acceptable examples as in (55b): (55)

(b) Our election of John as secretary was illegal. Your opinion of that law as repressive is evidence of an open mind.

When no alternative with as is available, the derived nominal for the verb with a subcategorization feature + N P AP or + N P NP either does not exist, or is not compatible with these complements. 21 The reader may verify that derived adjectives corresponding to verbs which take predicate attributes do not appear with AP and N P complements either. Just a few typical examples are given here:

21. Chomsky (1970) briefly discusses "action nominalizations", which for the most part exhibit internal structure typical of NP's, and whose head is of the form V-ing. Predicate attributes in such nominals are marginally acceptable: ?Their painting of the White House bright red disturbed even the Secretary of Labor. ?His calling of the rebels Communists gave the signal to the death squads. The restrictions that action nominalizations are subject to, such as the above and several others pointed out by Chomsky, suggest to me that they are derivatively generated, for reasons entirely analogous to those given in Chomsky (1970) in his discussion of nounmodifying adverbial clauses (cf. section 7.3 here).

46 (55)

A unified theory of syntactic

categories

(c) *He criticises desserts tasty too sweet. *We were not considerate of our guests very comfortable. *Mary was judgmental of John ill-tempered.

So far, I have shown that N's and A's cannot assign 0-roles directly to N P and AP sisters, whatever their grammatical function. I defer until the next section the explanation of how the asymmetric case-marking properties of N and V follow automatically from the asymmetries imposed on deep structures by 0-role theory. For the moment, my purpose is to show that elaboration of case-marking theory does not suffice to explain the asymmetries of complement structure between V on the one hand, and N and A on the other. I now turn to the question of whether N can assign 0-roles directly to a projection of V. The system of 0-role assignment proposed here again predicts an asymmetry in the array of possible clausal complements to N and V. V and P should be able to take sister complements, where k = 2 or 3, without benefit of an introductory grammatical formative of the category P. In these cases, V and P but not N can assign a 0-role directly to its Wk sister. The first such gap in complements to N's that concerns us here has to do with a restricted kind of non-finite complement in English. In a previous extensive study of English gerundives (i.e., the non-finite clausal complements introduced by W + ing; Emonds, 1976, Ch. 4), I gave a series of syntactic tests in support of a similar hypothesis of Rosenbaum (1967) that gerundives are noun phrases throughout a transformational derivation, with two exceptions: (a) The V + ing complements of intransitive verbs of temporal aspect are not NP's (examples italicized in (56)), because they do not undergo various N P movements and otherwise do not behave as NP's. (b) For similar reasons, both the V -I- ing complements and certain "bare", i.e., io-less infinitives after transitive verbs of perception (italicized in (57)) are not NP's. 22 (56)

(57)

They continued clearing the street. Did David start doing his project? She should cease describing the machines. One should see a cat fight(ing) another cat. They noticed me tak(ing) a tooth brush off the rack. They will arrest Bill picketing that home.

22. In certain cases, perception verbs may also take a gerund N P object ( M y / ? M e taking the tooth brush off the rack will never be noticed), so one needs some argument that there is an alternative non-NP structure. In such cases, the bare infinitive is more transparently not an NP: *Me take a tooth brush off the rack was noticed right away. Also, ordinarily, the subject of a gerund cannot be extracted (*Who do they enjoy playing the piano?), so that (OK) Who did they notice taking tooth brushes indicates that notice can have either a gerundive direct object N P or an N P - V P complement sequence, where the VP is either a bare infinitive or a V + i«g form. The latter two exemplify the non-progressive/progressive distinction, again indicating a difference from gerundive NP's, where the ing form is not necessarily progressive.

The source of categorial

asymmetries

47

It will be argued in Ch. 2 that the English infinitival marker to appears only in the category S. The non-finite complements in (56)-(57) are exactly the VP complements to verbs which, by virtue of lacking to, may qualify as base V 2 rather than V 3 complements. Since such status would then encompass all the gerunds which fall outside the generalization that gerunds are NP's, it is all the more plausible that (56) and (57) exemplify verbs subcategorized as + V 2 and + N 2 V 2 respectively. All gerunds in English other than these are NP's, while (non-NP) V 3 complements are never gerunds or bare infinitives; they are always finite, or infinitives with to. If the non-finite complements in (56)-(57) are V 2 complements, then the limitation of direct 0-role assignment to complements within V and P predicts that derived nominals should exhibit a gap with respect to (a) and (b) type complements. And this is the case: (58)

*The continuation (of) clearing the street was a surprise. *We were all relieved at David's start (of) doing his project. *A cessation (of) describing the machines would be welcome. *The sight of a cat fight another cat is interesting. "Their notice of me take tooth brushes led to my arrest. *The arrest of Bill picketing that house went unnoticed.

That is, there is no grammatical P which is insertable in N VP to permit indirect 0-role assignment to the base VP's in (58). The contrast of (56)-(57) vs. (58) cannot be explained by a difference in case-marking ability between V and N. It may be remarked that French has non-finite complements that correspond to (57): the infinitives following transitive perception verbs. But it does not have non-finite complements corresponding to (56) after verbs of temporal aspect. However, there may be no "gap" in the French paradigm; rather, the V - V 2 structures in French are possibly exemplified by a different semantic class. The initial V in these structures are the motion verbs such as monter, descendre, sortir, partir, etc. It has been noted in the literature (e.g., Gross, 1968) that the prepositionless infinitives that follow these verbs may not be negated, passivized, or otherwise modified. It is also the case that the derived nominals of these verbs may not appear followed by such infinitives: (59)

Michel est sorti acheter du vin. "Michael went out to buy wine." Sa sortie (*acheter du vin) n'a pas été remarquée. "His leaving (to buy wine) was not noticed." Marie va descendre voir ses amies. "Mary will go down to see her friends." Sa descente (*voir ses amies) sera probablement périlleuse. "Her going down (to see her friends) will probably be dangerous."

48

A unified theory of syntactic

categories

Elle part faire du tourisme. "She's leaving for some touring." Son mari est triste de son départ (*faire du tourisme). "Her husband is sad over her leaving (for some touring)." By analyzing these infinitives as VP complements to V, the basis is laid for an account of their inability to passivize, to be negated, etc. And furthermore, the impossibility of direct 0-role assignment between a derived nominal and a VP can explain the contrasts in (59). The VP sister complements to P corresponding to (56) are the participles that appear after before, while, since, etc. (to be discussed in Ch. 2), and those corresponding to (57) are the absolute phrases discussed in section 1.5. Having now seen instances of V 2 complements to V and P which are lacking inside N, we can turn our attention to whether there are V 3 ( = S) complements with the same skewed distribution, as the present 0-role theory predicts. In most recent generative work, an embedded S is assumed to be generated by the following rule: (60)

S->COMP — S

In Ch. 7, I will argue in detail that S should be identified with P and that the grammatical formative category C O M P should be identified with P. Given this result, the above question becomes, can V take a range of S complements directly, without an intervening C O M P and S (just as P = C O M P can), while the categories that cannot assign 0-roles directly, N and A, cannot? That is, the distinction between direct and indirect 6role assignment suggests that N and A can have S complements only with an intervening C O M P and S, while V should be able to take an S sister. Before giving the evidence for an affirmative answer to the question, which will confirm the 0-role theory being presented here, I should recall the many parallelisms among the N, A, and V heads concerning the types of clausal complements (S) they may take. As is well-known, verbs and derived nominals have similar and often identical subcategorizations with respect to most clausal complement types. Thus, when for-to infinitives, t/iat-clauses, indirect questions, present subjunctives, and infinitives with necessarily missing subjects ("control infinitives") appear with a verb, they generally appear with the corresponding derived nominal also: (61)

{They reported that/the report that} the Polish Party resumed negotiations with the Soviet envoys. {*They reported/*the report} for the Polish Party to resume negotiations. {Someone explained/someone's explanation of} which cities to visit. {*She believed/*her belief of} which cities to visit.

The source of categorial

asymmetries

49

{*One generally avoids/*a general avoidance} that people work for free. {One generally avoids/a general avoidance of} working for free. {They demand that/a demand that} each expenditure be (?is) recorded on paper. {They hope that/a hope that} each expenditure is (*be) recorded on paper. These parallelisms are suggested in Rosenbaum (1967). As expected, since gerunds are NP's (Emonds, 1976, Ch. IV), the derived nominal structure requires a gerund to be introduced by of, like any other N P corresponding to a direct object of a verb. When the parallelisms between clausal complements to N and V break down, we can expect, according to my principles of 0-role assignment, that there will be cases where V and S are sisters, while N and S are not. An instance of this in English seems to be that the C O M P can be absent between certain V and a finite clause: (62)

John feared Mary would be late. *John's fear Mary would be late turned out to be justified. She decided no one qualified. *Have you heard about her decision no one qualified?

In my terms, the COMP ( = P) that is providing the structure for indirect 0-role assignment, which is allowed for both N and V; but V can also have a subcategorized S sister which is 0-marked directly. Thus, it is correctly predicted that that will appear optionally with some verbs, that it must appear with nouns, and that, if it appears after a P, it will be part of an atrophied lexical P such as in that, now that, etc. It might appear that examples such as (63) involve direct 0-role assignment to two sisters to V, contradicting the restrictions on direct 6role assignment stipulated in (23). (63)

John told Sam we would be late. Did she promise you I would be hired?

However, I have previously mentioned that the indirect object N P receives a 0-role indirectly, by virtue of an empty P in deep structure. The examples in (63) are among the cases where it is justified to consider the post-verbal N P as the indirect object in deep structure rather than the direct object. Thus only the embedded S's in (63) receive a 0-role directly, in accordance with (23). In French, there is no large class of verbs where que "that" can be omitted before the complement S. But significantly, with only one or two exceptions involving WH (si "if", quand "when"), que is also always required between subordinating conjunctions P and a complement S:

50

A unified theory of syntactic

categories

pendant que "while", avant que "before", puisque "since", lorsque "when", bien que "although", dès que "as soon as", etc. If we make the plausible assumption that que is inserted in French with every finite S, the following rule can be postulated, essentially as part of the mapping from surface structure to phonological form: (64)

Ç)=*que/C

finite

S; C # WH; obligatory

If C = N or C = A, there is an empty P required by indirect 0-role assignment between N, A and a finite S, so que will fill that P in phonological form. If C = a lexical P, there will be no such empty P following, and if C = V, there need not be. In these latter cases, que will be inserted into the terminal string but will not be assigned a syntactic category. With this clarification for French, it appears that the prediction that V and P assign 0-roles to S complements in parallel fashion can stand (both for French and English). In the case of infinitival V 3 ( = S) complements, 23 French infinitives and English infinitives with lexically realized subjects (impossible in French) both provide direct evidence in favor of the contention that P and V assign 0-roles directly, while N, A, and V assign them indirectly. Huot (1981) argues that the morphemes à and de which introduce French infinitives are members of C O M P . Here, using my results of Ch. 7, this means they are P; this is of course corroborated by the fact that à and de are also P's which appear in the context + N P in French. It is then predicted that N and A can have infinitival complements only if they are introduced by a P (such as à and de), while some V may be subcategorized to take infinitives without a COMP. Again, this is borne out: (65)

23.

Il préfère (*à) boire du vin blanc avec le poisson. "He prefers to drink white wine with fish." Sa préférence {à/*0} boire du vin blanc avec le poisson a été encouragée. "His preference for white wine with fish was encouraged." Claire a voulu (*de) changer de travail. "Clare wanted to change her job." La volonté (de Claire) {de/*0} changer de travail n'a pas été respectée. "Clare's wish to change jobs wasn't respected." Il dit pouvoir influencer ses parents. "He says he can influence his parents." I return in Ch. 2 to the task of establishing the S status of infinitives; while this is

taken as established in m u c h generative work, I will argue that a distinction

between

infinitival S c o m p l e m e n t s and n o n - S participial clauses has not been adequately explained in previous work. T h i s will entail re-examining the rationale for analyzing an infinitive as S.

The source of categorial asymmetries

51

O n a discuté son pouvoir {d'/*0} influencer ses parents. "They discussed his ability to influence his parents." The main verbs in (65) assign 0-roles to clausal complements directly, while their noun counterparts cannot d o this in principle. N o proposal in terms of a motivated case-marking asymmetry between N and V can explain the contrasts in (65). Incidentally, it is no more surprising that some lexical P in French take infinitives obligatorily introduced by à and de (afin de, jusqu'à) than it is that some verbs do. Turning again to English, it is a striking fact that the relatively marked situation whereby the subject N P of an infinitive may be lexically realized obtains in English both after a C O M P ( = P) for and after a relatively small set of verbs of belief and desire. For a variety of reasons, Chomsky (1981, Ch. 2) argues that in the latter cases, no S can intervene between the verb and its S complement, italicized in (67): (66) (67)

F o r Sue to lose would upset us. He expected there to be lasting peace. We prefer the weather to be cool.

In Chomsky's system, there is no natural connection between the fact that English has one C O M P (P) which takes an infinitive with a lexical subject, and the fact that some verbs in English trigger a rule of "S - deletion." But here, items such as for, expect, and prefer can share an identical subcategorization for an infinitival S sister, and hence only one formal device in the grammar of English gives rise to infinitives with lexical subjects. 24 The impossibility of direct 0-role assignment by N in the present system implies that derived nominals corresponding to (67) can contain clausal complements only if they are assigned a 0-role indirectly, via an intervening C O M P ( = P). And this is the case: (68)

24.

*His expectation (of) there to be a lasting peace was never met. *Our preference (of) the weather to be cool should be taken into account. Our preference for the weather to be cool should be taken into account.

A m o r e c o m p l e t e analysis of a c o m p l e m e n t i z e r which takes an infinitive that a l l o w s

a lexical subject will be given in Ch. 7. T h r o u g h o u t the discussion there, it remains crucial that V and P can t a k e S sisters, while N and A cannot. This is all that is at issue in this section. English infinitives without a lexical subject appear after N , A, and V, but rarely with P; they appear only with P = in order, rather than, so as, about. I leave the scarcity of "P + o b l i gatory control infinitive" c o m b i n a t i o n s as an unexplained lexical discrepancy for the m o m e n t .

52

A unified theory of syntactic

categories

We have now seen several instances of V* complement structures which bear out the theory of 0-role assignment developed above, namely, that V and P assign 0-roles directly to sisters, while N, A, and V assign them indirectly, to embedded in P P . The Wk structures which exemplify this claim are (i) the V 2 participial clauses of temporal aspect which appear after V and P (e.g., begin, before) but not in derived nominals; (ii) the V 2 constructions which appear with transitive perception verbs and in absolute constructions (P = with), but not after the nominals derived from perception verbs; (iii) the COMP-less finite V 3 complements which are direct objects of English verbs like fear, decide, and promise; (iv) the French (subjectless) V 3 infinitives which can be immediately preceded by a governing V or P, but never by a governing N or A (an intervening P being necessary); and (v) the English infinitives with lexical subjects after the P for and the V's like expect and prefer, excluded after derived nominals such as expectation and preference. This constitutes an ample demonstration that several asymmetries between the clausal complements to V and those to N (and A) are due to the ability of only V and P to assign 0-roles directly to Vfc complements. Since, in the system devised by Stowell (1981, Ch. 3), the various V^ d o not even receive case, this asymmetry cannot be properly considered as due to the theory of case-marking. Rather, the more general principle of asymmetric 0-role assignment developed here is the appropriate explanation for the much wider range of facts discussed throughout this section. Differences in case-marking should in fact follow from the influence of the 0-role interpretation theory, rather than being stipulated as primitives in universal grammar. It is to this topic that I now turn. 1.8. A Generalized Theory of Abstract Case Recent generative work in the framework of Chomsky (1981) has crucially used a theory of abstract case-marking of NP's to account for some of the N-V asymmetries discussed in the preceding section. As I have argued in section 1.7, case theory is incapable of expressing the full scope of these differences. But the theory of case has also been justified by the role it plays in accounting for aspects of constructions derived by NP-movement, such as the passive and the subject-to-subject raising constructions. Finally, abstract case-marking can be related to morphologically represented case rather directly, especially in languages that inflect for case productively, such as Arabic, German, Classical Greek, Japanese, Latin, etc. (Henceforth, I abbreviate Classical Greek as Greek.) 1.8.J. The Nature of Case

Categories

Current generative work on case has simply taken over the traditional names of morphological cases from the Greek and Latin tradition: nominative, accusative, etc. For example, a verb assigns a feature "accusative" to an N P . T o take this type of statement seriously as a formal operation

The source of categorial

asymmetries

53

would weaken an early implicit claim of transformational grammar namely, that the base component is the "categorial" component, the component where all categories are defined by the relations they stand in with each other. Under this conception, the transformational component contains operations which alter the relations of categories defined in the base, but introduce no new categories. As one example, an agreement transformation can be taken as a rule which copies a category in the base structures into a non-base position, but introduces no new category. A case-feature such as "accusative" would be a category that does not appear in the base, but only when case is assigned (at s-structure for Chomsky, 1981, and at NP-structure for van Riemsdijk and Williams, 1981); this would violate the restriction that all categories of grammar are defined by base configurations. This objection can be eliminated if case features are taken as projections of the case-marking bar notation categories themselves onto NP's and certain AP's, either as features or as indices. This move also has the advantage of predicting that the number of cases is restricted; the only possible case features are as in the following list, where typical correspondences with morphological cases are given for concreteness. [NP, SP(V)]: Arabic, German, Greek, Latin nominative and Japanese ga. [NP, V]: Arabic, German, Greek, Latin accusative and Japanese o. [NP, P] (P ^ 0): Arabic genitive, German and Greek dative, Latin ablative, and Japanese ni. [NP, P] (P = 0): Arabic accusative, German, Greek and Latin dative, and Japanese ni.25 [NP, SP(N)]: Arabic, German, Greek, Latin genitive and Japanese no. [NP, SP(A)] and [NP,SP(P)]: the case of measure phrases. The number of surface cases can be expanded somewhat, since apparently case-marking can project one or two gross syntactic features of at least P onto NP. Thus, the Slavic languages distinguish among dative, instrumental, and locative cases. The same sensitivity to the gross structure of the case-marking category P is also observed when a language has instead a slightly restricted set of surface cases. Arabic, which has only three surface cases, uses the accusative to translate Latin and German datives where there is an empty P (in both indirect object and causative constructions), and the genitive to translate the Latin ablatives and 25. A cursory knowledge of Latin supports the view that the dative and the ablative case are essentially the same. In it, they are always identical in the plural, and this is true of n o other pair of cases; they are also identical in the second declension singular. In the singular declensions where the ablative and dative differ, a simple rule adding -T to the stem derives all datives. We return to this in section 5.7.

54

A unified

theory

of syntactic

categories

German datives - the cases of overt P's. Such systems will be discussed in section 5.7. If case-marking were actually viewed as a device which introduces new features into structures, essentially arbitrarily, then there would be no prediction that the number of truly independent morphological cases seems to hover closely around four, and (more importantly) that each of the four tends to translate isomorphically across case-marking languages. A claim that case categories are actually the categories V and P realized as features on N P and (morphologically) on N might entail a technical difficulty if one were convinced that an archi-category unites V and N (Jackendoff, 1977) or P and N (Chomsky, 1981). However, I have not been convinced by the arguments put forward for these classifications. The possibility of X ° taking subjects inside its maximal projection, which is Jackendoff's principal motivation for uniting N and V, has been expressed in this chapter without positing a feature that combines N and V. Arguments that P and N form a class are in part based on the similar behavior of A and V in languages such as Japanese. But V and A can be taken as a single subcategory of L without implying that the residue N forms a class with P. Thus, throughout this work, I avoid using the categories + S U B J of Jackendoff and - V of Chomsky. 2 6 F o r me, [N, V ] is a noun marked "accusative" and [N, P ] is a noun marked "dative", "ablative", "instrumental", etc. I leave unresolved whether these V and P are formally syntactic features on N or indices on N. M y treatment of case features as projections of the categories SP(V), V, P, and SP(N) parallels the treatment of "secondary" features on consonants (palatalization, labialization, etc.) proposed in Chomsky and Halle (1968). In that work, the central division of phonological segments into consonants and non-consonants does not involve complete crossclassification. The sometimes realized sub-categorizations of consonants into palatalized, non-palatalized, etc. are described as the projection of vowel features onto whatever consonantal segments in languages exhibit these sub-categories. When a given sub-category of consonants is absent in a language, the rule describing the appropriate projection of a vowel feature is not part of the language's grammar. Similarly, a language such as French exhibiting no morphological case in noun phrases (Emonds, 1976, Ch. 6) has no rules which project case features onto N and A although universal principles of abstract case assignment, to be given now, do project the case features onto the phrases N P and AP. 1.8.2. A Uniform Principle

of Case

Assignment

The most general device within case theory is the "case-filter" of Rouveret and Vergnaud (1980), by which all lexical NP's must have a syntactic case (nominative, accusative, etc.) in order to be well-formed. I utilize this filter 26.

If Chomsky's + V if taken to refer to V and A together, then I define ± V only for

members of L, so that —V is identical to my category N.

The source of categoria!

asymmetries

55

here, as a principle of well-formed s-structure: (69)

Every lexical N P at s-structure (i.e., prior to semantic and phonological interpretation) must be associated with exactly one case.

Typically, this means that every N P at s-structure must have a case feature, even though there are instances, under widely accepted analyses, of empty N P ' s which themselves have no case feature, but are instead coindexed with NP's which d o have case. The status of these possibly caseless empty N P ' s is not central to the ideas I will develop here; cf. note 28 below. Stowell (1981, Ch. 3) develops the idea that bar notation features are sharply divided into case-marking and case-marked categories. I think this is correct, but that Stowell does not make clear what motivates this division. As I remarked in discussing the imaginary prepositionless language LG in section 1.4, case-marking can be thought of as a way to distinguish, by means of a grammatical feature, among various N P in a sentence, or, more generally, in any maximal phrase. In contrast, other constituents, since they cannot be external arguments, either occur only once in a maximal phrase (i.e., S and VP) or occur introduced by a grammatical formative in a P P . In other words, case-marking is the distinguishing, by means of a grammatical formative, of various N P in structurally different positions. This naturally leads to the question of how case-marking actually reflects the structural positions that determine it. In Chomsky's and Stowell's formulations, this is done by a set of stipulations: a V case-marks its adjacent N P sister, the inflectional element ( = I N F L = my SP(V)) casemarks its adjacent N P sister, etc. In my view, a general category-neutral statement suffices to predict when an element will case-mark: (70)

An X° or a SP(X) which may have N P as a sister or daughter in deep structures is a potential case-marker at s-structure.

This statement reflects the idea that the multiplicity of N P ' s inside a given phrase requires each N P to be distinguished by a sort of "index" (the casemark) which reflects its structural position - the index of an N P ' s structure is the NP's closest bar notation neighbor in that structure. Moreover, (70) takes advantage of the fact that V and P, by virtue of their ability to assign 0-roles directly, are potential case-markers, while N and A are not. Similarly, since V m a x ( = S) has a subject N P , by (14M17), SP(V) ( = INFL) is also a case-marker. Finally, SP(N) can dominate N P in English, according to (12), repeated here as (71), and so is also a casemarker. 2 7 27. According to (71), English pre-head measure phrases are dominated by SP(X), X ^ V. This means that these SP(X) can (and I presume do) assign case to the measure phrases. In Japanese, a measure phrase modifying a noun has the same overt case-marker as

56

A unified theory

(71)

of syntactic

categories

S P ( N ) - » N P ; English-specific

Statement (70), in conjunction with the theory of direct and indirect 9role assignment, implies that V, P, and SP(V) are potential case-markers, while N and A are not. Thus, a case-marking principle which forms part of the theory of categorial asymmetries developed here does not mention particular categories. The list of case-marking contexts given in Chomsky (1981, 170) follow from the single statement (72): (72)

Case-Marking: At s-structure, a potential case-marking category C (cf. 70) may be copied onto any N P which is its sister or daughter, provided C and N P are not separated by a maximal phrase.

The proviso in (72) is a slight and I think appropriate weakening of the adjacency condition on case-marking proposed in Stowell (1981, Ch. 3). As Stowell explains, the direct object N P of a V cannot be separated from the V by phrases not case-marked by V. 28 Stowell's condition makes other correct predictions about various constructions analyzed in this chapter; for example, it can be deduced that an external argument, at least prior to stylistic rules, must be adjacent to its case-marking sister: (73)

With John at the wheel, we'll be lucky to arrive. *With at the wheel John, we'll be lucky to arrive. Bill considered Mary late. *Bill considered late Mary.

In most treatments of abstract case, such as those of Chomsky and

F o o t n o t e 27—Continued. does a possessive phrase, which supports my view here (H. Hoji, personal communication). I return at the end of this section to why English measure phrases d o not receive a morphological genitive case. 28. A stricter adjacency requirement on case-marking would imply t h a t the separable verb prefixes in G e r m a n , argued in Koster (1975) to immediately precede the verb in deep structure, must form a larger constituent V with the lexical V; otherwise, this particle P will block abstract and morphological case-marking, which would not square with the facts. It is of interest to note that such a constituency is not excluded, as it would be in English, by the word order principle (2). Also consistent with this distinction between G e r m a n and English, the G e r m a n prefixes are written as a single word with the following V, while in English, a post-verbal particle P is never joined in writing with the preceding verb (aufstehen "get up", etc.). I a m not concerned here with whether an empty deep structure SP(V)—which is h o w I characterize an infinitival clause—can case-mark its sister N P . In the framework of C h o m s k y (1981), the subject N P of an infinitival does not receive a case-mark. This means that the "association" of every N P with a case-mark required by the case filter (69) must be interpreted as a requirement that every N P either is a lexical N P with case, or is an empty N P co-indexed with a lexical N P with case.

The source of categorial

asymmetries

57

Stowell, case-marking is defined in terms of government (an exception being the proposal of Kuroda, 1983). I agree with their conception, and feel that it allows (72) to be generalized in interesting ways. But these generalizations require that I define government slightly differently than has been done previously. The innovations that I feel are needed in order to properly express a more general case-marking condition are four: (74)

(a) Since S = V 3 and since sometimes case is assigned irregularly to the subject of an S from outside S, the absolute barrier to case-marking and government is V 2 , and more generally X 2 , rather than X m a x . (b) Several patterns of widely attested case-marking have not been assimilated into the theory of abstract case-marking proposed by Chomsky, principally because this theory lists case-marking contexts as primitives and is not forced to generalize. These include (i) the case-marking of predicate attributes, both with and without an introductory P as (cf. Ch. 6), (ii) the widespread tendency for directional PP's subcategorized by V to contain accusative NP's (German, Latin, Polish), and (iii) the fact that counterparts to direct and indirect objects in derived nominals should be differently case-marked. In order to allow for these accusative, nominative, and genitive case-markings on objects of certain P, I take the barrier to case-marking to be L 2 (L # P, as discussed in preceding sections) rather than X 2 . (c) I wish to allow SP(V) to case-mark predicate attributes (when an intransitive verb which cannot assign case is present, such as be, become, etc.), and I also wish to allow SP(N) to casemark the possessive and measure NP's they dominate (cf. note 27). (d) In the system I am proposing, an N P in a given structural position does not have a unique potential case-marker. Rather, given a certain case-marking, only certain P can be inserted, and only certain 0-roles can be assigned. Then, if case-markers are not unique, and are defined under government, it follows that governors are not unique.

With these remarks in mind, I propose the following definition: (75)

Government: Let a be the lowest projection of X® over an X® or a SP(X,). Then X® or SP(X,) governs every phrase dominated by a, except what is contained in any L, 2 under a (i ^ ;). 29

29. Actually, the L 2 of a conjoined structure does not hlock government, but this has generally been left out of discussions of government.

58

A unified theory of syntactic

categories

It has not been necessary to list which bar notation categories are governors (they all are), nor which ones are potential case-markers. By the theory of direct and indirect 0-role assignment of sections 1.4-1.7, N and A are not potential case-markers defined by (70), even though they are governors. (This particular result accords exactly with the systems used in Chomsky (1981) and Stowell (1981).) It is now possible to state a very general, non-stipulative case-marking principle: (76)

Generalized Case-marking: At s-structure, a potential casemarking category C (cf. 70) may be copied onto any N P which it governs, provided C and N P are not separated by a maximal phrase.

It is a simple matter to extend (76) to encompass the case-marking of AP; in fact, it can be done by means of a category-neutral statement. It suffices to replace "NP" in (76) with "maximal projection of an X which does not assign 0-roles directly." While I think this is correct, the theory of case-marking of AP remains incomplete and awaits proposals about when such case-marking is obligatory (adjectival AP) and when it is blocked (adverbial AP). 30 Assuming N P can be eliminated in (76), the syntactic case-marking theory is category-neutral, while the two principles for (semantic) 0-role assignment are partially category-specific. This accords well with Category-neutral Syntax (9). 1.8.3. Examples

of Case

Assignment

The following examples show the various ways in which generalized casemarking applies in a variety of sentence structures. It must be kept in mind, however, that this sketchy first attempt to extend abstract case-

Footnote 29—Continued It w o u l d be a trivial matter to restore t o n o t i o n of "unique governor" o n the basis of this definition. T h e "principal governor" could be the closest governor, or the closest n o n - e m p t y governor, or (subsequent t o case-marking) the case-marking governor. W h i c h if any of these definitions were c h o s e n w o u l d depend on empirical m o t i v a t i o n . 30.

This w o u l d require a theoretical d e v e l o p m e n t of o n e of t w o sorts, (i) O n e possibility

is that case-marking of A P is syntactically optional; if it applies, it yields an adjectival A P and if it d o e s not, it yields an adverbial A P . A theory of A P interpretation o n a par with 0role assignment to N P ' s remains t o be developed; s u c h a theory w o u l d be able to interpret case-marked AP's in certain positions and case-less A P ' s in others. In s o m e instances, b o t h syntactic o p t i o n s are realized and interpretable: Bill [ V P walked into the r o o m ] [ A P fearful]; A P = n o m i n a t i v e Bill [ V P w a l k e d into the r o o m ] [ A P fearfully]; A P = n o case (ii) T h e other possibility is that the case filter should b e extended t o A P , and that adverbials are of a category other than A P . This w o u l d extend the proposal of Jackcndoff (1977, Ch. 3).

The source of categorial

asymmetries

59

marking to predicate attributes and other non-sister constituents is intended to show a direction for research, and does not represent the results of detailed studies of morphologically case-marking languages. (77) (78) (79)

(a) (b) (a) (b) (a) (b)

He saw another person behind the curtain, John told the story to Mary. He pushed another person behind the curtain, John made his friends ashamed drunk. He became another person behind the curtain, John seemed a tyrant as a manager.

(80)

V

VPI np2 another person

_ ^ P

NP3

behind the curtain I believe that (80) can be considered the deep structure for all three of the (a) sentences of (77)-(79). The only possible case-marker for N P i in all three instances is SP(V), so He is uniformly nominative. N P 2 can be casemarked by V or by SP(V), by (75) and (76); VP is not a barrier to government by SP(V) because of the condition "i # / ' in (75).31 N P 3 can be case-marked by P, or, in the less typical situation which has not been noticed is studies of abstract case, by the case-marker which is on N P 2 . 3 2

31. F o r the same reason, an N P or an A P such as drunk in (78) which is outside the main V P and modifies the subject N P can be case-marked as nominative by SP(V). That is, the gross structure for (78b) is:

T h e condition "i # j" in (75) allows SP(V) t o govern and case-mark drunk, and the only potential case-marker for drunk is SP(V). Generalized Case M a r k i n g must be defined such that drunk is not separated f r o m its casem a r k e r SP(V) by a maximal projection such as his friends; t o ensure this, I assume that "maximal phrase" in (76) means "maximal phrase in the grossest constituent analysis containing C and N P , " where "grossest constituent analysis" is as defined in Wilkins (1980). 32. This has not been noticed because before now, the case-marking category (e.g., V) and the case-mark (e.g., "accusative") have been treated as distinct. By conflating them, as I propose, it is n o w possible to treat case-marking by a head or specifier and case-marking of one N P by another with the same case as exactly the same p h e n o m e n o n . This of course provides a further a r g u m e n t for conflating them.

60

A unified theory of syntactic

categories

Thus, using traditional labels for case features, the case frames that can occur in the s-structure of (80) are as follows: (81)

(i)

N P (P NP ) +ACC +DAT

(DAT = P )

(ii )

N P (P N P ) + ACC + ACC

(ACC = V)

(iii )

N P (P N P ) + NOM +DAT

(NOM = SP(V))

(iv )

N P (P N P ) +NOM +NOM

I assume that an ordinary transitive verb, such as see or,tell in (77) and push or make in (78), which has the features V, + N P but no marked case frame feature, is to be interpreted as having the features V, + [NP, V]. That is, an ordinary transitive verb takes an accusative object unless the verb takes "quirky" case; cf. section 5.7. More generally, the following must be a part of the theory of markedness for subcategorization features: (82)

A lexical item with the subcategorization feature + C, where C is unspecified for a quirky case, must be inserted into + [C, B] if B assigns a 0-role directly to C (i.e., whenever B and C are sisters).

Thus, unmarked transitive verbs which also take an unmarked locational or indirect object complement, like see and tell in (77), have the feature +

N P ( P NP) and are inserted into frame (81i). In languages like German, Greek, and Latin, motional verbs with a locational complement such as push in (78a) take an accusative direct object and a P P whose object is also accusative. It seems plausible to assume that such verbs have the feature + NP(P [NP, V]) (this being the general subcategorization feature that occurs with all transitive motional verbs) and are inserted into (81 ii). A transitive verb whose object can be modified by a second accusative N P or AP, such as make in (78b), can be similarly subcategorized as + N P (NP, V]) or + N P ([AP, V]). The case frames (81 iii) and (81iv) are utilized by linking verbs such as be, become, seem, etc. Exactly how such verbs are subcategorized to avoid taking accusative N P complements is not clear; one possibility is simply that they have features like V, + NP, but are "invisible" for case

The source of categorial

61

asymmetries

assignment. Thus, become would have the feature + N P (PP), and by virtue of its invisibility could appear in (81iii). The preposition as (cf. Ch. 6) would also be invisible for case assignment, which would allow an example like (79b). In such an example in German, the N P for a manager would follow the P als "as" and be morphologically nominative. Another possible treatment of linking verbs is that they do not assign a 0-role to their external argument (their subject). This property is connected to the fact that their subject position is typically the landing site for N P movement. Burzio (1981) and Chomsky (1981, 113) have pointed out the following property as one which has not been completely assimilated into grammatical theory: (83)

A verbal element assigns Case to an N P that it governs if and only if it assigns a 0-role to its subject.

If we simply list linking verbs as not assigning 0-roles to their subjects, it follows from (83) that they will automatically utilize the case frames (81iii) and (81 iv) rather than (81i) or (81ii), so that caseless subcategorization frames like -IN P ( P P ) will suffice for verbs such as be, become, and seem. Whatever the exact mechanisms for integrating the subcategorizations of linking verbs into grammatical theory, it can be seen that all the case frames of verbs provided for by Generalized Case-Marking (76) are utilized, and that (82) expresses which frames are the unmarked ones; that is, (81i-iv) correspond to a sort of increasing order of markedness of lexical subcategorization features. 1.8.4. Case-based Definitions of Grammatical

Relations

The discussion of section 1.6 suggests that the simplest indirect 0-role assignment involves indirect objects. A good candidate for a universal characterization of indirect objects is thus a deep structure [ p p [ p 0 ] N P ] , where P case-marks N P . (For ideas in a similar vein, see Czepluch, 1983, and Kayne, 1983.) This means that in s-structure, an indirect object has the form (84): (84)

.PP. [NP, P ] ...N...

An "Invisible Category Principle," which I develop in section 5.7, allows the P node on the left in (84) to be empty if the feature P (DATIVE) is distinctively spelled out on the N or SP(N) of the N P in a productive

62

A unified theory of syntactic

categories

number of NP's (e.g., as in German, Greek, Latin, Polish, etc.).33 In English, either a phonological rule spells the P as to or for, or, according to a condition to be discussed in the first appendix to Ch. 2, the P node obligatorily deletes when the N P is moved next to a V by (indirect object) movement. Similarly, in Japanese, the P is spelled out ni "to", or, again, the P must be deleted if the N P is moved to subject position in the passive. In French and Spanish, the empty P of s-structure gives rise to the quasicase-mark prefixal a on NP; it is presently unclear to me whether the "Invisible Category Principle" is at work here, which would imply that P is empty throughout the derivation, or whether a is a preposition. 34 Summarizing, the following is quite plausibly the universal definition of the indirect object grammatical relation: (85)

An indirect object is an [NP, + P] which constitutes (cf. the definition (45)) a sister to L° at s-structure.

Since grammatical relations such as "indirect object" are used only for determining 0-role assignment and related aspects of semantic interpretation, and not by the transformational operations themselves, it makes sense that an N P cannot be uniquely determined as an indirect object prior to s-structure, the input to semantic interpretation in most current generative models. A final remark on indirect objects concerns the fact that the appropriate subcategorization feature for any V which assigns 0-roles to two N P complements is + N P N P . At most one sister of V can receive a 6role directly (23), so the second N P can only indirectly be assigned a 6role. By indirect 0-role assignment (47), this further entails that the second N P can and must appear in a PP as in (84) if it is to receive a 0-role from V. As an illustrative example of Generalized Case-marking (76) in nonclausal structures, I discuss an N P which examplifies four different instances of case-marking. (86) (a) John's introduction of Bill to Sue as Harry (was a poor joke). 33. The spelling of D A T I V E in (84) may be identical to the spelling of some other morphological case; e.g., Arabic uses the morphological accusative when P is empty (for indirect objects and embedded subjects in causatives). 34. Strozer (1976) presents several arguments that the indirect object a-phrase in Spanish is not a PP. These arguments list and discuss contexts in which PP's with a lexical preposition act differently than indirect objects. But in my view, none of these differences in behavior would resist an account in terms of the difference between (84) and a P P with a lexical head. That is, in my terms, an indirect object N P constitutes a P P and a sister to V, but this is not true of an object of a lexical preposition; moreover, an indirect object N P may c-command a clitic position (because P P doesn't branch) while an object of a lexical P does not. I feel that these structural distinctions will allow the differences Strozer points out to be adequately expressed. Jaeggli (19 81, Ch. 2) argues that French a is not a preposition in Certain Contexts, a contention all the more plausible since the French partitive de is not a P.

The source of categorial asymmetries

63

(b) NP

(GEN = SP(N))

PP

introduction

Several comments are in order: (i) N P j , which is an L 2 , is a barrier to any case-marking from outside N P j . (ii) N P 2 can receive case only from SP(N); i.e., it is genitive, (iii) NP3 can receive case structurally from either P or SP(N), while NP4, because of the adjacency condition on case assignment, can receive case only from P. That is, the maximal phrase [pp0 + Bi//] intervenes in the grossest constituent analysis (cf. note 31) containing NP4 and the case-marking SP(N). 3 5 But a general prohibition against two (non-coordinated) N P ' s in the same grammatical relation with a head 3 6 implies that NP3 should get some case other than P, namely, genitive case from SP(N). (iv) N P 5 can receive case from P or from SP(N). However, as we will see in Ch. 6, the general characteristic of noncomparative as is precisely that it resembles the copula be and does not assign case to its object, so that NP5 in fact has a uniquely determined genitive case. The phonological reflexes of the case features in (86) are given by the following rules: (87)

P -to/

[NP, P ]

(87) has been discussed above in connection with (84) and (85), and will be further discussed in section 4.8 with regard to certain irregular English datives.

35. Stowell (1981, Ch. 3) claims that NP's like John's introduction to Sue of Bill (*as Harry) are generable by virtue of all the NP's being case-marked by their P sister. I think that this type of analysis does not allow the most general characterizations of the various Pinsertion rules (e.g., (87) and (88) in the text below). Nor does it correctly predict that when the o/"-phrase is not final in N P , it is unacceptable: *The invasion by Hannibal of Italy with elephants was a disaster. *A discussion with Larry of that trip at your house would have been useful. These examples suggest that any o/"-phrase not adjacent to N is extraposed to the end of NP, and is not simply optionally placed anywhere to the right of N in N. 36. This is a consequence of the 0-criterion. That is, two NP's in the same grammatical relation with a given lexical head would receive the same 0-role, which violates (48).

64 (88)

A unified theory of syntactic P-o//

categories

[NP, SP(N)]

Rule (88) raises the question of whether every English genitive N P is marked with -'s. If so, one could not say that a phrase like of Bill in (86) is a genitive. But there is evidence for stipulating a language-particular structural limitation on where -s can be added to English genitive NP's; in particular, -s is added only to those genitives exterior to N. In Japanese, there is no similar restriction; cf. note 27. Supporting this restriction is the fact that neither derived nominal counterparts to direct objects inside N nor pre-nominal measure phrases inside N are marked with -'s. E.g., John's old six foot boat. In contrast, post-nominal genitive phrases of the form of + NP's can be argued to be outside N: (89)

That introduction of Bill to Sue as Harry of John's was a poor joke.

As the reader may verify, the phrase of John's is not acceptable in (89) in a position closer to the head N. Similarly, in (90), the phrases of NP's must follow any complements inside N. (90)

That invasion of Italy with elephants of Hannibal's was a disaster. T h a t invasion of Hannibal's of Italy with elephants was a disaster. I enjoyed those descriptions of Italy of Larry's. *I enjoyed those descriptions of Larry's of Italy.

Thus, I propose that the English rule for adding -s is as in (91): (91)

[NP, SP(N)] => 1 +'5, where N fails to govern 1.

The grammatical relations in derived nominals can be characterized as follows: As discussed in section 1.3, a subject N P is one which is outside of X, so that the surface subject of a derived nominal is one marked by -'s in English. Indirect objects are defined exactly as in clauses, in terms of an abstract case-mark by an empty P, as in (85). Direct object NP's in derived nominals and clauses can be grouped together by using a characterization entirely parallel to (85) which again crucially uses the notion of "constitute" introduced with indirect 0-role assignment in section 1.6. (92)

A direct object is an [NP, — P ] which constitutes a sister to L° at s-structure.

Ultimately, some provision must be made so that (92) does not encompass predicate attributes to V. In English, we can say that any N P at s-structure which is co-case-marked with a preceding N P in the same clause is a predicate attribute, rather than a subject, object, or indirect

The source of categorial

asymmetries

65

object. How to state this formally is a technical problem not only here, but also in any theory of grammatical relations which defines direct object as the sister of L 0 . 3 7 This concludes the discussion of the sample derived nominal (86). The discussion has included English rules for the phonological reflexes of abstract case, and definitions for the grammatical relations of direct and indirect objects in terms of abstract case. The main purpose of this section has been to show that a category-neutral principle of case-marking (72), based on the theory of 0-role assignment of sections 1.4—1.6, can predict the list of category-specific case-marking statements of Chomsky (1981, 170), and can generalize well beyond these to a range of case-marking environments that would require simply a longer and non-explanatory list if case theory rather than 0-role theory is taken as primitive. My conception of "^-dependent case theory" is based on the ample empirical evidence presented in section 1.7, as well as on the adequacy and explanatory breadth of Generalized Case-marking (76). Some caution is however in order. Principle (76) has been designed to encompass patterns of case assignment to predicate attributes and directional phrases which have not been previously scrutinized in a generative framework. In fact, the mere existence of certain broad classes of predicate attributes is first brought out in Ch. 6 of this study, so my attempt to generalize a principle of case-marking to cover them is probably deficient in some way, and must be regarded as tentative. Throughout this chapter, I have tried to develop and refine the theory of universal grammar with respect to those principles that determine possible base structures, either directly (the base composition rules, the bar notation, the head placement principles) or by way of their influence on possible s-structures and logical forms (the theory of direct and indirect 0-role assignment, ©-dependent case theory). The syntactic principles of the bar notation and of case theory have been shown to be category-neutral; in contrast, specific categories appear in the semantic principles involving 0-role assignment, determination of subjects, and characterization of direct and indirect objects. These results suggest that the basic categorial distinctions of grammar are ultimately justified by their differing roles in the interpretive component. Throughout, the 0-criterion of Chomsky (1981), provisionally restated here as (48), has played a central role. However, one important aspect of the 0-criterion has not been examined in this chapter: the issue of whether each N P receives at most one 0-role. When we turn to sentences containing embedded non-finite clauses, it becomes controversial to maintain that this statement holds in an unmodified way. It is this issue that Chapter 2 is intended to clarify. 37. Alternatively, it may never be necessary in assigning rt-roles and other aspects of semantic interpretation to refer to direct objects of N and V as a natural class. In this case, how to exclude predicate attributes from this class becomes a non-problem.

Chapter 2

The revised 0-criterion, clausal subcategorization, and control

2.1. The " Understood Subject Property" of Non-finite Verbs In Chapter 1, I have developed a hypothesis about the fundamental distinction between the head-of-phrase categories N, A on the one hand and V, P on the other. The phrasal complements of N and A can receive a 0-role only indirectly, via an intervening P node, while phrasal sisters of V and P can receive a 0-role directly (without a P), either as an internal argument to V or P or else by virtue of being the subject of another phrasal sister. Moreover, a phrasal complement to P can receive a 0-role only directly. I have further concluded that these basic distinctions among N, A, V, and P, together with the restriction that a subject phrase must be an N P , subsume a wide range of pervasive syntactic asymmetries among complement systems of different heads of phrases. I have argued that these asymmetries are not lexical or language-particular, but follow directly from the theory of 0-role assignment in universal grammar. Any further discrepancies among complement systems I attribute to the sorts of itemparticular or systematic semantic restrictions that are presently expressed by lexical subcategorization. N o phrase structure rules are needed, other than the general base composition statements given in Chapter 1, supplemented by two principles of head placement in left-right sequences. Chapter 1 has also attempted to demonstrate that case-marking of phrasal complements by heads is not the primitive asymmetry-introducing device assumed in recent work by Chomsky, Stowell and others; rather, case-marking asymmetries follow from the theory of 0-role assignment, as explained in the last section of Chapter 1. In establishing these conclusions, I have used the part of Chomsky's 9criterion which insures that each phrasal complement is assigned to at least one head-complement semantic relation (i.e., has at least one 0-role), and that 0-roles are necessarily expressed through phrasal arguments. If a 0-role is syntactically rather than merely pragmatically "understood," or is expressed through a non-phrasal grammatical morpheme, an empty phrase will in fact be found to exist in the syntactic structure which bears the 0-role. I have deliberately not yet addressed whether the 0-criterion should assert that each phrase receives at most one 0-role. This aspect of the 0criterion is a central concern in this chapter. I will here try to resolve the

68

A unified

theory

of syntactic

categories

issue of when if ever heads of different phrases H and H' can separately assign a different 0-role to the same N P . 1 In order to focus on what is at issue, let us assume that some X m a x is the internal argument of two distinct lexical heads H and H'. (This will lead to a contradiction.) It must follow that one of H and H' minimally ccommands the other, since X m a x must be in both H and H'. Definition: a minimally c-commands /? if and only if (i) neither dominates the other, and (ii) the lowest X' dominating a also dominates p. Assume that H minimally c-commands H'. It now follows that H assigns a 0-role indirectly to X m a x ; by the properties of indirect 0-role assignment (section 1.6), we have the structure (1): (1)

This contradicts the assumption that H' assigns a 0-role and is hence lexical. Thus, X m a x is the external argument of either H or H'. So, without any stipulation, the requirement of the previous chapter that internal arguments be or constitute sisters to heads implies that a phrase which is assigned a 0-role by two distinct lexical heads must be the external argument of one of them. That is: (2)

A phrase which receives a 0-role from two heads must be the subject N P of one of them.

1. According to the preliminary version (i) of the 6-criterion in Chomsky (1981), heads of different phrases cannot assign separate 0-roles to the same NP. (i) Each argument bears one and only one 0-role. (36). However, in the more formalized version (ii), the N P position in a chain C from which the chain obtains its 0-role may itself receive separate 0-roles from more than one source (N. Chomsky, pers. comm.). (ii) If a is an argument of S , . . . a 0-role is assigned to C i by exactly one position P. (335) Regarding (ii), J am restricting attention to chains C, that involve no movement and have only one element a; moreover, " . . . when we restrict attention to A-(argument-) positions,... every N P is in one and only one chain." (333) Thus, statement (3) in the text below, the focal point of discussion in this chapter, brings out a crucial difference between (i) and (ii) since it follows from (i) but not from (ii). As will be seen, I hold that (i) is too strong but that (ii) is too weak. My proposal makes no claim about whether a single head can assign different 0-roles to the same phrase. There are proposals in the literature about multiple 0-roles being assigned to one complement by one head, as in Jackendoff's (1972) original treatment of agents and of secondary 0-roles, and in the adjunct 0-roles of Zubizarreta (1982). I am not concerned with these points here.

The revised 6-criterion,

clausal subcategorization,

and control

69

At this point, it can be asked whether this is too weak a restriction; is a stronger stipulation such as (3) empirically motivated? (3)

(?) No phrase can receive a 0-role from two distinct lexical heads.

For example, can there be cases of (2) where the two heads are V - where the subject N P of one V receives a second 0-role from a distinct V'? Such cases would be exemplified in structures like (4)-(5).

,4)

,5) NP

AUX

VP / \ V VP'

/ T \ VP'

V NP

I V'

V' The uncertain status of (3) is closely related to the much-debated question in recent work in formal grammar of whether there exist full verb phrases (VP) which are not immediately dominated by a sentence (S). Such "bare VP" are not compatible with (3), and in fact inclusion of (3) in the first statement of the 0-criterion in Chomsky (1981, 36) is part of a tendency in the Chomskyan paradigm, beginning with Rosenbaum (1967), to not accommodate argumentation for the existence of such bare VP. On the other hand, a systematic attack on the theoretical devices supporting the analysis of all VP's as daughters of S is made in Brame (1976, part II) and certain alternatives are proposed. In response, Koster & May (1982) present several arguments against analyzing infinitives as bare VP's. A third intermediate position is pursued by Culicover and Wilkins (1984). I want to show here that the empirical properties of gerunds support the existence of bare VP's, but not in nearly the range of cases envisaged by Brame. While Brame (1976) claims that at least those infinitives which do not tolerate lexical subjects are bare VP's, I will argue that only gerunds instantiate deep structure bare VP's, and that infinitives are deep structure S, as claimed by Koster & May. As a result, I will not retain (3) as part of the 0-criterion. Instead, a central result of this chapter is a Revised 0-criterion which permits bare VP's under a quite restricted set of circumstances. Many of the empirical arguments for my position will consist in pointing out that most of the Koster-May arguments for S fail when extended to gerunds. However, Koster & May's argumentation is embedded in a network of theoretical assumptions that render it impossible to entertain the existence of bare VP's, once a basic property of all non-finite clauses pointed out in Wasow & Roeper (1972) is acknowledged. This property is that gerunds and infinitives, in contrast to lexical nominalizations, are typically predicates of a determinate subject N P in the same string, rather than having a syntactically unfixed and pragmatically under-

70

A unified

theory

of syntactic

categories

stood subject. Thus, in (6), the understood subject is precisely the N P the man, whereas in (7) pragmatic context determines the subject of renovation.

(6) (7)

The man lived with friends (while) renovating his apartment. The man lived with friends during the renovation of his apartment.

F o r convenience, I will call this property of non-finite clauses the "understood subject property" ("USP"). In the Koster & M a y framework, the existence of the USP, in conjunction with a customary definition of subject N P , the "strong" 0-criterion (3), and seemingly uncontroversial employment of lexical subcategorization, renders bare V P a theoretical impossibility. In what follows, I will conclude, on the basis of the clear and predictable empirical differences between gerunds and infinitives, that the existence of these two fundamentally different kinds of non-finite clauses makes unavoidable certain modifications of these theoretical constructs: namely, the definition of subject, the 0-criterion, and lexical subcategorization. As a result of these modifications, I will show how "bare VP complements" are compatible with U S P and many related phenomena that Koster and May claim indicate the presence of an S. The issue of whether various types of non-finite clauses are S's will consequently no longer depend on USP-related arguments, such as subject-bound anaphors, predicate attributes, floating quantifiers, etc.; arguments of this type are neutralized. There remain, however, many arguments in favor of an S status for infinitival clauses which are independent of the USP. These arguments will be discussed in turn, and it will be seen how each of them suggests that some gerunds are not S's. Moreover, I will show how some previously unsolved problems in syntax have essentially simple solutions. The distribution of gerunds vs. infinitives will be shown to follow from the revised 6criterion, and the distribution of bare infinitives (i.e., of obligatorily controlled VP's) will reduce to a trivial subcase of subcategorization.

2.2. The Distribution of Gerunds Throughout this chapter, I will be focusing on those VP's which are called present participles or gerunds in traditional grammatical descriptions. In Spanish, the head of such VP's is V-nclo and in English, the head is \-ing. The examination of Spanish alongside of English is methodologically central here because it suggests the correctness of abstracting away from those English " N P gerunds" of the form (NP's)-VP, which d o not have an exact Spanish counterpart. Such gerunds, the subject of much study in transformational grammar, are characterized by their ability to occupy all

The revised 6-criterion,

clausal

subcategorization,

and control

71

N P positions freely. 2 In contrast to English gerunds which are not NP's (and which will be studied in detail below), the N P gerunds are translated in Spanish by the infinitive construction introduced by el "the". The N P distribution of the Spanish "e/ + infinitive" construction has been established in Plann (1981). I take it as significant that English and Spanish non-NP gerunds correspond in their distribution (as will be shown below), whereas English N P gerunds correspond rather to Spanish infinitives of a particular subtype. This along with the fact that English N P gerunds are an innovation of Modern English (Emonds, 1971), while English and Spanish non-NP gerunds are Indo-European ctfgnates, indicates the appropriateness of an English-particular device (integrated with an otherwise universal theory of gerunds) for generating N P gerunds. One such device is discussed in note 11, although other alternatives are compatible with the general theory of gerunds to be developed. Eliminating English N P gerunds from initial consideration in this way, as will be seen, allows for some generalizations that could otherwise escape our notice. I also abstract away from derived nomináis, which are not VP's at all (Chomsky, 1970). The gerunds that remain fall into very similar, but not identical, distributional patterns in English and Spanish. These patterns will be treated under five headings, which I believe to be an exhaustive list, idioms aside. Aspectual Gerunds (i). In both Spanish and English, certain intransitive verbs of temporal aspect are followed by non-NP gerunds. These include Spanish ir, venir, llegar, andar, seguir, salir, continuar, estar, and English begin, start, go on, continue, keep, resume, finish, stop, be. The gerund after estar/be is called the "progressive" form. (8)

He {went on, ended up} complaining. El {siguió, continuó} quejándose.

Reduced Relatives and "Perception Gerunds" (¡). English allows a n o n - N P gerund traditionally termed a participle to modify an Ñ or an NP. These participles are interpreted essentially like restrictive relative clauses, and hence are also commonly referred to as "reduced relatives." (9)

The boy crying in the kitchen is my brother. They burned a box containing books.

2. English N P gerunds include those in subject position, objects of prepositions such as to, for, in, from, about, and objects of verbs like try, avoid, explain, describe, prefer, forbid, suggest, remember, etc. N P gerunds sharply contrast with English finite and infinitival complements which are limited to topicalized N P and phrase-final positions. This is the main point made in Emonds (1976, Ch. 4). Throughout this chapter, when I refer to that study, I mean to direct the reader to chapter four, which deals with clausal properties. Turkish gerunds and infinitives contrast distributionally in entirely analogous ways (George and Kornfilt, 1981).

72

A unified theory of syntactic

categories

Most Spanish speakers I have consulted consider this construction marginal or derivative from English. Another type of gerund which is a sister to an Ñ or an N P may be called a "perception gerund." This construction, the object of an interesting study by Akmajian (1977), can occur in Spanish and English inside an N P complement governed by an N or V of perception, or by the P con/ with. (10)

(11)

Un marido lavando a sus hijos representa una escena poco común. A husband washing his sons is an uncommon scene. El bote está hundido en la arena con la proa mirando a la costa. (Keniston, 1937, 240). The boat is sunk in the sand with the prow aiming at the coast.

I have found no difference that pertains to the subject matter of this chapter between these two NP-modifying types of gerunds, so I discuss them throughout under the rubric of "reduced relatives." Adverbial Gerunds (i). Both Spanish and English allow gerunds to appear in subject-controlled adverbial clauses, which are sisters to V, VP, or S. There is no limitation on which main verbs can appear with these clauses in either language. Adverbial gerunds are preposable with comma intonation. (12)

Juan siempre estudia química usando mis apuntes. John always studies chemistry using my notebooks

(13)

Jugando en este parque, el niño podn'a tener mejor salud. Playing in this park, the boy could get healthier.

Subject-controlled adverbial gerunds can also be introduced by a subordinating conjunction P; in English, temporal conjunctions are allowed (before, while, when, after), and in both Spanish and English the attributeintroducing P como "as", as if is allowed. The presence of absence of a P in this construction will have no effect on most of the arguments to be presented. Examples: (14)

(15)

My friend worked on her paper while listening to music. El hablaba como cantando. He spoke as if singing. While listening to music, my friend worked on her paper.

It might be thought that reduced relative participles can be nonrestrictive as well as restrictive.

The revised 6-criterion, (16)

clausal subcategorization,

and control

73

Eva, pensando en sí misma, votó por su marido para la presidencia. Eva, thinking of herself, voted for her husband for president.

However, this is easily seen not to be the case; such gerunds are derived from subject-controlled adverbial gerunds generated in deep structure as right sisters to VP. Parenthetical formation, along the lines of Emonds (1976, 1979), then moves a "two-bar" projection X around the gerund to the right, accidentally leaving the gerund adjacent to the N P which is its controller. When we construct an example where a non-subject is adjacent to the gerund, it becomes clear that a relative clause source is not possible: (17) (18)

Eva votó por su marido Juan, pensando en sí {misma/*mismo}, para la presidencia de la república. Eva voted for her husband Juan, thinking of {herself/*himself}, for the presidency of the republic.

Thus, "non-restrictive" participles reduce to adverbial gerunds. Object-controlled Gerunds (i). In both Spanish and English, the N P object of certain classes of transitive verbs can be followed by a gerund, of which it is the understood subject. Most of these transitive verbs can be termed "perception" or "judgment" verbs; they include ver, encontrar, dejar, tener in Spanish and see, hear, observe, watch, smell, find, have, catch, notice in English. (19)

Juan encontró a la niña corriendo en el parque. John found the girl running in the park.

Sometimes the object-controlled gerund is introduced by a P such as como/as; again, this P will not affect the argumentation here. (20)

I regard John as having too much property.

Absolutive Gerunds (i). What traditional grammar calls an "absolute construction" (because of its lack of close semantic affiliation with the main clause) is introduced in Spanish by 0, en "in", or aunque "although", and in English with with or 0. (21) (22)

(With) weekly visits being limited to only five minutes, the prisoners mounted a protest. (Habiendo vistol , ., . . . < . , > las mujeres a sus mandos por cinco minutos, la [En viendo J policía las obligó a volver a sus celdas.

A true absolute construction usually modifies only a main clause. However, constructions similar in form to absolutes introduced by

74

A unified theory of syntactic

categories

con/with, when they seem to "set a scene," can be embedded. In this use (e.g., the examples of (11)), I have classed them with perception gerunds. This completes the initial survey of five basic types of non-NP gerunds in Spanish and English. While there are some differences between the two languages concerning which lexical items enter into each of the five constructions, they do not impinge centrally on the syntactic arguments I want to pursue. That is, I claim that Spanish gerundives and English nonNP gerunds are a natural syntactic class. From here on, I will use the term gerund, for both Spanish and English, to refer to just this class, i.e., the five constructions just discussed. Before discussing gerunds further, I want to make clear that I am assuming that Spanish has a VP constituent which does not include the subject NP. That is, a Spanish clause has the form [NP-VP] or [VP-NP], and is not a "flat" structure in which the subject and the object NP's are distinguished only by linear order. In making such an assumption, I imply that the VP is parallel to other seemingly subjectless constituents such as P P and AP. In fact, a weaker position would suffice for this analysis. In a recent paper, McCloskey (1983) argues that Irish, a verb-initial language, has a VP that is clearly distinct from S, and that this VP dominates a range of participial constructions, for example, one that translates as an English progressive. McCloskey seems to indicate that his main conclusion participles are VP (not S, NP, PP, or AP) — is compatible with the notion that Irish finite clauses in deep structure are verb-initial, "flat," and without an internal VP. That is, he does not accept, or at least remains skeptical of, the arguments of Emonds (1980b) and Harlow (1981) to the effect that verb-initial languages, like verb-second languages, can have a base VP. Since it is not completely implausible that Spanish is verb-initial (such has been argued by Bordelois, 1974), one might entertain McCloskey's view for Spanish. That is, in Spanish there could be a VP for gerunds and participles, but finite clauses and infinitives would not have an internal VP. This would be compatible with the arguments in this chapter, since I say nothing about the internal structure of a finite clause in Spanish. However, in Chapter 3, I return to the issue of whether "VSO" languages can have "flat" deep structures. The fundamental question in this chapter can now be posed as follows. The Spanish and English gerunds, exemplified in (8)-(22) above, are VP's. This is presumably generally agreed upon (Brame, 1976; Koster & May, 1982; McCloskey, 1983, etc.) The question is whether these gerunds are also S's, with understood (lexically empty) NP subjects, parallel to English "bare infinitives" as analyzed in Koster & May (1982), or whether they are to be analyzed in some other way. 3 3.

O n e might at first consider a d o p t i n g the Koster and M a y

(1982) analysis of

infinitives for gerunds as well. That is, o n e might say that gerunds are like bare infinitives, except that they are "verbal nouns" or "verbal adjectives." T h e Koster and M a y analysis

The revised

O-criterion,

clausal

subcategorization,

and control

75

2.3. Subjects of Gerunds Koster & May present two types of argument in favor of bare infinitives as S's with an obligatorily empty subject. One type is based on the advantages of being able to identify a specific N P outside the VP in question as its subject N P ; let us call these "USP-based arguments" (referring to the Understood Subject Property of Wasow & Roeper). A second type of argument is based on other S or S properties of bare VP's, such as properties associated with C O M P , AUX, S, or S. Let us refer to these as "structure-based arguments." These terms are purely expository. Once this distinction is made, an intriguing division emerges. All of Koster & May's USP-based arguments carry over to gerunds, but only one or two of their structure-based arguments have even prima facie applicability to gerunds. The majority of their structure-based arguments suggest on the contrary that gerunds are not S and will be discussed in Section 2.5. Their USP-based arguments are neutralized by the formal design of my analysis, which provides a specific N P outside the gerund that satisfies the definition of a subject, without requiring gross modification in the rest of grammatical theory. This then allows me to use Koster and May's structure-based arguments to establish that gerunds are bare VP's. Koster & May's USP-based arguments depend on the bare VP and its constituents having a specific N P subject in the lowest S or N P containing them (their "governing category" as defined in Chomsky, 1981, 211). Both bare infinitives, as shown by Koster & May, and gerunds, as the reader may verify, regularly exhibit five features: (a) "subject-oriented" adverbs; (b) predicate attributes which agree in morphological case, when the language in question has such case, with the N P subject of the bare VP; (c) certain subject-modifying predicates which are in general the type of predicates that are c-commanded by the N P they modify; (d) reflexives and reciprocals which must be c-commanded by an N P in their governing category; and (e) NP's which are necessarily disjoint in reference with the understood subject N P . And, beyond this, the U S P property itself (Wasow & Roeper, 1972), holds of gerunds as well as infinitives; that is, the proper interpretation of non-finite VP's requires that they be related as predicates to a determinate N P in the string, as pointed out in the first section of this chapter. F r o m these considerations, I conclude that a gerund VP must Footnote 3—Continued would be modified by saying that gerunds but not infinitives share some feature with nouns. Besides the general ad hoc and unconvincing nature of previous attempts to cross-classify the lexical categories beyond the well-justified N, A vs. V, P distinction, Spanish gerunds are a particularly unappealing domain for a further step in this direction. Unlike English, Spanish is lacking N P gerunds; the gerunds in Spanish appear only in non-NP positions, and so are quite clearly not " + N". Moreover, Spanish has an even clearer case than English of "verbal adjectives," namely, the passive participle, which is inflected for gender and number. However, gerunds are never so inflected (*-nda(s)), and thus seem to be "pure" V's, in the same sense as finite verbs.

76

A unified theory of syntactic

categories

have a determinate NP subject inside the string. However, I do not agree that this NP must be in the same clause as the VP. Is it possible to hold simultaneously that gerunds have determinate N P subjects in the string, and yet are not S's? It is not, if "subject NP of a VP" is defined solely by the configuration [sNP-VP], However, the range of nominal elements that must be considered "SUBJECTs", in the sense of Chomsky (1981, Ch. 3), includes the feature N (noun) of number agreement on inflection and the possessive NP inside an NP. A definition of "subject NP" that includes these elements may be given as (23); a similar definition in a different framework is developed in Hasegawa (1981). (23)

Definition: the subject of an X is the closest maximal N7 which minimally c-commands X and is in all the same NP and S as X.4

This definition of "subject NP" is independently motivated for X — A by the base structures (24a), discussed in Chomsky (1970), and (24b-c). (24)

(a)

/ NP

s

\

John V I felt (b)

VP. AP I guilty

^ NP VP^ I ' I\ John V NP AP I I I ate meat raw

NP AUX VP I | / \ I John may V NP sober J . l visit us

The definition (23) eliminates the need for a "small clause analysis" for the AP's in (24), in which their subjects are empty NP's which form a constituent with the AP's and which are co-indexed with the lexical NP's understood as subjects (John, meat, John). A range of problems internal to such an analysis has been examined in Williams (1983). The theory of direct 0-role assignment in Ch. 1 and further refinements below render an empty subject "inside AP" unnecessary; in many respects, my proposals are very similar to Williams'. It should be observed that the notion "minimal c-command," which I am utilizing throughout this chapter, is 4. A maximal XJ is one not immediately dominated by XJ + 1. Thus, in a finite clause, the Specifier of the Verb ("inflection") has the feature N from agreement with the subject; this N is maximal, and counts as the subject of VP in accord with Chomsky (1981, Ch. 3). In the case of a reduced relative structure [ N P N P — VP], discussed more fully below, the inner and the outer N P are both maximal, by the present definition, but only the inner N P is the subject of the reduced relative, by (23).

The revised 9-criterion, clausal subcategorization,

and control

11

not exactly the c-command that plays an important role in the theory of Chomsky (1981, Ch. 3). It is also closely related to the definition of ccommand in Reinhart (1981). The generalized and independently motivated definition of subject N P (23), together with my proposal that n o n - N P gerunds are bare VP's, now explains a previously unnoted distributional difference between infinitives and gerunds. The specific lexical or trace N P which both minimally ccommands a non-NP-gerund (i.e., V) and is the closest such N P to the gerund at s-structure is in fact the N P which is interpreted as its subject. If n o n - N P gerunds are bare VP's, (23) always correctly picks out the N P that is its grammatical subject. F o r example, in (25a), John minimally c-commands visiting England, and is in each case its subject. (25)

(a) John began visiting England. John got sick visiting England. With John visiting England, I am lonely. We found John visiting England.

With NP-gerunds and infinitives, however, examples can easily be constructed in which the lexical N P which is interpreted as their subject, italicized in (25b), is either not a minimally c-commanding N P or else not the closest such N P at s-structure. (25)

(b) Partir de Mexico le desagradaría a Juan. {To leave Mexico/Leaving Mexico} would bother John. Juan le sugirió a su propio hijo ir a ver al doctor. John suggested seeing a doctor to his own child. John promised Bill to leave town.

The examples in (25b) would be counterexamples to any attempt to extend (23) to cover N P gerunds and infinitives; such counterexamples for nonN P gerunds cannot be constructed. Some transformational operations that induce comma-intonation produce n o n - N P gerunds that are not minimally c-commanded by their subjects: (25)

(c) While visiting England, it seems that J o h n got sick, (example due to N. Chomsky) John got sick, I suppose, while visiting England.

I assume that such operations apply after the level at which the determination of subjects for interpretation is fixed, e.g., as stylistic transformations in the model of Chomsky and Lasnik (1977). As such, they d o not affect my claim that the subjects of n o n - N P gerunds are uniformly determined by (23).

78

A unified theory of syntactic

categories

In order to see how the definition of subject (23) renders the USP-based arguments of Koster and May irrelevant to the structure of gerunds, we must consider the interaction of (23) with the 0-criterion. This is taken up next. 5 2.4. The Revised 9-criterion If no phrase can receive a 0-role from more than one head, as stated above in (3), the USP-based arguments of Koster and May, and in fact the USP property itself, exclude the possibility of subordinate bare VP's which are predicated of some N P which also has a 0-role with respect to a main verb. Thus, (3) excludes the possibility that reduced relatives or adverbial subordinate clauses are generated as VP's that are not S's. In order to entertain the hypothesis that gerunds are bare VP's, the 9criterion cannot include (3). Definitionally, let us say that any two heads are 0-related if and only if the maximal projection of one bears a 0-role with respect to the other. In the language of algebra, "0-relatedness" is now a symmetric relation. My relaxation of the 0-criterion is that if two heads B, C are 0-related, and if two heads C, D are 0-related, then B, D may not be 0-related. Algebraically, this is equivalent to (26): (26)

The Revised 0-criterion (supplements (48), Ch. 1): 0-relatedness is an anti-transitive relation.

In general, Chomsky (1981) requires that a subcategorized complement C m a x to a head B° bear a 0-role with respect to B. Suppose C m a x is a bare VP complement to a V ( = B) and further than this V assigns a 0-roIe to its subject (e.g., try) or to an object (e.g., persuade). This situation is excluded by (3) but is excluded as well by (26), since the governing V, the subcategorized complement V, and the understood subject N of that complement (the subject of try or the object of persuade) are all pairwise 0related, violating anti-transitivity. However, unlike (3), (26) allows an N P to be related to two different V under quite specific lexical/structural conditions. It turns out that when these conditions are satisfied, we obtain gerunds rather than infinitives. In fact, the close examination of all five gerundive constructions introduced in section 2.2 which follows reveals that analyzing them as bare VP (nonS) structures always satisfies the Revised 0-criterion. Moreover, in all five constructions, the revised definition of subject (23) correctly identifies the subject of these bare VP's, so that the USP-based arguments in Koster and May for the presence of an S are irrelevant. 5. I will discuss at a later point in this chapter how transformational movements and subcategorizational requirements can be ordered so that no problems arise from my definition of subject N P . Definition (23) subsumes the Subject Principle (15) of Ch. 1, but I will continue to refer to the latter when focusing on this consequence of (23).

The revised 9-criterion,

clausal subcategorization,

and control

79

In contrast, whenever a bare VP would violate the Revised 0-criterion, non-finite clauses take infinitival rather than gerundive form. This correlation between the requirements imposed by (26) and the morphological form of non-finite clauses strongly suggest that (26) is responsible for their distribution. Aspectual Gerunds (ii). There are numerous discussions of the fact that an aspectual verb imposes no selection restriction on its subject N P (e.g., Garcia, 1967). In terms of 0-roles, a subject of an aspectual receives no 6role from that V. If this V is subcategorized for a bare VP complement, the subject N P satisfies the definition of subject (23) for the lower verb as well; that is, it is the closest maximal N minimally c-commanding the subcategorized VP. Morever, N P may receive a fl-role from the lower VP without (26) being violated, precisely because it is not 0-related to the main (aspectual) verb.

= "0-related" = "not 0-related" VP V (b) Aspectuals italicized: Nos vamos haciendo viejos sin darnos cuenta. (Fente, Fernández, Feijóo, 1972, 31) We keep getting old without noticing it. Esto lo venia vaticinando yo desde hace ya muchos años. (Fente, Fernández, Feijóo, 1972, 33) This I had been predicting for several years already. Adverbial Gerunds (ii). A subject N P can be the subject of a main verb and also of an appropriately c-commanded bare VP which has no 0-role with respect to the main verb; such a bare VP is outside the internal V or VP, and thus in "adverbial" (preposable) position. The subject N P can receive a 0-role from both V's in accord with (26), because there is no selection restriction or 0-relatedness between the main verb and the adverbial participle. 6 The situation remains unchanged if the participle is introduced by an adverbial P; cf. (14)—(15).

6.

We might want to say that a grammatical relation exists between the main V P taken

as a unit and the adverbial constituent (whether the latter is a P P or a V P ) . T h i s question is independent of our concerns here.

80 (28)

A unified theory of syntactic

categories

(a) NF

(b) El general trató de retroceder, siendo insultado por la gente del barrio. The general tried to back up, being insulted by the neighborhood people. El libro fue alabado como siendo muy interesante. The book was praised as being very interesting. As shown in (29), subject control is required in adverbial gerunds: (29)

*La policía atacó a los motociclistas disparando durante el crimen, (with object control) T h e police stopped the cyclists while committing a crime, (with object control)

Reduced Relatives (ii). In a configuration [NPNP-VP], where the internal NP is the head of the external one, there is no grammatical relation, selection restriction, or 0-relation between the VP and the heads of larger constructions in which the exterior NP is embedded. Hence, the internal N P can be, according to anti-transitive 0-relatedness, the subject of the internal VP and also in a 0-relation with a head outside the inclusive NP. This structural situation corresponds to reduced relative clauses and perception gerunds. As far as I can see at this point, the only difference between these two constructions is that a different rule of semantic interpretation is operating on the same structure. (30)

(a) Reduced relatives: The boy crying in the kitchen is my brother. The man sleeping doesn't want to talk to you. (b) Perception gerunds (cf. the study of Akmajian, 1977): La voz de Roberto maldiciendo a los trabajadores perforó mis oidos. Robert's voice cursing the workmen granted on my ears. Un marido lavando a sus hijos representa una escena poco común. The husband washing his sons is an uncommon scene.

(31)

S VP NP

V

The revised 6-criterion,

clausal subcategorization,

and control

81

To the perception gerunds I assimilate the same grammatical construction introduced by the preposition con, with; in contrast to absolute constructions introduced by en, aunque, or (¡) (discussed below), this construction can sometimes be embedded (cf. also the discussion of this construction in Section 1.5). (32)

El bote está hundido en la arena con la proa mirando a la costa. (Keniston, 1937, 240) The boat is sunk in the sand with the prow aiming at the coast. Los alumnos que puedan estudiar {con/*aunque} el profesor leyendo en la clase son muy buenos estudiantes. Pupils who can study (even) with the professor reading aloud in class are good students. El acusado que parece muy tranquilo {con/*aunque/*en} el abogado aduciendo un motivo justificante también será condenado por el juez. The defendant, who seems very serene with his lawyer bringing out a mitigating circumstance, will nonetheless be condemned.

Absolutive Gerunds (ii). An absolute construction necessarily contains a non-embedded "bare VP" in no grammatical relation to the main verb. Since, under my present conception, there are VP's generated elsewhere than as daughters of S, it follows that we can expect them to be generated attached to roots, since other phrases can be generated there too (vocative NP's, "speaker-oriented" parenthetical PP's, etc.). The subject of such an absolutive VP is either the main clause subject, the speaker (stipulated as such in certain marked idioms, such as Seeing as how it's raining, the lawn can't be mowed; Knowing you, the secretary better make these copies), or an N P also generated directly under the root which has the (sole) role of being the subject of the absolutive participle. Intransitive 0-relatedness is not violated in any of these cases, so in the present framework, the existence of absolutive gerunds is also predicted. 7 7. Structures that are immediately dominated by a root node do not have to conform to the bar notation. This is best expressed, I believe, by using the non-subordinate-able initial symbol E, and saying that E falls outside many of the restrictions of the bar notation. Thus, there are structures like the following:

E E PP

/ \

P

NP

PP

P

/ \

NP

Into the cellar with them

and

INTERJ

NP

damn

the consequences

82 (33)

A unified

theory

of syntactic

categories

(With) Weekly visits being limited to only five minutes, the prisoners mounted a protest. Habiendo visto! , ., . . . > las mujeres a sus maridos por cinco minutos, la En viendo J policía las obligó a volver a sus celdas. Aunque aduciendo el abogado un motivo justificante, el juez condenó al acusado.

The requirement that absolutives not be embedded is exemplified in (32) and (34): (34)

*The protest that the prisoners mounted (with) weekly visits being limited to only five minutes was severely repressed. *E1 hecho de que el guardia empezó a gritar en (aunque) viendo la mujer a su marido por cinco minutos la motivó a escribir una carta de protesta.

Put in another way, an absolutive construction is one in which no item is 0-related to any item in the main clause, or to the main clause itself. Object-Controlled Gerunds (ii). The Revised (^-criterion (26) would prohibit a "bare VP" from being a sister to a transitive verb, if both the verbs and the intervening direct object N are all pairwise 0-related. Indeed, as will be discussed in a subsequent section, many verbs can assign 0-roles both to an N P and a non-finite complement (e.g., persuade, urge, tell, force, help, etc.), and the non-finite complement is realized as an infinitive, in accord with the analysis presented here. However, as indicated in section 1.5, I do not impose the requirement that every subcategorized phrasal sister of X° receive a 0-role from X°. Rather, influenced by the theory of predication in Williams (1980), I claim that a verb with a deep structure subcategorization frame such as + N P A P can assign a 0-role (directly) to only one complement, and that the second complement is interpreted by virtue of the N P being assigned a 0-role as the external argument (subject) of the AP. Footnote 7—Continued

the prosecution rested its case not having come to light A puzzling fact about Spanish is that the best position for lexical subjects in absolute gerunds is between the verb and the verb's other complements, as in the examples of (33). As will be discussed in Ch. 3, I take such word order to indicate a verb inversion. In the generative literature, verb-initial orders in Spanish are taken as evidence for a verb-initial base order by some authors (e.g., Bordelois, 1974), and as requiring an inversion analysis by others (Torrego, 1984).

The revised

O-criterion,

clausal subcategorization,

and control

83

This encompasses two possiblities. Williams (1980) gives examples where the 0-relation is between the V and the subject of the predicate attribute, with the AP being interpreted by means of predication. e (35)

{ John

e I ate

the

( meat

t

o

t

I raw.

The V also can assign a 0-role only to the predicate attribute, with the object N P getting a 0-role from this attribute. 6 (36)

\ The medication

I

i

rendered

John

t

i

6 } helpless.

t

It can be seen that the Revised 0-criterion (26), which prevents phrases from having "too many" 0-roles, is satisfied in both (35) and (36), while (3), rejected here, is violated in (35). The two different analyses of verbs subcategorized as + N P AP which are available in the present framework can just as easily be applied to verbs which take object-controlled gerunds. They can be assigned the frame + N P VP. In (37), the main V is 0-related to either the object N P or the complement VP (but not to both, or else (26) is violated), and the N P object is the syntactic subject of the VP complement, by (23). (37)

Juan encontró a la niña corriendo en el parque. John found the girl running in the park. Juan vió a la mujer siendo insultada en la calle. John saw the woman being insulted in the street. La mujer había dejado al niño durmiendo en la cama. The woman had left the child sleeping in the bed. Juan tuvo al niño llorando por largas horas. John had the child throwing a tantrum for hours.

As might be expected, a good many of the verbs which take objectcontrolled gerunds are subcategorized for both + N P AP and + N P VP. (38)

John found the movie dull. John found Mary studying in the library. John has never seen me sick. John has never seen me complaining.

84

A unified theory of syntactic

categories

The general characterization of object-controlled gerunds is thus that they appear with verbs whose first complement is the external argument to its second, a VP. Their minimal lexical specification consists in their subcategorization frame with an indication as to which complement, the N P or the VP, receives a 0-role. These verbs are not irregular, since they fill a gap in possible lexical subcategorization, and their complements obey the principles of ^-assignment in every way. Object-controlled gerunds, like other non-NP gerunds in English and Spanish, are predicted to exist from general principles. 8 8. I direct the reader to B. Schein's abstract in the GLOW Newsletter, 6 (1981) for discussion of some problems involving 0-role assignment in small clauses in Russian and English. His solution is similar in some respects to mine, but there are also differences. On grounds independent of our concerns here, I argued in Emonds (1976) that the VP's introduced by V-ing after perception verbs are "telescoped progressives" with a deleted be, following a suggestion of Fillmore's. Thus, in English, we find that these gerunds are progressive in meaning and distribution, while N P gerunds, in contrast, can appear in nonprogressive contexts: John saw the prisoners (*be) dying, (death not implied) *John heard Bill (be) owning a Honda. Cf. T h e outcry over the prisoners' dying should have toppled Thatcher. Owning a H o n d a can save you money. This then suggests that some object-controlled gerunds are reduced infinitives both in Spanish and in English, produced by a (usually obligatory) deletion of estar/be. If so, then these gerunds are really instances of infinitives, and hence of S (by my acceptance of Koster and May's position on infinitives), and no violation of the Revised ^-criterion is involved. This proposal is confirmed by the fact that when we postulate a deleted estar in Spanish, other verbs which take gerunds are also acceptable as infinitives. There is thus a "gap in the paradigm," which the deleted estar fills. (Of course, if estar is acceptable in some idiolect or context, the deletion is optional there.) The significance of this paradigm was pointed out to me by E. C h u a q u i - N u m a n . Juan vió a la nina f (?estar) corriendo | en el parque. salir corriendo •s ir corriendo v andar gritando comer quesadillas ^ J u a n vió a la mujer

en la calle.

The same gap exists in English: I heard the children f (*be) singing ^ that song. stop singing J resume singing V go on singing begin singing While this deletion may be appropriate for some object-controlled gerunds, it is less justified for those which d o not alternate with infinitives (e.g., those after find, catch, etc.). At least for these, the discussion of the 0-criterion in the text is crucial.

The revised 9-criterion,

clausal subcategorization,

and control

85

Some reflection on the range of possible phrase structure positions for clauses in both English and Spanish can demonstrate the definition of subject (23) and the Revised 0-criterion (26) permit embedded 'bare VP" in precisely the five positions that have been reviewed here: as complements of transitive and of intransitives, in adverbial position, in "adnominal" position (reduced relatives), and in an absolute construction. Thus, the revisions of the definition of subject and of the 0-criterion proposed here do not permit us merely to entertain the possibility that there exist bare VP's which are not S's (but which have determinate N P subjects elsewhere in the same string); they predict a certain limited number of syntactic position^ for such bare VP's. And empirically we find that Spanish (and English, when we leave aside N P gerunds) has gerunds in precisely those positions. I have shown that (non-NP) gerunds exist in both Spanish and English in those positions where my revised definition of subject and my revised 9criterion predict that bare VP's may exist, and in no other positions. If we maintained the classical definition of subject (essentially, mutual minimal c-command of subject and predicate) and retained (3) as part of the 6criterion, the question of why English and Spanish have non-finite verb forms would remain unanswered. In this discussion, the revised definition of subject (23) has rendered the USP-based arguments of Koster and May (1982) in favor of clausal status neutral, as far as the structure of gerunds is concerned. I now turn to their structure-based arguments, to see whether they confirm or conflict with the predictions of the Revised 0-criterion. It turns out that these arguments differentiate gerunds and infinitives very clearly, confirming the distinction I draw here that the former are bare VP's and that the latter are reduced S's. 9 9. According to the analysis of this chapter, the underlined adverbial gerund in the s-structure of (i) is not an S. (i) We interviewed the students before admitting

them.

Principle B of Chomsky's binding theory (Chomsky, 1981, 188), which utilizes c-command and not minimal c-command, would then wrongly predict that them must be disjoint in reference from students. My analysis is therefore in conflict with Principle B (N. Chomsky, pers. comm.). In my view, the correct account of disjoint reference is that a pronominal argument which receives its 0-role or case-marking from a lexical head cannot refer back to an argument of the same head. Thus, disjoint reference is not induced in any of (ii), because the pronoun is an s-structure argument of the italicized lexical head, while its antecedent is not. (ii)

John, met his¡ father. John, found oil near him,.. An important lecture,, often has a cocktail party before it¡.

In all these examples, the governing category of the pronoun, as defined in Chomsky (1981, Ch. 3), dominates the antecedent, and so these pronouns violate Principle B. Examples like

86

A unified theory of syntactic categories

2.5. Structure-Based Arguments for Bare VP's Koster and May have found a variety of paradigms in which some infinitives have properties of COMP, AUX, a lexical subject NP, S or S. Eight such constructions will now be discussed. When we look at non-NP gerunds, it is striking that these properties are simply absent. (i) Infinitives have (embedded) lexical subjects in some contexts, as do finite S. Koster and May are of course referring to English, but it is wellknown that Spanish infinitives also may, in restricted environments, exhibit lexical subjects. (39)

Al comprar Juan el libro de las Malvinas, decidió irse a la Antàrtica. "Upon John buying the book about the Malvinas, he decided to go to Antartica." Por obtener un trabajo su hijo, M a n a no tuvo que hablar con el director de la escuela. "Because her son obtained a job, Mary didn't have to talk with the director of the school."

But non-NP gerunds, in both Spanish and English, never have independent lexical subjects. Their subjects are always in relation to some other predicate as well. (When one says, "infinitives have lexical subjects," one means precisely a lexical N P which has no grammatical role other than being the subject of the form in question; e.g., Bill in John believes Bill to be sick.) (ii) English to is an AUX, an immediate constituent of S, in that to precedes VP-deletion sites. But ing can never be stranded outside VP in this way. One might imagine, for example, that do-insertion would allow ing as well as the finite endings to be realized outside an empty or moved VP, but this doesn't happen:

Footnote 9—Continued these suggest to me that disjoint reference is better formulated as a condition on arguments to the same head. Then, since the only arguments of admit in (i) are we and them, them and the students can co-refer. Under my view, the complements of a prepositional head which is empty in s-structure or has no argument-taking properties (such as a case-marking preposition) are in fact arguments to the higher head. They must then be disjoint in reference with any of its complements: (iii)

"John's, description of him, suprised me. "John, talked to Mary of him,. *John described Mary, to her,. "John, poured oil on him,.

Thus, Chomsky's Principle B and my alternative both correctly account for examples as in (iii).

The revised 6-criterion, (40)

clausal subcategorization,

and control

87

He drives fast, and she does too. *John tried driving fast, and I tried doing too. *Mary finished buying presents before I even started doing.

The contrast in (40) shows that ing doesn't behave like a (finite) member of AUX. There is no counterpart to a lexically realized AUX before a VPdeletion site in Spanish. (iii) Infinitives can be the focus of pseudo-cleft sentences, as illustrated in Rosenbaum (1967). This means that a focused S in a pseudo-cleft can be co-indexed with the introductory what/lo que, and that (some) infinitives share this property. (41)

(a) What John decided was that he would fly at night. Lo que Juan quiere es que María lo alabe. What John decided was to visit his mother. Lo que Juan ambicionaba era viajar en Europa.

Non-NP gerunds never exhibit this property: (41)

(b) *What John continued was smoking cigarettes. *Lo que Juan continuó fue fumando cigarillos.

This non-NP property distinguishes them from English N P gerunds (Emonds, 1976): (41)

(c) What John enjoyed was smoking cigarettes.

(iv) Infinitives can sometimes be conjoined with S or S. (42)

(a) Juan quería escribir una novela, y que ella fuera publicada en La Prensa. John hoped to write a novel, and that it would be published in La Prensa.

Other examples are given by Koster and May, and in Emonds (1976). In contrast, non-NP gerunds can never be conjoined with S, S, or infinitives: (42)

(b) *Vino hablando de política y a trabajar en los afiches de propaganda. *Ana trató de hablar como cantando y si fuera francesa. *A professor knowing phonology and to talk to about it will be in soon. *The man to fix that and talking on the phone is my husband. *The woman drinking beer in there and who you met is rich.

88

A unified theory of syntactic

categories

*He began talking about politics and to work on the posters. *I fixed that while cooking dinner and Bill was napping. Since category identity is usually a necessary condition for coordination, the hypothesis that non-NP-gerunds are bare VP's, while finite clauses and infinitives are S's, explains these unacceptable conjoinings. (v) Certain infinitives have properties indicating that they are generated with a C O M P (S-initial) node, such as an overt preposed WH-phrase, gaps in object position, and introductory C O M P morphemes (italicized). (43)

(a) La profesora no sabía a quién castigar. El reloj es difi'cil de reparar. No tenías tú que ver al editor esta mañana? The professor didn't know who to punish. The watch is difficult to repair. The question of whether to invade the country was debated in NATO.

Gerunds do not exhibit these C O M P properties; that is, they do not act like S. (43)

(b) We continued (*whether) visiting our mother for years. *The tools buying at this store are expensive. *They liked to work with whom humming tunes. *The professor began which lessons teaching her students. *A watch is difficult to listen to music repairing.

The only apparent COMP-related gaps in any non-NP gerunds are the "parasitic gaps" in adverbial gerunds, of the sort studied in Engdahl (1983), Taraldson (1981), and Chomsky (1982). In (44a), t¡ is the trace of WH-movement and e,- is the parasitic gap. (44)

(a) How many dishesj should I dry t; while putting away e¡? I disliked the painting, that the expert scrutinized t¿ before describing e,- to the owner. Which students; can we criticize t, while interviewing e, for jobs?

In an excursus, I will now show that these parasitic gaps are not evidence for a C O M P in adverbial gerunds. The reader not interested in this rather involved problem may wish to continue on at point (vi) in the text below. Certain analyses of parasitic gaps (e.g., Contreras, 1984) require them to be indexed with an empty operator in their own clause. My claim that adverbial gerunds are bare VP's seems incompatible with such an analysis,

The revised

9-criterion,

clausal

subcategorization,

and control

89

since a b a r e V P contains n o motivated syntactic position for an operator, empty or not. Parasitic gaps of the sort that appear in (44a) are far f r o m unrestricted, however. T h e restrictions they are subject to suggest an analysis which reconciles the bare V P analysis with the need for an operator position and explains these restrictions at the same time. T h e first restriction is that the acceptable parasitic gaps in n o n - N P gerunds are adverbial gerunds introduced by P: (44)

(b) * H o w m a n y dishes should I dry putting away? *I disliked the painting that the expert scrutinized describing to the owner. *Which students, can we criticize t, interviewing e, for jobs?

The contrast between (44a-b) correlates with the fact that the adverbial gerunds with P in (a) translate into Spanish as infinitives (i.e., S's in my terms), while those without P (and n o t allowing gaps) in (b) translate as gerunds. Since Spanish and English are otherwise alike with respect to n o n - N P gerunds, this morphological difference is plausibly due to an irreducible language-particular distinction. I c o n t i n u e to assume that Spanish g e r u n d s are deep structure bare VP's and that Spanish infinitives are S's, a n d I further assume that Spanish and English have the same deep structures for the constructions under discussion. Thus, the deep structure of a P-introduced adverbial gerund in both languages is as in (45), using the result of Ch. 7 that C O M P is a P.

(45)

^ P P \

(AUX = I N F L = SP(V))

P I before, while, etc By the usual mechanisms (the case filter, etc.), the N P is lexical in (45) if and only if A U X is finite, and non-lexical if and only if A U X is infinitival (i.e., empty, prior to s-structure). Moreover, as argued in Ch. 7, a lexically filled P is incompatible with any movement to C O M P ( = P), or even with any o p e r a t o r in C O M P . Therefore, there is n o available 0 - C O M P in (45) to "license" a parasitic gap. The correctness of this is confirmed in Spanish by the absence of acceptable c o u n t e r p a r t s to (44a). (46)

* 6 C u á n t o s poemas; d e b e n a publicar t, después de escribir e¡? * N o m e gustó el cuadro; q u e el experto examinó t; antes de describir e¡ al dueño. * ¿ A cuáles cantantes,- criticó t, al escuchar e¡?

90

A unified theory of syntactic

categories

The only language-particular stipulation needed for English is the deletion of AUX in (45) in the context, "[lexical P, + T E M P O R A L ] ." Assuming with much recent work (Cf. Ch. 3) that AUX is the head of S, it is plausible that AUX-deletion also entails S-deletion, which would result in (47). (47)

before, while, etc. An s-structure as in (45) will automatically become a morphological gerund in English, since it is a bare VP. Furthermore, the subject of this VP, by (23), will correctly be determined as the subject of the main clause. The question is now, where does English AUX-deletion in temporal clauses occur, and what happens to the N P in the underlying (45) when this deletion takes place? Suppose we say that this language-particular AUX-deletion takes place in the syntax. A general condition on AUX-deletion (that is, S-deletion) should be that whatever is automatically deleted with AUX, namely the N P is (45), is recoverable. Beyond this condition, there is no restriction on the content of this NP, since it is absent at s-structure, and hence has no effect either on post s-structure interpretation (e.g., grammatical relations) or conditions like the Binding Theory. One plausible interpretation of "recoverable NP" is that a recoverable N P is either empty or co-indexed with a c-commanding N P in some local domain. Since (45) is an adverbial clause (outside the VP in a higher clause), the only NP's that c-command it in the next highest clause are the subject N P and a possible NP in COMP. If the deleted N P in (45) is unindexed (i.e., empty) or co-indexed with the subject of the higher clause, no parasitic gap results; moreover, definition (23) determines that the higher clause subject is also interpreted as the subject of the gerund. On the other hand, if the NP to be deleted is co-indexed with an N P in the C O M P of the higher clause, a parasitic gap results. For example, the structure to which AUX-deletion applies for a typical example in (44a) is (48): (48)

How many dishes; should I dry t,- before [ s [ n p , 0 ] [ a u x W Evpput away e,] ]

Since the AUX-deletion which removes NP,- + AUX in (48) is in the syntax, NP,- can either be base-generated or moved into position by Move a, subject to subjacency (cf. Ch. 3). Interestingly, confirming data is provided in Contreras (1984) to show that the parasitic gap in examples like (48) is subject to subjacency. However, this NP,- will not be subject to the Binding Theory, which applies only to s-structures.

The revised

9-criterion,

clausal subcategorization,

and control

91

According to this analysis, then, the missing "operator" in adverbial gerund parasitic gap structures in English is not in C O M P , so no sstructure S over them can thereby be justified. This operator is rather an empty N P in deep structure subject position, which is deleted prior to s-structure as an automatic side effect of AUX-deletion, the languageparticular local rule which is in any case necessary to explain a discrepancy in the otherwise parallel distribution of gerunds and infinitives in Spanish and English. This analysis predicts several more properties of adverbial gerund parasitic gaps besides the required presence of P (44a-b) and the absence of Spanish counterparts (46). Parasitic gaps in finite adverbial gerunds are correctly predicted to be less acceptable, since they can only be derivatively generated. (49)

(a) *I disliked the painting that the expert scrutinized before you described to the owner. *Which books should I make a list of while you are putting away? *Which studentsj can we criticize t, while you interview e, for jobs?

An adverbial clause which is an absolute construction introduced by with cannot exhibit a parasitic gap because there is no empty subject N P at deep or s-structure. This result is quite unexpected in a framework that takes the missing operator in parasitic gaps to be in C O M P . (49)

(b) I can't locate my papers with the staff putting them away so soon. *The papers I can't locate with the staff putting away so soon are important.

The present analysis also explains without stipulation why no filled C O M P or even an empty C O M P necessary for interpretation ever appears in adverbial gerunds, since their only C O M P is in fact the lexically filled P (before, etc.) which introduces the construction. An analysis where adverbial gerunds have an S and a C O M P independent of this P cannot explain this. Finally, the present analysis is compatible only with parasitic NP gaps, since the empty co-indexed "operator" is in deep structure subject position, and subjects must be NP's (Ch. 1). This is also confirmed: (49)

(c) *How sick,- did John say he felt t; before getting e,? *How long; does John drink t, before lecturing ej? This is a topic you should think about before talking about. T h i s is a topic about which you should think before talking.

I conclude that the parasitic gaps in English adverbial gerunds, when

92

A unified theory of syntactic

categories

fully investigated, strongly confirm my claim that no S or CO M P (other than the introductory lexical P) is present in these structures. While nonfinite adverbial clauses introduced by a lexical P are deep structure S's in both Spanish and English, they become s-structure VP's in English. They thereby conform to the general pattern according to which bare sstructure VP's are realized as morphological gerunds in both languages. (I am indebted to N. Chomsky for pointing out the apparent problem for my analysis posed by parasitic gaps). (vi) There is a range of anaphoric expressions (reflexives and reciprocals), floating quantifiers, and predicate attributes that ordinarily must be locally c-commanded by their antecedent or the modified NP. This restriction holds without exception for infinitives only if we hypothesize empty N P subjects, controlled outside their S, that c-command the expressions in question. Thus, the reflexive in {Shaving himself/To shave himself} disturbs Oscar is c-commanded by a co-indexed argument in the same S only if the infinitive is an S with an empty subject N P controlled by Oscar. In contrast, as we have seen earlier in the discussion of (23), the revised definition of "subject of a V" makes it unnecessary to postulate an additional subject N P inside non-NP gerunds in order to properly account for anaphors and predicate attributes. This extends the argument given at the end of section 2.3 above regarding the examples in (25). (vii) Koster and May claim that some infinitives allow an object such as each other to have a "split antecedent," as in John proposed to Mary to help each other. They suggest that it is the empty N P subject of the infinitival S which has such a split antecedent, rather than each other, which never can. Such situations do not arise with gerunds, which reflect the prohibition against split antecedents for each other directly. (50)

*John talked with Mary, reminding {her/*each other} about their promise. •After the talk with Mary, John continued helping each other.

(viii) Essentially the same range of heads take S and infinitive complements, as Koster and May discuss in their section 3.3. Indeed, the distribution of infinitives in the base is essentially that of S, which suggests of course that the same category is involved. (This point is established in detail in Emonds, 1976). Gerunds, however, do not have the distribution of S. First, gerunds (i.e., bare VP's under my hypothesis) do not appear as sisters to N and A, as discussed in Ch. I. 10 10. This discrepancy in distribution can be explained as follows, using principles developed in various places throughout this work, (i) It is established that S is a P P in Ch. 7. (ii) PP, including S, can combine with any L J (L = N, V, A) to form X J + I , as argued in Ch. 1. (iii) In contrast, L 2 ( = NP, VP, AP) can be a sister to V in the base, but not to N or A; this generalizes the notion that only V, but not N or A, "case-marks", and has been discussed in section 1.8. Thus infinitives but not gerunds can be sisters to N .

The revised (51)

O-criterion,

clausal

subcategorization,

and control

93

Mis amigas n o n o t a r o n su llegada (*fumando un cigarro). *My friends didn't notice his arrival smoking a cigar. N a d i e n o t o la salida del tren (*chirriando). * N o one noticed the train's d e p a r t u r e screeching. *John's c o n t i n u a t i o n correcting papers surprised me.

A second difference in base distribution between gerunds and S is that the former are allowed as complements to subordinating P, as are S, while S are not. At this point, the exclusion of the base sequence P-S may seem problematic, given expressions such as in that and in order that, but it is established with ample argument in Ch. 7. A third distributional difference between gerunds and S is easily stateable for Spanish, but the existence of English N P - g e r u n d s necessitates a circumspect statement. This difference is as follows: (52) (53)

Spanish gerunds a n d English n o n - N P gerunds are never N P ' s and hence c a n n o t a p p e a r in topicalized position (cf. section 7.7.2). Spanish and English infinitives are in N P positions (only) when they are topicalized N P ' s (cf. E m o n d s , 1976, and Plann, 1981).

The mechanism by which S can a p p e a r as the sole element under N P as in (53), while a bare V P cannot, would take us into the material of Ch. 7, since it would involve a property of S. Nonetheless, the result of this mechanism is a third clear distributional difference between S and gerunds (that is, the distinction between (52) a n d (53)). 11 11.

The interested reader may wish t o k n o w h o w I propose to derive the English N P

gerunds. Let me state more carefully the rule for generating base VP sisters of N and V. For X = N and V, the base composition rule is X J - > X ' , V 2 , where i < j by the bar notation restriction on heads. This base rule is given in a more general form in note 12 of Ch. 1. For X = N and i = 2 in the formula we have:

(a)

NP

(a) is the structure of Spanish and English perception gerunds and English reduced relatives. We can also have the following, for X = N and i = 1 :

(c)

(b)

NP

SP(N) N N Such V P in Spanish are subjectless, and thus ungrammatical, because the U S P property is violated. However, in English, S P ( N ) can take the form of a possessive N P , so that such VP are interpretable, provided they have a 8-role.

94

A unified theory of syntactic

categories

There are thus eight structure-based arguments presented by Koster and May that bare infinitives are S's which, when extended to gerunds, indicate that for them the opposite conclusion is warranted. Namely, these arguments show that gerunds are bare VP's. Of the three other structure-based arguments that Koster and May give, two of them, based on extraposition and subjacency, can be shown to be orthogonal to the issue at hand. The third, concerning a proper choice for a maximal projection of V in the bar notation, leads to a theoretical clarification of exactly how a "bare VP" is related to a full S. Koster and May show that some subjectless infinitival relatives (with a P P in COMP) extrapose, analogously to finite relative clauses. They point out, however, that no clause with an empty C O M P extraposes, citing Chomsky and Lasnik (1977). Therefore, within their framework, bare reduced relative gerunds should not extrapose. This is exactly the case, as illustrated in (54). Since such gerunds are marginal in Spanish, I confine myself to English examples. (54)

That man (who is) cleaning the table has a nice shirt. That man has a nice shirt who is cleaning the table. T h a t man has a nice shirt cleaning the table. Somebody (who is) visiting from the capital wants to give a talk. Somebody wants to give a talk who is visiting from the capital. *Somebody wants to give a talk visiting from the capital.

In the present framework, Koster and May's characterization of which clauses extrapose (those with lexical COMP) can be adopted without change, and no argument for or against the bare VP status of reduced relatives emerges. 12

F o o t n o t e 11—Continued If N assigns the 0-role to V P in (b) and (c), t w o things are wrong. A 0-role is being assigned outside N

to an external argument which is not an N P , a n d moreover, anti-

transitive 0-relatedness is being violated for the triplet N , V P , "possessive" N P . S o a higher predicate, say V, must assign a 0-role to V P , by virtue of subcategorizing for the V P in (b) a n d / o r (c). In section 2.6, it will be proposed that the subcategorization feature in question is +

N

VP, where N is empty.

It remains only to specify w h a t can c o u n t e n a n c e the e m p t y N in the s-structures (b) and (c). If it were due t o universal grammar, a n y language with a possessive N P w o u l d have an N P gerund. But I have s h o w n (Emonds, 1971) that Middle English falsifies this. S o the gerund requires a language-particular innovation. T h e M o d e r n English rule that must be stipulated particular to this construction is the deletion of [ N 0 ] in

VP]

within

English "phonology". Alternatively, this N is spelled out as ing, by a "deverbalization rule," as in Jackendoff (1977). 12.

In Ch. 7, I argue that C O M P is the head of S; further, as suggested in section 1.7,

clauses without C O M P are plausibly not S, but only S. Also, as in Ch. 1, S and V P are projections of V (V'), while S = P m a x (Ch. 7) is not. If these results are accepted, K o s t e r & May's claim may be reinterpreted as saying that PP's extrapose, while V* d o not. But still

The revised 6-criterion,

clausal subcategorization,

and control

95

Koster and May note that infinitives with filled COMP's act like S's by contributing to subjacency effects (*What does Mary wonder to whom to give!). Again, since gerunds, as seen above, never exhibit a filled CO MP, an independent argument for or against my hypothesis about gerunds as bare VP's cannot be constructed from such examples. One can, however, contrast extractability from a gerund which is a sister to V (aspectual and object-controlled gerunds) and from a gerund which is a sister to V or to some projection of N (adverbial gerunds and reduced relatives). (55)

(a) A quién continuó Juan escribiendo cartas? What songs did he go on humming to himself? (b) * ¿ Cuánto humo te habló echándote? *What songs did he write his paper humming to himself? (c) *What songs would a friend humming to himself bother you?

There are no subjacency effects in the (a) sentence of (55), but it is wellknown that adverbial clauses outside the VP in (b) or reduced relatives in (c) prohibit extraction. The same effect can be seen in PP's in English, a language in which P's can be stranded. (56)

(a) What town did he ride a horse past? (b) *What town did he rent a horse near? (c) *What town did a friend from visit you?

(PP inside V) (PP outside V) (PP inside NP)

Koster (1978) discusses this effect, attributing it not to subjacency, but to what he calls being (or not being) "nested" in a manner he defines. That is, phrases which are not nested can be extracted and others cannot. My version of this idea is as follows: (57)

B can be extracted from X m a x only if B is in a minimal ccommand relation with X. 13

Such a formulation is consistent with many often-noted subject-object assymetries. By (57), material from inside a subject N P or an adverbial y m a x (itself outside the V containing the main verb) cannot be extracted from S, which I take to be V m a x . Similarly, material within a P P outside V cannot be extracted, even though a P P itself can be, since the P P minimally c-commands the main V. Thus, (57) predicts the inability of P outside of V or inside N P to strand. Footnote 12—Continued there is n o independent argument based o n extraposition a b o u t whether a gerund is an S or not, since their inability to extrapose is consistent with their being either V m a x or v m a x ~ ' . 13.

K a y n e (1981) a c c o u n t s for m a n y of the s a m e effects by elaborating a system of

proper government based on c o - i n d e x i n g of heads and c o m p l e m e n t s .

96

A unified theory of syntactic

categories

In any case, gerunds exhibit restrictions on extraction only when they are outside the internal V and the same holds in English for the P P of the form P-NP, and so bare VP can be assigned the same role as such P P in any theory of bounding. But they should not be considered nodes that count for subjacency, as shown by the examples in (55a). A final structure-based argument brought to bear by Koster and May against a bare VP status for infinitives is that it is theoretically preferable to have a single maximal projection for V within grammatical theory. I agree that the maximal projections for V within the bar notation should not differ in category - they should all be \ k . However, as discussed in sections 1.3 and 3.3,1 do claim that a defining characteristic of V is that its maximal projection is in some cases V 3 ( = S), while in others it is V 2 ( = VP). Within this system, both X 2 ( = NP, AP, VP) and X m a x (NP, AP,S) form natural classes, and this is what expresses the fact that some grammatical generalizations treat VP parallel to NP and AP, while others treat S as parallel to N P and AP. Moreover, some other gereralizations of grammatical theory also utilize Xk, k>2, and thus treat VP and S together, rather than disjunctively. 14 For purposes of this chapter, it is not necessary to agree that S is a maximal projection of V; it suffices to establish that VP = V 2 . That such a "bare" VP can be generated as a sister to V, to V, and to some N k , k > 0, is then my hypothesis. Whether S is V 3 or is outside the bar notation system is independent of the present argument. If VP could not be justified as V 2 rather than V 1 , one could object that governing predicates, for example, aren't subcategorized for X 1 complements. Hence, generating bare VP (i.e., V 1 under this supposition) would violate a bar notation restriction against bare "minimal projections" in non-head positions in the base. But, as long as bare VP are V2, governing predicates uniformly select Xk, k>2. Note that this is Brame's (1976, part 2) argument for the existence of some VP complements. Some corroborating justifications for VP = V 2 are as follows: in Emonds (1980b), I argue that a number of generalizations hold for X 2 ( = NP, VP, AP, PP) but not for S. (i) Phrases to the left of the head inside X 2 exhibit severely limited recursion in verb-second languages, but subject N P and adverbial AP outside VP escape the restriction, (ii) The same set of phrases cannot be extracted, by Ross's (1967) left-branch constraint, while subjects of sentences can be. (iii) Governing predicates cannot (non-idiomatically) select anything but a head inside a complement X 2 , even though they can select subject NP's (obligatory control) and SP(V) (e.g., subjunctives) inside S. (A generalization of essentially this 14.

F o r a m o r e detailed discussion of these generalizations, see section 3.3. In E m o n d s (1979), I argue against the view of Jackendoff (1977) that appositive modifiers

justify postulating an additional third bar level for N P , A P , and P P . T h a t is, I argue that appositive modifiers d o not form a constituent with the Xj they modify. I a m a s s u m i n g this position here throughout.

The revised

9-criterion,

clausal

subcategorization,

and control

97

principle appears in Chomsky, 1981, 300). (iv) Schwartz's (1972) constraint against moving heads can be "updated" by saying that ( j < 1) cannot move around within X 2 ; but various projections of V including V itself do move within S and even into S. 15 F r o m these considerations, I conclude that VP is a V 2 , and as such it forms a natural class with N 2 (NP), A 2 (AP), and P 2 (PP). To conclude this section, I summarize the argument I have presented so far based on the properties of English and Spanish gerunds. Koster and May (1982) have given two types of arguments that English infinitives introduced by to should be uniformly analyzed as S. O n e type of argument is based on the fact that they have understood subjects, while another type is based on a variety of structural considerations. I have shown here that a proper definition of subject (23), independently justified on a number of grounds (some discussed in section 1.5 and some here), neutralizes the subject-based arguments as they might apply to n o n - N P gerunds. The structure-based arguments stand, however, and in fact lead directly to the conclusion that n o n - N P gerunds, in contrast to infinitives, are not to be analyzed as S. Rather, they are "bare VP," i.e., V 2 which are not directly dominated by S. Moreover, all V 2 , whether bare (gerunds) or not (finite and infinitival clauses) are correctly subsumed under a number of statements in grammatical theory that apply to X 2 ; cf. section 3.3. Finally, the requirement that an N P have at most one 0-role has been relaxed in order to allow the theoretical possibility of bare VP gerunds. This Revised 0-criterion (26) states simply that 0-relatedness is antitransitive. The requirement that an N P have at most one 0-role (3) has been rejected. Bare VP have turned out to have the deep structure distributional characteristics of AP's, which is to be expected if bare VP are V 2 , and if V and A are considered to share a cross-classifying feature [ + V], as in Chomsky (1970). Like AP's, bare VP's can be sisters to V (aspectual and object-controlled gerunds), sisters to N and N P (reduced relatives), and sisters to V and VP (adverbial gerunds; here an A P would have adverbial form). Also like AP's, VP's can be sisters to P, under restrictive choices of a head P. Lastly, VP's can occur directly under the initial symbol E in absolutive constructions, as can AP's (With John sick, ...). Thus, no special base composition rule is needed to specify where VP's occur, as opposed to other phrasal categories. 2.6. Technical Implications of Bare VP's for Lexical Insertion, Subcategorization, and Obligatory Control The arguments outlined in the previous section to the effect that n o n - N P gerunds are bare VP's are essentially all the structure-based arguments of 15. One might want to say that all traces, including traces of heads, must be free within X 2 , if one is willing to dispense with N P movement within NP's. (Cf. Williams, 1982; Bates, 1983; Ch. 3 here).

98

A unified theory of syntactic

categories

Koster and May for the S-status of bare infinitives, turned on their heads. The USP-based arguments of Koster and May and of Wasow and Roeper (1972) in favor of an S status for all clauses have been neutralized by the generalized definition of subject (23) and by the Revised 0-criterion (26). This revision (anti-transitive 0-relatedness) in turn strengthens the argument that infinitives, at least those introduced by to in English, are indeed reduced S's, and not bare VP's. The hypothesis that bare VP's exist necessitates some remarks about the mechanical functioning of the grammatical model in which they are embedded. These clarifications in turn make it possible to reduce the phenomenon of "obligatory control" to subcategorization of a maxiamlly simple sort. If reduced participial relatives have a base structure [ N P N P - V P ] , as I have tried to establish, then the fact that such relatives can be, for example, passive in form has consequences for our conception of lexical insertion. When the external NP in such a structure is an argument to a governing predicate, its head is subject to various lexical restrictions imposed by that predicate (and receives a 0-role from it, etc.). But this head (the internal NP) is not in its surface position until after the transformational cycle has completed its operations within the domain of the external NP. Thus, in an example like the boy being criticized, the internal N P becomes lexically filled only by NP-preposing from within the VP on the cycle of the external NP. This is no difficulty if we adopt the technical revision of lexical insertion first suggested in class lectures by E. Klima in 1966: (58)

Cyclic Lexical Insertion: A lexical head X imposes restrictions on its arguments (such as 0-roles, selection restrictions, etc.) only after the transformational cycle is terminated in all the cyclic domains properly contained inside X m a x .

Cyclic lexical insertion is independently supported by the fact that it permits us to subsume the notion of "obligatory control" under a general statement that holds of all obligatorily empty nodes. By this term I mean the instances where subjects NP's of non-finite clauses must be phonetically null (e.g. John decided who (*the department) to hire; Mary tried (*Bill) to leave.). The more general statement does not distinguish between obligatorily empty NP( = N 2 ) and empty lexical heads (X°). (59)

(a) Limitation on Empty Nodes. A category X-i (X = N, A, V, P) can be empty in deep structure if licensed by the interplay of lexical subcategorization and the 0-role assignment principles ((23) and (47) of Ch. 1). Such empty nodes will be called "induced."

The revised

Q-criterion,

clausal

subcategorization,

and control

99

(b) If X = L, an open category, (59a) is the only means by which empty categories are licensed in deep structure. 16 Sometimes, a lexical entry stipulates an empty node directly; for example, impersonal verbs such as English seem and French falloir "be necessary" have empty subject NP's. In other cases, those of interest here, empty nodes can be an automatic consequence of subcategorization features and the Revised 0-criterion. Recall that the theory of indirect 0-role assignment of sections 1.4—1.7 implies: (60)

A lexical entry of the form B, + D is satisfied in a deep structure tree (a) when B and D are sisters or, if this isn't possible, (b) when D constitutes a sister C of B. 1 7

The relation between B and D in (60b) is what I have called indirect 0-role assignment. How (60) works in a structure not involving obligatory control is exemplified by the feature + N P N P shared by the verb elect and the noun election. Since an N cannot have N P sisters (Ch. 1), indirect 0-role assignment requires that the subcategorization of election be satisfied as follows (B = N, D = N P , and C = PP):

(J (surface of)

John

0 (surface as)

president

16. Since a P is a closed category, it can be an induced empty node in accordance with (59a), or it can fall under the "Invisible Category Principle" of Ch. 5, which licenses empty closed categories that are morphologically realized on phrasal sisters. These limitations on empty categories are probably still too strong for languages like Spanish and Japanese, where clitics and morphological case (respectively) seem to permit empty deep structure phrases under other circumstances. I became convinced of the need for limiting the types of empty nodes in deep structure in conversations with W. Wilkins; as several authors have pointed out, certain analyses in Emonds (1976) are flawed because empty nodes are allowed too freely. Whatever revision in (59) allows for the control of empty nodes by clitics (cf. Jaeggli, 1982, Ch. 1; Borer, 1983, Ch. 2; Hurtado, 1981), it should respect Bouchard's (1983) claim that no separate statements of grammatical theory should be required for empty nodes. Under such a perspective, (59) has only the status of a logical consequence of the definitions and principles of syntactic theory. That is, one can deduce from the requirement that the nodes mentioned in subcategorization frames (i.e., assigned 0-roles) not be empty in deep structure (cf. Ch. 4) that only those deep structure nodes not so mentioned can be empty. 17. The definition of "constitute," repeating here (45) of Ch. 1, is that D constitutes C if and only if the only lexical material under C is under a node D dominated by C. I developed this notion in large part because of arguments in Hendrick (1979, Ch. 1) that subcategorization cannot be limited to sister constituents.

100

A unified

theory

of syntactic

categories

Returning now to obligatory control, suppose we express the subcategorization of a verb which takes a "bare infinitive" such as try or decide with the same minimal specification used for verbs of temporal aspect like begin: + VP. This is to say that the only lexical difference between these uses of try and begin is that begin, being in the marked lexical subclass of aspectuals, does not assign a 0-role to its subject, while try is entirely regular. 1 8 However, anti-transitive 0-relatedness prohibits the subject of try from also being the subject of a base VP which both is a sister to try and receives a 0-role from try. The Revised ^-criterion instead requires that the VP complement of try have its own subject (i.e., in a separate S), as in (62). (62)

^

^

V' I try I 0

SP(V) ^VP I I 0 (surface to) leave town

Indirect 0-role assignment (60) now comes into play to require that the subcategorized VP be the only lexical material under some sister of try (an S or an S). This implies that the subject N P of the S in (62) (and C O M P , if S is involved) must be null at the point when try is lexically inserted. By cyclic lexical insertion (58), the subject is null after transformations apply on the domain of the embedded S, so what is implied by try, + VP is that this VP is in a separate S whose subject is empty after transformations apply within S. This is precisely the property of obligatory control that resisted adequate formal expression for many years, and has brought Chomsky (1981) to allow obligatorily controlled subjects of infinitives to have the unique status of "pronominal anaphors" in the binding theory. The fact that the obligatorily empty subject N P ' s of infinitives can be reduced to a special case of empty nodes entailed by principles of subcategorization confirms the analysis of n o n - N P gerunds which has made cyclic lexical insertion (58) necessary. 19 18. In other accounts in which the deep structure sister of try is an S, its subcategorization must be differentiated from that of verbs like prefer and intend, which also take infinitival S sisters. That is, the frame "+ S, S infinitival" does not suffice to distinguish try and prefer; some additional feature is needed. Under my account, " + S, S non-finite" does suffice to uniquely pick out the class of verbs like prefer and intend, since try requires only the specification + VP. For discussion of verbs like prefer, see section 7.4. 19. Cyclic Lexical Insertion (58) has been argued for in previous works. Brame's "Stratified Cycle Hypothesis," which I believe has not been sufficiently investigated, shares many characteristics of my solution here: Let us now return to the difficulties noted in earlier chapters in connection with Equi-NP deletion. On the one hand, we would like to eliminate Equi-NP deletion so as to provide for the simple + VP, + P P VP, + N P VP, and other subcategorizations for examples such as try, persuade, expect, be easy,

The revised

0-criterion,

clausal subcategorization,

and control

101

Any infinitive of obligatory control can be handled in this same way. For example, infinitival indirect question complements of verbs like wonder and decide are generated by the following subcategorizations: (63)

wonder, V, +

WH~V

decide, V, +

(WHpV

Recall, k / 0 or 1 in syntactic subcategorization, so that Vfc = VP or S here, where S is finite. By indirect 0-role assignment, V* together with its introductory grammatical category WH must constitute a sister of the head V when these V undergo (cyclic) lexical insertion. Since WH can only be generated in COMP (i.e., as argued in Koster and May (1982), it is ad hoc to permit WH as a feature or daughter of VP), the sister of V must in fact be S. Thus, indirect 0-role assignment coupled with the subcategorizations (63) gives rise to lexical insertions as in (64). (64)

(a) The boys wondered how often

}

subway. The boys decided

[they should take

(how often)

subway.

to take

fthey should take] ^

the

t o ta

^e

the

J

(b) For k = 2 in (63):

how often

VP I take the subway

Footnote 19—Continued etc. On the other hand, we would like to retain E q u i - N P deletion (i.e., empty subject N P ' s of VP's in the base, J. E.) so as to preserve such basic transformations as Passive, . . . Suppose we therefore adopt a cyclic account of subcategorization, . . . Armed with these distinctions, we are able to suggest an alternative to the standard theory which circumvents much of the criticism levelled in chapter 5. Stratified Cycle Hypothesis: Embedding rules, Pruning, Co-occurrence checking, and simplex-sentence rules apply in that order on each cycle. (Emphasis mine, J. E.) (Brame, 1976, 127-128) Brame eventually rejects this hypothesis on the basis of two characteristics; in his conception, it includes S-pruning and a g a p of bare VP's in base phrase structure: "We are thus left with (v) (S-pruning, J. E.), and a weaker form of (vi) (the gap of bare VP's in phrase structure, J. E.) as the remaining objections to the Stratified Cycle Hypothesis." (Brame, 1976,130-131). Since the analysis of this chapter uses no pruning and implies no such gap, Brame's argumentation in fact supports the Stratified Cycle; crucially, I retain the essential cyclic checking of subcategorization before cyclic transformations, as suggested by Brame.

102

A unified theory of syntactic

categories

Verbs like persuade and convince have the following subcategorizations: (65)

(a) +

NP(X*), X = N or V 20

They differ from verbs like find and catch in that persuade and convince assign 0-roles to two internal arguments, while find and catch assign a 0-role to only one internal argument. Hence, a second complement of persuade or convince (Wk or NP) must receive a 0-role indirectly via an empty P. The unmarked P is of before N P (persuade John of his importance) that before V 3 (persuade John that he is great); cf. Ch. 7.

and

With k = 2 in (65a), a frame for persuade/convince is NP VP. If one of these verbs is inserted with both NP and VP sisters to which it assigns 0-roles, the Revised 0-criterion (anti-transitivity of 0-relatedness) is violated, since the NP will also be the sister of the VP and get a 0-role from it. Thus, when Vfc = V 2 in (65a), this VP must be part of an S with a separate NP subject to which it can assign a 0-role. Moreover, since this VP must constitute a sister to the governing verb, the embedded subject and AUX outside of VP must be empty, giving rise to (65b): (65)

(b)

persuade

John

NP I 0

AUX VP I I 0(to) leave town

Eixamples like (62), (64), and (65) show that infinitives have the same underlying structures in the system developed here as they do in systems where VP does not appear in subcategorization frames. Some verbs of temporal aspect accept both gerunds and bare infinitival complements: (66)

John {began/started/continued} {riding horses/to ride horses} during his vacation.

English speakers often detect a subtle difference in meaning between the 20. One might feel that the possibility in my system of subcategorizing differently for V'(( = S or an obligatorily controlled VP), V 3 ( = S without obligatory control), or V 2 (obligatorily controlled VP) is too powerful a device. While we must be able to express easily the finite-infinitival alternation found in many clausal complements (e.g., after persuade, decide, wonder, etc.), many verbs take only obligatorily controlled infinitives (i.e., which a r e + V 2 in my system); some examples are try, hasten, decline (intransitives) and encourage, force, help, delegate (transitives). Other verbs take only full sentence complements and exclude obligatory control; answer, (re)assure, and demonstrate are some examples. As seen in the text, the verbs which take both types of complement clause contain V' in their subcategorization frame.

The revised

9-criterion,

clausal

subcategorization,

and control

103

two verbal forms. The infinitive in these examples tends to convey more of a sense of "unrealized" activity. (As far as I know, the intuition never extends to differing truth value judgments.) Suppose we say that the feature which differentiates English modals from ordinary past and present tense in SP(V) is + M . (In section 5.5, I equate + M with - T E N S E , where both + T E N S E and - T E N S E elements of SP(V) can be + PAST.) Begin, start, and continue can now be listed with the feature + (M) VP. If M is chosen, there is an element M in the complement which overtly expresses "unrealized" or " m o d a l " force in the infinitive; moreover, since M is necessarily a feature on SP(V), an S complement with an empty subject is necessarily involved. If M is not chosen, nothing prevents the governing V and complement V P from being sisters, and an aspectual gerund results. 2 1 In contrast, temporal aspect verbs like finish, stop, and resume are + VP, and thus take only n o n - N P gerunds, while tend and fail are +

M

VP, and must appear with infinitives.

(67)

*John {finished/resumed} to ride horses fast.

(68)

*John {tended/failed} riding horses fast.

The syntactic subcategorizations proposed here for various temporal aspect verbs seem to accord well with the semantics of individual verbs. The verbs in (67) are incompatible with the "unrealized" modal force of M. while those in (68) require this element. With temporal aspect verbs, it is not the Revised 0-criterion but simply the occurrence of M in the lexical insertion frame that induces obligatory control. Along such lines, every instance of infinitives of obligatory control in English and Spanish can be explained, in the sense that the presence of an S, rather t h a n of a bare gerundive VP, can always be attributed to a factor independent of control itself. An infinitive of obligatory control is always due to the Revised 0-criterion or to a subcategorization requirement for an M or a C O M P with the infinitive, as we have seen. Since an infinitive is a reduced S or S, it contains possible reflexes of C O M P , a subject N P , or SP(V), while a gerund does not. When something in the g r a m m a r requires the presence of one of these 21.

If begin

like *John

is subcategorized as +

M

VP, o n e can ask what_prevents a sentence

began must go. T h e e x a m p l e is excluded as follows. Since M

V P must constitute

a sister to V, these t w o elements must contain the only lexical material under that sister; in particular, they c o n t a i n the only lexical material under the S which immediately d o m i n a t e s them. Thus, the subject N P of this S is empty, while the A U X ( = S P ( V ) = I N F L E C T I O N ) is lexical (e.g., must). A s is standard in treatments of abstract case theory (e.g., C h o m s k y , 1981), a lexical A U X assigns case to its subject. But then, a case-marked e m p t y category in this theory must be a variable and b o u n d to an operator in C O M P ; the e m p t y N P subject of M

V P at the level of lexical insertion is not able to satisfy this requirement.

104

A unified theory of syntactic

categories

elements (e.g., subcategorization for WH in an indirect question; a "gap" in object position; an empty subject NP so that the 0-relatedness of lexical heads remains anti-transitive; or the "modal force" associated with SP(V)) then we find infinitives, as in (69): (69)

El profesor sabía a qué estudiante castigar. The professor knew which student to punish. Este libro es difícil de leer. That book is difficult to read. María obligó a Juan a irse. Mary persuaded John to go away. The man to do that is John.

In the cases that remain, the grammar allows a bare VP to be generated; that is, as adverbial participles, as reduced relatives, in absolute constructions, as complements to aspectual verbs without 0-role subjects, and as complements to V which assign a 0-role to only one of two phrasal sisters. The rules that permit these VP in the base can be free, since anti-transitive 0-relatedness does the work of restricting their distribution. As mentioned earlier, if an AP complement is allowed by the theory of the base, then a VP is also allowed in that position. Thus, the category distinction of gerund vs. infinitive has been entirely reduced to the distinction between VP and S. A gerund is a VP that does not have a SP(V) as a sister (i.e., a gerund is a V 2 that is maximal; it is not immediately dominated by V3), while an infinitive is a VP that is not maximal. Besides the Revised 0-criterion, the main factor which induces S or S with a subcategorized VP is a morpheme category such as WH or M in the subcategorization frame; this category, together with the VP, must constitute a sister of the governing head so as to receive a 0-role indirectly. For explicitness, I reserve the symbol /""S,' in subcategorization frames for linking a phrase and an associated adjacent morpheme category of this type. Thus: Governing V: finish try

Su bcategorization frame: VP + VP +

fail

+

wonder find

_M

Deep structure sisters of V: VP S or S*

+

VP ^ ^ kÍ _WH V

S

+

NP

NP and VP

p

VP v

fc

persuade

+

N

tell

+

NP ((WH)

S or S

NP, and S o r S* V ) NP and S

*The S node, it will be recalled, is required by the Revised 6criterion.

The revised O-criterion, clausal subcategorization,

and control

105

I now wish to extend the use of subcategorization and indirect 0-role assignment to obligatory control in N P gerunds, and thereby reduce the latter phenomenon to principles needed elsewhere in the theory of grammar. Verbs like avoid and prevent impose obligatory control on their N P gerund complements: (71)

Mary avoided (*John's) being late. What prevents you from (*Sam's) leaving town?

The preposition from is not responsible for obligatory control here: we are exhausted from John's having harrangued us for hours. There is no need to invent an ad hoc feature to insure that a VP complement with avoid and prevent appears in an NP; features as in (72) suffice: (72)

(a) avoid, V, +

N

(b) prevent, V, +

VP N P from

VP

In (72a), the combination N + VP must constitute a sister to avoid, and (73a) is the minimal tree that expresses this. Similarly, since from is itself subcategorized for an N P complement, the minimal tree that satisfies (72b) is (73b). As discussed in note 11, Modern English has some device which licenses N as 0 in the context n [ VP].

being late

N

0 (b)

from

N

VP

N

leaving town

0

106

A unified

theory

of syntactic

categories

It can thus be concluded that all obligatory control can be reduced to subcategorization, and the principles which govern it; further, no ad hoc features are needed to refer to distinctions between gerunds and infinitives. Distributional differences between n o n - N P gerunds and infinitives follow from the Revised 0-criterion and the presence or absence of some introductory morpheme category such as M or W H with the frame + V P . Distributional differences between NP-gerunds and infinitives follow from the presence of N or some noun-taking P in the subcategorization frame of the former. Finally, there is no need for a device which insures that the subjects of unmarked infinitives must be lexically empty. 2.7. A Note

on

Recoverability

If adverbial participles are bare base VP's, this raises some questions about how a sentence such as (74) is to be derived: (74)

John got examined by a doctor after being arrested by the police

First, following Bresnan (1972), I will assume that "agent-postposing" of the ¿y-phrase in passives is not a transformational process, today hardly a controversial position. Rather, I assume that the passive auxiliaries, get, be, ser " b e " , chosen from among verbs which are particular subcategorization of the form +

+

A P , have a

A P [ V P ] ( N P ) , or perhaps,

conforming to the types of features given in section 2.6, -I-

A

VP NP.

In this view, the deep structure of a simple passive sentence is:

I do, of course, retain NP-preposing of an object N P in a passive. This raises the following question: if the deep structure of (74) contains t w o VP's

(and hence t w o instances of John

to be moved), but only

one

"landing-site" empty N P to move them to, how can the hypothesis of this chapter be retained? A simple technical revision suffices, this time in the notion of "recoverability." I allow a lexical N P , to move to a landing-site N P j (replacing N P y ) provided that N P , is empty, contains only dummy elements, or (this is the innovation) is precisely of the f o r m N P , already.

The revised

9-criterion,

clausal

subcategorization,

and control

107

This retains the spirit of recoverability, and allows us to analyze (74) as in (76) without difficulty.

be

A

/

\

I en

VP

/

P

r \

NP

\

I

I

V

NP,

0

the

arrest

John

police

We need not specify any ordering between the two N P movements; moreover, either of the VP's may also be active in form, in which case the subject N P is lexically filled when the transformational cycle on the Sdomain begins. 22 By this revision of recoverability, we can resolve a long-standing anomaly in transformational analysis, noted in Jackendoff (1977) and in Gazdar, Pullum, Sag, and Wasow (1982). The problem is exemplified by the derivation of a sentence like the following: (77)

The same man seemed to hate you, got drunk, and was arrested.

This example has the deep structure (78) in the framework presented here; its s-structure (79) is derived by moving each embedded N P , to the surface subject position via the rule Move a (Ch. 3). In this way, three separate deep structure N P ; are conflated (recoverably) into one s-structure NPj.

22.

There is n o lexical requirement that the d e e p subject of the passive auxiliary must be

(}. W e c o u l d derive Johni

was criticized

c o u l d be assigned to the first Johni

from Johni

was criticized

John.,

except that n o 0-role

at d e e p structure. T o exclude such a derivation, we can

require that 0-roles be assigned to phrases at d e e p structure, and are necessarily retained t h r o u g h o u t a derivation. Recoverable erasure under identity is a syntactic

operation. By l o o k i n g at (78), w e can see

that the (semantic) 0-role of a lexical N P , s h o u l d n o t be erased, even w h e n a lexical N P ; identical t o N P f replaces (erases) the latter. That is, the surface subject N P in (79) retains three 0-roles, but this is allowed because anti-transitivity of 0-relatedness is n o t violated.

108

A unified theory of syntactic categories

SP(V)

I

to same man

VP get

drunk

I

hate you

arrest

the same man

The s-structure (79) is an appropriate input for semantic interpretation. The coordinated VP's and the surface spreading of TENSE are justified in Dougherty (1970). In (79), three 0-roles are assigned to one NP, the same man; one by hate, one by drunk, and one by arrest. Since no two of these predicates within conjoined VP's are in a 0-relation with each other, the anti-transitivity of the Revised ^-criterion (26) is respected. It can be noted that condition (3) on the 0-criterion, rejected here, is not respected in (78) or (79), and that it is in general incompatible with conjoined VP's in the base, which are argued for in Dougherty (1970). Thus, either a 0-criterion that includes (3) leads to the generative semantics position that coordinate predicates are always coordinate clauses (expressed in logical form with structure-building transformations), or an exception must be made for coordinate structures exactly of the form of the Revised 0-criterion. But dropping the stipulation that such a revised version holds only for coordinate structures leads precisely to the general position on the ^-criterion proposed in this chapter, thus confirming it via a separate line of reasoning. The technical revisions of the transformational model required by the Revised 0-criterion are thus the independently justified Cyclic Lexical

The revised 6-criterion, clausal subcategorization,

and control

Insertion (58) and a simple and plausible extension "recoverability."

of the

109 notion

2.8. Implications for Morphology and Concluding Remarks I am now in a position to explain a little noted but I think interesting and important fact about verbal morphology. In English and Spanish, the gerund never has an irregular morphological form. Even irregular V's, such as be, ser, estar, etc., form the present participle regularly. One might think that in French, where in a handful of verbs the present participle and infinitive stems differ (e.g. sach-ant vs. sau-oir "know"), a few participles are irregular. But in these cases too, it can be argued on independent grounds that the present participle ( = subjunctive and imperative stem) is regular, while the infinitive ( = indicative) stem is irregular. This regularity of the present participle (gerund) stem can be explained as follows. The finite and infinitival forms of verbs are words within which the category V co-exists with some other syntactic feature, in particular, those that derive from a deep structure SP(V) by virtue of affix movement. The passive participle also combines the category V with the category A (cf. section 2.7). 23 In contrast, the V of the gerund is "pure." That is, it coexists with no other syntactic category inside the same word. Suppose we say that morphological exceptions to inflectional regularities are all of the form (80): (80)

The "regular stem" Z has the irregular variant Z ' in the "context" of the syntactic feature F.

It is plausible to say that an accessible context for a lexical morphological condition like (80) must be within the same word. F o r example, we may say, "take has the irregular variant took when it is in the same word as + PAST." It now follows that a gerund, which in the present analysis is a V without any feature of SP(V) or of A, cannot in principle have an exceptional realization, since there is no other syntactic feature F that cooccurs in the same word with the head V of a pure VP. Some further anecdotal evidence that ing realizes a pure VP concerns the attempt of a contemporary American writer (Conroy, 1973) to render a rural Black English dialect. 2 4 The writer's technique is to supress all verbal inflection (and most nominal inflection too). However, he consistently retains the ing of the n o n - N P gerund, indicating its different status. In Ch. 5 here, I will argue that other inflection reflects forms that

23. In some recent work, this is simply stipulated, though it is transparently justified in Spanish by the adjective-like agreement that the passive participle exhibits. 24. The writer in question may be providing evidence about his own internalized grammar rather than about rural Black English. This is equally useful, if not more so, for the present argument.

110

A unified theory of syntactic

categories

are transformationally derived; in these terms, ing is retained since it is a direct morphological realization of a base structure (a bare VP). 25 My overall argument in this chapter has been to show that the morphological difference between non-NP gerunds and infinitives reflects an important structural difference, and further, that the differing distributions of these structures (vis., VP and S) are consequences of general grammatical principles, such as the Revised 0-criterion, the base composition rules and 0-role assignment principles of Ch. 1, and principles of subcategorization, e.g., (59) and (60). With one further language-particular device (cf. note 11), English NP gerunds can easily be integrated into this system, so that a rather complete theory of non-finite clauses emerges. Most of the empirical arguments for my position come from the fact that the structure-based arguments of Koster and May (1982), if they are taken to show that infinitives are S, equally well shown that gerunds are not. Much of the material in sections 2.3, 2.4, and 2.5 has been developed to account for the fact that gerunds as well as infinitives have understood subjects (the USP of Wasow and Roeper). The theoretical modifications I have proposed with respect to the definition of subject, the Revised 6criterion, Cyclic Lexical Insertion (58), and Recoverability may seem extensive, even though, as I have argued in section 2.6, obligatory control can be essentially subsumed under subcategorization as a result. In light of the explanations achieved, I do not think that hesitation before these theoretical modifications is justified. The previous notions of the subject, the 0-criterion, deep structure subcategorization, and recoverability are neither more elegant nor more empirically adequate than the ones proposed here. The principles and definitions I have proposed retain all the virtues of their predecessors, and moreover provide an explanation in terms of universal grammar both of the structural differences between gerunds and infinitives and of their different distributions. This is sufficient to warrant their acceptance.

25. As was noted in section 2.6, the verbal ending -ing does "double-duty" in English. Like Spanish -ndo and Middle English -ing (cf. note 11), it represents the necessarily regular realization of V in a non-S structure. But Modern English -ing also realizes the structure N + VP, where N is empty in deep structure. The theory developed here claims that this syncretism is accidental, and that in general, if N P gerunds had an inflection different from non-NP gerunds, the inflection for N P gerunds would not have to be always regular - there could be statements conforming to (80) such as "take has the form take' when it is in a word which has the syntactic feature N." Nonetheless, given the svncretisim (i.e., that n o n - N P gerunds and N P gerunds have the same form in English), it is predictable that the n o n - N P gerunds remain everywhere regular, since the condition for an irregular variant (80) are never satisfied.

Ill FIRST APPENDIX T O CHAPTER T W O THE

EMPTY

HEAD

PRINCIPLE

In this chapter, it has been proposed that empty nodes in deep structure are a result of subcategorization frames for heads of phrases. In particular, they are "induced" according to condition (59). F o r nodes that have no referential indices (e.g., heads of phrases and expletives) and for phrases that can be arbitrary in reference (the P R O of Chomsky, 1981), I claim that the distribution of empty nodes follows either from item-particular null subcategorizations (e.g., the subject of seem or of French falloir "be necessary") or from general principles of 0-role assignment. The principles include the Revised 0-criterion and the Indirect 0-role Assignment of Ch. 1, which then interact with lexical subcategorization frames of particular items to yield empty P's, empty subject N's, P R O subjects of infinitives, etc. I take phonetically empty N P ' s and VP's which'have specific referential indices (e.g., the VP's in "VP-deletion" contexts) to not in fact be fully "empty" and to not fall under the requirement (59). Further principles may in turn predictively specify individual subcategorization frames as canonical syntactic realizations of particular item-particular 0-roles (Pesetsky, 1982, Chs. 1 and 2). F o r example, S may be the canonical realization of a "propositional" 0-role, and a VP of obligatory control (i.e., + VP) may be similarly related to an "activity" fl-role. I d o not develop this possibility here. It is appropriate to specify how induced empty nodes fare under transformational derivations. In all untransformed structures and in many transformed ones, they observe the requirement for empty nodes given in Emonds (1976, Ch. 3), that empty nodes must be filled during the derivation of phonological form or referentially indexed during the derivation of logical form. In the model developed here, being filled via late lexical insertion (to be developed in Ch. 4) is the typical result for empty X°, SP(V) (e.g., I N F L = fo), C O M P , and the subjects of seem and falloir. P R O subjects of infinitives must receive an index of control in logical form. However, in one circumstance, certain induced empty heads are obligatorily not realized phonetically. The condition under which heads are empty is quite general, and several examples of it will now be given. In section 1.7, I postulated for verb/noun pairs like destroy/destruction and elect/election the subcategorizations + N P and + N P NP, respectively. As argued in Ch. 1, these complement phrases can be realized within N P ' s only through indirect assignment of 0-roles to objects of empty P's, as in the deep structures (i)-(ii).

112

A unified

theory

of syntactic

categories

(i) SP(N)

I

the destruction 0

president

Via (59), these P's are licensed to be empty in deep structure, being induced by the subcategorization features. If no transformation applies to (i)-(ii), these P are necessarily spelled out in phonological form as of and as, depending on the abstract case-marking of their object NP's. For example, the second N P complement of elect/election is not assigned abstract case by P, and in this situation, P is typically spelled out as. (Cf. Ch. 6 for more discussion.) A different set of verbs, those of transference such as send, buy, bring, etc. are also subcategorized as + N P NP. But for these verbs, the second N P is assigned an abstract "dative" or "oblique" case by P, which in turn means that P is spelled out, before oblique case, as to or for. (Cf. section 5.7 for more discussion.) A typical deep structure is as in (iii); the abstract oblique case permits assignment of an indirect object linked 0role, as shown in section 1.8.

send

a present

P I 0

NP I John

There is another way the NP objects of the deep structure empty P in (i)-(iii) can surface; they can be moved to some other position, as in (iv) or (v): (iv) The city, they never described the destruction of [np.0]It's John, they will send a present to [np.0](v) [ n p the city's] destruction [pp [ p 0 ] [ n p ; 0 ] ] send [np. John] a present [ p p [ p 0 ] [ n p ; 0 ] ] The difference between the movements in (iv) and (v) is that in (iv), Move a moves the NP's away from the positions that are case-marked into nonargument positions; in (v), Move a takes the N P ' s to positions that are case-marked, i.e., to the possessive position (where the abstract case is SP(N)), and to the position next to the verb (where the V can mark for case). Under the natural assumption that any chain of NP's co-indexed by

Appendix

to chapter

two: The empty

head

principle

113

Move a can receive only one abstract case, it appears that the examples in (v) should be derived from versions of (i) and (iii) in which the N P objects of the empty P are not marked for case. Thus, in (v), the traces are not case-marked, whereas in (iv) they are. This difference correlates with whether the induced empty deep structure P can remain empty in surface structure. When the trace is not marked for case, it must remain empty. (vi) **The city's destruction of. **Send John a present to. To account for these induced empty P which remain empty throughout the derivation, I stipulate the following universal condition. Empty Head Principle. If an empty head X° induced by subcategorization c-commands an adjacent empty and caseless Y m a x , X° has no phonetic realization. While this statement at first appears ad hoc, its usefulness can be seen in a wide number of quite different constructions. For example, it explains why empty N with subject clauses are allowed in s-structure, when the S subject is topicalized; cf. section 7.7.2 for a detailed account. The same thing happens in object clauses that are topicalized or extraposed. Their empty deep structure head N is licensed by the Empty Head Principle; cf. section 7.7.1. That is, since S doesn't receive case, and is nontheless a Y m a x (cf. Ch. 7, where I show that S is a PP), the introductory N next to the trace of a topicalized or extraposed S is not phonetically realized. It will also be demonstrated in Ch. 7 that C O M P is a head and thus a possible value of X° in the Empty Head Principle. As a result, any following empty and caseless subject N P (either a trace of N P movement or any P R O subject) is compatible only with an empty C O M P , which is empirically confirmed. The reasoning applies to C O M P which are empty in deep structure and subject to late lexical insertion, such as the C O M P for of Standard English; this for is analyzed in detail in sections 7.4 and 7.5. The C O M P for which appears in some dialects of English directly before an infinitive is, I assume, inserted in deep structure in S like a lexical subordinating conjunction. Finally, in the government-binding theory of Chomsky (1981), the N P trace of an object of a passive verb is caseless, and the verb heads themselves are arguably adjectival. Yet the range of subcategorization frames found after a passive verb, when augmented by the N P trace, are just those found in a VP, not in an AP; these facts suggest a passive sstructure as in (vii). The representation of a passive participle as a "bare V P " in (vii) is akin to the result of this chapter that active participles are bare VP.

114

A unified theory of syntactic

categories

(vii)

elect

VP I N P (caseless) I I 0 other lexical phrases

By the Empty Head Principle, the V head of VP in (vii) is allowed and required to be null. It seems then that the Empty Head Principle is a well-justified technical device within the theory being developed here. It would be desirable to state case theory and 0-role theory so that this principle could be derived as a theorem, but for the present I am not in a position to pursue this. So I allow it to stand as a multiply justified but purely technical stipulation which violates Bouchard's (1983) caution against "empty category principles." There may be a situation where the Empty Head Principle applies to an X° even when the following Y m a x is marked for case. This arises when X° not only does not provide Y m a x with case (e.g., (i) and (ii)), but is not even a potential case-marker of Y m a x . The only time an X° can c-command an adjacent empty but case-marked NP, and still not be a. potential casemarker of Y m a x (cf. section 1.8), is when X°"=COMP and Y m a x is an adjacent, empty subject N P case-marked by the following SP(V). As is well-known, this is the "that-trace filter" context: (viii) Who did you think [ x o = c o m p 0 ] [ n p 0 ] left? (ix) *Who did you think [ C OMP t h a t ] [ N P 0 ] left? In order to derive the t/iat-trace filter from the Empty Head Principle, we have to allow Y m a x in the latter to be either caseless or not potentially case-marked by X°. I leave it to the reader to decide if this is an appropriate modification.

115

SECOND APPENDIX TO CHAPTER T W O VERB

RAISING

IN DUTCH

AND

GERMAN

A construction which is relevant to several of the hypotheses of this and succeeding chapters is the "verb-raising" construction of Dutch and German. The principal work which analyzes this construction is Evers (1975); other important papers on the subject include Bech (1955), Huybregts (1976), and Bresnan, Kaplan, Peters and Zaenen (1982). In this construction, an infinitival verb V | which precedes an underlyingly VP-final V2 is raised and adjoined to the governing V2. German, at least in two-verb sequences, preserves the underlying order of the two verbs, while Dutch reverses them. Bresnan et al. (1982), using Evers' (1975, 38) verb-raising rule, argue that this rule moves a V as in (i) and produces the following derived structure, which they attribute to E. Klein. Throughout, I will use the surface order of Dutch. (0 ...(NP)...

VP / \ ...(NP)...V, II

.V / V2

\

V,

Evers' verb-raising rule is a local transformation which restructures (in German) or interchanges (in Dutch) two V which are adjacent in underlying structure. Evers considers this rule to be evidence against the structure-preserving constraint (Emonds, 1976, Ch. 1, and Ch. 3 here), but ignores the possibility of analyzing verb-raising as a local transformation and thus removing its status as a counter-example. Viewed as a local rule, verb-raising fits nicely into the scheme of possible V-movements given in section 3.7 below. Evers' own formulation of the rule is not local because he utilizes a context predicate [COMP0] ¡ n the lower VP to prevent finite verbs from undergoing the rule. If it is necessary to make reference to finiteness in the rule, it can be read off the feature composition of V] and the context predicate can be eliminated. A significant amount of language-particular evidence from Dutch and German provided by Evers himself supports the claim that verb-raising is local. Four of the conditions which block verb-raising are plausibly attributable to a single restriction that material in underlying structure between the two verbs renders the rule inapplicable. This has been pointed out to me by H. van Riemsdijk, and is attributed by Evers to P. Nieuwenhuijsen (cf. Evers, 1975, 39-40 for examples). If the higher verb, in addition to an infinitival complement, has either an idiomatic adjunct, an

116

A unified theory of syntactic

categories

obligatory and inherently reflexive pronominal object, a directional particle, or a direct object which is the understood subject of the infinitive complement, verb-raising does not apply. All these constituents can be considered to be obligatorily closer to the governing verb in underlying structure than the infinitival complement, with the consequence that the optimal structural description for the rule appears to involve two adjacent V, as required by the Adjacency Hypothesis for language-particular statements of the Introduction. 1 While several characteristics of verb-raising are insightfully described and partially explained by Evers, he admits that the conditions under which governing V2 permit raising "are probably more than a collection of lexical peculiarities, though they still escape a completely revealing description" (39). As just discussed, one such condition appears to be that the verbs be adjacent. I further propose that matching the categories of non-finite complements permitted by the Revised 0-criterion in this chapter with D u t c h / G e r m a n non-finite complement types can lead to the revealing account sought by Evers of when verb-raising is obligatory and when it is optional. Dutch and German, like French, lack the present participle complements to verbs exemplified by English W + ing and Spanish \ + ndo. In view of this gap, I would identify D u t c h / G e r m a n bare infinitives (those lacking te/zu "to") with the V - V P deep structure in (i), which is allowed by the Revised 0-criterion when a main verb fails to assign a 0-role to one a m o n g its subject, object, and clausal complement arguments. The main verbs which take these bare VP complements are those in Evers' classes I, II, and IV (cf. his section 1.0); they include the modal verbs and verbs of perception, just as English verbs which take bare VP complements include the temporal aspect verbs and the perception verbs. (The English translations of the D u t c h / G e r m a n modal verbs are not verbs but realizations of SP(V); cf. Chs. 4 and 5 for discussion.) The three possible 0-role assignment patterns permitted by the Revised 0-criterion are indicated by the arrows in the deep structures (ii)-(iv). 2 1. According to Evers, particle verbs in German but not Dutch appear to permit verbraising. This could be due to the presence of an intervening P in the German structural description, which would not affect the rule's status as local, or to the possibility that particles are at least optionally adjoined to V in German but not (when verb-raising applies) in Dutch. Evers' structural description also includes a P P term which intervenes between the two V, since he considers P P extraposition cyclic and the embedded V to be in an embedded S. A single phrasal term is permitted in a local rule, so his technical interpretation of the cycle does not affect my argument. However, later work to which I refer below allows extraposition to be formulated as optional and to apply on the cycle of the higher S, in accord with subjacency, across at least one cyclic domain boundary. Thus, the P P in Evers' structural description is unnecessary. 2. Evers' partition of bare infinitives into classes depends on his assumption that all complements must be S's in deep structure; his classes depend on whether his derivations include Equi-NP deletion or not. My three structures d o not exactly correspond to his three classes.

Appendix

to chapter

two:

Verb raising in Dutch and German

117

(ii) "Class IV" 0-role assignments:

de kraanvogels

vergiftigen

t

J L

(iii) "Class I" 0-role assignments: s NP' I Cecilia

NP' de kraanvogels

t

(iv) "Class II" 0-role assignments: S

V

i e I de kraanvogels «-fotograferen As Evers points out, V-raising is always obligatory if the lower V (Vj) is a bare infinitive (i.e., unmarked by te/zu). This could be due to one of two factors: either V-raising is obligatory for any infinitive which appears in a certain structural configuration, or the process is optional but subject to an output condition which filters out the configurations in (ii)— (iv). These alternatives will be discussed below. Evers' Class V consists of subject-raising verbs; beginnen "begin", dreigen " t h r e a t e n " , schijnen "seem", hebben "have", zijn, "be", a n d

blijken

"appear" are typical Dutch examples. I propose that they appear in the same deep structures (abstracting away from word order) as in English. As discussed in Ch. 1, raising of subjects is possible only out of complements to V's, which can have S sisters, and can't occur from complements to N's, which must have S (i.e., by Ch. 7, PP) complements. The marker te/zu "to"

118

A unified theory of syntactic

categories

can be taken to indicate a sentential rather than a bare VP structure, just as the English to represents the presence of an S, a view that has been defended throughout this chapter.

NPj Move a = 0

VP ...V,

beginnen, schijnen, etc. -FINITE

In these four of Evers' five classes, V-raising is obligatory. The governing verbs in these classes are precisely those whose complements, by the argumentation of this chapter, we would expact might not be full S's in deep structure. That is, the verbs in these classes either plausibly fail to assign a 0-role to one of their arguments (the modal and perception verbs) or they are subject-raising verbs. This suggests that V-raising applies obligatorily to a non-finite head V in any complement which (i) is adjacent to its governing V, and (ii) is not in a separate S? This naturally leads to the question: what happens to an infinitival Vj which is adjacent to a governing V2 but also in a separate S? One option for such Vj is that the complement S that they are in is extraposed rightward over V2. This is not surprising, since extraposition is a process that typically affects S (cf. Ch. 7). Moreover, it is well-known that PP's in Dutch extrapose over the VP-final V, as established in Koster (1975) and further discussed in Evers' own work. In Ch. 7 here, I show that S is nothing else than a P P of a certain type. Taken together, these facts allow one to conclude that S complements, both finite and infinitival, can extrapose in Dutch because they are PP's. Reuland (1981) has observed a very interesting complementarity in Dutch; the verb in a clausal complement which precedes an underlyingly VP-final governing verb must be moved either by virtue of verb-raising or by extraposition. Rather than stipulating that extraposition is sometimes optional (for many sorts of PP's) and sometimes obligatory (at least for the complement clauses discussed earlier for which verb-raising is blocked), Reuland concludes that extraposition should be considered an optional effect of Move a, as it is in English. Then, some filtering device which applies to the output of transformations should mark as ill-formed a surface structure in which a verb is preceded by a complement V^ (S or VP) containing its head V.4 Reuland elaborates the theory of abstract case 3. The verbs of obligatory control in Evers' small class I l i a can also be analyzed as taking an S rather than S complement in deep structure, and hence as obligatorily triggering the V-raising of the complement infinitive. 4. The presence of the unmarked complementizer datjdass doesn't nullify the filter; their extraposition is obligatory.

"that" in finite clauses

Appendix

to chapter two:

Verb raising in Dutch and German

119

to achieve this effect; I assume that at least that his "filtering" approach is correct, and that extraposition is a uniformly optional operation on P P and S in Dutch and German. An enlightening discussion and appropriate formulation of the V — V filter appears in van Riemsdijk and Williams (1981). Using this approach, it is not necessary to consider V-raising as an obligatory rule either. It can be considered optional, but forced to apply because non-application in an S or V P complement will yield a surface structure in which a V^ containing its head precedes a governing V. Vraising is the only movement rule available which can derive an acceptable surface structure from an underlying structure like (i)—(v). We have not yet considered the situation of a V which takes an infinitival S complement whose head V is adjacent to the matrix V. It appears that there is an open class of matrix V which take such complements, namely those of Evers' Class I l l b (Evers, 1975; 5, 6, 42). With these verbs, the main verb of the preceding infinitive can escape being filtered in two ways. That is, there is an open class of V2 which appear in deep structures like (vi). These deep structures can become well-formed surface structures either through V-raising of V j or through extraposition of S.

V2

vergessen/vergeten "forget" hoffen/hopen "hope"

>

glauben/geloven "believe" versuchen/proberen "try" Reuland's proposal is that extraposition and V-raising are both optional rules, and that the exclusion of (vi) in surface structure is due to a post-transformational filtering device. We moreover appear justified in concluding that any sequence "infinitival Vj— V2" can undergo V-raising optionally. Those V] which apparently must undergo the rule d o so only because extraposition is not an available alternative for VP and S, as opposed to S complements. Those V[ which may not undergo the rule are prevented from doing so by the presence of underlying material between V j and V2 which block the adjacency required by a language-particular process. In Ch. 3, I will introduce a principle proposed by Travis (1984) which constrains movements of heads. Clearly, such a principle should affect Vraising. Travis's proposal is that a head (here V j ) can only move to the head which governs it. If this principle is to work, the empty C O M P in (vi) must not "count" as a head which intervenes between V j and V2; a condition ignoring an empty head can be easily incorporated into a formal

120

A unified

theory

of syntactic

categories

statement of Travis's principle. In fact, this condition may well provide the explanation for why V j cannot be finite if verb-raising is to apply. In Dutch and German, an embedded C O M P is always phonologically realized as dat/dass "that" in a finite clause; thus, the requirement that a head can only move to the next highest phonologically realized head in a chain of governing heads prevents verb-raising out of a finite clause. In Ch. 7, I will argue that the C O M P that is the unmarked realization of P in the context S; thus, the C O M P morphemes are assimilated to other subordinating conjunctions of category P. A final context which blocks V-raising brought up by Evers is provided by subordinate clauses which are introduced by prepositional forms such as e.g., erom/darum. These forms are quite possibly lexical P in the position of " C O M P " in deep structures like (vi). If so, Travis's principle would stop verb-raising from applying to the subordinate verb in such contexts. One might object to the suggestion that V-raising is a completely general rule by arguing that the process is blocked in some cases even when an infinitival V] is adjacent to V2. Evers (1975, 43) gives a few such cases, suggesting their idiosyncratic nature. If it turns out that this kind of objection can be sustained, it means that when V-raising is blocked and when it is allowed is, for S complements, essentially lexical. However, since V-raising remains uniformly possible (and obligatory) for the bare infinitive complements which I have analyzed here as "VP complements", I would revise my proposal only partially. We could analyze all those S complements in which V-raising can take place as deep structure S sisters to V, and all those S in which it can't take place as deep structure S sisters to V. Then, Travis's principle, unmodified, would block V movement out of S, but would permit it out of S and VP. The V-movement rule itself would be stated without any stipulation as to the non-finiteness of Vj. The only task would be to explain why so many verbs (Evers' Class Illb) take both S and s[0 — S] complements, since they are compatible with both verb-raising and extraposition; this perhaps could be made to follow from a subcategorization frame + S and a fully worked out theory of the "induced empty nodes" introduced in the first appendix to this chapter. In conclusion, it seems that obligatory V-raising contexts include bare VP complements in Dutch and German, and that these are the counterparts to the English \ + ing and Spanish V + ndo participles. V-raising itself is a local rule which mentions neither the finiteness of Vj, nor the lexical subclass of the governing V2. If verbs must be partitioned into lexical classes according to which can be the V2 in verb-raising, the deep structure subcategorization frames of + S and + S suffice to mark the appropriate distinction. In Ch. 4, it will be of some importance that rules mentioning V such as verb-raising and the filter proposed by van Riemsdijk and Williams just cited do not mention sub-classes of V in their formal statement.

Chapter 3

Clausal word order and structure-preservation 3.1. General Characteristics of S Expansions In Ch. 1,1 have claimed that sentences S are best characterized as a third projection of V in the bar notation, and that V contrasts with other heads of phrases precisely in allowing this third projection. The second projection of V, which is VP, corresponds to the second projection of the other heads (N, A, P), and is not always immediately dominated by V 3 ( = S), as seen in Ch. 2. In this chapter, I will argue that S does, however, always immediately dominate a VP, or at least it always dominates an X 2 . That is, I will argue against the idea that a clause S can be a "flat structure" in which subject and object occupy structurally similar positions. In Ch. 1, the following principles for constructing an expansion of S have been proposed. (1) (2) (3)

X 3 -» Y m a x , X 2 Only V can have a third projection in the bar notation. SP(X) is the daughter of X m a x .

Recall that in Ch. 2, the subject of X° is defined as an argument of X° which is outside X 1 and in all the same MP and S as X 1 . 1 further proposed that the fundamental characterizing property of N P is (4): (4)

Only N P ' s may be subjects. 1

Moreover, (5)

In the unmarked situation, subjects precede predicates.

1. As pointed out in section 1.3, the fact that Y m a x is always an N P in languages considered up to this point is a special case of (4). Any phrase must be interpreted, and the only interpretations of a phrase are as an (external or internal) argument, or as some kind of "adverbial" complement. One type of non-argument N P is a measure phrase, but, as suggested in Jackendoff (1977, Ch. 6), measure phrases are external to X 1 but internal to X 2 . 1 assume any other "adverbial" NP's are similarly restricted. A more accurate semantic alternative to (4), also pointed out in section 1.3, is (i): (i) Only Y m a x whose specifiers can express reference can be subjects.

122

A unified theory of syntactic

categories

Of the principles (l)-(5), the last three are motivated outside the structure of S by the fact that they hold inside NP's, for adjective phrases, and for the bare participial VP's studied in Ch. 2. 2 The two principles (1)—(2) should and can easily be collapsed into one if both are truly universal. However, it might be that one or both are subject to parametric variation across languages, so I leave them separate. 3 In section 2.5, I have given several justifications for considering V to have two different levels of phrasal projections which can occur as complements to other heads. In particular, some grammatical principles classify S with NP, AP, and PP (the class X m a x ), and others classify VP with NP, AP, and P P (the class X 2 ). Moreover, especially in subcategorization and distributional statements, one sometimes (but crucially, not always) needs to be able to refer to Wk (the class VP and S). The earlier justifications for S being V will be repeated below in section 3.3; sections 3.6 and 3.7 contain the more complex argumentation for this hypothesis, and form the empirical core of this chapter. If we now make the assumption that subjects of sentences are obligatory (I return to this below), the consequences of (l)-(5) taken together are the following possible expansions of S: (6)

(a) S->NP SP(V) VP (English, French, etc.) (b) S ^ S P ( V ) N P VP (c) S->NP VP SP(V)

The possibility that Celtic languages exemplify (6b) and Dutch/German (6c) will be touched on in later sections. If we abstract away from the position of SP(V) in (6), we arrive at Chomsky's original phrase structure rule for expanding S, to wit: (7)

S - > N P VP

This rule lays the basis for explaining the many properties which appear to distinguish the subject N P from other phrasal complements of a V, including its object NP's. Thus, in a number of the well-studied European languages and apparently in many others, the subject N P contrasts with various other NP's in sentences with regard to obligatory syntactically conditioned presence, verb agreement, agentive interpretation, possibilities for case marking, various properties stated in terms of max

2. Since a participial V P is a V 2 but not a v , it follows that it has no specifier, i.e., it does not exhibit the tense/aspect/modality variation characteristic of SP(V). Bar notation theories in which the A U X or I N F L node is not the SP(V) fail to explain the lack of some other V-modifying closed class specific to V (and not to N, A, and P), the semantic content of whose elements is recognizably related across langauges. 3. For example, Taraldson (1983) argues that whether S is V m a x is a parameter. See section 1.3 for a brief discussion of Williams' possible extension of (1) to permit exocentric (non-verbal) sentences in certain languages.

Clausal

word order and

structure-preservation

123

reference and co-reference (cf. Keenan, 1976), and accesibility to transformational removal. The sentential structure imposed by (7) allows various interpretive, transformational, and morphological rules a n d / o r conditions on the functioning of such rules (e.g., the specified subject condition of Chomsky, 1973) to depend on the particular structural characteristic of the subject N P . Some of the less extensively studied properties of subjects that can be accounted for in terms of the subject's exteriority to VP will be reviewed in section 3.3. There does not seem to be any doubt, then, that the notion of a base N P outside the VP has been and continues to be a fruitful hypothesis for the formal and elegant expression of many generalizations about English and typologically similar languages. The question that then poses itself is, how far should we extend Chomsky's rule for subjects (7) in describing languages other than those clearly like English? How wide a class of languages should be described by attributing to them the base rule (7) (i.e., one of those in (6))? An obviously important sub-question is, can (7) be used to describe languages whose canonical clausal word order pattern is verb-subjectobject-. .. ("VSO")? A principal focus of this chapter, argued for in section 3.5-3.8, is that VSO languages are better described by attributing to them a base SVO order, and hence some version of (6), than by considering such word order to be "base-generated." But even if not all VSO languages are derived from a grammar containing some version of (6), it is a groundless (but common) assumption that rule (7) is of interest only if it is universal; i.e., only if every language utilizes it. It is a priori just as interesting to find out that (7) has a more limited extension, and to find out exactly, in terms of other formal properties of the languages involved, what linguistic phenomena correlate with the presence of (7) in the grammar. A more sophisticated error would be to reject (7) if some property correlated with (7) in languages like English (e.g., that the N P generated by this rule is always the subject of the following VP) were found not to be so correlated in some other language(s). Section 3.4 is devoted to exploring the possibility that (7) might appear in a grammar of a language without N P always being the subject in the same sense as this notion is understood for languages like English. Before going on to discuss the status of (6)-(7) as universals - that is to say, of (1) and ( 2 ) - in the face of differing observed canonical word orders, it is appropriate to recall that there is notional as well as purely syntactic content in the claim than an S is a V 3 . When a certain syntactic element in s-structures, call it I for "inflection", is lexically filled, it typically expresses not only verbal tense, aspect, and modality, but also the presence (and sometimes the express absence) of a judgment in a proposition. In the S's bracketed in (8), judgments are expressed, while in the minimally different examples in (9), the corresponding S's do not express judgments.

124

A unified theory of syntactic categories

(8)

[sMy child go [¡es] out on rainy days]. It is important that [s he ha [¡s] a good time]. Since [ s John [i,-past ' S J here], I am happy.

(9)

(a) [s My child go out on rainy days]? (b) It is important that [s he have a good time], (no lexical I in have) (c) It is important for [§ him to have a good time], (no lexical I in have) (d) If [s John [ i , + p a s t were] here], I would be happy.

Syntactically, the examples in (8) contain lexically filled I, while those in (9a-c) are derived from S's in which I is either absent or is interpreted as an empty element. Thus, it is clear that I, in particular, whether it is lexical or not, plays an important role in interpreting an S as a judgment. At the beginning of Ch. 1,1 stated that grammatical tradition considers an S to be that syntactic structure which has the canonical form of a judgment, even though not all S's in fact express judgments. Then, we find that S, defined in this sense, has a typical deep structure form " N P - I - . . . " When we further observe that I is distributionally dependent on a specific category, namely V, whose existence and distributional properties are independently defined, and that this distributional dependence can be expressed as that of SP(X) to X, this is a non-trivial result. Among other things, it reduces the category I to one which fills a gap in an independently defined system of categories, those of the bar notation. It seems to me that exactly this identification of I with SP(V) can be made, as various considerations throughout this chapter (some already given in Ch. 1) will indicate. That is, while it is of interest to claim universal or nearuniversal status for (10), where all the categories in the formula are independent, it is of even more interest to reduce (10) to (l)-(3). (10)

S - » N P I VP

In this chapter, for brevity and to facilitate comparison with Chomsky (1981), where I = INFL, the symbol I is used as an interchangeable replacement for SP(V). It might be thought that I am committed to the position that I is not the head of S contra McA'Nulty (1980) and Chomsky (1981). However, since S = V 3 in my system, heads can be defined as X° and X 1 , and lacking these as SP(X), so that in fact I is the head of S. 4 If necessary, we can distinguish between the head of a ( = immediately dominated by a), and the lexical head of a ( = X°, where a = X-'). A less technical and quite possibly better definition for the head of S is to say that only X° and X 1 can be 4. Another plausible alternative is that D E T is the head of NP, but not of N, in ways analogous to how I is the head of S, but not of VP.

Clausal

word order and

125

structure-preservation

heads, and that I counts as the head of S (i.e., it governs the subject NP) only when it additionally has the features of an independently defined X°. Chomsky (1981, Ch. 3) points out that the AGR(eement) features on a finite I can be considered to be an N°. Thus, [I, + N ° ] would be the head of S, and be able to govern and case-mark the subject NP, etc. The nonfinite value of I, [I, - N ° ] (the infinitival to in English), on the other hand, could not be the head of an S. The role of head for an infinitive would then devolve onto V° and/or V 1 , which could not govern a subject NP, but could, as in fact happens, enter into selection restrictions imposed by a governing predicate outisde of S in a way that a finite V does not. (For a discussion of this type of V —V selection restriction in French infinitives, see Lamiroy, 1983.) To the extent the question of headship of S enters into the exposition in this chapter, I take I ( = SP(V)) to be the head of S. For this reason, it is of course obligatory in the expansion of S. 3.2. Base Word Orders and Observed Word Order Patterns An oft-cited and oft-recast generalization about word order in language is directly stated in Greenberg (1963): languages tend to either put modified before modifier across all phrasal types (NP, VP, AP, PP) or vice-versa. Of his 25 universals on word order and syntax, numbers 2, 3, 4, 7, 13, 15, 21, 22, and 24 are all special cases of the idea that we almost always find the following orders together within a single language: (11) (12)

P + NP, V + N P , V + VP, N + N P ("genitive NP") N + S, A + S, V + Adverbial. N P + P, N P + V, VP + V, N P + N, S + A , Adverbial + V. 5

An extension generalization: (13)

of

principle

(18)

of

Ch.

1 expresses

Greenberg's

Head Placement Parameter for Phrasal Complements. A phrasal sister must be on the a) right b) left of the head of a deep structure X f c ,k 1 -

3 - 2 - 4

Quicoli's rule involves two phrasal categories and so is not local in the sense of (38). Perhaps, since his rule does not involve two maximal projections, the definition of locality I have given is too strict. However, at least one additional problem faces a claim that causatives are formed by a local V-fronting. Since causative constructions seem to be subject to rules of interpretation at logical form beyond those imposed by deep structure grammatical relations, any movement of V should precede the level of s-structure. The fact that there is no special morphology on causative V's in Romance languages provides some independent support for the idea that the causative rule applies "in the syntax". But then, the French (non-finite) causative construction, which is essentially the same as what is found in Italian and Spanish, must be lacking the VP-external I that McA'Nulty (1983) has argued for in finite French s-structures. Some ad hoc statement about French, whereby I is present outside V P in s-structure only in finite clauses, would seem to be necessary. An alternative is to allow V-fronting, including movement over I, to form the causative in French, and to attribute the appearance of the embedded subject to the right of the embedded verb to its earlier displacement by Move a, as argued for in Milner (1982, second appendix).

148

A unified theory

of syntactic

categories

are possible in English because I retains its deep position outside VP in English s-structures and blocks any local inversions of V and the subject NP. I have not been concerned here with the details of how these V movement rules or the I movements of the previous sections are ordered, or when they are obligatory. These questions are partially answered in Ch. 5. It is plausible to assume, nonetheless, that language-particular rules cannot be arbitrarily stipulated as applying before or after s-structure. Following this logic, the V-fronting rules which "don't get the chance" to apply before s-structure in English cannot apply after English affix movement removes I from between the subject N P and V; that is, local V movements in English, except on the last cyclic domain of E, should be excluded in principle. It can be observed that the local movements of V in Romance languages are limited to certain constructions (causatives, questions, etc.). It can be asked, should there exist languages in which local V - fronting is permitted in all clauses, provided I is always suffixal? The V - frontings in Romance are restricted either by structural descriptions in the syntax, or alternatively by the fact that only certain configurations are interpretable at logical form. For example, simple finite transitive declarative clauses (embedded or not) cannot have a locally fronted V in French: (49)

*(I1 a remarqué que) cherche ta lettre Marie. "(He noticed that) is looking for your letter Mary." *(I1 a remarqué que) cherche Marie ta lettre. "(He noticed that) is looking Mary for your letter."

This is also true for Spanish, for the informants I have asked. A language in which local fronting of V is possible in all clauses without restriction turns out to be a language which has a clause-initial verb (a "VSO" language). Provided that I is always realized as a verbal suffix, the formulation of a local subject NP-V inversion rule derives a V-initial order from an underlying V-second order as in (14a). I think there is a growing amount of evidence that at least the Celtic languages are to be analyzed in this way. Anderson and Chung (1977) show that one possible topicalized constituent in Breton is a "semantic VP," where this VP contains just the combinations of clausal elements that we recognize as English VP's; e.g., verb + object(s) but not verb -I- subject, verb + subcategorized P P but not verb + time adverb PP,

Footnote 23—Continued In the potential English counterpart to a causative construction, the causative V , and the embedded V 2 are necessarily separated by two constituents at s-structure, the subject N P and the infinitival to ( = 1); the general condition on local rules (38) then prevents a local rule from adjoining Vj and V 2 .

Clausal

word order and

structure-preservation

149

etc. (They establish independently that only constituents topicalize.) Thus, even though finite clauses in Breton exhibit VSO order, what I would consider the deep structure VP surfaces intact in topicalizations. An entirely similar conclusion is that of McCloskey (1983), who argues that a non-finite VP constituent in Modern Irish arises in a variety of constructions. This construction appears to me in many ways similar to the bare gerundive VP's studied here in Ch. 2. Harlow explicity adopts an analysis of Welsh VSO word order as derived from an underlying SVO pattern by means of the following rule. (50)

T E N S E — N P — V=>3 + 1 — 2 — 0 (Harlow, 1981, 222)

The V-fronting rule in Celtic languages is obligatory, as is the fronting of I in Dutch/German. One candidate for a general principle from which it would follow that both these rules are obligatory is some extension of the Head Uniqueness Principle of Safir (1982). The various local V-frontings in verb-second languages of the English, Romance, and Celtic types, have now been discussed. Let us turn to the possibility of local V-movements in verb-final languages. Such languages can give rise to local V-movements which are not available in a V-second language. The asymmetry is due to the fact that in all the languages under discussion, a non-finite clausal complement to a V is positioned, at least in deep structure, to the right of all other complements to V. Thus, deep structure nonfinite complements in verb-second languages appear as in (51) and in verb-final languages as in (52): (51)

VP

(52) .(S) I (S) "I VP v2.

Inspection of the above trees shows that the two V are not adjacent in most deep structures in a verb-second language, whereas in a verb-final language, the two V are adjacent. If the I in the embedded S in (52) is realized in the syntax as a verbal affix on V 2 (as is always the case in Dutch, German, Japanese and Korean), or if Vi has a VP complement without an I, a local rule respecting the Generalized Head Restriction (40) and the SPC can adjoin the lower V 2 to the higher Vj. Clear examples of local V-movements of this type are exhibited by German and Dutch. Evers (1975) presents convincing evidence that in a range of nonfinite clausal complement constructions, sequences V 2 -Vj as in (52) are subject to a rule

150

A unified theory of syntactic

categories

of verb raising which makes them into a single V constituent. In Dutch, the two verbs are reordered as V1-V2, whereas in German, there is generally no reordering. 2 4 Thus, it can be seen that there exist local V-movements to all the positions where the S P C and the Generalized Head Restriction license such possibilities. A verb can raise to the I or V which governs it; when I is realized in the syntax on V and so does not intervene between V and the next highest head, a verb can also adjoin to a C O M P (P) or to a V governing its S. When I does intervene, as in English or in French finite clauses, the maximum movement of V is around the I itself (English beraising and French finite verb raising). It is moreover of interest to observe that the possibilities for (in my opinion, language-particular) local movements allowed by the S P C and the Generalized Head Restriction are realized in languages in quite different ways. T h a t is, within the rather strict limits set by these constructs of universal grammar, languages d o vary, one might say, "as much as they can". It has not been necessary to search through a wide range of languages to find instances of local and apparently quite different V-movements. When a language (English) has seemed to lack full-fledged V-movement, it can be attributed to another factor, the uniform location of I outside V P at s-structure. When a verb-final language of the K o r e a n / J a p a n e s e type has seemed to offer little to confirm the idea that V can move locally, it has only been necessary to look to D u t c h / G e r m a n to find the predicted type of movement. 3.8. Greenberg's VSO Universals as Evidence for a Universal VP under S In the previous sections, we have seen that (i) we expect some VSO languages to be derived from underlying SVO order via local Vmovements licensed by the SPC and the Generalized Head Restriction (40) (derived from Travis's (25)), and (ii) the Celtic languages appear t o lend themselves to an analysis of this type. This naturally suggests that all VSO languages might be derived in a similar way. 2 5 While we have seen in section 3.4 that the Polynesian languages studies by Chung may not have external arguments with fixed 0-roles (i.e., subjects), I also suggested that their superficial structure and case-marking may simply be a misleading remnant of an earlier topic-prominent period. In the face of diverging language-particular analyses of VSO language families, it is reasonable to base a tentative hypothesis about the VSO typology on properties which all languages of this type share. The best source to date for examining 24. See the second appendix to Ch. 2 for a discussion of various aspects of German/Dutch verb-raising. 25. Another language often taken as a typical VSO language is Berber. M. Kenstowicz (pers. comm.) has suggested to me that an argument can be made that Berber subjects appear in their "bound form" only as a result of some constituent being fronted to S-initial C O M P position. Then, since a subject in a sentence with VSO order is always in its bound form, V appears to be in a fronted rather than a deep structure position.

Clausal word order and

structure-preservation

151

these properties is the classic study of Greenberg (1963) on word order universals, which includes many statements specifically directed at VSO languages. The overriding impression given by Greenberg's article is in the direction of a universal status for VP, rather than away from it. For let us observe two points: (i) VSO languages are in fact rare compared even with the SVO type alone. Deriving them via a local rule from a "universal VP" predicts this, in the same way we would predict that verb-last languages with a Vsecond rule in main clauses are rare compared to pure V-last languages, or that SVOX languages lacking English's particular version of subjectauxiliary inversion or tag questions are more common than languages strictly like English in these respects. That is, certain types of rules (especially non-structure-preserving movement rules) make a language "more complicated" and hence, rarer. An alternative taxonomic typology simply has no explanation for why one type of word order is more common; the refusal to face this lack of explanation is nothing but dogmatic empiricism. (ii) Greenberg points out another "universal", which should at least suggest the direction of careful analyses, although it is true that his terms "dominant" and "basic" order are unclear. (Dutch and German are taken by him as verb-second languages.) All languages with dominant VSO order have SVO as an alternative or as the only alternative basic order. (Universal No. 6) While Chung's work discussed in section 3.4 casts some doubt on the absolute universality of this pattern, the common occurrence of SVO alternates to VSO is again suggestive of an underlying VP. Let me now distinguish two independent typological characteristics or parameters which must be postulated for languages which are not VSO. Suppose we say that languages may vary as to whether they have a sentence-initial COMP in the base or not, and that, among other things, the existence of this C O M P in the language entails a WH-fronting transformation. We then have the following possibilities for languages: (53)

Deep Structure Order SVO SOV

Initial C O M P

Initially No C O M P

English, etc. German, Dutch

Thai Japanese

For the class of COMP-initial languages, Greenberg's Universal 11 expresses an interesting correlation: Inversion of statement order so that verb precedes subject occurs only

152

A unified theory of syntactic

categories

in languages where the question word or phrase is normally initial. This same inversion occurs in yes-no questions only if it also occurs in interrogative word questions. (Universal N o . 11) Reformulating, only languages with an initial C O M P have word orders with inverted (fronted) verbs. Although this is probably not Greenberg's intent, this restatement of Universal 11 correctly includes the V-final D u t c h / G e r m a n type of language. To capture this generalization, I adopt (54), which is essentially due to DenBesten (1977). (54)

All instances of movement to a pre-subject position by a grammatical transformation are attractions to a sentence-initial C O M P node. 2 6

I d o not imply by (54) that all movements to pre-subject position are necessarily adjunctions to C O M P , although it may be that such movements are always adjunctions to C O M P "if they can be", i.e., if C O M P is not otherwise filled. If C O M P is filled by a phrase, a V-movement to C O M P is necessarily local, by (46), and gives rise to a structure [ g C O M P V - . . . ] . Principle (54) is exemplified both by subject-auxiliary inversion of English and by the root transformation of V-second in Dutch and German; i.e., (54) is not just a restatement of Greenberg's Universal 11, but a generalization of it. But it now follows that any VSO language which has a deep structure VP, since such a language exhibits a (local) movement of V around the subject, will necessarily have a C O M P and hence - by our universal characteristic of C O M P above - it will also have a WH-fronting rule. But this is in fact Universal 12 and the second part of Universal 10 of Greenberg (1963): Question particles or affixes, when specified in position by reference to a particular word in the sentence, almost always follow that word. Such particles d o not occur in languages with dominant order VSO. (No. 10) If a language has dominant order VSO in declarative sentences, it always puts interrogative words or phrases first in interrogative word questions; if it has dominant order SOV in declarative sentences, there is never such an invariant rule. (No. 12) We are interested here in the first part of Greenberg's Universal 12; the 26. (54) is not meant to encompass purely stylistic frontings, such as the verb-fronting of Korean poetry mentioned in note 20.

Clausal

word order and

structure-preservation

153

weakly stated second part is in fact not true if "dominant order" is replaced by "deep structure order" (Dutch, German). If we incorporate (54) into the theory of grammar to reflect and extend Greenberg's Universal 11, it will then follow for any VSO languages with a VP that they obey Greenberg's Universal 12, but for any VSO language without an underlying VP, their adherence to the latter "universal" will be accidental: if there are true deep structure VSO languages, we will have no explanation for why all such languages have deep structure C O M P nodes ( = WH-fronting rules) too. Thus, Greenberg's universals 11 and 12 independently lead us to conclude that VSO languages in fact are deep structure SVO languages. By analyzing VSO languages in this way, we explain why they all fall in the upper left group with English with respect to the parameters in (53). The above argument is in no way a priori - it is based on the extensive empirical investigation carried out by linguists, including work internal to the VSO languages, to establish Greenberg's universals and the generalization of one of them (54). In this connection, we can also consider the distribution of question particles that are positioned "by reference to the sentence as a whole" (Greenberg). It is more restrictive to assume that such a particle is always a realization of C O M P in any language which has C O M P . (Perhaps because of his concentrating on main clauses, it escapes Greenberg's notice that English whether is such a particle.) We can leave open the question of whether languages in which such particles are sentence-final have sentence-final C O M P (e.g., Japanese and Thai have such particles). But now we can make a further deduction from our restrictive premises; if a language is VSO with a VP, then it has a sentence-initial C O M P , by (54); but then, if it also has a question particle positioned "by reference to the sentence as a whole," it will follow that the particle must be sentenceinitial and not in any other sentential position. But this is actually the case: cf. Greenberg's table 2. It is also the content of his Universal 9, where I think the significance is obscured by correlating these particle positions with prepositions vs. postpositions rather than with verb order. If we could replace our one-way implication ("if a language has a C O M P and such a particle, the particle is a realization of C O M P " ) with a two-way implication ("a language has a sentence-initial C O M P if and only if it has such an initial particle"), then we could conclude not only that any question particles of VSO languages must be initial, but that such languages must have such particles - which is also the case for the six VSO language in Greenberg's "sample". But this stronger prediction depends on whether the many other (e.g., verb-second) languages with sentence-initial C O M P can be shown at least in some contexts to exhibit such a particle. 3.9. Conclusion The goal of Sections 3.5-3.8 has been twofold: to show that the con-

154

A unified theory of syntactic

categories

strained theory of transformations that we now have at our disposal allows only one non-base "everywhere-generated" sentential word order, VSOX, to be derived from the well-attested SVOX and SXOYV orders, and that there is a reasonable a m o u n t of evidence from work on the VSOX languages to show that many such languages in fact are so derived. This is turn strengthens my hypothesis that a deep structure subject phrase external to VP is the universal defining structural characteristic of S. This basic hypothesis about the nature of S, supplemented by two general principles governing movement rules, the Structure-Preserving Constraint and Travis's Head Movement Restriction (utilized here in the form (40)), yields not only an interesting and predictive analysis of VSO languages, but also predicts the occurring range of other surface S patterns, as shown in the following table: (55)

Movement of (V) + 1

Obligatory

Optional; interacts with question interpretation

Literary or narrative style

Root, since conditions on local rules violated.

Dutch, German

English I-inversion

Korean poetry

Local, because I is attached to V before fronting

Celtic

Romance V-fronting rules

English simple V inversion

F r o m table (55), it can be seen graphically just how the structurepreserving constraint allows the a u t o n o m o u s syntactic restrictions to be factored out of the eventual proper formulations of the interpretive components. The columns of table (55) correspond to different uses of the uniform output structures at various levels of interpretation. The distinction between the two rows of the table is completely accounted for by universal grammar, the independently stated distinction between English and languages where I is always an inflection, and the choice of the word order parameter (13). I conclude therefore the following for the constituent S, which is the canonical form of a judgment: (i) S always has a VP and I daughters, where VP is classed with N P , AP, and P P in the bar notation as X 2 . (ii) S = V 3 and I = SP(V). (iii) S contains an external argument Y m a x , which appears to exhibit fixed 0-role assignments in most and perhaps all languages, (iv) When Y m a x has the fixed, agentive 0-role property (32), it is an N P and is obligatory, and is called the subject.

Chapter 4

Grammatical formative categories and the designation convention 4.1. The General Nature of Syntactic Categories The categorial, or base, component of the grammar is a set of universal principles of natural language which defines the possible dominance relations among a certain set of categories before grammatical transformations apply. In Chapters 1—3 principles of the base have been formulated which determine where phrases Y* (k > 2) appear as constituents of larger, containing phrases XA The phrase X 1 is always the head of X 1 or X 2 . The central base composition rules for X' and accompanying principles of 0-role assignment are the subject matter of Ch. 1. In Ch. 2, it is shown that a V 2 may occur independently of v m a x as a "bare VP", while in Ch. 3 it is argued that it is a third projection V 3 which typically, and perhaps universally, dominates N P — SP(V) — V 2 . 1 Furthermore, because of the Structure-Preserving Constraint, as revised in Ch. 3, the categorial component sets strict limits as well on the dominance relations that can arise during a transformational derivation. O n the other hand, the only proposals so far discussed on how nonphrasal categories appear as constituents of phrases have been the largely uncontroversial ones that (i) the four maximal projections contain specifiers SP(X) (as immediate constituents) and deep structure morpheme category heads X°, and (ii) in deep structure left-right order, the heads X' follow all non-phrasal sisters. In this chapter, I will determine in more detail the nature of non-phrasal categories and the dominance relations that can hold between them and the various phrasal categories. For convenience, an alternative designation for "non-phrasal category" is "morpheme category". 1.

The base composition rules given in preceding chapters are summarized here: X - > X , Y*, Z ' G,fc>2) (a) Sisters Yk of P or V can receive a 0-role "directly". (b) Phrases Yk c-commanded by L( = N, A, or V) can receive a 0-role "indirectly" if both (i) L is subcategorized for (F)Y\ where F is a morpheme category, and (ii) F + Y l constitutes a sister of L. J X -> X', Y 2 (Y # N; k = j or j - 1) X m a x ->SP(X), x m a x - ' X3 Y m a x , X 2 (Y m a x is an argument of X) SP(X) N P (English; X # V) Only NP's can be arguments of X external to X.

156

A unified theory of syntactic

categories

The central morpheme categories of syntax have deep structure distributions in the languages of the world which appear quite constant. These include the major lexical categories noun (N), verb (V), and adjective (A). Moreover, each of these lexical categories X can be paired with a category of grammatical formatives SP(X) - the specifier of X - which typically appears with it in the same phrase, and not in combination with the other lexical categories. Thus, for the category N, languages invariably contain a small class of morphemes termed articles, demonstratives, and quantifiers that combine with nouns to form a larger syntactic unit, the "noun phrase". This class, in earlier generative treatments called DETERMINER, is notated SP(N), and is as universal as the category N itself. Likewise, there are classes of SP(V) and SP(A) which include, respectively, verbal inflection and modality markers, and degree modifiers (very, so, too, etc.). We will investigate all these classes in more detail as we proceed; at this point, I only reemphasize the universal existence of the categories X and SP(X). Throughout, X} and SP(X) are referred as "bar notation categories". An insight of many grammarians, first partially incorporated into a formal framework by Z. Harris (1946), and one which has been at the core of generative work since Chomsky (1970), is that all phrases in syntax are formed by a combination of a lexical category "head" X, which fixes the category of the phrase itself, together with SP(X) and perhaps other grammatical formative categories, and optional complement phrases. In the regular cases, the category of the head centrally determines the syntactic behavior of the phrase; it is moreover obligatory, and selectionally dominant. 2 Under this conception, a phrase, defined as a non-idiomatic syntactic unit larger than a word, is always, at the level of deep structure, a combination of an obligatory lexical category X and its specifier and complements. Phrases are therefore called "projections" of X and are notated X-', where j is a small integer, probably 3 at most. The integer j is alternatively represented by writing j bars over X, e.g., X = X h e n c e the term "bar notation". Within generative grammar, there has been a gradual acceptance of an idea first argued for in Ruwet (1969), that preposition (P) is a fourth head-

2. It is the head of the phrase which enters into selection restrictions with material outside the phrase (usually, the head of a larger phrase). If, with Chomsky (1981, Ch. 3), we take the relation between the head and its phrasal sisters to be government, the head being the governor, then selectional dominance is subsumed under the following principle (Chomsky, 1981, 300; Belletti and Rizzi, 1981, 117): The head of a maximal projection (a phrase which is not itself a head, J. E.) is accessible to an external governor but peripheral positions are not. This principle should be united, it seems to me, with the "head constraint" defended in van Riemsdijk (1978, 160, and references cited there).

Grammatical

formative

categories

and designation

convention

157

of-phrase category that is more "grammatical" than N, V, and A; for example, the latter are all productive in many languages, while P is not. The category P thus gives rise to bar notation phrases PA Some of the works that have contributed to establishing P as a bar notation head are Emonds (1972), van Riemsdijk (1978), Jackendoff (1973, 1977), and Baltin (1978b). One of the central purposes of this book is to further delineate the role of P in universal grammar (cf. Ch. 1, 6, and 7). To express the distinction between P and N, V, A, I refer to P as a grammatical category and to N, V, A as the "major" lexical categories. A point of disagreement in formulations of the bar notation is whether the categories for embedded propositions, S and S, where [ g C O M P — S] is the usual structure assumed, fall partly or completely outside the bar notation. One claim I want to defend in this book is that they do not. In Ch. 1 and 3,1 have argued that S = V 3 , and in Ch. 7 I try to establish that C O M P = P and S = P. Once it is established that syntactic phrases, their lexical heads, and their most prominent grammatical formative category all fall under the bar notation, it should be clear that the following claim about the deep structure relations that can obtain among these categories is fairly strong. (1)

Bar Notation Uniformity: The dominance relations permitted in deep structures among the categories X-' and SP(X) are the same in all natural languages, and are determined by the principles of a universal categorial component.

It is not implied by (1) that the deep structures of languages are all the same. For example, deep structures are in large part determined by the principles and the particular entries of the lexicon, and these are not claimed as invariant. Secondly, I do not claim that the interpretive principles for grammatical relations are exactly the same in all languages; thus, I suggested in section 3.4 that [ s N P — VP] could define a topiccomment relation in some languages and a subject-predicate relation in others. Similarly, thematic roles might be assigned to argument NP's somewhat differently across languages; cf. Levin (1983) for a detailed discussion of these possibilities. Thirdly, it is not necessarily the case that each XJ or SP(X) has the same characteristics in every language's deep structure; thus, which lexical categories are productive varies across languages, with N being productive in every language. The subcategories of X° and SP(X) appear to vary also. For example, Talmy (1975) discusses a pervasive (deep structure) difference in how languages use subclasses of verbs to express the "path" and "manner" of motion, which distinguishes English from Romance. Steele et al. (1981) present differences in what notions are expressed in AUX (my SP(V)) across languages, even though the possibilities are limited. Fourthly, some grammatical formative categories appear to fall outside the bar notation, and these might well vary from language to language; nonetheless, as we proceed, I will strongly

158

A unified theory of syntactic

categories

circumscribe the variation possible across languages. 3 Fifthly, a,s is wellknown, the categories of language can be ordered from left to right according to different principles; Bar Notation Uniformity says nothing about this, although in Ch. 1 I discussed some ordering principles that might have universal status. Finally, as is always the case in transformational grammar, a claim about deep structure such as (1) is justified by argument and analysis, and not by cursory inspection of surface forms. 4 In any case, it can be seen that Bar Notation Uniformity, though it is a strong claim, is not a "Universal Base Hypothesis". The claim of Bar Notation Uniformity is strengthened by any limits we can set on how many grammatical formative categories fall outside X° and SP(X). O n e proposal along these lines is that of van Riemsdijk (1978): "All syntactic features outside the bar notation are morphosyntactic in the sense that they are taken from a universal inventory and play a nonneutralized role in the morphology." I believe we can be even more restrictive. Van Riemsdijk's proposal is really a claim about how a language can be analyzed, given the morphology of the language. It does not set limits on how many such potential morphosyntactic features there are. I will argue here that, with few exceptions, all grammatical formative categories are either "disguised" instances of X° or SP(X) themselves, or are .sub-categories ( = features) of X° and SP(X). The only counterexamples to this claim seem to be certain grammatical formative categories associated with coordination. However, when it is realized that the bar notation is fundamentally a formalization of subordination, it is not surprising that morpheme categories marking coordination might fall outside it. It is possible that some of the universal dominance relations among categories, or even some categories themselves, may not be realized in a particular language. For example, some languages seem to utilize three, rather than two, demonstrative determiners; some languages differentiate a syntactic dual from a syntactic plural while others d o not; etc. But such variations might be simply different distributions of category cooccurrences, and not different distributions of inventories of categories at all. Thus, a "three-demonstrative language" might arise from the first3. Variations across languages are also to be expected in which notions show up in different types of morphological classes. Interestingly, Bickerton (1982) makes a very strong and convincing case that grammatical formative categories (outside the lexical categories) tend to be the same even in newly created Creoles which have evolved from completely different situations of contact with other languages. That is, Creoles, and possibly all languages, tend to have the same grammatical categories whatever their history. 4. Thus, the English gerunds of surface form [ N P V P ] seem to violate the bar notation, and also to have no counterparts in, say, French. But I go to some length in Emonds (1976, Ch. 4) and in section 2.6 here to show that their deep categorial structure is [JMPN + clause], which is in conformity with the bar notation and analogous to structures found in languages like French. Rosenbaum (1967) reaches the same conclusion, but our two proposals differ considerably otherwise.

Grammatical formative

categories and designation convention

159

second-third person pronominal features being also realized on the DETERMINER, and not from some feature that is absent in a "twodemonstrative" language. Whatever the correct resolution of these puzzles, I will assume here that, outside of conjunction, language-particular variation among grammatical formative categories in deep structure is limited to which subcategories of X° and SP(X) are expressed, and with what distribution they are expressed. After arguing for analyzing grammatical formative categories as subcategories of X° and SP(X), I will go on to delimit the syntactic behavior of these "closed" classes. In particular, I will show that some of the most grammatically interesting of the closed classes are three previously unrecognized closed subsets of the open categories, which I term "grammatical" verbs, nouns, and adjectives. The principal theoretical result of this chapter, the "Designation Convention", explains why the syntax of an item in any of these closed classes typically diverges from that of other members of its class. An important correlate of this Convention is that the category "auxiliary verb" can be eliminated from formal grammar (section 4.6). To terminate, I will argue for a final distinguishing property of closed class items, their ability to satisfy s-structure or "post-transformational" insertion contexts. 4.2. Closed

Categories

The systematic relations among phrases and heads within the bar notation provides the skeletal structure for grammatical behavior. However, the flesh and blood of grammar is the behavior of the non-phrasal, or morpheme, categories. Variation among syntactic systems is very apparent here, even though any sort of deeper look at the properties of these categories suggests that strong limitations can be imposed. As a first step toward establishing these limits, consider the classic division among syntactic categories between "open" and "closed" categories. (2)

The only possible open categories are the major lexical categories, N, A, and V.

An open category has the following two properties: (3) (4)

Only open categories have indefinitely many members in the dictionary of a language - say several hundred at least. Closed categories have twenty to thirty members at most.5 Conscious coining of new lexical entries by speakers is allowed only in the open categories.

5. Typically, there are thousands of nouns and verbs, not counting compounds. By "indefinitely many", I mean that a theory of language imposes no limit on their number. Thus, a natural language with 10,000 verbs is permitted by linguistic theory, but a language with 1,000 distinct determiner morphemes is not.

160

A unified theory of syntactic

categories

All the categories of syntax which are not open are called closed categories. They include the grammatical head-of-phrase category P, the SP(X), etc. A further property of the closed categories is that they are not an independent source of phrasal structure. That is, any non-idiomatic combination of words which makes up a deep structure syntactic unit is a phrase in the bar notation of the form XJ. If some word sequence is a closed category, then that sequence is either also a bar notation phrase, as in (5), or an idiom, as in (6): (5) (6)

[ d e t [ N P m y father's]] house CsP(A) [ n p three boring hours] ] long [ d e t what the hell]' more do you want [ d e t each and every] person here

There have been analyses which falsify this claim - e.g., the classic analysis of the branching AUX in Chomsky (1957), and many recent analyses of branching complementizers. I have argued elsewhere that the correct analysis of the auxiliary requires that any phrasal sequence under this closed category must be transformationally derived 6 , and in Ch. 7 my analysis of the C O M P involves no branching. Thus, I postulate: (7)

All rules of the deep structure categorial component that sanction branching refer to the branching of X ; .

That is, no closed category in the base dominates a sequence of categories prior to the lexical insertion of idioms, except by virtue of being expanded itself into a single bar notation phrase as in (5), or (under a slightly different conception) of occurring in the same feature complex as a bar notation phrase. Besides investigating the properties of the closed categories, we want to determine how numerous they are. As mentioned in section 4.1, I think it is a reasonable research program to reduce as many as possible of the seemingly disparate closed categories to the categories which form the core of syntactic theory, the X° and the SP(X). Not surprisingly, there are SP(X) whose categorial membership is not transparent (some typical cases are given just below). What is more interesting is how many putative closed categories are arguably instances of the various X (i.e., N, V, A, P). This latter, less obvious phenomenon is exemplified in the next section, following a brief discussion of various SP(X). Many English demonstrative and other pronouns are plausibly just SP(N) which can occur with or without a following lexically realized N 1 : 6. The analysis of Emonds (1976, Ch. 6) allows the surface English A U X doesn't to contain three linearly ordered morphemes, but this combination arises only because of two transformational operations on a non-branching base A U X dominating a single present tense morpheme.

Grammatical

formative

categories

and designation

convention

161

this, that, these, those, some, any, none (variant of no), all, both, each, which, what (cf. Jackendoff, 1977, section 5.3). With respect to personal pronouns, Postal (1969) argues that at least the plural personal pronouns (us, you, them) are also DETERMINERS (e.g., you doctors, we three, etc.). We might extend this claim to singular personal pronouns as well; alternatively, since phrases modifying personal pronouns are arguably complements to NP's rather than to N's (Emonds, 1976, Ch. 4), it may be that some or all personal pronouns are simply NP's that cannot be analyzed into some other category. 7 The archetypical members of the class SP(V) are the English modal auxiliaries (will, can, etc.). What is less often realized is that the English verbal affixes themselves are best analyzed not as co-occurring with modals but as alternative lexical expansions of the same SP(V) category. Modals and tense have in common, as opposed to the infinitive marker, that they can be further specified as +PAST. What characterizes infinitives is probably that their deep structure SP(V) is unexpanded into finer sub-features and is lexically empty. If we treat modals as + M expansions of SP(V) and morphological English tenses as simply [SP(V), — M], then it follows without further stipulation that modals exhibit neither the third person singular ending -s nor the regular past tense ending -ed. These properties of inflection are discussed in Emonds (1976, Ch. 6). In the AP system, it is fairly clear that the comparative and superlative morphemes -er and -est are of the same category as the pre-adjectival modifiers SP(A) such as very, too, so, as, how, quite, etc. The evidence for this is that the suffixes in question alternate with, but do not combine with, other SP(A): * very smarter than Harry, *how smartest, etc. The adverbial pairs now/then and here/there exhibit the same proximate/distant dichotomy as the SP(N) and SP(A), this/that. Since these adverbials can also be objects of prepositions, parallel to NP's, it seems reasonable to say that they are NP's, whose internal analysis is suppletive for some NP; for example, [NP [DET here'] N ] substitutes for this place. Finally, there is a category SP(P) which can be realized as right, clear, and perhaps as some other grammatical formatives. Cf. Hendrick (1976) for a study of these possibilities.

7.

Personal p r o n o u n s are arguably not nouns. In English an initial th before a v o w e l is

voiceless if and only if the m o r p h e m e is in a major lexical category (e.g., thing, thin, think, vs. this, then,

them,

thus, etc.). This p h o n o l o g i c a l pattern reinforces n u m e r o u s syntactic c o n -

siderations that indicate that p r o n o u n s are not in the category noun: p r o n o u n s d o not take regular -s plurals, they are not preceded by adjectives or determiners, they are distinguished by m o r p h o l o g i c a l case in English while n o u n s are not, and they can appear in all syntactic positions with p o s t p o s e d universal quantifiers: H e ignored us both. John b r o k e t h e m all. * H e ignored the b o y s both. *John broke m y chairs all.

162

A unified theory of syntactic

categories

There remain, of course, a range of closed categories, for example, coordinate conjunctions, that cannot be plausibly included in the categories of SP(X). It is to the properties of some of these that we now turn. 4.3. Disguised Lexical Categories I begin by proposing that there exist significant closed classes of grammatical formatives that are simply subclasses of the head-of-phrase categories N, V, A, and P. I delay characterizing these subclasses formally until section 4.4. For expository purposes, we can consider them to be comprised of the most frequently used and least semantically specific members of each lexical category. I will call these closed subclasses the "grammatical nouns, verbs, adjectives, and prepositions." As an example, the closed class of grammatical nouns in English appears to include one, self, people, thing, place, time, and way, and the grammatical verbs include be, have, get, do, go, come, make, let, want, say, and possibly a very few others. The exact limit on these classes is an empirical question that depends on properties brought out later in this chapter. Roughly speaking, one property of grammatical as opposed to lexical heads that will be developed in section 4.6 is that certain types of transformational statements apply only to them. The result of these statements in a grammar is that the grammatical heads differ in distribution from "ordinary" N, V, A, P; such transformationally derived grammatical N, A, V, P I will term "disguised lexical categories." The realization that many closed categories are simply X in disguise (i.e., are simply the product of a theory which contains X = N , V, A, P and a certain range of transformational operations) reduces the task of limiting the types and properties of those closed categories which cannot be analyzed as X or SP(X). Disguised N. The proform one, as in John's good ones, appears to be an N, though perhaps it may be required to be alone in N 1 (Jackendoff, 1977, sec. 4.1). Helke (1973) argues persuasively that the category of the reflexive self (cf. selves) is also N, even though it, of course, has idiosyncratic properties. In Ch. 5, I will argue that the morphemes one, thing, place, time, body which appear in English composite pronouns (someone, anybody, noplace, etc.) are in fact deep structure N which undergo a process that tolerates only grammatical formatives (closed classes). Thus, these are also disguised grammatical N. Disguised A. Many adverbials appear to be transparent cases of A, differing from adjectives principally in their syntactic placement. Very frequently, the adverbs either are simply A withough an ending (hard, fast, long, etc.), or with an -ly suffix. But other adverbs, for example the pair often/seldom, never appear in a noun-modifying position. Since they take the same specifier system as do adjectives (very seldom, how often, etc.), they should be analyzed as A with a defective distribution. The noun modifiers other, same, different, and such are candidates for

Grammatical formative

categories

and designation

convention

163

membership in A, even though they cannot be modified by the full SP(A) system. They typically appear after determiners and, parallel to other adjectives, do not permit ellipsis of the following noun in English: (8)

John has your books, and I have {some/six/his} myself. John has your books, and I have {some good/some different/the same} ones myself. *John has your books, and I have {some good/some different/the same} myself.

Finally, translation of these words into Spanish reveals that their grammatical gender alternates like that of adjectives (either -o/-a or no alternation) rather than like determiners (-e/-a). A final class of disguised A are the quantifiers many, few, much, little, studied in Bresnan (1973) and many subsequent works. They are not typical A, since they permit ellipsis of N in patterns like (8); that is, they are probably outside N 1 , like DET, rather than inside, like ordinary A. Nonetheless, E. Klein (1980) has presented good arguments that these "Q" and the " Q P " they appear in must be considered A and AP respectively; that is, they are A that appear in the SP(N) outside of N 1 . Disguised P. Ch. 6 demonstrates that English post-verbal particles, although not in base P P positions, are in fact P that have been transformationally moved. I also argue there that the connective as is a preposition in its non-comparative uses, with the peculiarity that it introduces a predicate attribute rather than an object NP. (Hence, it is not recognized as a preposition in traditional grammar.) The focus of Ch. 7 is to establish that the morphological C O M P are subordinating conjunctions (P), which differ from ordinary P in their choice of complement (S rather than NP) and in how they are inserted into the string. Hence, there are many disguised P. 8 Disguised V. A particularly pervasive type of disguised lexical category is the "auxiliary" verb-i.e., a verb which has verbal inflection but appears in positions not typical of V. In section 4.6 I will discuss this phenomenon in detail, arguing among other things that there is no need for a feature "AUX", once the interplay of the concepts "V" and "non-lexical formative" is properly understood. A very general type of closed category which is a disguised X ( = N, A, 8. Another class of P often not recognized as such, but not "disguised" in the exact sense I am using here, are the sentence connectives besides, moreover, however (suppletive for despite and although), neverthelss, (even) so, therefore (suppletive for because), and then (in the sense of French puis). As argued in Ch. 6, PP's can be objectless. Moreover, PP's can appear as parentheticals in a wide range of cases; in fact they are the typical parenthetical. Finally, many parenthetical PP's are of a form that is rarely or never allowed in non-parenthetical position: to my knowledge, in my opinion, in so many words, without doubt, etc. Sentence connectives are thus the intransitive parenthetical PP's whose existence these properties suggest.

164

A unified theory of syntactic

categories

V) is that which surfaces as a bound inflectional morpheme. For example, the Japanese passive and causative morphemes (cf. Hasegawa, 1981, and Morikawa, 1982) are V which are transformationally incorporated with the V they govern into a single surface compound V. Similarly, it has been argued that the English passive suffix -en realizes a deep structure A which is transformationally amalgamated with V. A final class of transformationally disguised X are the "case categories" such as "accusative," "dative,", etc. In section 1.8,1 argued at some length that such categories, if they were "created" by a transformation-like principle, as is often assumed, would violate an otherwise completely general property of natural language grammars: (9)

Base Generation of Categories: All syntactic categories of a grammar appear in the base component.

From the general validity of (9) in a range of non-controversial cases, I concluded that (closed) categories such as "accusative" and "dative" are realizations of the case-assigning categories themselves (V and P respectively) in non-base positions for these categories-i.e., on the N of the governed object phrase. For example, an accusative noun is [N, V]. In section 1.8, I pointed out that this type of analysis of case features suggests excluding any cross-classifying in the base of the case-assigning categories V, P with the case-assigned categories N, A. Nonetheless, detailed arguments for grouping V and A such as those in van Riemsdijk (1983) may require that the case-marks V ("accusative"), P ("dative"), etc. be treated as indices on (rather than features of) the case-marked categories. Thus, an accusative adjective would be properly represented as Ay, a dative noun as Np, etc. For ease of exposition I usually write [N, P] rather than Np, but I do not mean to exclude a possible crossclassification of N, A and V, P. It is simply that I do not crucially utilize this cross-classification in the text. 9

9. It is often asserted that Japanese and Korean provide evidence for the V, A feature, since these two categories take finite endings in these languages. However, I have not seen this argument made where a ¿e-deletion in the context AP and attachment of T E N S E to X ( = any lexical category) are taken as a serious alternative. There must be descriptive generalizations that favor one of these treatments over the other. Van Riemsdijk points out (pers. comm.) that some elements appear to be in a neutralized category, e.g., the example of near discussed in Maling (1983), which may be in a category that neutralizes the distinction between A and P. Near occurs with both SP(A) and SP(P), and assigns a 0-role directly. (i) The boat is {right/very} near the shore. I am inclined to place near in two distinct categories, P and A, rather than in a neutralized category, analogously to the way need and dare are both in V and in M. The adjective near would then take a goal P P whose head P is optionally empty:

Grammatical

formative

categories

and designation

convention

165

4.4. Unique Syntactic Behavior of Closed Class Items Now that some idea has been given of the range of items included in the closed categories, and the specific categories X or SP(X) of many of the English grammatical formatives have been determined, it is appropriate to establish more syntactic characteristics which differentiate them from the open lexical categories. One such property (7) has already been proposed; by virtue of this restriction, no base rule can expand a closed category node as a branching sequence of categories. Another such property, based on the definition (10), is given in (11). (10) (11)

Definition: A feature with semantic content not used in any syntactic rule is called a (purely) semantic feature. Unique Syntactic Behavior: Lexical items in the closed categories cannot be differentiated from each other solely by purely semantic features.

Principle (11) is equivalent to claiming that any two morphemes that appear in a given non-head position must differ by a syntactic feature (or by no feature at all, in which case they are in free variation). This implies that in non-head positions, a language has at most one grammatical formative for a given set of syntactic features, outside of free variation. Since a given set of syntactic features a generally induces syntactic behavior different from that associated with any other set a', the grammatical behavior of the formative that realizes a will generally be unique; hence, the name I give to (11). Let me begin discussion of (11) by reviewing what justifies the use of any syntactic feature or category. Syntactic categories have been defined by Chomsky (1965; 88, 153) as exactly those categories which appear in syntactic rules. By syntactic rules, I mean the statements that play a role in determining the well-formedness of deep structures, surface structures, and logical form, but I do not mean purely semantic or phonological information associated with lexical entries. Thus, when we find that a certain Footnote 9—Continued (ii) The boat is very near (to) the shore. The boat is nearer (to) the shore than I expected. It then follows that a SP(P) is as odd with [ A « e a r ] as with other A. This seems borne out by the low acceptability of (iii). (iii) ?The boat is right near to the shore. T h e boat is right far from the shore. The dual rather than neutralized status of near is confirmed by its predictable behavior under WH-fronting, which moves PP's but not AP's in relative clauses: (iv) The town [ P P (right) near where] we saw an accident has been destroyed ?The town [ A P very near where] we saw an accident has been destroyed.

166

A unified theory of syntactic

categories

rule of syntax in a formally constrained and empirically enlightening syntactic description must be expressed in terms of a certain category, that category (or feature) is called "syntactic." Except for their use in theoretically based and relatively well worked out syntactic descriptions, we have no basis for calling a semantic or cognitive category "syntactic." Chomsky (1965, Ch. 4) emphasizes that this definition of syntactic features or categories does not imply that they play no role in the semantic component. It is possible that syntactic features/categories are even primitives in semantic interpretation. In the elaboration of syntactic theory, "semantic feature/category" does not mean "any feature used in the theory of semantics," but means rather a feature not used in syntax. Hence, the import of principle (11) (Unique Syntactic Behavior) is that every feature/ category that appears with morphemes in closed classes is syntactic, i.e., can be used to state syntactic rules. Before considering the differences between the syntactic and semantic features of the lexical categories, let us examine this consequence of the principle of Unique Syntactic Bahavior (11) for the non-head categories. Any two morphemes in such a category which differ in meaning necessarily differ in a syntactic feature, and thus, there can be a rule of syntax which treats them differently. To the extent that many such rules actually exist, it is Unique Syntactic Behavior (11) that predicts their existence. That is, (11) predicts that members of categories such as SP(X) or C O N J U N C T I O N will all potentially behave pairwise differently in syntax. This prediction is borne out very well in English, keeping in mind that syntax here includes all rules involving not only deep and surface structure, but also logical form. Here are some examples of pairwise differences in syntactic behavior of DETERMINERS: (12)

The boys {all/each/*every/*some} came. {Some/each/all/*every} sat down. {All/*some/*each/*every} the boys sat down. {All/some/*each/*every} boys were polite.

Even though no rule of English clearly distinguishes between this and that, whatever feature distinguishes them surely distinguishes here and there also, and this feature is used in the syntax (i.e., there but not here appears in existentials, etc.). What is distinguished from which by its appearance with else and by its use in existentials (* Which was there in the house?). The enumeration of syntactic differences between all pairs of putative SP(N) in English can easily be continued (you deletes in imperatives and we does not; John's many vs. * John's some), but the point is established: it appears that the feature that distinguishes any two elements of SP(N) is syntactic, as predicted by (11). At the risk of tedium, the same exercise can be carried out for SP(A) and SP(V). For SP(A):

Grammatical (13)

formative

categories

and designation

convention

167

John is {very, very/too, too/*less, less/*rather, rather} tired, too green, so green, vs. green enough {as/so/too/how/*very/*less/*most/*4uite} big a man {so/*as/*too/*very} tired that he went to sleep {*so/as/* too/* very} tired as I was {*so/*as/too/*very} tired to go outside

If there are SP(A) which are syntactically identical (quite vs. rather), (11) requires that they cannot differ by a semantic feature; i.e., they are in free variation in any dialect where they co-exist. Pairwise differences among the SP(V) are as follows: will, can, would, and could are the only SP(V) in subjectless imperative tags; the first two being - PAST and latter + PAST. Will and would but not can or could are subject to contraction with a preceding subject. Shall interacts with the person of the subject: *Shall you catch a cold? Ought irregularly takes a following to. Need and dare are "negative polarity" modals, unlike the others: *Someone need help me. Only may and dare (in my speech) don't permit negative contraction: *He daren't do that. Inverted may has a nonquestion interpretation when inverted, not shared, for example, with might. So, in general, I conclude that morphemes outside the lexical categories typically exhibit unique syntactic behavior. Actually (11) predicts that they may, not that they must, exhibit such behavior. If two morphemes in a closed category have the same behavior, it means that the syntactic feature which distinguishes them fails to appear in any syntactic rule in the language, an unusual but not excluded situation. But the pervasiveness of this unique behavior, in contrast to its absence with typical members of the open categories, testifies to the importance of the distinction between the categories which tolerate purely semantic features and those which do not. 10 In fact, principle (11), in conjunction with the fact that there are a 10. It is of interest to point out an implication of (11) with respect to method. As we have seen, (11) predicts that morphemes outside the lexical categories potentially have, and in fact typically have, pairwise distinct syntactic behavior. If an empiricist begins investigating such a category, say the English SP(V) = AUX, he must take some pattern in the data as definitional of the category (the operational test). Since surface patterns reflect syntactic rule operations, he will no doubt observe that some other surface test (reflecting the pairwise syntactic differences predicted by (11)) fails to confirm the extension of the category set up on the basis of the first test. Every new test will suggest instead extending or diminishing the extension of AUX, and there will be no reason (within the empiricist framework) for taking one test over another as definitional. Thus, he will finally reject the category AUX, for lack of any principled basis for selecting one or more of these parameters as definitional for a base category. (Huddleston, 1978, note 13, wherein the author lists 18 surface properties of English modals, none of which coincides exactly with the others.) The empiricist is of course finally led to deny the existence of syntactic categories (other than the major lexical categories). This is an old road rediscovered by too many linguists to enumerate.

168

A unified

theory

of syntactic

categories

rather modest number of syntactic features available for each SP(X) and X (perhaps of the same order as there are phonological features to distinguish among consonants and among vowels), essentially predicts why closed categories have limited membership (3). Unique Syntactic Behavior (11) simply says that the fixed store of syntactic features available for differentiating among, say determiners, cannot be augmented by purely semantic features in order to express more meaning differences. Hence, syntax sets an upper limit on the number of determiners, and no more can be added to the system. 11 Let us turn our attention now from the closed SP(X) categories to the closed subsets of lexical categories. Withing a closed category, it appears that universal grammar furnishes a number of syntactic features which typically allow on the order of a dozen or two elements of that category to be distinguished. These elements are not further specified or subdivided by purely semantic features; indeed, it appears they cannot be. As is quite unremarkable, the syntactic features that subdivide the closed category in question play an important role in semantic interpretation. What might we expect to happen within a lexical category? It seems to me plausible to expect similar behavior with one important distinction. Within a category like V, universal grammar furnishes a number of syntactic features which allow perhaps a dozen or two basic elements of that category to be distinguished. Since syntactic features, as always, can play a central role in semantic interpretation, the permitted combinations are simply matched with spellings (as in any closed category), and the dozen or two elements are fully functioning verbs. However, we know that within V, there are hundreds and hundreds of items besides these basic ones, and empirically we know that they are not distinguished solely by syntactic features (i.e., by features that play a role in syntactic rules). Thus it seems that universal grammar, within the lexical categories only, permits spellings to be associated with complexes of syntactic features that are further differentiated by very large numbers of purely semantic specifications. But there is no reason to believe that the basic verbs, both given meaning and differentiated among themselves by syntactic features, could be made more complex by the addition of purely semantic features to their lexical entries. Thus, I propose that there are two kinds of V: (i) A small, fixed set of basic V pairwise differentiated by features which play a role in syntactic processes and which, at the same time, function centrally in semantic 11. One might object that there are numerals from 2 through 9, and 10 through 19, and a set of multiples of 10, that d o not differ by syntactic features (i.e., 6 and 7 are only semantically different). I would gladly grant that the number system represents a mental creation that has a unique and exceptional relation to an otherwise regular syntax. However, other solutions suggest themselves: higher numerals are quite possibly adjectives (A), or nouns (N) as argued in Jackendoff (1977, sec. 5.5). Another solution is that 6 and 7 are not different semantically, but only by virtue of a non-linguistic system. That is, they are semantically in free variation, as allowed by (11).

Grammatical

formative

categories

and designation

convention

169

interpretation; moreover, the meanings of these V are not embellished or particularized by purely semantic information, at least in non-idiomatic cases, (ii) A much larger set of V which are pairwise differentiated by semantic features that play no role in the syntax, which function less centrally in semantic interpretation, and which in many cases have very particularized meanings. Within this class, conscious coining is allowed. Class (i) contains what I will call grammatical verbs, and class (ii) contains the open lexical classes. In class (i) are found the English verbs be, have, get, do, go, come, let, make and probably a few others such as want and say. Other verbs belong in the open class (ii). Formally: (14)

(15)

A lexical item of category X ( = N, V, A, P) which contains no purely semantic feature in its lexical entry is called a grammatical X. Since the number of syntactic features available to distinguish among X is assumed to be roughly the same as for any syntactic category, the categories of grammatical X are closed.

By (4), no coining is allowed in the closed classes of grammatical X°. There is, at this point, a discrepancy in what has been developed. In a lexical category, it seems that there will be a grammatical verb associated with a syntactic feature complex (e.g., get = [V, I N C H O A T I V E ] ; cf. Kimball, 1973), together with large numbers of open class verbs with the same two syntactic features, the open class items also having semantic specifications. The question is, will the principle of Unique Syntactic Behavior (11) have any effect comparable to the prediction that each SP(V) behaves differently from each other one in the syntax? Suppose a rule in some language affects the term [V, I N C H O A T I V E ] ; will the grammatical verb realizing this complex have a unique behavior, or will the rule in question apply to all inchoative verbs in the language? Nothing in what has been proposed so far will prevent the second alternative. It seems that transformations operate only (i) on the class of verbs as a whole (e.g., V-fronting in Dutch and German), or (ii) on what I call grammatical verbs, the verbs which spell out certain combinations of syntactic features like I N C H O A T I V E , ACTIVITY, M O T I O N A L , and which at least in their principal meaning or use are not further specified by a semantic feature. The semantic features that differentiate, say, possess from have or exist from be can never be used in a rule of syntax. Perhaps, but not necessarily, this feature is nothing else but a marking that the verb is lexical rather than grammatical. 1 2 There are no syntactic rules which

12.

That is, if there is a purely semantic feature + F which is associated with

exist,

etc., this feature can never appear in a syntactic rule, in either its + or — value.

possess,

Here I d o not try to solve the question of whether all languages utilize exactly the s a m e set of syntactic features. If not, there is a universal set from which languages c h o o s e only certain subsets. A s discussed in section 4.1 a b o v e , I think it is likely that the same set of

170

A unified theory of syntactic

categories

operate on both the grammatical and lexical motional verbs, or on all inchoative verbs, or just on individual lexical verbs, or on the class of verbs with animate objects, or on some semantic class such as hurt, destroy, wreck, ruin, injure, damage, harm, etc. For example, no rule could front all and only stative verbs in questions. Some principle of universal grammar is required to account for this, and hopefully, the principle will have predictive power beyond being just a stipulation with only this effect. After a digression in section 4.5., I return to the formulation of such a principle. 13 4.5. Suppletion as a Property of Closed Categories However suppletion is to be described formally, it can be conceptualized as a richness of syntactically conditioned phonological specification for a given lexical item. The lexical entries for members of closed classes, being devoid of purely semantic information (11), tolerate a qualitatively richer specification of phonological/syntactic irregularity. In traditional terms, the following types of highly irregular items in the lexical categories are said to be "suppletive." Verbs: go/wenf, be/are; Latin ferre 'to bring'/tuli 'brought'/iaius 'brought' Nouns: person/people Adjectives: good/better/well Ordinary irregular variants such as stand/stood; catch/caught; woman/ women are not generally considered suppletive. In historical terms, suppletion can be defined trivally as "words in the same paradigm which F o o t n o t e 12—Continued features appears across languages, with members having somewhat different distributions. For syntax, what these features are has been most thoroughly discussed in Jackendoff (1977) and, from a more semantic perspective, in Talmy (1978) and (1983). In phonology, this question has been interestingly addressed and perhaps in large part solved by K e a n (1975). 13. It may be that generally applicable transformational rules c a n n o t refer t o an unrestricted X° either, but establishing this would be an ambitious task. K o o p m a n (1984) argues that V-fronting in D u t c h - G e r m a n is an operation on T E N S E rather than on V, and my review of verb movements in Ch. 3 follows this position. O n the other hand, I take the frontings of V in R o m a n c e languages to be movements of V°. M y view on verb-raising for Dutch infinitives is that the V , which raise in final V, — V 2 sequences are those which are the head V in an S or a VP sister to V 2 , while those t h a t do not raise are in an S. Whether this view can be sustained I not sure; cf. the Second Appendix to Ch. 2. But in general, whenever it seems like a subclass of verbs undergoes a rule, I tend to think there is either a characteristic structure that the verbs all share, or t h a t the question reduces to one of deep structure subcategorization. F o r example, the English verbs that a p p e a r with "raising-to-object" complements I have argued are exactly those whose deep structure subcategorization frames include + N P [ S ] (Emonds, 1980a). F r o m this deep structure, the independently motivated transformational processes of extraposition and English " N P - a Inversion" derive the surface forms.

Grammatical

formative

categories

and designation

convention

171

descend from different roots," but this definition is of no help in characterizing suppletion as a synchronic lexical phenomenon of extremely limited range. Looking at the English and Latin examples above, a more appropriate first approximation of what suppletion is would be "words in the same paradigm which have different initial and medial consonants"; this makes no appeal to non-synchronic information. However this definition is to be sharpened, what is of more interest here is how to capture the fact that suppletion is so rare, compared to the more widespread phenomenon of simple morphological irregularity. That is, we want to predict that while English allows semantically specific open class verbs such as arise, bleed, blow, buy, catch, creep, forbid, etc. to be irregular, suppletive variants of such semantically specified elements are excluded in principle. It seems inconceivable that the past stem of arise could be bose, that gloot could be the past of bleed, etc. analogous to be/are, good/better, etc. My purpose then is to refine the principles governing the closed lexical class items defined in section 4.4 in such a way as to make suppletion in the open lexical classes impossible. That is: (16)

Suppletion occurs only in the closed categories.

In order to define suppletion appropriately, a notion of regular and irregular variant must be clarified: (17)

Two words are regular inflectional variants if they result from applying the (regular) syntactic and phonological rules of a language to a unique lexical stem in two different structural contexts.

Thus, helps/helped are regular variants. Moreover, identical forms can be regular variants, by (17); thus, have/have result from applying the rules of English to have in the two contexts [TENSE, PRESENT, 1 PERSON, SINGULAR] and [TENSE, PRESENT, 1 PERSON, PLURAL] . (18)

Two different words are irregular variants if they differ only and precisely in the same structural contexts as does a pair of regular variants, but they cannot be obtained from the syntactic and phonological rules of the language.

Thus, eats/ate are irregular variants, as are am/are. Again, a pair of identical forms can be irregular variants (e.g., put/put in the contexts PRESENT and PAST ). For languages like English and Latin, suppletion can now be characterized by (19), though I do not claim that this definition is universal. (19)

Two irregular variants are suppletive if and only if they differ in some non-stem-final consonant cluster.

172

A unified theory of syntactic

categories

The characterization of grammatical verbs as a closed subset of the least semantically specified verbs seems to suggest that it is exactly among these verbs that suppletion is allowed. That is, while verbs like be, go, take, get, etc. may be suppletive, verbs like boil, cough, chase, arise, bleed, blow, etc. cannot be. An empirical consequence of (16) and (3) taken together is that a language contains a maximum of about two dozen suppletive verbs, a clearly testable proposal. From (16) and (4) taken together, it also follows that suppletive variants cannot be coined. 14 Having now examined some typical closed class behavior of the grammatical V, N, and A, let us return to the question posed at the end of section 4.4: why do syntactic rules not seem to affect subclasses of X° which include both grammatical and lexical verbs? 4.6. The Designation Convention and the Epiphenomenon of Auxiliary Verbs The distinction between grammatical and lexical (open class) verbs interacts in an interesting way with a general principle of universal grammar to explain a seemingly wide range of "irregular" behavior in the category V. The purpose of this section is to develop this principle, the "Designation Convention." A subcategory of a grammatical category consists of all the lexical items in that category which have a certain feature. Thus, all activity verbs (e.g., do, eat, repaint, certify, etc.) make up a subcategory of verbs notated as [V, ACTIVITY], and stative verbs (e.g., be, own, resemble, exude, etc.) are not in this category. With this definition, I can state the principle which I will be arguing for in the rest of this chapter. (20)

Designation Convention: A rule of syntax (whether of insertion, movement, deletion, or filtering) stated in terms of a subcategory of a head of a phrase [X°, F], where F is a syntactic feature, cannot affect any lexical items under X° associated with a purely semantic feature.

In particular, such a rule cannot affect an open class item. We can equivalently call the grammatical representative(s) of [X°, F ] its designated element(s), in the sense of Chomsky (1965, Ch. 3, 144—145). 14. Depending on how we read the phrase "structural contexts" in (17) and (18), we can extend the notion of suppletion to many closed categories that are not in the lexical categories. For example, suppose we have the contexts "all are nice" and "all is nice." From (17), bread/breads are regular inflectional variants. By (18), that/those are irregular variants. Similarly, suppose the contexts are " bothered him" and "destroying bothers him." From (17), food/food are regular variants, and by (18) I/me are irregular variants. By (19) I/me are also suppletive. Such an extension of the notion of suppletion to all closed categories seems appropriate, but nothing in what follows depends on this extension.

Grammatical

formative

categories

and designation

convention

173

We can immediately see that the first correct prediction of the Designation Convention is that no rule could, for example, front all and only semantically inchoative verbs, or morphologically combine all and only semantically motional verbs with the head of their complements. That is, no syntactic rule can affect proper subclasses of verbs which include open class items. If a syntactic rule affects [V, M O T I O N A L ] , it will, by the Designation Convention, affect only go or possibly come, go (depending on a further determination of how syntactic features are to be distributed). I return below to the details of how Unique Syntactic Behavior of the grammatical verbs is guaranteed by (20). The Designation Convention also makes precise an aspect of the Recoverability Condition (Chomsky, 1965, Ch. 3). One of the consequences of (20), when applied to deletion rules, is that only designated elements can be deleted. Moreover, the notion of "designated element" is now clearly defined; such an element is one with no purely semantic feature. Thus, syntactic deletions of verbs like be, do, go, etc., and of nouns like self, one, people, etc. are possible in specified contexts, but deletions of arbitrary open class lexical items are not. A third desirable consequence of the Designation Convention is that it allows suppletion to be assimilated to other language-particular rules with target predicates of the form [X°, F], In section 4.5, suppletion was treated as a phenomenon outside the scope of the regular rules of grammar. However, by means of the Designation Convention (20), suppletion can be correctly predicted to be limited to the grammatical X. As an example, consider the past tense of English go. We can say that there is a local transformational rule of English that substitutes, or in some way associates, went with the syntactic feature complex [V, M O T I O N A L ] in the context PAST. By the Designation Convention, the rule can apply only to the grammatical formative with these features, and not to open class verbs; that is, it can apply only to go. Since suppletion rules typically are transformational and local, they will always apply to grammatical and not to open class X, so suppletion is not a separate property stipulated for grammatical X, but is rather a further justification of (20). To my mind, the most interesting aspect of the Designation Convention is the clarification it allows in the analysis of the catch-all category of "auxiliary verb." An explanatory characterization of the difference between auxiliary and main verbs has eluded generative grammarians. It is of no interest to postulate two types of verbs by means of an abstract, morphologically inoperative feature such as + AUXILIARY if no general predictions can be made as to what sorts of base, transformational, or lexical rules may apply to one but not the other of the two types. Even under van Riemsdijk's relatively weak requirement that categories outside the bar notation must be morphosyntactic (cf. section 4.1), the English auxiliaries could be said to constitute a category only by a nearly vacuous interpretation of "morphologically non-neutralized." In the rest of this section, I want to show that a complete analysis of

174

A unified theory of syntactic

categories

these elements eliminates the feature AUXILIARY from the class of syntactic features that can accompany a verb and be referred to in a syntactic rule. Throughout, I adhere crucially to the definition in Chomsky (1965) discussed in section 4.4 by which a "syntactic feature" is precisely a category of universal grammar that may appear in a syntactic rule. 1 5 The conclusion that the "auxiliary" verb is epiphenomenal does not mean that what traditional grammar has called auxiliary verb is not precisely identifiable; the contrary will be shown to be the case. An initial clarifying step is to recognize that in discussions of English, two quite different types of elements have been called auxiliaries. The modal auxiliaries (will, would, can, could, may, might, shall, should, must, ought, need, and dare in American English) lack any inflection and prohibit it on the following verb, while the conjugated auxiliaries (do, have, be, get, go, and come in American English) are fully inflected main verbs which also appear in certain syntactic contexts where n o productive or even semi-productive class of verbs is allowed. Typically, in fact, the syntactic distribution of each of the conjugated auxiliaries differs from that of the others. My focus here is on the conjugated auxiliaries. F o r the English modals, I build on and add to my previous analysis (Emonds, 1976, Ch. 6). As in that work, English modal auxiliaries are realizations of the specifier of the verb, perhaps misleadingly termed AUX. The category. VERB and the category SPECIFIER(V) ( = AUX), the latter comprising the modals and the finite verbal suffixes, share no syntactic feature and undergo no rules in common. In order to elucidate the nature of the conjugated auxiliaries, let us examine the various overlapping criteria that identify them. Modern English verbs which invert with the subject in questions (have, be, do), including sentences where be is the main verb, are termed auxiliaries. So are those followed by past participles (have, be, get). F o r French, various arguments can be given that the causative faire "do, make" is an auxiliary (Gross, 1968); the French verbs aller "go" and venir (de) "come" are also temporal auxiliaries. Finally, the English verbs come and go appear as "inflectionless" auxiliaries in he will come/go visit Sam, they come/go visit Sam, *he come(s)/go(es) visit Sam. A preliminary characterization of a conjugated auxiliary verb is therfore as follows: A verb is an auxiliary if and only if it occurs in some syntactic position where main verbs do not occur, or if it cannot appear in some position where main verbs d o occur (deep structure subcategorization for phrasal complements permitting). In the terminology of section 4.3, a verb is an auxiliary if and only if it is "disguised." 15. An obvious misunderstanding is to be avoided: syntactic features are themselves subject to semantic or at least pragmatic interpretation. The point is rather, as in Chomsky (1965, Ch. 2), that semantic features can have no syntactic effect.

Grammatical

formative

categories

and designation

convention

175

Or: A verb is an auxiliary if and only if some transformational rule moves or inserts it into some non-verbal position, or deletes it, or prohibits ("filters") it in some verbal position. By the Designation Convention (20), a transformational rule moves, inserts, deletes, or filters a verb in a non-verbal position if and only if the rule has [V, F ] as a target predicate, in which case it will affect only a grammatical verb. Thus, an auxiliary verb is a grammatical verb (one without a purely semantic feature) whose characterizing syntactic feature F is one that appears in a language-particular transformational operation. 1 6 So, in the light of the Designation Convention, the problem of describing English conjugated auxiliaries is re-focused as the problem of specifying the syntactic rules of English which contain terms of the form [V, F]. Incidentally, it might well be that these rules mention only F (e.g., I N C H O A T I V E , M O T I O N A L , etc.), since in a rather obvious way the mention of V with these features is redundant. That such language-particular rules apply to terms of the form [V, F ] is n o different than, and no more surprising than, the fact that there are similar rules that apply to [SP(X), F]. F o r example, verbal affix movement in French applies to all SP(V)-the future and the conditional are tense suffixes; in English, it applies only to [SP(V), - M O D A L ] - t h e future and the conditional will/would don't become suffixes. Summarizing, the conjugated auxiliary verbs for a given language are just that subset of grammatical verbs whose features are target predicates in the syntactic rules of the language. While this characterizes auxiliaries precisely, such a collection clearly has no theoretical status; it is thoroughly epiphenomenal. A final consequence of the Designation Convention is that no two conjugated auxiliaries are expected to have exactly the same behavior under transformation. (Identical behavior can be expressed, but is marked). There is no feature AUX encompassing all grammatical verbs for rules to refer to. 1 7 Rather, each grammatical verb V; is the unmarked representative of a subcategory F of V; if some rule in the languageparticular grammar mentions F, then V; will, all else being equal, behave unlike other auxiliaries. This prediction seems correct. The uniqueness of English do requires no comment. Moreover, a number of studies have brought to light differences in the syntactic combinations of be -I- present participle, be -I- past parti16. In Ch. 3, I have discussed the restrictions that the theory of grammar imposes on such rules; as a result, almost all rules that affect auxiliaries are local transformations. 17. Some language might have only one rule with a target predicate of the form [V, F ] for some syntactic feature F. For such a language, the grammatical verb(s) with the feature F would be co-extensive with what traditional grammar would recognize as auxiliary verbs.

176

A unified theory of syntactic

categories

ciple, and have + past participle. Another obvious difference between have and be is that have cannot take the present participle. In particular, the past participle in the passive exhibits certain adjectival properties quite foreign to the past participle of the composed past tense (the "perfect")-cf. Kayne (1975, Ch. 5). Be and have also become part of the SP(V) in English under quite different conditions, while get is never so attached. Still other ways in which the Spanish past participle behaves differently in the passive and the composed past have been pointed out to me by C. Otero and E. Torrego; that is, there are various constructional differences between the Spanish auxiliary haber "have" and the auxiliary ser "be". The unique behavior of the conjugated auxiliaries is not a disadvantage for the present framework, but rather a predicted consequence of the Designation Convention. One might object to the Convention by saying that it just masks the irregular behavior of certain verbs-but this objection would be operating with an a priori and atheoretical notion of "regular." No one expects the English plural morpheme or the Latin dative morpheme or English not or French dont "of which" to share the syntactic behavior of other morphemes, and no one calls the unique syntactic behavior of each of these "irregular." The Designation Convention is simply the theoretical statement of where we may expect the line between syntactic morphemes of unique behavior and those of non-unique behavior to fall: the line will fall between those grammatical head-of-phrase morphemes representing categories mentioned in syntactic rules, and other grammatical and all lexical members of head-of-phrase categories. Regarding verbs, the former group comprises what I have called in section 4.3 the "disguised verbs," and corresponds precisely to what traditional grammar calls auxiliaries. Thus, the Designation Convention, a natural extension of Chomsky's Recoverability Condition, explains why open subclasses of X° categories don't undergo transformational rules, reduces suppletion to typical language-particular transformational behavior, and explains the seemingly irregular nature of certain members of the grammatical X° their behavior is regular, and due to the presence of rules in the language with terms of the form [X°,F], In the next section, I will show how the Designation Convention interacts with and highlights the special character of the small class of grammatical X, further justifying separating them from the open category lexical X. 4.7. Late Lexical Insertion The properties of the members of closed categories that have been discussed so far can be summarized as follows. Grammatical X (X = N, V, A, P), unlike open-class X, appear in the lexicon with no purely semantic features. As a result, grammatical X can be differentiated from each other only by the small number of syntactic features, and so are few in number and (outside of idioms) lacking in particularized semantic specification. Grammatical X, like other members of closed categories, cannot be

Grammatical formative categories and designation convention

ill

consciously coined but they can be suppletive. Because of the Designation Convention, only grammatical and not open class X undergo rules,including those of suppletion, which apply to the target predicates [X, F ] (F a syntactic feature). In this section, I want to establish a further property of grammatical formatives; only such formatives satisfy insertion contexts which are produced by transformations. More specifically, certain grammatical formatives of category C are inserted after at least some transformations have applied on the smallest cyclic domain which contains C. Such insertion I will call "post-transformational," in contrast to "pre-transformational insertion," whereby a morpheme is inserted under a category C before transformations apply in the cyclic domain containing C. Pretransformational insertion covers all open class items and presumably many closed class items as well; it is justified by the fact that the insertion contexts for such items are satisfied only in structures which exist prior to certain transformational operations; cf. (58), Ch. 2. Pre-transformational insertion encompasses both the familiar deep structure insertion of lexical items, as in Chomsky (1965, Ch. 2), and the cyclic lexical insertion that I have argued here must replace it (Ch. 2, 58), and which is also proposed in Evers (1975). I d o not pretend to completely determine how grammatical theory orders the post-transformational insertions under investigation with respect to other language-particular transformational rules. Since these insertions are simply spellings of unique complexes of syntactic features, they might be purely phonological. Among other possibilities, posttransformational insertions might be interspersed among other languageparticular transformations or apply in a block at the end of a cycle. In any case, what I want to establish here is that some representatives of all closed classes, including the grammatical X°, are inserted into contexts which are produced only after certain movement rules, especially N P movements, apply. (21)

Late Lexical Insertion: If a morpheme M inserted in a cyclic domain D has a contextual insertion feature that must be satisfied after (rather than before) transformations apply in D, then M is in a closed category.

I will also argue that some members of closed categories must be inserted post-transformationally. Before discussing individual instances of post-transformational insertion, two general points about insertion of lexical items need to be made. The first is that the Designation Convention (20), under a quite plausible reading, can in fact predict the property we are interested in in this section, Late Lexical Insertion (21). Assume that the insertion of an open class item is a rule of insertion which applies to a head [X°, F]; e.g., let us say that a typical verb like paint is inserted by rule into the feature complex

178

A unified

theory

of syntactic

categories

[V, ACTIVITY], This is not a "rule of syntax." If we take "rule of syntax" in the Designation Convention to mean "cyclic or post-cyclic rule," then the Convention correctly predicts that if an item is of an open class, it must be inserted at deep structure or at the beginning of each cycle, exactly as desired. Moreover, by the contrapositive of the preceeding statement, if an item is inserted according to a context created by transformations, it must be in a closed category; this is Late Lexical Insertion (21). A second general point is that all lexical insertion, pre- or posttransformational, is subject to the following condition, which is a generalization of Chomsky's (1965) concept of a deep structure subcategorization feature: (22)

A contextual subcategorization feature + Xk of a morpheme k a is satisfied only by an X which dominates a terminal element at the level at which a is inserted, unless Xk is further stipulated as (possibly) empty by the feature in question.

Thus, items mentioned in subcategorization are quite unlike the empty nodes which are induced by, but not mentioned in, subcategorization features, as discussed in section 1.7. Applied to complements of heads, (22) means that arguments must be lexical, and not traces, at the level of insertion (i.e., at deep structure, for open class items). 18 (22) is stated in such a way that it applies to either pre- or posttransformational insertion. One important justification for it is that it prevents traces from arising "accidentally" in deep structures, through free indexing of NP's, in positions where they violate subjacency. Of course, the idea that subcategorized complements of lexical heads are not empty when the latter are inserted in deep structure is a familiar characteristic of the extended standard theory of transformational grammar. Given (22), it is necessary to maintain that an insertion frame, of a passivizable verb for example, determines well-formedness "at a level," and not simply anywhere in a derivation. The Projection Principle of Chomsky (1981, Ch. 2) thus does not in itself predict (22) as a consequence, even though it is consistent with (22). It has been maintained (Otero, 1976) that all lexical insertion is at sstructure, and hence post-transformational. This implies that empty Xk, including traces of movement, can satisfy subcategorization requirements. If we accept this view, we would lose the ability to exclude traces which 18. A trace from the lexicon, if such exist (cf. Keyser and Roeper, 1984), cannot appear in the syntax, if (22) is correct. If there are phonologically empty pronominals (e.g., bound by clitics) in deep structure other than those resulting from obligatory control, they must involve syntactic terminal elements, or else (22) must be appropriately weakened. The empty NP's of obligatory control are discussed in the section o n induced empty nodes in Ch. 2, and require no modification in (22).

Grammatical

formative

categories

and designation

convention

179

arise by base-generation of empty categories and which viQlate subjacency, as well as the explanation of (21) by means of the Designation Convention. But if one nonetheless holds to inserting all lexical items uniformly at s-structure, my arguments will establish that there is in any case a subtype of insertion of closed class items which is subject to (22). I refer to this type of insertion here as post-transformational. I now will justify post-transformational insertion of a number of grammatical formatives in closed categories such as SP(X) and P. One large class of elements which are inserted post-transformationally are the inflectional morpehemes. (The following chapter argues that all inflections result from some transformational operation on a base structure.) However, there are other moved elements of the same category as inflections which are not bound morphemes. For example, in (23), the SP(V) will and ed have both been moved, but only ed is inflectional. (23)

(a) Will John burn the paper? (b) John burned the paper.

A pervasive difference between moved inflectional morphemes and moved free morphemes is that the latter also appear unchanged in their deep structure positions: (24)

(a) John will burn the paper. (b) *John ed burn the paper. (c) cf. John did burn the paper.

A similar point can be made by comparing the SP(A) in (25)-(26): (25)

(a) John has bread enough. John is tall enough. (b) John is prouder than me.

It is widely accepted that the italicized elements in (25) are derived from deep structure sequences as in (26): (26)

(a) John has enough bread. (b) *John is er proud than me. (c) Cf. John is more proud than me.

Accepting the argument of Ch. 5 that inflectional morphemes represent transformationally derived structures, this difference between moved free morphemes of category SP(X) (e.g., will, enough) and inflectional SP(X) (e.g., -ed, -er) suggests that the former are inserted in their deep structure positions and then moved, while the latter are inserted only into a context created by a transformation, i.e., into X° . Thus, the grammatical formatives in question have lexical listings as follows (throughout, - TENSE = + MODAL):

180 (27)

A unified theory of syntactic

categories

will, SP(V), - TENSE, - PAST ed, SP(V), + TENSE, + PAST, + V enough, SP(A), . . . (no contextual feature) . . . more, SP(A), + C O M P A R A T I V E , + P O S I T I V E , . . . (no contextual features) . . . 19 er, SP(A), + C O M P A R A T I V E , + POSITIVE, + A

I have not taken u p here several aspects of inflection, such as the exact nature of the language-particular rule(s) that allow it. (This topic is discussed more in Ch. 5). Here, it suffices to note that the theory of grammar must guarantee something like (28): (28)

Complementary Distribution: If no contextual insertion frame for a contains a morpheme category (such as X o or SP(X)), then a cannot be inserted as a bound morpheme.

That is, an ending like -ed must be a bound morpheme, and will never can be. However these general conditions on inflection are to be worked out, the insertion contexts for inflections, as represented in (27), d o capture the generalization that these morphemes appear only in transformationally derived contexts, and thus support the idea that such morphemes are inserted post-transformationally. In a similar vein, morphological realizations of abstract case features, often assimilated to inflection in traditional grammar, depend o n posttransformational surface contexts. At the very least, the nominative case morphemes appear with NP's that have undergone movement from nonnominative positions in passive and raising constructions. In English, the proper context for the insertion of the subject pronouns I, he, she, we and they clearly can be determined only after N P movements. There is also ample justification for inserting a wide range of free (noninflectional) grammatical morphemes in post-transformationally determined contexts. I begin with two transparent examples, one from English and one from French, and then discuss some more complex cases. In English, there is a grammatical formative category, call it K, associated with coordination, which means roughly also and is realized as too in affirmatives and either in negatives. (29)

Mary will leave town, and J o h n will (leave town) too. Mary won't leave town, and John won't (leave town) either.

In root contexts, [K, — N E G A T I V E ] can be preposed into C O M P , and is there realized as so: 19. This contextual feature for the comparative inflection must also include a specification that A can be comprised of at most one initially stressed phonological foot. This type of information makes it even clearer that the insertion context is post-transformational.

Grammatical (30)

formative

categories

and designation

convention

181

•,, , , f so will John 1 Mary will leave town, and < , >. (*too will John] Mary won't leave town, and neither will John.

Thus, the insertion context for this use of so is a K which has been moved into CO MP, while an unmoved K is too. The French morpheme dont "of which, of whom" appears only in relative clauses in C O M P position. Its counterpart in other WH-fronting contexts is de "of" followed by a WH-word (e.g., quoi "what", qui "who", lequel "which"). These WH-words have the status of NP's and occur in other N P positions, whereas dont appears only as a substitute for a fronted phrase [cOMP^e + N P ] in relative clauses. Thus, dont is also inserted in a context which is determined post-transformationally. An extensive system of free morphemes which are inserted in posttransformational contexts consists of the pairs of "polarity items" such as the SP(N) some/any, the disguised grammatical A often/seldom, the K morphemes too/either, the adverbials still/yet, etc. In a convincing treatment, Monaghan (1981) shows that the classic transformational analysis of these elements by Klima (1964a), with relatively few revisions, escapes the criticisms of Jackendoff (1972) and is superior to his purely lexical system. The pivotal innovation in her revision is that the morphemes involved are inserted after certain syntactic categories ( = feature complexes) are moved in the transformational component. In her view, several negative morphemes (not, no, etc.) are also inserted after the transformational movement of NEG; this is of course consistent with my general claim that Late Lexical Insertion is characteristic of many closed class items. Some examples of the Klima-Monaghan system: (31)

(32)

(33)

Two deep structure options, one with N E G and one without: (a) John can (NEG) [A, FREQUENCY] visit us. Third option-NEG movement: (b) John can [A, FREQUENCY, N E G ] visit us. The three realizations of (31) and two non-generable strings: (a) John can(not) often visit us. (b) John can seldom visit us. (c) *John cannot seldom visit us. *John can't seldom visit us. Two deep structure options, one with N E G and one without: 20 (a) We have (NEG) visited [SP(N), QUANT, I N D E F ] friends.

20. The system used by Monaghan retains the some/any alternations originally proposed as transformational by Klima. Some could be dropped from the system and be generated freely, and the argument here for post-transformational insertion would remain valid. It would then be based on alternations of the determiners no and not... any, and the unacceptability of sequences like *any friends have not visited, *we have not visited no friends, and * no friends have not visited.

182

(34)

A unified theory of syntactic

categories

Two passive options for (a): (b) [SP(N), QUANT, INDEF] friends have (NEG) been visited (by us). Fifth option-NEG movement to SP(N) in the object: (c) We have visited [SP(N), QUANT, INDEF, N E G ] friends. Sixth option-NEG movement to SP(N) in the subject, rendered obligatory by a filter against any in certain subjects: (d) [SP(N), QUANT, INDEF, N E G ] friends have been visited (by us). Five realizations of (33) and four non-generable strings: (a) We have visited some friends. We have not visited any friends. (b) Some friends have been visited (by us). *Any friends have not been visited (by us). (Excluded by an independent filter; cf. *Any friends have visited us). (c) We have visited no friends. (d) No friends have been visited (by us). (e) *We have visited any friends. *We have not visited no friends. *No friends have not been visited.

The basic workability of this type of analysis for polarity items confirms that a wide variety of closed categories in syntax have members which are inserted post-transformationally. A somewhat different type of argument for late lexical insertion of a SP(N) can be given for the English universal quantifier every. The SP(N) every differs from its closest syntactic relatives each, all, and both (the other universal quantifiers) in apparently two ways: (i) every cannot appear in a "floating" position, separated from its head: (35)

My friends have {all, both, each, *every} left the city. I gave them {all, both, each, *every} some money.

(ii) every cannot appear alone as an NP, as a case of "N-anaphora": (36)

If you want the phone numbers, {all, both, each, *every} may be found in the directory.

These diverse facts can be succinctly expressed by a restriction in the lexical entry for every: every, SP(N), + N. 21 But this restriction cannot simply mean "followed by an N in the tree," since every DET is so 21. The requirement for every is that N rather than N must be lexically filled: *every of the boys got sick. In Ch. 5, a convention will be introduced whereby the head of X always counts as adjacent to material which in a given tree is only adjacent to X. Thus: every young boy got sick vs. *every young of the boys got sick.

Grammatical

formative

categories

and designation

convention

183

followed. The restriction becomes meaningful, and furthermore expresses all the restrictions in (35) and (36), only if it is a subcategorization feature requiring that every must be followed by an N whose head is a terminal element, as insured by (22). If the rule which "floats" quantifiers out of an N P as in (35) is a transformation, then the feature + N for every describes the restriction in (35) only if it refers to a level after this transformation applies; that is, every should be inserted post-transformationally. I now turn to some P which I claim are inserted post-transformationally. The theory of 0-role assignment of Ch. 1 requires the deep structure subcategorization features + N P for a noun like destruction and + N P N P for a verb like send. These features, as discussed in section 1.7, induce empty deep structure P as in (37): (37)

(a) N I destruction

PP P I 0

/

\ I

NP

V NP I I send books

Rome

PP P 0

/

\ I

NP I John

These P are filled by the rules of (38), which apply obligatorily by virtue of the fact that empty nodes in s-structure would be ill-formed. (38)

(a) P =>to/ (b) P =>o//

[NP, P ] (P = dative case-mark, as in section 1.8) [NP, SP(N)] (SP(N)=genitive case-mark, as in section 1.8)

In English, the N P objects of the empty P in (37) can be moved; in (37a) to the possessive position (e.g., the city's destruction) and in (37b) to either "dative position" (e.g., send John books) or subject position (e.g., John was sent a book). When the object of an empty P is so moved, the P typically has no surface reflex. 22 French exhibits a somewhat different paradigm which nonetheless illustrates the same point about empty P. The pronominal object N P of a deep structure lexical P typically cannot be cliticized to the verb in 22. Van Riemsdijk (pers. comm.) suggests that if thereof is lexicalized in examples like (i), the rule of o/-insertion may be called into question. (i) The bombing of Vinh was massive and the destruction thereof was predictably total. However, it seems to me that in the elevated style of (i), thereof, thereto, therefrom, etc. are productive. (ii) Your letter and my response thereto are on file. But for him, I would speak thereof with pleasure.

184

A unified theory of syntactic

categories

French. But the pronominal object of an empty P in structures like (37) can be cliticized, by en "of it" and indirect object clitics, respectively. Again, when the object N P are moved away in this fashion from these empty P, there is no surface reflex of the P. Of course, if the objects of these empty P are lexical, rules like (38) apply also in French, to insert a "to" and de "of". In the first appendix to Ch. 2, I proposed the "Empty Head Principle" by which a head X is empty (phonetically unrealized) if it c-commands an adjacent empty caseless phrase. (This principle is not stated in terms of movement, but rather in terms of empty categories, so that it applies to the objects of the French P just discussed, even if there is no movement of pronominals to clitic position). The Empty Head Principle applies without modification to the empty P in (37) when transformational operations such as passive, indirect object movement, and possessive formation have removed the lexical N P object of these P. Similarly, in French, if the empty object N P of such P are coindexed with clitics, then these P are also phonologically unrealized. Clearly these N P movements and the accompanying Empty Head Principle must take effect before the rules (38) which insert the formatives to and of. The appropriate generalizations as to when these prepositions appear phonologically and when they are deleted by convention can be stated without ad hoc stipulations only if (38a-b) are posttransformational. This concludes the discussion of post-transformational insertion of morphemes outside the lexical categories. The existence of this phenomenon is interwoven in several other analyses in this book, and so it has seemed appropriate to discuss several examples in detail in one section. In particular, the principal result of Ch. 7 is the characterization of C O M P morphemes as P which are post-transformationally inserted in the context S. Thus, I want it established that such insertion is in no way exceptional. The principles proposed so far, in particular the very central Designation Convention, predict that grammatical members of lexical categories L can also be inserted post-transformationally, provided that their insertion is not associated with any purely semantic feature. In the final section of this chapter, I will show that this is indeed the case, and that some often-noted but unexplained syntactic facts follow from the claims that there are closed subclasses of the lexical categories, and that members of these closed classes can be inserted after transformational operations. 4.8. Post-transformational Insertion of Grammatical Verbs, Nouns, and Adjectives In this section, I will mainly discuss how a number of grammatical verbs are subject to late lexical insertion. Some of the same arguments are available for grammatical nouns and adjectives, but I have not investigated them in as much detail.

Grammatical formative

categories

and designation

convention

185

We know that members of lexical categories which are suppletive must be grammatical L. In the school grammar of English, the purported difference between good and well is that the former is an adjective and the latter an adverb. Both these items have the same suppletive comparative form better and superlative form best. (39)

Mary is a good runner. Mary is a better runner. Mary is the best runner. Mary is well-known. Mary is better-known. Mary is best-known for running.

Good and well therefore form a suppletive pair of grammatical A, by (19). As discussed briefly in section 1.8 (cf. note 30), the case morphology of several languages suggests that the difference between adjectives and adverbs is that formally, adjectives are case-marked AP, while adverbials are caseless AP. Thus, we may say that the minimal representative of the feature complex [A, EVALUATIVE] (this term is from Katz, 1964) is good when A is case-marked and well otherwise. Now, case is generally thought to be assigned at s-structure, after certain transformations apply. If so, and if, as is plausible, the pair good/well is inserted under A according to whether A has case, it follows that this suppletive and minimally specified grammatical adjective good verifies the claim that late lexical insertion can occur in the grammatical subcategory A. One clear example of a grammatical noun is self. Within the framework of Chomsky (1981), an N P of the form X-self cannot be the subject of a tensed verb because of principle A of Chomsky's binding theory. A competing explanation in terms of case has been proposed by Brame (1979b, Ch. 6), who claims that anaphoric forms like self are incompatible with nominative case, and otherwise are free to appear in any N P position. If Brame's line of explanation is correct, and again, assuming that case is assigned to NP's in s-structure, then self is a grammatical noun which exemplifies post-transformational insertion. Turning now to grammatical verbs, we have already seen in previous sections that rules of suppletion are essentially post-transformational insertion rules which affect only the designated elements of the category V. For example, was ¡were is suppletive for be and went for go in the context of PAST. It has also been established that any main verbs used as auxiliaries (i.e., which appear in contexts where open class verbs are disallowed) are grammatical verbs. In English, these include at least be, have, get, do, go, come (Will you come/go/*travel wash it?), let, and make (Let/make/have/*cause/*force him wash it). The most obviously grammatical of verbs is be, which is both suppletive and used as an auxiliary. A further general restriction on be is that it must govern a maximal phrase in s-structure which either (i) dominates lexical elements, or (ii) is co-indexed with a phrase in a non-argument position such as COMP.

186 (40)

A unified theory of syntactic

categories

There were lots of people who complained alot. *Lots of people who complained alot were. It is difficult to always be a good doctor. *[A good doctor], is difficult to always be t,-. John was never a good teacher. *[A good teacher] was never been t,- by John. [What kind of teacher],- has he been t,? John was t, to me [the best friend I had],

This restriction on be furnishes us with a contrast between be and other "there-insertion" verbs such as exist, remain, occur, appear, etc. The latter are free to appear without a post-verbal phrase. For this reason transformationalists have proposed the operation of t/iere-insertion and an optional rightward movement of the subject around these deep structure intransitives: (41)

Good doctors exist. There exist good doctors. A problem with the exhaust system remains. There remains a problem with the exhaust system. Something unexpected will occur at midnight. There will occur something unexpected at midnight. Lots of people who complained alot appeared. There appeared lots of people who complained alot.

Recall that any subcategorization feature requires that the mentioned complement phrase dominate a terminal element when the head is inserted, by (22). So, if we say be is subject to a contextual feature + X2, it appears, given (40), that this feature must be satisfied at the end of the transformational cycle on the smallest S which contains this beP This correctly accounts for the examples in (40); when a transformation such as WH-fronting or complex N P shift applies in the cyclic domain of the S containing S, the subcategorized complement phrase of be can be removed. 24 Post-transformational insertion features similar to that for be occur 23. I hesitate to make a strong statement for which I have not systematically marshalled evidence, but all the cases I have considered are consistent with the following principle: If a is subject to late lexical insertion, its contextual subcategorization features are satisfied at the end of the cycle on the smallest S containing a. 24. It may be that traces of movement to non-argument positions, throughout a derivation, either are or "count as" terminal elements, as argued in Whitney (1984). If this is so, I cannot claim that post-transformational insertion is precisely at the end of the cycle of the smallest containing S, as in note 23. Rather, it could be even later in the derivation. I note that the inability of be to take an N P complement with accusative case can be described by modifying its contextual restriction to + [X 2 , —V].

Grammatical

formative

categories

and designation

convention

187

with a number of other grammatical verbs. In order to see this, one must abstract away from certain idiomatic uses of these verbs, as in have a good time, let NP go, wanted (by the police), etc. In these idiomatic uses, grammatical verbs are lexically associated with purely semantic features and as such, by the Designation Convention, are like open class verbs and are inserted at deep structure. These uses are therefore not of interest here. What is of interest is that several transitive grammatical verbs do not appear in the passive; these include have, get, let, and want. (42)

"This car was had by John last year. "This car was gotten by John last year. *My friend was {had/let} (to) report for service. *Malaria was gotten by Joan during her trip. *Your house has been wanted by Bill for years. "The dog is never wanted in the back yard. Cf. They want the dog in the back yard.

These verbs differ minimally in meaning from their open class counterparts, which can appear in the passive, as the following examples show: (43)

This car was owned by John last year. This car was obtained by John last year. My friend was {urged/allowed} to report for service. Malaria was {contracted/caught} by Joan during her trip. Your house has been {coveted/desired} by Bill for years. The dog is never kept in the back yard.

I propose that the failure of stative grammatical verbs to appear in passives is again due to post-transformational lexical insertion. Like other transitives, these verbs have a subcategorization feature + NP, but they are not inserted until after transformations apply in the smallest S containing them (cf. note 23). At this point, the object N P must still contain a lexical terminal, in accord with (22); their object N P cannot have been moved to subject position by passive NP movement, leaving only a trace of NP. Such late insertion is not available for open class verbs, so the contrast in (42)-(43) is predicted by virtue of the main verbs of (42) being in a closed category of grammatical verbs. It is of interest also that of the grammatical verbs, only the subclass of stative verbs are inserted post-transformationally; the grammatical activity verbs (e.g., do, go, make) can appear in passive constructions. That is, the following correlation seems to hold: (44)

Grammatical verbs are inserted (i) at deep structure if they are activity verbs, and (ii) post-transformationally if they are stative.

It seems likely that (44) should follow from universal grammar, and

188

A unified theory of syntactic

categories

should not be an independent stipulation. If we are willing to say that the insertion of a verb under [V, ACTIVITY] is "associated" with a purely semantic feature, then (44i) is yet another consequence of the Designation Convention (20). U p to this point, I have used "associated with a semantic feature" in (20) only to mean that a member of the category X° inserted into a tree is listed with a semantic feature in the lexicon. But if "associated with a semantic feature" also means "causes a purely semantic feature to be added to a tree," then the existence of a rule assigning an agentive feature to the subject of an animate activity verb suffices to ensure deep structure insertion of such verbs (i.e., 44i). Such a rule has been proposed in Chomsky (1972, 75). We might be able to further develop a theory of insertion of closed class items for (44ii) by claiming that at least grammatical X° are always inserted post-transformationally, unless some consideration such as that of the preceding paragraph necessitates deep lexical insertion. But serious justification of a fully predictive theory of insertion level is beyond the scope of this chapter. I limit myself here to claiming simply that late insertion of some grammatical verbs exists, and do not claim to be able to fully predict when. Some final predictions made by late lexical insertion, particularly in light of (44), concern the verb give. Oehrle (1976) has distinguished two main uses of this verb. In its most familiar use, give signifies a transfer of possession, requires an animate subject, and permits its indirect object to appear both with and without to. (45)

John is giving some books to Mary. John is giving Mary some books. A nurse gave shots to the child. A nurse gave the child shots.

There are also extended uses of "transferring give" where some of the conditions are violated: (46)

We gave some thought to a trip. We gave a trip some thought.

In a second use, give is essentially the causative of have, according to Oehrle. Its subject may be inanimate, and its indirect object cannot appear with to, except in certain contrastive contexts. (47)

Small shops give French towns a universal appeal. *Small shops give a universal appeal to French towns. Overeating gave him a stomach ache. *Overeating gave a stomach ache to him. Too much mathematics might give the students a lot of trouble. *Too much mathematics might give a lot of trouble to the students.

Grammatical

formative

categories

and designation

convention

189

The simplest subcategorization for give that suggests itself is + NP N P ; give is in fact an archetypical three-argument verb exemplifying this frame. Transferring give is furthermore an activity verb, and in accord with (44i), it is inserted in deep structure. As discussed in section 1.6, indirect 6role assignment requires that the second N P in the frame + NP NP appear in a P P with an induced deep structure empty P. The deep structure of transferring give is thus (48): (48) V~ I give

some books Mary

When the empty P assigns case to the following N P , its interpretation as indirect object at surface structure is assured, by (85) of Ch. 1. If this indirect object N P is not moved, P is spelled as to (rule 87, Ch. 1), yielding the surface form (49a). If this indirect object N P is moved to immediately post-verbal position, the Empty Head Principle (first appendix to Ch. 2) requires the induced empty P to remain null, giving the surface form (49b). 25

(b) Y I give

NP l some books

^PP I P I to

\

NP I Mary

V J give

NP; I I Mary some books 0

0

The causative give, as noted by Oehrle, is stative, in contrast to the transferring give. (50)

*What those small shops do is give French towns a universal appeal. *What overeating did was give him a stomach ache. *What too much mathematics might do is give the students a lot of trouble.

25. Whitney (1983) gives a plausible account of such movement in the framework of Chomsky (1981). There is a discrepancy in my formulation of dative movement. In order for the Empty Head Principle to apply to P, the object of P must be caseless. But according to (85) of Ch. 1, a case is required on this N P if it is to be interpreted as an indirect object. This problem will be solved in the next chapter, when the indirect object is defined slightly differently than in Ch. 1.

190

A unified

theory

of syntactic

categories

By (44ii), the stative causative give is expected to be inserted posttransformationally, into structures such as (49a-b) (with empty V's) rather than into the deep structure (48). But only (49b) satisfies the frame + N P N P , common to both main uses of give; the presence of lexical to in (49a) is not compatible with this frame. Thus, (44i-ii) essentially predicts the acceptability contrasts of (45)-(47). I have thus associated Oehrle's two uses of give with the general division into pre- and post-transformational insertion of grammatical verbs. The correctness of this step is confirmed by his observation that the causative (post-transformationally inserted) give permits neither N P object to be passivized (i.e., to be represented at s-structure by an empty category). This contrasts with the expected behavior of the transferring give, which allows both N P ' s to be passivized, subject to the well-known caveat that passivizing the second N P sounds rather forced in American English. (51)

(52)

Mary is being given some books. Some books are being given Mary. (? in American English) The child was given shots by a nurse. Shots were given the child by a nurse. (?) A trip was given some thought. Some thought was given a trip. (?) *French towns are given a universal appeal by small shops. *A universal appeal is given French towns by small shops. *He was given a stomach ache by overeating. *A stomach ache was given him by overeating. T h e students might be given a lot of trouble by too much mathematics. *A lot of trouble might be given the students by too much mathematics.

These recalcitrant facts about two uses of give appear to fit nicely into the scheme developed here whereby "primitive" syntactic verbs - what I call grammatical verbs - are inserted sometimes pre-transformationally and sometimes post-transformationally. These facts therefore provide important support for this division, confirming both late lexical insertion for grammatical formatives, and my claim that closed classes of grammatical formatives include small subsets of the lexical categories L, as well as all non-L categories. 4.9. Conclusion In summary, I will display the several contrasts between open and closed categories. The defining characteristics of open categories (2)-(4), as well as claims (7) and (16), serve mainly to underscore the fundamental dichotomy between lexical and closed classes, and to provide some clear independent indicators for when we have a closed category. As a result,

Grammatical formative

categories and designation convention

191

Unique Syntactic Behavior and Late Lexical Insertion are contentful claims, rather than simply definitions of closed classes. Table (53) summarizes differences between lexical and closed morpheme classes. Numbering in text: (3) (4) (7) (11) (16) (21)

Open Categories of N, A, and V: Indefinitely many Conscious coining allowed Branching rules OK (compounds) May differ by purely semantic features N o suppletion Inserted before transformations apply (by (58), Ch. 2)

Closed Categories of P, SP(X), CONJ, etc.: At most thirty members N o conscious coining N o branching rules allowed Differ only by syntactic features (Unique Behavior) Suppletion allowed Sometimes inserted after transformations apply (Late Lexical Insertion)

The principle of universal grammar which I think has been amply justified throughout this chapter is the Designation Convention (20). The Convention explains the limitation of suppletion and late lexical insertion to closed categories, and correctly predicts that closed class items have unique syntactic behavior. When applied to the closed class of grammatical verbs, this Convention reveals that "auxiliary verbs" are nothing other than verbs which have undergone particular types of language-particular transformational displacements from their base positions.

Chapter 5

Principles of inflectional morphology

5.1. Inflectional vs. Derivational Morphology The heart of traditional grammatical study has been the regularities in form and distribution of the inflections on the primary lexical categories of n o u n (N), verb (V), and adjective (A). This interest goes back to the original Greek and Latin grammarians, but is also the focus of the historically independent traditions of Japanese, Arabic, and Sanskrit grammar. If a study of language is to be called "grammar", it must accord this concern an important role. And if grammar is t o be a science, it must raise our understanding of these processes to a qualitatively higher level than what has been reached in traditional studies. Inflections on lexical categories involve phonological modifications of these categories, most usually by the addition of bound morphemes, which are productive for the category in q u e s t i o n - a n d in almost every case, particular to the category in question; i.e., a particular inflectional morpheme is bound to one of N , V, A, or to N and A, but not to the others. 1 By bound morpheme is traditionally meant one which can be separated from the lexical category which binds it (its "stem") by at most a sequence of other such morphemes. Thus, a sequence of preverbal French clitics are all b o u n d to the following V, and German case endings are bound to a preceding A even when a comparative or superlative inflection intervenes. (1)

(a) Jean le lui propose souvent. 'John it to him proposes often.' (b) Ein kleinerer Apfel 'a smaller apple'

N o adverb such as souvent can intervene in the clitic-verb sequence le lui propose in (la), and no free morpheme can interrupt the fixed head-bound morpheme sequence A-er-er in (lb).

1. Thus, the English third person singular marking on V and the plural marking on N (both phonologically s) are rightly taken as independent. A treatment which conflates the two cannot account in a natural way for the absence of s in sentences with conjoined singular subjects or with bare determiner plural subjects, e.g., few dispute this. For a treatment of a (past tense) inflectional morpheme on V which may also be an inflectional morpheme on N, see Hess and Hilbert (1976).

194

A unified theory of syntactic

categories

Bound morphemes on N, V, and A stems are not all termed "inflectional", however. A division is usually made between derivational and inflectional morphology. The pre-theoretical basis for this distinction among bound morphemes is discussed clearly in Siegel (1974), Aronoff (1976) and Anderson (1982), and some knowledge of this background is presupposed here. For example, non-productive affixes are typically considered to be derivational, and those most integrated into syntax, such as verbal tense and agreement and nominal/adjectival case endings, are taken to be inflectional. E. Selkirk (pers. comm.) suggests that grammatical tradition considers as inflectional those morphemes which alter neither the syntactic category nor the 0-role assignment properties of the stem they are bound to. Derivational morphemes then include all other bound morphemes. By this criterion, English inflection consists of: (2)

(a) the plural ending of nouns, (b) the finite and participial endings of verbs, perhaps excluding the passive participle, (c) the comparative and superlative endings on adjectives.

Inflections (2a-b) exist in French, German, and Spanish; German also has endings as in (2c). Further examples of inflections are the following: (2)

(d) German case endings on N and A. (e) Gender agreement of A with N in French, German, and Spanish. (f) Spanish and German diminutives, such as Spanish (c)ito and German chen/lein. (g) Japanese subject-agreeing honorifics on verbs and adjectives.

In contrast, productive and syntactically regular bound morphemes such as the passive participle morpheme in all the above mentioned languages, the causative morpheme in Japanese, and the French and Spanish verbal clitics cannot be taken as inflectional, since they affect 0-role assignment. In as much as traditional grammar has tended to treat as inflectional the bound morphemes which enter into regular grammatical processes, the classification of diminutive morphemes as inflectional and of passive and causative morphemes as derivational may be running afoul of some more principled account of the distinction in question. In response to the inability of traditional criteria to distinguish inflectional and derivational morphology in an enlightening way, Anderson (1982, section 3) claims that the problem cannot be correctly approached in advance of a properly elaborated theory of syntax and word formation. He concludes that what has been traditionally termed inflection is "what

Principles

of inflectional

morphology

195

is relevant to the syntax". In this way, at least Spanish diminutives fall outside the realm of inflection, and productive passive and causative morphemes fall within it. Another response to the apparent murkiness of the inflection/derivation border has been the claim that all morphology is of a piece (Lapointe, 1981; Lieber, 1980). However, it cannot be plausibly maintained that the distribution of the inflectional morphemes of a language is autonomous from the syntactically determined distribution of its free morphemes, as will be shown below. Those authors who have argued that the same principles govern inflectional and derivational morphology have stressed the word-internal similarities of the two systems, both with regard to phonological processes and to how bound morphemes are arranged, to the neglect of how these morphemes interact with the syntactic categories in their immediate structural environment. In what follows, I want to show how the domain of a word can be defined which allows both for an interesting syntactic characterization of inflection and for unified statements which capture the similarities between derivation and inflection. In particular, I will elaborate on the syntactic principles which give rise to bound morphemes and will demonstrate, I believe, that the traditionally conceived domain of inflection is a fairly well defined sub-area of syntax; in accord with Anderson, this sub-area can be understood only in terms of grammatical theory, and not by superficial properties of inflectional forms. In my view, Anderson's characterization is too broad, however; a more accurate though still pre-theoretical formulation is as follows: (3)

Inflections are those bound morphemes which are relevant to transformational (as opposed to deep structure) syntax.

Since transformations are exactly that part of syntax which involves the non-equivalence of deep structures and surface forms, my claim is that inflection includes any bound morphemes inserted by transformation (e.g., agreement) or s-structure-dependent principles (case), and any whose presence permits otherwise expected syntactic categories in their environment to be phonologically unrealized. Because I claim, with Anderson, that inflection is not null (i.e., that there exist transformational processes which give rise to bound morphology), it is incumbent on me to place theoretical limits on how much bound morphology transformations can produce. To this end, certain principles will be proposed, the most central of which are one which yields inflection as an output to transformations, and another by virtue of which certain closed categories (Cf. Ch. 4) may remain phonologically unrealized. This latter principle I will term the "Invisible Category Principle". I now turn to the relation of the inflection/derivation distinction to the question of productivity. A bound derivational morpheme can combine freely or not freely with other elements inside X°. I attribute this to derivational morphology being

196

A unified theory of syntactic

categories

determined by lexical insertion at deep structure; deep structure combinations can be either productive or lexically restricted. Much of what is traditionally termed derivational morphology is in fact unproductive, as expected. But we can also consider a productive diminutive, such as the Spanish (c)ito, to be a deep structure N subcategorized for a preceding concrete noun, and hence, by (3), derivational. In contrast, if an inflection is produced by the transformational component, it follows that it should combine will all members of the stem category (N, V, or A) that it attaches to. The ability of a lexical category item to combine selectively within X° (e.g., to form a compound or a derived form) is characteristic of lexical insertion. In Ch. 4, it was shown that, with the exception of closed subclasses of grammatical N, V and A, lexical category items are inserted at deep structure. Thus, open class stems are present throughout a transformational derivation, and an inflection of given category, produced during this derivation, should freely combine with all open class items. Exactly because inflection is generally productive, generative grammarians have usually described it by essentially exceptionless devices of the syntactic component. For example, we find in Chomsky (1957, 1964, 1973) an affix movement rule which transformationally generates English verbal inflection. My present conception of how to describe inflection accords with this tradition. It is sometimes pointed out that an inflectional process is not totally productive. However, the type of non-productivity associated with inflection differs from that typically found within derivational morphology. For example, let us compare the English past tense suffix ed, which combines with the lexical category V, to the English causative suffix en/ize/ ify which combines with the lexical category A to form a V (gladden, soften, deafen, broaden, blacken, stabilize, immunize, formalize, radicalize, intensify, humidify, rigidify, falsify). The inflectional status of ed is indicated by the fact that in the regular case, with a small and clear list of exceptions, if X is an infinitival root, then X -I- ed is also a V. The category A does not combine with en/ize/ify in this way. It is qualitatively speaking as common to find English adjectives which cannot take a causative suffix {happy, foreign, alert, clever, proud, precise, old, anxious, grey, dumb, intelligent, responsible, reticent, ugly, etc.) as it is to find ones that can. Subregularities of the type, "no adjective of the form F takes a causative suffix" are irrelevant to the point, which is that every V takes a regular or irregular past tense, with at most a specifiable handful of exceptions. Furthermore, any number of roots, X / A , can combine with ize/en/ify to form a causative V (lengthen, strengthen, baptize, minimize, advertize, organize, canonize, cauterize, ratify, crucify, terrify, edify, etc.). This property does not hold of inflectional morphemes; when the root of an English past form is not itself a V, it is a highly exceptional case, and typically it is a suppletive form of a present tense root.

Principles

of inflectional

197

morphology

The appropriate analysis for derivational morphemes like en/ize/ify appears to be along the lines of Williams (1981), Lieber (1983), Selkirk (1982), and Walinska de Hackbeil (1983). The bound morphemes ize and en are deep structure heads of category V, subcategorized as A , which assign a 0-role (probably GOAL) to the A. (4)

/ A I stable

\

V I ize

I

the city

A

/

soft

v

\

V I en

NP I the city

These affixal V may be subject to additional lexical restrictions, such as the requirement that the A's with en be monosyllabic. In addition, the items in the open class A may also be lexically restricted; for instance, proud and old cannot take en. Notably, it appears that A can combine with ize only if positively specified to do so; it is this which renders the addition of a causative suffix to A non-productive in English. These restrictions all hold at deep structure, and the trees in (4) are invariant throughout the transformational derivation. Thus, these derivational processes, whether or not they are productive, conform to the claim of (3). The apparent non-productivity of some inflections is more limited in nature. For example, English "affix movement" adjoins the feature complex [SP(V), — M O D A L ] to the verb, without exception. By late lexical insertion (Ch. 4) the morpheme ed is the regular choice for [SP(V), + PAST] in the context V . To the extent that this inflectional morpheme does not combine with every V, the grammar of English must explicitly represent in its lexicon every case of non-combination. Moreover, the exceptional realizations of V -1- PAST, being instances of the posttransformational lexical insertion of Ch. 4, must be stated as properties of a closed category, namely of the affix and not of V. Most plausibly, the irregular allomorphs of [SP(V), + PAST], such as vowel-shortening t (e.g., meant, kept, lost, left, bit, hit, cut) and the vowel change to o (e.g., drove, broke, told, wrote, rode), have insertion contexts consisting simply of fixed lexical lists of stems. When a stem is not in these lists, a regular inflection such as ed, in contrast to what occurs in derivational morphological patterns, must be added to the stem. 2 That is,

2. Anderson (1982, section 3) brings up the existence of defective paradigms in inflectional morphology as instances of non-productivity. In any incomplete paradigms not describable by a filter, I would propose that a filtering allomorph be associated with the list of stems that are incompatible with the inflection in question. What is then correctly excluded is a situation where a productive but non-predictable set of items of the stem category fail to combine with a certain inflectional category. It can be noted that traditional grammars take note of defective paradigms, but d o not make lists, for example, of forms which d o not undergo a certain derivational process.

198

A unified theory of syntactic

categories

from the claim that inflection is transformationally induced (3), the following theorem can be derived. (5)

If m is an inflectional morpheme that combines with stems of category L, the exceptional members of L which do not combine with m are stipulated in lexical lists.

These lists rarely have more than, say, two dozen members each; that is, they are qualitatively of the same size as are the closed lexical categories themselves. According to (5), when a child who knows the English past tense learns a novel verb (in participial form, for example), (s)he automatically will subsequently use the regular inflectional past tense; vice-versa, (s)he will extrapolate from a past tense form to a regular present, if (s)he doesn't already know the verb is irregular. By contrast, when a child learns words like proud, precise, foreign, baptize, advertize, and cauterize, extensions by means of the derivational formula A + en/ize = V, in either direction, are made only sporadically and tentatively, analogously to lexical compounding. In general, when two classes of syntactic elements combine freely under specifiable syntactic conditions subject only to short lists of exceptions, the combination is said to be productive. In this sense, inflectional morphemes—those produced by transformational processes-are productive, while derivational morphemes need not be. 5.2. The Genesis of Inflectional Morphology In this section, I will sketch some typical ways that the transformational component produces inflections, but will not yet set a formal limit on how inflections can arise. I note at the outset that my claim that transformations give rise to inflection does not imply that transformations can themselves refer to word-internal structure. 3 That is, once a syntactic rule adjoins, for example, the English [SP(V), — M O D A L ] to V as inflection, I take it that the resulting V is not internally analyzable for syntactic purposes, other than as a complex of unstructured features (e.g., as {V, SP(V), - MODAL, + PAST}). If this view is correct, there can be no "clitic-climbing" transformational process in Romance by which a clitic which is adjoined to one verb moves from that verb to a higher verb; rather, clitic-climbing must be analyzed either as base-generation of a clitic on a verb to which it is not 0-related, or as a movement from a phrasal position directly to a higher verb. An examination of English transparently supports my claim that 3. Lapointe (1981) holds that reference in the syntax to a feature that is internal to a word should be forbidden. But I see n o argument against referring to such features considered as associated with the word as a whole, as long as the internal structure of the word is not accessible to the syntax.

Principles

of inflectional

morphology

199

inflectional morphology arises from transformational processes. 4 The following table lists the English inflectional affixes, their deep structure category, and the argument for their being derived via a transformational movement or insertion rule. Table l 5 Morpheme:

Argument:

Plural s, member of SP(N).

See below in the text.

Comparative er and Superlative est, members of SP(A).

alternate with the preadjectival more and most; at the same time they cannot co-occur with other preadjectival specifiers: *very smarter, *as smartest. *how smarter, etc.6

They

Finite verb forms, members of SP(V).

See section 5.6 below; essentially, the original arguments for "affix movement" in Chomsky (1957), bolstered by possibilities for explanatory adequacy provided by the Adjacency Hypothesis.

Present participle ing, inserted into bare VP ( = VP not immediately dominated by S).

Complementary distribution of V + ing and all combinations of AUX + V (e.g., *to + V + ing, *MODAL + V + ing, and *V + TENSE + ing). This is one of the main hypotheses of Ch. 2.

Passive participle en, member of A, which amalgamates with a following V.

Classical arguments of Chomsky (1957) and Wasow (1977) that the verbal passive involves a transformational movement.

Perfect participle en, inserted into a bare VP following have.

What follows the perfective have has all the characteristics of the bare VP of Ch. 2; it fills the gap indicated by the absence of have + V + ing.

4. A similar conclusion holds for French, with the possible exception of a range of preverbal clitics. It may well be that some or all French clitics are generated pre-verbally at deep structure, and are not part of X° at this level; French preverbal clitics are written as separate words, in contrast to postverbal clitics. Moreover, if they are base-generated, they do not violate the Head Placement word order principle of Ch. 1, whereby non-phrasal deep structure sisters must precede heads. 5. As is well-known, the possessive e n d i n g ' s in English is not an inflection on the category X°, but is rather case-marking exterior to the possessive N P taken as a unit. However, since morphological case is assigned at s-structure, it can be considered an inflection and not violate (3). 6. In sections 4.7 and 4.8, I argued that many spelled-out grammatical formatives are inserted into syntactic feature matrices only after these matrices are in their surface structure positions. This is a restricted verison of a hypothesis put forward in Otero (1976) and in more detail in Pranka (1983). Thus, it makes no sense to ask if, for example, the deep structure comparative morpheme of English is more or -er. The comparative morpheme is -er in the surface context [ a A ], and more in all other surface contexts. Similarly, from this point of view, there can be no "lexicalist" argument against a transformational relation holding among quantifiers such as no, any, and some proposed in Klima (1964a), if these are inserted in surface structure. For a discussion of the issues here, see Monaghan (1981).

200

A unified theory of syntactic

categories

The English plural morpheme might be thought to be an optional choice characteristic of a noun in deep structure. As such, it would violate the Head Placement Principle (2) of Ch. 1. However, several studies have pointed out that plurality is also a property of the specifier of the N P (for arguments and a review of the literature, see Samiian, 1983, Ch. 4): (6)

these boys, *these boy, this boy, *this boys two ants, *two ant, one ant, *one ants

In French and German, a wide range of noun specifiers show number overtly. Thus, SP(N) is marked ± PLURAL (with count nouns at least), even when the distinction is morphologically unrealized on SP(N) (I spoke to friends vs. */ spoke to friend). We can account for specifier-head number agreement with a transformation; if we take the source of the plurality alternation to be the specifier, then the grammar conforms to the Head Placement Principle of Ch. I. 7 (7)

Plural Assignment: [Sp(N) PLUR] . . . - N => 1 - 2 + 1

The arguments listed in Table 1 justify the conclusion that English inflectional morphology is generated by transformations, rather than in deep structure. In most cases, these inflections result from the following possibility, which is a theorem derived from principles to be developed in this chapter. (8)

Language-particular transformations allow a feature of a SP(X) (which here includes verbal INFLECTION, as argued in Ch. 3) to be realized as an inflection bound to X 0 . 8

Besides providing an account of why inflection is productive, this perspective on inflectional morphology has several other advantages. (i) If inflections were present in their surface positions in deep structure, they would violate my claim that only phrases can be right sisters to X° in the deep structures of languages like English. Thus: (9)

7.

The Head Placement Principle, (2) of Ch. 1, stands as an exceptionless statement about the left-right order of nonphrasal elements in deep structure. A similar but distinct rule exists in Persian, confirming the partially

language-

particular nature of (7). Cf. Samiian (1983, Ch. 4). 8.

W a l i n s k a d e H a c k b e i l (1983) s h o w s that s o m e m o r p h e m e s which are traditionally

considered to be derivational in English are b o t h productive with certain subclasses of stems and alternate with a specifier element. The e x a m p l e she w o r k s out in detail is the adjectival suffix -ish

(as o p p o s e d to the n o u n suffix -ish).

N o t h i n g in what is said here excludes

extending the term "inflection" to s o m e of the English b o u n d m o r p h e m e s which W a l i n s k a d e H a c k b e i l c o n s i d e r s productive, except that n o t e 9 w o u l d be modified.

Principles of inflectional

morphology

201

(ii) As argued in Bresnan (1971, 1972b), many of the mechanisms for determining English word stress proposed in Chomsky and Halle (1968, Ch. 2 and 3) can apply prior to the insertion of open class lexical items in deep structure. F r o m this point of view, the assignment of word stress is an essentially regular but lexical process in English. Derivational suffixes often affect word stress, and hence they must be inserted in open class items in deep structure. Since inflections are, according to my view here, transformational adjunctions to open class categories, (10) follows. (10)

Inflectional morphology does not affect word stress in English, while derivational morphology may.

(iii) It is widely accepted that deep structure N, V, and A should be unanalyzable by transformations (except perhaps as an unstructured feature bundle). While some languages may have "word templates" with internal slots that allow inflections inside an X° (P. Kiparsky, pers. comm.), English and many similar languages do not work in this way. Rather, transformational operations add onto X° but d o not analyze their internal structure. Therefore: (11)

Inflectional suffixes follow derivational suffixes in all regular cases.

(iv) Finally, (12) will be discussed in the next section. (12)

A qualitatively better understanding of the relation between words and syntactic units is possible, since inflection is a reliable sign of transformational operations. 9

5.3. The Relation between Words and Syntactic Units The arguments of Table 1 clear the way for the statement of the following principle, which is related to that proposed in Chomsky and Halle (1968, section 6.2); it differs from theirs in that SP(X) as well as X° are set off by word boundaries in (13). 9.

An interesting property of English inflections is that only o n e is a l l o w e d per word.

U n d e r the plausible a s s u m p t i o n that the adverb-forming -ly suffix is inflectional o n A, w e then correctly predict that c o m p a r a t i v e and superlative suffixes c a n n o t be added to it. Similarly, the English finite past c a n n o t exhibit number agreement, and a plural n o u n in -s is m a d e possessive by virtue of p rather than by an -s suffix. *slowlier vs. friendlier; *freeliest vs. holiest; *trieds vs. hides; * s o m e locks's usefulness vs. s o m e box's usefulness. T h i s property d o e s n o t hold for F r e n c h or G e r m a n inflection. A French past participle can b e further inflected for gender, and the F r e n c h c o n d i t i o n a l / f u t u r e suffix (-r) is followed by the finite endings. In G e r m a n , the c o m p a r a t i v e and superlative endings can be followed by case endings. Thus, the English property w o u l d s e e m to be due t o s o m e theory of w o r d form which h o l d s at or subsequent to s-structure.

202 (13)

A unified theory

of syntactic

categories

Word Division. Word boundary symbols # # are inserted in deep structure terminal strings between any two adjacent syntactic units, prior to the insertion of X ( = N, V, A, P).

Since the structure internal to X typical of derivational morphology and of compounding is here ascribed to lexical processes, the Word Division Principle is not the source of any internal word boundaries such lexical items may have. 10 Word Division should not be taken as a principle of pronunciation; it is more directly reflected in how speakers "divide up a sentence into words" when asked, and in how languages like English and French are written. Complete phonological amalgamation across word boundaries, as in the French du maire 'from the mayor' (from the deep structure sequence de # # le # # maire), involves either phonological processes or some language-particular syntactic stipulation, whereas unreducible sequences such as contre le maire 'against the mayor' directly reflect unmodified deep structure strings produced by (13).11 In the preceding section, I claimed that any English inflectional affix m in the context X (X = N,V,A) is not in this position in deep structure. It therefore follows, by Word Division (13), that no word boundary # # need separate X and m. But the question can now be raised, under what circumstances does a transformational attachment of one category to another result in a single word? That is, not all transformational (re)orderings yield inflection, so

10. Lexical X not only cannot be modified by Word Division, they cannot be modified by transformations either. Consequently, separable particle-verb combinations (e.g., English take down) cannot be deep structure V, as is often asserted without argument, in spite of arguments to the contrary in Emonds (1972) which are independent of what is being considered here. Similarly, idioms such as take advantage, French avoir lieu 'take place' etc. cannot be deep structure V, since they can be modified by transformation. But I have no objection to attributing the irregular status and behavior of the object nouns in these expressions to a transformation-like cliticization or reanalysis by which they are associable with idiomatic meanings and become (sometimes only optionally) a single syntactic V. 11 The presently most plausible treatment of, e.g., French du 'of the' is that a lexical item du is associated with the syntactic sequence P + D E F via a rule of post-transformational lexical insertion, as in Ch. 4. By the Designation Convention, the rule correctly applies only to the unmarked representatives of P and D E F (which would, in the "elsewhere" case, be associated with de and le, respectively). Any such insertion associating a morpheme with a sequence rather than with a single terminal is marked, and, a fortiori, language-particular. It is also transformational in that its output is not expressible by context-free phrase-structure rules (Cf. Piera, 1984). Then, by principle (14) in the text below, with X = P and C = D E F , internal word boundaries in the item so inserted are excluded. The association of du with the sequence P + D E F (or, for another example, of au(x) 'to the' with [P,DIR] + [DEF,(PLUR)]) is consistent with the treatment of higher order complex words argued for in Piera (1984). While Piera objects to a "post-insertion phonology for syntagmatic irregularity", his ideas are compatible with post-syntactic insertion of grammatical formatives for these irregularities, as argued for here.

Principles of inflectional

morphology

203

what is proper to inflection? With regard to this question, English specifier-adjective pairs provide a minimal contrast: the comparative specifier becomes an inflectional suffix (-er) in post-adjectival position, while the specifier enough, if we are to trust the orthography, remains a separate word (intelligent enough). An inspection of the inflectional processes of Table 1 reveals a pattern; except for the participial endings, English inflection is a transformational conflation of a specifier element and its head (8). The rules of Plural Assignment (7), Affix Movement, and Comparative Postposing (of -er and -est) d o not "carry along" the word boundaries of the moved specifiers. (14)

First Inflection Principle (tentative). A transformation attaching some non-phrasal closed category C to a head X° does not provide word boundaries between the two in derived structure.

Principle (14) extends the potential for inflection from the major lexical categories to all heads of phrases X°; that is, it also allows P to be inflected. Two instances attesting such transformational movement are (i) the productive attachment of a locative demonstrative to P in Dutch, discussed in van Riemsdijk (1978, Ch. 3), yielding forms such as erop 'thereon', ervan 'therefrom', etc; (ii) the appearance of an agreement marker AG on P in Welsh (Harlow, 1981, 220). We might want t o consider rule-particular exceptions for (14); that is, the rule would insert a word boundary for a case like enough. (Alternatively, enough may be too phonologically complex to become an affix.) However, on the basis of examples like (15), Bresnan (1973) has argued that enough has N P status, so this suffices to insure its nonsuffixal status, given the "nonphrasal" stipulation in (14). (15)

He didn't put {enough/*too/*so} in the box.

It is appropriate t o note that noninflectional rules like D u t c h - G e r m a n verb-raising and Romance causative formation, under typical formulations, are not subject to (14), because both V's involved in both rules are open class categories. (Cf. the discussion of closed categories in the previous chapter.) Attachments that fall under (14) have another general property not shared with a variety of other transformational processes. While an inflectional ending such as the English past tense [sp(V)e^] has the insertion frame + V , such frames for inflections must be interpreted as applying strictly within the domain of a single word. By the Principle of W o r d Division (13), insertion in the context X° of a grammatical category distinct from X°, such as SP(X), can only be post-transformational; otherwise X° and SP(X) would be separated by word boundaries. Moreover, such late insertion is permitted in closed categories

204

A unified theory of syntactic

categories

by the Late Lexical Insertion of Ch. 4. Thus, it is predicted that inflectional endings involve elements that are obligatorily moved from their deep positions, without stipulating that the inflection-producing processes are themselves obligatory. (16)

Second Inflectional Principle (theorem). A transformational operation which produces an insertion frame inside X° for a given morpheme applies obligatorily.

(16) is clearly superior to any rule-particular stipulations of obligatoriness. The two inflectional principles in fact have wider applicability than to just what is traditionally termed inflection. An interesting confirmation of both (14) and (16) is provided by the English compound pronouns everything, someone, no place, anybody, etc. Two facts strongly suggest that these compounds of specifier and noun morpheme pairs appear in the surface position of SP(N). Like other specifiers, but unlike nouns, the compound pronouns precede simple adjectives: (17)

Somebody clever is invited. *Clever somebody is invited. *Housemates clever can be fun. Clever housemates can be fun. Some clever fellows are invited. *Clever some fellows are invited.

Moreover, compound pronouns, like non-demonstrative specifiers but unlike nouns, do not have plural forms (*someones vs. some young ones). Thus, it seems likely that compound pronouns are generated by a transformational attachment of a nonlexical noun (with the syntactic features ± ANIMATE, + PLACE) to a subclass of quantifying specifiers. 12 This attachment produces single words, in accordance with (14), as is shown by the orthography, the stress pattern (- -), and the vowel reduction. It is furthermore obligatory, in accordance with (16). So these principles apply beyond what in traditional or structuralist grammar would be labeled "inflection", and the formal boundary between inflectional morphology and traditionally conceived "syntax" is correspondingly weakened, as suggested by (3). Two other implications of Word Division (13) should be mentioned here. The proposal in Emonds (1976, Ch. 6) that there is a verbal suffix position in the base would now incorrectly entail word boundaries between V and this suffix. One important justification for this proposal

12. If we assume that quantifiers have some feature Q which distinguishes them from demonstratives and definites on the one hand and from WH words on the other, the subclass of quantifiers which form compound pronouns are just those which d o not "float": *The boys will {any/no/every/some} come late. The boys will {all/both/each} come late.

Principles

of inflectional

morphology

205

was to insure a structure-preserving "landing site" for affix movement, which at that time I conceived of as a nonlocal rule. But in section 5.6 here, I return to a justification of the rule's earlier formulation in Chomsky (1957), in which the adjacency of T E N S E and V plays a crucial (and, with the Adjacency Hypothesis, explanatory) role. Thus, affix movement here is no longer structure-preserving, but rather is local, and the motivation for a verbal suffix position in the base disappears. 1 3 A second implication of (13) is that any base pre-verbal clitic node in Spanish or French will be (i) flanked by word boundaries, and (ii) subject to transformational modification. Now, (ii) accords with what happens; both languages have local rules that postpose clitics (cf. Emonds, 1976, Ch. 6), but (i) seems to be the wrong prediction for phonological purposes. At this point, one might assume that some theory of cliticization will eliminate the word boundaries, or that some special stipulations are needed in the grammars of French and Spanish to eliminate these boundaries. But I am skeptical of either move. First, pre-verbal clitics in these languages are written as separate words. Second, in French at least, there is clearly some principle involving elision, liaison, and stress that extends the phonological word well beyond the boundaries of the syntactic word, so the clitics may well have n o special status with regard to this point (cf. Milner, 1973). Third, the post-verbal clitics in most dialects of Spanish d o not attract the stress away from the root in the way that Spanish inflectional morphology typically can, so it may be that even postverbal clitics retain word boundaries. F o r the moment, therefore, I keep to the controversial implication that pre-verbal base clitic nodes in Romance are regularly assigned word boundaries by (13); this can explain why they do not attract the word stress in post-verbal position in Spanish, and why they are subject to transformational modifications. 1 4

5.4. The Language-Particular Nature of Inflection and the Adjacency Hypothesis According to the First Inflection Principle (14), transformational attachments of closed categories to heads result in bound morphemes. I hold that this is sufficient to account for all bound inflectional morphology. F r o m this, my claim (3) that inflections are due to transformations follows.

13. The conclusion of Ch. 2 here that to rather than -ing is in a deep structure position eliminates the other motivation for a verbal suffix node given in Emonds (1976, Ch. 6). 14. Going a bit further, it appears that the pre-verbal base clitic position in Spanish and French, with regard to object pronouns, are similar (Groos, 1978; Bok-Bennema, 1981). It then is plausible to assume, on the basis of French, that this base position is pre-verbal, and that the post-verbal position of Spanish clitics on nonfinite verbs results from a transformational interchange analogous to English inflectional morphology. It thus is explained by (13) why Spanish pre-verbal clitics are written as separate words, while the post-verbal clitics (counterparts now to inflectional morphology, being transformationally placed) are not.

206

A unified

theory

of syntactic

categories

In Ch. 3,1 proposed that the unmarked movement rule of language, Move a, involves movement of a phrase to the boundary of a cyclic domain. Consequently, an attachment of a closed category to a head does not fall under Move a; inflectional processes are thereby predicted to be languageparticular. This prediction seems correct. First, there are particular grammars (e.g., of Chinese) which produce almost no inflection; second, grammars of closely related particular languages seem superficially to vary most in their inflectional morphology, even when the languages under consideration are closely related syntactic systems. F o r example, English and German have comparative and superlative adjectival inflection while French does not; French has future and conditional endings, while English does not. German, but not English or French, has case endings on N. French, but not English or German, has preverbal clitics. English, but not French or German, has two nonfinite verbal forms for clausal arguments of predicates (the infinitive and the gerund). The German and French infinitive, but not the English, is marked by a bound morpheme. English and German, but not French, appear to have a sort of possessive morpheme for noun phrases. French and German, but not English, have grammatical gender and gender agreement between noun and adjective; the French and German gender agreements differ from each other. It is widely understood that traditional grammar is inexplicit about universal grammar, and concentrates on describing the language-particular. The explanation for why traditional grammar lays so much stress on inflectional morphology is precisely because so much languageparticular syntax is concentrated in this area. If inflections are a regular subcase of language-particular syntax, whose only distinguishing syntactic characteristic is that determined by the First Inflection Principle, the distinction between the processes that yield inflectional morphology and other language-particular processes becomes less sharp that what has been observed in traditional or structuralist grammatical studies. Local rules formally analogous to those which produce inflectional morphology explain a number of language-particular syntactic characteristics, such as the various frontings of V and I N F L discussed in sections 3.6 and 3.7, Dutch and German verb-raising (cf. Evers, 1975, for a fairly complete initial study, and the second appendix to Ch. 2), a Romance causative rule, and certain Romance clitic placements. This assimilation of inflection to syntax is, in fact, the traditional approach in generative grammar. In the Introduction to this work, I put forward the "Adjacency Hypothesis", which claims that no language-particular rules can involve string variables. Consequently, since inflectional rules are languageparticular, transformational processes that give rise to inflection can stipulate only categories that are adjacent to each other. Nonetheless, in section 5.2, I argued that a sentence like (18) involves the transformational copying of the feature P L U R A L after N, as indicated by the arrow:

Principles

of inflectional

morphology

207

[Those] darned green slowly maturing [ n tomato]

(18)

SP(N) DEMON PLUR)— The Adjacency Hypothesis requires that this language-particular rule (distinct, for example, from a similar rule of Persian which does not apply if Quantifier specifiers are overt) be stated without an internal variable, as in (19): (19)

P L U R — N=>1 — 2 + 1

Whatever the principle that permits such a statement, it should also prevent application of the rule to the italicized plural specifiers in (20): (20)

That new building.

many-sided

mirror; this impressive

twenty

story

So as to correctly generate (18) and (20), I propose a principle of grammar that defines the head X of a phrase to be adjacent to the boundary of the phrase X 1 . The principle in question is a special case of the Variable Interpretation Convention of Wilkins (1980); she has suggested the formulation of a more limited version that I use here: (21)

Head Adjacency. If B is a head-of-phrase category, A-B in a structural description is to be interpreted as A- B l [Y-B-Z], where Y contains no B* in its grossest constituent analysis. 15 (The leftright order of A and B is irrelevant.)

For (19), A = P L U R and B = N. If English compound pronouns such as anyone are transformationally generated, as proposed in the preceding section, then Head Adjacency correctly allows this process to be stated in terms of adjacent categories: (22)

NP ;n

SP(N) I some

AP

N

very clever

j person Ì I Hsu

'someone

;

very clever"

15. The grossest constituent analysis of X (Wilkins's term) is the shortest and "highest" string of constituents that dominate all of and only X in a given tree. Head Adjacency in the equivalent of Safir's (1982) "Projection Adjacency", except that I do not generalize beyond X with one bar at this point.

208

A unified theory of syntactic

categories

Head Adjacency has no effect on the postposing of SP(A) in English, because such a specifier (e.g., enough) is never separated from its head. Within the English NP, however, we can further verify the validity of Head Adjacency by considering the gap among post-nominal adjective phrases. As examined in detail in Emonds (1976, Ch. 5), an adjective phrase, save one that ends in its head, may appear after a noun in English: (23)

A description of the crime longer than mine appeared in the Congressional Record. The other relative of his very eager to work got terribly sick. An actual explanation of the accident acceptable to the police should be found.

The italicized AP's in (23) cannot precede the o/-phrases, as the reader may verify. That is, a postnominal AP in English must follow not only N but also Ñ. In contrast, an AP which ends in its head is unacceptable in any postnominal position: (24)

(a) * A description of the crime long appeared in the Congressional Record. *The other relative of his very eager got terribly sick. *An actual explanation of the accident acceptable should be found. (b) *A description long of the crime appeared in the Congressional Record. *The other relative very eager of his got terribly sick. *An actual explanation acceptable of the accident should be found.

The inability of simple adjectives to appear after the noun inside the N P is a fact particular to English, not shared by French and Spanish. That is, either a filter or a movement rule must be postulated for English, which takes the following configuration (25) as input, and either filters it out, or moves the simple A to prenominal position, as in (26). (25)

(26)

[NP--.Ñ[AP...A]...]

A long description of the crime appeared in the Congressional Record. The other very eager relative of his got terribly sick. An actual acceptable explanation of the accident should be found.

Precisely because of contrasts as in (23)-(24), most analysts agree that finite relatives, "reduced" participial relatives, and the AP's with complements that pattern with them are to be generated outside Ñ (again, see Emonds, 1976, Ch. 5, for arguments that such AP's fill a gap in the range of post-nominal participles). On the other hand, Jackendoff (1977) pre-

Principles of inflectional

morphology

209

sents a number of generally accepted arguments that o/-phrase complements to N are generated inside N. Thus, a language-particular transformational statement of the excluded pattern in (24) invoking the sequence N — AJ will call the Head Adjacency Condition into play, so that any intervening P P ' s will not block the rule in question. That is, in (24a), N P P constitutes an N. F o r example, a local movement of A into prenominal position stated as (27) can correctly derive the sentences of (26) from the strings in (24), and thus exemplifies the effect of Head Adjacency. 1 6 Moreover, the formulation (27), which mentions N and N, correctly predicts that the moved adjectives will not appear separated from the nouns, at the beginning of the N, as in (28). (27) (28)

N - A = > 2 + 1 — 0; obligatory. T h e very eager other relative of his got terribly sick. *An acceptable actual explanation of the accident should be found.

I conclude that the behavior of the N in the English N P supports the contention in (21) that the head X is properly defined as "adjacent" to material which is next to X. This tentative conclusion, to be amply supported in the remainder of this chapter, can in fact be taken as part of the definition of what it means to say that X is a "(minimal) projection" of X; X not only contains X, and is labeled by virtue of how X is labeled, it

16. The discussion here is focused on the role of N and N in (25) and (27), and does not directly depend on whether A 0 , A 1 , or A 2 should appear in the appropriate rule. If the correct structure for an AP is as in (i), which appears to be assumed in Chomsky (1981, 165) and which is justified by the type of arguments in Hendrick (1978), then the rule should mention A1.

SP(A) The local rule (27) with A replaced by A 1 cannot apply to an A 1 in (i) because neither N nor A 1 would minimally c-command the other. On the other hand, if the specifier/complement structure within A P is parallel to that found in NP, as has been assumed here in Ch. 1, then the following considerations suggest that (27) should mention simply A. Such a rule would seem to be more easily learnable than one mentioning A 1 . Moreover, as would then be expected, a number of SP(A) block prenominal adjectives. (ii)

Any description {that/so/three pages} long was rejected. *Any {that/so/three pages} long description was rejected.

We can assume that the SP(A) which appear prenominally in examples like (26) are essentially "cliticized" onto a following A, so that (27) need only mention A 0 .

210

A unified theory of syntactic

categories

also contains X as its only "visible" member, at least for purposes of the transformational calculations we are considering here. 5.5. Tense-Inflection,

Modals,

and Verbs

In this chapter, we began the investigation of some language-particular aspects of English syntax by noting some of the formal properties of the rules which yield the language's inflection. For the most part, these are local transformations that involve a subcategory of a specifier and an adjacent lexical category X, where "adjacent" has been defined in a nonobvious and yet elegant way (the Head Adjacency of (21)). Moreover, in order to assimilate the rules of inflectional morphology to the transformational component, two principles of inflection (14) and (16) have been introduced to account for what other traditions of linguistic analysis (traditional, structuralist, and also much recent lexical grammar) take as justification for separating morphology and syntax into different components. Up to this point, however, the rules discussed (plural formation, postposing of certain adjectival specifiers, preposing of simple adjectives, compound pronoun formation) have been chosen precisely because they do not interact in particularly interesting ways with the rest of the grammar; while they allow the principles that govern their operation to be stated straightforwardly, they do not shed light on problems of analysis that ordinarily attract the English grammarian's attention. When we turn to verbal inflection, however, even in a language as morphologically "streamlined" as English, the interest of theoreticallybased explanations increases, because the English verb and its auxiliaries are always a source of fascination. Indeed, I think we can say, following Newmeyer (1980), that much of the intellectual success of Chomsky (1957) is due precisely to its stunningly attractive analysis of this inflection system. In the theory of language-particular rules I am trying to develop here, I feel that the most interesting aspects of this analysis can and should be maintained. A number of generative studies, based on arguments presented in Chomsky (1957) and Emonds (1970), have demonstrated that the modal auxiliaries of English are not verbs, but rather realizations of the specifier of the verb, which is notated indifferently here as SP(V) or AUX. Any analysis of this type, in which English modals are not verbs, will be termed a "modal analysis (of English)".17 In a modal analysis, the AUX of English is inverted in direct questions, appears in tag questions, optionally attracts

17. These studies include Emonds (1970,1976), Jackendoff (1972); Lightfoot (1974,1979), Culicover (1977), Iwakura (1977) and Akmajian, Steele, and Wasow (1979). The modal analysis has not been established by the number of pages devoted to asserting it (by this criterion, the analysis of modals as main verbs is superior), but by showing that in actual formalized descriptions that restrict the notion and number of "possible rules of grammar," modals do not undergo verbal rules.

Principles of inflectional

211

morphology

N E G to itself (as n't in normal American speech), remains undeleted in "VP-deletion" contexts, and itself undergoes certain contractions. In my development of a modal analysis, I have further hypothesized that modals d o not co-occur with present (s) and past (ed) tense formatives (TENSE) exactly because the latter are alternatives to expanding AUX as a present or past modal. Thus, any claim that modals are verbs because they and verbs alone exhibit a present-past distinction rests on the false premise that verbs syntactically realize this distinction. The proper locus of the syntactic opposition present-past is rather the AUX, whether AUX is realized as a modal or as a tense suffix on a verb. Under this conception, the base rule for generating modals and tense is (29), where SP(V) is subject to (11) of Ch. 1. (29)

SP(V)=>

(optional)

Exactly the same rule can be used for French, which has n o nonverbal modals; in French, the features that correspond to English modals (AUX, — TENSE) are realized by the future and conditional "tenses" and perhaps by the subjunctive mood. The French-English correspondences are given in Table 2: Table 2 1 8

AUX + TENSE

— TENSE

-

PAST

+ PAST

English and French

English past and

present tense

French imperfect

French future tense;

French conditional;

English will, can, ought, etc.

English would, could,

may,

might

Like the base composition rules of Ch. 1, rules such as (29) for expanding SP(X) are optional. If (29) does not apply, then we obtain an infinitive form; that is, a syntactically empty SP(V). Within the modal analysis proposed here, as in my previous work,

18. At this point, traditional grammar's nomenclature cannot be reconciled with the verbal m o d e l proposed here. The problem is what t o call will-would and the French future/conditional suffix, which have exactly the same feature composition here. If these are not T E N S E , this g o e s against traditional Romance terminology. If they are T E N S E , this goes against m y claim that English modals alternate with but don't contain what American grammar has called the T E N S E morpheme. M y solution is t o consider the French future/conditional and English modals as M O D A L , and the French and English present/imperfect/simple past as — M O D A L , and to drop the term T E N S E . But for familiarity, I still often use T E N S E as equivalent to — M O D A L , implying that the French future is —TENSE.

212

A unified theory of syntactic

categories

AUX = SP(V) and V share no syntactic feature and undergo n o rule in common. So it makes n o sense to speak of "auxiliary verb" rather than "auxiliary" with respect t o the modals. It is perhaps of interest to analyze why studies continue to appear supporting the contrary view that English modals are verbs. One misconception already mentioned is that many think of verbs and modals, rather than T E N S E and modals, as exhibiting past-present alternations, when this is not syntactically the case. A second empirical source of confusion is that the English verbs be and have are transformationally moved into the AUX position by a rule that precedes various operations on AUX and V (cf. Emonds, 1976, Ch. 6), so that these verbs appear to be auxiliaries in the same sense that modals are - and in fact they are dominated by A U X in surface structure. Because of this, even though earlier work has shown that at a deeper level, be and have have the distribution of verbs and not of modals, many authors still assume that a modal analysis cannot avoid postulating some feature shared by verbs and modals, which they then take to be the "true" category V. Thirdly, since the number of possible morpheme sequences, excluding adverbs and parentheticals, between subject NP's and what all agree are main verbs is on the order of a few hundred, empiricist bias leads to a formal solution which is a disguised listing of possible sequences (a "lexical solution"). The modal analysis here is a "transformational solution" because it employs a level of deep structure which is both nontrivially distinct from surface structure and precisely related to surface structure by formulated rules of some generality. We are interested of course in how particular grammars express differences a m o n g languages. F r o m Table 2, it is clear that English and French differ in that AUX is always realized as a verbal affix in French, while in English it is realized both as affixes and as a word class (the modals). Related to this is the fact that the number of words realizing AUX, — T E N S E in Modern English is more than ten, while at most the future, conditional, and subjunctive forms correspond to these feature values in French. This latter fact amounts to saying that the English AUX has a finer division of syntactic features than does French. As an example of how this should be described, we can observe that all modal auxiliaries except for will, would, can, could share some feature, say NECESSITY, that blocks their appearance in the tags of imperatives (e.g., *Leave the room, may you?). A rule in English but not in French of the form (30) generates this feature: (30)

SP(V) - TENSE

•NECESSITY (optional)

The full g r a m m a r for English modals will contain a small number of rules of this type; their exact form is to be determined by how the various

Principles

of inflectional

morphology

213

modals behave under syntactic and interpretive rules (e.g., initial may appears in wishes, should one is deleted or understood in "why + infinitive" expressions, need and dare can only be used in nonassertive contexts, etc.). While we can think of these rules as simply part of the English lexicon of grammatical formatives, this leaves the formal nature of this "lexicon" unresolved. I suggest that it consists in large part of rules like (30), which can be thought of as local transformations (hence the double arrow) with a single term and no adjunctions. We could not d o this if the right side of (30) contained a phrase subject to deep structure lexical insertion, but since it cannot (by principle (7) of Ch. 4), and since there are arguments for allowing late insertion of grammatical formatives (section 4.7), there is no reason to assume that rules like (30) are in a different component from that which we know differs from language to language, that of the local transformations. Thus, one important difference between French and English, the existence of a greater number of realizations of AUX in English, is formally expressed by a small set of local transformations of the simplest possible type such as (30), in accordance with the Adjacency Hypothesis. Whether a more generally applicable rule like (29) should be assimilated to language-particular rules like (30) or should be part of the base composition rules depends on the status of (29) in languages that are far removed from English. 5.6. A Comparison of French and English Verbal Inflection In English, a word class of modals realizes the feature combination AUX, — TENSE; in contrast, the AUX in French is always realized as a verbal suffix. That is, in the surface structure of French, all deep structure sequences SP(V) —V are permuted, whereas in English, only the sequences T E N S E - V are. 1 9 In both French and English, this permutation (i) gives rise to bound inflectional morphemes; (ii) is obligatory; (iii) must be ordered after diverse rules operating on AUX, on V, and on V; and (iv) is not blocked by adverbials intervening between AUX and V, including negative adverbials (French pas, point, jamais, guere, etc., English never, scarcely, etc.), with the 19. At the end of Ch. 4, I proposed that English verbal inflections and English comparative and superlative adjectival inflections might be generated by a single local rule permuting SP(X) and X°. This rule is supplemented by entries for the various SP(X) morphemes which list them as either + X ° , in which case they are inserted only as suffixes, or without a context, in which case they are by convention inserted only in the unpermuted, deep structure position of SP(X). This chapter is concerned with the details of how verbal inflection is generated; for clarity, I have not used this generalized form of affix movement, which mentions the sequence SP(X)—X° rather than the sequence TENSE—V°. As the argument in the text concerns the explanatory role of the Adjacency Hypothesis and principles of rule ordering, nothing hinges on exactly how the categories which appear in the affix movement rule are specified. It can be observed that under either choice of categories, the Second Inflection Principle (rules mentioning X° are obligatory) implies that affix movement is obligatory.

214

A unified theory of syntactic

categories

exception of English not. The First and Second Inflection Principles (14) and (16) given earlier explain, respectively, (i) and (ii), so it remains to discuss (iii) and (iv). Differences between French and English are (a) a certain number of elements in English but not French blcok the TENSE—V permutation, namely, the negative not-rft, the emphatic particles so-too, and an inverted subject NP; (b) when these elements block English TENSE—V permutation, the auxiliary verb do is obligatorily inserted before TENSE; 2 0 and (c) English but not French has other rules which move the AUX, such as subject-AUX inversion and tag question formation. Many of the above characteristics of English are accounted for in a descriptively adequate fashion in Chomsky (1957), who proposes two obligatory rules ordered late in a transformational derivation, affix movement and do-insertion. In particular, his formulations describe (i-iii) and (a-b), but not (iv) and (c). I have explained (i-ii) by the Inflection Principles. In what follows, other principles of universal grammar, including Head Adjacency and the Adjacency Hypothesis, are used to explain the other aspects of Chomsky's account, and also to explain (iv) and (c). I will refer back repeatedly to (i)-(iv) and (a)-(c). A central feature of Chomsky's original "late do-insertion" is this: at a certain point in the transformational derivation, various elements render TENSE and V noncontiguous in English. These elements are the simple negation (not or n't), an inverted subject NP, and the emphatic particles so and too, the operations of V-deletion and V-preposing (Ross, 1967; Emonds, 1976) have the same effect. If none of these elements or operations intervenes, an obligatory local rule of affix movement applies to permute TENSE and V. But otherwise, the structural description of this rule (requiring adjacent TENSE—V) is not satisfied, and TENSE must then be phonetically realized on a dummy carrier do, which is inserted before any TENSE that has not been permuted with a following verb. Chomsky's account ignores the fact that other intervening adverbials do not block affix movement. This description of English tense clearly conforms in its outlines with the Adjacency Hypothesis, which provides the explanation for why intervening elements block English affix movement - point (a) above. 21 20. When the English verb is be or in some cases have, (a) and (b) do not hold; that is, be and certain have move into the AUX position early in the syntactic derivation. This movement is fully discussed in Emonds (1976, Ch. 6), and correctly generalized in Iwakura (1977) to non-finite clauses. The way have and be are raised over VP-initial adverbs into the AUX position outside VP further supports the principle of Head Adjacency, with A in (21) taken as A U X and B as have-be. 21. In Emonds (1970, 1976), I accounted for the fact that most adverbials d o not block affix movement by inserting the auxiliary do only when T E N S E is adjacent to V rather than to V; that is, V-deletion, V-fronting, AUX-inversion, or the presence of a negative or emphatic particle outside V in effect makes affix movement onto the main verb impossible. Although this was a descriptively adequate account of points (iv) and (a), it did not explain why some elements block affix movement and some do not.

Principles

of inflectional

morphology

215

Moreover, the Head Adjacency Principle (21) now permits us to understand how Chomsky's local structural description (TENSE—V) is compatible with the fact that intervening adverbials don't interfere with the rule applying - point (iv) above. To apply this principle to affix movement, let A = TENSE and B = V, and generate the non-blocking adverbs (scarcely, never, etc.) inside V. In contrast to these adverbs, an element which blocks English affix movement (not) is positioned outside V. An important confirmation of this analysis is provided by the fact that not can immediately precede an empty (moved or ellipted) V, while non-blocking adverbs and "floating" quantifiers (being inside V) cannot, for the same reason, immediately precede a V-extraction site. (For more examples and discussion, see Sag, 1978, esp. notes 3 and 5.) (31)

* Others may visit Chicago, but will John {ever/finally}? I lost a copy before they could all obtain a copy. *I obtained a copy before they could {all/even}. Even though they did {both/barely} fulfill the quota, everyone else now refuses to. *Even though they did {both/barely}, everyone else now refuses to fulfill the quota.

The correlation that the elements that block affix movement are those which may precede a V-extraction site in English confirms the claim that these elements are outside V. Adverbials other than not are inside the V, since they neither block affix movement nor precede V-extraction sites. Affix movement is never blocked in French, because French neither separates SP(V) and V by AUX-inversion, nor has emphatic particles analogous to English so and too, or V-preposing rules; while there is some limited null anaphora of complement V's ("V-deletion"), a V that would be otherwise finite cannot delete leaving TENSE behind, as in English. The fact that negative adverbials in French do not block affix movement simply suggests that these and other French adverbs that can follow the finite verb are within the verb phrase V, and hence are included in the Y of Head Adjacency (21). The discussion of affix movement to this point has been ambiguous as to whether TENSE moves to the right around V or whether V moves to the left around TENSE. For French, there is an argument, given in more detail in Emonds (1978), for the latter analysis. No adverbs occur in French between the subject and the finite verb, yet a series of different constituents, including certain time adverbials, negative adverbials, and "floating" quantifiers, either may or must appear before a (subjectless) infinitive: (32)

*Jean {encore/maintenant} sait les réponses. 'John {still/now} knows the answers.' *Mon ami {toujours/souvent} prépare du poisson.

216

(33) (34)

(35)

A unified theory of syntactic

categories

'My friend {always/often} prepares fish.' Jean sait {encore/maintenant] les réponses. Mon ami prépare {toujours/souvent} du poisson. Jean a peur de ne pas {encore/maintenant] savoir les réponses, (infinitive) 'John is afraid of not {yet/now} knowing the answers.' Mon ami {préférérait /essayera} de (ne pas) {toujours/souvent] préparer du poisson, (infinitive) 'My friend {would prefer/will try} to (not) {always/often} prepare fish.' *Jean a peur de ne pas savoir {encore/maintenant] les réponses, (infinitive) *Mon ami {préférérait/essayera} de (ne pas) préparer {toujours/ souvent] du poisson, (infinitive)

These contrasts are automatically accounted for if SP(V)—V interchange in French is formulated as a movement of V to the left, particularly if the Head-Adjacency Principle (21) is used to eliminate mention of the intervening material. 22 Thus, the following local transformation is justified: (36)

French finite verb formation: SP(V)—V => 2 + 1 - 0 2 3

With respect to affix movement in English, two arguments can be made based on the position of adverbials that this rule, in contrast to what occurs in French, moves the TENSE to the right around V, just as proposed in Chomsky (1957). First, adverbs are generally excluded in English in the context V NP but permitted preverbally (i.e., after AUX). If a simple finite verb moved into AUX position, we would expect "finite V + a d v e r b + N P " to be acceptable (as in French), but it is not: (37)

*John should remove carefully the paint. John should carefully remove the paint. *John removed carefully the paint. John carefully removed the paint.

Second, manner adverbs are not allowed before the AUX, but they are allowed just at the left of the V. If English affix movement is to the right, it 22. When Ruwet (1967) proposes to move T E N S E to the right around the V in French, he accounts for some of the facts of adverbial placement by positing a temporal constituent which unites the tense endings and some adverbials, and is moved around the verb as a unit. If one tried to follow out this approach, the italicized sequence in "nous ne mangeons pas souvent tout" "we don't often eat everything" would be the unlikely "constituent" that would undergo the French counterpart to English rightward affix movement. 23. It is of interest to note that in French, the arguments for (36) apply to the present participle as well as to the finite verb, suggesting that French -ant "-ing" is a realization of SP(V), while its infinitive and past participle morphemes are not. In this case, it is likely that the French infinitive results simply from the deletion of SP(V).

Principles of inflectional

217

morphology

correctly follows that these adverbs can (contrary to French) precede a simple finite verb, even though they cannot precede a modal. (38)

*John completely should remove the paint. John should completely remove the paint. John completely removed the paint.

From these considerations, affix movement is rightward in English: (39)

English affix movement: TENSE—V ^ 0 - 2 + 1

As in French, adverbials inside the English V do not affect the adjacency of TENSE and V, by Head Adjacency. I assume here that any adverbials outside the V are, when (39) applies, to the left of AUX. By carefully attending to the formalism of local transformations, we can now partially collapse Chomsky's do-insertion with his affix movement (b). Material between TENSE and the following V (an inverted subject, etc.) blocks affix movement. Suppose now that the target predicate V in the structural description is in parentheses in (39). Given that every local transformation has implicit in its structural description two end variables, when the instruction " X - T E N S E - ( V ) - Y = > 1 - 0 - 3 + 2 - 4 " is applied to a tree in which V is not adjacent to TENSE, the effect is to substitute TENSE for the identity under concatenation on its right. Let us say that in general the identity under concatenation in such operations is construed to be as high as possible in the tree - here, outside the V. What now must be added to affix movement is something which turns the vacuous movement of TENSE to its right in the absence of a following adjacent V into the insertion of the conjugated auxiliary do before TENSE. That is, when the term "(V)" is null, this term should be replaced by do as the unmarked or designated grammatical representative of some inserted category. If we simply insert the feature complex [V, ACTIVITY] by a post s-structure transformation, the Designation Convention of Ch. 4 insures that the grammatical verb do is the element inserted. To do this, I introduce a notational convention by which rules with multiple sets of parentheses can be required to be read with one or the other set, but not with different sets simultaneously: (40)

Multiple parentheses convention: When there are non-nested sets of parentheses in a single rule, the rule applies using at most one set of these parentheses.

By using (40), we can subsume do-insertion under affix movement, thus accounting for the behavior of the conjugated auxiliary do: (41)

English finite verb formation: TENSE - (V) =>0 -

2 +1 (ACTIVITY)

218

A unified theory of syntactic

categories

The multiple parentheses insure that the verbal sub-category ACTIVITY (whose unmarked representative is do) is attached to TENSE only in case the sub-tree in which (41) is to apply has no V adjacent to TENSE. As in Chomsky and Halle (1968), the longest structural description must be used if it can be - if V is adjacent to TENSE in a tree, this V necessarily counts as the second term in (41).24 Having found different and locally statable representations (36) and (41) for finite verbal inflection in English and French and having explained why adverbials don't block these processes, we can now return to the last property they have in common: their application late in a derivation. English affix movement, though its effects are entirely within the domain S, depends on configurations produced by rules that apply to AUX (tag question formation, AUX inversion) and to V (V-deletion and fronting), even though these rules clearly require access to a domain larger than S. Similarly, in French, finite verb formation must follow a quantifier movement rule, a rule determining past participle morphology (the same holds for English), a rule moving clitics into preverbal position, and probably the French version of V-deletion (Emonds, 1978). This apparent ordering goes against the cyclic principle by which rules apply in smallest S domains first, and in any order in a single domain. One possibility would be that the property by which finite verb raising unites two separate syntactic categories (AUX and V) into a single word determines its later ordering. Without ruling out this "morphological" alternative, the following seems to me more precise and at the same time more general: (42)

S-structure Principle. A language-particular rule which mentions an unrestricted lexical category X° and material outside a minimal phrase containing it depends on s-structure configurations.

A language-particular rule which adjoins, for example, AUX( = INFL) to V prior to s-structure is forbidden by (42). At first glance, this appears incompatible with the suggestion of Chomsky (1981, 257-258) that English affix movement takes place after s-structure, in accord with (42), while affix movement in a language like Italian takes place "in the syntax" (prior to s-structure). However, the extremely pervasive language type in which all values of I N F L are realized on V (German, Romance languages, Japanese, etc.) is quite possibly not language-particular and hence not subject to (42). That is, the unmarked value of Chomsky's "pro-drop parameter" may be that INFL is "merged" completely with V at s24. We might wish to alter this interpretation, and t o say t h a t contrastively stressed do results from the optional choice of both sets of parentheses in (41). But since this interpretation does not account for stress being obligatory on a do which is inserted even when a V is present (Joe did leave vs. *Joe d'd leave), it m a y be better to assume, with C h o m s k y (1957), that contrastive do is the surface realization of an abstract emphasis formative, generated to the right of A U X outside the V.

Principles

of inflectional

morphology

219

structure, so that only grammars in which there is no merger or partial merger (e.g., English, Chinese) contain language-particular rules subject to (42). According to (42), any language-particular rule uniting V and material outside V will be post-cyclic; the domain of (42) is therefore wider than simply the rules that produce inflection. For example, principle (42) can explain the later ordering of the root transformation of verb-second in Dutch and German. Similarly, the local rule of verb-fronting in Breton (Emonds, 1980b), which may well have exactly the form of (36) with the addition of a context predicate NP between TENSE and V, is correctly predicted to follow rules such as topicalization. Harlow (1981, 222) proposes a similar rule for Welsh: T E N S E — N P — V =>3 + 1 —2 — 0. Finally, I can here make a suggestion for why French has no rules that move AUX (point (c) earlier) - recall that French finite verb formation is not an affix movement, but a V movement. While proposal (43) has a number of interesting implications, nothing in the rest of the development here logically depends on it. (43)

No movement transformation can apply in a language to (a subcategory of) a category which is not realized as a word class.

That is, by (43), English constructions such as inverted AUX, syntactically variable tag questions with AUX, and affix movement are made possible just because AUX is realized as the word class of modals in English. In this section, I have attempted to attribute certain aspects of French and English verbal inflection (i-iv), which would seem to distinguish these processes from other transformations, to some general principles that eliminate the need to stipulate any rule-particular conditions on (36) and (41). In brief, the Inflection Principles (14) and (16), justified independently of verbal morphology and to be further generalized in the following sections, explain (i) and (ii). The late ordering (iii) of (36) and (41) is attributed to the S-structure Principle (42), which is justified by other effects inside and outside the domain of morphology. And the difference between what elements block the adjacency of SP(V) and V in (36) and (41) and what elements do not (a and iv) is explained by the interaction of the Adjacency Hypothesis and the Head Adjacency expressed in (21), the latter also being justified for some non-inflectional processes by the earlier discussion in section 5.4. All the grammatical principles invoked, except for the First Inflection Principle (which is itself essentially the definition of inflection), have important roles to play in explaining aspects of grammar traditionally conceived of as syntax proper, as the discussion of each of them demonstrated. If one tries to take the step of separating inflection from transformational syntax, one risks losing the unified explanations these principles provide, as well as leaving unexplained the gaps in the specifier paradigms typically filled by inflections.

220

A unified theory of syntactic

categories

Another point that emerges from the discussion is that the statements of individual language-particular rules are essentially simple, rarely being as complex even as (41). While the most general formulations of the principles which obviate rule-particular stipulations may still escape us for example, the post-cyclic principle (42) may be due to some general requirement of preserving phrasal typology through a derivation - it is nonetheless an advance if the analysis of morphology yields principles and hypotheses to study, criticize, and refine, in place of informal descriptions of paradigms. It is also perhaps reassuring that initial alluring aspects of the first widely read publication in generative grammar (Chomsky, 1957) can be retained in what is hopefully a more advanced theoretical framework, with correspondingly more mature explanatory goals. That is, Chomsky's original analysis of English verbal morphology involves a reordering which is obligatory, which is integrated with syntax, and which crucially depends on adjacency, late application in a derivation, and abstracting away from a range of adverbial-like elements. I have tried to show here that all these characteristics are justified. Indeed, some of the surest initial candidates for the language-particular local transformations which are the heart of the Adjacency Hypothesis have been found to stand up under the test of time. 5.7. The Source of Morphological

Case

A final type of inflectional morpheme that will be considered here is morphological case. For example, German determiners and adjectives, and to a lesser extent, German nouns exhibit case-marking; case inflection on nouns, determiners, and adjectives is found pervasively in Classical Greek and Latin. On the other hand, case inflection is not found in NP's or AP's in English, French, or Spanish, except in a few pronoun contrasts and on possessive NP's. In Emonds (1976, Ch. 5) I argue that the remnants of case found on English pronouns should not be generated by the mechanisms for morphological case; it is the purpose of this section to elucidate what these mechanisms are. As a corollary, it will be seen why English pronominal case-marking is exempt from their efTects. Similarly, in Emonds (1976, Ch. 6), I argue that the distinction between French indirect and direct object pronominal clitics is not a true case phenomenon, and this also will follow as a consequence of the theory proposed here. In general, a true case-marking rule has to be "healthy", or it ceases to be a casemarking rule at all; i.e., I claim that "marginal" case phenomena are always vestigial, in the sense that they do not realize any property of the universal theory of case other than being a historical reflex of an earlier system with such properties. In Ch. 1, a universal principle of case-marking (72), revised there as (76), assigns case to NP's and AP's in the transformational component, so that at least all lexical NP's and AP's interpretable as arguments to predicates

Principles of inflectional

morphology

221

must have abstract case-marks in surface structure. Traces of NPmovement may also be associated with abstract case-marks, as discussed in note 28 of Ch. 1. The universal case-marking principle, as was justified in Ch. 1, copies the categories in (44) onto the adjacent argument NP's, so that the categories of case are nothing more than copied grammatical categories of the base. The correspondences between the names for cases in traditional (Indo-European) grammars and their syntactic counterparts in formal grammar are essentially as in Chomsky (1980): (44)

nominative = SP(V); accusative = V; genitive = SP(N); dative and ablative = P.

In this framework, "interpretable N P " is synonymous with "casemarked N P " , at least for lexical NP's. The pre-theoretical contrast between languages with and without morphological case is replaced by a threefold distinction. A language may (i) realize its N P and AP case features by inflections on SP(X) a n d / o r X, as in German, Greek, and Latin, or (ii) realize its case features on N P and AP as X m a x units, but not on SP(X) or X, as in Japanese or in English possessive NP's, or (iii) not realize its case features morphologically at all, as in French, Spanish, and elsewhere in English. For purposes of discussing inflection, the last two of these situations fall together, since neither involves a morpheme bound to a lexical category X or to a grammatical category SP(X), rather than to a phrase. I d o not deny the importance of formulating case-marking rules for a language like Japanese, or showing what theoretical changes they might induce; I simply observe that such formulations are outside the scope of this chapter. I am here concerned solely with how the universal case-marking of phrases is translated into the language-particular case-marking of heads and specifiers of phrases. The theory of possible systems of inflectional case can be subsumed under the answers to two questions: when may a language have a G e r m a n / G r e e k / L a t i n system of inflectional case, and if it does, what kinds of rules are available for realizing these inflections in different ways? In part, these two questions are interrelated, because when a language has inflectional case, we want to know what an unmarked realization of such a system is. I think it is most plausible to assume that in an unmarked inflectional system, case features are realized on the head N and A of N P ' s and AP's, and only secondarily elsewhere inside the phrase. The case system of German can give one pause in this respect, because in standard German, morphological case is typically realized on the determiner and adjectival modifiers of the head noun, rather than on the N itself. However, case is by no means absent from the German N entirely; genitive singulars, dative plurals, and, for certain word classes, nominative singulars exhibit distinctive case. But when we look further, for example, at Greek, Latin, and Russian, the lexical head is clearly the central carrier

222

A unified theory of syntactic

categories

of case, since a determiner is apparently syntactically optional. This is a fortiori so in case-marked adjective phrases, since specifiers of AP typically show no case-marking at all. I therefore conclude that an unmarked inflectional case system is one in which the phrasal case feature is realized on the lexical head of that phrase. I propose now to answer the question posed above as to when a language may have a true inflectional case system. Formally, I define a category C as morphologically transparent in XP if and only if the rules of morphology of a language yield a productive (indefinite) number of pairs of minimal XP which differ phonologically only by virtue of whether XP contains C. Then: (45)

Morphological Transparency: The abstract case features of lexical XP (i.e., of NP's and AP's) are syntactically realized on their lexical heads if and only if they are morphologically transparent in XP.

It should be noted that morphological transparency is a property of C, XP, and a grammar of a language, and not among C, XP, and a particular derivation. I assume that conjoined phrases have multiple heads, one in each conjunct. By a minimal phrase XP is meant one which properly contains no other XP'. Put somewhat differently, Morphological Transparency means that a language-learning child will not recognize phonological phenomena as case inflections (indicative in turn of underlying case-marked heads) unless the phenomenon is productive. Non-productive phenomena will be regarded as realizing some other type of rule, some examples of which will be given in section 5.8. Since the number of pairs of minimal NP's which exhibit apparent case in Modern English is five (I/me, he/him, she/her, we/us, and they/them), in French is zero (all apparent case distinctions are within the verbal clitic system), and in Spanish is two (i.e., outside the clitic system, the two pairs are yo/mi and tu/ti), it follows from (45) that N's in these three languages do not carry case features. In contrast, the German-speaking child is confronted with a productive number of pairs of minimal NP's which differ only by a case inflection on, in the German situation, the SP(N). For example, a range of determiners exhibit a nominative/accusative distinction with the productive class of masculine singular nouns: der Mann/den Mann, dieser Apfel/diesen Apfel, ein Staat/einen Staat, etc. By (45), the German-speaking child then considers the lexical head of the N P to be marked with the case features V (accusative), SP(V) (nominative), P (dative), or SP(N) (genitive). German offers a good example for a preliminary discussion of language-particular case-marking rules. By Morphological Transparency, only the head of the phrase is case-marked. The following rule operates in German to copy these features onto the determiner:

Principles (46)

of inflectional

morphology

223

SP(N) - [ N CASE F E A T U R E ]

By Head Adjacency (21), the N in (46) is the head of the N which is adjacent to SP(N), i.e., N and SP(N) are head and determiner respectively of the same N P , as desired. In section 5.9, an extension of the Second Inflection Principle (16) will suffice to make rules of morphological casemarking like (46) obligatory. In all likelihood, (46) is part of a more general principle which insures agreement in not only case, but also in number and gender between a determiner and a noun; however, extending the scope of (46) to features other than case is not my purpose here. Following the assignment of case features to the head noun by (45) and to the determiner by (46), German has rules which insert the grammatical formatives of morphological case. Typically, such rules apply in contexts created by transformations, i.e., in the post s-structure or "phonological" component, as discussed in section 4.7. The exact form of these phonological determinations is not at issue here; for explicitness, we can use the formalism of Jensen and Jensen (1984) for the lexical entries of inflectional endings: (47)

(a) - n / [ + PLUR, P ] (b) - s / [ —PLUR, SP(N)]

The first formula in (47) means that -n appears in German phonological forms as an ending on all plural forms (determiners and nouns) with the dative case mark P. The second formula means that -s appears as an ending on singular forms with the genitive case mark SP(N). 2 5 A central question of theoretical syntax posed by case-inflecting

25. In (46) and throughout this discussion, I ignore the case agreement morphemes that appear on German adjectives, that is, the adjectives which are preceded either by no determiner, or by a determiner which is not case-marked. The agreement on weak adjectives is discussed in van Riemsdijk (1983, 233-234). German strong and "mixed" adjectival case may be accounted for by (i) attaching non-case-marked "ein-words" as proclitics on following adjectives, and then (ii) moving adjectives themselves into any determiner position which is empty. Strong adjectives will then be in determiner position and will correctly receive the case endings of determiners. In rule (19), number is transferred from the determiner to the head, and in (46), case is transferred from the head to the determiner. The other agreement between determiners and head nouns in many languages is that of grammatical gender. It is most plausible that gender is a lexical feature of nouns; then its transfer to the determiner, parallel to that of case, is transformational. I d o not mention — F E M I N I N E in (47b) since I think that the confluence between the general plural and the feminine singular endings and pronouns in German is more than accidental. Grammatically, the German "feminine" may be a lexical and purely formal plural marking which is invisible in syntactic rules which mention + PLURAL, such as number agreement or dative plural case-marking (47a). It should be recalled that Latin first declension feminine nouns are known to have been originally syntactic plurals. In general, I

224

A unified theory of syntactic

categories

languages like German that are morphologically transparent (in the sense of (45)) concerns the status of those NP's marked for case which are not in the domain of some appropriate overt case marking category (as defined in the Generalized Case Marking principle (76) of Ch. 1). Some simple examples of such NP's are German indirect objects in the dative case, German N P complements of adjectives in the dative and genitive, and Latin NP's of instrument and cause in the "ablative" case. Descriptively, we can call such NP's, using a term of W.D. Whitney and Kurylowicz, "adverbial case NP's", even though the term is misleading. Adverbial case NP's are not irregular in the languages in question, in the sense of being required by individual lexical items in lieu of NP's exhibiting some other, "regular" case. Rather, these adverbial case NP's, in their normal usage, are absolutely typical of the ways in which the languages where they appear use case-inflected NP's. (To refer to lexically irregular case, I will use the current term "quirky".) I will argue here that the abstract case-marking theory of Ch. 1 does not need to be supplemented or complicated in any way by some subtheory or parameter which describes adverbial case NP's in terms of "autonomous case", "inherent case" or "semantic case". Rather, adverbial case NP's receive case, in all their central uses, by being objects of lexically empty structural P, sometimes accompanied by one or two syntactic feature values, as indicated in (48). (48)

Deep and surface structures of adverbial case NP's, including German datives, Latin and Greek datives and prepositionless ablatives, etc. 26

P (a DIRECTION) {P LOCATION)

[NP, P] (Assigned by (76), Ch. 1) I adverbial case lexical N P I

0 [N, P ] (Allowed by (45) above) Footnote 25—Continued am suspicious of the masculine-feminine dichotomy that traditional grammar ascribes to forms that we would today consider part of linguistic competence. More precisely, I think it is correct to posit a purely syntactic feature ± ANIMATE, whose role in the syntax of many languages including English is pervasive, and which plays an important role in semantic interpretation. But any semantics associated with what grammar calls masculine/feminine, on the other hand, seems to me to be purely pragmatic; moreover, any masculine/feminine differences in syntax can be arguably reduced to distinctions such as ± S I N G U L A R , ± PLURAL, ± A N I M A T E , etc. 26. The accusatives with prepositions expressing "motion toward" in the I n d o European case-inflecting systems of German, Greek, Latin, Russian, Sanskrit, etc. are not assigned by P, as explained in section 1.8. Thus, they d o not fall under the proposal shown in (48). Rather, these NP's are assigned accusative by virtue of the fact that directional

Principles of inflectional

morphology

225

There are multiple and detailed justifications throughout this book (Chs. 1, 3, and 7) of empty P's in the analyses of English and French, so the existence of the structural type (48) is not novel. Recall further that PP's are freely generated (cf. Ch. 1) as sisters to any projection of N, V, or A, so we expect adverbial case N P ' s in a wide range of positions. W h a t remains to be explained is how P with an adverbial case N P is allowed to remain empty throughout a transformational derivation. I will show that analyzing adverbial case NP's as instances of (48) has the following advantages: (49)

(i) The unmarked "adverbial" case (i.e., the case assigned by p [ 0 ] ) is the same as the unmarked case assigned by lexical P; marked adverbial cases (e.g., genitives) will appear in the same proportion as they appear after lexical P (as "quirky" case). (ii) A general principle which complements Morphological Transparency (45) can be given which predicts when the P in (48) remains empty and when it must be filled. (iii) The universal definition of indirect object given as (85) of Ch. 1, with a slight refinement, is valid for both case-inflecting and isolating languages. (iv) The appearance of lexical P's in case-less languages like English and French is predicted, and differences between case-less and case-inflecting languages are reduced to a minimum. (v) Many apparently unique syntactic properties of adverbial case N P ' s fall into place. (vi) Analyses are available for capturing morphological generalizations for highly inflected languages which resist formulation when adverbial case N P ' s are taken as syntactic primitives.

I will first discuss the various points in (49) with respect to German. Then I will extend my claims in (49i, v, and vi) about language-particular aspects of adverbial case N P ' s to Latin, Greek, and Sanskrit, in the hope

Footnote 26—Continued c o m p l e m e n t s are inside the smallest V; an intransitive V (such as go, run, etc.) or the V case feature o n the direct object of a transitive V (such as push, drop, etc.) assigns t h e m case and thus a l l o w s t h e m to be interpreted as directional rather than locational c o m p l e m e n t s . In Sanskrit and Greek, the P of m o t i o n typically appears c o m p o u n d e d with the V (cf. Whitney, 1889, 3 9 5 - 4 0 0 ) . Whether this c o m p o u n d i n g is lexical and appears in d e e p structure, or is achieved transformationally, it implies that at s-structure the P is not available to casemark its semantic object. T h e latter function must be performed by V. G i v e n this situation in the syntax, a rule of semantic interpretation is necessary to interpret such accusative N P ' s as directional rather than locational. I assume the same rule is operative e v e n in the languages (German, Latin, Russian) where directional P's remain adjacent t o their objects.

226

A unified theory of syntactic

categories

that my proposal can be developed further by those more knowledgeable about other case-inflecting languages. In the German four-case system, nominative, accusative, and genitive NP's appear as expected as subjects, objects, and complements to nouns, respectively. Once the accusative NP's after prepositions of motion are analyzed as in note 26, it is clear that the regular case assigned by lexical P is dative. Certain small sets of P assign lexical (quirky) accusative case to their object (e.g., fiir 'for', + [NP,V]) and others assign quirky genitive case (e.g., wahrend 'during', + [NP,SP(N)]); these quirky cases are parallel to the irregular dative and genitive cases assigned by certain V. As (48) and (49i) lead us to expect, the principal adverbial case in German is the case assigned by lexical P, namely the dative. I will first examine two instances, and return below to more details. The subcategorization for a typical German verb with a direct and indirect object (schenken 'send', lesen 'read') is exactly as in English (cf. section 1.6): -IN P NP. (As mentioned in Ch. 1, left-right order is determined by principles of word order and not by lexical entries.) By the principles of direct and indirect 0-role assignment and the universal definition of indirect object (85) developed in Ch. 1, this subcategorization feature is necessarily realized as (50); in German, the V is V-final in deep structure. (50)

[N,P] The empty P in (50) assigns dative case just as would a lexical P. No stipulations about case are needed, other than the rules of German morphology for datives. The case on the NP complements to German adjectives is studied in some detail in van Riemsdijk (1983). While van Riemsdijk points out that some A take full PP complements, others are quite clearly subcategorized simply for NP. By the principles of Ch. 1, A can not have an NP sister in the base, and is hence not itself a case-marking category. That is, its unmarked (non-quirky) subcategorization feature -INP can only be realized as in (51), where NP "constitutes" a sister a A.27

27. Van Riemsdijk (1983) points out that the subcategorized N P complements of A, whether they receive (regular) dative case or (quirky) genitive case, may not appear after the head A within the AP. He argues that the German A is parallel to the German V in not allowing an N P to follow the deep structure position of the head. In my framework, van Riemsdijk's restriction (33), namely, * [ + V ° + N m a l ] , must be interpreted to include all N m a x which "constitute" a P P sister to + V° as well as NP's which are sisters.

Principles

of inflectional

morphology

227

(51)

[N,P] Again, as with indirect objects, the empty P in the structure (51), which results from the theory of subcategorization in section 1.6 and the simplest possible subcategorization frame, correctly case-marks these adjective complements as dative without a stipulation. It is clear from van Riemsdijk's discussion that dative is the regular case after a German A, and that genitives and accusatives are to be considered quirky, just exactly as they are after lexical P. The present theory thus predicts that N P complements to adjectives and indirect objects will fall together as far as syntactic case-marking is concerned, and this is confirmed by both German and other languages to be considered below. This result does not depend on an unnatural extension of the notion indirect object to include (first) N P complements to adjectives. The universal principle for surface interpretation of indirect objects ((85), Ch. 1) is to be considered optional, and is rendered obligatory only by the intrinsic semantic representation of verbs of transferral. The characteristic of German indirect objects and unmarked N P complements to A which is not explained by what has been developed so far is why the empty P's which provide the case-marking (and whose existence is required by the principle of Indirect 0-role Assignment in section 1.6) can remain empty throughout the transformational derivation. Intuitively, the reason is that the case-inflections themselves serve to indicate the structural presence of this P. A parallel can be made with verbal inflection; SP(V) can remain empty in its base position when it is spelled out, by virtue of affix movement, on the head of its phrasal sister VP. Formalizing, (52)

Invisible Category Principle. An obligatory closed category B (such as a SP(X) or P) with a feature C may remain empty throughout a derivation if C is morphologically transparent in a phrasal sister of B.

Thus, by (52), an I N F L which is not a modal can remain empty (i.e., not dominate a lexical item) throughout a derivation since its feature T E N S E ( — M O D A L ) is morphologically transparent in its phrasal sister VP. Similarly, we can consider the English D E T to be uniformly obligatory with count nouns (e.g., *bad student came in late), and use (52) to explain why plurals can appear without a D E T (e.g., bad students came in late). Here, B = DET, P L U R = C, and the phrasal sister is N. The Invisible Category Principle equally well predicts why the P in (48) can remain empty in case-inflecting languages (49ii). Because of

228

A unified theory of syntactic

categories

Generalized Case-Marking, the category P is present on its object NP, universally. Taking P = B = C in (52), P can remain empty through a derivation only if there is a productive distinction between minimal NP's which are + P and those which are —P. Since German 'has such a distinction, (50) and (51) are licensed through a derivation. But, as seen earlier, there are no minimal NP in English and French which differ only by virtue of being dative or not, much less a productive number of them. Thus, the Invisible Category Principle does not license empty P's in these languages, as required (49iv). There may of course be zero prepositions in the two languages, but these require a stipulation beyond (52). Let me now turn to another instance of adverbial case NP's in German, brought out in van Riemsdijk (1983). While I do not follow van Riemsdijk's case-marking system here exactly, I owe much to his careful discussion. Van Riemsdijk discusses the possessive NP's in German which "double" a possessive pronoun. (53)

[[NP Dem Mann] [DET sein] Vater] 'the man his father'

It seems to be that (54) is a minimal descriptive statement of this German structure which English lacks: (54)

Full German NP's can be immediately preceded by a "doubling" N P coreferential with the possessive specifier.

Since the doubling N P immediately precedes the possessed NP, it is plausible to assume that they together constitute an NP. Moreover, since the possessed N P is selected and case-marked by the surrounding structure, it is the head of the containing NP: (55) NP'' I Dem Mann

^ DET I sein

NP

\ N I Vater

By the base composition rules of Ch. 1, the only type of N P allowed as a complement to N 7 is a PP, so the full structure of (55) must be (56): (56)

NP

NP, I Dem Mann

sein

Vater

Principles

of inflectional

morphology

229

The only case-marker available for NP; is P, so necessarily, the theory I am proposing requires that this "NP-external" possessive be case-marked dative in German; since this is empirically so, the theory is supported. 28 We have now seen that the case-inflecting system of German supports the proposal that regular (non-quirky) adverbial case NP's be represented as objects of empty P, as in (48). Grosso modo, the German N P object of P is in the dative, whether P is lexical or not: when it is not, we have "adverbial" datives such as indirect objects, unmarked complements to A, possessive NP's, etc. Latin presents a slightly more complicated case system, one which at first glance seems to undermine my approach to adverbial case NP's. For our purposes here, the Latin nominative, accusative, and genitive can be said to have the same fundamental distribution as their German counterparts. Among other things, lexical prepositions of "motion toward" take the accusative case, as in German; see note 26. While there are important differences in the behavior of these cases, such as the existence of a Latin accusative subject for infinitives, they do not impinge on a discussion of adverbial case NP's. Of course, there are also many small differences; for example, among the lists of quirky case assigners. Corresponding to the German dative case, there are two Latin cases, traditionally termed the dative and the ablative. The Latin dative might seem to disconfirm my proposal that adverbial case NP's receive case from P, since no lexical Latin P takes the dative. But before turning to the dative, I discuss the ablative. In Latin, the ablative is the unmarked case assigned by lexical P. Moreover, quite as predicted in (49i), the ablative is the prototypical Latin 28. The precedence requirement expressed in (54) appears to override the more general word order principle by which PP's follow the head in German. This might be best expressed by extending the filter for German discussed in note 27 to [ L ° - N P ] . Van Riemsdijk develops in his study a theory of dative as the unmarked oblique case in German, to account both for the dative of possession just discussed and for the fact that some appositive NP's have the option of either agreeing with the N P they modify or of being in the dative. If possible, I would like to avoid recourse to a "fall-back" morphological case, at least for German, and conserve the notion that whenever case-marking fails, then the case filter applies. Of the appositive N P types given by van Riemsdijk, all those which can alternatively appear in the dative in fact modify NP's which are analyzed here as being in PP's, while those appositives which must agree are not in PP's. In particular, dative case is not allowed for appositives to nominative and direct object NP's, while dative case is allowed for appositives to post-nominal genitives (cf. the discussion of (86), Ch. 1, to see that these are in PP's) or to quirky accusative or genitive objects of lexical prepositions. The theory developed here can express naturally when non-agreement in appositive NP's is allowed by a statement such as "appositive NP's may be alternatively case-marked by the closest potential casemarking category governing them". Only one of van Riemsdijk's examples, a dative appositive to a quirky genitive object of a verb, does not immediately fall under this statement. However, quirky genitives may in fact be PP's with an empty P that selects a genitive. As will be seen below in discussing Sanskrit and Greek, there are prepositions in those languages which are followed by genitives, including null P. If quirky genitives are always PP's, we expect that they sometimes will behave unlike simple NP's with respect to other rules of grammar, but like PP's instead.

230

A unified theory of syntactic categories

adverbial case; there are prepositionless ablatives of separation, cause, comparison, instrument, means, price, and time when (Greenough et al., 1975). However, the Latin ablative differs from the G e r m a n dative in two ways: (i) the Latin ablative never represents the prepositionless indirect object, and (ii) the Latin ablative can express all the above adverbial roles without a preposition, while the G e r m a n dative cannot. Leaving aside (i) for the moment, the fact is that the present framework is fully compatible with (ii); i.e., Latin grammar need only specify that null prepositions exist for these relatively unmarked notions, and the ablative case-marking on their N P objects is automatic. F o r further discussion, it will be useful to have a table of the fundamental syntactic features on prepositions which I believe are valid across languages, along with their typical realizations in English. (57)

Table 3 Features on P:

+ LOCATION

-LOCATION

+

to (displacement toward)

for (benefactive)

GOAL

Unspecified for G O A L ( = 0 GOAL) -GOAL

(SOURCE)

with (instrument, in/at

(locative)

accompaniment)

from (ablative)

by (agent, cause, means)

I assume these features on P are features that percolate to P P as well. The Latin ablative, as an adverbial case, typically specifies the notions characterized in the above table as not + G O A L and as —LOCATION. T h a t is, Latin but not German has a p [ 0 ] which is — L O C A T I O N , 0 G O A L . This expresses (ii). The universal definition of indirect object given in Ch. 1 is repeated here: (58)

An indirect object is an [ N P , + P ] which constitutes a sister to L° at s-structure.

To explain (i), that the Latin ablative cannot be interpreted as an indirect object, (58) must be rendered slightly more specific: (59)

An indirect object is a sister to L° at s-structure which is [pp,+goalNP]

2 9

Given this analysis by which the Latin ablative corresponds to the German dative, the following facts about Latin remain to be accounted for: 29. This definition of an indirect object, no longer dependent o n a case feature as in (58), is n o w consistent with the treatment of English prepositionless indirect objects sketched in the First Appendix to Ch. 2. I there assigned the following derived structure to preposition-

Principles of inflectional (60)

morphology

231

(a) The prepositionless indirect object in Latin is in the dative, but our analysis so far predicts the ablative, (b) The most frequent case taken by adjectives with N P complements in Latin is the dative, but our analysis so far predicts the ablative.

Any grammar of Latin must be allowed one statement which asserts that Latin but not German has a second adverbial case. So our goal is to make one stipulation about the Latin dative (so far none has been made) and thereby express (60a-b). Before this step is taken, we may also note that any complete traditional grammar of Latin is required to make two further statements about the dative, including a central morphological generalization not expected in any theory which takes the various cases as primitives or autonomous. (60)

(c) A quirky dative never occurs after a preposition, even though verbs take quirky genitives, ablatives, and datives, and prepositions take quirky genitives and accusatives, (d) The Latin dative and ablative endings are always identical in the plural, and in the second declension singular. (Otherwise, add -i to the stem for singular datives.)

The striking morphological fact (60d) can be stated in the approach taken here in such a way as to make (60a-b-c) fall out immediately. I simply list the context for the only specifically dative ending in the lexicon as follows: (61)

-i/[N, + P, + SING, - 2 N D D E C L E N S I O N ]

, when govern-

b

(62)

ed y Cp, + GOAL ^ "ablative endings"/elsewhere in [ N , + P ]

.

Footnote 29—Continued less indirect objects; in line with Ch. 3, a transformational operation not falling under Move a is necessarily an adjunction to a head. Both lexical N P ' s receive an abstract accusative case from V by (76) of Ch. 1.

[NP;,V] V ^ send

[NPp V] John

l' a present I P

^ ^ P P ^ ^ [P, + G O A L ] I 9

\ NP, I 0

The NP-trace is caseless, by the Empty Head Principle of the First Appendix to Ch. 2. The interpretation of the chain of N P ; as an indirect object is, by (59), dependent on the presence of the + G O A L feature on the empty P and its P P projection. Under this interpretation, the trace N P . is the indirect object of V, and N P j is the direct object, as required.

232

A unified

theory

of syntactic

categories

The statement of (61) implies that in the plural (and second declension singular), there is n o Latin "dative"; in the plural, indirect objects and unmarked N P complements to adjectives are in fact "ablative", that is, in the unmarked prepositional case, just as required by (60a-b) without stipulation. The one stipulation necessary for the Latin dative is rather about the ablative - the ending changes in the singular after a null + G O A L preposition, and from this (60a-b-c) immediately follow. Thus, by virtue of a single rule for the Latin dative, the present system of analyzing adverbial case N P ' s captures the main facts of Latin that fall under (49i, v, and vi). Moreover, the Latin cases are related in maximally simple ways to the English and German constructs (whether N P ' s or PP's) that translate them, by virtue of analyzing the Latin adverbial case N P ' s as reflexes of underlying structural prepositions in universal grammar. The use of identical base structures (48) for the adverbial case N P ' s of Classical Greek allows for a similar elegant characterization of a particularity of its case inflection. While the differences between Greek and Latin adverbial case N P ' s viewed in relation to each other seem extensive, they become comprehensible when seen as resulting from one fundamental deviation in each language (each in a different direction) from the regular abstract P P structures of (48). Like German, Greek has four morphological cases, termed nominative, accusative, genitive, and dative. The first three appear as expected; moreover, as in other Indo-European case-inflecting languages, the lexical P of motion toward are followed by accusatives, as discussed in note 26. Exactly as in G e r m a n , the typical case governed by other lexical prepositions is the dative, so I take the Greek dative endings (singular -i and plural is/si) t o realize [ N , P ] , When the governing P is zero, an [ N P , P ] is, as expected, interpretable as an indirect object. In this situation, P has the feature + G O A L , as discussed for German. Also, in accord with the situation in German, N P complements to many classes of adjectives appear in the dative in Greek. Again, I take these to result from lexical entries whose form is A, + NP, and from the requirement that A assign 0-roles indirectly, via an empty P. Greek, like Latin, but unlike German, presents a range of uses for the prepositional case in which no overt P appears. These uses are termed datives of possession (with nouns and the copula), of instrument, of accompaniment (with or without lexical P), and of place where and time when (Ruck, 1968). Thus, as in Latin, the lexicon of Greek morphemes includes certain [P )+ GOALQ] and [P,OGOAL0], some of which are allomorphs of lexical prepositions meaning with, in, etc. The principle particularly of Greek case is the pervasiveness (nonquirkiness) of the genitive for "movement away". Lexical prepositions of separation such as apó 'away from', ek 'out o f ' and áneu 'without' are followed by genitives. Moreover, there are genitives of separation after both V's and A's (a lexical P appears for physical separation and a null P

Principles of inflectional

morphology

233

for metaphorical separation), genitives of comparison (the idea of separation in a comparison emerges in the English alternation different from/than NP), and partitive genitives, after both nouns and verbs (eat "of" bread, etc.). The traditional grammarians attribute this use of the Greek genitive to the conflation of an earlier Indo-European ablative and genitive case in Greek (cf. the discussion of Sanskrit below). Synchronically, this conflation can be stated as a local and languageparticular Greek rule: (63)

Greek genitive: [ P , - G O A L ] - NP=>1 - [ 2 , S P ( N ) ] 3 0

In (63), the P can be + L O C A T I O N or - L O C A T I O N ; as noted above, any P which is -I- L O C A T I O N in Greek tends to be lexical rather than understood, whatever the value of G O A L involved. Rule (63) is in the synchronic grammars of neither German nor Latin. In the discussions of main clause absolute constructions in Ch. 1 and 2, it was pointed out that somewhat different P's can introduce absolutes: wii/i/0 in English and en/aunque/con/(J in Spanish. In both Latin and Greek, absolutes are introduced by [p,-GOAL0]- Hence, Latin has ablative absolutes, and Greek, via (63), has genitive absolutes. In summary, a single language-particular rule for Greek added to the otherwise general scheme for representing adverbial case NP's (48) shows that a relatively complex case-inflecting system is actually minimally different from the other systems allowed by universal grammar, the isolating system of English/French/Spanish and the two case systems of German and Latin. As a final example of a case-inflecting system, I will briefly discuss the adverbial case N P ' s of Sanskrit. I d o not choose this example because it easily confirms my proposal for treating adverbial case N P ' s as structural PP's, as d o German, Greek, and Latin. One might, in light of the traditional views on Sanskrit, take it rather as a language where this approach is bound to fail. The eminent Sanskritist W.D. Whitney, for example, claims that Sanskrit has "no proper class of prepositions (in the modern sense of that term), no body of words having for their prevailing office the "government" of nouns." However, he continues: "But many of the adverbial words indicated above are used with nouns in a way which approximates them to the more fully developed prepositions of other languages. If one and another of such words - as vina, rte - occurs almost solely in prepositional use, this is merely fortuitous and unessential. Words are thus used prepositionally along with all the noun-cases excepting the dative." (Whitney, 1889, 414). M y reading of Whitney's sections on case and prepositions is that he

30. The quirky genitives after German verbs mentioned earlier may be irregular vestiges of this type of process. As such, their quirky representation may be in the form of a PP.

234

A unified theory of syntactic

categories

understands by preposition a word which must govern an N P (e.g., English for, with), whereas one that may (e.g., English in, down, before), when followed by a morphologically case-marked N P , "is directive only, determining more definitely, or strengthening, the proper case-use of the noun." (414) Empirically speaking, it seems that Whitney's claim is simply that few Sanskrit prepositions (even though he gives examples, as in the passage cited) have a sole subcategorization frame + N P . I am quite content with this claim; in my terms, the Sanskrit counterparts to words like with and at are empty prepositions. Following Ostler (1976), I agree that in generative terms, Sanskrit prepositions and what Whitney calls "prepositional prefixes" on the verb must be classed in the category P. Whitney's impressions about the lack of solely transitive P in Sanskrit were strengthened by the fact that in classical Sanskrit, even lexically contentful P of direction are typically part of the V, and do not form a surface constituent with their logical object, as discussed in note 26. " I n classical Sanskrit, the prefix stands immediately before the verbal form. In the earlier language, however (...), its position is quite free: it may be separated from the verb . . . and may even come after the form to which it belongs;" (Whitney, 1889, 397-398, emphasis added). These complex verb constructions, which result in accusative N P ' s of "direction toward", d o not directly bear on the hypothesis expressed in (48), since they are N P ' s whose case-marking category is V, n o t P, in any system. Alternatively, their case-marking is quirky, and again does not bear on the correctness of (48). Putting aside vocatives, which are related to nominatives in a regular way, the case system of Sanskrit is arrayed by Whitney as follows: (64)

The normal scheme of endings, as recognized by the native grammarians (and conveniently to be assumed as the basis of special descriptions), is this:

N. A. I. D. Ab. G. L.

Singular m.f. n. s — am — ä e as as i

Dual m.f. n. äu T äu i bhyäm bhyäm bhyäm os os

Plural m.f. n. as i as i bhis bhyas bhyas ara su [Whitney, 1889,106]

The dual endings in Sanskrit should be considered as reliable as the singular and plural, because dual number is a fully functioning part of the system, in n o way vestigial or marginal. (W. Poser, pers. comm.) In the preceding disucssion of Latin and Greek, two rules have been

Principles of inflectional

morphology

235

proposed whose existence seems confirmed by their ability to straightforwardly derive essential aspects of what might seem like skewed systems from a four-case underlying pattern. The Latin rule provides a special ending for singular datives after P = 0 in the lexicon, while the Greek rule is a local transformation realizing the feature —GOAL as a genitive casemarking on the object of a P. The fact that these rules, whose plausibility seems to me established, explain peculiarities in two daughters of Indo-European suggests that slight modifications of them might also explain aspects of Sanskrit, another daughter of Indo-European. The lexical entry for the dative singular ending i in Latin, given as (61), is rather e in Sanskrit ( y a after stems in a). However, the insertion context for the two entries is identical, except for the reference to the Latin declension. 31 Similarly, the Greek genitive rule (63) is also part of Sanskrit, except that it also is restricted to singular N P ' s in Sanskrit. Before summarizing the impact of these rules on the Sanskrit case system, I would remark that the instrumental plural ending does not appear to be unrelated to the other endings containing the obstruent bh. (Note this is the only stop in the system in (64).) T o express this relation, I propose that the morphological instrumental plural in Sanskrit be derived by a late rule which drops the a in just this ending. Given these rules of syntax and morphology, what is left for the "elsewhere" rules of Sanskrit adverbial case morphology to express? Adverbial Cases

Singular

Dual

Plural

Instrumental

a

bhyäm

bhyas

Dative

already spelled out

bhyäm

bhyas

Ablative

like genitive

bhyäm

bhyas

Locative

i

os

su

(before a-drop)

To me, it seems clear; there is a typical case assigned by P, lexical or not, in Sanskrit, just as there is in German, Greek, and Latin. Its endings are -a, -bhyam, -bhyas. There is no name in traditional grammar for this case, because a Latin-type rule affecting dative singulars, a Greek-type rule affecting ablative singulars, and a Sanskrit-particular rule for instrumental plurals obliterate the regularity. But there is n o plausible alternative analysis which somehow predicts these irregularities by simply displaying them in a seven by three array with all the slots filled; in fact, alternatives based on a notion of structure-independent a u t o n o m o u s morphological cases cannot in principle explain why syncretisms tend to prevail among 31. In particular, the fact that rule (61) explicitly requires that P = 0 explains, in both Latin and Sanskrit, why the dative (quirky or not) never follows a lexical preposition, as observed by Whitney.

236

A unified

theory

of syntactic

categories

adverbial cases (and also among nominatives, accusatives, and vocatives, those NP's case-marked by V and SP(V).) The above account of this complex adverbial case system is still incomplete, since there still remains the fact that Sanskrit NP's which are governed by an empty [P, + LOCATION, OGOAL] (locatives) have their own specific endings. 32 Nothing in my approach here either predicts or is incompatible with this additional stipulation. But it is a stipulation, to be compared with the fact that the syncretisms evident in the Dual and Plural columns of (65) are predicted (unstipulated) in my approach. In the approach based on the autonomous or inherent (non-PP) status of adverbial and other morphological cases, syncretisms of any sort are in principle possible. A final source of support for attributing adverbial cases to empty P's is based on the behavior of absolute constructions. Section 1.5 discussed English absolutes, proposing that they consist of an introductory P (with), an X 2 complement, and an N P subject of this X 2 . Just as indirect object NP's can be introduced by lexically empty P in case-inflected IndoEuropean languages, it appears that absolute constructions can also be so introduced. Gary Holland (University of California at Berkeley colloquium, 1983) has argued that absolute constructions in Indo-European are of two and only two distinct types. The older "nominative absolute" is the only one traceable to the original Indo-European, having the form [§ nominative NP—participial V P ] . It is further, in his view, a main clause construction, coordinate with a preceding main clause but itself containing a deleted or empty finite verb, which can assign nominative case to its subject. The more recent absolute construction, which develops in parallel in many but not all branches of Indo-European, is, according to Holland, a subordinate construction. In line with what is developed here and in Ch. 6 and 7, subordination involves the category P, i.e., the subordinate absolute construction should involve three sisters, P, N P , and X 2 . With the exception of Balto-Slavic, where subject NP's in all non-finite clausal constructions are dative, Holland argues that the case of the subject N P in the subordinate absolute is always a case which regularly accompanies prepositions of place: a genitive in Greek, the ablative in Latin, the dative in Gothic, etc. This sort of morphological case-marking is exactly what the present framework predicts; when the absolute construction becomes embedded, it must be introduced by P, and further this P, if not lexically realized (e.g., as with in English), projects itself as a typical morphological case induced by P onto its sister N P , the subject of the absolute. Thus, Holland's conclusion that subordinate absolutes in Indo-European follow

32. Like the rule for dative singulars, the rule for Sanskrit locatives should be restricted to NP's which follow an empty P. Whitney remarks: "This case is least of all used with words that can claim the name of preposition." (1889,414).

Principles

of inflectional

morphology

237

a single case-marking pattern, by which [ P , -I- L O C A T I V E ] assigns case to the subject N P , lends support to my general claim that adverbial cases are always realizations of the category P, where P is sometimes lexical and sometimes empty. Opposed to the quite derivative status which I assign to adverbial case is a view of traditional grammar that morphological cases are autonomous primitives of syntax whose structural relations with the rest of a sentence are only sometimes clarified or brought out by prepositions. More sophisticated versions of autonomous morphological case, such as that in Jakobson (1984), utilize distinctive features to group cases together in classes which allow syncretisms to be more easily expressed. But these cross-classifying features are not categories which appear elsewhere in grammar; they are ad hoc. The case-marking categories used here, which in fact d o provide these cross-classifications, are motivated throughout grammar. Among them is the feature P, which groups together datives, locatives, instrumentals, benefactives, and ablatives. These adverbial cases are subdivided by the natural subclasses of prepositions given in (57). Nominatives and accusatives are a class by virtue of the fact that they are case-marked by V or SP(V); they are both arguments of V in their main uses. Nominatives and genitives are a class by virtue of being casemarked by SP(X). It seems to me then that the advantages claimed in (49v-vi) for uniformly deriving adverbial case NP's from structures as in (48) are real: unique syntactic properties of adverbial case N P ' s fall into place, and analyses are available for capturing morphological generalizations for highly inflected languages which resist formulation in terms of adverbial case N P ' s taken as syntactic primitives. 5.8. The False Case of English Pronouns Let me now turn to comparing the genuine morphological case-marking of German and similar languages to the impoverished case distinctions among English pronouns. Here we will be interested in the version of American English where grammaticality is assigned as in (67-70). Of course, as in almost all fully learned versions of English, this version has the simple subject/object contrasts as in (66). (66)

(67)

Lately, he (*him) usually makes dinner. Does John think that we (*us) like her (*she)? Betty knows they (*them) are talking about me (*I). Mary and him are late. *Mary and he are late. Are your friends or us going to pick up John? *Are your friends or we going to pick up John? Only him and us were late. *Only he and we were late. Neither J o h n nor her d o I think deserve the trip.

238

(68)

(69)

(70)

A unified theory of syntactic

categories

*Neither John nor she do I think deserve the trip. Everyone but them gets on John's nerves. *Everyone but they gets on John's nerves. Students smarter than her get no scholarship. *Students smarter than she get no scholarship. Go in before him and get the wine first. *Go in before he and get the wine first. My new cars make me look as young as them when I'm driving them. *My new cars make me look as young as they when I'm driving them. It is just us who John says are late. *It is just we who John says are late. Mary has a nice life, but you could never be her now. *Mary has a nice life, but you could never be she now. Us commuters are often blamed for smog. *We commuters are often blamed for smog.

As linguists widely acknowledge, English-speaking children all seem to acquire the prescriptively incorrect usage in (67-70) side by side with (66), before certain of them are exposed to the corrections which reverse or partially reverse the judgments in (67-70), and which constitute what I call prestige subject pronoun usage. In a separate study, I argue that prestige subject pronoun usage in Modern English is not a version of natural language, but rather a mixture of some minor local rules (e.g., "use I after a coordinate conjunction"), avoidance of a range of constructions, overcorrection, formulas (It is I vs. ?Mary couldn't be he in the play), and constant semi-conscious sociologically determined self-correction. 33 For my argument here, it is not necessary to adopt the view that prestige subject pronoun usage is not natural language; it suffices to accept that the usage in (67—70) is significantly easier to learn than is prestige usage. The prestige usage could then be viewed as resulting from relaxing the

33. Prescriptivists themselves generally acknowledge that the incorrect usage of (67)— (70) is inevitably recurrent unless a constant vigilance is maintained. That is, as I claim, normal language learning conditions do not give rise to an internalized prestige subject pronoun usage. Overgeneration, which testifies to the inability of prestige dialect speakers to master "correct" subject pronoun usage, has occurred throughout the Modern English period. Fowler (1965) cites the following examples. Leave Nell and I to toil and work. (Dickens) Wagers lost and won between him and I. (Pepys) All debts are cleared between you and I. (Shakespeare) I saw a young g i r l . . . whom I guessed to be she whom I had come to meet. (78) T o discover only one solitary person, and he a sentry, on the steps... (78) Let us be c o n t e n t - w e Liberals, at any rate-to go o n . . . (669) Whether . . . , is not for we outside mortels to decide. (689)

Principles

of inflectional

morphology

239

Morphological Transparency Condition (45) under special languagelearning conditions (e.g., vigilant parents). When Morphological Transparency is observed (as I believe is always the case), the nominative case feature SP(V) which universal abstract casemarking assigns to NP's, including conjoined subject NP's (67), subject NP's of interpretable but phonologically empty VP's (68), predicate nominative NP's (69), and subjects with pronominal demonstratives (70), is not realized in English on head N's or on any category within the NP. This means that some other language-particular (hence local) rule must assign nominative case to subject pronouns. This rule can be given in the following form: (71)

English Pronoun Rule: PRON - SP(V)=>

1

-2

Then, in the lexicon, pronouns without case features are listed as me, him, etc., while I, he, she, we, and they are listed as [PRON, SP(V)]. 34 It can now be shown how the unmarked usage in (67-70) follows. The structural description of the local rule (71) does not assign nominative case to the right, explaining (69); in contrast, the universal principle of abstract case-marking (72 in Ch. 1) assigns case in either direction, including to the right from the SP(V) when the verb itself does not assign case. The requirement that each term of a structural description (e.g., SP(V) in (71)) be in the terminal string in order for a rule to apply to it predicts correctly that the "isolated" subject NP's of empty VP's in (68) will exhibit pronouns with unmarked (objective) case. The Adjacency Hypothesis on language-particular rules, as explained in the Introduction, explains why (71) cannot apply in (70).35 Finally, the principle of subjacency (or alternatively, the coordinate structure constraint) prevents (71) from affect-

34. It is immaterial to our argument here whether P R O N is taken as a feature on SP(N) or N or as a separate expansion of NP. However, examples like (70) are among those adduced by Postal (1969) in support of the hypothesis that pronouns are determiners. 35. According to the analysis of parentheticals and appositive relative clauses in Emonds (1979), which can be extended to appositive NP's, constituents set off by commas following a subject N P are not in immediate post-subject position until a cycle subsequent to the one on which a rule such as (71) applies. Thus, these constituents d o not interfere with the adjacency required in the language-particular (71), and we expect the following examples:

It is also the case that adverbs not set off by commas block neither (71) nor the abstract nominative case-marking of the subject, which also typically depends on adjacency (Stowell, 1981, Ch. 3). This problem can be circumvented either by analyzing these adverbs as parentheticals without commas, or by utilizing a post-syntactic stylistic rule to place them in post-subject position.

240

A unified theory of syntactic

categories

ing a conjoined N P subject; the circled cyclic domain nodes in (72) mean that P R O N and SP(V) can't be related by a single language-particular rule.

It should be noted that the universal "descent" of case from a phrasal node to a head in accord with Morphological Transparency is never stopped by subjacency (or the coordinate structure constraint); that is, in languages with full-blown morphological case inflection, coordination does not prevent case from being assigned to conjuncts. From this discussion, it can be seen that the loss of a productive case inflection will always give rise to a situation where pronouns can retain case distinctions only by the addition of an extra rule to the grammar. Since such a rule is a language-particular local transformation rather than the universal principle of Morphological Transparency, it necessarily fails to reproduce the pronoun distribution of the pre-case-loss period. French pronouns which are in N P positions (the "strong forms") show no case distinctions at all. When pronouns are clitic prefixes or suffixes on the verb and not in N P positions (Kayne, 1975, Ch. 2), they differ in form depending on whether they are subjects or objects. But it is not necessary to have morphological case-making on N to express these differences in form; the nominative case of the subject NP, assigned by abstract case-marking, can be transferred back to the subject clitic position by a local copying rule. (73)

French subject clitic rule: a PERSON P NUMBER y GENDER SP(V)

— VERB=>1 — 1 +2

I assume that a rule such as (73) copies only the features mentioned in the rule, and not those that co-occur with it. Thus, an N P such as les tables 'the tables' case-marked as nominative would, by (73), cause the feature complex [ S p ( V )III PERSON, PLURAL, F E M ] to be copied into preverbal position, but the unmentioned features such as NP, —HUMAN, and + C O U N T would not be copied.

Principles

of inflectional

morphology

241

Similarly, a special local rule of French inserts third person, nonreflexive direct object clitics in front of the verb, and is justified on grounds independent of what is being considered here in Emonds (1976, Ch. 6). This rule also need make no appeal to the appearance of morphological case within an N P . (74)

French le-la-les rule: III P E R S O N -REFLEXIVE aNUMBER ^GENDER

2 + 1-0 + DEF

Since Morphological Transparency (45) is exactly what characterizes a language as having inflectional case, the fact that the case-distinctions in the French clitic system (or equivalently, the clitic distinctions that reflect abstract case features on NP's) d o not involve lexical heads implies that French is accurately termed a language without case inflection. French is thus like English, in that (45) never licenses morphological case on an N. Spanish pronouns present a situation which is partly analogous to English and partly to French. While there are no overt subject clitics in Spanish, there are indirect reflections of abstract case among the object clitics. As in French, these never involve (45). Spanish first and second person singular pronouns apparently also differ in case in N P positions: yo T/mi 'me' and tu 'you'/ti 'y° u ' contrast. A difference between Spanish and English, however, is that in Spanish the forms after prepositions are the "marked" ones, while in English, as we have seen, the subject pronouns play this role. Thus, in Spanish, the "subject forms" yo and tu are found in predicate attribute position, in conjoined NP's, as subjects of comparative clauses with understood predicates, as lexical subjects of non-finite verbs in certain constructions, etc. The nonclitic object forms mi and ti are used exclusively in positions where a governing preposition occurs just to their left. 36 (75)

El hombre con gafas era {yo/*mi}. 'The man with glasses was me.' Eran Juan y {tú/*ti} los que merecían el viaje. 'It was John and you that deserve the trip.' Juan es mas inteligente que {tú/*ti}.

36. In Spanish, prepositional NP's include all non-subject positions outside of predicate attributes, since human direct objects in N P positions are always preceded by the prepositional marker a, and possessive pronouns have adjectival forms which are not NP's. As H. Contreras and A. Hurtado have pointed out to me, children's pronoun usage in Spanish may go through a stage parallel to English ¡/me before the usage of standard Spanish is internalized. H. Contreras helped me to construct the appropriate examples in (75)—(77).

242

(76)

A unified theory of syntactic

categories

'John is more intelligent than you.' Al ver {yo/*mi} los resultados del examen, se puso furioso. 'On my seeing the results of the exam, he got furious.' María criticó a Juan y a {ti/*tú}. 'Mary criticized John and you.' Ana compró esos para tu hermana y para {mí/*yo}. 'Ana bought those for your sister and for me.'

In particular, notice that Spanish object pronouns are not allowed in compound objects of prepositions where they are not adjacent to the preposition itself: (77)

*María criticó a Juan y {ti/tú}. 'Mary criticized John and you.' *Ana compró esos para tu hermana y {mí/yo}. 'Ana bought those for your sister and me.'

Therefore, the unmarked nonclitic Spanish pronouns are the "subject" forms yo and tú, and a language-particular rule (78), which obeys the Adjacency Hypothesis, introduces mi and ti.

(78)

Spanish Pronoun Rule:

P-

+ PRON -III =>l-PLUR

2 1

It can thus be seen that abstract case features do not descend from phrases to lexical heads via Morphological Transparency (45) in Spanish, French, or English. All three languages have vestigial remnants of casemarking preserved by means of language-specific local transformations, but none partakes of the pervasive case-marking on nouns provided for by the general theory of inflectional case. 5.9. Conclusion: the Ephemeral Morphological Component It can be asked whether assimilating language-particular rules of morphological case-marking to other inflection necessitates modifications in the principles of grammatical theory proposed here to describe inflection. To begin with, no change is needed in the principle of Word Division (13), which attributes word boundaries to each category in a deep structure string, since I claim that the copying of case features is accomplished in the transformational component, in particular, before or at s-structure. This accords with most generative treatments of inflectional case that I am aware of. 37 37. Yim (1984) argues for a theory whereby abstract case is assigned a deep structure. However, it is consistent with Yim's theory to claim that the language-particular realizations of abstract case as inflections, exemplified by rules such as (46), (47), (61), (62), (63), apply in

Principles

of inflectional

morphology

243

Second, the transformational insertion of case allows one to retain generalizations about deep left-right order of phrases and non-phrases such as the Head Placement Principle (2) in Ch. 1. For example, the existence of case-marking morphemes to the right of the head noun in the German N P does not violate the claim that only phrases follow heads in deep structure. The rules of morphological case-marking do, however, suggest slight changes in the First and Second Inflection Principles (14) and (16). Both these changes are generalizations of my earlier tentative formulations. As with other inflection, case-marking rules typically do not induce word boundaries between a head N or A and the case ending (First Inflection Principle), and are obligatory (Second Inflection Principle), at least in head-argument configurations. 38 We therefore want to include morphological case-marking in the scope of these principles. This can be accomplished by referring to specifiers and heads uniformly in the two Principles. (79)

First Inflection Principle (revised). A transformation attaching some non-phrasal closed category C to a head X° or a specifier SP(X) does not provide word boundaries between the two in derived structure.

Both the type of rule which moves a case feature (e.g., 46) and the type which spells out a case ending (e.g., 47) attach a non-phrasal closed category (the case ending or feature) to some lexical head (N or A) or specifier (SP(N)), and thus fall under (79). Similarly, the contraction of n't with SP(V) in English is now properly predicted to involve the loss of an intervening word boundary. (80)

Second Inflectional Principle (revised). A transformational operation which produces an insertion frame inside X° or SP(X) for a given morpheme applies obligatorily.

By extending the Second Inflection Principle to specifiers, all casemarking rules are included in its effects. A rule spelling out nominative case (SP(V)) on a DET would not fall under the unrevised version (16), but it is included under (80). Similarly, it follows that the English Pronoun Footnote 37—Continued the transformational component or even after s-structure. In fact, since Yim's theory is that noun phrases and clauses receive case at deep structure, it is consistent even with the claim that Morphological Transparency (45) projects case from phrases onto lexical heads only in the transformational component. Moreover, one can claim that abstract case is assigned to phrases at deep structure, and not contradict the principle of Word Division (13). 38. There are some non-central situations where it appears that case-assignment rules are not obligatory, but most of them remain to be fully analyzed in terms of a formal theory of abstract case-marking. For example, certain aspects of case "attraction" in Classical Greek appear to be optional (Quicoli, 1982, Emonds, 1973); cf. also note 28.

244

A unified theory of syntactic

categories

Rule (71) is obligatory, as are the rules for the French and Spanish pronouns, (73), (74), and (78), provided we consider pronouns to be either specifiers, as in Postal (1969), or nouns. The Second Inflection Principle simply reflects the fact that a wordinternal subcategorization frame for an item which realizes a deep structure category such as SP(X) or P can be satisfied only posttransformationally. These items, which are thus compatible only with transformationally produced, word-internal syntactic contexts, are called inflections. In other words, transformations must apply in order for these items to appear; and the transformations producing inflection are obligatory. The Second Inflection Principle is therefore derivative of Word Division (13) and the First Inflection Principle (79), and has no separate status in grammatical theory. The First Inflection Principle alone formally defines the domain of what is traditionally called inflection, namely, the expression of certain closed grammatical categories as affixes on heads, extended here to affixes on specifiers as well; it accomplishes the "merger" argued for in Pranka (1983). It is of course true that inflection is subject to other principles of syntax: subjacency, the Adjacency Hypothesis and Head Adjacency (21), the sstructure Ordering Principle (42), the Word Order Principles of Ch. 1, the Designation Convention of Ch. 4, the universal definition of Indirect Object (59), and the Invisible Category Principle (52). But these other principles, with the possible exception of the ICP, have wider scope than just inflection, and are properly called principles of syntax. Even the ICP is syntactic in that it specifies when closed categories can be syntactically empty. Therefore, it means very little to say that inflectional rules exhibit a cluster of properties such as obligatory application, late ordering, adjacency of affected elements, being subject to subjacency, etc., along with their defining property given by the First Inflection Principle. This is analogous to insisting in chemistry that the element oxygen, besides being defined by its nuclear composition and valence, is subject to gravity, the conservation of energy, the laws for electrical conduction, etc. The point about chemical theory is that we can define oxygen with one statement about its nuclear composition and valence, and the other properties follow. Likewise here, I have tried to define a certain property of a sub-class of transformational operations which give rise to syntactic sequences without internal word boundaries, and a principle by which these operations license empty . categories. What I hope to have established is that such rules of "inflection" automatically fall under generalizations expressed by other principles of syntax, and that a theory requiring a further elaboration of universal principles specifically for the relation of inflectional morphology to syntax is undesirable. I thus give an answer to the question raised in section 5.1 concerning which component of universal grammar inflectional morphology belongs in. I indicated there that early generative grammar tended to assimilate

Principles

of inflectional

morphology

245

inflection to syntax, while structuralist and traditionalist schools tend to make sharp distinctions between the two domains. The separation of inflection from syntax can also be seen in recent generative works, such as Lapointe (1981) and Lieber (1980). The similar treatment of inflection and derivation proposed by these authors, in terms of phonological properties and word-internal orderings, is arguably appropriate for how phonology expresses the categories of morphology, but not, I claim, for explaining what these categories are. To my mind, the categories of inflection are the categories of syntax, except that they are transformationally displaced. Whether the phonological expression of morphology is determined by an independent component, by more general lexical principles, by phonological principles, or even by partly syntactic ones is not my concern here. 39 My purpose has rather been to show how a general syntactic theory of deep structure categories and relations and of possible transformational operations can explain the possibilities and limits for inflection. Successful generalizations in syntax, of which a sample follows, require, whether they are basic or derived, that the categories of inflection be syntactically represented other than in their surface positions; otherwise, inflections constantly counter-exemplify these generalizations, as shown in (81). (81)

(a) Adjectives can be preceded by SP(A). Cf. *John is very taller than Sam. (b) An English count noun requires a preceding SP(N). Cf. I saw trees. (c) A member of SP(V) ( = INFL) precedes VP in all English S. Cf. John (*may) likes fish. (d) The only sisters which follow a head X° in English are phrases (Head Placement Principle 2 of Ch. 1). Cf. John likes fish. (e) The universal definition of indirect object implies the structural presence of P. Cf. Hans gab dem Madchen einen Hund. 'John gave the girl a dog.' ( f ) Deep structure categories outside an X° are separated by word boundaries. Cf. John like Is fish. (g) Simple adjectives precede head nouns in English. Cf. Someone clever came in. (h) Negative adverbs are VP-initial in French and English. Cf. Jean n'aime pas le poisson. 'John doesn't like fish.'

39. The research program of Walinska deHackbeil (1983, 1984) is based on arguments that principles of syntax, such as the bar notation and 0-role assignment, are of central importance in word-internal morphology.

246

A unified theory

of syntactic

categories

(i) Every English sentence has a verb after the subject. Cf. Is John here? (j) English questions and tags are formed by respectively inverting and copying the SP(V) ( = INFL). Cf. John likes fish. Does John like fish? John likes fish, doesn't he? Examples like these, which can be easily multiplied, demonstrate to what extent the categories of inflectional morphology fit exactly into gaps in otherwise regular deep structure paradigms; neither what these syntactic paradigms are, nor what the limits on inflection are, can be understood without postulating a direct transformational relation between deep structures and inflected forms. That is, syntactic theory, which includes a theory of categories, of deep structure relations, and of possible transformational operations (for inflection, transformational constraints such as the Adjacency Hypothesis and Head Adjacency are particularly important), is both necessary and sufficient for an understanding of what categories can surface as bound inflections, and of what categories can be their hosts. If all non-definitional properties in inflectional morphology can be predicted on the basis of independently motivated principles of a universal syntax, there is no point in speaking about a separate component for inflection. A rule of inflectional morphology is simply a syntactic rule which happens to fall into the domain covered by the First Inflection Principle (79). While such rules are also constrained by principles of wider scope, this does not justify considering them to form a "component". 40 If this constituted a component, we could as well speak of a "roundedness component", which consists of a single statement that only non-low back vowels are rounded in the unmarked case. This use of "component", meaning essentially "rule" or "principle", does violence to the intuitive content of the term; the term should be reserved for a set of principles which have co-extensive domains. Thus, I maintain that inflection is not a component, but, more properly speaking, is the interface between phonology and the transformational subcomponent of syntax.

40. I am not speaking here of derivational morphology, or in fact even of the rules such as (47) which spell out as morpheme sequences the single words induced by the First Inflection Principle. All these processes taken together, the spelling out of inflections within words and aspects of derivational morphology, might form a separate component. But they might also simply be subject to one further principle, plus principles of the lexicon, and thus still fail to qualify as a component, properly speaking. Walinska deHackbeil, in work in preparation, argues for this kind of approach; word-internal processes are subject to general principles of the lexicon, of syntax (the bar notation), and of other sub-theories of grammar (e.g., the assignment of 0-roles), but are not the domain of a set of principles in a separate component.

Chapter 6

Subordination and the category P In Ch. 1, a general theory of base or deep structures, which includes principles of 0-role assignment and case assignment, has been elaborated. It is argued there that the category P P plays an important role as the source of indirect 0-role assignment. However, aside from a cursory listing of types of complements to P and a somewhat more thorough discussion of absolute phrases (headed by P) in section 1.5, relatively little attention has been given in preceding chapters to the details of the internal structure of P P . Since P is not an open category, much of the discussion of Ch. 4 establishing the existence of significant closed subsets of the lexical categories is not applicable to P. Similarly, as noted in Chs. 1 and 5, P has an impoverished specifier system and does not give rise to a rich inflectional system, so the treatment of these topics in Chs. 4 and 5 has not included much about PP. It is the purpose of this and the following chapter to focus on P P and justify in some detail the general claim of Ch. 1 that P can take any type of phrasal complement (Ch. 1, rule (22)) and that P P can appear as a sister to any bar notation head (Ch. 1, rule (20)). In this chapter, the first of these claims will receive our attention. According to the principle of direct 0-role assignment (Ch. 1, rule (23)), each P can assign a 0-role to at most one complement. Thus, if we abstract away from the absolute constructions already discussed in that chapter, where the head of one of the complements of P assigns a 0-role to the other, the structures we are to investigate in this chapter reduce to those generated by the following special case of rule (22) of Ch. 1. (1)

P'-P,

Xhj>2.

As argued in section 1.8, the P in a tree generated by (1) is not always realized as a word, particularly when the complement is N P and P can be realized as a case feature on this N P . When this P is empty, its N P object "constitutes" a sister to a higher lexical head and receives a 0-role indirectly from that head (section 1.6) as its indirect object (Ch. 1, rule (85)). Whenever = N P , N P can receive dative case-marking from P, whether or not P is lexical. I proposed in sections 1.8 and 5.7 that the "prepositionless-datives" of G e r m a n , Greek, Latin, and Sanskrit are all case-marked by an empty adjacent c-commanding P. One supporting argument for this in section 5.7 was that while an empty P in Latin yields a morphological dative and a lexical P a morphological ablative, these

248

A unified theory of syntactic

categories

two cases are clearly closely related, since they are always identical in the plural. A partly similar and partly different bifurcation in the morphological realizations of abstract dative (i.e., P) case occurs in Classical Arabic. This language has no separate morphological dative; an empty P (for indirect objects and embedded subjects in causatives) yields a morphological accusatiye, while a lexical P yields a morphological genitive. In can thus be seen that the theory presented here ascribes a more pervasive presence to the structure P P generated by (1) than do traditional treatments, in which every P P consists precisely of a lexical P and a lexical NP. Similarly, when X takes on values other than N in (1), a wide range of structures which traditional grammar classifies under one or another type of "subordination" fall into place as instances of PP.

6.1. The Range of PP

Structures

In order to fully justify (1), structures must be found which exemplify all values of W in (1): X^ = NP, S, AP, PP, VP and, since phrases other than heads in base composition rules are optional, = 0. Moreover, it must be determined whether (1) provides a rich enough system of complements for P as well. A number of interesting generative studies on internal P P structure have appeared, in particular, Jackendoff (1973), van Riemsdijk (1978, esp. Ch. 3), and Hendrick (1978). The first two of these argue for a system of complements to P richer than that in (1), while Hendrick attempts to reduce all base complementation of P to NP's. My position is closest to that of van Riemsdijk, as will be seen below, since the only clear case of multiple complements to P which he accepts is that of absolute constructions, a position I share. I take the existence of NP complements to P as uncontroversial. Moreover, there is wide acceptance of the proposal in Emonds (1970, 1976), first argued to my knowledge in class lectures of E. Klima, that finite and infinitival clauses introduced by subordinating conjunctions with intrinsic semantic content, as in (2), instantiate the P P structure in (3).1 (2)

John will burn those papers, {although/in case/(only) if/since/ while/before/because/now that} it is raining. Everyone knows you have worked hard {in order (for them)/so as} to get through. You should pay his bills, lest he get arrested.

1. Arguments for the P P status of such complements and for the P status of the subordinate conjunction are also given in Geis (1970), but there S is taken to be a transformationally reduced NP.

Subordination (3)

and the category

P

249

S

V

NP

P

pay

his bills

lest

S he get arrested

The justifications for (3), beyond the methodological desire to reduce the inventory of categories, are as follows: (i) A subordinate conjunction of time or place (while, before, where, etc.), like other P of time and place, can be intensified by right, whereas no other category in Standard American may be. (ii) The P P in (3) can be fronted with comma intonation and, under a range of lexical choices (before, while, because, only if until, etc.), can serve as the focus of a cleft sentence. These are typical P P properties. 2 (iii) Among verbs, many appear in both — N P and — (that) S contexts (e.g., believe, tell, find out, deny, insist (on), etc.). One does not conclude from the existence of other V such as wonder and describe, which are — N P and — S respectively, that the V which occur before N P are of different category from those that appear before S. In parallel fashion, many of the individual morphemes exemplified in (2) which are P, + S also appear before NP: before, after, until, since, because/because of in case/in case of, in that/in, like, as. (This holds for as independently of the comparative construction; as = since or while is + S, and the "non-comparative as" to be introduced in section 6.3 is -INP.) As with verbs, the fact that many P occur in both contexts is evidence not for setting up two categories, but for allowing a single category to appear in multiple contexts. (iv) The VP-final position of P-S in (3) is also a base position for P-NP, and in fact it is typically here that the P of (iii) appear; thus, we say that before is a P which appears outside X, regardless of whether P has an N P or an S complement. In conclusion, the existence of a structure as in (3) seems secure. One might consider whether the S in (3) should be replaced by S, where S = C O M P + S, and C O M P = that, for, and if This could be suggested by compound subordinate conjunctions such as now that, in that, so that, in order that, in order for, and as if In Ch. 7, it will be argued that these " C O M P morphemes" are in fact the "grammaticalized" P in the context S entirely analogous to the grammaticalized P in — N P such as of, to, for, etc. Such grammatical formatives often occur as the second element of an atrophied lexical P with semantic content such as out of, because of, in case of, into, onto, and as for. In such cases there is no syntactic behavior 2.

I return to the limitation on which PP's may be a focus in cleft sentences in section

7.2; cf. Emonds (1976, Ch. 4).

250

A unified theory of syntactic

categories

justifying anything beyond a completely amalgamated lexical unit (cf. Hendrick, 1978). The same can be said of now that, so that, as i f , etc. For example, even though P P and S can ordinarily appear in the following gapping pattern (so-called "right node raising"), there is no corresponding P P or S to break up the compound subordinate conjunctions and prepositions we are considering here. (4)

(a) John should argue because of, and Mary should argue in spite of, the chairman's opinion. I put flowers on, and he put flowers next to, the table. (b) *John should argue because, and Mary should argue in spite, of the chairman's opinion. *Mary put some books in, and Sam put some books on, to the table. (c) John should vote in that he's a member, and you shouldn't criticize him now that he's a member. *John should vote in, and you shouldn't criticize him now, that he's a member. 3

It is of minor interest that the proportion I claim to exist, namely, atrophied P + of. N P = atrophied P + that: S, also holds in French, with a slightly different twist. While these atrophied compound lexical P are fairly rare in English, French counterparts containing de "of" and que "that" are both more common: près de "near", autour de "around", a coté de "beside", le long de "along", lorsque "when", parce que "because", depuis que "since", pendant que "while", etc. By unifying the traditional categories of preposition and subordinating conjunction, we can say that this difference in the lexicons of English and French is due to a single tendency (within the category P), and not to more than one. A final argument against taking compound lexical P as support for P-S rather than P-S is that every other finite dependent clause context in English or French where S has been argued to occur allows some C O M P type morpheme (that, i f , whether, as, than) to occur, at least optionally. But

3.

The option available in (4a) for P - N P is often not allowed for P-S: *You should call in case, and I'll call even though, the train is on time.

But these P-S combinations are then like certain C O M P - S combinations, providing further support for their identification in Ch. 7. *He asked me whether, and John assured me that, the train was on time. ?I've been wondering whether, but wouldn't positively want to state that, your theory is correct. The acceptable status accorded this last example in Bresnan (1974) seems doubtful to me.

Subordination

and the category

P

251

except for the compound P such as now that, in that, and as i f , no C O M P morpheme is ever permitted in P S. 4 I conclude that the S in (3) is correct, and should not be replaced with S. The AP and P P complements to P are exemplified in section 1.5 in (29) and (30); He suddenly changed from sad to radiantly happy and John is from near St. Louis are two typical sentences of the larger inventory given there. Turning next to the expansion of (1) when X-i = V 2 , it can be observed that many of the subordinating conjunctions of time also may appear with a clausal complement whose subject is understood and whose predicate begins with V-ing: We were very happy, before / while / after /* until / * although/* because (*her) visiting you. These may be instances of the base structure P — V 2 , just as in Chapter 2 the necessarily subjectless V-ing complements of verbs of temporal aspect (begin, etc.) have been analyzed as base V 2 complements. These complements to subordinating conjunctions of time are analyzed as V 2 because they exhibit no N P behavior; they cannot be coordinated with NP's or focused in cleft sentences: (5)

It's this kind of movie that I'm always so sad after. *It's discussing his past that I'm always so sad after. Before finishing the book and painting the house, I hope to visit you. After the conference on third language learning and my vacation, I hope to avoid him. *After the language conference and finishing my book, I hope to visit you.

Further, it hardly seems an accident that the base V 2 structures after verbs and after prepositions are in both cases complements to heads that lexically express time (independent of inflection). It can be supposed that it is semantically "natural" that such heads may take V 2 rather than V 3 complements in the base, which are furthermore not NP's. The position just set forth, that P's can have \ J complements, is opposed in Hendrick (1978, section III). He offers three main considerations in support of his view: (i) he suggests that there is no explanation for why extraction is not permitted out of a P — V7 structure; (ii) he argues that, analogously to P - N P , P should appear as P P

4. By the same argument, one might claim that it is unjustified to postulate a C O M P in main clauses, counter to Bresnan (1970). In section 7.7.3, I argue against main clause C O M P on independent grounds. It might be thought that S is needed to stop extraction from adverbial subordinate clauses in complements to conjunctions like before and because. However, I will argue in section 7.8 that the PP that immediately dominates S must be itself considered a bounding node for subjacency, and thus suffices to prohibit extraction.

252

A unified

theory

of syntactic

categories

sisters to V, while in fact such PP are always sisters to V 1 ; and (iii) he gives arguments in favor of identifying subordinating conjunctions with C O M P . In fact, his proposal (iii) has been the inspiration for my hypothesis in Ch. 7; that is, I accept both his (iii) and the arguments I have given that such conjunctions are P. It then follows that my position is that the C O M P category should be identified with P, which is precisely what I argue there. A m o n g other things, this removes the force of Hendrick's (ii) above and, as will be seen, of his (i) as well. 5 We have now seen that all types of X7 complements to P exist, where j > 2; the next topic is the existence of P without complements, the last of the types predicted by (1). 6.2. Intransitive Prepositions The material of this section is the first part of Emonds (1972). Though it has been reprinted and also summarized by other authors in a number of places, it belongs in a full treatment of the category P, and so is included with minimal modifications. In The Verb-Particle Combination in English, Fraser (1976) investigates in detail a class of post-verbal particles which may precede or follow nonpronominal direct objects. We can distinguish three main uses for these particles: (6)

(a) As directional adverbs: John carried the trunk up. Mary threw a box out. John carried up the trunk. Mary threw out a box. The teacher handed every paper back to the students. The teacher handed back every paper to the students.

5. Hendrick and van Riemsdijk present convincing arguments against Jackendoff's base expansions of P P in which P has more than one sister complement. In particular, Hendrick makes a good case for treating /rom-NP-ro-NP structures as syntactic coordination; he notes that it begs the question to observe that a coordinate conjunction is not tolerated in this sequence. That is, there is no reason to assume that coordinate conjunctions are an operational test for coordinate structure. Hendrick has argued elsewhere (1978) that many comparatives are syntactically coordinate, and Emonds (1979) elaborates on the conjecture of Ross (1967) that non-restrictive relative clauses result from coordination in the base. Van Riemsdijk (1978, 62-86) argues that there is only one clear instance of a P having a double phrase as a complement, namely, the "absolute construction": With your chauffeur driving my car, we'll have time to chat. With Arthur sick, no preparations can be made. As developed at length in section 1.5, the principle of direct 0-role assignment allows two non-PP complements after a P only if one, an NP, is the subject of the other. Hence, this principle explains van Riemsdijk's conclusion. Cf. also the study of Ruwet (1978) on a very similar construction in French. A different view of absolute constructions, according to which the introductory with takes a single S complement in deep structure, is developed in Ishihara (1982).

Subordination

and the category

P

253

They should gather these books together. They should gather together these books. (b) In idiomatic verb-particle combinations: John will turn that job down. His offer really took John in. John will turn down that job. His offer really took in John. You shouldn't put such tasks o f f . You shouldn't put off such tasks. He has taken the government over. He has taken over the government. (c) In "completive" verb-particle combinations: John fixed a drink up. John fixed up a drink. We painted the house up. We painted up the house. Cut the meat up. Cut up the meat. Fraser demonstrates that noun phrases which follow particles are not their direct objects. That is, up the trunk is not a constituent in carry up the trunk! Rather, this verb phrase exhibits the structure of (7) VP

(7) V' I carry

PRT I up

~NP I the trunk

I will argue that these post-verbal particles should not be assigned to a category P R T distinct from the category preposition (P). In other words^ I will argue that these particles should be regarded as intransitive prepositions. We note first that post-verbal particles are almost all also transitive prepositions, and that when such a word is used as a directional adverb, it has the same intrinsic meaning whether or not it has an object. On the other hand, a non-productive syntactic category like modal (M) has no members in common with the other non-productive syntactic categories (particles, prepositions, determiners, or the "degree" words which modify adjectives such as too, enough, very, rather, quite, as, more, most, so, how, this, that, etc.). Similarly, there are no determiners or degree words which are also particles or prepositions. The fact that some degree words are determiners is in fact usually taken as evidence for a syntactic relation between the two categories, a suspicion strengthened by the existence of other properties common to the categories modified by determiners and degree words (nouns and adjectives, respectively). For example, adjectives

254

A unified theory of syntactic

categories

and nouns are declined in languages with declensions; only nouns and adjectives are modified by w/¡-words; neither category has surface direct objects. Thus, it appears that in general non-related non-productive syntactic categories do not have members in common. In order to extend this generalization, we must formally relate the category of post-verbal particles to the category "preposition". Basically, considering P and PRT as distinct categories can lead to the same problems that early transformationalists faced when they proposed two verb classes, V j (transitive verb) and Vi (intransitive verb). These two classes shared many members whose only semantic differences seemed attributable to the presence or absence of an object and whose distribution was otherwise identical; furthermore, postulating two such distinct verb classes meant that phrase structure rules such as VP -* Vj — N P and VP - » V j were formally possible. The solution to these difficulties is proposed in Chomsky (1965); he assumes that there is a single verbal category whose members are "sub-categorized" to take or not take objects. That is, a lexical subcategorization feature specifies for each verb whether it can or must be followed by a direct object. In this way, it becomes formally impossible to have verb phrases whose transitive verbs have no objects in deep structure or whose intransitive verbs have objects. Furthermore, the rules which determine the distribution of verbs need only to mention the category V, and the intrinsic meanings of verbs which optionally appear with or without objects need be given only once for each verb. If we analyze post-verbal particles as prepositions which are subcategorized not to take objects, we explain in the same way why many prepositions are also particles, why the prepositions which are particles have essentially the same meanings in both usages, and why prepositions and particles have similar distribution. (Of course, there are differences in distribution, but we will see as we proceed that these differences are less fundamental than might seem apparent). Moreover, there is no formal possibility of particles with objects as distinct from prepositions with objects. And lastly, we can retain the Head Placement Principle of Ch. 1: all post-head categories are phrasal in deep structure. The structure (7) excluded here would violate this principle. Thus, I propose that morphemes such as with, in, apart all be generated as prepositions by the base expansion rule P P -> P, NP, and that these prepositions are associated in the lexicon with subcategorization features as in (8): (8)

with, + P , + in, + P , + apart, + P , +

N P (similarly for at, for, toward, etc.) (NP) (similarly for out, down, around, etc.) (similarly for away, back, together, etc.)

This proposal, for which I will continue to give independent justification

Subordination

and the category

P

255

in what follows, implies that the deep structure position of particles should be the same as that of other prepositional phrases: they should follow the direct object. (That is, an advantage of analyzing particles as prepositions is that the symbol PRT is eliminated from the base, because particles are generated in the position of PP.) This means in turn that the pre-direct object position of particles is due to a local particle movement transformation (9):6 (9)

V - NP - [P]PP = > 1 - 3 - 2

One might ask if the idiosyncratic verb-particle combinations such as those in (6b) should not be contiguous in deep structure; such would have to be the case if lexical entries, viewed as insertion transformations, are constrained to insert only continuous sequences into trees generated by the base composition rules. There are two answers to this question consistent with the decision that particles appear next to the verb in surface structure only by virtue of the transformation (9). The first answer is that there does not seem to be any convincing argument that lexical entries must be sequences of contiguous elements. It may be that lexical entries (insertion transformations) may consist of verbs and other sister constituents which are not necessarily contiguous. If so, we expect to find idioms of the form V — P P where direct objects obligatorily intervene between the two parts of the idiom. Such idioms do exist: (10)

John took his student to task. *John took to task his student. His proposal will bring the crisis to a head. *His proposal will bring to a head the crisis. John wants to put that car to the test. *John wants to put to the test that car. The shopkeepers took some students for a ride. *The shopkeepers took for a ride some students.

On the other hand, if one insists that idioms such as those in (10) should be continuous sequences of morphemes, it is clear that in order to account for them there must be an obligatory transformation which applies to idioms of the form V — P P that moves the P P part of the idiom into the position after the direct object, as in (11): 6. Rule (9) is assimilated to a more general local process of English that I call " N P - a Inversion" in Emonds (1980a). Among other things, I show there that the V and the P P of (9) need not be mentioned in the rule. Wasow (1975) discusses alternative ways to insure that the N P in (9) not be a pronoun (*Mary threw out them) results from a surface filter and is not part of the movement rule.

256 (11)

A unified theory of syntactic

categories VP. NP I the government

take

(Whether the PP in (11) is dominated by V or not is of no importance in this discussion.) Given a deep structure configuration for V — PP idioms as in (11) and an obligatory rule which separates them, as indicated by the arrow in (11), it is clear that the same analysis and rule will produce (only) the discontinuous idiomatic verb-particle combinations in (6b). It then follows that the particle movement rule (9) should apply to idiomatic particles just as it does to directional adverb particles to produce the examples in (6b) in which particles precede the direct object. I conclude that the question of whether or not V — PP idioms are contiguous in deep structure is independent of the hypothesis that particles come to precede direct objects in surface structure only by virtue of the rule (9). Continuing with the justification for considering the post-verbal particles as prepositions, we next recall Fraser's (1968) point that certain verbs may or must have adverb complements of direction. For the sake of an example, consider the transitive verb put, the verb sneak, and the intransitive verbs glance and dart. These verbs must be followed by a directional adverb, but the directional adverb may be either a prepositional phrase or a post-verbal particle. By analyzing particles as intransitive prepositions, we can account for this by simply assigning these verbs the subcategorization feature + PP, where the head of PP has the feature DIR (directional). How this condition is incorporated into the subcategorization mechanism is of no importance; it is simply a fact that PP's whose heads are non-directional p r e p o s i t i o n s like without, because, until, since, despite, for, etc. d o n o t fulfill

the requirement of an obligatory directional complement after these verbs.7 7. In section 1.8,1 suggested that we might identify the subcategory DIR of the syntactic category of accusative case; that is, into but not without takes an N P abstract accusative case. Thus: into, P, + [ N P , V ] . A directional P is then one a certain contextual feature. This could account for the morphological accusative such P in German, Greek, Latin, etc.

P with V, which has which has objects of

Subordination (12)

and the category

P

257

*John put some toys. John put some toys in the garage, downstairs, away, down, together, back, out, etc. *John was sneaking (the food). John was sneaking (the food) into the theatre, out of the store, in, back, out, away, over, etc. T h e children darted. The children darted outside, toward the door, in, back, ofT, apart, away, etc. *Why did you glance? Why did you glance at Mary, behind you, down, in, away, up, back, etc.

If post-verbal particles were not prepositions, we would have to subf p p ") categorize such verbs as + )PRT(' w ^ e r e head of the P P has the feature "directional". Similarly, for the very large number of verbs which take optional directional complements (move, shove, etc.), if we did not analyze particles as prepositions, we would have to always use the subcategorization feature

Further, notice that no verbs of motion take

directional adverb complements which must or must not be particles as opposed to prepositional phrases, as we mjght expect if these were distinct grammatical categories. It is of course true that the directional adverb PP complements which follow certain verbs are subject to certain (probably semantic) restrictions, but these restrictions cut across (are independent of) the distinction between particles and transitive prepositions, and so are not relevant to the discussion here: (13)

John sewed the material together, into one piece, on, onto the coat, back, up, etc. *John sewed the material apart, into shreds, off, out, out of the blanket, away, etc.

I conclude then that the subcategorization of every verb which takes a directional adverb complement will be simplified if we analyze post-verbal particles as intransitive prepositions. In much American speech, the emphatic word right modifies only prepositions of space and time, but no other syntactic categories such as adjectives, adverbs, modals, etc. (This observation due to E. Klima.) (14)

(a) *Bill visits Europe right often, frequently, etc. *Fights happened right seldom in that town.

258

A unified

theory

of syntactic

categories

""Those girls were right attractive. *A proposal of that sort seems right unjust, wise, etc. *He ironed his shirt right wet. *Some right ignorant students asked those questions. (b) Make yourself right at home. We went right along that road. Bill put the spices right on the meat. He lives right up the street. Some people can't work right before dinner. It is also true that right modifies subordinating conjunctions of space and time, which actually is only one among several indications that subordinating conjunctions are prepositions in deep structure, summarized above in Section 6.1. Other arguments to this effect are given in Emonds (1970) and Geis (1970). The word right also modifies post-verbal particles, which is a further similarity of this class of words to prepositions: (c) John came right in. He put the toys right back. G o right on to the stoplight. They looked it right u p and left. John brought the bottles right down. By analyzing the post-verbal particles as prepositions, we can simply state that right modifies only prepositions. Another advantage of my analysis is that the particle movement rule (9) accounts for the fact that particles modified by right do not precede the direct object, without the addition of any special conditions on the particle movement rule. Since right is plausibly a SP(P), as proposed in van Riemsdijk (1978), it is exempt from Head Adjacency (Ch. 5), and so interferes with the adjacency required between the terms of the local rule (9). On the other hand, if particles did not originate to the right of the direct object, we would have to add a condition on the rule moving them to that position: such a rule would ordinarily be optional, but it would have to be obligatory if the particles were modified by right, as shown by the examples of (14d): (d) *He put right back the toys. T h e y looked right u p the number. *John brought right down the bottles. There are some idiomatic uses of particles which right can modify and some which it cannot, but the same is true of idiomatic prepositional phrases, so this irregular distribution is not an argument against the prepositional status of particles:

Subordination (15)

and the category

P

259

H e p u t his life s a v i n g s right o n t h e line. H e p u t his c o a t r i g h t o n . T h e y p u t t h e e n e m y right t o flight. * J o h n p u t his v a c a t i o n right off until C h r i s t m a s . ?Bill s h o u l d t a k e t h o s e r e m a r k s r i g h t t o h e a r t . ?Bill s h o u l d t a k e t h e a n s w e r right d o w n w h e n it is given. * Bill s h o u l d t a k e his f r i e n d s right t o t a s k . *Bill s h o u l d t a k e his f r i e n d s right u p o n their offer. *The c o m m i t t e e t o o k the suggestion right into account. * T h e s t o r e k e e p e r s t o o k t h e s t u d e n t s right in. ( i d i o m a t i c sense)

G i v e n t h e validity of u s i n g right as a d i a g n o s t i c for t h e s y n t a c t i c c a t e g o r y p r e p o s i t i o n , w e c a n see t h a t p o s t - v e r b a l p a r t i c l e s (i.e., d i r e c t i o n a l i n t r a n s i t i v e p r e p o s i t i o n s ) a r e n o t the o n l y i n t r a n s i t i v e p r e p o s i t i o n s in English, a s is s h o w n b y (16): (16)

J o h n finished t h e t a s k right before. J o h n lives r i g h t o u t s i d e . I h e a r d s o m e t h i n g right o v e r h e a d . T h e y t o o k t h e j o b right a f t e r w a r d s . Y o u s h o u l d d o t h i s right a w a y .

T h e r e is an expletive c o n s t r u c t i o n in E n g l i s h w h i c h c o n s i s t s of a d i r e c t i o n a l a d v e r b p l u s a p r e p o s i t i o n a l p h r a s e i n t r o d u c e d by with, as e x e m p l i f i e d in (17): (17)

I n t o the d u n g e o n with that traitor! T o t h e river w i t h t h o s e s a n d b a g s ! O u t t h e d o o r w i t h it! T o hell w i t h t h i s a s s i g n m e n t !

If w e c o n s i d e r t h e p o s t - v e r b a l particles as i n t r a n s i t i v e p r e p o s i t i o n s , t h e P — s t r u c t u r e of this expletive is a l w a y s of t h e f o r m ^ + ( N P ) — with — N P . H o w e v e r , if we c o n s i d e r particles t o b e d i s t i n c t f r o m t h e c a t e g o r y P , a n o t h e r s e q u e n c e of t h e f o r m P R T — with — N P is a p o s s i b l e e x a m p l e of this construction: (18)

Off with his h e a d ! D o w n with the leadership! O u t with what you know! Away with them!

T h i s m e a n s t h a t , w h e t h e r s u c h expletive c o n s t r u c t i o n s a r e t o b e a c c o u n t e d for b y b a s e r u l e s o r b y d e l e t i o n t r a n s f o r m a t i o n s , c o n s i d e r i n g particles as i n s t a n c e s of d i r e c t i o n a l p r e p o s i t i o n s will simplify t h e analysis.

260

A unified theory of syntactic

categories

There is a preposing rule for directional adverbs in sentences whose verbs are in the simple past or present. If the subject of such a sentence is not a pronoun, it may invert with the verb when such preposing occurs. (Another difference between this preposing rule and other adverb preposing rules is that in this case there is no comma [breath pause] after the preposed adverb). (19)

Into the house he ran! Down the street rolled the carriage! Out the window jumped the cat! Into the sink they go!

This directional adverb preposing construction appears not only with transitive prepositional phrases, but also with post-verbal particles: (20)

In he ran! Down rolled the carriage! Out jumped the cat! Up she climbed!

By analyzing particles as intransitive prepositions we can simplify the description of this preposing construction by allowing it only with conP - X . Otherwise, we would have to + D [ p p [ - IR state that it is allowed with two categories, directional prepositions and particles. It should be remarked that this construction is not allowed with particles or transitive prepositions which are part of V — P P idioms.

stituents of the form

(21)

Bill ran into trouble. *Into trouble ran Bill. John jumped at the chance. *At the chance jumped John. The gun went off. *Off went the gun. The meeting came off at six. *OfT came the meeting.

Bill went into detail. *Into detail went Bill. Bill came to a conclusion. *To a conclusion came Bill. The battery ran down. *Down ran the battery. The kitten rolled up. (into a ball) *Up rolled the kitten.

A final consideration in favor of identifying particles and directional P is that the analogous categories in Dutch both undergo a "P-shift" rule (van Riemsdijk, 1978, section 3.4). Van Riemsdijk (personal communication) points out that in Standard Flemish, the feature + D I R is absent in the description of P-shift. In conclusion, the advantages and simplifications in the base rules and principles, in subcategorization, in specifying the distribution of right, in

Subordination and the category P

261

analyzing the expletive construction mentioned above, and in describing the directional adverb constructions justify the claim that post-verbal particles are prepositions. We have also seen that idiomatic post-verbal particles and other idiomatic prepositional phrases d o not differ in syntactic behavior. However, there is one construction in which non-idiomatic directional particles d o not act completely like prepositional phrases. As noted by Fraser (1968), prepositional phrases but not particles can appear in focus position in cleft sentences: (22)

It was into the house that John ran. *It was in that John ran. It was down the street that he pushed the cart. *It was away that he pushed the cart.

Intransitive prepositions which d o not undergo particle movement, such as those in (16), can be focused in clefts: (23)

It was inside that John put the cat. It's downstairs that I want you to glance. It was overhead that he heard something. It was right afterwards that she took the job.

Given the distinction between (22) and (23), an initial characterization of this limitation in focused constructions is something like (24): (24)

Some head of an N P or P P in a cleft focus position must be "lexically specific."

F o r the moment, lexical specificity can just be thought of as a feature which the intransitive P's that undergo particle movement are lacking; at the descriptive level, it is just this feature which also blocks particle movement in (25): (25)

*John carried upstairs the trunk. *Mary threw outside a box. *Store overhead your suitcase.

O n e might wonder if the lexically specific P (overhead, upstairs, etc.) are better candidates for status as intransitive P, since their external distribution is closer to that of PP's. But the fact that most post-verbal particles are optionally transitive P, while lexically specific P are not, suggests rather that the particles are the intransitive elements most centrally associated with the category P.

262 (26)

A unified theory

of syntactic

categories

*John did it afterwards the concert. *The planes are cruising overhead the bridge. ""Let's take it upstairs the escalator. T h e y keep it hereabouts the pool. The planes are cruising over (the bridge). Let's take it up (the escalator). They carried the sofa in (the room).

When we observe NP's in focus position, we see that some of them also can lack "specificity," which similarly gives rise to ill-formedness: (27)

»It's somebody that John's behavior might bother. ( / John's behavior might bother somebody.) *It was from someone that they received a letter. *It was nobody that described the event. *It was some stuff that I received in the mail. ( ^ I received some stuff in the mail, where some stuff is not drug slang and is not specific.)

So ill-formedness as a cleft focus is not a property of all intransitive prepositions, nor is it a property limited to them. It is a property which directional post-verbal particles, as a subcategory of intransitive P, share with certain N (non-slang stuff) and with certain SP(N) (non-specific some, no). Especially in light of the array of arguments presented above that post-verbal particles are P, the contrast in (22) does not justify taking them out of this category. We will, in fact, return to the stipulation in (24) later, in Ch. 7, where the "lexical specificity" of another class of P will be under discussion. 8 8. It is tempting to identify the "lexically specific" P like outside and upstairs with the open class m e m b e r s of N , A, V, and the m o r e grammatical post-verbal particle P with the closed classes of N, A, V, w h o s e existence and significance has been argued for in Ch. 4. T h e only objections to this m o v e are that lexically specific P d o not seem t o be a fully productive open class, and that the partitioning of subordinating conjunctions effected by (24), t o be discussed in Ch. 7, places w o r d s like before and because in the lexically specific class a n d hence in the "open class" of prepositions. It may be a p p r o p r i a t e t o address here the hesitation sometimes expressed a b o u t putting idiomatic and literal directional post-verbal particles in the same syntactic class (cf. Fraser, 1976, C h . 1). This contrasts with the general acceptance of the n o t i o n t h a t idiomatic n o u n s are nonetheless nouns, a n d similarly for verbs and adjectives. As with lexical categories, differences between idiomatic a n d literal P can be found, but they d o not justify completely separate categorial identities. In a comprehensive study of post-verbal particles, Fraser (1976) lists five such differences. O n e is that idiomatic particles rarely allow a range of SP(P): T h e player drew {the number/*his o p p o n e n t } part of the way out. T h e car {tumbled/*slowed} all the way d o w n . But this is typical of idioms across values of X°, and in no way justifies bifurcating categories

Subordination

and the category

P

263

In this section, it has been demonstrated that the phrasal variable XJ in the base expansion for PP can be absent, in accord with the general optionality of phrases in statements of the base component. When this happens the "intransitive P" that are generated include the English "postverbal particles" as archetypical members. F o o t n o t e 8—Continued beyond the appropriate co-occurrence restriction; idioms of all types include fixed or restricted SP(X): *John is less clean than a whistle. *The soldiers kicked some buckets. ( = The soldiers were killed.) *The truth has rarely outed. (cf. T h e truth will always out.) Fraser's other distinctions between idiomatic and literal particles: (i) In two constructions, the verb must be phonologically present in the same clause as the particle, in order to yield an idiomatic reading: *He put his friends up, not d o w n . *John pulled the deal off, and Peters the money in. *The paint wore down, and the undercoat away. But it is again typical of many idioms that e m p t y o r deleted elements d o not suffice t o provide an acceptable idiomatic reading, so (i) is not a property in any way peculiar to particles (P). Examples: *We should give chase, not ground. " J o h n took my parents into account, and M a r y my sister to task. *The bucket was kicked. ( = [ N P the b u c k e t ] was [ v p kicked [ N P P]]) *My brother is a jack of, and I have a mild interest in, all trades. (ii) In two other constructions, an idiomatic particle can be separated f r o m its verbal p a r t n e r by at most an intervening direct object N P ; P P ' s or adverbs or even predicate a t t r i b u t e N P ' s are not allowed. *The mine caved quickly in. *His throwing of the presents out disturbed the company. " J o h n grew a Catholic up. Again, other discontinuous idioms besides those involving particles exhibit this type of restriction: J o h n brought several new facts to light in front of the jury. *John brought several new facts in front of the jury to light. You will drive me crazy over this. *You will drive me over this crazy. •Their putting of the strikers t o flight was a setback. With respect to (ii), it seems that many idioms include the specification of where a free phrase of a certain type may be generated, and not respecting this leads to unacceptable usage. Thus, the cat has NP's tongue, throw NP out, bring NP to light, grow up (but not grow NP up), etc. Particles have no special status with respect to this property, and so there is again n o reason not t o include them in the class of intransitive P.

264

A unified

theory

6.3. The Prepositional

of syntactic

categories

Copula as

I now wish to argue that, analogously to verbs, prepositions not only may be transitive and intransitive, but also may take predicate attribute complements. T h a t is, I wish to demonstrate that there are P's which are counterparts to both be and become, in that they require an N P or an A P which has the properties of a predicate attribute rather than of a direct object. T h e P which I will argue in detail here is the counterpart to be is what can be called "non-comparative as;" the counterpart to become is into. These uses are italicized in (28)—(29): (28)

(29)

H e came to the party as a monkey. Sue stayed on as a doctor. J o h n would be a poor choice as Hamlet. I think of him as a gorilla. W e introduced him as John's brother. This house is famous as a rendez-vous. Any clothes unsuitable as cold weather garments must be sold. Women as engineers still surprises some people. T h e use of gasoline as a food additive was criticized as a waste of energy. J o h n turned into an ogre. T h e children made what we gave them into a toy village. C a n Big Culture transform a suburb into a city?

Since the construction " P + predicate attribute" has received little if any attention in the literature, a fair a m o u n t of purely descriptive work will be integrated into this section. By way of establishing that there is a welldelineated construction to be studied here, I will begin by listing and differentiating a number of uses of the grammatical morpheme as. Pretheoretically, the regular grammatical uses of as fall into three main categories: (i) the use of as in comparative constructions both as a modifier of degree on the compared constituent and as the corresponding introductory word of the comparative clause, exemplified in (30) and (31); (ii) the use of as to introduce subordinate clauses with a range of meanings, exemplified in (32); and (iii) the use of as to introduce a phrase that expresses a property or role of some preceding noun phrase, exemplified in (28) and (33). It is of interest to note that almost every use of as which can be distinguished in this way has a different translation in French. (i) Comparative as: (as) - ADJ - as; French aussi — ADJ - que (where as much = autant). (30)

As often as he does that, he regrets it. Is John as tall a man as Harry? John, (as) tall as he is, couldn't reach it.

Subordination

and the category P

265

Mary was (as) pleased as she could be. They seem as eager to please as you are. Translations: Aussi souvent qu'il le fait, il le regrette. Jean, est-il aussi grand que Harry? Jean, aussi grand qu'il soit, ne pouvait pas l'atteindre. Marie était aussi heureuse que possible. Ils semblent aussi disposés à plaire que vous. Similar to the comparatives in (30) are those in which as - ADJ is replaced by (the) same (way); this whole phrase can sometimes be deleted, leaving us with an as which appears to be a simple subordinating conjunction: (31)

He no longer drives (the same (way)) as he used to. Just (the same) as we predicted, Mary won the race.

T h e comparative as has been the subject of many interesting studies in the generative literature. The classic treatments are Bresnan (1973) for English and Milner (1973 and 1978, Ch. 8) for French; Milner's work includes identification and analysis of comparatives with a fixed standard and of a "non-restrictive" type, and these have not been much studied in English. I will not go into the extensive debate between Chomsky (1977) and Bresnan (1977), subsequently taken up by other authors, as to whether WH-movement is involved in the description of comparatives. (ii) Subordinate conjunction as. Subordinate clause of cause or circumstance: as = French comme or étant donné que. (32)

(a) As I was late, I missed the ferry. Etant donné que j'étais en retard, j'ai manqué le ferryboat.

Subordinate clause of measure: as = French dans le mesure où. (32)

(b) As we look into adverbs more closely, the questions become more difficult. D a n s le mesure où nous étudions les adverbes de plus près, les questions deviennent plus difficiles.

Sentential Relative: as = French comme. (32)

(c) Someone seemed sick, as you mentioned. Quelqu'un semblait malade, comme tu as mentionné.

Subordinate clause of cotemporality: as = French pendant (32)

(d) As you were working, Mary came in. Pendant que tu travaillais, Marie est arrivée.

que.

266

A unified theory of syntactic

categories

These uses of as as a subordinate conjunction are taken here to be instances of a preposition P with an S complement; thus, they should be listed lexically as as, + PREP, + S. As such, they fit nicely into the paradigm of other subordinate conjunctions, discussed in section 6.2. However, I believe that the use of as in sentential relatives, where there is also a "gap" in the clause it introduces, is best assimilated to the comparative as, as suggested in (31), by way of the deletion of the same. Thus, it is only in comparative clauses that as-clauses exhibit gaps or ellipted elements. (iii) Non-comparative as: French comme. Several uses of as which introduce complements to N, A, and V are presented in (28); where V is involved, the as-phrases seem to be sisters to V. When a phrase with noncomparative as is outside the X, is preposable, and necessarily modifes the subject, the preferred French translation is en tant que:

(33)

As your professor, I condemn you to writing three term papers. She was criticizing John as his supervisor. Bill seems happy as a building inspector. My feeling as a linguist is that a solution will be found. I think of you often as your lover. Cf. As your lover, I... Cf. I think of you often as my lover. Cf. *As my lover, I... Translations: En tant que votre professeur, je vous condamne à écrire trois devoirs. Elle critiquait Jean, en tant que patronne. Bill semble content en tant qu'inspecteur de bâtiment. Ma réaction en tant que linguiste est qu'une solution sera trouvée. Je pense à toi souvent, en tant qu'amant.

Before turning to a detailed study of non-comparative as, it can be remarked that as can furthermore appear as part of a number of fixed compound morphemes or lexical expressions: as for NP, so as to VP, as opposed to NP, as regards NP, such as NP, in as much as S, as if S, as though S, do as S, etc. For the most part, I think these compound lexical items are of no more synchronic interest than are prepositions and subordinate conjunctions like because of, in case (of), in that, due to, etc., but of course, some interesting problems might emerge from a closer study. In order to argue that non-comparative as is a prepositional copula, three points have to be established: the N P following as has the properties of a predicate attribute; the combination as + NP is a PP; and any behavior of as which seemingly differentiates it from other prepositions should be accounted for in a way that is consistent with or positively supports the first two points.

Subordination

and the category

P

6.3.1. The Predicate Nominal after Non-comparative

267 as 9

N P ' s of the sort italicized in (28) closely resemble predicate attributes semantically, in that they indicate a role or property of the subject N P (with intransitive verbs), or of the object N P (with transitive verbs). In addition to this rather clear semantic property, a number of strictly syntactic arguments support my claim that a predicate nominal follows as. (i) In German, non-comparative as is translated as als (while comparative as is translated as so ... wie). It is a commonplace of German pedagogical grammar that the N P after this als must agree in morphological case with the N P modified by it, as must all predicate attribute NP's. This pattern and its significance have been pointed out by R. Janda as evidence that the N P complement of als is a predicate nominal. Van Riemsdijk (personal communication) further points out that German ausser "except" displays ambivalent behavior, assigning either dative, or permitting case agreement like als. In section 1.8, I discuss how the general principle of case assignment assigns an agreeing abstract case to a predicate nominal after be, become, as, etc.; see especially (74b) of Ch. 1. The realization of abstract case as morphological case has been discussed in section 5.7. (ii) Predicate attribute NP's, italicized in (34), do not undergo what are known as simple " N P movements" (as opposed to " W H movements"), such as preposing in the passive (34a), "object shift" (34b), and indirect object interchange (34c). 10 (34)

(a) John will [be/become] your lawyer. *Your lawyer will be {been/become} by John. (b) They called a conscientious objector a moron. Let's paint the whole house that plain white. They elected Susan the chairwoman. I will consider this house my legal residence. *Your lawyer has always been difficult to {be/become}. *A moron is easy to call a conscientious objector. *That plain white would be a bore to paint the whole house. *The chairwoman was tough to elect Susan. *My legal residence is difficult to consider this house. (c) You will probably just {be/become} Mary's husband to Sue. *You will probably just {be/become} Sue Mary's husband.

9. I have profited f r o m discussion with Michael Brame on this topic, including a University of Washington colloquium in which he presented a r g u m e n t s that the comparative as is a preposition. In earlier preliminary discussions on non-comparative as, he also suggested that its complement is a predicate attribute. T h e a r g u m e n t s for this thesis that I present here I have arrived at independently. 10. See E m o n d s (1972) for justification of the idea that b o t h N P ' s m o v e in the English indirect object construction. JackendofT (1977) analyzes this construction as involving a structure-preserving movement of the indirect object, but not of the direct object.

268

A unified

theory

of syntactic

categories

For contrast, the reader can verify that all the predicate attribute NP's in (a,b) may be questioned by the form what. The descriptive generalization (35) subsumes the behavior in (34). (35)

Predicate attribute NP's do not move into surface N P positions other than C O M P .

When we examine NP's introduced by non-comparative as, they are also subject to the restriction (35), as my proposal predicts. 11 (36)

11.

(a) Preposing in the passive: In the evening, John doubles as the house-photographer. Mary acted as my lawyer. A strong lamp can serve as a dehumidifier. *In the evening, the house-photographer is doubled as by John. *My lawyer was acted as by Mary. *A dehumidifier can be served as by a strong lamp.

T h e r e is possibly a transformational relation between the sentences of (i): (i) Bill still has to visit Egypt. Bill still has Egypt t o visit.

If so, I assume that the derivation is similar t o that of the object shift construction Egypt would be easy to tisit. O n e plausible derivation for such c o n s t r u c t i o n s is that the moved N P first preposes t o C O M P , and subsequently moves into an N P position in the higher sentences: Bill still has [ N P E g y p t ] [ s [ C o m p W to visit [ N P 0 ] ] It is of interest to note that, in accordance with my proposal, predicate nominals after be and a.s are excluded in sentences like (iii): (ii) J o h n had t o become a trial lawyer. The j u d g e had to pronounce a friend the winner. She has t o consider this house her legal residence. They had to paint a whole house that plain white. J o h n h a s to introduce several artists as his friends. I still have to work as a trial lawyer. (iii) *John had a trial lawyer to become. T h e j u d g e had a friend to p r o n o u n c e the winner. T h e j u d g e had the winner to p r o n o u n c e a friend. She has this house t o consider her legal residence. *She has her legal residence t o consider this house. They had a whole house to paint that plain white. "They had that plain white to paint a whole house. J o h n has several artists to introduce as his friends. • J o h n h a s his friends to introduce several artists as. *I still have a trial lawyer to work as.

Subordination

and the category

P

269

(b) Object shift: They treat an officer as a normal person. I thought of you as my friend. Will they hire me as the building inspector? *A normal person is difficult to treat an officer as. *My friend was easy to think of you as. *The building inspector would be a bore to be hired as. Like predicate attributes after be, the NP's after as in (36) can be questioned by the N P what: (37)

What What What What

does John double as in the evening? can this strong lamp serve as? should I think of you as? will they hire me as?

(iii) Predicate nominal NP's do not appear in the focus position in cleft sentences, which is a special case of the descriptive generalization (35). I am not concerned here with the fact that as + N P can so appear, since this sequence is a PP, not an NP; what is significant is that the N P complement to as cannot appear in focus position alone: 12

12. It can be observed that predicate adjectives are also barred in the focus position of cleft sentences; this may be d u e t o a general prohibition against the category A P in this position: *It was serious that J o h n tried to be. »It's as serious as Bill that J o h n seemd to Mary. *It was carefully that J o h n tried to sew. »It's m o r e seriously that Bill is talking to Mary. N o t h i n g prevents a predicate nominal N P after be or as from being the focus constituent in a pseudo-cleft construction, provided that it can be questioned by m e a n s of the N P what: What What What What What What What What What

he has always been is a teacher. we should paint the house is that plain white. they elected Susan was the chairwoman. Sue w a n t s to w o r k as is a lighting technician. he seemd happiest as was the manager. they will use this c a r d b o a r d as is insulation. we introduced S a m as was J o h n ' s brother. M a r y h a s become is a lawyer. they m a d e the boxes into was a toy village.

The c o n t r a s t between the above sentences and those in (38) is exactly t h e contrast noted in E m o n d s (1976, Ch. 4); the focus position in cleft sentences must be a non-attribute N P or P P , while any m a j o r phrasal constituent can be a focus constituent in the pseudo-cleft construction.

270 (38)

A unified theory of syntactic

categories

»It's a teacher that he has always been. •It's that plain white that we should paint the house. *It was my legal residence that I considered this house. *It is chairwoman that they elected Susan. •It's my lawyer that Mary has become. *It was a doctor that he remained in Denver. *It was the manager that John seemed happiest as. •It's insulation that they will use this cardboard as. *It was John's brother that we introduced Sam as. *It has been my lawyer that Mary has been acting as. *It's a toy village that they made the boxes into.

(iv) The predicate nominal with be alternates with the progressive construction, which is structurally a non-NP V 2 complement introduced by V + ing, as argued in Ch. 2. The expected parallel construction with as would be an as + VP structure introduced by V + ing. Such a construction indeed exists, and is exemplified in (39).13 (39)

You described him as being fussy. John regards me as knowing too much. He likes to masquerade as owning a lot of property.

The construction in (39) actually fills another gap in the range of P P expansions. The P + VP gerund complements of subordinating conjunctions discussed in Chs. 1 and 2 are always sisters to V 1 in the base, while the as + VP of (39) are sisters of V°. They thus help satisfy the expectation that all types of complement PP's can be found equally well both inside and outside V1, as discussed in Section 6.1. Also as expected, the V-ing complements with progressive be and with non-comparative as act alike; neither act as an NP: (40)

*It is being fussy that he might be. *It is being fussy that you described him as. *It's knowing too much that he regards me as. *It was owning a lot of property that he liked to masquerade as.

(v) Predicate nominals, whether after be or after as, cannot contain certain determiners, such as each, every, and any: 13. Peter Culicover has observed that gerunds in this construction can appear only with a stative verb: "His reputation as cooking asparagus preceded him. *He likes to masquerade as driving a fancy car.

Subordination (41)

and the category

P

271

*Did your friend(s) become {every/each/any} doctor(s) in that town? *Mary doesn't consider {me/them} each fool. *My friend(s) will work as {every/each/any} doctor(s) in that town. *Mary described me as each fool.

(vi) Predicate nominals, whether after be or after as, cannot be the antecedent for a personal pronoun. To show this, examples must be constructed with care, for nothing prevents a pronoun from referring back to the N P modified by the predicate nominal. In the latter case, it may appear that a predicate nominal is an antecedent: John became a doctor so he would be needed. (42)

*John didn't want to {become/work as} a woman, because she would have been discriminated against. *The teacher describes him as a rock, because it never stands up. *The teacher considers him a rock, because it never stands up. *Her mother {was/doubled as}a father to Sue, and Sue was always grateful to him.

(vii) Predicate nominals may not be "relativized" freely. The restriction appears to be that when a relativized predicate nominal modifies a lexical NP, the antecedent N P is itself a predicate nominal. 14 (The restriction is stated in Chiba (1973), and attributed to Kuno (1970).) As the following examples show, the phrases following non-comparative as pattern with predicate nominals, as expected. (43)

John finally {became/*saw} the doctor that we had wanted Sue to be.

14. Actually, Kuno's proposal is that the antecedent must be "non-referential," and certain NP's other than predicate nominals qualify. Thus, an example from Robbins (1968) cited by Chiba is Everyone hated the tyrant which their king had become. (For me to accept this sentence, which must be replaced by that.) When we examine the other tests for predicate nominals given here, we find that such NP's also tend to satisfy them, thus: ?The tyrant that their king had become was hated by everyone. *The tyrant that the king has become is easy to hate. ?It's the tyrant that the king has become that everyone hates. *Everyone hates {each/every/any] tyrant that the king has become. •Everyone at court hated the old woman that the king had become, because she abolished capital punishment for tardy chambermaids. Similarly: I haven't yet detected the adult {that/*who/*which} T o m m y is bound to turn into. "The adult that Tommy is bound to turn into hasn't yet been detected. •It's the adult that Tommy is bound to turn into that we haven't detected. *I'd like to detect {each/every/any} adult that Tommy will turn into.

272

A unified theory

of syntactic

categories

Mary will probably describe {you as/*him to} the character (that) you were ten years ago. They thought I {was/*met} the person that she introduced you as. We choose her {as/*with} the programmer (that) you used to work as. Furthermore, in my speech, counter to some examples which appear in the literature and are cited by Chiba (his examples 13), a WH-word may not stand for a relativized predicate nominal. I imagine this is related to (v) and (vi) above, and to Kuno's characterization of these N P ' s as "nonreferential"; that is, if relative WH-words are simply a certain kind of pronoun, then a generalization of (vi) would be that "pronouns cannot refer back to or replace a predicate nominal." The examples unacceptable for me can be constructed by uniformly replacing that with who or which in (43). My argument here, as throughout this section, does not depend on exact characterization of the properties of predicate nominals, but rather insists that the nominals after be and those after as share a wide range of properties that "referential" NP's (subjects, objects, etc.) d o not. 6.3.2. The PP Status of Non-Comparative

as with NP

The previous argument leads well into a consideration of whether, given the predicate nominal status of the complement of non-comparative as, we can further justify an encompassing PP structure in (44)-(47). (44)

brought

them

P

NP

as

extras

(45)

V

NP

PP

I them J P1 \VP

struck

as

V knowing

NP algebra

Subordination

and the category

P

273

(46)

could

VP / V

marry

\

PP NP

/ P

\ NP

you

as

captain

(47)

be as

a poor choice

Hamlet

The P P structures in (44)-(47) are supported if they undergo the P P movement rules. While in many cases they clearly do, there is nonetheless the fact that WH-fronting itself, perhaps the most widely studied P P movement rule, rarely results in the fronting of non-comparative as. The explanation for this discrepancy rests on a closer examination of the several uses of WH-fronting, and will be taken up after we discuss other tests for P P behavior. These other tests leave little room for doubt that as is a P. (i) A familiar rule of PP-fronting applies to PP's that are sisters to V 1 and preposes them, inducing comma intonation. In the case of as-phrases, these P P necessarily qualify the subject, those that qualify the object being sisters of V. (48)

As As As As As

a supervisor, she was criticizing John. your lover, I think of you often. a legal authority, the captain can marry them. a New York resident, Sue produced many plays. the building inspector, John seemed happy.

(ii) As seen in the previous section (the examples in (34)-(38)), predicate attribute NP's do not move to N P positions other than COMP, and hence they do not appear as the focus in cleft sentences. But as + predicate attribute is, according to my proposal, a PP, and the focus position of a cleft generally accepts PP's. Therefore, an as-phrase should appear in focus position. The following examples confirm these hypotheses.

274 (49)

A unified theory of syntactic

categories

It's as his supervisor that she was criticizing John. It was as a New York resident that Sue produced many plays. It might be as a building inspector that John would be happiest. It's as a doctor that he remained in Denver. It's as insulation that they will use this cardboard. It's as John's brother that we introduced Sam. It's as cold weather garments that these clothes are most suitable.

(iii) A third test that confirms that as is a preposition is the fact that when it takes a non-finite complement, this complement is gerundive (Ving). In Emonds (1976, Ch. 4), I argue that gerunds but not infinitives are NP's in English, so that it follows that a P which is subcategorized as + N P and can take a non-finite complement will, without further stipulation, take a gerund rather than an infinitive. Thus, since there is abundant evidence that as takes a (predicate nominal) N P complement, it should accept a gerund rather than an infinitival complement. 15 (50)

I would describe what happened as you(r) not being considerate of me. Picking up the mail counts as doing your job. •Picking up the mail counts as to do your job.

The examples in (50) can be constrasted with the use of so as, which is not subcategorized as + NP, and so does not take a gerund complement. (51)

John sails so as {*a pleasurable experience/*pleasing his wife/to please his wife}.

(iv) An as-phrase is a P P because it can be coordinated with a PP or an AP, as can a range of other PP's. Certain syntactic parallelisms are 15. I a m arguing that as + gerund is a P P , whether there is no "obligatory c o n t r o l " and the gerund is an N P , as in (50), or whether it is a VP, as in (39); it is therefore expected to a p p e a r as the focus in clefts. This prediction is not borne out: ?It's »It's »It's «It's

as as as as

being fussy that they described him. knowing too much that he regards me. y o u r not being considerate that I would describe that. d o i n g y o u r j o b that picking u p the mail counts.

However, this is not a peculiarity limited to P P ' s headed by as: ?It was t o breaking a strike that they c o m p a r e d visiting Marineland. •It's to taking public transport that we're limited. •It's of taking a long vacation that I think constantly. Therefore, the absence of as + gerund as a focus does not undermine the hypothesis that as is a preposition.

Subordination and the category P

275

required in coordinate structures (although they are not sufficient to allow coordination freely), and for PP's, a coordinate sister must be an A P or another PP. (52)

John arrived promptly and in a new suit. We considered him lazy and without prospects. They wanted us to work either gratuitously or for a very low wage.

A non-comparative as-phrase can be coordinated with a P P or with an A P in some instances, but it cannot be a conjunct of an N P or an S, and is thus like other PP's. (53)

Bill works for the vice-president and as the receptionist. Bill works for the vice-president and next to the treasurer. * Bill works the copying machine and as the receptionist. * Bill works the copying machine and next to the treasurer. Mary advised Sam as his friend and without being paid. Mary advised Sam during work hours and without being paid. *Mary advised Sam as his friend and to find a job. *Mary advised Sam during work hours and to find a new job. Harriet used to write for the New Left Review and as a free-lance journalist. Harriet used to write for the New Left Review and over at TimeLife. ""Harriet used to write short poems and as a free-lance journalist. ""Harriet used to write short poems and over at Time-Life. John was hired as a gardener and at low wages. John was hired in the summer and at low wages. *John was hired as a gardener and to comb the cat's fur. *John was hired in the summer and to comb the cat's fur.

Reversing the order of the coordinate phrases in (53) often worsens the unacceptable examples, but does not affect the acceptable ones. N o t only can P P ' s with non-comparative as be coordinated with PP, but as itself can be coordinated with P, again indicating the correctness of the structures in (44)-(47): (54)

All were built as and for grandiloquent religious rituals. (Frank Lloyd Wright, Writings and Buildings, Horizon, New York, 1960). I like to visit famous cities as or with a tourist. I don't care if they select me as or instead of a vice-president; I just want to get to the top.

276

A unified theory of syntactic

categories

(v) A fifth syntactic confirmation that the copular as-phrase is a P P is that it can be a sister to a head noun; some of the examples here are repeated from (28): (55)

J o h n as Hamlet would be a poor choice. W o m e n as engineers still surprises some people. T h e use of gasoline as a food additive might be the next step. My feeling as a linguist is that an elegant theory will be found.

As argued in Ch. 1, a deep structure N can have only a P P complement; an N P or AP complement is not permitted. In fact, it was noted there that certain predicate nominal complements to verbs must be mediated by as when they complement a corresponding derived nominal: (56)

Her election as treasurer surprised some people. *Her election treasurer surprised some people.

Thus, the ability to modify nouns is another P P characteristic of as + N P . (vi) Given the confirmations of (i)-(v) that as + N P = P P , we can now approach the question of why WH-fronting rarely appears to allow as to "pied-pipe" like other P's. For example, if we look at direct questions, it seems clear that non-comparative as does not pied-pipe: (57)

*As {what/who} does he think of you? {What/who} does he think of you as? ?As what would John be a poor choice? What would John be a poor choice as? ?As what did they select him? What did they select him as? *As what is this house famous? What is this house famous as? ?As what did Sue stay on? What did Sue stay on as? *As what are those clothes unsuitable? What are those clothes unsuitable as? *As what did he seem happiest? What did he seem happiest as? *As {what/who} did Sue introduce you? {What/who} did Sue introduce you as?

However, note that precisely the sentences above which are strongly unacceptable are redeemed and retain the same meaning if how replaces as + NP. (58)

How does he think of you? *How would John be a poor choice? (where how = in what capacity)

Subordination

and the category

P

277

*How did they select him? (where how = in what capacity) How is this house famous? *How did Sue stay on? (where how = in what capacity) How are those clothes unsuitable? How did he seem happiest? How did she introduce you? This suggests a simple replacement of certain of these as + N P combinations with a single WH word how, analogous to other such replacements involving where, when, and why. (59)

{Where/*at what/in what/with whom} did I leave those papers? (Why/*for what} does she work so hard? [How/*with what} does John speak? (where with indicates manner) {How/*as what} do you want to be described?

In the instances in (58) where replacement by how is not allowed, the corresponding examples in (57) are doubtfully rather than firmly unacceptable, and are marked with "?". For the moment, I assume that a replacement of as what by how accounts for the starred examples of (57), while the examples with "?" are grammatical but require further comment below on their low acceptability. 16 In indirect questions and exclamations, there is a solid explanation for why as does not pied-pipe: other P's do not felicitiously front either. Thus, these constructions provide a positive argument that as = P. (60)

*With how many people he talks in a day! *About what stupid subjects he writes! *As what a famous person you described me! ?Ann told me with how many people Sue talks in a day. ?Ann told me as whose son Sue wants to introduce me. T h e question of about what to write a talk hasn't come up. T h e question of as what to use this hasn't come up.

When we turn to relative clauses, we find that principles of grammar already established explain why as is not fronted like other P's. In section 16. When the as - phrases modify the subject N P and seem to be outside the VP, even though how is not allowed in WH-fronting, unacceptability judgments tend to be firmer: ?As whose doctor do you claim the right to be here? *As what was she criticizing John? *As what could John marry you? I have no explanation for the stronger unacceptability in these cases. However, I think the arguments (i)-(v) and what follows in the text will allay doubts as to whether noncomparative as is a preposition.

278

A unified theory

of syntactic

categories

6.3.1, it has been pointed out that relativization of predicate nominals requires that or 0, and not a WH word. But certain patterns of relativization in English are known to be compatible only with WH words: (a) the fronting of P — NP, and (b) non-restrictive relative clauses. The contradictory requirements of a WH-word in (a) or (b) and of that/Q as a relative form after predicate nominals explain why there is no fronting ("pied-piping") of as in (61). (61)

(a) *Bill is the man as who(m) I want to be introduced. (b) *John wants to stop being the manager, as which they hired him a year ago.

The following might be taken as acceptable: (62)

(a) ?Bill is the man who I want to be introduced as. (b) ?John wants to stop being a manager, which they hired him as a year ago.

The grammar I am proposing can generate the sentences of (62) only by generating the corresponding examples of (61) as well (if (61) are acceptable, there is no problem; this indicates simply that relativized predicate nominals may have the form of WH words), or by adding an ad hoc stipulation that as may not appear in a fronted position in COMP. 1 7 In preference to this second position, I claim that (61a) and (62a) have the same status, either both grammatical or both excluded, and likewise for (61b) and (62b)—both are grammatical or both are excluded. The incompatibility of a fronted non-comparative as with a relative clause structure is thus directly due to the predicate nominal status of the N P introduced by as, and is explained by the hypothesis of the preceding section that as is the prepositional counterpart to be. The only remaining uncertainty about the prepositional status of non-comparative as concerns the less than total acceptability of the sentences with "?" in (57) and the lower acceptability of the examples in (61) relative to their counterparts in (62). I attribute the marginal status of these fronted as to the combined efTect on performance of (i) the increasing "stiffness" associated with pied17. A fact suggesting n o special prohibition against the pied-piping of as even in relatives is its a p p e a r a n c e when the predicate nominal otherwise escapes the prohibition against being replaced by W H : J o h n just saw the man as whose son he wants to introduce me. The w o m a n as whose husband he hoped to become f a m o u s in the end scorned him. If even sentences like these are excluded in some dialect, as well as those maked "?" in (57), then the dialect in question simply excludes non-comparative as as a fronted P in C O M P position; this w o u l d have t o be stipulated as a local filter in a particular g r a m m a r .

Subordination

and the category

P

279

piping of all prepositions in American English and (ii) the general rarity of contexts when as may be pied-piped (i.e., the restriction against W H as a relativized predicate nominal and the obligatory use of how in (58)). I therefore conclude that no special prohibition against WH-fronting of non-comparative as exists, and that overall the construction in fact supports the status of this morpheme as a preposition (cf. especially the examples of (60) and of note 17). (vii) A final argument for analyzing non-comparative as as a preposition is based on the fact that the morpheme as in its role as a subordinate conjunction is a P (cf. section 6.1) and also that as is a complementizer-like element in comparative constructions. In section 7.1, I will argue that all complementizer-like elements (that, i f , for, than, and as) are also members of the category P. I have already claimed, in the discussion of post-verbal particles (section 6.2), that "in general nonrelated, non-productive syntactic categories d o not have members in common." This general heuristic suggests, independently of the specific behavior of non-comparative as, that this morpheme should fall into the same category as do the subordinating conjunction as and the comparative as, namely P. 1 8 In this section, I have presented numerous arguments that the phrases introduced by non-comparative as contain predicate nominal NP's and are PP's whose head is as, as shown in (44)—(47). The import of these arguments is that the category P is demonstrated to take the same range of complements that V does, within the limits of the base composition rule (1) of this chapter. That is, P may be intransitive, transitive, copular (as, into), or introduce a V 2 or V 3 complement. As this thesis has been developed, all these possibilities have been exemplified both for the complements inside the V 1 phrase (the sisters to V), and for those outside it (the sisters to V 1 ). It is precisely this last point - the fact that any expansion of (1) can be used outside or inside V 1 - that leads us to the next topic. That is, what constructions exemplify the structure P - V 3 inside V 1 ?

18. This heuristic is of course pre-theoretical, a n d in n o way commits me to the position that it always holds. While it has been a fruitful research strategy (outside the lexical categories, where it has been misleading), I d o not entertain the notion that the C O M P that and the D E T that are of the same syntactic category, nor the hypothesis that Spanish si "yes" and si "if" are likewise. Similarly, for the moment I see no syntactic relation between the SP(A) as and the P as.

Chapter 7

S as P and COMP as P In this chapter, I undertake to show that all subordinate clauses S are deep structure sisters to V or to P. These include clausal complements to N, infinitive and finite clauses with empty N heads (Emonds, 1976, Ch. 4), restrictive relative clauses, comparative clauses, and all other clauses which have been analyzed as sisters to " C O M P . " 1 In terms of generative studies subsequent to Bresnan's (1970) proposal that there is a general subordinating node C O M P , my claim is that the C O M P morphemes are a subset of the P which appear in the frame S and that S = P. In particular, the C O M P morphemes are those P which are subject to the late (post-transformational) insertion discussed in section 4.7, and whose insertion frames include + S. Since this hypothesis effectively eliminates two categories and at least one primitive category from the theory of universal grammar, it is clearly methodologically desirable, if empirically supported. It immediately follows from the hypothesis that C O M P = P and from the bar notation that C O M P (P) is obligatory in the expansion of S (P), since it is the head of S; thus a stipulation to this effect is eliminated. (This was pointed out by Judith McA'Nulty.) By the general base composition rule (20) of Ch. 1, repeated here, all XJ should be able to appear with P-S complements, as sisters to both X° and X. (1)

X>-*X*, P m a x

But on the basis of the structure p p [ P - S ] for adverbial subordinate clauses, whose validity has been established in section 6.1, there is an apparent asymmetry in the distribution of P-S and P - N P . P-S seems to appear only outside V, while P - N P occurs both outside and inside V, as well as in the system of deep structure complements t o N m within N P , as demonstrated at length in Chomsky (1970) and Jackendoff (1977). (2)

(a) P - N P inside V: J o h n put his meeting before lunch.

1. In Ch. 2 , 1 have shown that gerunds - i.e., verb phrases introduced by V + ing - are not S's. The English g e r u n d s with lexical subject N P ' s have been shown t o have the following deep structure: CNPLDETlexical N P + \ S ] [ f j [ N 0 ] ] lexical V P ] ,

282

(3)

(4)

A unified theory of syntactic

categories

(b) P-NP outside V: John had several meetings before lunch. (a) P-S inside V: *John put his meeting before he left to eat. (b) P-S outside V: John had several meetings before he left to eat. (a) P-NP in NP: [A criticism of the bank during the lecture] would be out of place. Ann wrote about [her irritation at the procedures], [The visibility at the airport] was poor. [The man with a coat] was singled out. (b) P-S in NP: *[A departure before we ate] would surprise the guests. *Ann wrote about [her irritation because John was late]. *[Any precautions in case fire broke out] should have been posted.

Should not the structure P-S appear in these latter positions as well? That is, if P-S is an expansion of PP, it should occur in all P P positions. In fact, these positions where P-S is "missing", the positions of sister to X° and of sister to N, are exactly those where an S has been postulated. Therefore, a fundamental argument for my claim is that it regularizes the distribution of P-S, extending it to all base PP positions. An objection might be made that the distribution of one or another value of COMP or of some class of lexical subordinate conjunctions is skewed and less general than that predicted by rule (20) of Ch. 1. To forestall such an objection, I will go through each type of P and C O M P with an S sister, and show that apparent asymmetrical distributions either do not exist or are due to characteristics of universal grammar independent of putative C O M P / P or S/PP category distinctions. As each value of COMP originally postulated by Bresnan is examined, we will want to see if they are really limited to being sisters of X°, or whether they each have a wider distribution outside X which confirms the hypothesis that S and PP can be identified. Conversely, we will want to know if or why adverbial subordinate clauses are limited to being outside V. The values of P in the order that I will examine them here are: (i) P = that(ii) P = WH (if), (iii) lexical P, and (iv) P =/or. 2

2. It may be recalled that Bresnan's proposals for C O M P are mainly concerned with arguing against the transformational insertion of this category. Throughout, I agree that P (her C O M P ) is a base category, though, as her (1972) work suggests, it may be empty early in the transformational derivation. Her article is not concerned with the question of whether one may identify C O M P and some other category.

S as P and COMP as P 7.1.

283

[P,-WH]=that/P

This configuration translates the "unmarked" complementizer that into the present system. The finite complement clauses and perhaps present subjunctive complements which are sisters to L (L = N, V, A) are introduced by this P. Moreover, we find f/iaf-clauses outside N as well, in restrictive relative clauses. (This construction will be discussed in more detail in section 7.3.) Thus, it remains only to ascertain whether we find i/jaf-clauses outside V; if so, the unmarked instance of P in the context S has the full distributional range predicted by my hypothesis. Throughout this chapter, I will be using several well-established tests that determine whether a P P or S is inside or outside V. If a P P can prepose with comma intonation, it is outside V (Chomsky, 1965, Ch. 2): (5)

Before lunch, John had several meetings. * Before lunch, John put several meetings.

Strictly subcategorized complements, including obligatory complements and those whose heads are idiomatically selected by the governing verb, are inside V (also Chomsky, 1965, Ch. 2). A P P can follow the "pro-verb" do so if and only if it is outside the lowest V. This idea is originally in LakofT and Ross (1966); the difficulties with the two-way implication are I believe cleared away in Emonds (1976, Ch. 5). There are also imperfect but interesting correlations between a P P being inside V and a P being "strandable" in passive and WH-fronting constructions, but since P which take an S complement never strand (another property they share incidentally with C O M P ) , I will not have occasion to use such constructions with this material. The construction which realizes the unmarked complementizer outside V is what traditional grammar calls the "result clause." It has been studied in detail by Rouveret (1978), with special emphasis on French. It is exemplified in (6)—(7): (6)

(7)

(a) John went downtown by taxi, so that he would arrive on time. (b) S o that he would arrive on time, John went downtown by taxi. (c) Many corporations gave money to the campaign, so that rent control started to seem complicated. (d) * S o that rent control started to seem complicated, many corporations gave money to the campaign. (a) So many corporations gave money to the campaign that rent control started to seem complicated. (b) Many corporations gave so much money to the campaign that rent control started to seem complicated. (c) The corporations made such a campaign that rent control started to seem complicated.

284

A unified theory of syntactic

categories

There are in fact two sub-types of so that clauses; in the first type (6a), so that is nearly synonymous with in order that, and introduces a result planned by the agent of the main clause. In this case, the subordinate clause can be preposed (6b), like other PP's. In the second type (6c), so that introduces an observed result and this clause cannot be fronted (6d). It is also the case that the second usage of so that is possibly discontinuous, with the so being able to appear in the first clause with quantifiers or in the form such, as in (7). In discontinuous constructions, even stronger unacceptability results from preposing the result clause. (8)

**That rent control started to seem like a complicated issue, so many corporations gave money to the campaign.

These "observed result" clauses linked with so/such in (7) are the Vexternal constituents that exemplify pp[that-S]. It might be suggested that this second type of result clause ("observed result") is a true case of an S, rather than PP, and that this explains the restriction on preposing in (6d) and (8). However, such a proposal would fail to correlate a putative S property with any other useful distinction between this node and PP. 3 There are three reasons for generating an observed result clause as a P P outside the V (cf. Rouveret, 1978); and its non-preposability in (6d) and (8) can in fact be attributed to an independent factor which itself turns out to be a fourth argument for V-external status. First, observed result clauses are not subcategorized by individual verbs, and can appear, subject to pragmatic felicity, with any predicate (So many months are rainy that I feel like moving). Second, result clauses must follow all V-internal material: *John asked so many people that I was embarrassed whether they liked his paper. Third, like other P P clauses external to Ñ, result clauses are allowed only if interpretable as a relative. (9)

Could a city such that none of its inhabitants were poor ever exist? He gave a talk such that critical remarks about its assumptions were difficult to make.

A result clause without a relative clause sense is at best marginal inside N P (we will return to why this is so):

3. Thus, suppose one claimed that subordinate clauses which don't prepose with c o m m a intonation were S, separate from PP. As is being shown in the text, such S are distributed both inside and outside of both V and N, so there would be no correlation between deep structure distribution and preposability. We will see below that certain subordinate clauses (e.g., //-clauses, f/iar-clauses, and infinitives) cannot be focused in cleft sentences; but again, many other adverbial subordinate clauses cannot be focused. Thus, the ability to appear in focus position doesn't correlate with putative S (as opposed to PP) status either.

S as P and COMP as P (10)

285

*So many trips that he gets homesick are expected of him. *She had put a talk of such length that I had complained on the agenda.

Finally, the very inability of observed result clauses to prepose is paralleled by the fact that comparative clauses cannot either. Moreover, just as preposing a discontinuous result clause (8) is worse than preposing a simple one (6d), so a preposed comparative clause where more/less is separated from than (12b) is worse than one in which more than is preposed (lib). (11)

(12)

(a) Mary contributes to campaigns, more than John gives to charity. (b) *More than John gives to charity, Mary contributes to campaigns. (a) Mary contributes more of her time to campaigns than John gives of his to charity, (b) **Than John gives of his (time) to charity, Mary contributes more of her time to campaigns.

These unacceptable frontings in both comparative and observed result clauses should, it seems, be linked to the fact that these subordinate constructions are dependent on the specifier systems of nouns, adjectives, and quantifiers. That is, such dependencies, presumably expressed in logical form representations, are incompatible with P P preposing. Baltin (1978, Ch. 2) independently justifies a general restriction on logical form which explains the restrictions under discussion. (13)

Head-Modifier Constraint (Baltin). A modifier which is outside its head's maximal projection in s-structure must follow the head.

By (13), specifier-linked comparative and result clauses, which can be considered parts of complex modifying units, must follow the phrase they modify in s-structure. Since we have every reason to believe that comparative clauses are outside V, it is plausible that this common syntactic and semantic behavior between observed result and comparative clauses (dependencies with SP(X) and forbidden P P preposing) reflects a common syntactic position, outside V. The link between result clauses and SP(X), rather than status as an S and not PP, is what explains the unacceptability of preposing a result clause. Given this explanation for the peculiarity of result clauses (nonpreposability), nothing stands in the way of analyzing observed result clauses introduced by that as P + S structures generated outside V with an unmarked P. They are a structural parallel to the restrictive relative clauses introduced by that that are generated outside 5i. Thus,

286

A unified

theory

of syntactic

categories

[P, — W H ] = that appears in all the positions predicted by the hypothesis that S is a special case of PP. 7.2. IP, + WH] = if/whether In the present system, the possibility that C O M P can be expanded as ± W H is expressed by allowing the syntactic feature W H to co-occur with P, as in (14): (14)

p

WH/

S (optional)

We can consider (14) to be part of the base, In the appendix to this chapter, I argue that (14) follows from the more general nature of W H as a feature that can co-occur with any closed category. It suffices for the exposition here to take W H as the feature necessary for characterizing the English morphemes if and whether. (15) (16)

if. + P, + WH, + S whether: + P, + WH, +

S4

Using (14)—(16), it follows that (17) must be the s-structure for the indirect question complements of sentences like those in (18). (17)

[P, W H ]

S

if/whether (18)

You should inquire {if/whether} his friend put those books on the train. Bill asked Mary {if/whether} a friend had put some books on a train.

We ask, if the expansion of P P by (14)—(15) is general, what construction realizes the structure of (17) when P P is outside X, as in (19)? 4. Possibly, whether is a fronted sentence adverb (the W H form of either/so/ too), and is not strictly speaking a C O M P . French and some English dialects lack a phonetic counterpart t o whether. It m a y be nrfted that whether is possible in a n u m b e r of contexts where if and that are both excluded; an interesting account of these distinctions in terms of a theory in which clauses as well as N P ' s receive abstract case in developed in Yim (1984). Did he k n o w J whether./ *that /' *if J or not she won the award? I a m uncertain as to { w h e t h e r / * t h a t / * i f } you should attend the party. T h e y enjoy discussing the problem of {whether / * t h a t / * i f } circles can be squared. D e b a t e s a b o u t { w h e t h e r / * t h a t / * i f } the weather is changing are futile. You should tell me { w h e t h e r / * t h a t / * i f } t o go to the meeting.

S as P and COMP (19)

as P

287 X or X

X

PP S

[P, W H ]

If there are well-motivated instances of (19), my claim that P-S occurs in all P P positions is confirmed for P = WH. We will see in the next section that when X = N in (19), we have a relative clause, such as those introduced by WH words in English. 5 When X = V and no WH-fronting of a phrase applies, the structure (19) is simply that of a conditional introduced by if: (20)

You should relax if his friend has put those books on the train. Bill would have told Mary about it if a friend had put some books on a train.

For the first time in a generative framework, we can uniformly assign if and its French translation si to a single category: [P, + WH]; i / n o longer has a schizophrenic C O M P / P status. Moreover, the fact that »/-clauses in Modern English (both indirect questions and conditionals) and the corresponding si-clauses in French always require non-subjunctive finite complements can be stated in a unified way. As expected, the P P inside V in (18) may not be fronted, whereas those outside V in (20) may be: (21)

*If his friend put those books on the train, you should inquire. •Whether a friend had put some books on a train, Bill asked Mary. If his friend has put those books on a train, you should relax. If a friend had put some books on a train, Bill would have told Mary about it.

5. I make no distinction between WH and wh, as in Chomsky (1973) and in much work based on that article. These two categories are never crucially in contrast, and so should be coalesced. This implies that WH-fronting (specifically in relatives) can only apply in a clause whose C O M P is + W H . Such a structure-preserving analysis is not allowed in Chomsky (1973), apparently so that a rule of surface interpretation can "tell' whether the clause involved is an indirect question (" + WH") or a relative (" - WH", accompanied by a fronted w/i-phrase). But this non-syntactic bookkeeping role for the feature W H on C O M P is unnecessary, since interpretation as an indirect question can and should be made to depend on the structure [pp(=S) WH or WH-phrase —S] being a sister to X°. This rule of obligatory interpretation allows WH to be generated freely on C O M P in all S positions (and so to play the role of a structure-preserving landing site in relatives). It also explains why in (a)-(l>) below, N P , must be an empty NP, and not an overt WH-phrase; an overt WH-phrase heading a clause which is a sister to V° can only be interpreted as an indirect question. (a) John may buy a book, to read (b) "John may buy a book which to read.

N P | [ [0]].

288

A unified theory

of syntactic

categories

In a construction little noted in the generative literature, we can observe that WH-fronting, precisely as expected under the present analysis, can use the head P with the feature WH of the conditional structure in (19) as a landing site, just as it can use what is called C O M P : (22)

You should relax, whichever books his friend has put on the train. Bill would have told Mary about it, whatever train his friend had put the books on. Whether or not his friend has put the books on the train, you should relax. Whichever train you take, you will be late. However long the lecture lasts, I can come pick you up.

These examples transparently show the need for extending C O M P (i.e., 5) to positions outside V, and for considering such S to be P P (because of their preposability). This is done here in the simplest possible way, by identifying P P and S and also P and C O M P . 6 An objection might be made at this point that while I have shown that S and [ p p P - S ] occur both inside and outside V and N, a distinction of some type between these constructions is still justified by the ability of P P but not of S to serve as the focus constituent of a cleft sentence. This observation dates from Akmajian (1971), and is developed in some detail in Emonds (1976, Ch. 4). Upon reflection, however, we find that a number of adverbial subordinate clauses pattern with S in this instance, so that an unmodified distinction between S and P P sheds no light on the problem: (23) (24)

It's {before/until/while/because/when/only if/just in case} Greg starts to sing that I tune the piano. *It's {if/even if/although/unless/so that/lest} Greg start(s) to sing that I tune the piano.

6. Conditional clauses, whether or not they exhibit WH-fronting, seem to be the only adverbial subordinate clauses which can freely serve as the antecedent for an impersonal argument if of certain verbs. In this they are parallel to the right and left dislocated NP's studied in Ross (1967). See also Haiman (1978). [If John sings arias],, it,. bothers m e / I dislike it,, a l o t / I don't talk about it, to anyone. [Whatever arias John sings],., it,, bothers m e / I dislike it,, a l o t / I don't talk about it,, to anyone. Those ariaSj, they^ bother me/I dislike therrr a l o t / I don't talk about them^ to anyone. *[Although/Because/After John sings arias],, it, bothers m e / I dislike it, a lot/I don't talk about it, to anyone. T o say that this argues for S rather than P P status for (/"-clauses would solve nothing, since other instances of S (in the case that a distinction between the categories is maintained) d o not have the distribution of the above paradigm.

S as P and COMP as P

289

What brought particles (such as

is useful here, however, is a restriction on cleft focus constituents out earlier in connection with a distinction between post-verbal (such as English in, out, etc.) and other more contentful adverbials outside, upstairs, etc.). I repeat (24) of section 6.2 for convenience:

(25)

Some head of an NP or PP in a cleft focus position must be "lexically specific."

The discussion of adverbial subordinate conjunctions up to this point suggests that if and that are grammatical morphemes without intrinsic content, while the opposite is true of the conjunctions in (23). Thus, it is reasonable to conclude that (25) is playing a role in determining the judgments in (23)-(24), although further study is needed to explain exactly what "lexically specific" is with respect to semantic interpretation. But even should this tack not prove particularly fruitful, it is clear in any case that the contrasts in (23)-(24) do not support any otherwise motivated distinction between S and PP. 7.3. P = Lexical Subordinating Conjunction One example of a gap in the distribution of P + S, attributed to M. Kajita in Chomsky (1970), is that the P + S structures introduced by conjunctions such as before, because, if, although, in that, etc. are unacceptable as sisters to Ñ (this is discussed in more detail in Emonds, 1976, Ch. 5), even though they are fine as sisters to V (section 6.1 here). Thus, we have: (26)

*A criticism of the book before you read it would be foolish. *John's departure because he was sick was a surprise. *Sue described the danger if I travelled by train to my friends. *Ann wrote about her irritation in that the procedures were irregular in a letter to the management.

In a system which distinguishes S and PP ( = P + S here), the absence of these complements to Ñ, in some ideolects at least, can only be stipulated without explanation. However, once S and PP are identified, as they are here, an account based on a universal principle of semantic interpretation becomes possible. The principle is: (27)

Clausal sisters to noun phrase heads are universally reserved as a structure which is primarily interpreted as a restrictive relative clause.

In many idiolects at least, this is the only fully grammatical interpretation of a clausal sister to N m , m > 0. We can think of (27) as a kind of Projection Principle (cf. Chomsky, 1981) for relative clause structures, which imposes a representational

290

A unified theory of syntactic

categories

uniformity on relative clauses at deep structure, s-structure, and logical form. Besides encompassing relative clauses with a gap, as we find them in English, it also applies to relative clauses with resumptive pronouns-such as the "result clause" relative structures exemplified above in (9). A more formal version of (27) is (28): (28)

Relative Clause Projection Principle. In a syntactic stucture of the form [ N p NPfc — a m a x ] , if a is not empty, then either (i) a assigns a 0-role to NPfc (e.g. a = V or A), or (ii) a contains N P * (e.g., via WH-movement to an empty P). Here N P may be N or N.

Let us now see how (28) applies to any clausal complement outside N to guarantee that it is a relative clause, and not an adverbial clause as in (26). If, as established in Ch. 1, a phrasal sister to an N m can't be an NP, then a in (28ii) is required in deep structure to be a P. By (28i), it moreover follows that a cannot be a subordinating conjunction of intrinsic semantic contnet, of the type discussed in section 6.1. That is, the obligatory interpretation of P + S in the context Nm (m > 0) is that obtained by (28 ii), so that the examples of (26) are excluded. (I agree with the argument in Chomsky (1970) that speakers who accept such sentences are derivatively generating them.) Given (28), we only expect outside N those subordinators P which result from empty P in deep structure: the pure grammatical formatives that, WH, and for. It is well-known that exactly these complementizers may introduce English restrictive relatives. A second position where adverbial subordinate clauses seem excluded is as sisters to X°, where X = N, V, and A.7 If S as a sister to V instantiates P + S, as I claim, we want to know why the vast majority of semantically contentful subordinating conjunctions P do not appear inside the smallest verb phrase, since a phrase structure distinction no longer performs this function. The answer here is that these contentful conjunctions mostly refer to time and to the system expressing causality and implication. More generally, such P, whether they have S or N P complements, are generally excluded from constituency in the smallest V. It therefore follows that most P inside the smallest V are grammaticalized elements; this holds equally well for P - N P and for P-S. There is a class of semantically contentful P which do occur inside the smallest V: the directional prepositions (into, below, etc.) and some locational ones (with certain verbs). A number of the syntactic tests for V establish this (do so, P P fronting properties, obligatory subcategorization, etc.). The only subordinating conjunction which can indicate direction or location is where, and this is, as we expect, perfectly possible inside V. 7.

Hendrick (1978) argues that P P or S cannot appear inside a d e e p structure AP;

nothing in what f o l l o w s crucially interacts with this claim. That is, I d o n o t d e m o n s t r a t e here that P + S structures o c c u r inside A P , contra be taken as e v i d e n c e that S is not a PP.

Hendrick, but Hendrick's a r g u m e n t s could not

S as P and COMP as P (29)

291

(a) Obligatory subcategorization by put, reside, go, etc: John put a table where I wanted one. Don't you want to reside where there is sun? The typewriter should go where we can't hear it. (b) Do so can be followed only by material external to V: *Bill put a table where I wanted one, but Mary did so where it was inconvenient. (c) Fronting without a comma: Where there is sun I'd love to reside.

Likewise, a subordinating conjunction whose semantic content is "similarity" can be subcategorized by a verb and appear inside the smallest V, for such conjunctions can be involved in obligatory subcategorization (30a-b), are excluded after do so (30c), and are not preposable with comma intonation (30d). (30)

(a) They treated me well. They treated me as if I were royalty. (b) T h e y treated me. (with the same sense of treat as in a) (c) T h e y treat me as if I were royalty, but you do so as if I were kin. (d) *As if I were royalty, they treated me.

So in fact, the distribution of the structure [ppP — S ] inside V is in no way exceptional; some instances of P of semantic content (where, as i f ) occur there, as do other more grammaticalized P corresponding to the morphemes labeled CO MP. The P of time, causality, and implication are excluded inside V equally well in both S and NP contexts. In summary, we have seen that adverbial subordinate clause expansions of PP occur in all PP positions, except that they are disallowed in Nexternal position by the universal Relative Clause Projection Principle (28).

7.4. IP, + GOAL] = for/p All work on the C O M P in English recognizes that it has a third value besides WH which introduces infinitival clauses of the surface form "(for NP) to - VP." When S (that is, [ P P P - S] in the present system) is a sister to X°, this value of C O M P yields sentences as in (31): (31)

They arranged (for a friend) to deliver the package. We bought some books (for Sam) to read. Your proposal (for others) to be invited was turned down. He was anxious for the tools to be bought.

The feature WH on P does not co-occur with this type of "for-to clause"; there are no infinitival indirect questions with an expressed for-phrase subject. (We will see below that infinitival relatives with overt WH-phrases

292

A unified theory of syntactic

categories

are not derived from a COMP with a feature + WH, as are finite relative clauses; cf. note 5.) Bresnan (1972) and Chomsky (1973) present evidence that the subordinating conjunction for is not just an accidental synchronic homonym with the preposition for. They point out that a significant number of verbs and nouns such as be, wait, beg, pray, plan, arrange(ment), proposal, etc. are subcategorized as + for a (more accurately, as + + GOAL a), where a can be either NP or S, without any clear difference of semantics or grammatical relation. These distributional similarities between the preposition and the complementizer for indicate that they share the same syntactic feature. As in section 5.7, for is to be notated [P, + GOAL]; as seen there, for is an item which can be inserted after syntactic transformations apply.8 The hypothesis of this chapter, by eliminating the C O M P / P dichotomy, permits us to state the subcategorization of these verbs and nouns elegantly, using the feature + [P, + GOAL]. If at least some for-to infinitival clauses are S complements to deep struture P's with the feature GOAL, we expect them to appear not only in the context V , as above, but also in the context V NP , where the transitive V take VP-internal indirect objects. This class of complements, missing from the lists in the appendix of Rosenbaum (1967), is exemplified in (32): (32)

(a) John may John may (b) Mary has house. Mary has

buy a book for you. John may buy you a book. buy a book for you to glance through, found a house for John. Mary has found John a found a house for us to store books in.

If a verb does not take a VP-internal indirect object, it does not typically allow this kind of infinitival complement: (33)

(a) John could lick the stamps for you. *John could lick you the stamps. *John should lick the meat to get a taste of. (b) Mary burned the gloves for John. *Mary burned John the gloves. *Mary burned the gloves to destroy before the police arrived.

It may be asked, why do these for-to clauses in VP-final position exhibit an obligatory gap, while for-to subject and object clauses do not - in fact, cannot have a gap? While I do not have a complete answer, it is 8. More precisely, for is [ + GOAL, - LOCATION]. But since all P which are + S are — LOCATION, with the possible exception of where, I assume that it is redundant to specify — LOCATION with the dictionary entry of for in its complementizer use.

S as P and COMP

as P

293

instructive to look at a tree for (32) and compare it with the configuration required in a relative clause by (28).

a book

[P, + G O A L ] 0 ( = for)

you to glance through [ n p ^ P ]

The sturcture (34) almost satisfies the conditions of the Relative Clause Projection Principle (28), if we set a = [P, + G O A L ] , The only discrepancy is that NPfc and P P do not satisfy the requirement of together forming an NP. In contrast, a for-to clause in subject or object position could not satisfy (28), since it would not be contiguous to a preceding, ccommanding NP^. If we could find a principled reason for relaxing the NP-constituency requirement, we would have an explanation for why gaps are permitted (and required) in only "oblique" for-to clauses. Unfortunately, simply dropping the N P - constituent requirement in (28) would incorrectly allow r/iaf-clauses and indirect questions after transitive verbs such as tell to contain gaps coreferential with the direct object. However, at the descriptive level, it still seems a step forward to drop the NP-constituent requirement of (28) just when a = [ P , + G O A L ] , Thus, we see that for-to clauses appear in all X-interior positions, which is automatically captured by analyzing them as P + S. The hypothesis collapsing S and P P is further confirmed if [P, + G O A L ] (for) introduces clauses which are exterior as well as interior to X. When X = V or A, there are two types of exterior clauses. First, "purpose clauses", as in (35), are exterior to V and may be introduced by for: (35)

John should write a check (in order (for you)) to have a record. (In order (for you)) to have a record, John should write a check. 9

In accord with its P P status, and parallel to "planned result" clauses, a purpose clause may prepose. Second, the adjectival specifiers too and enough can be linked to infinitival clauses introduced by for, quite parallel to the situation with "so + a result clause" or with "more/ less/ as + a comparative clause", as in the preceding discussion of thai-clauses. Whether too/enough-linked clauses are inside A P or outside it in the base depends on whehter one accepts the arguments of Hendrick (1978) that no deep structure complements of A are interior to AP. In any case, these infinitival clauses 9.

In my speech, in order can be deleted only when the infinitive immediately follows it.

294

A unified theory of syntactic

categories

are exterior to A, since they can co-occur with and must follow complements strictly subcategorized by A. (36)

I don't think Peter is upset enough about leaving for you to have to visit (him). Cf. *Peter isn't upset enough for you to have to visit (him) about leaving.

Exactly parallel to result clauses (cf. 8), the type of for-clause in (36) that is linked with the adjectival specifier system may not prepose: (37)

*For you to have to visit (him), I don't think Peter is upset enough about leaving.

The clausal type exterior to N introduced by [P, + GOAL] is, as determined uniformly by the Relative Clause Projection Principle (28), the infinitival relative (38) much discussed in recent generative literature (i.e., Emonds, 1976; Chomsky, 1977, etc.). (38)

I suggested a subject for Mary to talk on to Sam. I suggested a subject on which to talk to Sam.

We can conclude the English for-to clauses appear in all P P positions, inside and outside V, and inside and outside N. A strong confirmation that all these for-to clauses are derived from a uniform syntactic source is that they invariably exhibit infinitival verb forms introduced by to. I have argued elsewhere (Emonds, 1976, Ch. 5) that these infinitives are derived by means of a local transformation which I called "/or-phrase formation". This transformation brings about the "Exceptional Case Marking" (ECM) required for the subject of an infinitive in the system of government devised in Chomsky (1981). The motivation for this device is the apparent universal tendency for the subject N P and SP(V) ( = INFL, in Chomsky's terminology) to "agree" in being empty or not empty. On the one hand, finite SP(V) tend to appear with lexical NP, and on the other, infinitives tend to have null (or "understood") subjects across languages. This is in no way an extrapolation from English, since English is precisely one of the most recalcitrant systems, with its variety of lexical subjects for nonfinite forms in diverse constructions. Now, what traditional descriptions of fairly well-known verbal systems call "nonfinite" forms exhibit very few if any tense-aspect contrasts, compared to the finite forms. In light of this, it seems valid for quite a range of languages to characterize non-finiteness as empty s-structure SP(V), and to note that empty SP(V) seem to go with empty subject NP. 1 0 10.

A s discussed in Ruwet (1979), a relatively peculiar type of infinitive m a y arise in

S as P and COMP as P

295

While we might envision stipulating more directly this agreement in "lexicality" between subjects and SP(V), consideration of apparent counterexamples has led Chomsky to an analysis in which this agreement is derived, rather than axiomatic. The basic principle is that SP(V) assigns case to the subject N P only if SP(V) is expanded into features such as ± TENSE and + PAST. An undifferentiated SP(V) does not govern the subject and hence does not assign it case at s-structure; by the Case Filter (Ch. 1, 59), such a caseless N P must be non-lexical. However, since the subject of an infinitive introduced by for ([P, + GOAL]) may be lexical, it must receive case from something other than SP(V), i.e., "exceptionally." I propose to formalize this case-marking of the deep structure subject in (39) by means of the local rule (40). (39)

PP

Mary (40)

0( = surface to)

talk on that subject

Exceptional Case Marking. [P, + G O A L ] — N P => 1 + 2 — 0 1 1

This rule is obligatory, possibly by virtue of an extension of the Revised Second Inflection Principle of Ch. 5 (80). The symbol " + " , as is conventional, indicates Chomsky-adjunction. But, in the bar notation, the constituent formed by a head-of-phrase category (here P) and a complement phrase is, by definition, a minimal projection of that category. F o o t n o t e 10—Continued French by virtue of W H - f r o n t i n g of a subject N P out of certain object clauses. He argues conclusively that n o raising-to-object process can be responsible for this construction in French. This suggests that a finite SP(V) can delete in French in the left context of the form "V — W H - t r a c e ." It can be noted that such infinitives c a n n o t arise in all contexts in French, but only when there is an immediately contiguous verb. It would appear that a language-particular rule subject t o the Adjacency Hypothesis of the Introduction deletes the finite SP(V) in these instances. F o r an explanation in terms of abstract case, see Rizzi (1981). 11. If the SP(V) is lexical so that ordinary case-marking of the subject N P applies, we can ask, what prevents such an N P f r o m also undergoing E C M ? The trace left of this N P would be case-marked, and would fall under the requirement (whether primitive o r derived) in the theory of C h o m s k y (1981, 322) that case-marked empty phrases be co-indexed with a constituent in a non-argument position, a typical instance being a trace of W H - m o v e m e n t . Since the lexical N P , in (41) is in an a r g u m e n t position (i.e., the object position for P), SP(V) can't be lexical. P u t another way, if SP(V) is finite, its subject is necessarily a lexical N P or a trace of W H - m o v e m e n t , topicalization, etc. (Whitney (1984) shows that n o n - a r g u m e n t positions must be construed to include many others besides that of C O M P . ) It therefore follows in the present analysis, without stipulation, that for takes a n infinitival.

296

A unified theory

of syntactic

categories

Thus, assuming this interplay of Chomsky-adjunction notation, the derived structure produced by (40) is (41). 12

and the bar

(41)

[P, + G O A L ] I 0

NP; 1 1 Mary

NP; , 0

I SP(V) 1 (3 (= surface to)

^VP I talk on that subject

In this derived structure, there is one and only one case-marking category (P) available to mark the lexical subject N P when, at s-structure, the Generalized Case Marking Principle of Ch. 1 applies. In this analysis, there is no stipulation that for, unlike other complementizers, has a special case-marking property P. The stipulation is rather that the N P moves into the case-marking domain of the + GOAL complementizer. 13 A number of independent considerations converge on the formulation (40) of Exceptional Case Marking. Some of these are given in Emonds (1976, Ch. 5), but I repeat them here for completeness. 12. M y first formulation of this rule in E m o n d s (1976, Ch. 5) assumes that the C h o m s k y - a d j u n c t i o n effected by this rule creates a P P rather t h a n a P. But this is incorrect for four reasons, (i) The constituent created by for-phrase formation does not act like a P P with respect to subsequent movement rules. The fact that it creates a P together with the general restriction against moving a head X or X out of X m a x (unless X m a x exhaustively d o m i n a t e s the X or X in question) excludes movement of this / o r - N P . (ii) The sense of C h o m s k y - a d j u n c t i o n is t o create a minimal unit containing the adjoined and host constituents. An extension of the bar notation to such a d j u n c t i o n s requires that a transformationally created constituent conform t o it, and the minimal phrase containing a P and an N P is P , not P P . (iii) An a r g u m e n t can be made, and is in the text below, that the structures created by / o r - p h r a s e formation d o not tolerate SP(P), which is predicted if P but not P P results f r o m this rule, (iv) In my original formulation, the derived position of N P does not cc o m m a n d its trace. But the definition of c - c o m m a n d which must be adopted in an adequate theory of binding, as argued in Contreras (1984), and also in stating general conditions on coreference (cf. Reinhart, 1981) must have at least the following as a consequence: P ( a n d X ) d o not c o u n t in determining c - c o m m a n d by a phrase. Using any such definition of cc o m m a n d , t h e lexical N P , in (41) correctly c - c o m m a n d s its trace. T h e E C M rule (40) is not strictly speaking a movement rule, as it does not reorder terminal elements in a string. It is technically a "readjustment rule," whereby d o m i n a n c e relations but not linear order is affected. I assume that readjustment rules are a type of local t r a n s f o r m a t i o n , and hence subject t o the Adjacency Hypothesis of the Introduction. 13. An objection that might be raised against (40) is that such movement of the subject N P to a position outside S means that the S of a for-to clause no longer has a lexical subject, and hence is not a governing (or binding) category for lexical and N P - t r a c e anaphors, as it must be in the theory of binding of Chomsky (1981, section 3.2.3). However, the S in (39) does have a n NP-trace as a subject, and this trace (or any NP-trace) is "accessible" (in C h o m s k y ' s sense) to all a n a p h o r s inside the clause, so in fact the binding theory is not affected.

S as P and COMP as P

297

(i) If the subject N P in the deep structure (39) were exceptionally casemarked without moving out of S, it would have to be stipulated that for is both + N P and + S, where S is non-finite. But in the analysis here, the P for, which is subject to the late lexical insertion discussed in Ch. 4, is uniformly inserted in the surface context + N P ; by principle (22) in Ch. 4, this N P must be lexical. Nothing needs to be said about for taking a non-finite complement; a deep structure finite SP(V) after [P, + G O A L ] in (39) will give rise to a case-marked empty N P , sister in (41), and this N P is excluded since it is not co-indexed with a phrase in C O M P (here P), as explained in note 11. (ii) A language which allows lexical dative subjects for infinitive constructions, such as Russian, can be analyzed in exactly the same way as English. In the light of the analysis of dative case as a realization of [p, +GOAL0] in a language which marks case morphologically (section 5.7), it is possible to say that such dative subjects also result from the rule (40). (iii) It has been shown by Klima (1964b) and in section 5.8 here that the morphological "case" of English pronouns is not case of the sort that depends on grammatical relations or abstract case; rather, "subjective" pronouns reflect the structural property of surface sisterhood with the SP(V), while "objective" pronouns appear elsewhere. The fact that object pronouns appear after the subordinator for follows directly from the above formulation of Exceptional Case Marking. (iv) Sentence adverbs, which are typically analyzed as daughters of S, cannot appear between for and the following lexical N P . This contrasts clearly with the behavior of whether, i f , and that, even in present subjunctive clauses whose semantic affinity with for-to clauses is wellknown. (42) (43) (44)

They suggest that initially he be put on probation. T h e y intend for initially him to be put on probation. They asked whether initially he was put on probation. They knew that initially he was put on probation. I find it irritating that usually this street is closed. *I find it irritating for usually this street to be closed.

The derived structure of for-to clauses given here (41) accounts for why a sentence adverb cannot intervene between for and a subject N P . (v) Consider now infinitival relatives, as in (45)-(46). (45) (46)

They should list the topics (for students) to write on in the syllabus. They should list the topics about which to write in the syllabus.

There are two fundamental distributional facts about infinitival relatives to account for, as pointed out in Emonds (1976, Ch. 5). The first is that an

298

A unified theory of syntactic

categories

overt lexical subject, with or Without for, is excluded if there is an overt WH-phrase. The second is that, if the relative is infinitival, an overt WHphrase can only be a PP. In this section, I treat only the first of these points. If there is an overt WH-phrase, it is in COMP, and in the present framework, this means that it has been substituted for P. (This mechanism will be taken up in section 7.5.) The question is then, what prevents (47a-b), while allowing (46)? (47)

(a) *They should list the topics about which for Mary to write, (b) *They should list the topics about which Mary to write.

(47a) is ungrammatical because two elements which are generated in the same position, the WH-phrase and the P for, are both present. (47b) is excluded because the maximal phrase which has substituted for the deep structure P is not a possible case-marking category; only X and SP(X), but not X 1 or X2, can case-mark at surface structure. That is, nothing prevents rule (40) from applying, but its applying does not lead to case-marking, and so the Case Filter rules out (47b). The above five conserations provide ample evidence for the formulation of Exceptional Case Marking as in (40). The uniform applicability of this rule in all for-to infinitives, whether in purpose clauses, infinitival relatives, or in complements to X°, demonstrates that the same element of COMP, [P, + G O A L ] in the context of an S sister, appears regularly both inside and outside both V and N, and that here again there is no principled reason for distinguishing S from [ p p P - S], Even if the formulation of Exceptional Case Marking were replaced by a somewhat different device for permitting lexical subjects of infinitives, this device would have to call upon the prepositional nature of the complementizer for and would suggest that C O M P is a type of P. 14 14. It might be of interest to discuss an important restriction on the N P after this for. Since 1 think this restriction can plausibly be attributed to other factors in g r a m m a r , I have not stated it explicitly in (40). T h e condition is that N P ; in (41) may not u n d e r g o W H movement (or other m o v e m e n t s into a C O M P ) at any point in a derivation. My p r o p o s a l for explaining this divides into two parts. First, since the C O M P of the present system is P and not P , the lexical N P ; in (41) is not in fact "in C O M P " , and this explains why the subject N P after for cannot serve as the W H - p h r a s e in a relative or as the obligatorily fronted phrase in an indirect question. T h e relevant examples are: (a) *The teacher for who(m) t o teach second grade is Mr. Jones. *Bill asked for who(m) t o finish the work. (b) Bill arranged for who to teach second grade? (echo question) I wonder who's lying in order for who(m) to get this job. (multiple question) In the present system, no problem arises because the offending who(m) after for in (a) are simply not initial WH-constituents in a P that immediately d o m i n a t e s S, which would be required for their interpretation in logical form. Thus, the situation is n o different t h a n in a system without /or-phrase formation.

S as P and COMP as P

299

The point of this chapter so far has been to show that subordinating conjunctions of lexical content, that, W H , and for, all occur in all P P positions, and that no motivated bifurcation of the traditional grammatical category of subordinating conjunction (e.g. according to which subordinate clauses appear in cleft focus position, which ones appear inside X as sisters to X, or which ones undergo PP-preposing) corresponds to a C O M P / P distinction. Rather, all such bifurcations in fact lend support to the unity of C O M P and P and to the identification of S and PP. Any apparent residual " C O M P behavior" reduces down to particular behaviors of individual grammatical formatives (WH, for, and the unmarked P that in the context S), and does not support setting up a category distinction between them and the P of semantic content. 7.5. P as a Landing Site for WH-movement As in much other current work, I assume that the rule which fronts W H marked constituents in questions, relatives, exclamatives, and conditionals (section 7.2) is simply the rule "Move a," where a has the value W H . Since this rule moves phrases, and the constituent on which the feature W H is morphologically realized is SP(X) (e.g., how = SP(A) and which = SP(N)), there must be a "percolation mechanism" by which features of SP(X) also appear on some dominating node(s) in the same projection of S. 1 5 The strength of the Left Branch Condition in English (Ross, 1967), or of whatever prohibition on extraction subsumes this constraint, then prohibits any but the highest in a series of "percolated" W H categories from moving. Thus, in (48), Move W H applies only to the boxed constituent. Given this percolation mechanism, rules that move phrases (e.g., a W H F o o t n o t e 14—Continued Second, the N P f in (41) cannot be a trace of W H - m o v e m e n t or of any movement into a higher " C O M P " . This restriction may eventually be fully predictable as a characteristic of all N P ' s which have first moved locally. F o r example, moved indirect object N P ' s in English are subject to a similar restriction discussed briefly in C h o m s k y (1977): T h e m a n who, J o h n arranged (for) NP,. to fix the lamp was late. *The m a n who f J o h n brought NP f the l a m p was late. • W h o , was the lamp ready (for) N P . to use? *Who i . did J o h n write NP f that note? Whitney (1982) proposes an analysis of the a b o v e restriction on indirect objects in terms of a theory of binding of WH-traces; I believe her analysis can be extended t o the N P following for, since in both situations there is a chain that includes b o t h an N P - t r a c e and a W H - t r a c e that are in the s a m e clause. 15. Features from SP(X) can apparently percolate u p through any n u m b e r of phrasal nodes which are themselves specifiers. Thus, a n o u n phrase like whose father's partner's losses is a WH-constituent subject to Move a. Similarly, as originally noted I believe by Jackendoff, the man's partner's losses is + D E F I N I T E , even t h o u g h the definiteness feature a p p e a r s on an embedded SP(X). Van Riemsdijk (1984) argues, on the basis of pied-piped infinitivals in G e r m a n , that percolation may occur even after movement, in order t o satisfy restrictions o n the interpretation of relative clauses in logical form.

300

A unified theory

of syntactic

categories

(48)

[SP(A), WH]

I

how

A

I

awful

phrase) can be stated as "Move a", where a is a feature that percolates to the phrase. Further, this percolation of syntactic features from the specifier to the maximal projection is a generalization of the percolation of the lexical category X° through the projections of X to X m a x . That is, N P is simply a percolated N with an index indicating the "height" of the projection (Muysken, 1983). In this way, we can also say that N P movement is in fact "Move N". When transformations are applying on the domain of NP,-, it is of course impossible for a rule stating, for example, Move N, to apply to N P , itself. But when Move N applies on a domain properly containing N P „ then the prohibition on moving a bare head away from other material within an X 2 , as discussed in Ch. 3, requires that N P rather than N alone move. It is a commonplace that NP-movement involves structure-preservation (substitution of one consistuent for another of the same category), and this characteristic is clearly not affected if NP-movement is formulated, using percolation, as Move N. Returning now to WH-movement, it can be observed that the typical landing site for WH-movement is [P, W H ] , Given the percolation of W H to maximal projections and the elimination of the W H / w / i distinction (justified in note 5), it is clear that WH-movement as depicted in (48) is structure-preserving in the same way as " M o v e N " is. This seems to be a good result, since, as argued in Ch. 3, the structure-preserving constraint of Emonds (1976) continues to have a great deal of excess and verified content, in comparison to subsequent proposals in the generative literature that have subsumed parts of it. And of course, nothing in the structure-preserving constraint prohibits one from distinguishing various landing sites, for example, according to whether they are argument positions or nonargument positions, when such distinctions are needed for other principles of grammatical theory. 1 6 16. The binding theory of Chomsky (1981) is an archetypical instance of such principles. Whitney (1984) argues that a wide range of morphological processes (such as auxiliary contraction) and semantic processes (such as focus assignment) crucially distinguish between movements to argument and non-argument positions. She claims further that movements to non-argument positions are not structure-preserving; in terms of what is developed in the text, I would reformulate this claim by saying that movements to argument positions never involve percolation in a crucial way, as does WH-fronting.

S as P and COMP

as P

301

Given this structure-preserving analysis of WH-movement, it now appears quite mysterious that there is an apparently universal tendency to move an introductory P along with a WH-marked N P under WHmovement, at least optionally. (49)

They told me with whom I should talk. The man for whom I worked was arrogant. He never made much money, in whatever field he worked.

No plausible convention of unmarked "feature percolation" will pass features (e.g., case, number, etc.) from within a governed maximal projection (here, an NP) to a higher phrase (here P or p m a x ). Thus, WHfronting of a P P does not seem to preserve structure. Even if one rejects the idea that this rule preserves structure, its simplest formulation would in any case be Move WH; this would include assuming feature percolation from SP(X) to X m a x , but again further percolation of WH up to P P is ad hoc. If we again examine the tree in (48), however, we see another feature in the landing site for WH-movement besides WH, namely, P itself. As suggested above, I take it that movement of a phrase X m a x by Move a in a domain larger than X m a x consists of a having the value X and of X m a x moving by virtue of percolation of X. It now becomes clear why a P P can move in WH-fronting constructions. One possible value for a in Move a is a = P. By percolation, P P is a P with a height index for the number of bars, so a P P can move into an empty P (with a proviso to be noted below) in structure-preserving fashion. The principal hypothesis of this chapter (50) then predicts that P P can also move in structure-preserving fashion into what has been notated C O M P . (50)

A C O M P is a P which dominates no lexical terminal element during the derivation of s-structure.

If C O M P were kept separate from P, there is no plausible explanation for why PP, but not VP, moves into C O M P in WH-fronting contexts. A further means for corroborating this explanation of PP-fronting in WH-contexts is to see whether P P can undergo "WH-fronting" when no WH is available in C O M P as a structure-preserving landing site. As seen in section 7.4, infinitival relatives have a C O M P which is + GOAL (/or); moreover, -I- GOAL, which triggers Exceptional Case Marking (40), and the feature + WH are incompatible expansions of P: (51)

I don't know whether to leave or not. *I don't know whether for John to leave or not. *I don't know whether John to leave or not. *I don't know for John to leave or not.

302

A unified theory of syntactic

categories

Since, according to the present analysis, infinitival relatives have a C O M P which is [P, + GOAL, - WH], the only overt fronted category they should tolerate is PP. 1 7 As noted in section 7.4, this is exactly the case. (52)

A good topic (*which) to write on has not been found. *Mary is the person whose teacher to consult about that.

Thus, the rule of Exceptional Case Marking postulated in (40) and the derived structure it induces fully explain the restrictions on English infinitival relatives, with no language-particular stipulation beyond what is needed for the general description of all English for-to clauses. 18 In a framework such as that in Emonds (1976), where C O M P and P are not identified, the analysis of/or-phrase formation must stipulate both (i) that for is a C O M P that is + P, and (ii) that /or-phrase formation exists (i.e., no comparable rule for WH or that exists). But here, stipulation (i) is eliminated, and is provided by a general theory of subordinating conjunctions, which are all P. Put another way, an accidental confluence of ad hoc properties is eliminated. I return to the proviso about the exact nature of the formal operation by which a projection of P fronts in an embedded WH-fronting context. Examine (53), which includes both a finite and a non-finite value for SP(V). (53)

PP

P 1

to

[NP, + WH] / / [SP(N), + WH] I which

\ N | boy

17. As explained in section 3.5, structure-preservation is a condition on overt categories in their surface position, and not on traces. 18. It can be asked why a projection of P can move in structure-preserving fashion in WH constructions, but not in *It bothers me about politics to talk. I assume that there is no rule to interpret in logical form a P P which is not a WH-phrase in an embedded C O M P .

S as P and COMP

as P

303

If Move a substitutes P P for P, the bar notation will be violated in derived structure, since P P will be the head of P. (In Ch. 1, independent justification was given for allowing X to be the head of X, but X cannot be the head of X.) Now, the effect of structure-preserving movement is supposed to be precisely that the bar notation is respected in derived structure. In order to avoid a violation, at most P can move in a structure like (53). Empirically, this is confirmed, since the various specifiers of P, which are exterior to P but inside p m a x , block WH-fronting in the embedded contexts where structure-preservation holds. 19 (54)

(55)

Which wall should we plant the bushes three feet behind? Behind which wall should we plant the bushes? The meal that I don't want to leave right after is supper. The meal after which I don't want to leave is supper. I don't know (*three feet) behind which wall we should plant the bushes. The meal (* right) after which I don't want to leave is supper. The wall (*three feet) behind which to plant the bushes is over to the left. The appropriate meal (*right) after which to leave is supper. John described the window (*directly) beneath which to hang the picture. They are pointing out the pillows (*right) on which to sit.

In my view, the predictions of (52) and (55) strongly confirm the proposal that the structure-preserving constraint is restricting the landing site in C O M P of WH-movement, as long as we assume as our method that ad hoc stipulations for the constructions just discussed are excluded. Within this framework, the identification of C O M P with P is a crucial and necessary step, and in this way the central hypothesis of this chapter, that C O M P is a P, is also confirmed. 20 19. Examples such as the following are excluded by the prohibition on extracting an X f r o m within an x m a x which contains other lexical material such as SP(X). •Behind which wall should we plant the bushes three feet? ' A f t e r which meal should we leave right? 20. This analysis of why P P ' s can move in W H - f r o n t i n g contexts allows us to attribute all instances of "pied-piping" t o various influences of universal g r a m m a r . Ross (1967) discusses three different W H - f r o n t i n g contexts in which a constituent larger t h a n the W H phrase itself moves. O n e type is the m o v e m e n t of a P P which contains a W H - m a r k e d N P , which has just been explained in the text. A second type is the movement of a containing phrase Y m a x when W H is in a phrase in the SP(Y); this is discussed in note 15. Thirdly, constituents properly containing a W H - p h r a s e can prepose in appositive relative clauses: J o h n , the stories a b o u t w h o m were false, was outraged. Ross gives examples of this third type of pied-piping in some restrictive relative clauses;

304

A unified theory of syntactic

categories

7.6. Subcategorization for Elements in COMP In the preceding sections of this chapter, I have argued that S (=clauses introduced by the grammatical formatives labeled C O M P in Bresnan, 1970) and P P ( = [pp lexical P —S]) have the same deep structure distributions, as right sisters to V, V, N, and N, and possibly to A and A. I have also pointed out that other putative distributional contrasts between the categories S and P P correlate neither with each other nor with the difference between lexical and grammatical P. The differing distributions of subclasses of P —S in cleft focus position, in fronted PP's, in relative clause positions, and in adverbial clauses of time and causality have all been shown to be due to properties not expressible as a PP/S category distinction. General parsimony then requires us to dispense with a category distinction that is a purely derivative notion; we are forced to consider that COMP's are P's. The particular characteristic of COMP's is that they are P which are empty in deep structure; they are subject to the late lexical insertion of section 4.7 in the context S. Their counterparts in the context N P are of, to, and for, whose post-transformational insertion has been treated here in several places, in particular, in the first appendix to Ch. 2. It is not suprising, then, that there are only three C O M P in Bresnan's original system; roughly the same number of P's are inserted posttransformationally in both S and N P contexts. As we have been discussing, the three C O M P are + WH (if whether), + G O A L (for), and the unmarked C O M P which is realized as that is the s-structure of finite clauses. 21 F o o t n o t e 20—Continued however, acceptability of this type of construction varies greatly, and I think it is derivatively generated on the model of appositive relatives. T h e report the cover on which was soiled was rejected. ?We should visit only the city a favorable report on which Jack received. *Most students are interested in any professor a security file on w h o m the government won't release. Cf. Most students are interested in Prof. Rotestern, the security file on w h o m the government won't release. In E m o n d s (1979), I propose that pied-piping in appositive relatives results from topicalization of the larger N P (e.g., the stories about whom, etc.) in a root clause, and is not related t o W H - m o v e m e n t at all. This root topicalization brings a constituent containing W H into C O M P . and thereby permits a relative clause interpretation rule t o assign this W H the same reference as a maximal projection that immediately precedes C O M P . T h e claim that pied-piping in appositive relatives is a root operation independent of W H - f r o n t i n g is further confirmed by noting that appositive relatives violate an otherwise general restriction to the effect t h a t W H - f r o n t i n g moves only P, and not P m " x : T h a t is the sunny wall, three feet behind which we should plant the bushes. T h e lunch at Moissonier, right before which I had arrived by plane. I'll never forget. 21.

T h e freedom of occurrence of a zero a l l o m o r p h of that is sometimes exagerated.

S as P and COMP

as P

305

A possible justification, though a weak one, for setting up a separate category S would be if some classes of verbs were subcategorized to take just S complements. However, most lexical classes of verbs are subcategorized to take various values of COMP, and do not freely allow all values of COMP. For example, believe takes neither a for-to clause nor an indirect question, prefer takes no indirect question, persuade does not take a for-to clause, ready takes only a for-to clause, hope takes only an indicative clause or an obligatory control infinitive, wonder takes only an indirect question, etc. (Cf. the appendices of Rosenbaum, 1967.) Such verbs and verb classes are (respectively) subcategorized for complements by featuresjike+ S, + (GOAL) S, + N P (WH) V k , + GOAL S, + V\ + WH V*; the existence of a category COMP is irrelevant to stating these facts. 22

F o o t n o t e 21—Continued That deletes only in certain contexts stateable in terms of adjacent constituents, as required by the Adjacency Hypothesis on language-particular rules. These contexts are: (i) that =>0/V (X m a x ) (ii) that = - 0 / N NP As in all structural descriptions of local rules, one of the categories corresponding to a term must minimally c - c o m m a n d the others in order for the rule to apply. In accordance with (i), the italicized that in (iii) cannot be deleted, d u e to a failure either of c - c o m m a n d or of adjacency. (iii) Is what you believe that J o h n dislikes you? That he would pay for this I can't believe. It will a m u s e J o h n that you have d r o p p e d in. H a r r y claims (that) the train isn't running, and Sam that it is. I tried t o convince a gambler last night that he should leave town. M a r y seemed convinced to most of us that her house had burned d o w n . So many customers demonstrated that P G & E lowered their rates. Bill told Mary, I've heard, that his apartment was free. In accordance with (ii), the italicized that in (iv) cannot be deleted. (iv) He finally f o u n d a doctor, after a long search, that he has confidence in. A guest arrived that J o h n refused to introduce. Since a local rule can apply only if one of the terms affected in the rule minimally cc o m m a n d s the others (Emonds, 1976, Ch. 6), rule (ii) cannot apply in the examples in (v), because the head X {woman, machines) does not minimally c - c o m m a n d that. (v) I already k n o w the w o m a n passing out exams that M a r y is watching. Some machines you can buy here that they agree t o service actually last. 22. G r i m s h a w (1979) argues that verbs which take clausal complements are subcategorized as + S or + (S), and that the choice of individual elements of C O M P is accomplished by an a u t o n o m o u s semantic selection for complement types at the level of logical form. T o the extent she is correct, the basic syntactic distinction she draws between

306

A unified theory of syntactic

categories

Some few classes of verbs do appear to allow all values of C O M P in their complement clause: (56)

(57)

Verbs such as mumble, yell, murmur, whine, stutter, etc. John murmured for her to leave him alone. John murmured that it was going to burn down. John murmured how often his friends had deserted him. "Psychological" verbs with animate objects and clausal subjects: amaze, bother, disturb, frighten, please, etc. For her to leave him alone would disturb me. That is was going to burn down greatly disturbed me. ?How often his friends had deserted him disturbed me.

If either or both of these verb classes select complements which turn out to be best described as what has been notated S, the frame [ p 0 ] S is available in the present system, so there is no need for a separate category COMP. For example, the "manner of speaking verbs" (Zwicky, 1971) in (56) can be subcategorized as + [p0] S. Similarly, a feature for subject clauses of psychological verbs such as + [p0] S will induce an empty N under the subject NP, in accord with (60) of Ch. 2, and allow for all three types of subject clauses. The existence of such features for a few verb classes does not justify introducing a fifth phrasal type C O M P ( = S) into the bar notation. However, it is doubtful that the classes of verbs in (56)—(57) are even accurately described by the features + S and + S NP, respectively. Concerning the manner of speaking verbs, it has already been noted in section 7.3 that semantics essentially excludes any P within the smallest V which refer to time and causality. Other than the traditional COMP, the only subordinating conjunctions left to consider are a few such as as if. This P is certainly allowed with manner of speaking verbs: (58)

John murmured as if he were badly injured. The boy yelled as if he couldn't be heard. He is whining as if someone is depriving him.

Footnote 22—Continued + N P and + S can be replaced here by the contrast between + N P and + PP. Her semantic selection of, say, an indirect question by a verb like wonder (+ Q) rules out a P P of the form in the house or because John left just as easily as it can rule out a for-to or a ttoaf-clause. Thus, her discussion provides no rationale for retaining an S / P P distinction. While Grimshaw is certainly right in her claim that WH-introduced exclamations do not have the distribution of direct and indirect questions, the fact remains that she focuses entirely on the contrasts between r/iaf-clauses and WH-clauses, and says nothing about how various sorts of infinitives are to be selected. As a result, I remain skeptical of her claim that selection of complement types can be completely separated from (syntactic) subcategorization.

S as P and COMP as P

307

Thus, the feature + P S is just as a p p r o p r i a t e for these verbs as is + S. While the psychological verbs d o not take as «/-clause subjects, it is not clear that they take indirect question subjects either. It is well-known that English headless relative N P ' s begin with what, where, and when, but less felicitously with who, which, whether, how, and why. The pattern in (59) then suggests that the psychological verbs c a n n o t take true indirect question subjects: (59)

* Which person J o h n brought irritated me. ? W h o you talked t o didn't frighten Mary. * Whether J o h n left amazed his parents. ? H o w sick the d o g was upset the neighbor. *Why his friends had visited him pleased my father.

Along the same lines, indirect questions but not headless relatives allow infinitives (H. Y a m a d a , per. comm.): (60)

* W h a t t o buy now would irritate me. *Where to talk to Bill might frighten Mary. * H o w often t o help M a r y disturbs her brother.

Thus, it seems t h a t the feature + S is really of n o help in describing the psychological predicates, and that a more a p p r o p r i a t e feature is + (GOAL) S . This allows for for-to clause subjects, f/iai-clause subjects, and headless relative clauses, which are typically allowed as arguments to verbs in any N P position. In conclusion, it seems t h a t the facts a b o u t verbs being subcategorized for propositional complements of various sorts d o not justify introducing a phrasal category S which is independent of P P . Thus, even Bresnan's original defining characteristic of C O M P , that its elements appear in the subcategorization frames of verbs, does not turn out t o justify its separate existence. It is perhaps a p p r o p r i a t e here t o discuss whether there is a fourth lexical choice for a complementizer (i.e., a grammatical P before S); in particular, one which is always 0 and which appears with infinitives, as in Rouveret and Vergnaud (1980). Typically, a "zero-complementizer" or rather n o complementizer a p p e a r s with infinitives that exemplify obligatory control (61a), subject-raising (61b), an accusative-marked lexical subject (61c), or an understood subject in a for-to clause (61d). (61)

(a) (b) (c) (d)

J o h n ' s attempt [ p p [ p 0 ] [ S C N P P ] t o leave]]. J o h n seemed to me [ p p [ p 0 ] [ S [ N P 0 ] t o b e u n h a p p y ] ] . J o h n believes [ p p [ p 0 ] [ s t N p S u e ] t o be c l e v e r ] ] . J o h n arranged [ P P [ P , + G O A L 0 ] [ S [ N P 0 ] to l e a v e ] ] .

308

A unified theory of syntactic

categories

In my view, there is no fourth lexical choice for a complementizer which is realized as empty. All the [ p 0 ] in (61) result from the interplay of the Empty Head Principle (first appendix to Ch. 2) and independently justified subcategorizations such as + VP, + S, and + G O A L S. The E H P was developed precisely to account for other structures where empty P's induced by subcategorization remain empty throughout a derivation. A typical instance is repeated here as (62).

The empty P in (62) is licensed by (63). (63)

Empty Head Principle. If an empty head X° induced by subcategorization c-commands an adjacent empty and caseless Y max X 0 h a s no phonetic realization.

As seen in Chs. 2 and 3, complements of obligatory control result from the frame + VP. In (61a), an S is required by the Revised ^-criterion of Ch. 2 and a P P by the indirect 0-role assignment that always accompanies N heads. The P remains empty, however, by virtue of (63). In (61b-c), the complements of seem and believe result from the subcategorization + S; it is not clear that a P P is required over the S, but if so, (63) also licenses its P as empty. In (6Id), as discussed in section 7.4, [ P , + G O A L ] is empty prior t o s-structure, and remains so in accord with (63) if the subject N P of the complement is not lexical. In the present framework, then, there is no need to postulate a special empty lexical complementizer for any type of infinitive. The non-lexical nature of a P in these constructions follows from the interplay of subcategorization features and the Empty Head Principle (63). O n e apparent problem that results from this conclusion deserves mention. Chomsky (1981, 172), citing Rizzi, points out that if infinitival clauses lack a C O M P (P) to serve as an intermediate landing site for successive cyclic WH-movement, the hypothesis that S is a bound for subjacency in English (Rizzi, 1982b; cf. section 7.8 here for discussion) wrongly predicts that (64a) is ungrammatical. (64)

(a) Who,- did [ s you believe [ s John to have seen t , ] ] ?

S as P and COMP

as P

309

It may be that an empty C O M P (P) in all the constructions of (61) is at least optional, licensed by the frames + S and + VP and the EHP. Alternatively, the putative conflict with subjacency may merit being reconsidered; it can be observed that successive embeddings of NP's without intervening S's also fail to produce a subjacency effect, in a way that seems parallel to (64a): (64)

(b) Who, did [s you consult [ n p a book on [ n p the election of t/]]]?

Thus, it may be that successively embedded S's and successively embedded NP's count only as a single bound for subjacency, and that an example like (64a) does not necessitate postulating an empty embedded COMP. 7.7. Explanations of Transformations Using COMP and S Within the generative literature, various base and transformational rules have been proposed which utilize the categories C O M P and S. In this section, I will show that these rules can be stated in perfectly adequate fashion using P and PP, or that they should be eliminated. These rules include S-extraposition, a rule that allows S to appear in the position of topicalized NP's, COMP-deletion in main clauses, and a rule involving S for appositive relatives. I will discuss each of these in turn. 7.7.1. Extraposition of S Since the pioneering study of Ross (1967), two varieties of extraposition from within an N P to the end of an S have been recognized. When there is no lexical head N, as in the subcategorized [NP§] arguments discussed in the previous section, a "free" extraposition of an S sister to an empty N head may always occur. As argued in Emonds (1976, Ch. 4), the empty N in fact forces extraposition to occur, from both subject and object positions, unless some alternative transformational process offers a means for eliminating the empty N (e.g., such as S-topicalization, discussed below in 7.7.2, extraposition, or gerund formation, discussed in Ch. 2, note 11). The result of free extraposition in s-structure is exemplified in (65). (65)

(a) (It) amuses no one for John to do that.

310

A unified theory of syntactic

categories

(b) N o one has explained to me how to build a better mousetrap.

N

PP,

0I

I0 -

Free extraposition arises from Move a, with a = P P . As shown in Terazu (1979), the arguments of various authors, notably Baltin (1981, 1983), converge on the claim that the landing site for free extraposition of a clause is one which mutually c-commands the deep structure N P which contains it. 2 3 If free extraposition did not apply in (65), there is n o rule which would fill or otherwise license the empty N, and so the structures would be ruled out. But when free extraposition applies, the empty N's in (65a-b) fall under the Empty Head Principle (63), which licenses them as phonetically unrealized. Thus, free extraposition derives the surface form in (65b); in (65a), however, a pleonastic it must still be inserted in surface structure, as in several English constructions whose subject N P would otherwise be empty. A pleonastic it in subject position, including the one that appears with extraposed subject clauses, may result from a requirement that a

23. For this same reason, mutual c-command between the extraposed S and the N P from which it moves, extraposition out of a PP is blocked. "John talked about (it,) to Mary [g that Bill was late], * Bill was convinced of my innocence by (it,.) [5 that Mary confessed]. These clauses cannot appear in their deep structure position inside P P either, since in this position there is nothing which licenses the empty head N of the object of P. When a P is idiomatically re-analyzed as part of a preceding verb, then mutual ccommand and extraposition is possible: Don't count on it very much that Bill will be on time.

S as P and COMP as P

311

subject N P in English must have a D E T as well as an N, a pattern that seems to be present in other languages (e.g., in Spanish; cf. Torrego, 1984b). 2 4 The question now arises, does free extraposition apply only to S, but not to P P ? T o answer this I first recall that the empty N in (65a-b) are permitted only by virtue of deep structure subcategorization for S, as required by the statement on induced empty nodes (59) of Ch. 2. (66)

* F o r John to do that chased no one. * I t chased no one for John to do that. * N o one has brought to me how to build a better mousetrap.

Now, the contrast in (65) vs. (66) can just as well be seen with PP's of location in subject N P position: (67)

Near the furnace was stifling. It was stifling near the furnace. * Behind the stove was clever. * D t [ p p . 0 ] ] was clever [pp. behind the stove]. Behind the stove warms up fast. It warms up fast behind the stove. *Near the furnace changed my mind. * [ l t [ p p . 0 ] ] changed my mind [pp. near the furnace]. Under the porch is pretty dusty. It is pretty dusty under the porch. * Under the porch was almost imperceptible. * [ l t [ p p . 0 ] ] was almost imperceptible [pp. under the porch], I don't like it very much inside the machine. * I don't like it very much for the machine. *He suggested [ N P ( i t ) [ p p . 0 ] ] to me [pp. inside the machine].

24.

There are good reasons to suppose that it is not an N, and that if it has a category

besides N P , it is D E T , analogously to that of pleonastic there in existential clauses (Emonds, 1976, Ch. 4). T h e requirement o f a D E T in subject position in English is a structural one, and is moreover a sort of "elsewhere" case; various other devices may license an empty D E T . In section 5.7, it was seen how the English plural marking on N allows the D E T of a count noun to be empty. Similarly, mass nouns in English allow a D E T to be empty, in subject or object positions. Sometimes an apparently lexical property of a verb permits the empty object N P in structures like (65b) to also be spelled out as it. Such an it appears to be compatible only with passivization or idiomatic cliticization onto the verb: He likes it that M a r y works outside. *He likes that Mary works outside. He thinks (*it) that Mary works outside. He suggested (it) to me that Mary work outside. He really caught it last night.

312

A unified theory of syntactic

categories

Thus, provided that a predicate subcategorizes for a class of PP subjects or objects (whether they are locational PP, indirect questions, for-to clauses, or whatever), free extraposition accounts for the appearance of this PP in clause-final position. This process makes no distinction between PP and 5; the distinction between locational PP's and clausal PP's is a matter of subcategorization, and not of transformational behavior. The greater awareness in traditional grammar of clausal subjects does not justify assigning them a special category in their non-extraposed form. The logic of the preceding argument for collapsing P and S does not depend on the Empty Head Principle. The latter is a proposal for licensing certain phrases whose surface position seems to violate the bar notation. But the similarity between the extraposition pattern in (67) and the extraposition pattern for clauses is independent of the formal device which is chosen to explain when they can occur. In Dutch and German, where a dependent clause V is VP-final, it is quite clear that the post-verbal position is restricted to P P and S, but cannot be held by an NP, AP, or bare VP. This pattern, brought out in GrQos and van Riemsdijk (1981) and discussed further in the second appendix to Ch. 2, also supports the identification of P P and S. A second type of "limited" extraposition first studied extensively in Ross (1967) is that of various PP and relative clause S which modify a lexical subject or object N. It is widely recognized that these limited extrapositions are dependent on the deep structure VP not containing the "focus" of the clause (cf. Gueron, 1980). They are exemplified in (68): (68)

Some preachers knocked on the door from Utah. *Some preachers knocked down the door from Utah. Any guest is welcome who can pay. *Any guest must leave who paid.

Gueron has argued, I think persuasively, that these extrapositions are not, and cannot be best described by attempting to characterize the input structures that permit them. Rather, limited extrapositions are also to be subsumed under the general rule of Move a. Restrictions on limited extraposition are then to be stated in terms of the output, developing statements like (69). (69)

Extraposition of PP and S from within an N P is blocked when the lexical head of the NP fails to stand in a certain relation to the focus of the clause.

Thus, limited extraposition not only applies to both PP and S, as expected under the hypothesis of this chapter, but in fact, as a syntactic process,

S as P and COMP as P

313

appears to treat them identically. Thus, patterns as in (68) provide another basis for uniting P P and S. 2 5 It can now be appreciated that extraposition, whether free or limited (i.e., from an empty or lexical head N), does not depend on the presence of S, in putative contrast to P P . T h a t is, all extraposition is just an instance of Move P P . 7.7.2. Topicalization

of S

I have previously argued that infinitival and finite clausal subjects, as in (70), are not in subject position, but have rather been substituted into the pre-subject C O M P position. (70)

F o r John to smoke cigars bothers me. That the house needs cleaning is doubtful.

The arguments for this conclusion in Emonds (1976, Ch. 4) are of two types. First, it is shown that in embedded ( = non-root) clauses, where C O M P is not available as a landing site other than for WH-movement, infinitive and finite clauses d o not appear as subjects. 2 6 (71)

*Is the possibility that for John t o smoke cigars bothers me a consideration? Is the possibility that it bothers me for John to smoke cigars a consideration? * M a r y believed that the house needed cleaning to be doubtful. Mary believed that it was doubtful that the house needed cleaning.

Second, when some non-subject constituent is substituted for C O M P in either a root or embedded clause, infinitive and finite clause subjects are excluded. (72)

* Those health consultants for John to smoke cigars bothers. * H o w doubtful is that the house needs cleaning?

really

25. There may be some principle of modification in logical form which distinguishes between PP's with lexical heads and relative clause structures ( = [ P P WH-constituent — S]), and hence leads to differences in their distribution. These two structures are clearly quite different from an interpretive point of view, since one head contains a phrase with reference and the other does not, so it wouldn't be surprising if logical form distinguished them. 26. As noted in my first work on this topic, and by many others since, embedded clauses of indirect discourse sometimes exhibit certain root phenomena. When they do, they are islands for any further syntactic operation involving an element of a higher clause, which indicates that they have some kind of derivatively generated status as a root clause.

314

A unified theory of syntactic

categories

For an extensive range of data involving many constructions in which the above two arguments apply, see E m o n d s (1976, Ch. 4). When we turn to the subcategorized PP's in subject N P position discussed in the previous section, we find that the same two tests which establish that clausal subjects are in C O M P lead to the same conclusion for locational PP's. First, in embedded subject positions, the acceptability of P P subjects deteriorates rapidly: (73)

In that back room is quite chilly. * Y o u r complaint that in the back room was chilly was ignored. *A day that in that back room is quite chilly must be rare. ""Is the possibility that in the attic is dusty a consideration? * M a r y believed under the porch to be dusty. *It was so cold that near the furnace didn't warm up fast. *We left the house although in the kitchen was cozy.

Second, when another constituent is in C O M P , a P P subject is excluded, just as is a clausal subject. (74)

*John groaned about how dusty under the sink probably was. * H o w chilly was in that back room? *Safer than being downstairs would be under a doorway. Cf. Under a doorway would be safer than being downstairs. * In n o season should in the back room be chilly. *Warming up fast is behind the stove. Cf. Warming up fast is our ace reliefer.

F r o m the patterns in (73)-(74) I conclude that P P subjects are permitted only in "topicalized" C O M P position. In this way, P P ' s are again analogous to 5, confirming my hypothesis that S is a subcase of P. This "topicalization" of P P and of S is brought about by the rule Move a, where a = P P . However, it appears that a can be S only when this § is an N P in the structural position from which it moves. This descriptive generalization is established in Higgins (1973) by virtue of examples like the following. 2 7 27. O n e possible account of this restriction is based on the converse of t h e principle discussed in note 11, namely: (i) If a constituent is co-indexed with an element in C O M P ( = P), then the constituent must be a case-marked phrase in s-structure. (i) is t o o strong, since it apparently excludes P P traces of W H - m o v e m e n t in s-structure. However, if we consider the trace of a W H - m o v e d P P ( = P + a) to contain at least the structure [ p p [ j 0 ] ] , then (i) seems fairly accurate. It correctly excludes any topicalization via movement into C O M P of a " b a r e " S (Higgins's generalization), of an adverbial clause, of an adverbial A P (assuming these are not m a r k e d for case), or of a VP.

S as P and COMP (75)

That That *That "That That That

315

as P

the movie would be exciting it seemed to Bill. bankers are boring Bill grumbled. California had no mosquitoes I convinced him. you chew loudly it bothers me. the house be painted I suggested it to Mary. I had been on time they refused to believe it.

Higgins's generalization suggests that Move a derives 5-topicalization from the following input structures. (76)

foolish too warm For John to smoke cigars Near the fence Move a

It can now be asked what value of a. is required for 5-topicalization. If a = NP, no independently motivated mechanism will license the empty N's in (76), and the s-structures will be ill-formed. On the other hand, if a = PP, the Empty Head Principle (63) will permit these N to be phonetically null, just as in the case of free extraposition discussed in thè preceding section.

316

A unified theory of syntactic

categories

Both S-topicalization and free S-extraposition, then, move P P out of the structure [ N P [ N 0 ] P P ] , and the E H P (63) allows the N to remain empty. A difference between the two processes is that the N P , if it is a subject, must be realized as it under extraposition, but must remain 0 under topicalization. This difference is due to the more general fact that whenever subjects move to C O M P in English, their trace in subject position is null; that is, this trace is exempt from the requirement discussed in the preceding section that empty subject N P ' s in English are typically spelled out as it/there. So, just as */ bought the book whichj iti amused her and * W h o j did hei complete the task? are excluded, so is *§.[For John to leave] it[ bothers me. This account in terms of the E H P of what licenses empty N's when S( = PP) moves is supported by the diversity of the constructions in which this principle seems to work; however, I emphasize again that the mode of licensing these empty N is independent of the fact that topicalization treats S and P P alike, as shown in (71)—(74). 7.7.3. Main Clause

COMP-Deletion

We may ask, if S = PP, does this imply that P P is the initial symbol for the base rules, i.e., the "root" (highest category) of a tree containing a main clause? That is, something must be said about one asymmetry of S / P P distribution not yet mentioned; namely, only the former are main clauses. In fact, there is little reason to consider S rather than S is always required for expanding a proposition. In her initial study of C O M P , Bresnan (1970) provides for the lack of overt C O M P in main clause S by an obligatory deletion. (77)

*That the books have burned already. * Whether the books have burned already. *For the books to have burned already.

She further observes that [COMP./*""] + S does not occur freely as a main clause, even assuming C O M P is deleted: *The books to have burned. Banfield (1973) proposes a more nuanced theory of initial symbol, in which the root of every well-formed tree is an E ("Expression"), where E cannot be subordinate (although E's can be coordinated). More precisely, except perhaps in not strictly grammatical, derivatively generated structures, E can be dominated only be E. 28 In turn, E dominates a range of non-embeddable constructions (see Banfield, 1973, and works cited there for discussions of particular constructions). E can also be expanded as any maximal projection, giving rise to structures such as (78):

28.

This is the difference between Banfield's E and the S of Chomsky (1977).

S as P and COMP (78)

as P

317

If you please, another glass of wine. (Non-embeddable construction+NP) Oh darling, to be in Paris again! (Non-embeddable construction + subjectless S) Yes, more intricate than I would have thought. (Nonembeddable construction + AP) Long live the Revolution, since we are in Cuba. (Non-embeddable construction + P P )

Within this framework, constituents like (77) are generable as E's, with the structure (79): (79)

E

whether/that/for / if As such, however, they are no better as independent clauses than are other PP's in isolation under E; like many PP's generated in this fashion, many examples as in (79) are uninterpretable (except as answers to questions in connected discourse): (80)

(a) *If you please, of my house! *Oh darling, from the riverbank. *God, in a systematic fashion. *Yes, since the moon came up. *Long live the Revolution, whenever the police disappeared. *Good grief, that the electricity was shut off. *Yes, for the car to start.

In contrast, some P P structures, including some like (79), can be interpreted in isolation (i.e., purely pragmatically) when they are immediately dominated by E: (80)

(b) If you please, off my lawn! Oh darling, to the riverbank. God, for a cigarette. Yes, if only I could camp out more often. Long live the revolution, because the police will disappear.

Thus, assuming a uniform initial symbol E, the "S" structure (79) has no special status, and is simply parallel to other "incomplete sentences," as in (80).

318

A unified theory of syntactic

categories

Since E can immediately dominate any maximal projection, it can in particular dominate V m a x , giving rise to (81). (81)

E S

It is this structure which has a special status; like the original initial symbol S in early phrase structure grammars, it represents, semantically, an independent logical judgment. It is not, however, related to (79) by any syntactic process (for a similar view, see Brame, 1979). Therefore, in the present system, obligatory COMP-deletion in main clauses is dispensed with. 29 According to the definition of root node given in Ch. 3, slightly revised as (82), an S as in (81) is a root. In (79) and (80), the S is not a root, but the PP is; the S is governed by the P and is hence not a possible root. (82)

Root nodes: A node C is a root node if no node in the tree path including C and the initial symbol E as end points is governed, and if C # V (L = N, V, A; j < 2).

This definition of root makes no special mention of C O M P or S, even though C O M P and S can qualify as root nodes, just as P and P P can. In (83a), for example, the P is a landing site for subject-auxiliary inversion, and in (83b), the though-c\ause PP is a root to which an AP adjoins. (83)

(a) If Mary were in a nicer town, she would be happy. Were Mary in a nicer town, she would be happy, (b) Though Mary is happy, she still wants to move. Happy though Mary is, she still wants to move.

29. Main clause C O M P - d e l e t i o n has several other undesirable properties, which make its elimination all the m o r e attractive. More in French t h a n in English, there are perfectly interpretable structures of the f o r m (79) which must be exempted from C O M P - d e l e t i o n : Qu'il arrive á temps! " M a y he arrive on time!" Si on allait au cinéma? "What about going to the movies?" Secondly, while most deletions of specific m o r p h e m e s in F r e n c h and English are language-particular, we find t h a t in the main cases, C O M P - d e l e t i o n is the same. This indicates a characteristic of universal g r a m m a r , not a typical role for a rule deleting a morpheme class. It is m o r e plausible that universal rules of semantic interpretation pick out (81) and (79) as independent logical judgments and questions respectively. Thirdly, other familiar root deletions, for English for example, a r e optional. Thus, analyses of English imperatives postulate optional, not obligatory rules deleting you and will/can', should one is optionally deleted in the context why VP; some forms of V moved into AUX are optionally deleted in (WH) (1982b).

row. (Cf. Emonds, 1976, C h . 6 and Hendrick,

S as P and COMP as P

319

So it turns out that the possibility of being a root is just one more property which P shares with C O M P and P P with S, and hence one more argument for conflating them. The fact that declaratives are not S's does not imply that questions and other main clause structures which contain a motivated C O M P position cannot be S's. The discussion of AUX-fronting in sections 3.6 and 3.7 in fact tends to confirm the status of WH-fronting constructions as root S's. But there is no reason to believe that the special symbols " S " and " C O M P " play some crucial role in the description of the deep structures, s-structures, or logical forms of these constructions which cannot just as easily be satisfied by the symbols " P P " a n d / o r "empty P." I elaborate somewhat on this last point below in section 7.8. If root sentences are not necessarily S's, then root transformational applications of Move a can be adjunctions to S or E, as well as the substitutions for C O M P utilized in Emonds (1976), following a proposal of Higgins (1973). The general pattern that only one phrase is allowed to be fronted in a root S is accounted for in Emonds (1976) by stipulating that all movements of phrases over variables are substitutions, whether they are structure-preserving or not; the singularity of the C O M P node then insured that only one such operation can take place on each sentential domain. The possible absence of C O M P in main clauses here implies that the effect of one fronted phrase per S must be achieved in some other way. Rather than requiring that phrasal movement over variables be substitutions, I propose the following: (84)

On a given cyclic domain, Move a may attach only one phrase at each landing site made available by the structure-preserving constraint.

F o r substitutions, the effect of (84) is essentially definitional. However, when adjunctions can provide landing sites (in root domains), (84) limits movements to one adjunction of a phrase per S on each side, as desired. 7.7.4.

Appositive

Relative

Structures

Another possible discrepancy in the distribution of S and P P in the base might be that S but not P P appears in appositive relatives. For example, Jackendoff (1977) proposes that S can appear as a sister to any X 2 in his system and form with this X 2 an X 3 . 1 argue against this in Emonds (1979), on grounds independent of what is being developed here; instead I justify an analysis in which an appositive relative is derived from a coordinate main clause with a 0 CONJ(unction) in deep structure. Now, in accord with the discussion of the preceding section, when main clause coordinate structures are declaratives, there are no S or C O M P in the deep structure. Thus, for an example like (85), the deep structure is (86) (combining

320

A unified theory of syntactic

categories

Emonds, 1979, and the previous subsection on main clauses): (85)

The excavators found the lost scrolls of Eternal Philanthropic Capitalism, in order to find out about which Morgan traveled to Egypt, in the foundation of a Holiday Inn.

The excavators found the lost scrolls of [ n p . E P C ] in the foundation of a Holiday Inn

Morgan traveled to Egypt in order to find out about which,

The fronting of the WH-phrase in the appositive relative S2 is not effected by a movement of WH (structure-preserving or root); that is, it is not a substitution for C O M P . Rather, "pied-piping" of the sort involved here consists of a topicalization (root-fronting) of a maximal projection containing WH. In (85), this maximal projection is an entire purpose clause. As is explained in more detail in Emonds (1979), the only requirements involving W H in non-restrictive relatives are (i) that its antecedent be adjacent to S2 in surface structure, and (ii) that W H be inside a preposed constituent; cf. (87) below. Neither of these requirements need be stipulated specifically for appositive relatives. In any relative structure, at the point where interpretation applies, the antecedent must be adjacent to the relative clause (i). Moreover, outside of question structures, W H can be interpreted only in preposed positions; this holds for both restrictive and non-restrictive relatives, as well as for exclamatives and for conditionals (*John will be late, if he takes whichever train). This accounts for requirement (ii), again without mentioning appositives. These requirements should follow from a complete theory of pronominalization (cf. Hendrick, 1982a, and Jackendoff, 1977, Ch. 7). 30 The surface structure of (85) is derived from (86) by fronting of the purpose clause via Move a (a root operation here), by relative clause WHinterpretation, and by two further processes which apply generally in parenthetical structures: Chomsky-adjunction of S2 to Sj with accompanying deletion of null C O N J , and syntactically optional postposing of a

30. The WH cannot be too deeply embedded in the topicalized phrase, or else it becomes inaccessible to interpretation. A relevant study with a similar point is Nanni and Stillings 1978). The rule which is restricted here is, I believe, the rule of interpretation for W H in (all) relatives, rather than a movement process.

S as P and COMP as P

321

maximal projection around the parenthetical clause to the right (again, for more details, see Emonds, 1979). 31 The resulting structure is (87):

31. We might take C O N J to be the head of E. For exposition, let C O N J = C and E = C. Then (86) might have a form something like (i). (Let a root S be any S dominated only by S's and projections of C.)

In all parentheticals, S 2 in Chomsky-adjoined to S , (Emonds, 1976, Ch. 2; cf. Banfield, 1982, Ch. 1). This results in the structure (ii).

The Empty Head Principle (63) now licenses the empty C automatically, with no stipulated deletion. In this way, parenthetical formation is reduced to a single operation (Chomskyadjunction), and nothing need be said about the missing conjunction.

322

A unified theory of syntactic

categories

Concluding, since S plays no role in either the deep or the surface structure of appositive relatives, this construction cannot be invoked to justify an S / P P distinction. 7.8. The Role of S and COMP in a Theory Constraining Movements In assimilating S to P P , it must be asked if a theory which properly prevents transformational operations form moving elements too far or too freely, as developed for example in Chomsky (1981, Ch. 3), is affected in some non-terminological way. F r o m this point of view, is the hypothesis of this chapter disadvantageous, neutral, or superior? In Chomsky's framework, constituents are prevented from moving too far or too freely by one of three devices that might interact with the hypothesis that C O M P = P. (i) In a language such as English, there is a unique non-argument position " C O M P " in [g S ] to which a phrase marked in a particular way, with WH, may be moved, (ii) An extraction cannot take place out of two distinct "bound domains" simultaneously, where S (in some languages S) and N P constitute such domains; this restriction is called subjacency. (iii) Phrases can move in two steps across the boundaries of two subjacency domains only via an intermediate landing site in C O M P . In the present framework, these restrictions actually become better understood, because they can be related to other restrictions on movement more or less unintegrated into Chomsky's theory. I will discuss (i)-(iii) in turn. (i) According to Chomsky's conception of Move a, a W H - m a r k e d phrase can be adjoined to C O M P , even though such a movement does not preserve structure (that is, it is not a movement to an argument position). Rules of logical form then interpret phrases in s-structure C O M P positions which bind a variable (variables being identified with traces of WH-movement in the simplest cases). Now, any rules of logical form which interpret an X m a x in C O M P that binds a variable inside a following S can as well pick out an X m a x in [ p p S] as in [§ S], so my system has n o disadvantage. As I understand Chomsky's system, it is these rules of logical form, e.g., for interpreting questions, that determine that phrases marked with W H appear in C O M P in s-structure rather than, say, sentence-finally. Alternatively, there may be some syntactic restriction as to an appropraite landing site stipulated for the element W H , as in Baltin (1982). At this point, a difference between my conception and Chomsky's emerges, which I think shows the superiority of analyzing WH-movement as a structure-preserving operation, as in section 7.5. Since a fully structure-preserving analysis of this rule requires that C O M P = P (to explain the pied-piping of PP), any advantage of the structure-preserving account indirectly supports the hypothesis that C O M P = P. If, as I argued earlier, WH-fronting is allowed only because the deep structure categories P and W H in " C O M P " provide structure-preserving

S as P and COMP as P

323

landing sites for Move a, then WH-fronted constituents are licensed syntactically, rather than semantically. For reasons I will now go into, I think this is a preferable position. The typical relation between syntactic and semantic constructions observed in other areas of g r a m m a r is that a syntactic constructional type, even of a transformationally derived a n d / o r language-particular sort, gives rise to a multiplicity of interpretations. T o a large extent, but rarely exactly, these interpretations are mutually exclusive. The impression that results is that one interpretation of a given construction is basic, and that the others are semantic "utilizations" of a generalized syntactic operation which are licensed when some aspect of the syntactic structure typically required for the basic interpretation is not present. F o r example, the English imperative is used "basically" for commands. However, when it is conjoined with a declarative sentence, the imperative can be interpreted as a conditional: (88)

Be careless with your money and you'il end u p in trouble. Don't like chocolate too much or you'll end up fat.

If both uses of the imperative are not assigned the same syntactic analysis, one essentially ends u p with different construction-specific, meaning-based generative semantic transformations, which are not able to explain the surface "conspiracies" of syntax. Similarly, English tag questions, studied in Culicover (1971) from the point of view adopted here, have at least three distinct semantic interpretations. (89)

John will leave Boston, won't he? (confirmation requested) John will leave early, will he? (irony) Leave the room now, will you please? (degree of politeness in commands)

It appears that the "confirmational" use is basic, but depends on the "misaccord" of the main clause and the tag with respect to negation. Since the presence or absence of not does not, I assume, license the tags syntactically, tags without misaccord or which follow an imperative clause can be semantically utilized in the other ways exemplified in (89). The same reasoning can be applied t o WH-fronting constructions. Under my conception, the particularity of this construction is that a W H phrase can move to the clause-introductory position P by virtue of the interplay of two independently justified properties of the syntax, structurepreservation and the characteristically wide distribution of W H as a feature on several different closed categories. ( F o r a predictive account of why W H can appear as a feature with both P and SP(X), see the appendix to this chapter.) Since the deep syntactic structures [ a W H — S ] and [ a P - S ] and the s-structures [ a W H - p h r a s e - S ] are then available inde-

324

A unified theory of syntactic

categories

pendently of the semantics, it is expected that more than one, and again for the most part mutually exclusive, interpretive rules can use them. That is, the autonomous syntax can here also explain the mysterious conspiracies of semantics. This "syntactically driven" model for interpreting relative clauses, information questions, echo questions, comparative clauses, and exclamations off of a uniform syntactic input seems implicit in Chomsky (1977), and is exactly the type of system developed and justified in great detail in Milner (1978, Ch. 6-8). But in my view, the syntactic explanation for the diversity of semantic structures utilizing WH-fronting syntax is greatly undermined if it rests on an implicit or explicit ad hoc landing site statement about where WH-phrases are to be adjoined. Such a statement is avoided if Move a is uniformly subject to structure-preservation in embedded contexts, which is how WH-fronting is viewed in section 7.5 above. As mentioned above, my claim that C O M P is a P which is empty at deep structure is not inconsistent with the adjunction account of WHfronting. But since the structure-preserving account seems explanatorily superior, as just argued, a necessary condition for this account (that C O M P = P is the landing site for pied-piped PP's) is thereby further supported. (ii) According to the principle of subjacency first proposed by Chomsky (1973, 1977) and refined by Rizzi (1982b), Move a cannot extract a constituent out of two "bound domains" simultaneously. Under this view, languages may vary along a subjacency parameter, according to which choice of "bounding nodes" counts for subjacency. It goes without saying that, since the status of subjacency is controversial, it bears on my claim that S = P P only if it exists. In languages like English, where the bounding nodes are N P and S, the proposal that S = P P is not directly involved, although the role of C O M P is, and will be discussed below in (iii). For languages like Italian, an appropriate generalization of Rizzi's claim that S is a bounding node is the following: (90)

A P P that immediately dominates S may be a bounding node for subjacency.

This formulation predicts that an adverbial subordinate clause introduced by a lexical P (e.g., Italian prima che/prima di "before", anche se "although", etc.) will be a bounding node that restricts extraction just like S. Since extraction from such clauses is in fact prohibited, as discussed in Belletti and Rizzi (1981), there is no doubt that (90) can be maintained, even though it may not in itself suffice to explain all the relevant extraction effects. Given that (90) is empirically supported, another way to express the parameter that either S or S may be chosen as a bounding node for subjacency is as follows:

S as P and COMP (91)

325

as P

S or the node immediately dominating S may be a bound for subjacency.

A somewhat different direction in which the S/S subjacency parameter might be generalized is suggested by the work of Baltin (1978b) and van Riemsdijk (1978), who argue (albeit for sometimes conflicting reasons) that P P is a bound for subjacency even in English. This approach suggests that in some languages, PP, including S, is always a subjacency bound (thus justifying further conflating the two categories), while in others (e.g. Dutch, following van Riemsdijk) PP may not be a bound under certain marked conditions. Indeed, Sportiche (1981) concludes that both PP and S are bounding nodes for subjacency in French; this result clearly confirms the hypothesis that S should be identified with PP. (iii) According to Chomsky's theory of subjacency, a phrase can escape from two bound domains by means of two separate movements, one into C O M P and one out of COMP, as in (92). Under his proposal, S but not S is a bounding node for English.

COMP John should tell Bill

step 2 (OK)

step 1 (OK)

unbounded movement (excluded)

S( = P)

Sam saw [ N P Who]

^J

Who should John tell Bill Sam saw? As observed by Chomsky, a non-empty C O M P blocks the possibility of a two-step ("successive cyclic") movement; this is known as the "WH-island" effect.32 32. The inability of two phrases t o co-exist in C O M P follows from my view that an embedded instance of W H - f r o n t i n g is, like all applications of Move a in embedded contexts, a substitution. In my view, the various proposals in the literature t h a t formulate and motivate a "doubly-filled C O M P filter" are by-products of the less explanatory view in which W H - f r o n t i n g is an adjunction. Research may have been directed away from a substitution analysis by the existence of C O M P morphemes adjacent to WH-fronted phrases. As discussed in section 1.7 with respect to French que "that", such morphemes can be inserted by local rule in the context S, and are not necessarily under P in all languages.

326 (93)

A unified theory

of syntactic

categories

(a) * W h o should John tell Bill where Sam saw? • W h o should John tell Bill when to help?

In this framework, the hypothesis that C O M P is a P and that S = P immediately explains the fact that adverbial subordinate clauses introduced by deep structure lexical prepositions prohibit successive cyclic movement. In (93b), the subordinating conjunctions play exactly the same role as do the lower WH-phrases in the WH-islands. That is, either a WHphrase or a lexical element under the lower C O M P in (92) blocks successive cyclic movement. 3 3 (93)

(b) * W h o should John describe Bill before Sam meets? *What students did Mary help John after she counseled? *What students did Mary help John since questioning? *What students did Mary help John while thinking about? *What students did Mary help John although she disliked?

Chomsky and Lasnik (1977) have observed that not all adverbial subordinate clauses strictly forbid extraction. The ones that allow extraction at least marginally are those that I have independently hypothesized to have empty introductory P in deep structure. This empty P can act as an intermediate landing site for WH-fronting, while the filled P in (93b) cannot. 3 4 (93)

(c) P = that: ?What was it that he got so mad that he lost? P = if: ?That is a problem that I'd be surprised if he solved. P = + G O A L : Who did John learn about algebra to help?

F o o t n o t e 32—Continued Languages such as Polish and Russian which permit multiple W H - m o v e m e n t to sentence-initial position in embedded clauses must make use of a device other t h a n substitution. They may be using stylistic reorderings whose o u t p u t s are not accessible t o the strictly grammatical processes which can extract elements from embedded clauses (Banfield, 1973b). 33. In section 2.5. I concluded that the participial V + ing forms after prepositions of time are in S's in the syntax when the rule M o v e a applies; therefore, the lower b o u n d i n g node S in (92) is present in all the examples of (93b). 34. The patterns of (93b—c), taken as facts about English, seem t o strongly confirm both subjacency and the status of C O M P as P. However, as noted earlier, Belletti and Rizzi (1981) have noted that extractions from adverbial clauses are prohibited in Italian, where S is not a b o u n d i n g node, as well as in English. They suggest that if subjacency is responsible for these effects, then crossing even o n e bounding n o d e C is prohibited if C is outside X. T h e apparent marginality of the examples in (93c) suggests that movement t h r o u g h the empty P ( C O M P ) in positions outside V varies across speakers. F o r s o m e speakers, the grammatical formatives introducing result clauses, conditionals, and p u r p o s e clauses m a y be inserted in d e e p structure, m a k i n g them into lexical subordinate c o n j u n c t i o n s which block successive cyclic movement. I think there is also some stigmatization of examples as in (93c) by school grammar.

S as P and COMP

as P

327

The contrast in (93b-c) thus confirms my proposal that a C O M P is an empty P. A final point regarding subjacency is that successive cyclic application of Move a through C O M P contrasts with the impossibility of this kind of operation on extraposed phrases. These latter never move successive cyclically into higher domains, but are rather subject to the "Right Roof Constraint" of Ross (1967). It seems to me plausible that phrases in C O M P can move into a higher domain because they are in the head positon of S (i.e., of P). Typically, the head in a phrase a is accessible to material in the larger domain in which a is governed, while complements within a are not (Chomsky, 1981, 300). This type of explanation for why phrases in C O M P but not extraposed phrases can move a second time requires that C O M P be the head of S, and this in turn, as noted earlier, follows from my claim that C O M P = P. In conclusion, the roles that S and C O M P play in the theory of subjacency are both better understood in terms of their being P and P. We come to see why PP, especially in [ p p P —S], is a bound on extraction, and why C O M P , as an empty landing site of category P in the syntax, allows extraction through it. 7.9. The Status of Comparative than/as Clauses The morphemes that introduce comparative clauses and phrases, as in (94), have been argued by various authors to be prepositions, complementizers, or "of no category." (94)

(a) John seems less interested in reading books than Mary is 0 capable of writing them. As many women as the company says they will hire 0 men should show up. Fred can read newspapers as quickly as Jim can letters 0. A different person than the one I know 0 answered the phone. I find Boston the same as it has always been 0. (b) Sue has more than a million dollars. A person taller than six feet couldn't sleep here.

In the clausal comparatives (94a), there is a minimal and obligatory syntactic gap consisting of a Specifier of an AP or a " Q P " (quantifier phrase); I have notated these with 0. Throughout this section, I will assume the conclusions of E. Klein (1980), who argues that QP's are a type of AP, so the minimal gap in the comparative clause is then always a SP(A). Chomsky and Lasnik (1977) offer several considerations against the contention of Bresnan (1976) that than and as can be subsumed under the category C O M P . However, the contrast between these two points of view assumes a category distinction between C O M P and P, which I claim now to have shown to be erroneous. Thus, in my terms, Chomsky and Lasnik

328

A unified theory of syntactic

categories

(C & L) are arguing that the comparative morphemes do not have the typical properties of certain P, namely those of the grammatical formatives (that, for, whether) which introduce S rather than NP. 3 5 One of their arguments is that than and as have an essentially unique property that other C O M P don't have, and a second is that they lack a property that some C O M P do have. However, since grammatical formatives characteristically have unique properties within their class, as argued in section 4.4, this cannot be taken as evidence for anything; it is only evidence of the lack of a complete analysis. Thus, C & L note that that but not than and as delete in some contexts (cf. note 21 above). 36 But since, for example, to and of but not with delete in some contexts without affecting their status as P's, such a contrast in behavior does not, in itself, indicate a completely separate category status for than and as. A similar response can be made to C & L's remark that than and as require extractions, or at least "gaps", in the clauses they introduce. While this does indicate the existence of some transformational operation in comparative clauses (i.e., for WH-movement, as argued in Chomsky, 1977), it does not imply that than and as are not in the " C O M P " position (here the P position) in surface structure. We could not analogously argue, upon observing that a fronted which always introduces a clause with a gap, that which is not in the surface position of COMP; that is, which can be argued to have substituted for C O M P . Thus, the two preceding "non-COMP" properties observed by C & L are simply characteristic properties of than and as, and don't prejudice arguments about their category status. The more general considerations developed so far here obviously suggest that than and as are grammatical formative P, or transformational substitutes for P, in the context S; this is consistent both with their non-deletability and their requiring an extraction. A third property pointed out by C & L is precisely that some other uses of comparative than and as (cf. 94b) are arguably P (cf. Hankamer, 1973). This suggests independently that all comparative subordinators may be P, which is quite consistent with the framework advanced here. One argument of C & L's against the C O M P status of than/as does,

35. One of these arguments is based on what seems to me to be an erroneous analysis. C&L mention that tlitm what and similar sequences appear in certain English dialects (He's taller than what I thought), suggesting that than is not in the C O M P position. However, denBesten (1978) I think shows clearly enough that such constructions are "than + headless relatives," a special case of the familiar "than + N P " construction, and hence irrelevant for deciding on the category of the than and as which introduce clauses. One of C&L's points really challenges the possibility that comparative than and as can be uniformly analyzed in the framework being developed here as members of P; this is the point they make about similarities between comparatives and coordinate conjoined structures. This will be discussed later in this section. 36. A case of u.s-deletion is discussed in Emonds (1976, Ch. 5); but it is undeniable that that, being the unmarked P in the context S, deletes more freely.

S as P and COMP

as P

329

however, suggest a different line of inquiry; if fruitful, it would simply remove comparative than and as from the category P altogether. They observe that gapping and other sorts of ellipses occur both in comparative clauses and in coordinately conjoined clauses; cf. (94a). Gapping is not as free in other types of subordinate clauses, and the same is true for other sorts of elliptical processes. (A detailed study of the similar generalizations in comparatives and in coordinate clauses can be found in Banfield (1976).) The affinity of comparatives for gapping suggests that than I as may be themselves a type of coordinate conjunction, and that the notions of semantic and syntactic coordination current up to the present are simply too rudimentary to properly encompass the clausal comparative construction. Some steps to remedy this are taken in Hendrick (1978), who argues that comparative clauses can be generated only outside N and V (under N m a x and V m a x respectively). Another way of looking at his results is to say that comparative clauses do not embed easily, as d o other subordinate constructions. When we investigate comparative structures more closely, we find in fact numerous indications that they instantiate syntactic coordination, and are not a subordinate structure. However, it is not my intention to work out the details of these properties here, so what follows is sketchy in many respects. (i) As just stated, comparative clauses and conjoined clauses introduced by and or or exhibit gapping and various other elliptical patterns not generally available in subordinate clauses. (ii) We do not find the sequence C O N J - than/as (without some intonational stress on C O N J , indicating intervening deleted material). This lack of co-occurrence suggests that than/as are themselves in the category CONJ. (95)

(a) *The bookcase is wider than the doorway (is) and than the window (is). *I find him the same now as in his twenties and as in his forties. * Bill liked him more than Sue and than Sam. *We have offices in more cities than Sears does and than we can afford.

In a similar vein, H. van Reimsdijk suggests that the lack of "stacked" instances of than/as, as in (95b), may be related to the unacceptable sequences in (95a). (95)

(b) *Mary is more likely as tall as John than as Bill.

(iii) H. van Riemsdijk (pers. comm.) has observed that in German, the single NP's after than/as (cf. 94) are "transparent" with respect to case.

330

A unified theory of syntactic

categories

That is, these NP's are not assigned some case by than/as (as they would be after a typical preposition), but receive the same case as the N P they are being contrasted or compared with in the first clause. This is the same phenomenon we find in phrasal conjunction. (Thus, in I gave Bill money, and Sam too/but not Sam, Sam and Bill will have the same morphological case.) By assigning than and as to CONJ, we can treat this transparency as a unified phenomenon in the context CONJ (iv) Conjoined clauses and comparative clauses are never preposed. (v) For comparative phrases {than/as - NP), there is neither the possibility of fronting the entire phrase, nor that of fronting the N P and "stranding" the comparative morpheme itself. The same is true for coordinate phrases. (96)

*Than six feet, John is taller. *But not Sam, I gave Bill money. *Or Sam, Bill likes Sue. *As Sam, Bill likes Sue as much. * It's Sam that Bill likes Sue as much as. •It's Ann that Bill likes Sue or.

This property can be made to follow from an appropriate generalization of the coordinate structure constraint, whereby maximal projections linked by an element of CONJ and forming with CONJ a larger constituent are immune to extraction. (vi) Comparative clauses that are not themselves embedded show some root properties, as do main clause coordinate S's. For example, comparative clauses tolerate, under restricted conditions, certain auxiliary inversions. (97)

Bill will eat more fish than will John. ?Bill will eat more fish than will the rest of us. *More fish than can most of us eat has already been cooked. Is he accomplishing as much as did his sister? ?Is he accomplishing as much as did his sister during her stay? •Accomplishing as much as did my sister during her stay will be difficult.

(vii) The comparative morphemes themselves can become enclitics on the preceding phrase, reducing to their final consonant ('n and's) phonologically. The same is true for the coordinate conjunctions and and or, which may become 'n or V. The same does not hold (in informal standard English) for subordinating conjunctions or prepositions more generally, such as that, for, when, while, to, with, etc. (98)

John's more interesting to me 'n Mary is. He's the same o n e ' s you met before.

S as P and COMP

as P

331

John's interesting, 'n Mary is too. Either a new one 'r a used one would be fine. *So many people c a m e ' t Mary felt sick, ('f = that) *She'd be an interesting person 'r you to meet. (> = for) *John talks only 'n Mary does, ('n = when) *You should have coffee 'th your dessert, ('f/t = with) By subsuming comparative morphemes into the class of conjunctions (CONJ), this possibility of enclisis can be treated in a uniform fashion. (viii) It has often been noted that coordinate conjoined structures are in some contexts subject to semantic "parallelism" requirements; the same is true for comparative structures: (99)

?John will be a success and the sink has just stopped up. ?John writes longer articles than Mary takes walks. *Does John write many books and Bill took a walk. *John writes more intelligent articles than he considered Bill.

While such requirements have never been even informally stated in adequate fashion, it is nonetheless true that they are brought up principally with regard to these two types of constructions. This indicates some sort of intuition that the juxtaposition of sentences by means of coordinate conjunctions and comparative conjunctions is allowed under somewhat similar semantic conditions, and it is expressed by the ordinary language usage of the descriptive term "parallelism". On the basis of (i)—(viii) above, I tentatively conclude that comparative clauses, and perhaps even comparative phrases, are to be considered as introduced by morphemes than/as which are not in the category P - i.e., they are not subordinators, but coordinators. This does not mean, however, that than/as cannot be surface realizations that result from a movement rule. For example, it might well be that the minimal gap of SP(A) in comparative clauses has no realization in English in its deep structure position inside the AP. Rather, such SP(A) might be subject to fronting, and be morphologically realized only in the feature complex [ C O N J , SP(A)]. 37 That is, besides "movement to C O M P " , there could be "movement to C O N J " , and, subsequent to movement, insertion of the grammatical formatives than and as under the resulting feature complex. If this direction is to prove fruitful for the analysis of comparative clauses, it is clearly necessary to propose a more sophisticated and general definition of "syntactic coordination" than has been used heretofore. Semantic coordination would then be only one interpretation of conjoined syntactic structures, and not the only possible one. 37. Thus, this process might be subsumed under the general Move a scheme in English, as developed in Ch. 3, except that a could be any derived bar notation category (X J ,j > 0, or SP(X)).

332

A unified theory of syntactic

categories

Whatever the success of extending coordination to comparative structures, the principle hypothesis of this chapter, the identification of C O M P with a subset of P, remains intact. If a more satisfactory and enlightening analysis of comparatives in terms of coordinate conjunction emerges, the status of than /as will be completely orthogonal to the issue of COMP. But if it should turn out that comparative clausal than/as are not elements of CONJ, there is really no candidate for their syntactic category other than the now unified COMP/P. Indeed, both phrasal than/as (as argued by Hankamer) and non-comparative as (Ch. 6 here) are independently justified as members of P. Their lack of deletability they share with most other P, and their affinity for a following gap they share with WH-words like which. Under this view, than/as are analogous to WH - words in the C O M P / P position. 7.10. Conclusion In this chapter, I have surveyed the numerous roles that C O M P and S have played in grammatical descriptions: their deep structure distributions, their role in transformational analyses, their apparent unique status in both root clauses and as a landing site and "escape hatch" for WH-movement. I have shown, I believe, that in all cases a C O M P is just a P which is empty prior to s-structure and which is followed by an S sister. This hypothesis has opened the door to a wide variety of theoretical and descriptive clarifications. To cite some: an account of why nouns reject adverbial clause modification, a unified analysis of several types of if and if-clauses, an advantageous formulation of Exceptional Case Marking, a non-stipulative explanation of the two principal restrictions on infinitival relatives, an explanation for why PP's "pied-pipe" with WHmovement, a conflation of P P / S extrapositions and topicalizations, the elimination of "COMP-deletion," and an overall regularization of PP and S distribution both inside and outside X, for all values of X. In the light of these arguments, the result of this chapter can be unified with that of Ch. 1, in which the central role of the category P in assigning 0-roles was established. The category P emerges as a central subordinating category in universal grammar. Recalling also the results of Ch. 2 and 3, that every S contains a VP, but not every VP is necessarily immediately dominated by S, the following overall proposal about subordination in natural language appears justified: (100)

Recursion Hypothesis. Outside of Specifiers, all "A-over-A" recursion in language passes through the relational phrases VP and PP, whose heads assign 0-roles directly.

A P P E N D I X T O C H A P T E R SEVEN THE

GENERALIZED

DISTRIBUTION

OF W H

The result of this chapter - that C O M P is a subcategory of P - suggests a hypothesis about a necessary condition for the appearance of W H that would be difficult to formulate in a system where C O M P # P°. This is the first attempt, to my knowledge, to derive the specific distribution of W H from a more general principle. (i)

WH is a syntactic feature which indicates that a bar notation morpheme category on which it appears is closed.

This expresses in a natural way that W H may occur on SP(N) (which, what), SP(A) (how), and P(//; whether) but not on N, A, or V. We could easily retain the essence of (i) and at the same time allow W H to occur as a feature on bar notation maximal projections, in line with work such as Chomsky (1977), by rewording (i) so that upwards percolation from SP(X) to Xmax is either allowed or required. Milner (1978b) suggests that such percolation is obligatory for English and optional for French. Cf. van Riemsdijk (1984) for further restrictions on pied-piping. Given (i), we should ask if W H can appear on SP(V) or SP(P); whether or not it does might be taken as a language-specific parameter. Thus, van Riemsdijk (1978, Ch. 6) argues that W H can occur on SP(P) in English, and, within his constrained theory of movement rules, this marked option for the English base is precisely what gives rise to the language-particular phenomenon of English preposition stranding. In another elegant and detailed study of WH-movement, Horvath (1981) investigates the possibility that the landing site for this rule in Hungarian and Basque direct and indirect questions is a pre-verbal node which forms a single constituent with V. While her study makes n o particular mention of the AUX in Hungarian, it does not seem implausible that the immediately pre-verbal node in question, which can be marked with W H , may be SP(V). As shown in Ch. 5, it is not unusual that all the tense and mood markers of a deep structure SP(V) are spelled out as verbal inflection, as in Romance languages. When this happens, only empty SP(V) are left in s-structure, and they could well be the landing site for WH-movement, provided they carry the feature WH. One construction which appears crucially in Horvath's argumentation is the interpretation in Hungarian of the empty pre-verbal landing site, under certain circumstances, as progressive aspect; this strengthens my impression that this landing site may be SP(V). K o o p m a n (1984, 103) expresses the same view. I cannot attempt to integrate in any detail the fully articulated analyses of van Riemsdijk and Horvath into the somewhat different overall theory

334

A unified theory of syntactic categories

proposed here. F o r example, in van Riemsdijk's theory, S P ( P ) is a landing site for W H , while in mine, P plays this role. This contrast might seem conceptually at odds, but in fact what I have said here is compatible with both positions being landing sites, provided certain further restrictions are imposed; e.g., no "doubly-filled C O M P ' s " can be allowed, but this is typically a problem in theories in which C O M P # P. It is b e y o n d the scope of this w o r k to effect a combining of these theories with w h a t has been presented in this chapter. But overall the possibilities just mentioned suggest to me that (i) is not only a necessary condition for the appearance of W H , but, in terms of all languages considered together, a sufficient one.

Bibliography

Akmajian, A. (1970) On deriving cleft sentences from pseudo-cleft sentences. Linguistic Inquiry 1, 149-168. Akmajian, A. (1977) The complement structure of perception verbs in an autonomous syntactic framework. In P. Culicover, T. Wasow, and A. Akmajian (eds) Formal syntax. Academic Press, 427-460. Akmajian, A., S. Steele, and T. Wasow (1979) The category Aux in universal grammar. Linguistic Inquiry 10, 1-64. Anderson, S. (1976) On the notion of subject in ergative languages. In C. Li (ed) Subject and topic. Academic Press, 1-24. Anderson, S. (1982) Where's morphology? Linguistic Inquiry 13, 571-612. Anderson, S. and S. Chung (1977) On grammatical relations and clause structure in verb-initial languages. Syntax and Semantics 8. Academic Press, 1-25. Aronoff, M. (1976) Word formation in generative grammar. M.I.T. Press. Baker, C.L. (1971) Stress level and auxiliary behavior in English. Linguistic Inquiry 2, 167-181. Baltin, M. (1978) Toward a theory of movement rules. M.I.T. doctoral dissertation. Baltin, M. (1978b) PP as a bounding node. In M. Stein (ed) Proceedings ofNELS VIII. University of Massachusetts at Amherst. Baltin, M. (1981) Strict bounding. In C.L. Baker and J. McCarthy (eds) The logical problem of language acquisition. M.I.T. Press, 257-295. Baltin, M. (1982) A landing site theory of movement rules. Linguistic Inquiry 13, 1-38. Baltin, M. (1983) Extraposition: bounding versus government-binding. Linguistic Inquiry 14, 155-161. Banfield, A. (1973) Narrative style and the grammar of direct and indirect speech. Foundations of Language 10, 1-39. Banfield, A. (1973b) Stylistic transformations in "Paradise Lost". University of Wisconsin doctoral dissertation. Banfield, A. (1976) Stylistic deletion in coordinate structures. Unpublished paper. Banfield, A. (1982) Unspeakable sentences. Routledge and Kegan Paul. Bates, D. (1983) Q-role assignment in derived nomináis. Talk delivered at the meeting of the Western Conference on Linguistics. Bech, G. (1955) Studien über das Deutsche Verbum Infinitum, Band I. Copenhagen University. Belletti, A. and L. Rizzi (1981) The syntax of " n e " : some theoretical implications. Linguistic Review 1, 117-154. Besten, H. den (1977) On the interaction of root transformations and lexical deletive rules. Groninger Arbeiten zur germanistischen Linguistik 20, 1-78. Besten, H. den (1978) On the presence and absence of JFA-elements in Dutch comparatives. Linguistic Inquiry 9, 641-671. Bickerton, D. (1982) Learning without experience the Creole way. In L. Obler and L. Menn (eds) Exceptional language and linguistics. Academic Press, 15-29. Bok-Bennema, R. (1981) Clitics and binding in Spanish. I n R . May and J. Koster (eds) Levels of syntactic representation. Foris Publications, 9-32.

336

Bibliography

Bordelois, I. (1974) The grammar of Spanish causative complements. M.I.T. doctoral dissertation. Borer, H. (1984) Parametric syntax. Foris Publications. Bouchard, D. (1983) On the content of empty categories. Foris Publications. Brame, M. (1976) Conjectures and refutations in syntax and semantics. North Holland. Brame, M. (1979) A note on COMP S: grammar vs. sentence grammar. Linguistic Analysis 5, 383-386. Brame, M. (1979b) Alternatives to the tensed S and specified subject conditions. Essays toward realistic syntax. Noit Amrofer Publishing, 183-213. Bresnan, J. (1970) On complementizers: towards a syntactic theory of complement types. Foundations of Language 6, 297-321. Bresnan, J. (1971) Sentence stress and syntactic transformations. Language 47, 257-281. Bresnan, J. (1972) Theory of complementation in English syntax. M.I.T. doctoral dissertation. Bresnan, J. (1972b) Stress and syntax: a reply. Language 48, 326-342. Bresnan, J. (1973) Syntax of the comparative clause. Linguistic Inquiry 4, 275-345. Bresnan, J. (1974) The position of certain clause particles in phrase structure. Linguistic Inquiry 5, 614-619. Bresnan, J. (1976) On the form and interpretation of syntactic transformations. Linguistic Inquiry 7, 3-40. Bresnan, J. (1977) Variables in the theory of transformations. In P. Culicover.T. Wasowand A. Akmajian (eds) Formal syntax. Academic Press, 157-196. Bresnan, J., R. Kaplan, S. Peters, and A. Zaenen (1982) Cross-serial dependencies in Dutch. Linguistic Inquiry 13, 613-635. Burzio, L. (1981) Intransitive verbs and Italian auxiliaries, M.I.T. doctoral dissertation. Chiba, S. (1974) On the movement of post-copular NP's in English. Studies in English Linguistics 2, 1-17. Chomsky, N. (1957) Syntactic structures. Mouton. Chomsky, N. (1964) A transformational approach to syntax. In J. Fodor and J. Katz (eds) The structure of language. Prentice-Hall, 211-245. Chomsky, N. (1965) Aspects of the theory of syntax. M.I.T. Press. Chomsky, N. (1970) Remarks on nominalization. In R. Jacobs and P. Rosenbaum (eds) Readings in English transformational grammar. Ginn, 184-221. Chomsky, N. (1972) Deep structure, surface structure, and semantic interpretation. Studies in semantics in generative grammar. Mouton, 62-119. Chomsky, N. (1973) Conditions on transformations. In S. Anderson and P. Kiparsky (eds) A festschrift for Morris Halle. Holt, Rinehart and Winston, 232-286. Chomsky, N. (1976) Conditions on rules of grammar. Linguistic Analysis 2, 303-352. Chomsky, N. (1977) On Wh-movement. In P. Culicover, T. Wasow, and A. Akmajian (eds) Formal syntax. Academic Press, 71-132. Chomsky, N. (1980) On binding. Linguistic Inquiry 11, 1-46. Chomsky, N. (1981) Lectures on government and binding. Foris Publications. Chomsky, N. (1982) Some concepts and consequences of the theory of government and binding. M.I.T. Press. Chomsky, N. and M. Halle (1968) The sound pattern of English. Harper and Row. Chomsky, N. and H. Lasnik (1977) Filters and control. Linguistic Inquiry 8, 425-504. Chung, S. (1978) Case marking and grammatical relations in Polynesian. University of Texas Press. Conroy, P. (1973) The water is wide. Dell Publishing Co. Contreras, H. (1984) A note on parasitic gaps. Linguistic Inquiry 15, 698-701. Culicover, P. (1971) Syntactic and semantic investigations. M.I.T. doctoral dissertation. Culicover, P. (1977) Syntax. Academic Press. Culicover, P. and W. Wilkins (1984) Locality in linguistic theory. Academic Press.

Bibliography

337

Czepluch, H. (1983) Case theory and the dative construction. Linguistic Review 2, 1-38. Derbyshire, D. (1977) Word order universals and the existence of OVS languages. Linguistic Inquiry 8, 590-599. Dougherty, R. (1970) A grammar of coordinate conjoined structures: I. Language 46, 850-898. Emonds, J. (1970) Root and structure-preserving transformations. M.I.T. doctoral dissertation. Emonds, J. (1971) The derived nomináis, gerunds, and participles in Chaucer's English. In B. Kachru et al. (eds) Issues in linguistics. University of Illinois Press, 185-198. Emonds, J. (1972) Evidence that indirect object movement is a structure-preserving rule. Foundations of Language 8, 546-561. Emonds, J. (1973) Alternatives to global derivational constraints. Glossa 7, 39-62. Emonds, J. (1975) A transformational analysis of French clitics without positive output constraints. Linguistic Analysis 1, 3-24. Emonds, J. (1976) A transformational approach to English syntax. Academic Press. Emonds, J. (1978) The verbal complex V'-V in French. Linguistic Inquiry 9, 151-175. Emonds, J. (1979) Appositive relatives have no properties. Linguistic Inquiry 10, 211-243. Emonds, J. (1980a) Inversion généralisée NP-a; marque distinctive de l'anglais. In A. Rouveret (ed) Langages 60, 13-45. Emonds, J. (1980b) Word order in generative grammar. Journal of Linguistic Research 1,33-54. Emonds, J. (1984) The necessity of three-cornered comparative syntax. In L. King and C. Maley (eds) Selected papers from the 13th Linguistic Symposium on Romance Languages. John Benjamins, 51-75. Emonds-Banfield, P. (1981) English syntactic competence. Mental Representations. Engdahl, E. (1983) Parasitic gaps. Linguistics and Philosophy 6, 5-34. Evers, A. (1975) The transformational cycle in Dutch and German. Indiana University Linguistics Club. Fente, R., J. Fernández, and L. Feijóo ( 1972) Perífrasis Verbales. Sociedad General Española de Librería. Flynn, M. (1983) A categorial theory of structure building. In G. Gazdar, E. Klein, and G. Pullum (eds) Order, concord, and constituency. Foris Publications, 139-174. Fowler, H. (1965) Dictionary of modern English usage, 2nd edition. Oxford University Press. Fraser, B. (1965) An examination of the verb-particle construction in English. M.I.T. doctoral dissertation. Fraser, B. (1976) The verb-particle combination in English. Academic Press. Freidin, R. and L. Babby (1983) On the interaction of lexical and syntactic properties: case structure in Russian. Unpublished paper. Garcia, E. (1967) Auxiliaries and the criterion of simplicity. Language 43, 853-870. Gazdar, G., G. Pullum, I. Sag, and T. Wasow (1982) Coordination and transformational grammar. Linguistic Inquiry 13, 663-676. Geis, M. (1970) Adverbial subordinate clauses in English. M.I.T. doctoral dissertation. George, L. and J. Kornfilt (1981) Finiteness and boundedness in Turkish. In F. Heny (ed) Binding and filtering. M.I.T. Press, 105-127. Goldsmith, J. (1981) Complementizers and root sentences. Linguistic Inquiry 12, 541-574. Greenberg, J. (1963) Some universals of grammar with particular reference to the order of meaningful elements. In J. Greenberg (ed) Universals of language. M.I.T. Press, 73-113. Greenough, J. et al. (1975) New Latin grammar. Caratzas Brothers. Grimshaw, J. (1979) Complement selection and the lexicon. Linguistic Inquiry 10, 279-326. Groos, A. (1978) Towards an inflectional theory of clitics. Unpublished paper. Groos, A. and H. van Riemsdijk (1981) Matching effects in free relatives. In A. Belletti, L. Brandi, and L. Rizzi (eds) A theory of markedness in generative grammar. Scuola Normale Superiore. Gross, M. (1968) Grammaire transformationelle du français: syntaxe du verbe. La Rousse.

338

Bibliography

Gruber, J. (1965) Studies in lexical relations. M.I.T. doctoral dissertation. Gruber, J. (1969) Topicalization in child language. In D. Reibel and S. Schane (eds) Modern studies in English. Prentice-Hall, 422-447. Guéron, J. (1980) On the syntax and semantics of PP extraposition. Linguistic Inquiry 11, 637-678. Haiman, J. (1978) Conditionals are topics. Language 54, 564-589. Hankamer, J. (1973) Why there are two than's in English. In C. Corum et al. (eds) Papers from the Ninth Regional Meeting of the Chicago Linguistic Society. University of Chicago, 179-191. Harlow, S. (1981) Government and relativisation in Celtic. In F. Heny (ed) Binding andfiltering. M.I.T. Press, 213-254. Harris, Z. (1946) From morpheme to utterance. Language 22, 161-183. Hasegawa, N. (1981) A lexical interpretive theory with emphasis on the role of subject. University of Washington doctoral dissertation. Helke, M. (1973) On reflexives in English. Linguistics 106, 5-23. Hendrick, R. (1976) Prepositions and the X' theory. In J. Emonds(ed) Proposals for semantic and syntactic theory: UCLA papers in syntax 7, 95-122. Hendrick, R. (1978) The phrase structure of adjectives and comparatives. Linguistic Analysis 4, 255-300. Hendrick, R. (1979) On nesting and indexai conditions in linguistic theory. U.C.L.A. doctoral dissertation. Hendrick, R. (1982a) Construing relative pronouns. Linguistic Analysis 9, 205-224. Hendrick, R. (1982b) Reduced questions and their theoretical implications. Language 58, 800-819. Hess, T. and V. Hilbert (1976) Lushootseed: the language of the Skagit, Nisqually, and other tribes of Puget Sound. University of Washington American Indian Studies. Higgins, F.R. (1973) On J. Emonds' analysis of extraposition. In J. Kimball (ed) Syntax and semantics 2. Academic Press, 149-195. Hornstein, N. (1977) S and X' convention. Linguistic Analysis 3, 137-176. Hornstein, N. and D. Lightfoot (1981) Introduction. Explanation in linguistics. Longman. Horvath, J. (1976) Focus in Hungarian and the X notation. Linguistic Analysis 2, 175-197. Horvath,_J. (1980) Core grammar and a stylistic rule in Hungarian syntax. In J. Jensen (ed) NELS X: Cahiers linguistiques d'Ottawa. University of Ottawa, 237-256. Horvath, J. (1981) Aspects of Hungarian syntax and the theory of grammar. U.C.L.A. doctoral dissertation. Huang, J. (1982) Logical relations in Chinese and the theory of grammar. M.I.T. doctoral dissertation. Huddleston, R. (1978) On the constituent structure of VP and AUX. Linguistic Analysis 4,31-59. Huot, H. (1981) Constructions infinitives du français. Librairie Droz. Hurtado, A. (1981) Le contrôle par les clitiques. Revue québécoise de linguistique 11, 9-67. Huybregts, M. (1976) Overlapping dependencies in Dutch. Utrecht Working Papers in Linguistics 1, 24-65. Ishihara, R. (1982) A study of absolute phrases in English within the government-binding framework. U.C.S.D. doctoral dissertation. Iwakura, K. (1977) The auxiliary system in English. Linguistic Analysis 3, 101-136. Jackendoff, R. (1972) Semantic interpretation in generative grammar. M.I.T. Press. Jackendoff, R. (1973) The base rules for prepositional phrases. In S. Anderson and P. Kiparsky (eds) A festschrift for Morris Halle. Holt, Rinehart and Winston, 345-356. Jackendoff, R. (1977) X-syntax; a study of phrase structure. M.I.T. Press. Jaeggli, O. (1982) Topics in Romance syntax. Foris Publications. Jakobson, R. (1984) Russian and Slavic grammar studies 1931-1981. Mouton. Jensen, J. and M. Strong-Jensen (1984) Morphology is in the lexicon! Linguistic Inquiry 15, 474-498.

Bibliography

339

Jo, M.-J. (in preparation) Determinants offree and fixed word order in Korean. University of Washington doctoral dissertation. Katz, J. (1964) Semantic theory and the meaning of "good". Journal of Philosophy 61, 739-766. Kayne, R. (1972) Subject inversion in French interrogatives. In J. Casagrande and B. Saciuk (eds) Generative studies in Romance languages. Newbury House, 70-126. Kayne, R. (1975) French syntax: the transformational cycle. M.I.T. Press. Kayne, R. (1981) ECP extensions. Linguistic Inquiry 12, 93-133. Kayne, R. (1983) Le datif en anglais et en français. Revue Romane. Kean, M.-L. (1975) The theory of markedness in generative grammar. M.I.T. doctoral dissertation. Keenan, E. (1976) Towards a universal definition of "subject". In C. Li (ed) Subject and topic. Academic Press, 303-334. Keenan, E. (1976b) Remarkable subjects in Malagasy. In C. Li (ed) Subject and topic. Academic Press, 247-302. Keniston, H. (1937) Spanish syntax list. Henry Holt and Company. Keyser, S.J. and T. Roeper (1984) On the middle and ergative constructions in English. Linguistic Inquiry 15, 381-416. Kimball, J. (1973) Get. In J. Kimball (ed) Syntax and semantics 2. Seminar Press, 205-215. Klein, E. (1980) Determiners and the category Q. Unpublished paper. Klein, S. (1982) Syntactic theory and the developing grammar. U.C.L.A. doctoral dissertation. Klima, E. (1964a) Negation in English. In J. Fodor and J. Katz (eds) The structure of language. Prentice-Hall, 246-323. Klima, E. (1964b) Relatedness between grammatical systems. Language 40, 1-20. Koopman, H. (1984) The syntax of verbs. Foris Publications. Koster, J. (1975) Dutch as an SOV language. Linguistic Analysis 1, 111-136. Koster, J. (1978) Conditions, empty nodes, and markedness. Linguistic Inquiry 9, 551-594. Koster, J. and R. May (1982) On the constituency of infinitives. Language 58, 116-143. Kuno, S. (1970) Some properties of non-referential noun phrases. In R. Jakobson and S. Kawamoto (eds) Studies in general and oriental linguistics. TEC Corporation, 348-373. Kuroda, S.-Y. (1975) Subject. In M. Shibatani (ed) Syntax and semantics 5: Japanese generative grammar. Academic Press, 1-16. Kuroda, S.-Y. (1983) What can Japanese say about government and binding"! Unpublished paper. Lakoff, G. and J. Ross (1966) A criterion for verb phrase constituency. Report NSF-17. Harvard University Computation Laboratory. Lamiroy, B. (1983) Les verbes de mouvement en français et en espagnol. John Benjamins. Lapointe, S. (1981) The representation of inflectional morphology within the lexicon. In V. Burke and J. Pustejovsky (eds) Proceedings of the Eleventh Annual Meeting of the North Eastern Linguistic Society, 190-204. Leben, W. (1982) Metrical or autosegmental. In H. van der Hulst and N. Smith (eds) The structure of phonological representations, Part I. Foris Publications, 177-190. Levin, B. (1983) On the nature of ergativity. M.I.T. doctoral dissertation. Li, C. (1976) Subject and topic. Academic Press. Li, C. and S. Thompson (1976) Subject and topic: a new typology of language. In C. Li (ed) Subject and topic. Academic Press, 457-490. Lieber, R. (1980) On the organization of the lexicon. M.I.T. doctoral dissertation. Lieber, R. (1983) Argument linking and compounds in English. Linguistic Inquiry 14,251-286. Lightfoot, D. (1974) The diachronic analysis of English modals. In J. Anderson and C. Jones (eds) Proceedings of the First International Conference on Historical Linguistics. Reidel, 219-249. Lightfoot, D. (1979) Principles of diachronic syntax. Cambridge University Press. Maling, J. (1983) Transitive adjectives: a case of categorial reanalysis. In F. Heny and B. Richards (eds) Linguistic categories: auxiliaries and related puzzles. Reidel.

340

Bibliography

Marantz, A. (1980) English S is the maximal projection of V. In J. Jensen (ed) NELS X: Cahiers linguistiques d'Ottawa. University of Ottawa, 303-314. McA'Nulty, J. (1980) Binding without case. In J. Jensen (ed) NELS X: Cahiers linguistiques d'Ottawa. University of Ottawa, 315-328. McA'Nulty, J. (1983) Clitics and c-command: three reasons for applying finite verb raising on the left side of the grammar. Unpublished paper. McCloskey, J. (1983) A VP in a VSO language? In G. Gazdar, E. Klein, and G. Pullum (eds) Order, concord, and constituency. Foris Publications, 9-55. Milner, J.-C. (1973) Arguments linguistiques. Éditions Marne. Milner, J.-C. (1978) De la syntaxe à l'interprétation. Éditions du Seuil. Milner, J.-C. (1978b) Cyclicité successive, comparatives, et cross-over en français. Linguistic Inquiry 9, 673-693. Milner, J.-C. (1982) Théorie des fonctions grammaticales. In Ordres et raisons de langue. Éditions du Seuil, 67-223. Monaghan, M. (1981) Negation in English revisited. University of Washington master's degree thesis. Morikawa, M. (1982) A structure-preserving approach to the Japanese passive construction. Papers in Japanese linguistics 8, 129-168. Muysken, P. (1983) Parameterizing the notion head. Journal of Linguistic Research 2, 57-76. Nanni, D. and J. Stillings (1978) Three remarks on pied-piping. Linguistic Inquiry 9, 310-318. Newmeyer, F. (1980) Linguistic theory in America. Academic Press. Oehrle, R. (1976) The grammatical status of the English dative alternation. M.I.T. doctoral dissertation. Ostler, N. (1979) Case-linking: a theory of case and verb diathesis applied to Classical Sanskrit. M.I.T. doctoral dissertation. Otero, C. (1976) The dictionary in generative grammar. Unpublished paper. Pesetsky, D. (1982) Paths and categories. M.I.T. doctoral dissertation. Piera, C. (1984) On the representation of higher order complex words. In L. King and C. Maley (eds) Selected papers from the 13th Linguistic Symposium on Romance Languages. John Benjamins. Plann, S. (1981) The two el + infinitive constructions in Spanish. Linguistic Analysis 7,203-240. Postal, P. (1969) On so-called "pronouns" in English. In D. Reibel and S. Schane (eds) Modern studies in English: Readings in transformational grammar. Prentice-Hall, 201-224. Pranka, P. (1983) Syntax and word formation, M.I.T. doctoral dissertation. Quicoli, C. (1980) Clitic movement in French causatives. Linguistic Analysis 6, 131-186. Quicoli, C. (1982) The structure of complementation. E. Story-Scientia. Reinhart, T. (1981) Definite NP anaphora and c-command domains. Linguistic Inquiry 12, 605-635. Reuland, E. (1981) On extraposition of complement clauses. In V. Burke and J. Pustejovsky (eds) Proceedings of the Eleventh Annual Meeting of the North Eastern Linguistic Society, 296-318. Reuland, E. (1981b) Empty subjects, case, and agreement in the grammar of Dutch. In F. Heny (ed) Binding and Filtering. M.I.T. Press, 159-190. Riemsdijk, H. van (1978) A case study in syntactic markedness. Foris Publications. Riemsdijk, H. van (1983) The case of German adjectives. In F. Heny and B. Richards (eds) Linguistic, categories: auxiliaries and related puzzles. Reidel. Riemsdijk, H. van (1984) On pied-piped infinitives in German relative clauses. In J. Toman(ed) Issues in the grammar of German. Foris Publications, 165-192. Riemsdijk, H. van and E. Williams (1981) NP-structure. Linguistic Review 1, 171-218. Rizzi, L. (1981) Nominative marking in Italian infinitives and the nominative island constraint. In F. Heny (ed) Binding and filtering. M.I.T. Press, 129-157.

Bibliography

341

Rizzi, L. (1982) Lexical subjects in infinitives: government, case and binding. Issues in Italian syntax. Foris Publications, 77-116. Rizzi, L. (1982b) Violations of the Wh island constraint and the subjacency condition. Issues in Italian syntax. Foris Publications, 49-76. Robbins, B. (1968) The definite article in English transformations. Mouton. Rochemont, M. (1978) A theory of stylistic rules in English. University of Massachusetts doctoral dissertation. Roeper, T. and M. Siegel (1978) A lexical transformation for verbal compounds. Linguistic Inquiry 9, 199-260. Ronat, M. (1973) Échelles de base et mutation en syntaxe française. Université de Paris VIII doctoral dissertation. Rosenbaum, P. (1967) The grammar of English predicate complement constructions. M.I.T. Press. Ross, J. (1967) Constraints on variables in syntax. M.I.T. doctoral dissertation. Rouveret, A. (1978) Result clauses and conditions on rules. In S.J. Keyser (ed) Recent transformational studies in European languages. M.I.T. Press, 159-187. Rouveret, A. and J.-R. Vergnaud ( 1980) Specifying reference to the subject. Linguistic Inquiry 11, 97-202. Ruck, C. (1968) Ancient Greek, a new approach. M.I.T. Press. Ruwet, N. (1967) Introduction à la grammaire générative. Pion. Ruwet, N. (1969) À propos de prépositions de lieu en français. In C. Hyart (ed) Mélanges Fohalle Duculot, 115-135. Ruwet, N. (1978) Une construction absolue en français. Linguisticae Investigationes 2,165-210. Ruwet, N. (1979) On a verbless predicate construction in French. Papers in Japanese Linguistics: Memorial Volume S.I. Harada 6, 255-285. Ruwet, N. (1982) Grammaire des insultes et autres études. Éditions du Seuil. This volume contains Ruwet (1969, 1978, 1979). Safir, K. (1982) Inflection-government and inversion. Linguistic Review 1, 417-467. Sag, I. (1978) Floated quantifiers, adverbs, and extraction sites. Linguistic Inquiry 9, 146-150. Samiian, V. (1983) Structure of phrasal categories in Persian: an X-bar analysis. U.C.L.A. doctoral dissertation. Schachter, P. and F. Otanes (1972) Tagalog reference grammar. University of California Press. Schein, B. (1981) The S'-deletion parameter and small clauses. GLOW Newsletter 6, 49-51. Schwartz, A. (1972) Constraints on transformations. Journal of Linguistics 8, 35-86. Selkirk, E. (1982) The syntax of words. M.I.T. Press. Siegel, D. (1974) Topics in English morphology. M.I.T. doctoral dissertation. Sommerstein, A. (1977) Modern phonology. University Park Press. Sportiche, D. (1981) Bounding nodes in French. Linguistic Review 1, 219-246. Steele, S., A. Akmajian, R. Demers, E. Jelinek, C. Kitegawa, R. Oehrle, and T. Wasow ( 1981) An encyclopedia of AUX: a study in cross-linguistic equivalence. M.I.T. Press. Stowell, T. (1981) Origins of phrase structure. M.I.T. doctoral dissertation. Strozer, J. (1976) Clitics in Spanish. U.C.L.A. doctoral dissertation. Talmy, L. (1975) Semantics and syntax of motion. In J. Kimball (ed) Syntax and semantics 4. Academic Press, 181-238. Talmy, L. (1978) Figure and ground in complex sentences. In J. Greenberg, C. Ferguson, and E. Moravcsik (eds) Universals of human language, vol IV. Stanford University, 625-649. Talmy, L. (1983) How language structures space. In H. Pick and L. Acredolo (eds) Spatial orientation: theory, research, and application. Plenum Press. Taraldsen, K.T. (1981) The theoretical interpretation of a class of marked extractions. In A. Belleti, L. Brandi and L. Rizzi (eds) A theory of markedness in generative grammar. Scuola Normale Superiore, 475-516. Taraldsen, K.T. (1983) Parametric variation in phrase structure. Tramso University doctoral dissertation.

342

Bibliography

Terazu, N. (1979) A note on the derived structure of extraposition rules. Studies in English Linguistics 7, 86-99. Torrego, E. (1984b) On inversion in Spanish and some of its effects. Linguistic Inquiry 15, 103-129. Torrego, E. (1984b) Determinerless NP's. Unpublished paper. Travis, L. (1984) Parameters and effects on word order variation. M.I.T. doctoral dissertation. Travis, L. and E. Williams (1983) Extemalization of arguments in Malayo-Polynesian languages. Linguistic Review 2, 57-78. Walinska de Hackbeil, H. (1983) X categories in morphology. In J. Richardson, M. Marks, and A. Chukerman (eds) Papers from the parasession on the interplay of phonology, morphology, and syntax. Chicago Linguistic Society, 301-313. Walinska de Hackbeil, H. (1984) On two types of derived nominals. In D. Testen, V. Mishra.and J. Drogo (eds) Papers from the parasession on lexical semantics. Chicago Linguistic Society, 308-332. Wasow, T. (1975) Anaphoric pronouns and bound variables. Language 51, 368-383. Wasow, T. (1977) Transformations and the lexicon. In P. Culicover, T. Wasow, and A. Akmajian (eds) Formal syntax. Academic Press, 327-360. Wasow, T. and T. Roeper (1972) On the subject of gerunds. Foundations of Language 8,44-61. Whitney, R. (1982) The syntactic unity of complex NP shift and wh-movement. Linguistic Analysis 10, 299-319. Whitney, R. (1983) The place of dative movement in a generative theory. Linguistic Analysis 12, 315-322. Whitney, R. (1984) The syntax and interpretation of A-adjunctions. University of Washington doctoral dissertation. Whitney, W.D. (1889) Sanskrit grammar, 2nd edition. Harvard University Press. Wilkins, W. (1980) Adjacency and variables in syntactic transformations. Linguistic Inquiry 11, 709-758. Williams, E. (1978) X-features. Unpublished paper. Williams, E. (1980) Predication. Linguistic Inquiry 11, 203-238. Williams, E. (1981) On the notions "lexically related" and "head of a word." Linguistic Inquiry: 12, 245-274. Williams, E . (1982) The NP cycle. Linguistic Inquiry 13, 277-295. Williams, E. (1983) Against small clauses. Linguistic Inquiry 14, 287-308. Williams, E. (1984) FAere-insertion. Linguistic Inquiry 15, 131-153. Yim, Y.-J. (1984) Case-tropism: the nature ofphrasal and clausal case. University of Washington doctoral dissertation. Zagona, K. (1982) Government and proper government of verbal projections. University of Washington doctoral dissertation. Zubizarreta, M.-L. (1982) On the relationship of the lexicon to syntax. M.I.T. doctoral dissertation. Zwicky, A. (1971) In a manner of speaking. Linguistic Inquiry 2, 223-233.

Index

ablative case, 4, 53, 221, 224, 229-33, 247 absolute construction, 34, 41, 52, 73, 81-2, 91, 97, 104, 233, 236, 248, 252 abstract case, 22, 52-65, 103, 112-13, 130, 134, 222-4, 239-43, 286, 297 accusative case, 30, 42, 52-4, 57, 60, 164, 186, 221, 224-6, 229-34, 237, 248, 256, 267 accusative languages, 136 accusative-marked lexical subject, 307 action nominalization, 45 activity verb, 172, 187, 189 Adjacency (Hypothesis), 8-10, 18, 116, 139, 199, 205-9, 213-14, 219-20, 239, 242, 244, 295-6, 305; see also Head Adjacency adjective (A), 54, 156, 196-7, 203 - 4 , 208-9, 220-3 , 226-9, 232, 245; disguised, 162-3; grammatical, 185; predicate, 35, 43, 269; verbal, 74; weak/strong, 223 adjective' (A j ), 209 adjective phrase (AP), 27, 42-4, 58, 163, 185, 208, 222, 269, 274-5, 293, 331; specifier, see SP(A) adjunct 0-role, 68 adjunction, 127, 138-46, 319, 324-5; local, 142; root, 127, 142-4, 146; transformational, 201; see also Chomsky-adjunction adverb, 13, 15, 26, 140, 161-2, 213; as caseless AP, 185; directional, 252—3, 256-60; manner, 216; preposing, 260 adverb-forming -ly, 201 adverbial AP, 58, 134, 185, 314

adverbial case NP, 224-9, 232-7 adverbial gerund, 72, 79, 85 , 88-91, 95, 97 adverbial NP, 121 adverbial participle, 104, 106 adverbial subordinate clause, 45 , 90, 281-4, 288-91, 304, 314, 324-6, 332 affirmative imperative, 143 affix, see derivational morphology and inflection affix, verbal, 56, 161, 213 affix movement, 109, 142, 146, 148, 175, 196, 199, 205, 214-18, 227 agent, 133; postposing, 106 agentive, 154, 188 AGR, 125 agreement, 7, 53, 76, 109, 200-3, 206 American English, 190, 237 anaphor, 28, 92, 100, 129 anaphora, N, 182 and, 329-30 ANIMATE, 224 antecedent, 271 anti-transitivity, 107-8 A-over-A recursion, 332 AP, see adjective phrase appositive: modifier, 96; NP, 28, 229, 239; relative clause 239, 252, 278, 303-4, 319-22 Arabic, 52-3, 62, 193, 248 archi-category, 54 argument: internal and external, 24, 28-31, 34, 41, 56, 68, 82, 84, 94, 102, 133-7, 141, 154, 178, 221; position 137, 141, 295, 301, 322; subcategorized, 37

344

Index

as, 61, 328-32; clause, 266, 327-32; deletion, 328; introductory, 57; (non)comparative, 63, 249, 264-79, 328-32; phrase, 273-7; prepositional copula, 264-6 as if clause, 307 aspect, temporal, 46, 71, 102-3, 116, 251, 294 aspectual gerund, 71, 79, 103 asymmetric case-marking, 46, 51 asymmetric semantics, 19 asymmetry, 2-5, 25-6, 29, 42-52, 95, 281 Autonomy from the Lexicon, 2, 3, 5 autonomy of syntax, 19, 42, 137, 154 autosegmental phonology, 8 - 9 AUX, 19, 22, 86-7 , 90, 103, 122, 142, 160, 163, 167, 174, 199, 210-19, 333; deletion, 90-1; fronting, 319; see also INFL and SP(V) auxiliary verb, 159, 163, 172-6, 191, 212; conjugated, 174-6; contraction, 301; do, 214, 217; and inflection, 210; inversion, see subject-auxiliary inversion; passive, 106-7 AUX-to-COMP, 146 Balto-Slavic, 236 bar notation, 1-7, 14, 23-4, 28, 53-5, 96, 121-4, 156-60, 245; category, 53, 139, 156; feature, 55; phrase, 157; projection, 24 Bar Notation Uniformity, 157-8 bare infinitive, 46-7, 74, 98, 100, 116-7 bare VP, 69-70, 75-98, 110 barrier to government, 59, 128, 131 base component, 3—4, 19; see also deep structure base composition rule, 20-31, 155, 213, 248; see also phrase structure rule Base Generation of Categories, 164 base-dependent case assignment, 5 Basque, 333 be, 46, 71, 146, 150, 212, 270, 333; raising, 150 Berber, 150 binary branching, 28 binding theory, 85-6, 185, 296, 300 Black English, 109 bound morpheme, 142, 179-80, 193-7, 200, 205 , 246

bounding node, 129, 324-6; see also subjacency branching, 28, 160, 191 Breton, 148-9, 219 canonical clausal word order pattern, 123 case: adverbial, 224-9, 232-7; inflectional, see morphological case; inherent, 224; morphological, 4, 22, 52-6, 62, 99, 199, 220-37, 240-3, 267, 297; semantic, 224; see also ablative, accusative, dative, etc. case assignment, 5, 42, 54-61, 164 case category, 52-4, 164 case ending, 194, 201, 206, 222, 243 case feature, 53 - 5 , 60, 164, 180, 243 , 247 case filter, 54-8, 295, 298 case names, 40 case resistance, 23 case theory, 4-5, 52, 65; see also other headings for case case-inflecting languages, 225, 227 case-ma;king, 30-2, 46-7, 54-65, 114, 164, 220-1, 225-9, 234, 237, 242-3, 295-7; of AP, 58; asymmetry, 46, 51; domain, 128; ergative, 136-7; Exceptional, 170, 294-302, 332; generalized, 58-62, 65, 224, 228, 296; languages, 59; middle, 136; principle, 58; see also case assignment and morphological case categorial component, 53; see also base component and base composition rule Categorial Grammar, 15 Categorial Uniformity, 1-2, 7, 20 category: bar notation, 53, 139, 156; deep structure, 245; empty, 39-40, 98-100, 111, 139, 143, 179, 183-4, 190, 244, see also PRO and trace; governing, 75, 85; grammatical, 5, 13, 19; head-of-phrase, 32, 39, 162, 207, see also head; morpheme, 20, 155-9; non-phrasal, 20, 155-9; relational, 29, 31; syntactic, 155-9; see also closed category, lexical category, open category category-neutral, 22, 31, 55, 58 Category-neutral Syntax, 19, 26, 29, 58 category-specific semantics, 24 causative, 53, 62, 147-8, 164, 188-90, 194-7, 248

Index c-command, 68, 75-9, 85, 296, 305; minimal, 68, 76-7, 85, 209; mutual, 310 Celtic, 122, 145, 148-50, 154 chain, 68 chemistry, 244 Chinese, 4, 40, 142, 206, 219 Chomsky-adjunction, 295-6, 320-1 Classical Greek, 52-3, 60-2, 193, 220-1, 224-5, 232-3, 235-6, 243, 247 clausal complement, 48-51, 148-9, 249, 305; see also gerund and infinitive clausal subject, 22, 30, 40, 306, 314 clause: finite, 15, 48-50, 142, 164, 199, 283, 304, 307; main, 144, 316-17, see also root clause; non-finite, 70, 110, see also gerund and infinitive; observed result, 284-5; planned result, 293; purpose, 293, 298, 320; relative, see relative clause; result, 283-5, 293; root, 127, 304, 313, 332, see also main clause; small, 34, 36, 76, 84; subordinate, 141, 144, 264, 281-4, 288-91, 324-6, 329; temporal, 90 cleft sentence, focus in, 249-51, 261-2, 269, 273-4, 288-9, 299, 304 clitic, 10, 62, 99, 129, 138, 143, 178, 184, 193-4, 198-9, 205, 220-2, 240-1, 330 clitic-climbing, 198 cliticization, 202, 205-6 clitic-verb sequence, 193 closed category, 6 - 7 , 99, 128, 143, 146, 159-72, 176-9, 182, 187, 191, 195-7, 203, 206, 227, 243, 286 closed class, 159, 165-72, 177-9, 184, 188-91; Unique Syntactic Behavior of, 165-70 closed head category P, 6 coining, 13, 159, 169, 191 comma intonation, 77, 283-4, 291 command, see imperative COMP, see complementizer comparative: clause, 249, 252, 264, 279, 285, 293, 324, 327-32; inflection, 161, 185, 193-4, 201, 206, 213; specifier, 180, 199, 203 comparative as, 249, 264-7, 328-32 competence, 224 complement, 4 - 6 , 26, 29; clausal, 48-51, 148-9, 251, 305; directional, 30, 40, 57, 224; infinitival, 50-1, 56, 115; _ obligatory, 283; S, 48, 50, 292; S, 23,

345 305, see also clausal complement and infinitival complement; VP, 46-8, 96, 120 complementary distribution, 7, 180 complementizer (COMP), 48-51, 88-95, 113, 139, 142-7, 150-3, 160, 180-1, 184-5, 249-52, 268, 273, 278, 281-334; branching, 160; deletion in main clauses, 309, 316-19, 332; empty, 113, 119, 307; filter, 325, 334; morpheme, 184, 249-52, 291; morphological, 163; No Complementizer Constraint, 144-5; as P, 281-334; that, 283; zero, 113, 119, 307 complex NP shift, 139-40, 186 component: base, 3, 4, 19; morphological, 242-6; semantic, 166; transformational, 9, 53 composite pronoun, see compound pronoun compositional 0-role assignment, 37 compound, 20, 198, 202; P, 251; pronoun, 162, 204, 207; V, 164 conditional, 206, 211-12, 287-8, 299, 320, 323 conditional/future suffix, 201 conjoined: NP, 240-1; phrase, 222; structure, 57, 328-31; see also coordinate conjugated auxiliary, 174-6 conjunction (CONJ), 166, 319-21, 329-32; coordinate, 162, 238, 252, 329-32; subordinating, see subordinating conjunction; temporal, 72; see also coordinate connective, 163 consonant, 54 consonant cluster, 171 conspiracies of syntax/of semantics, 323-4 constitute, 38-9, 42, 64, 99, 103-5 construction-specific transformation, 138 context predicate, 115 contraction, 167, 211, 243 control: obligatory, 96-106, 110-11, 118, 178, 274, 305-8; subject, 80 control infinitive, 48, 51 coordinate/coordination, 88, 158, 180, 238-41, 252, 328-31; see also conjoined and conjunction coordinate: clause, 143, 329; conjunction, 162, 238, 252, 329-32; phrase, 330;

346

Index

structure, 14, 108, 275, 319; structure constraint, 240, 330; VP, 108 coreference, 24, 123 count noun, 21, 200, 227, 245, 311 Creole, 158 cross-classification, 25, 54, 164, 237 cycle, 100-1, 146, 177, 218; transformational, 98, 107, 146 cyclic checking, 101 cyclic domain, 177, 186, 206, 319 cyclic lexical insertion, 98-101, 110, 177 dative, 4, 32, 53-4, 112, 183, 221-36, 247-8, 267, 297; movement, 188-9; of possession, 232; subject, 297 daughter, 20, 22 declension, 53 deep structure, 1 - 6 , 9, 13, 15, 18, 26, 39, 157, 195-200, 212, 246; category, 245; subject, 23, 25, 133; verbless clause, 24; see also base component and base composition rule defective paradigm (in inflectional morphology), 197 DEFINITE, 299 degree modifier, see comparative and SP(A) deletion: AUX, 9 0 - 1 ; COMP, 309, 316-19, 332; Equi-NP, 100-1; S, 51, 90; VP, 86, 111, 211 deletion rule, 173 demonstrative, 156, 158, 239 derivational morpheme, 197-8, 201 derivational morphology, 16-17, 193-8, 201-2, 245 - 6 derivatively generated, 2, 45 derived bar notation category, 139 derived nominal, 42-52, 6 4 - 5 , 276 derived structure, 127, 138 designated element, 172-3, 185 Designation Convention, 7, 159, 172-9, 184, 187-8, 191, 202, 217, 244 determiner (DET), 5, 16, 18, 156, 161, 222-3, 227, 243, 253, 270, 311; case inflection on, 220-1; limit on number of, 168, 182; pronoun as, 239; subject, 193; see also SP(N) deverbalization rule, 94 dictionary, see lexicon diminutive, 194, 196

direct object, 6 4 - 5 , 136, 229, 264; see also accusative case and direct 0-role assignment direct question, see question direct 0-role assignment, 31-7, 42, 47-51, 247 direction toward (accusative), 229, 234 directional, 40, 146; adverb, 252-3, 256-60; complement, 30, 40, 57, 224; preposition, 225, 229, 234, 290 discontinuous construction, 263, 284 disguised lexical category, 162-4, 176 disjoint reference, 8 5 - 6 do, 217-18; auxiliary verb, 214, 217; contrastive, 218; insertion, 86, 214, 217; do so, 28, 283 dominance relations, 22, 156-8 doubling NP, 228 doubly-filled COMP filter, 325, 334 dual, 158, 2 3 4 - 6 Dutch, 115-20, 141, 143, 145, 149-51, 154, 169-70, 203, 206, 219, 312, 325 E, see Expression echo question, 324 elision, 207 ellipsis, 163; see also deletion embedded clause, see subordinate clause, gerund, infinitive, subjunctive emphasis formative, 214, 218 empty category, 39-40, 98-100, 111, 139, 143, 179, 183 - 4 , 190, 244; see also PRO and trace empty category principle, 114 empty COMP, 113, 119, 307 Empty Head Principle, 111-14, 184, 189, 231, 308-12, 315, 321 empty N, 23, 113 empty node, see empty category empty NP, 55, 76, 231 empty operator, 88 empty P, 49-50, 53, 111-13, 183-4, 189, 225-31, 236-7, 248 empty subject, 27, 75, 78, 98, 100, 103, 110-11, 295 empty VP, 86, 111, 129, 211, 239 enclitic, 330; see also clitic endocentric, 40 English, 9, 10, 15-17, 21, 40, 4 6 - 7 , 5 0 - 2 , 5 5 - 6 , 62, 7 0 - 5 , 84-6, 89-91, 93, 97,

Index 103, 109, 110, 123, 128-9, 141-2, 144, 146, 148, 150-1, 154, 157-8, 161, 166, 170-1, 173 - 6 , 180, 182, 185, 193 - 4 , 196-202, 206, 208, 210-22, 225, 230, 232, 236-9, 241, 243 , 245-6, 250, 252, 259, 265 , 274, 278, 286-7, 294, 299, 310-11, 318, 322-30, 333 English Pronoun Rule, 239, 243-4 Equi-NP deletion, 100-1 erasure, 39, 107; see also deletion ergative, 136-7 escape hatch, 332 European languages, 122 evolution, 30, 138 Exceptional Case Marking (ECM), 170, 294-8, 301-2, 332 exclamation, 277, 306, 324 exclamative, 131, 277, 299, 306, 320, 324 existential, 166, 311 exocentric, 24 expansion of X k , see base composition rule and phrase structure rule Expression (E), 13, 81, 97, 143, 146-8, 316-21 Extended Projection Principle, 134 extended standard theory, 178 external argument, 24, 34, 41, 56, 68, 82, 84, 94, 133-7, 154; see also subject extraposition, 94, 118-20, 138, 309-16, 327; as instance of Move PP, 309-13; PP, 116; S, 23, 309-13, 332 false case of English pronouns, 237-42 feature: case, 53-5, 60, 164, 180, 243, 247; cross-classifying, 237; morphosyntactic, 158, 173; phonological, 168; pronominal, 159; secondary, 54; semantic, 165-76, 184, 187-8; SUB J, 3, 54; syntactic, 165-77, 199, 212, 333; system, 28; vowel, 54; see also subcategorization feature feminine, 223 filter, 119-20, 182, 197 , 208, 229, 255, 278; case, 54-8, 295, 298; COMP, 325, 334; V - V , 119 finite: clause, 4 8 - 50, 142, 164, 199, 283, 304, 307; verb, 15, 147, 150, 199, 216-19; verb formation, 216-19; verb raising, 147, 150, 218 first and second person pronouns in

347 Spanish, 241 First Inflection Principle, 203-6, 219, 243-6 flat structure, 74, 121 Flemish, 260 floating quantifier, 92, 182, 215 focus, 129, 284, 301, 312; in cleft sentences, 249-51, 261, 269, 273-4, 288-9, 299, 304 for. phrase formation, 294, 296, 302; preposition, 292; subordinating conjunction, 292 formative, see grammatical formative for-to clause, as PP, 48, 291-99, 305 - 7 free extraposition, 309-15 free variation, 165, 168 French, 10, 15-17, 21, 27, 40, 47, 49-52, 54, 62, 109, 111, 129, 131, 139, 142-3, 146-8, 150, 158, 174-6, 180-1, 183-4, 193-4, 199-202, 205-6, 211-22, 240-1, 245, 250, 252, 264-6, 283, 286-7, 295, 318, 325, 333. French: a and de, 50; exclamative que, 131; le-la-les rule, 241 fronting, 129, 138-41; of PP, 273, 299-301; stylistic, 152; V, 146-9, 170, 219; V, 214; see also inversion, NP movement, WH movement future tense, 211-12 gapping, 250, 328-32 gender, 163, 194, 201, 206, 223 Generalized Case Marking, 58-62, 65, 224, 228, 296 Generalized Head Restriction, 140-2, 146-50; see also Head Movement Restriction generative semantics, 2, 108 genitive, 53, 63-4, 183, 221-6, 232-7, 248; absolute, 233; of comparison, 233; of separation, 232 German, 4, 42, 52-3, 56, 60-2, 115-20, 143, 145, 149-51, 154, 169-70, 193-4, 200-1, 203, 206, 218-26, 228-30, 233, 237, 243, 247, 267, 299, 312, 329; emword, 223 gerund, 46-9, 69-97, 103 - 6 , 109-10, 149, 206, 270, 274; adverbial, 72-97; non-NP, 71, 92, 106; NP, 70, 77, 84, 93, 105, 110; perception, 71-2, 80-1, 93

348

Index

GOAL, 230-3, 292-8, 302, 307 Gothic, 236 governing category, 75, 85 government, 38, 57-9, 128, 131, 140, 156; of nouns, 233; proper, 95 government-binding framework, 37, 113; see also binding theory governor, 38, 58 grammatical adjective, 185 grammatical category, 5, 13, 14, 19 grammatical formative, 5, 17, 39, 40, 156-9, 162, 165, 173, 177, 179, 213-14, 218, 249 grammatical noun, 162, 184-5 grammatical relation, 4, 6, 61-5, 79; interpretive principles for, 157 grammatical universal, see universal grammar and universals on word order grammatical verb, 40, 162, 169, 172, 175; do, 217; post-transformational insertion of, 184-91 grammatical X, 169, 176-7, 188 Greek, see Classical Greek grossest constituent analysis, 59, 63, 207 head, 14-17, 80, 156, 228; grammatical, 162; lexical, 1; 68, 85, 98, 124, 129, 243; multiple, 14; of phrase category, 32, 39, 162, 207; of S, 124-5 Head Adjacency, 207-10, 214-19, 223, 244 Head Constraint, 156 Head Modifier Constraint, 285 Head Movement Restriction, 143, 145, 154; see also Generalized Head Restriction Head Placement (Parameter), 15-20, 26-7, 125 - 6 , 131, 146, 199, 200, 243, 245, 254 Head Uniqueness Principle, 149 head-final languages, 125-6; see also verbfinal languages head-initial languages, 26, 126, 130-1, 135, 140; see also verb-initial languages headless relative NP, 307, 328 Hierarchical Universality, 1, 3, 7, 20 historical change, 126 Hixkaryama, 126 honorifics, 194 Hungarian, 129, 141, 333

I, see INFL I inversion, see subject — auxiliary inversion idiom, 160, 202, 255, 260, 262-3 idiomatic: particle, 262-3; prepositional phrase, 258 i/clause, as PP, 286-9 immediate constituent, 14, 15, 20 imperative, 143, 318, 323; tag, 167, 212 imperfect tense, 211 impersonal verb, 99 INCHOATIVE, 169 independent clause, see main clause and root clause index of control, 111 indices, 54 indirect discourse, 313 indirect object, 30, 40, 49, 53, 57, 61-4, 112, 188-9, 224-32, 245, 267, 292, 299 indirect question, 30, 40, 48, 101, 104, 277, 286-7, 291, 298, 305-7, 333 indirect 0-role assignment, 31, 37-41, 47-9, 61-2, 99-101, 105, 136, 226-7, 308 Indo-European, 13, 22, 71, 134, 221, 224, 232-6 induced empty node, 98, 111-13 infinitival relative, 94, 294, 297, 302, 332 infinitive, 50-1, 56, 69-71, 75-8, 84-9, 92, 96-106, 110, 115, 161, 206, 216, 274; bare, 46-7, 74, 98, 100, 116-17; control, 48, 51 \for-to, 48, 291-99, 305-7; to, 8, 47, 125 INFL, 3 - 7 , 55, 124-6, 132, 142-50, 154, 218, 227; see also AUX, SP(V), tense, verbal inflection INFL inversion, see subject - auxiliary inversion inflection, 123, 179- 80, 193 -246; and the Adjacency Hypothesis, 205-9; tense, 210-3; verbal, 142, 210, 213-20, 227, 333; see also INFL Inflection Principles, First and Second, 203-6, 213, 219, 223, 243-6, 295 inflectional case, 221-2, 242 inflectional morphology, 16, 193-246 inflectional variant, 171-2 inherent case, 224 inherently reflexive, 116 insertion: cyclic lexical, 98-101, 110, 177; lexical, 97, 178, 196; of, 183; P, 63;

Index post-transformational lexical, 113, 132, 159, 176-81, 184-91, 197, 204, 304; pre-transformational, 177 insertion transformation, 255, 304 instrumental case, 53, 224, 235 INTENSIFIED see SP(A) internal argument, 29, 31, 68, 102, 137 interpretive principles for grammatical relations, 157 intransitive preposition, 32, 163, 252-63, 289 intransitive verb, 57, 254 intrinsic content, 289 inversion, 82, 148, 151; root, 141-5, 154; stylistic, 146; subject - auxiliary, 143-5, 214, 318, 330; verb-subject, 146-9, 170, 219 Invisible Category Principle, 7, 61, 99, 195, 227-8, 244 Irish, 74, 149 irregularity, 170-2, 202 isolating languages, 225 Italian, 139, 146-7, 218, 324, 326 Japanese, 9, 10, 52-5, 62, 64, 99, 142, 149, 151, 164, 193-4, 218, 221 judgment, 13, 14, 123-4, 128, 318 Kapingamarangi, 135 Korean, 4, 142, 145, 149, 154, 164 labialization, 54 landing site, 10, 106, 140-1, 147, 205, 299-301, 310, 318, 322-7, 332-4 language-particular nature of inflection, 250-9 language-particular rule, 7 - 8 , 200 last cycle, 146 late lexical insertion, 113, 132, 159, 176-81, 184-91, 197, 204, 304 Latin, 4, 17, 42, 52-3, 57, 60, 62, 170-1, 176, 193, 220-1, 223-5, 229-33, 235-6, 247 Left Branch Condition, 96, 131, 299 left dislocated NP, 288 left-right order, 26-7, 243; see also Head Placement (Parameter) and word order leftward movement, see fronting, inversion, NP movement, WH movement

349 le-la-les rule for French, 241 lexical category, 1 - 6 , 13-14, 19, 25, 30, 39, 128, 145, 156-64, 168-9, 172, 184-5, 190, 193; disguised, 162-4, 176; major, 13, 157-9, 203; see also head, open category lexical entry, 9, 255 lexical head, 1, 68 - 9 , 85 , 98, 124, 129, 243 lexical insertion, 97, 178, 196; cyclic, 98-101, 110, 177; see also posttransformational lexical insertion lexical P, 92, 120, 225-6, 250 lexical subject, 51, 86, 307 lexically specific, 261-2, 289 lexicon, 2, 157, 159, 178, 213, 246 LG (predecessor to UG), 30, 55 limited extraposition, 312 linear order, 20; see also left-right order and word order linking verb, 43, 60-1; see also be local rule, 7, 127, 148, 205-6, 209, 238, 241, 305, 325 local transformation, 38, 115, 120, 139-50, 175, 210, 213, 216-17, 220, 235, 240, 242 locality principle, 8 location, 133, 233 locative, 53, 235 logical form, 19 logical semantics, 19 main clause, 144, 316-17; COMP-deletion in, 309, 316-19, 332; see also root clause main verb, 173-4 major transformational movement, 139-40; see also Move a Malagasy, 126 manner adverb, 216 manner of speaking verb, 306 markedness, 60-1 max, value of, 20 maximal projection, 5, 14, 18, 76, 96, 128, 132, 156, 221, 300-3, 316-18, 333 measure phrase, 21, 23, 26, 29, 53, 55, 64, 121, 133-4 merger, 145, 244 metrical phonology, 8, 9 Middle English, 94, 110 minimal c-command, 68, 76-7, 85, 209

350

Index

minimal projection, 14, 26-7, 96, 156, 207-9, 222, 295 modal, 103, 142, 161, 167, 174, 210-13, 217, 253; analysis, 210-12; force, 103-4; and tense inflection, 210-13 modifier, 15, 26, 44 morpheme: bound, 142, 179, 193 - 7 , 200, 205, 246; derivational, 197-8, 201; inflectional, 196-8; phrasal, 130-1 morpheme category, 20, 155-9 morphological case, 4, 22, 52-6, 62, 99, 199, 220-37, 240-3, 267, 297 morphological component, 242-6 Morphological Transparency, 222-7, 239-43 morphology, 8, 16, 109, 235; derivational, 16-17, 193 - 8 , 201-2, 245-6; inflectional, 16, 193-246; verbal, see verbal inflection morphosyntactic feature, 158, 173 motion toward (preposition), 229, 234 motion verb, 47, 60, 170, 173 Move a, 10, 107, 112-13, 118, 12/-8, 138-41, 147, 206, 231, 299-303, 310-15, 319-27, 331 Move I, see subject - auxiliary inversion Move N, 300 Move PP, 116, 309-13 movement: affix, see affix movement; dative, 188-9; leftward, see fronting, inversion, NP movement, WH movement; local, see local rule and local transformation; major transformational, 139-40, see also Move a ; NP, 46, 52, 61, 97, 106, 184, 187, 267, 300; particle, 255-61; PP, see PP extraposition and PP fronting; rightward, 73, 129, 140, 186, 320-1, see also extraposition; root, 141-5, 154; SP(V), 141, see also subject - auxiliary inversion; verb, 115-20, see also verb fronting and verb raising; WH, see WH movement N, 208-9; anaphora, 182; dislocation, 139 NECESSITY, 212 NEG: and AUX, 211; movement, 181-2 negation, 15, 180, 323 negative morpheme, 181, 214, 245 negative polarity modal, 167 No Complementizer Constraint, 144-5

node: bounding, 129, 324-6, see also subjacency; empty, 39, 98-100, 111, 139, 143, 179, 183-4, 190, 244; nonphrasal, 20, 155-9; preverbal, 333; root, 81, 127, 139, 143-4, 196, 304, 313, 316-21, 330-2 nominal, derived, 42-52, 64-5, 276 nominative absolute, 236 nominative case, 53, 58-9, 180, 185, 221, 226, 229, 234-40, 243 non-argument position, 28-9, 295, 301, 322 non-comparative as, 63, 264-79; before predicate nominal, 267-72; P status of, 272-79, 332 non-finite clause, 70, 110; see also gerund and infinitive non-maximal projection, 14, 96, 209, 222, 295 non-NP gerund, 71, 92, 106 non-phrasal category, 20, 155-9 non-phrasal modifier, 15 non-restrictive relative clause, 239, 252, 278, 303-4, 319-22 noun (N), 6, 25, 156, 221-3; count, 21, 200, 227, 245, 311; disguised, 162; ellipsis, 163; empty, 23, 113; government of, 233; grammatical, 162, 184-5; Move N, 300; N-V asymmetry, 52; verbal, 74 noun phrase (NP): appositive, 28, 229, 239; conjoined, 240; doubling, 228; empty, 55, 76, 231; headless relative, 307, 328; left dislocated, 288; minimal, 222; possessive, 21, 26, 64, 94, 125, 130, 183-4, 199, 201, 206, 220-1, 228-9, 241; referential, 272; subject, 25, 68, 76-9, 85, 92, 123-5, 131, 135, 295; topicalized, 93; vocative, 81, 234 NP complement to A, 4 NP gerund, 70-1, 77, 84, 93-4, 105, 110 NP movement, 46, 52, 61, 97, 106, 184, 187, 267, 300 NP postposing, 139, 141 NP structure, 53 NP trace, 221, 231; see also empty NP and NP movement NP-a Inversion 170 null: anaphor, 129, 182; preposition, see empty P; VP, 129, 139 number agreement, 76, 200-1

Index numeral, 16, 168 object shift, 268-9 object-controlled gerund, 82-4 objective pronoun, 297 obligatory, 134, 156, 204, 213, 244; constituent, 21, 25, 283, 291 obligatory control, 96-106, 110-11, 118, 178, 274, 305-8 oblique case, 4, 32, 42, 112, 229 observed result clause, 284-5 of: insertion, 183; phrase, 208-9 one, 28 open category, 6, 99, 146, 159, 167-73, 177-8, 187, 190-1, 196-7, 201-3, 262; see also lexical category operator, 88-91, 103 order, 26-7, 243; see also Head Placement (Parameter) and word order over-correction, 238 P, see preposition P insertion rule, 63 P shift rule, 255-61 P j , 157, 303 palatalization, 54 parallelism, 4, 19, 331 parameter, 333; Head Placement, 15-20, 26-7, 125 - 6 , 131, 146, 199, 200, 243, 245, 254; pro-drop, 147, 218; word order, see Head Placement (Parameter) parametric variation, 122 parasitic gap, 88-92 parentheses in structural description, 217-18 parenthetical, 138, 143, 146, 163, 239, 321; formation, 73, 320-1 participle, 52, 70, 122; adverbial, 104, 106; passive, 75, 113, 194, 199; past (perfect), 174-6, 199, 201, 216; postnominal, 208; present, 109, 116, 175-6, 199, 216; see also reduced relative particle, 14, 16, 56, 202, 252-63; emphatic, 214; postverbal, 32-3, 163, 252-63, 289; question, 153 particle movement, 255-61 partitive, 62 passive, 98, 106, 113, 164, 176, 182-4, 187, 195, 199, 267

351 passive auxiliary, 106-7 passive participle, 75, 113, 194, 199 past participle, 174-6, 199, 201, 216 past tense, 161, 196, 198 path of motion in verbs, 157; see also directional perception gerund, 71-2, 80-1, 93 perception verb, 46-7, 52, 116, 118 percolation, 299-301, 333 perfect participle, see past participle Persian, 4, 200, 207 personal pronoun, see pronoun phonological: feature, 168; foot, 180; theory, 136; word, 205 phonology, 8 - 9 , 17, 245 phrasal conjunction, 330 phrasal modifier, 130-1 phrase category, 1, 122, 156; see also maximal projection, minimal projection, X2 phrase structure rule, 2, 15, 18, 22, 27, 202, 254; see also base composition rule phrases outside X, 27 physics, 8 pied-piped infinitival, 299 pied-piping, 276-9, 303-4, 320, 333 planned result clause, 293 pleonastic it/there, 310-11 plural, 158, 161, 193-4, 200-1, 206, 223-4, 231-2, 236 polarity item, 181-2 Polish, 17, 42, 57, 62, 326 Polynesian, 135-6, 138, 150 possessive NP, 21, 26, 64, 94, 125, 130, 183-4, 199, 201, 206, 220-1, 228-9, 241 post-cyclic principle, 220 post-cyclic rule, 178, 219 postpositional languages, 126 post-transformational lexical insertion, 113, 132, 159, 176-81, 184-91, 197, 204, 304 postverbal clitic, 205 postverbal particle, 32-3, 163, 252-63, 289 predicate adjective, 35, 43, 269 predicate attribute, 44, 57-9, 65, 75, 83, 92, 241, 264-9, 273 predicate nominal, 36, 43, 239, 267-79; after as, 267-72

352

Index

predication, 82, 121, 138 prefix, separable verb, 56 prefix, verb-forming, 17 preposition (P), 6, 13-14, 29, 32, 39-41, 61, 183, 203, 225 - 3 7 , 242, 245 , 247-66, 277-309, 318-34; compound, 251; directional, 225, 229, 234, 290; disguised, 163; empty, 49-50, 53, 111-13, 183-4, 189, 225-31, 236-7, 248; as grammatical category, 156-7; intransitive, 32-3, 163, 252-63, 289; as landing site for WH movement, 299-303; lexical, 92, 120, 225, 250; and N, 54; of place, 236 prepositional case, 232 prepositional copula as, 264—6 prepositional phrase (PP), 14, 2 7 - 8, 42, 62, 140, 163 , 225 , 232, 247-63 , 269-79, 282-327; directional, 30, 57; extraposition, 116, 309-13; fronting, 273, 299-301; subject, 314 prescriptivists, 238 present participle, 109, 116, 175-6, 199, 216

present subjunctive, 48, 297 prestige usage of subject pronouns in English, 238-9 pre-subject position, 146, 152 pre-transformational insertion, 177 preverbal clitic, see entries under clitic preverbal node, 333 principal governor, 38 Principle A, 185; see also anaphor and reflexive Principle B, 85-6 PRO, 27 , 75, 100, 103, 111 pro-drop parameter, 147, 218 productivity, 195; see also open category progressive, 46, 71, 270, 333 projection, 1, 3, 122, 156; maximal, see maximal projection; minimal, see minimal projection; Relative Clause Projection Principle, 290-4; second, 121, 319; third, 5, 7, 26, 28, 37, 96, 121, 137 Projection Adjacency, 207 Projection Principle, 134, 138, 140, 146, 178, 289 pronominal: anaphor, 100; feature, 159 pronominalization, 320

pronoun, 161, 184, 271-2; compound, 162, 204, 207; false case of in English, 237-42; in the lexicon, 239; objective, 297; subject, 180, 238-44, 297 Pronoun Rule for English, 238-44 Pronoun Rule for Spanish, 242 proper government, 95 Proper Inclusion Precedence, 17 propositional 0-role, 111 pro-verb do so, 283 PRT, see postverbal particle pseudo-cleft, 87, 269 psychological verb, 306-7 Pukapukan, 136 purpose clause, 293, 298, 320 quantifier, 24, 32, 156, 199, 204, 207 quantifier floating, 92, 131, 182-3, 215 quantifier phrase (QP), 163, 327 question, 152, 174, 210, 299, 324, 330; indirect, see indirect question; tag, 142, 146, 210, 214, 219, 323 question particle, 153 quirky case, 60, 224-6 quirky dative, 231 quirky genitive, 229, 233 raising: right node, 250; subject, 118, 307; verb, 115-20, 147, 150, 218 raising-to-object, see Exceptional Case Marking readjustment rule, 296 reciprocal, 75 recoverability, 106-110; of NP, 90 Recoverability Condition, 173, 176 recursion, 7, 26, 96, 130-1; A-over-A, 332 Recursion Hypothesis, 332 recursion restriction, 130-1 reduced relative, 71, 76-80, 85, 93-7, 104, 208 reference, 24, 121, 123, 128; disjoint, 85-6 referential NP, 272 reflexive, 75, 92, 162, 185; inherently, 116 regular inflectional variant, 171-2 relational category P, 29, 31 relative clause, 181, 252, 277-8, 284-5, 289-94, 298-9, 304, 312-13, 324; appositive, 239, 303-4, 319-22; headless, 307, 328; infinitival, 94, 294, 297, 302, 332; non-restrictive, 252, 278;

Index reduced, 71, 76-80, 85 , 93 - 7 , 104, 208; restrictive, 303; sentential, 265-6 Relative Clause Projection Principle, 290-4 relativized predicate nominal, 271-2 result clause, 283-5, 293 Revised Second Inflection Principle, 295 Revised »-criterion, 69, 70, 78-85, 98-110, 116, 308 right node raising, 250 Right Roof Constraint, 327 rightward movement, 73, 129, 140, 186, 320-1; see also extraposition Romance languages, 142, 145-9, 154, 157, 170, 198, 203, 205-6, 211, 218, 333 root, 81, 139, 143-4, 196, 316-21, 330 root clause, 127, 304, 313, 332 root transformation, 127, 141-6, 154, 214, 219, 318, 330 Russian, 42, 84, 221, 224-5, 297, 326 -'s, see possessive NP S, 13, 14, 69, 96, 121-2, 128, 154, 157, 308, 318; complement, 48, 50, 292; deletion, 51, 90; topicalization, 313-6; _ see also clause S, 69, 92-4, 157, 250, 281-332; complement, 23, 305; extraposition of, 23, 309-13, 332; subject, 22, 30, 40, 306, 314; topicalization, 309, 313-16; see also clause Sanskrit, 193, 224-5, 233-6, 247 science, 2, 8 Second Inflection Principle, 204, 206, 213, 223 , 243 - 4 ; Revised, 295 secondary feature, 54 selection restriction, 24, 79, 125, 156, 305-6 selectionally dominant, 156 semantic case, 224 semantic component, 166 semantic feature, 165-9, 172-6, 184, 187-8 semantic interpretation, 9, 19, 168-9 semantic selection, 305-6 semantics: asymmetric, 19; categoryspecific, 24; generative, 2, 108; logical, 19 sentence adverb, 286, 297 sentence-initial adverbial, 26, 260

353 sentential relative, 265-6 separable verb prefix, 56 serial verb construction, 40 Slavic, 53 small clause, 34, 36, 76, 84 so that clause, 283-5, 293 some/any alternation, 181 SOURCE, 133 SOV languages, 2, 126, 138-41, 149-52 SP(A), 5, 18, 29, 53, 156, 161-4, 167, 179, 203, 208-9, 245 , 253, 293, 327, 331, 333; see also comparative SP(INFL), 7 SP(N), 53-7, 156, 163, 166, 181-2, 204, 222-3, 245, 333; see also determiner SP(P), 19, 29, 32, 53, 140, 161, 164, 258, 296, 333-4 SP(V), 22, 53-7, 122-4, 140-7, 154-7, 161, 167, 176, 210-15 , 219, 227, 237-9, 245-6, 294-7, 333; see also AUX and INFL SP(X), 1, 16, 19-21, 159-60, 166-8, 179, 200-3, 213, 221, 237, 243, 263, 285, 299-303, 333; see also specifier Spanish, 62, 70-5, 82, 84-7, 89, 91, 93, 97, 99, 103, 109-10, 116, 120, 139, 146-8, 163, 176, 194-6, 205, 220-2, 233, 241-2, 279; el, 71 SPC, see Structure-Preserving Constraint specified subject condition, 123 specifier, 5, 18-21, 121, 200, 203, 210, 219, 285, 293; see also SP(X), SP(A), etc. split antecedent, 92 spreading, 108 s-structure, 1, 2, 53-64, 77, 128, 218-19; see also post-cyclic and post-transformational stative verb, 170, 172, 187, 189, 270 stem, 109, 193-4, 197 stranding, 283, 330, 333 Stratified Cycle Hypothesis, 100-1 stress, 201, 218 structural description, 218 structuralist grammar, 206, 210, 245 structure preservation, 7, 139-41, 300-2, 322-4 structure-building transformation, 108 Structure-Preserving Constraint, 115, 127-8, 132, 138-50, 154, 303, 319

354

Index

structure-preserving substitution, 127, 139-40 stylistic fronting, 152 Stylistic Inversion, 146 stylistic transformation, 127, 145, 239 subcategorization, 5 , 9, 17, 37-42, 45, 60-2, 70, 82-4, 98-106, 110-12, 122, 170, 189, 226-7, 292, 304, 308, 311-12; for elements in COMP, 304-9; for S, 311-12; obligatory, 291 subcategorization feature, 17, 38-40, 45, 60-2, 99, 112, 178, 183, 186-7 , 254-6, 292 subcategorization frame, 61, 84, 227, 244, 307 subcategory, 158, 172 SUB J feature, 3, 54 subjacency, 38, 90, 94-6, 116, 129, 178-9, 239-40, 244, 251, 308-9, 322-6 subject, 3 - 6 , 21-6, 34, 68, 75-9, 85, 91, 121-5, 131-5, 154, 295; clausal, 22, 30, 40, 306, 314; dative, 297; deep structure, 23, 25, 133; empty, 27, 75, 78, 98, 100, 103, 110-11, 295; lexical, 51, 86, 307; PP, 314; understood, 27, 67-70, 75, 78, 85 , 97-8, 103, 110-11, 294; unique, 134 SUBJECT, 76 subject clitic, 143, 240; rule for French, 240 subject control, 80 Subject Principle, 23-30, 78 subject prominence, 132-8 subject pronoun, 238-44, 297; context for insertion of, 180; prestige usage of, 238-9 subject - auxiliary inversion, 143-5, 214, 318, 330 subject-final languages, 138 subject - predicate structure, 138 subject-raising verb, 118, 307 subjects across categories, 22 subjunctive, 96, 109, 211-12; present, 48, 297 subordinate clause, 141, 144, 264, 281-4, 288-91, 324-6, 329 subordinating conjunction, 49, 120, 248-52, 258, 262, 265-6, 279, 282, 289-92, 299, 302, 306, 326, 330 subordination, 7, 158, 248, 332 substitution, 127, 139, 144, 319, 325-6;

structure-preserving, 127, 139-40 subtheories of grammar, 246 successive cyclic WH movement, 308, 325-6 suffix, 16, 17; see also inflection superlative, 161, 185, 193 - 4 , 199, 201, 206, 213 suppletion, 170-3, 176, 191 surface structure, see s-structure SVO languages, 18, 27, 126, 135-41, 149-52, 219 syncretism, 110, 237 Syntactic Asymmetry, 2, 3 syntactic category, 155-9; see also category syntactic feature, 165-77, 199, 212, 333; see also feature syntactic units, relation with words, 201-5 tag question, 142, 146, 210, 214, 219, 323 Tagalog, 137 temporal: aspect, 46, 71, 102-3, 116, 251, 294; clause, 90; conjunction, 72 tense, 19, 161, 196-8, 210-19, 227 tense-aspect contrast, 294 terminal element, 186 Thai, 151 than, 328-32 that clause, 15, 48-50, 142, 164, 199, 283, 304, 307; as PP, 283-6 Mai-trace filter, 114 thematic relations, see 0-role theme, 133 there, 186, 311 0-criterion, 29, 41, 63-5, 67-70, 78, 84-5, 127, 141; revised, 69, 78-85, 98-110, 116, 308 0-dependent case theory, 65 0-relatedness, 78-83 , 94, 100, 104 0-role, 29, 34-5, 78-9, 82-4, 100-2, 107, 133-7; adjunct, 68; fixed, 137; multiple, 68; propositional, 111; theory, 46, 48 0-role assignment, 5 - 6 , 29-52, 58, 61-2, 65, 67-8, 98-101, 105, 136-7, 154, 194, 226, 245, 247, 308; compositional, 37; direct, 31-7, 42, 47-51, 247; indirect, 31, 37-41, 47-9, 61-2, 99-101, 105, 136, 226-7, 308 0-role association principle, 137 third projection, 5, 7, 26-8, 37, 57, 95-6, 121, 137, 318

Index though clause, 318 T-model, 146 topic prominence, 132-8, 150 topicalization, 93, 126, 138, 149, 304, 313-16, 320, 332; S, S, 309, 313-16 topic - comment, 135, 137 trace, 23, 77, 97, 113, 178, 187; NP, 221, 231, see also empty NP, NP movement; WH, 295, 299, see also WH movement transformation: construction-specific, 138; insertion, 255, 304; language-particular, 7 - 8 , 200; local, 38, 115, 139-50, 175, 210, 213, 216, 220, 235, 240, 242; root, 127, 141-6, 154, 214, 219, 318, 330; structure-building, 108; structurepreserving, 7, 127, 139-41, 300-2, 322-4; stylistic, 127, 145, 239; see also movement and Move a transformational component, 9, 53 transformational cycle, 98, 107, 146 transitive verb, 60, 254; see also accusative case Transparency, Morphological, 222-7, 239-43 Turkish, 71, 141 typology, 138, 141, 151 understood subject, 27, 75, 78, 98, 100, 103, 110-11, 295 understood subject property, 67 - 7 0 , 75, 78, 85, 98 uniform three-level hypothesis, 14, 25 unique subject, 134 Unique Syntactic Behavior: of closed class items, 165-70, 191; definition, 165 Universal Base Hypothesis, 158 universal grammar, 8, 15, 18, 20, 30-1, 52, 65 universal VP, 150 Universality, Hierarchical, 1, 3, 7, 20 universals on word order, Greenberg's, 125, 132, 135-6, 150-3; Universal 6, 135-6; Universals 9 through 12, 151-3 USP, see understood subject property V fronting, 146-9, 170, 219 V movement, 115-120; see also verb raising V2 complement, see VP complement V3, see third projection

355 Vk complement, 46, 52 V max , see third projection and maximal _ projection Y, 217 V deletion, V fronting, 214 variable, 8 - 9 , 139, 322 Variable Interpretation Convention, 207 verb (V), 3, 6, 24-5, 29, 35, 54, 109, 121, 141-50, 156, 168-9, 196-8, 212-19, 254; of activity, 172, 187, 189; be, 46, 71, 146, 150, 212, 270, 333; of belief and desire, 51; compound, 164; disguised, 163; finite, 15, 147, 150, 199, 216-19; fronting, 146-9, 170, 219; grammatical, 40, 162, 169, 172, 175, 184-91; impersonal, 99; intransitive, 57, 254; linking, 43, 60-1; main, 173-4; of motion, 47, 60, 170; passive, 75, 106-7, 113, 194, 199; of perception, 46-7, 52, 116, 118; psychological, 306-7; raising, 115-20, 147, 150, 218; serial construction, 40; stative, 170-2, 187-9, 270; subject-raising, 118, 307; of temporal aspect, 46, 71, 102-3, 116, 251; transitive, 60, 254 verb phrase (VP), 3, 5, 59, 69, 74-5, 96, 121-2, 128-9, 132, 148-54; bare, 69-70, 75-98, 110; coordinate, 108; empty, 129, 239, see also VP deletion; gerundive, see gerund; participial, see participle; universal, 150 VP complement, 46-8, 96, 120 VP deletion, 86, 111, 211 VP-final, 115, 118, 145, 249, 312 VP-initial, 146 verb raising, 115-20, 147, 150, 218 verbal adjective, 74 verbal complex, 147 verbal inflection, 142, 148, 161, 204, 210, 213-20, 227, 333; in English and French, 213-20; see also tense verbal noun, 74 verb-final languages, 2, 126, 138-41, 149-52 verb-forming prefix, 17 verb-initial languages, 2, 18, 26, 74, 82, 123, 135-41, 148-54 verbless deep structure clause, 24 verb - particle combination, 252, 255-6 verb-second languages, 18, 27, 126,

356

Index

135-41, 149-52, 219 verb —subject inversion, 146-9, 170, 219 vocative, 81, 234 vowel feature, 54 vowel shortening, 197 VSO languages, 2, 18, 26, 74, 82, 123, 135-41, 148-54 V - V filter, 119 Welsh, 149, 219 WH, 9, 104, 286-8, 291, 299, 304, 320-6, 333-4; distribution of, 333-4 WH interpretation, 320 WH island, 325-6 WH movement, 10, 139, 141, 144, 151, 181, 186, 265-7, 273, 276-9, 287-8, 295, 298-304, 313-14, 322-8, 332-3; P as landing site for, 299-303; successive cyclic, 308, 325-6 WH phrase, 88, 291, 298-302 WH trace, 295, 299 WH word, 181, 272, 277-8, 332 wide scope, 9 word, 109, 195, 198, 201, 204, 219; phonological, 205; relation between words and syntactic units, 201-5, 246

word boundary, 202-5, 242-5 Word Division, Principle of, 202-4, 242-4 word formation, 194 word order, 20, 123-8, 135-41, 145, 148-54; fixed, 18; free, 127; s-structure, 128; stylistic reordering, 127; subjectfinal, 138 universals, 125, 135-6, 150-3; verb-final, 2, 126, 138-41, 149-52; verb-initial, 2, 18, 26, 74, 82, 123, 135-41, 148-54; verb-second, 18, 27, 126, 135-41, 149-52, 219 word order parameter, see Head Placement (Parameter) word template, 201 X, 159-60, 170, 176, 196, 201-3; see also head and lexical category X2, 121, 319 X', see minimal projection Xmax, see maximal projection yes-no question, 152 zero complementizer, 113, 119, 307

A UNIFIED THEORY OF SYNTACTIC

CATEGORIES, by J. EMONDS

Future, present, and past readers: the corrigenda assembled here include only those that will lead to misunderstanding if ignored. p. 18,

2nd line before 1.2: Placement as in (2) and to examine the consequences p. 18, lines 10 and 11: (2), not (5). p. 22, line 7: replace certain with certain X. p. 91, line 11: replace gerunds with clauses. p. 92, line 3 of note 10: replace Xj+1 with Lj+1. p. 108, ex. (78): replace ed under A with en. p. 139, line 5 from bottom: replace a most by at most. p. 144, line 13: delets an empty. p. 211, ex. (29): SP(V) i ± TENSE] . ( [ ± PAST J °Ptlonal) p. 228, 3rd last line: replace it by the latter. p. 242, line 1 of ex. (37): replace a deep with at deep. p. 248, line 17 from bottom, and p. 251, line 4 from bottom: replace (1978) with (1976). p. 249, lines 9 and 32: the symbols are NP and (that) S. p. 256, line 14 from bottom, and p. 261, line 8: replace (1968) with (1965). p. 290, line 9: replace N or N with N or N. p. 295, line 8: replace (59) with (69). p. 299, line 20: replace S by X.