New Approaches to Chinese Word Formation: Morphology, Phonology and the Lexicon in Modern and Ancient Chinese 9783110809084, 9783110151091

275 86 13MB

English Pages 400 Year 1997

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

New Approaches to Chinese Word Formation: Morphology, Phonology and the Lexicon in Modern and Ancient Chinese
 9783110809084, 9783110151091

Table of contents :
Foreword
Preface
Indroduction
Word formation in Old Chinese
V-V Compounds in Mandarin Chinese: Argument structure and semantics
Syntactic, phonological, and morphological words in Chinese
Wordhood in Chinese
Prosodic structure and compound words in Classical Chinese
Chinese as a headless language in compounding morphology
Chinese resultative constructions and the Uniformity of Theta Assignment Hypothesis
A Lexical Phonology of Mandarin Chinese
Cognate objects and the realization of thematic structure in Mandarin Chinese
On defining the Chinese compound word: Headedness in Chinese compounding and Chinese VR compounds
Index

Citation preview

New Approaches to Chinese Word Formation

W DE

G

Trends in Linguistics Studies and Monographs 105

Editor

Werner Winter

Mouton de Gruyter Berlin · New York

New Approaches to Chinese Word Formation Morphology, Phonology and the Lexicon in Modern and Ancient Chinese edited by

Jerome L. Packard

Mouton de Gruyter Berlin · New York

1998

Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter & Co., Berlin.

© Printed on acid-free paper which falls within the guidelines of the ANSI to ensure permanence and durability.

Library of Congress

Cataloging-in-Publication-Data

New approaches to Chinese word formation : morphology, phonology and the lexicon in modern and ancient Chinese / edited by Jerome L. Packard. p. cm. - (Trends in linguistics. Studies and monographs ; 105) Includes bibliographical references and index. ISBN 3-11-015109-X (alk. paper) 1. Chinese language - Word formation. 2. Chinese language - Lexicology. I.Packard, Jerome Lee, 1951II. Series. PL1231.N48 1997 495.1-dc21 97-33206 CIP

Die Deutsche Bibliothek — CIP-Einheitsaufnahme New approaches to Chinese word formation : morphology, phonology and the lexicon in modern and ancient Chinese / ed. by Jerome L. Packard. - Berlin ; New York : Mouton de Gruyter, 1998 (Trends in linguistics : Studies and monographs ; 105) ISBN 3-11-015109-X

© Copyright 1997 by Walter de Gruyter & Co., D-10785 Berlin All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording or any information storage and retrieval system, without permission in writing from the publisher. Typesetting and printing: Arthur Collignon GmbH, Berlin. Binding: Lüderitz & Bauer, Berlin. Printed in Germany.

For Carol, Errol, Sam and Eric "... I hope you don't mind that I put down in words, how wonderful life is while you're in the world..." Elton John

I was struck suddenly by the joke of it all. We social scientists are trying hard to be conscientious, using the methodologies and thought patterns of seventeenth-century science, while the scientists, traveling away from us at the speed of light, are moving into a universe that suggests entirely new ways of understanding. Just when social scientists seem to have gotten the science down and can construct strings of variables in impressive formulae, the scientists have left, plunging ahead into the vast "porridge of being" that describes a new reality. Margaret J. Wheatley Leadership and the New Science

Foreword

The question of wordhood in Chinese has been prominent from the inception of grammatical studies of the language. Two significant features of Chinese make this question pertinent. First, it has long been observed that, as compared to other languages, Chinese has very little obligatory inflectional morphology of the kind which figures as a major defining feature of such languages as those spoken indigenously in North and South America, for example. The relative lack of "morphology", particularly inflectional morphological processes, has been featured in typological studies at least since Sapir's Language (1921), and has long been cited as a central distinguishing property of Chinese. The second feature of Chinese that makes the issue of wordhood intriguing is its writing system: which characters and combination of characters should be thought of as "words"? As noted in the Introduction to this volume, there was no term in Chinese for "word" as distinct from "character" until the beginning of the twentieth century. For example, if a combination like xuexiao 'school' refers to a single "concept", shouldn't this be considered a word in spite of the fact that it contains two characters, each with its own semantic content? And related to this issue are two other issues: one is the status of such grammatical forms as le, zhe, and so on, which have properties of both affixes and clitics in other languages, and the other is the relative abundance of compounding and derivational, as opposed to inflectional, processes in Chinese. Partially in reaction to the view of Chinese as having "no morphology", several modern studies, including important contributions by Professor Packard, have addressed these questions from a variety of theoretical viewpoints. It is very much to Professor Packard's credit that he has brought this collection of articles together to provide a cross-section of research on word-formation in Chinese. This book gives us an in-depth look at a variety of morphological issues for Chinese, both diachronic and synchronic, including not only derivational processes, but also other types of word formation processes such as compounding and reduplication, the dramatic change in the history of Chinese from monosyllabic to bisyllabic lexical units, and analytic issues made visible by recent theoretical issues in linguistics, such as argument structure, transitivity, lexicalization, and the relationship between phonology, prosody, and morphology. By bringing to our attention this complex range of specifically word-

χ

Foreword

oriented problems, this book not only fills a gap in existing descriptions of Chinese, but sets the pace for future studies in this area. Sandra A. Thompson

Preface

In trying to understand the nature of a phenomenon, it helps if the vision can be articulated from multiple perspectives. This sort of "triangulation" - or "multiangulation" - helps provide a broader picture of the phenomenon in question, but it is also the hallmark of a discipline at a certain stage of development. Thomas Kuhn has said that a sharp increase in the number of theoretical approaches within a given area of inquiry is characteristic of the period just prior to a "paradigm shift" within that area. If that is so, then the diverse character of the papers in this volume stands as evidence that a shift within the paradigm of Chinese word formation cannot be far off. It is interesting to consider the differing contemporary views of linguistics and the language faculty as cognitive science, and how the subject of this volume — the analysis of Chinese words and their formation - fits in. At one end of the spectrum is the view that the linguistic ability we are born with that enables us to acquire language is merely a specific application of the generalized psychological principles of mental operation that govern the way we cognitively parse our world. From this perspective, how we build words is simply a particular instance of our general ability to build larger from smaller meaningful units. At the other end of the spectrum is the view that our ability to learn and use language constitutes a set of abilities or "algorithms" specifically dedicated to the language faculty, and that the linguistic subsystems (such as phonology, morphology, syntax and semantics) also represent unique, dedicated "modular" systems that share algorithms neither with each other nor with other cognitive abilities. My own bias is that within human cognition, language must surely be an instance of a specialized higher-level process involving kinds of rule abstraction and inferencing that differ from those that characterize, for example, visual perception. It may also be that as human linguistic ability developed phylogenetically, it not only allowed the species to achieve more complex modes of communication, but it also enabled successively more complex realms of ideation, so that in a sense the "language" of thought may have its origins in the "language" of language. It is difficult to conceive of anything other than language that has served so exclusively to make us human.

xii

Preface

Within this context of language as the great "cognitive enabler", it takes no great leap of imagination to entertain the hypothesis that both the phylogenetic development and contemporary use of language have crucially depended upon the word as a fundamental cognitive construct. The lexicon and the words it contains arguably constitutes the only modular linguistic system via which all other linguistic systems interface: words are nexus of specified sound sequences and stable configurations of meaning, from which the creative trajectories of phrases and sentences spring forth. If linguists wish to assert that linguistic principles constitute uniquely specialized cognitive principles, then we must be willing to look for evidence of those principles across different, apparently unrelated, languages. In that spirit the present volume looks at Chinese words and their formation, with an eye toward determining if in a language with no grammatical agreement, little morphophonemic alternation and no inflection, it might still be possible to read off a plan in the human mind for the design and use of words. Our perspective is necessarily both synchronic and diachronic, for it is in both the present-day properties of complex words and the evolution of their structures over time that we can best observe the workings of a putatively universal device. My thanks to all the contributors for their forbearance in the face of unspeakable delays in compilation and production. Thanks also to Sandra Thompson for writing the Foreword, to Werner Winter for his congenial correspondence and for editing the manuscript, to Barbara E. Cohen for making the index and to my colleague C. C. Cheng for his continued support. The contributors to this volume have been drawn together not by geographic proximity or even agreement on what constitutes the foundations of Chinese morphology, but rather by the shared conviction that the structure and formation of Chinese words is interesting and worth investigating. Jerome L. Packard

Contents

Foreword Sandra Α. Thompson

ix

Preface Jerome L. Packard

xi

Indroduction Jerome L. Packard

1

Word formation in Old Chinese William H. Baxter and Laurent Sagart

35

V-V Compounds in Mandarin Chinese: Argument structure and semantics Claire Hsun-huei Chang

77

Syntactic, phonological, and morphological words in Chinese John Xiang-Ling Dai

103

Wordhood in Chinese San Duanmu

135

Prosodic structure and compound words in Classical Chinese Shengli Feng

197

Chinese as a headless language in compounding morphology Shuanfan Huang

261

Chinese resultative constructions and the Uniformity of Theta Assignment Hypothesis YafeiLi

285

A Lexical Phonology of Mandarin Chinese Jerome L. Packard

311

xiv

Contents

Cognate objects and the realization of thematic structure in Mandarin Chinese Claudia Ross

329

On defining the Chinese compound word: Headedness in Chinese compounding and Chinese VR compounds Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu . . 347

Index

371

Introduction1 Jerome L. Packard

1. Introduction The subject of this volume is the formation and structure of complex (i. e., multimorphemic)2 words in Chinese. Within this collection is represented a range of articles that is broad both in theoretical approach and in time period, from word derivation processes in Old Chinese, to the thematic structure of modern Mandarin compounds. The collection of articles offered here serves to demonstrate the wealth and diversity of scholarship within the field of Chinese word formation. Throughout its history, the Chinese language has manifested word formation processes as disparate as phonological derivation, word splitting, contraction, overt marking of case relations and reduplication. At present many complex issues in Chinese morphology are topics of discussion, including theta assignment principles affecting words and their constituents, determination of wordhood, transitivity and the nature of wordinternal arguments and even diachronic shifts in syllable and foot structure. In this introductory chapter, I present some background information on complex word formation in Chinese, followed by a brief introduction to the papers of the contributors. I will use chronological terminology in this chapter as follows: The period from the appearance of writing (around 1200 BC or so; Boltz 1994: 43) to the end of the Han dynasty (around 220 AD) will be termed Old Chinese (also called "Archaic Chinese" in earlier work). The term "Proto-Chinese" is used to refer to a prehistoric hypothesized language ancestral to Old Chinese, perhaps a precursor of both Tibetan and Chinese. The period starting around the beginning of the Sui dynasty (600 AD) until the end of the Song dynasty (1279 AD) is called Middle Chinese (termed "Ancient Chinese" in earlier work). The term "classical Chinese" is used generally to refer to premodern Chinese language written in the classical versus vernacular style. Modern Chinese refers to the vernacular language used since 1900. Otherwise, dates or the names of specific dynasties are used.

2

Jerome L. Packard

2. Complex words in Old Chinese The Chinese language has produced complex words in many ways throughout its history.3 The most common word formation method (extending even to the present) has been the lexicalization of two juxtaposed words to form a single bisyllabic word. The first bisyllabic words 4 are thought to have been formed by the addition of a second syllable by some sort of phonological reduplication of all or part of a base monosyllable (see 2.2.2). But the earliest type of complex word formation in Old Chinese probably involved derivational processes of either affixation or morphophonological alternation ("derivation by phonological change") that operated on monosyllabic word bases. It is to this topic that we turn first.

2.1. One-syllable words 2.1.1. Derivation by phonological change At the earliest stages of the Chinese language for which we have a written record, 5 words appear to have consisted mainly of one syllable, with each syllable generally corresponding to one Chinese character and one morpheme 6 (Wang 1980: 343; Norman 1988: 112; and many others). At that time, related words are thought to have been derivable by changing the consonant, vowel or tone of a base word (Karlgren 1956; Pulleyblank 1995: 10—11; Baxter and Sagart, this volume). This would be like in English considering the verb "teethe" to be derived from the noun "teeth" by changing the consonant ([ti9] to [ti5]), the verb "bleed" to be derived from the noun "blood" by changing the vowel ([blud] to [blid]), or the verb "record" [record] to be derived from the noun "record" [record] by changing the tone (here, stress placement) of the word. The most clearly documented phonological derivation process in Old Chinese is derivation by tone change, specifically derivation involving the 'going' tone category (Downer 1959; Chou 1972: 15-22; Wang 1980: 213-217; Schussler 1985; Norman 1988: 84-85; Baxter 1992: 315-317; Baxter and Sagart, this volume). In Old Chinese, there were four tone categories: ping (Ψ) 'level', shäng (±) 'rising', qü (£) 'going' and rii (Λ) 'entering'. Many nouns with level, rising or entering tones could be changed to verbs by changing the base tone to a going tone. Such nounto-verb derivation appears to have been the most common derivation

Introduction

3

process, but verb-to-noun derivation occurred as well.7 As seen in Table 1, the tonal derivation process also involved grammatical functions other than noun and verb derivation, such as the derived form indicating a causative or transitive meaning. The examples in Table 1 are mostly from Chou (1972: 19), with some from Baxter (1992: 316), both of whom cite Downer (1959) as their source. The phonetic forms in the examples are modified from the sources cited for typographical convenience. 8 Table 1. Derivation in 'going tone' Base word

Gloss

'Going tone' derived word

Gloss

0 kwan

'cap' 'king' 'eat' 'ascend' 'pretty' 'distant' 'receive' 'buy'

kwan0 Μ hjwang0 ΞΕ biwan 0 ftx ziang0 ± xao° jiwan 0 jÜ zieu°g mai° St

'to cap' 'be king' 'food' 'above, top' 'to love' 'keep distant' 'give' 'sell'

Μ ohjwang ΞΕ °biwan 15 °ziang _h °xao fcF °jiwan jÜ 0zieu · A u χ °maiH

In Table 1 it can also be seen that characters representing the base and derived forms usually were the same or differed only minimally, sharing the component (i.e., the "phonetic" portion of the character) that reflected their cognate relation. Although going tone derivation is the best known of the Old Chinese phonological derivation processes, there is good evidence that other feature-based phonological processes, such as changes in voicing or vowel quality, were also used to derive words (Karlgren 1956; Norman 1988: 85; Baxter 1992: 176, 218-219; Baxter and Sagart, this volume).

2.1.2. Sub-syllabic affixation It seems likely that Old Chinese retained significant portions of a ProtoChinese sub-syllabic morphological affixation system, 9 cognates of which may be identified in classical and modern Tibetan (see, e. g., Pulley blank 1973 a; Schussler 1976: 61-62, 115; Bodman 1980; Baxter 1992: 176,

4

Jerome L. Packard

218—222, 324; Baxter and Sagart, this volume). According to this theory, morphologically-related single-syllable words (i.e., lexemes; perhaps in some cases the members of word families, see Karlgren 1933) were derived by the addition of prefixes, infixes and suffixes, the exact meanings of which are difficult to determine with certainty (Schussler 1976: 51—59, 75—76; Baxter and Sagart, this volume).

2.1.2.1. The written representation of sub-syllabic affixes If the theory that Old Chinese possessed sub-syllabic affixes is correct, then such affixes would still have existed at the time when the Chinese writing system was invented, and during its early stages of development (from around 1200 to 500 BC). Under such conditions, there were various ways that the characters could have been used to represent a derived word that had been generated from a base word using a sub-syllabic affix. One way was simply to use the same character for both the base and derived words. In this situation, a single character had different pronunciations corresponding to the different (albeit related) meanings, with the reader relying on context to provide the proper reading - a common occurrence in classical Chinese (Pulleyblank 1995: 10; Baxter and Sagart, this volume; see also, e. g., Table 1). A second way was to use a character that was different from (although often graphically related to) the base character, also a common practice in classical texts (see Karlgren 1933; Boodberg 1934; Lu and Wang 1983: 80-81; Baxter and Sagart, this volume). A third way, generally overlooked but explained in detail by Boodberg as a way of writing consonant clusters (Boodberg 1937: 354-360 [1979: 388-394]), was to use two characters to represent the single-syllable derived word, in violation of the "one-character-one-syllable" principle. For example, in the case of a derived word generated from a base word by prefixation, the initial consonant of the first character would represent (or "spell") only the prefix, and the second character would represent the pronunciation of the base word. The two characters representing the prefix and base would have rhymed, in the ideal case differing only in the pronunciation of the initial consonant. Boodberg argued that such two-character combinations (he called them "binoms") were used precisely in this way, to provide the correct reading of consonant clusters still extant in the phonologically more conservative southern dialects (Boodberg 1937 [1979] footnote 53), or in obsolescent words. 10 Quoting Boodberg (with minor changes made for typographical convenience):

Introduction

5

Thus a *geu (descended from an original **GLeu) could be "reconstructed" into its primitive reading by affixing to it a graph read -*leu while a -*leu could serve as a basis for "reconstruction" with a prefixed graph read *geu. In both cases, we would have a binom consisting of two independent graphs *geu-*leu, the purpose of which would be not so much to represent two words *geu and *leu as to render an obsolecent *gleu.

(1937: 356 [1979: 390]) Boodberg also explicitly mentions the possibility that certain members of such consonant clusters may have functioned as grammatical affixes (Boodberg 1937: 359 [1979: 393] and footnote 61). The deterioration of the Proto-Chinese sub-syllabic affixal system was presumably already well under way by the time Chinese character writing was invented. Since characters probably were not used extensively to represent such affixes (if indeed they were used at all), the lack of a way to record them in writing undoubtedly would have hastened their demise.

2.1.3. Fusion words Another type of complex monosyllabic word in Old Chinese is what have been termed "fusion" words (Dobson 1959: 167-168; Serruys 1959: 113; Kennedy 1940 [1964 a]: 62-77; Pulleyblank 1995: 9).11 These words resulted from the contraction of two syllables, the second of which is usually an unstressed pronoun or demonstrative (Norman 1988: 85-86). Since these words came to be spoken with one syllable, they were written with a single character whose pronunciation reflected the contracted form. Table 2 provides some examples, based on Kennedy (1940 [1964 a]). Sound values have been modified for typographical convenience.

Table 2. Fusion words (contractions) in Old Chinese First word

Gloss

Sound

Second word

Gloss

Sound

Fu-sion Gloss

Sound

£

'it' 'and'

tsi nzi

η Ε

'at' 'end'

iwo i

m Ά

tsiwo nzi

'not' 'how'

pieu g'a

£ *

'it' 'not'

ti pu

%

m * Μ

'it at' 'that's all' 'not it' 'how not'

piuet g'ap

6

Jerome L. Packard

These words are thought to have undergone lexicalization and then contraction to form monosyllables, implying an intermediate stage of existence as bisyllabic words. This fact, together with the posited existence of sub-syllabic affixes (see 2.1.2 and note 7), implies the presence of greater numbers of multimorphemic words than generally has been recognized for Old Chinese.

2.2. Two-syllable words With the possible exceptions of the progenitors of fusion words discussed in the previous section, and proper names (see note 4), the earliest productive creation of two-syllable words for which we have direct evidence involves the operation of phonological processes upon monosyllabic word roots. Those phonological processes were the complete or partial reduplication of monosyllabic words, and possibly the splitting of monosyllabic words into words of two syllables. The linguistic function of these phonological word formation processes may have been to produce lengthened word forms, in order to distinguish words in danger of losing their lexical identity through homophony.

2.2.1. Why did Chinese words become bisyllabic? If we assume a fairly close correspondence between the spoken and written language in Old Chinese (but see note 5), then one of the clearest developmental changes in the history of the Chinese language has been a shift from monosyllabic to bisyllabic words, begun in earnest probably at some time during the Zhou dynasty, around 1000-700 BC (Cheng 1981 b: 44; Boltz 1994: 171; Feng: this volume). A general simplification of the Chinese phonological system12 is believed to have occurred at about the same time as the incipient shift to bisyllabism. The cooccurrence of these two phenomena in time suggests a cause-effect relationship. The question is, if there was a causal relation, in which direction did it occur? Cheng (1981 b: 57—58) argues that bisyllabism occurred first, leading to the simplification of the phonological system. According to Cheng, societal forces resulted in pressure to enlarge the lexicon, and two-syllable words were created as a means of rapidly increasing the number of words.

Introduction

7

Cheng states that the phonological method of word formation (i. e., reduplication; see note 3, and also 2.2.2 below) was sufficient to handle the rate of vocabulary increase that occurred during an earlier, relatively moderate period of societal development. Then, as China moved through the bronze and iron ages, there was pressure to develop greater numbers of new words appropriate to an increasingly sophisticated civilization. Under these conditions, the phonological method of word formation proved inadequate to keep pace with the need to expand the lexicon. 13 In order to accommodate this need for rapid vocabulary growth, bisyllabic words were created, primarily through lexicalization of juxtaposed forms. This increase in bisyllabic words caused the phonological system to simplify, since the myriad phonological distinctions that originally served to distinguish single-syllable words (see note 12) were no longer necessary. The more traditional view (e. g., Karlgren 1923 b [1971]: 22-23; Wang 1980: 342; Li and Thompson 1981: 14; Norman 1988: 86-87, 112) is that the simplification of the phonological system initiated, rather than resulted from, the development of bisyllabic words. On this view, syllable structure simplification occurred as a natural linguistic process of phonetic attrition. This simplification caused syllables which had been phonologically distinct to become homophones. Bisyllabic words consequently arose within the language (via, e. g., lexicalization processes mentioned in 2.2.3 or reduplication processes mentioned in 2.2.2) as a means of overcoming problems in communication caused by this proliferation of homophonous monosyllabic words. Feng (this volume) offers an insightful adaptation of the latter view, arguing that phonological simplification first occurred, leaving the Chinese syllable with insufficient phonological "weight" to serve as a prosodic foot. This requirement for a "heavier" prosodic foot caused foot structure to shift from being monosyllabic to bisyllabic, eventually leading to the development of bisyllabic words in the language. 14

2.2.2. Reduplication of monosyllabic words The complete or partial reduplication of monosyllabic words (Dobson 1959: 7 - 1 1 ; Serruys 1959: 105-164; Chou 1972: 97-201; Wang 1980: 45-48, 212, 260, 326; Cheng 1981 a: 82-89; Baxter and Sagart, this volume) is thought to have occurred as a means of distinguishing monosyllabic words which were in danger of losing their identity due to homophony.

8

Jerome L. Packard

2.2.2.1. Complete reduplication In complete reduplication (MÄi^l), an entire monosyllabic word (usually an adverbial, Chou 1972: 117; Norman 1988: 87) was reduplicated. The semantic function of complete reduplication was to enhance the meaning of the base word in some way. This was manifested as intensification in the case of adverbials, plural or distributive in the case of nouns, and iteration in the case of verbs (Kallgren 1958: 31). Serruys, however, notes that while complete reduplication is thought to have originated for such purposes of enhancement, it may have been functionally adopted by the language as a means of reducing homophony. Serruys (1959: 111) states that complete reduplication "is not always and not only used because of the mere need for expressiveness, but may point to a tendency toward lengthening a word form; this tendency may have been in the first instances aided by the reduplication device for expressiveness, which at the same time formed a pattern of word formation". Examples of complete reduplication are seen in Table 3, based on Dobson (1959: 7 - 1 1 ) and Kallgren (1958: 31), with pronunciations modified for typographical convenience. Table 3. Complete Reduplicates in Old Chinese Base word

Pronounciation

Meaning

Derived form

Pronounciation

Derived meaning

m Φ Μ

chyr siao ch'u

'slow' 'little' 'place'

MJS /JN/Jn

chyr-chyr siao-siao ch'u-ch'u

*

tah

'talk'

'by and by' 'very little' 'everywhere' 'ramble'

tah-tah

2.2.2.2. Partial reduplication In partial reduplication ('double consonant, duplicate rhyme' the initial consonant, or all or part of the syllable rhyme, is reduplicated. The reduplication was sometimes phonologically strict, but could also be quite loose, with many intermediate values apparently permitted. Partial reduplication is thought to have been much more productive than complete reduplication as a word formation process (Wang 1980: 47; Hu

Introduction

9

1923: 30-31). Hu, for example, states that partial reduplication of a base syllable was a primary method of word formation, expanding one syllable into two for purposes of clarity. Hu notes that "[s]o far as we can imagine, partial duplication of syllables was a procedure used at the earliest times of our Chinese language. In the beginning, whether for clarification or for repetition, bisyllables formed by partial duplication were used to express one word" (my translation of Hu 1923: 30—31, cited in Wang 1972: 47). Examples of phonologically strict partial reduplication follow in Table 4, based on Chou (1972: 133-137; original classical sources cited therein). Some translations are based on Mathews (1945). Pronunciations are modified for typographical convenience. Table 4. Partial reduplicates in Old Chinese Reduplicated segment Initial

Vowel or ending

Base word

m m m ft

Pronun- Gloss ciation

Derived Pronunform ciation

kiet

toil

teig

ljet

cold

sjet

cricket

sjog

wander

buan

reject

bjuag

support

zm mm mm mm

kiet-kjag ljet-ljat sjet-sjuet sjog-gjog buanhuan bjuagsuag

Gloss

Meaning

toil— disabled cold— fierce cricket— cricket wanderdistant rejecthelp

'hampered' 'cold wind' 'cricket'

'leisure' 'follow one's fancy' support- 'mulplant berry'

The reader will note that sometimes the individual members of the reduplicated word are glossed as having different, unrelated meanings and sometimes they are glossed as synonyms. The difference depends (at least in part) upon whether a homophonous (but unrelated) character was borrowed to write the new, reduplicated syllable, or whether a new character was created expressly for that purpose (using the "phonetic compound" principle). This is discussed further in 2.2.4 below.

10

Jerome L. Packard

2.2.2.2.1. Divided monosyllabic words Serruys (1959; see also Boodberg 1937 [1979]) has argued15 that many apparent partial reduplications actually represent the splitting (Boodberg's term is "dimidiation") of one-syllable words into two syllables in order to compensate for the loss of phonological contrast that occurred when Old Chinese consonant clusters (see note 12) underwent phonological simplification. On this view, the initial of each syllable in the new twosyllable form contained a part of the original consonant cluster, along with the original syllable rhyme.16 As seen in the two examples in Table 5 (adapted respectively from Boodberg and Serruys), Table 5. Word splitting in Old Chinese One-syllable form



Μ

Pronunciation

gleu dzlied

Two-syllable form

mm

Pronunciation Gloss

geu-leu dz'ied-lied

'hunch-backed' 'Tribulus'

the first consonant of the cluster was represented by the initial consonant of the first member of the pair, and the second consonant of the cluster was represented by the initial consonant of the other member, with both members possessing the original rhyme. Serruys states While in the process of continuous development of the sounds toward more reduced and simplified word forms, the loss of some of the clusters caused these words to be less easily identified or separated from others, [sic] a natural reaction was started against this ensuing homophony ... But as this reaction against the extreme reduction of word forms must have set in slowly, at the same slow tempo as in the disappearance of the elements in the initial clusters which caused the homophony, it is clear that the knowledge about the more archaic pronunciation was not lost at once. It is therefore natural that some enlargements of words should be based on the knowledge of the lost elements of the older cluster in the initial. The process of word enlargement then goes together with an attempt to reconstruct the lost elements of the cluster, but in a new pattern of word structure, by means of a binomial form. (Serruys 1959: 105-106)

We observe that word splitting is a form of partial reduplication, since both processes involve repeating part of a base syllable. The difference

Introduction

11

between the two hinges upon whether the repetition process represents an attempt to retain the specific phonological contrast provided by the original consonant clusters (as has been argued in the case of word splitting).

2.2.3. Lexicalized forms Lexicalization here refers to the combining of two forms (at least one of which is a free word) that occur together in context to create a larger word. Lexicalization had become a primary word formation device certainly by the Han Dynasty (Cheng 1981 a; Feng, this volume), and must be considered the most enduring means of word formation in the history of the Chinese language. Lexicalization may be broken down into two types, syllabic affixation and compounding, with the distinction mostly based upon whether the word constituents are bound or free.

2.2.3.1. Syllabic affixes For Old Chinese, a syllabic affix may be defined as a syllabic morpheme that: (a) is bound, (b) is productive in forming words, and (c) has a "grammatical" or "functional" rather than "content" meaning. I follow Cheng (1981 a: 89-94) in assuming that this type of affixation arose in Chinese through the lexicalization of function words. 17 In Old Chinese, syllabic affixation involved various grammatical form class categories. Specific to nouns, yoü- (W) and yii- (2ft) (meaning respectively 'have, exist' and 'vis-a-vis' elsewhere in classical Chinese) were prefixes used to indicate the name of a person, place, country or tribe (Wang 1980: 219; Cheng 1981a: 89-90). By the time of the Warring States period (403-221 BC), this class of prefixes had all but disappeared (Wang 1980: 221). At the time of Old Chinese, the well-known nominalizing suffixes -zhe (#) and -zi (Ψ) had not yet become productive (Cheng 1981a: 93). Some examples of verbal prefixes are the words yuan- (S; alternate meaning: 'whereupon'), yüe- (EI; alternate meaning: 'say') and yän- (H; alternate meaning: 'speech'), which have been posited to be allomorphic variants of a single grammatical prefix that indicated verbalization (Wang 1980: 299-300; Cheng 1981 a: 91). Examples of verbal suffixes are -zhi (it; alternate meaning: 'stop') which indicated past or completed action,

12

Jerome L. Packard

and -de alternate meaning: 'obtain'), which indicated result, especially a result involving attainment (Wang 1980: 301-302). Regarding adjectives, according to Cheng (1981 a: 91), the prefixes sT(#r), si- (&), you- (W), qi- (Ä), and suffixes such as -si (#r), -qi (Ä), -ru (#•) and -ran were all affixes that turned adjectives into adverbial modifiers, thus performing the same function as adjectival reduplication (see section 2.2.2).

2.2.3.2. Compounds Compounds are words formed by joining two free words. Although they are often subject to vague definition in modern Chinese (see, e. g., Starosta et al., and Duanmu, this volume), they are relatively easy to define in classical Chinese. This is because the large class of word formatives that complicates defining compound in the modern language (viz., bound roots: those morphemes that represent "content" rather than "grammatical" information, and do not occur as free words; see discussion of affixation and bound roots on page 17 in section 3.1, and in note 30) is relatively small in the classical language. In general, a compound in classical Chinese has one of three types of structure. It may have a parallel structure, in which neither member dominates the other; it may have a non-parallel structure, in which one member modifies and is subordinate to the other; or it may be what can be called a "one-sided" compound, 18 in which the entire compound takes on the meaning of only one of its members, effectively losing the meaning of the other member. In a parallel compound, neither member is subordinate to the other. Rather, the members are "sisters", occurring at the same level of grammatical structure. The two members usually have similar, related or opposite meanings. When the meanings are similar, the formed word usually stands for that portion of meaning which is common to both members (as in Table 6, A—C). When the meanings are related, the formed word often has an extended meaning associated with the related aspect of the two members (as in Table 6, D—F). When the meanings are opposite, the word formed by the members usually stands for a superordinate category, with the two members representing the polar values of that category (as in Table 6, G - I ) .

Introduction 13 Table 6. Parallel compounds (from Zhou 1981 [1983]: 244-245; Cheng 1981 a: 72) Similar, related Compound or opposite

Pronunciation (modern)

Gloss

Meaning

A) B) C) D) E) F)

i f iS

däo-lü hü-χϊ jiän-nan gü-rou yü-zhou hün-yln

'road' 'breathe' 'difficulty' 'kin' 'universe' 'marriage'

tm mm

hüo-fu häo-huäi qlng-zhong

way—path inhale—exhale difficult-hard bone-flesh space—time female home— male home calamity—blessing good—bad light—heavy

Similar Similar Similar Related Related Related

G ) Opposite H ) Opposite I ) Opposite

ufm mm

'fate' 'quality' 'weight'

In subordinating compounds, one member (the m o d i f i e r ) is subordinate to and modifies the other (the head). The subordination relation may be o f the modifier-head type, in which the subordinate, m o d i f y i n g element precedes the element that is modified (as in A - C in Table 7). O r it may be o f a head-modifier structure as in the case o f verb-object compounds (as in D - F in Table 7), assuming that in syntactic structure a verb acts as head and an object is subordinate (to my knowledge, there are no instances o f other types o f subordinating compounds in the ancient language in which the head precedes the modifier. 1 9 The f o l l o w i n g examples are f r o m Cheng (1981 a) and Z h o u (1981 [1983]). In a one-sided compound, the entire compound takes on the meaning o f only one o f the members, and the meaning o f the other member is

Table 7. Subordinating compounds Subordination relation

Pronunciation (modern)

Gloss

Meaning

A ) Modifier—head B) Modifier-head

tiän-zi bäi-xing

'emperor' 'the masses'

C) D) E) F)

shäo-nian jiäng-jün zhi-shi sT-nän

heaven—son hundred— surname few—year lead—troops manage-affair control—south

Modifier—head Verb—object Verb-object Verb-object

Compound

mm mm

'youth' 'general' 'manager' 'compass'

14

Jerome L. Packard

lost. As an example, during the Western Zhöu (1066-771 BC) dynasty the words guo IU ('kingdom') and jiä Μ ('home') were often used together as guojiä EH Μ denoting a general term referring to areas under the control of various officials (see Feng, this volume, example 23). During the Period of Warring States (403-221 BC), the words guo III and jiä W. retained the basic meanings of 'country' and 'home' respectively, but were used together to mean 'country'. This may be seen in numerous examples from the classical works Mengzi, Zuözhuän and Ηάη Feizi (Cheng 1981 a: 69). By that time, the meaning of jiä as 'home' had disappeared from the meaning of the complex word guojiä, with jiä at the point contributing only its phonological form to the gestalt word. 20 As another example, during the Hän dynasty the individual members of the word lihai iUS? meant 'benefit' and 'harm' respectively, but were nonetheless often used together as a word meaning only 'harm' (Zhou 1981 [1983]: 245).

2.2.4. Problems in analyzing Old Chinese complex words Although I have presented clear examples of phenomena such as affixation, and have characterized bisyllabic words as neatly belonging to discrete categories such as "reduplicates" and "compounds", in reality the data are often less than entirely straightforward. Consider, for example, the problem of "unanalyzable" bisyllabic words. In ancient texts there is a type of bisyllabic word that is considered not to be further analyzable into independent single-syllable constituents, that is, it can be considered neither a reduplication, an affixed form nor a compound. The word cannot be considered an affixed form or a compound because it cannot be divided into two clearly isolable morphemes (i. e., the independent occurrence of neither member of the bisyllable has been attested), and it cannot be considered a reduplicated form because there is no evidence of phonological similarity between the two syllables. With evidence neither of independent occurrence of one of the members elsewhere, nor of phonological similarity between the two syllables,21 we are left with the assumption that the form in question is a two-syllable, monomorphemic word. An example of just such a word is the word for 'butterfly' (modern Mandarin: hiidie). Neither individual member of the word has ever been attested in isolation in classical texts (Kennedy 1955 [1964 b]: 289), and the two syllables have no apparent phonological similarity: *yuo and *d'iep respectively, according to Karlgren's reconstruction of Old Chinese

Introduction

15

(Karlgren 1923 a [1974]: 59, 93). Given the nature of our evidence, the best explanation seems to be that the word for 'butterfly', whatever its origin, 22 is simply a single morpheme that consists of two syllables (Kennedy 1955 [1964 b]).23 Another anomaly is the general nature of partial reduplication discussed in section 2.2.2.2. There is no problem if the character representing the reduplicated syllable was borrowed (an unrelated, non-synonymous character, borrowed for its sound alone) or created (using, e. g., the "phonetic compound" method). If a borrowed character had been used to represent the reduplicated syllable, this would be evident from the fact that the character would have occurred previously in a different context and with a different meaning. If the character had been created, this would be evident from the structure of the character (it would be a "phonetic compound") and because it would never have been seen before. The anomaly arises when the characters used to represent the reduplicated form have been attested as independent, synonymous words prior to the appearance of the reduplicated form. Since a bisyllabic word whose individual members have already occurred freely elsewhere is a compound (see 2.2.3.1), the question is how it is possible for a word to appear to be at once both a compound and a reduplication. In fact, there are long lists of such words (e.g., Chou 1972: 163-179). This is somewhat of a puzzle, because it implies that speakers of Old Chinese formed new words by combining free words that were both phonologically and semantically similar, a possibility that seems rather unlikely. There are several possible explanations for this phenomenon. For example, we could consider the putative prior occurrence of the individual members of the compound to be a fallacy. In other words, we could presume that the reduplicated form was actually created first, but that it has not been attested in extant textual sources. Another possible explanation for some of these forms is that the two characters were used to "spell" a consonant cluster (possibly representing sub-syllabic affixation, as explained in 2.1.2 above). In such a situation, Boodberg notes, the characters would have been specifically selected to be similar in their pronunciation, graphic form and meaning. 24 Finally, it may well be that, however unlikely it may seem, speakers of Old Chinese did indeed form words productively by joining free words that shared both semantic and phonological properties (as can be argued, for example, in the case of English, the nouns tip-top, chit-chat and teeter-totter; the verbs flip-flop, jingle-jangle and tittle-tattle; and the adjectives itty-bitty, teeny-weeny, slam-bang and slapdash).

16

Jerome L. Packard

3. Modern views of complex words Let us now take a look at how scholars of Chinese language have analyzed complex Chinese words during the modern era. 25 Discussion of Chinese word formation in this century has involved issues such as the nature of the relationship between gestalt complex words and the properties of the constituents that form them, whether or not Modern Chinese has a system of affixation, and the connection between word-formation principles and the rules of sentence grammar.

3.1. The character-affix period When linguists think of word formation, concepts such as "morpheme", "affixation", "inflection" and "compounding" typically come to mind, with the concept of the word itself usually tacitly assumed. The existence in Chinese of a term equivalent to word (i. e., ci §s|) is, however, a relatively recent phenomenon. The first use of the term to refer unambiguously to words as distinct from characters or phrases 26 was by Zhang (1907 [1928]: 1; P a n - Y i p - H a n 1993: 100). Since there was no term for "word" at the turn of the century, the written character (zi Ψ) served as the basis for discussing virtually all linguistic aspects of Chinese at that time. In fact, even though Zhang had used the term ci in 1907 to distinguish written from spoken language, written characters continued to be considered the basic units out of which complex Chinese words were formed until the 1940s.27 A primary example of this focus on characters is the first analysis of Chinese grammar based on Western linguistic principles, Ma's Grammar of Language (Ma 1898 [1983]; Mäshi Wentöng, ^ftl&fi), which classified characters according to their form class identity, using terms such as 'noun-characters' (mingzi 'verb-characters' (idongzi ffr^) and 'adverbial-characters' (zhuängzi The role of character orthography aside, at the turn of the century the analysis of Modern Chinese words tended toward affix-based analyses, following Western works on affix-laden languages such as Sanskrit and Latin. This influence was apparent in the works of Chinese linguists of the early 1920s such as Hu Shih (Hu 1920), who first translated the term suffix, 28 calling it 'language tail' (yüwei P a n - Y i p - H a n 1993: 61). Li Jinxi's New Grammar of the National Language (Li 1924) heralded a definitive move away from use of the concept "affix" for Chinese. This

Introduction

17

concept, however, arguably had only limited relevance for Modern Chinese to begin with. In Western linguistics, affixes are seen as bound morphemes that are productive in forming words, are "grammatical" (i. e., functional) rather than "lexical" (i. e., content) in nature, are members of a substitution paradigm, and are usually classified as either inflectional (if they mark grammatical relations or agreement and do not change the form class of a word) or derivational (if they derive words with a new form class). In Modern Chinese, there are no true inflectional affixes as defined above. That is, Chinese has no word components that vary as members of a paradigm and mark grammatical values such as case, number, tense, and gender. 29 Also, unlike "typical" affixing languages, Chinese has a large class of morphemes (which we may call "bound roots") that possess certain affixal properties (namely, they are bound and productive in forming words), but encode lexical rather than grammatical information, and furthermore may occur as either the left- or right-hand component of a word. 30 For example, the morpheme li t) 'strength, power' can be used as either the first morpheme (e. g., liliang t)M. power-capacity 'physical strength'), or the second morpheme (e. g., quänli ίϋΛ authority-power 'rights') of a bisyllabic word, but cannot occur in isolation. 31 The class of word formatives with these characteristics, in, e. g., English, is virtually nonexistent. 32

3.2. The sentence-grammar period Thus, the affix-based approach soon gave way to word order and other sentential principles as a way of explaining Chinese word structure, as Chinese linguistics began a focus on sentence grammar. This began with the work of Li's New Grammar (Li 1924), which was the first grammar of Chinese to deal with the sentence as a basic unit. That work was the advent of a long period (culminating with the increasing interest in affixation in the 1950s) of focus on sentential word relations, 33 during which Chinese word formation, to the extent that it was dealt with at all, was seen as deriving from properties of phrases and sentences. Chinese linguists generally avoided the term "affix" from the 1920's through the 1940's, with several scholars explicitly rejecting its application to Chinese. There were, however, two major exceptions to this "reject-affixes" view. Qu Qiubai (Qu 1931) based an entire Chinese morphological system on affixation in his work Research in Common Chinese

18

Jerome L. Packard

Words. Qu used the term 'character-root' (zigen ^$5) for characters which formed new words (his term for word was ziyän 'character-eye') in combination with other characters, and used the term 'character-tail' (ziwei ΨΜ) for characters that preserved the basic identity of the word to which they attached (reminiscent of derivational and inflectional affixation, respectively). The second exception was Chen Gang (Chen 1946), who not only asserted the existence of affixes in Chinese, but also offered explicit proof that Chinese had both inflectional and derivational affixes (Pan—Yip-Han 1993: 63). Two other general trends in Chinese word formation during this period were the deemphasis of the constituted word's gestalt identity (even though it was implicit in virtually all analyses), and the use of "relational" notions (rather than syntactic form class or structural notions) to characterize the internal properties of words. Liao Shuqian (Liao 1946) was a clear exception to the former trend, explicitly stating that the syntactic distributional properties (cixing l«ltt) of the constituted word are more important than those of the elements that make it up (Pan—Yip-Han 1993: 39). Xia (1946) Methods of Composing Two-character Words is a good example of the latter trend, using terms such as 'cause and effect' (ym güo S H ) 'meaning delimiting' (xiän yi IS#£), and 'modifying' {fü zhuäng lUltR) to describe word constituents. Chao (1948 [1972]) was an exception to the latter trend, being the first to extensively apply syntactic structural concepts (such as subject-predicate, verb-object, and verb—complement) to word formation.

3.3. The inflection period In China, the 1950s ushered in a thirty-year period during which the climate favored applying the concept of inflectional morphology to Chinese words ( P a n - Y i p - H a n 1993: 70-77). This was due in part to close cooperation between PRC and Soviet linguists, who felt that considering Chinese to be a monosyllabic language without inflection was tantamount to considering it to be a "simple" or "low-level" language ( P a n Y i p - H a n 1993: 71; Wang 1980: 343). Ironically, the major work on Chinese word formation in China during that period, Lu Zhiwei's Chinese Word Formation (Lu 1964), was different from the affix-based morphology works that were prevalent in China at that time. Lu's analysis included a listing of virtually every combination of form class elements 34 that could form bimorphemic words, classified

Introduction

19

by the form class identity of the constituted word. Lu's work can also be considered the first transformational-generative analysis of Chinese word formation, since he described word structures using generative phrase structure expansion rules. Lu's treatment was a milestone in the study of Chinese word formation because of his use of expansion rules, and because he emphasized the form class identity of both the constituted word and that of its constituents.

3.4. The syntax period The focus on inflectional morphology that occurred during the period of the 1950s through the end of the 1970s in the PRC went largely unnoticed in the West, partly because of China's extreme isolation from the West during that period, and partly because of the focus on syntax in linguistics which occurred from the 1960s (with the advent of Chomsky's transformational-generative grammar) through the 1980s, notably in the works of William S-Y. Wang (Wang 1963, 1965, 1967), Anne Y. Hashimoto (Hashimoto 1965), Robert Cheng (Cheng 1966), Shuanfan Huang (Huang 1966), James Tai (Tai 1969), Chauncey Chu (Chu 1970), Ying-che Li (Li 1972), Shou-hsin Teng (Teng 1973), Sandra Thompson (Thompson 1972 a), Charles T-C. Tang (Tang 1977), Feng-Fu Tsao (1978) and James C-T. Huang (1982). During that time there was a heightened awareness of the syntactic properties of the gestalt word, with a cloak veritably draped over the identities and properties of word constituents. Linguists working on Chinese did not pay much attention to the internal structure of words, and what analyses of words that were offered were generally couched in the framework of syntactic analysis. There were three major exceptions to the general lack of attention given to word structure during that time: Chao (1968), Kratochvil (1968) and Thompson (1973 b). Y. R. Chao's A Grammar of Spoken Chinese (Chao 1968) devoted several chapters to the properties of words and their constituents, including a chapter on the combinatory possibilities of various types of morphemes, and another 120-page chapter on compound formation. In that work, Chao discussed different ways that words may be defined in Chinese (Chao 1968: 136—147), and introduced the concept of a morpheme being free or bound in a specified position within the word (Chao 1968: 136-138, 146-147). 35 Chao also introduced the concept "versatile/restricted" in describing the range of word-internal environments within

20

Jerome L. Packard

which a morpheme may occur (Chao 1968: 155-156). This monumental work must be considered the best general descriptive grammar of Mandarin produced to date, and is still one of the most comprehensive treatments of Chinese word formation. Kratochvil (1968) contained a chapter on word formation, including a detailed description of word constituents and word types. Kratochvil posited root morphemes and affixes as word constituents, and for word types he posited simple words and compounds. Root morphemes were morphemes with lexical content that could be either free or bound (the bound ones correspond to my "bound roots"; see discussion in section 3.1, and also in note 30), while affixes were bound and grammatical. Kratochvil defined simple words as words that contained one root morpheme (possibly also containing an affix), and compounds as words with two or more root morphemes. Thompson (1973 b) is an important work during this period because it proposed deriving a large and important class of complex verbs (resultative verb compounds) using lexical rather than syntactic rules. In that article, Thompson provided a systematic analysis of resultative compounds, including the different properties of such words which are dependent upon their internal structure. Li and Thompson (1981) also devoted a large section of their general Mandarin grammar to word formation. Their treatment was analytical as well as descriptive, providing a comprehensive account of the possible complex words types occurring in each of the major form class categories.

3.5. The contemporary period Over the past several years, following general trends in the field of linguistics, there has been an upsurge of interest in Chinese morphology. In general, there have been two primary topics of discussion in Chinese word formation. The first is how the syntactic and other distributional properties of complex Chinese words may be seen as a function of the properties of their individual components, and the second is the relationship between rules of word structure and those of syntactic structure. As an example of the first topic, Claudia Ross (1990) has shown how properties of Chinese words may be viewed as a convergence of information involving universal semantic and syntactic features of word components. In so doing, Ross addresses questions such as which simple verbs can act as resultative verb heads and which can act as endings, what

Introduction

21

semantic properties determine the compatibility of heads and also how the argument structures of complex resultative verbs are related to the argument structures of the individual constituents. In a precise and complete analysis, Yafei Li (Li 1990) also shows how the thematic structures of individual complex verb components directly determine the thematic properties of the complex verbs themselves. The second major topic of discussion in Chinese morphology has been the relationship between the rules of word formation and the rules of syntax. Some scholars argue that Chinese has no independent system of word formation, and that all word formation is accomplished using rules of syntax. Others maintain that Chinese words are formed using independent morphological principles that are specific to the word formation component of grammar. 36 In this regard it would appear that Chinese, as a premier example of an isolating language, would be a natural candidate language to offer evidence of the unity of morphology and syntax (Dai 1992: 55, 285). This is because both sentence components and word components are clearly isolable in Chinese, thus potentially providing an optimal view into the nature of their interrelationship. In discussing the relation between morphological and syntactic rules, there are two subtopics. The first is whether complex words in Chinese are "listed" in the lexicon or whether they are generated in an "on-line" manner relying on grammatical rules, as is held to be the case for sentences. The second issue is, if complex words are in fact generated, whether they are generated using "lexical" or "syntactic" rules. The question of lexical listing versus on-line generation of complex Chinese words is discussed in the work of Dai (1992), Sproat and Shih (Sproat—Shih 1996) and Packard (1998). According to Dai, words in Chinese are constructed using productive rules only when the word formatives are, in essence, either free words or affixes (Dai 1992: 59—63). According to Dai, the vast majority of complex words (viz., those composed of the bound roots discussed in section 3.1 and in note 30), are "listed" in the lexicon rather than generated by rule (Dai 1992: 63). Sproat and Shih, on the other hand, propose that complex words37 in Chinese are constructed using a fully productive rule system. In Packard (1998), I propose that only grammatical words (i. e., words containing a grammatical affix) are generated by rule. The issue of lexical versus syntactic rules in the generation of words is exemplified in the controversy regarding the formation of verb-object structures. Here we have a rather blurry distinction between verb—object "compounds" (formed by lexical rules) and verb-object "phrases"

22

Jerome L. Packard

(formed by syntactic rules). The best treatments of this problem are by Chao (1968: 359-367), Huang (1984) and Chi (1985), each of whom offers criteria to distinguish "compounds" from "phrases". The criteria differ somewhat among the three investigators, but in general rely upon compositionality of meaning, whether the components can be manipulated or separated, and whether the verb-object structure itself can take an external object. Duanmu (this volume) provides some interesting insights into this issue. Perhaps the strongest and best-supported arguments in favor of the "Chinese-morphology-as-syntax" position have been presented by Charles T-C. Tang (e. g., Tang 1993, 1995), who argues that word formation rules in Chinese consist simply of a subset of syntactic rules.

4. Papers in this volume Many of the issues discussed in this introduction are revisited in greater detail by the individual contributors to this volume, along with some new issues and approaches in the analysis of Chinese words.38 William Baxter and Laurent Sagart begin the volume with a synopsis of morphological processes in Old Chinese, for which they provide wonderfully detailed phonological and textual evidence. Baxter and Sagart also present some of their new research in this area, and offer a critical and enlightening discussion of the very notion word itself. In her contribution, Claire Hsun-huei Chang argues that, contra Lieber (1983), compound verbs do not have a uniform argument structure. Using data from Mandarin resultative verbs, Chang shows that it is necessary to assume multiple argument structures for such verbs that are satisfied both within and outside the compound. Chang argues that the multiplicity of thematic relations possible in Mandarin resultative verb compounds necessitates a richer thematic structure than the simple universal argument structure proposed by Lieber. John Dai's paper deals with the different notions of word, and how the concept of words defined in phonological, morphological and syntactic terms applies to words in Chinese. In so doing, Dai demonstrates that these various concepts of word have validity for Chinese, just as they have been found to be valid for other languages. In discussing the notion of wordhood in Chinese, San Duanmu provides converging metrical and tonal evidence to demonstrate a clear distinction in Chinese between words and phrases. Duanmu revisits much

Introduction

23

of the original evidence for word versus phrase, accepting certain of the previously-proposed criteria supported by his own tone and prosody evidence from the Shanghai and Mandarin dialects. Duanmu states that phonological, morpho-syntactic and semantic evidence can all be used to draw the word-phrase distinction, with one type of evidence filling in gaps when other types come up short. Shengli Feng's paper on the origin and development of bisyllabic words in Old Chinese argues that disyllabicity developed in Chinese as a result of prosodic factors. Feng argues that the phonological simplification of the syllable in Old Chinese (see note 12) resulted in the syllable having insufficient weight for it to bear a monosyllabic foot structure. When that happened, according to Feng, the language compensated by evolving a bisyllabic prosodic foot structure, leading, in concert with syntactic factors, to the development of bisyllabic words. Shuanfan Huang argues in his paper that the nature of Chinese morphology stands as proof that words in the languages of the world need not be considered obligatorily headed. Huang argues that the assumption that every natural language makes use of the concept of lexical headedness in constructing morphologically complex entities is highly questionable. Huang provides a statistical breakdown of the distribution of putative lexical heads for Hakka, Mandarin and Taiwanese as evidence that the issue of headedness in Chinese compounds is at best a moot question. Yafei Li examines two semantically identical resultative verb constructions - one syntactic and one morphological - and demonstrates how their semantic identity is not matched by a structural identity at D-structure, as would be expected under the Uniformity of Theta Assignment Hypothesis (UTAH). Based on this evidence, Li argues that the morphological resultative must be formed in the lexicon, and therefore that morphology exists as an autonomous component independent of syntax. My own contribution to this volume is the phonological component of a stratum-ordered morphological analysis of the Mandarin lexicon. The original paper, a Lexical Phonology and Morphology treatment of Mandarin complex word formation (Packard 1990), dealt solely with the cooccurrence restrictions and disjunctive orderings of morphemes in the different classes of Mandarin words. The present paper accounts for the distribution of neutral tone and tone sandhi rules in complex Mandarin words by the judicious distribution of phonological information among the four lexical strata I have posited for the Mandarin lexicon. Claudia Ross examines the properties of activity verbs, which are simplex, monomorphemic verbs in many languages, but are morphologically

24

Jerome L. Packard

complex in Mandarin, requiring semantically weak or empty "cognate" objects as part of their compound-like lexical composition. Using Jackendoff's (1987) framework of thematic structure, Ross argues that the required specification for cognate objects by these verbs is due to a thematic role assignment hierarchy that gives special status to the role of Theme in Mandarin, and that inherent aspectual properties of the verb determine the assignment of its thematic roles. According to Ross, the semantic content of cognate objects is entirely predictable, inhering in the semantic specification of the verb. The paper by Stanley Starosta, Koenraad Kuiper, Siew-ai Ng and Zhiqian Wu argues that traditional definitions of compound in Chinese make it difficult to provide satisfactory explanations for certain word formation phenomena. They caution against overreliance on etymological factors in defining terms such as word and compound, and advocate the use of standard linguistic criteria to make such definitions more generally applicable. The authors rely on evidence from word headedness and resultative verb compounds to offer a precise definition of the term "compound" as it applies to Chinese. The contributors to this volume have been motivated by the conviction that Chinese, although a language with no apparent system of morphological agreement or inflection, nonetheless evinces word formation properties that contribute a good deal to our understanding of linguistic morphology. We hope that after sampling some of the works offered here, the reader will come to the same conclusion.

Notes 1. I would like to thank Martha Wang Gallagher, Xiangqing Cheng, Shengli Feng, Juan Cheng, Wang Jiahui, Bill Baxter, Jia Liu, Robert Xingchen Ye, Yu Shen, Yu-chiao Peng Longenecker, Xiaolin Hu, Shu-Fen Chen, Sandra Thompson, Patricia Ebrey and C. C. Cheng for helpful discussion or feedback in writing this introduction. I do not wish to imply that any of the aforementioned scholars necessarily agrees with what I have said here; I alone bear responsibility for the content. 2. Some of the words considered here consist of a single morpheme, but are considered complex nonetheless since they demonstrate productive morphological processes of derivation from a base form (see section 2.1). 3. Cheng (1981 b: 81-109) posits three "stages" in the development of Chinese words. The first two stages involve complete or partial reduplication (see section 2.2.2) of a base word. In the first, or "phonological" stage, two words

Introduction

4.

5.

6. 7.

8.

25

of similar sound and meaning combine to form bisyllabic, monomorphemic words. In the second stage (a phonological-to-syntactic transition stage), two words with similar sound and meaning combine to form bimorphemic words. The third stage Cheng terms the "syntactic" stage, in which monosyllabic words form bisyllabic words on analogy with rules of syntactic combination. Although proposing stages suggests chronological development, Cheng cautions that the proposed stages are unlikely to correspond to clear-cut time periods within which the posited types of word formation occurred, to the exclusion of other types. Cheng focuses on the process of development rather than the existence of discrete historical stages, stating that at any given point in time, one or more of the posited stages are likely to have coexisted (Cheng 1981b: 110). The earliest two-syllable words in Chinese were probably not formed using grammatically productive word formation processes. According to Cheng (1981 b: 110), the earliest bisyllabic words to appear in Chinese were proper names (place names, personal names, names of tribes or kingdoms). These occurred in writing as early as on Shang dynasty (sixteenth to tenth century BC) 'oracle bones' ( ¥ # 3 t ) , the earliest extant samples of written Chinese. Two syllables were used for those names, according to Cheng, in order to make them easier to distinguish, remember and communicate. Also, multisyllabic transliterations of foreign words occurred in early times and clearly were the result of asemantic phonological rather than morphological processes. How closely written Chinese reflects the way the language was spoken at that time is a matter of debate. See DeFrancis (1984), who questions the extent to which the written record reflects the way the language was actually spoken, and also Mair (1994), who suggests that classical written Chinese may have been almost completely unrelated to any variety of spoken Chinese. Some relationship must be posited, however, due to the wealth of evidence of the phonetic basis of written works (such as poetic rhyme and meter), the existence of sound glosses for characters, and the clearly phonetic basis for construction of many of the written character forms. But see discussion of subsyllabic morphemes (affixes) in 2.1.2 and bisyllabic morphemes in 2.2.4. Derivation via going tone is thought to be the reflex of an Old Chinese or Proto-Chinese *-s suffix, which also had derivational properties (Pulleyblank 1973b; Norman 1988: 5 5 - 5 7 , 8 4 - 8 5 ; Baxter 1992: 313-317; Baxter and Sagart, this volume). Other reconstructed Proto-Chinese affixes (e. g., Schussler 1976; Baxter 1992: 176, 218-222, 324; Baxter and Sagart, this volume) are discussed in 2.1.2. A traditional tone marking convention is used in Table 1, whereby the small circles at the bottom left, top left, top right and bottom right indicate level, rising, going and entering tones respectively.

26

Jerome L. Packard

9. Schussler (1976: 129) states that "... word derivation by prefixes and suffixes constituted one of the most outstanding characteristics of Pro to-Chinese." 10. If this theory is correct, then it would mean that, as Boodberg explicitly claims (Boodberg 1937: 354 [1979: 388]), the 'fän qie' system of using two characters to 'spell' the pronunciation of a different character, is likely to have been of native Chinese origin rather than having been imported from South Asia by Buddhist missionaries, as is generally believed. 11. Dobson (1959: 168) provides a list of such forms. He calls them "allegro forms". 12. Linguists posit the existence of a complex consonantal system in Old Chinese which disappeared before the time of Middle Chinese (see, e.g., Karlgren 1970: 280-281; Dong 1972: 269, 298; Baxter 1992: 218-234, 308-324; Feng this volume, and especially Chu 1981, 1995; cf., Wang 1980: 68, note 2). Old Chinese is reconstructed as having consonant cluster initials (with 1, r, d, t, z, s and ρ following many of the initial stops; Dong 1972: 298). Also, Old Chinese is usually reconstructed as possessing voiced [b, d, g] as well as voiceless [p, t, k] stop endings (Dong 1972: 269). The consonant cluster initials (which, as we saw in 2.1.2 and in note 7, may have had a morphological function) and voiced stop endings are thought to have disappeared by the end of the Han (200 AD), and were almost certainly not present by the time of Middle Chinese (600 AD). 13. Cheng argues that this was due to limits on Chinese syllable structure complexity. 14. See also Gallagher (1994) for evidence from reduplication that two-syllable forms were created to fit the needs of prosodic meter. 15. See Boltz (1994: 171-172) for the most recent version of this argument. 16. This specific claim is due to Serruys. Boodberg's claim was not (as is sometimes said) that the two-character binoms were originally single-syllable words that had become bisyllabic, but that such binoms were used to "spell" the pronunciations of single-syllable words that possessed initial consonant clusters. See 2.1.2.1. 17. Grammatical affixes are thought to originate in most cases from the grammaticalization of free function words. The process of function words becoming affixes by grammaticalization is explained in detail in Heine—Claudi— Hunnemeyer (1991), Hopper-Traugott (1993), and Bybee-Perkins-Pagliuca, (1994). 18. The traditional Chinese name for this type of word is 'partial-meaning compound' piänyi fuel mmmm (see Cheng 1981a: 68-70; Zhou 1983: 245; and Feng, this volume). 19. C. C. Cheng points out that ^W niänqlng year-young 'young' may be considered a classical example of a subordinating compound in which the head precedes the modifier. 20. This remains true for the word guojiä 'country' today in modern Mandarin.

Introduction

27

21. It is possible, as Serruys (1959: 111-112) and others have pointed out, that due to the effects of sound change and because great variation was apparently allowed in the phonology of reduplicated forms, the phonological similarity of the reduplicated syllables is no longer apparent. 22. Many of these unanalyzable forms undoubtedly represent multisyllabic single-morpheme words borrowed from other languages. Thus, the bisyllabic, monomorphemic word for "grape" (ρύίάο W®), for example, is thought to have been borrowed from the multisyllabic, monomorphemic hypothetical Iranian prototype *bädäwa (Laufer 1919: 225; Norman 1988: 19). 23. Of course it is possible that the individual members of the word previously had been used in isolation but that no textual evidence to that effect has been found. 24. Quoting Boodberg: "The determinative now was used as an independent graph prefixed or affixed to the graph determined and ... the GSP (i.e., character, JLP) selected for the role of a determinative inevitably would be one related in its GSP to the graph determined." (1937: 356 [1979: 390]). 25. Much of this section is based on P a n - Y i p - H a n (1993). 26. We should note that the boundary between word and phrase in Chinese remains somewhat unclear even today. 27. The first scholar to describe word components in a way that distinguished them from the characters used to write them was Chen Wangdao KÜIit (Chen 1940: 425; as cited in P a n - Y i p - H a n 1993: 100, 113-114), who used a term equivalent to morpheme ('word-essence', cisu ii^). 28. The term 'affix' had been translated by Xue (1919) in his discussion of agglutinating, isolating and inflecting languages, using the terms yügen (SS) and yüxi (i§^) for 'root' and 'affix' respectively ( P a n - Y i p - H a n 1993: 61). Interestingly, in keeping with contemporary views, Xue applies those concepts to the composition of Chinese characters rather than to the composition of words. 29. See Packard (1996) for a discussion of inflectional and derivational affixation in Mandarin. 30. We call these bound roots since they serve as the lexical base for many words and are not further decomposable (and are therefore "roots"), but they require additional morphemic information before they can occur as words. Chao (1968: 145) calls them "root words", and Dai (1992: 40) calls them "bound stems". Examples of bound roots from English are -ceive (receive, conceive) and tele- (telephone, teleconference). 31. Instances of bound roots occurring as free morphemes may often be found in the classical language or in proverb-type expressions. 32. One possible example of such a word formative from English is '-log-' (meaning 'word' or 'text'), since it is reasonably productive, represents "content" rather "function" information, may occur in either word-initial (logogram, logograph, logotype, logorrhea) or word-final (monolog(ue), dialog(ue), travelogue)) position, and may not occur as a free word.

28

Jerome L. Packard

33. Ironically, it was Li who also produced the first complete theory of Chinese word structure (Li 1930). In that work, Li explicitly mentions both the form class of the word's constituent morphemes and the form class of the gestalt word ( P a n - Y i p - H a n 1993: 35-36). Li (1930) is the first systematic work on Chinese word structure to be offered under the rubric "word formation" (mmmgoucifa; Pan—Yip-Han 1993: 34). 34. The greater part of Lu's work did, however, deal primarily with nouns. 35. For example, a morpheme that is bound and can only occur as the first member of a word would be 'end bound', since it requires a morpheme on its right. Such an 'end-bound' morpheme is similar to a prefix, except that its meaning is lexical rather than grammatical (see discussion on page 17). 36. Spencer (1994) suggests that morphology may be viewed in three different ways: as a module completely autonomous of syntax; as an "interface" phenomenon, relating the autonomous components of phonology and syntax but having no independent existence itself; and as an entity that is wholly derivable from principles of phonology and syntax. 37. Sproat and Shih actually limit their discussion to complex nouns in Chinese. However, their arguments can be extended to all complex Chinese words. 38. For new issues in cognitive as well as linguistic approaches to word formation in Chinese, see Packard (1998).

References Baxter, William Η. Ill 1992 A Handbook of Old Chinese Phonology. Berlin-New York: Mouton de Gruyter. Bodman, Nicholas C. 1980 "Proto-Chinese and Sino-Tibetan: data towards establishing the nature of the relationship", in: van Coetsem, F. and Waugh (Eds.), 34-199. Boltz, William 1994 The Origin and Early Development of the Chinese Writing System. American Oriental Series, Vol. 78. New Haven, CT: American Oriental Society. Boodberg, P. A. 1934 "Notes on Chinese Morphology and Syntax III", in: Cohen (ed.) 432-434. 1937 "Some Proleptical Remarks on the Evolution of Archaic Chinese", Harvard Journal of Asiatic Studies 2, 329—372. [1979] [Reprinted in Cohen, (ed.) 363-406.] Bybee, Joan—Revere Perkins—William Pagliuca 1994 The Evolution of Grammar. Chicago and London: University of Chicago Press.

Introduction

29

Chao, Yuen Ren (ffijcffi) 1948 Mandarin Primer. Cambridge, MA: Harvard University Press. [1972] [Sixth printing.] 1968 A Grammar of Spoken Chinese. Berkeley: University of California Press. Chen, Gang ( 0 K J R I ) 1946 Basic Problems in Establishing a System of Chinese Grammar (Jiänli Zhöngguo wenfä tixi de jtben wenti [cited in Pan—Yip—Han 1993: 63, no further citation given.] Chen, Wangdao (ΒΚ2£) 1940 On in-depth language research from the perspective of 'syllable connection' (Cong 'cir liänxie' shüodäo yüwen rü yänjiu .ίίέ ΊΒίϋΜΤ KSßBÖSRAWfc) [Cited in P a n - Y i p - H a n : 100, 113-114. No further citation given]. Cheng, Robert L. 1966 Some Aspects of Mandarin Syntax. [Indiana University Ph. D. dissertation.] Cheng, Xiangqing (ίϋίΐβϊϋ) 1981 a Research in Pre-Qin Chinese Language (Xiänqin Hänyü yänjiu Jinan, China: Shandong Educational Publishing Co. 1981b

"Research in Pre-Qin Bisyllabic Words" (Xiänqin Shuängylnci yänjiu ifciMHHfWfiS)'', in: Cheng 1981a, 44-112. Chi, Teelee R. 1985 A Lexical Analysis of Verb-noun Compounds in Mandarin Chinese. Taipei: Literary Crane Publishers. Chou, Fa-kao (/üi£iS5) 1972 Ancient Chinese Grammar. Morphology Volume. (Zhöngguo Güdäi Yüfä: Göuci Biän Φ Ι δ ΐ ^ ϋ ϋ : figpjjg) Taipei, Taiwan: Tailian Guofeng Publishing Co. t3it: & l t U t t l K t t . Chu, Chauncey C-H. (JB!**) 1970 The Structure of shr and you in Mandarin. [University of Texas Ph. D. dissertation.] Chu, Chia-ning 1981 Research in Old Chinese Consonant Clusters. (Gü Hanyü Füshengmü Yänjiü Taipei: Chinese Culture University Ph.D. dissertation, m i t : Ψ Η Λ Ή ; * * » ± » ί . 1995 Phonological Investigation (YTnyun Tänsüo. IraSS^). Taipei: Student Book Co. mit: Cohen, Alvin P. (ed.) 1979 Selected Works of Peter A. Boodberg. Berkeley: University of California Press. Dai, John Xiang-ling 1992 Chinese Morphology and its Interface with Syntax. [The Ohio State University Ph. D. Dissertation.]

30

Jerome L. Packard

Dobson, W.A.C.H. 1959 Late Archaic Chinese. Toronto: University of Toronto Press. Dong, Tonghe (SläHW) 1972 Chinese Phonology (Hänyü Ylnyünxue iHi£ i f ä i ^ ) , third edition. Taipei, Taiwan: Student Book Co. S i t : 1944 An Outline of Ancient Chinese Rhymes (Shänggü Ymyunbiäo Gäo ±ΐ!ϊ#£ϊϊ?ίί&). [1975] [Third edition reprint. Taipei, Taiwan: Tailian Guofeng Publishing Co. G l t : ÖPSJaaifiStt.] Downer, G. B. 1959 "Derivation by Tone-Change in Classical Chinese", Bulletin of the School of Oriental and African Studies. Vol. XXII, part 2, 258—290. Gallagher, Martha Wang (I^lSC) 1994 A Study of the Reduplicatives in the Chü Ci. [Paper presented to the Third International Conference on Chinese Linguistics. City Polytechnic of Hong Kong.] Hashimoto, Anne Y. 1965 "A Condensed Account of Syntactic Analysis of Mandarin", Project on Linguistic Analysis 10, 10-27. Columbus: The Ohio State University Research Foundation. Hopper, Paul—Elizabeth Traugott 1993 Grammaticalization. Cambridge: Cambridge University Press. Hu, Shih (ffläl) 1920 The Evolution of Mandarin (Guoyü de Jinhuä HgWSHfc). [Cited in Pan—Yip-Han 1993: 29. No further citation given.] Hu, Yilu (Ä0Jit») 1923 Beginning Mandarin {Guöyüxue Cäochuäng Shanghai: Commercial Press (_t?£: Huang, James C-T. (HIE$0 1982 Logical Relations in Chinese and the Theory of Grammar. [MIT Ph. D. dissertation.] 1984 "Phrase Structure, Lexical Integrity and Chinese Compounds", Journal of the Chinese Language Teachers Association 19, 53—78. Huang, Shuanfan (US®) 1966 "Subject and Object in Mandarin", Project on Linguistic Analysis 13. Columbus: The Ohio State University Research Foundation, 25-103. Jackendoff, Ray 1987 "The Status of thematic relations in linguistic theory", Linguistic Inquiry 18, 369-412. Kallgren, Gerty 1958 "Studies in Sung Time Colloquial Chinese as Revealed in Chu Hsi's Tsuanshu", Bulletin of the Museum of Far Eastern Antiquities 30, 1-165.

Introduction

31

Karlgren, Bernhard 1923 a Analytic Dictionary of Chinese and Sino-Japanese. Paris: Librairie Orientaliste Paul Geunthner. [1974] [Republished in unaltered form. New York: Dover Publications.] 1923 b Sound and Symbol in Chinese. Oxford: The Clarendon Press. [1971] [Revised edition. Hong Kong: Hong Kong University Press.] 1933 "Word Families in Chinese", Bulletin of the Museum of Far Eastern Antiquities 5, 1 — 112. 1954 "Compendium of phonetics in Ancient and Archaic Chinese", Bulletin of the Museum of Far Eastern Antiquities 26, 211—367. [1970] [Reprinted as Compendium of Phonetics in Ancient and Archaic Chinese. Göteborg, Sweden: Elanders Boktryckeri Aktiebolag.] 1956 'Cognate Words in the Chinese Phonetic Series", Bulletin of the Museum of Far Eastern Antiquities 28, 1 — 18. Kennedy, George A. 1940 "A Study of the Particle Yen", Journal of the American Oriental Society 60, 1 - 2 2 , 193-207. [1964a] [Reprinted in Li (ed.) 27-78.] 1955 "The Butterfly Case", Wennti 8. [1964b] [Reprinted in Li (ed.) 274-322.] Kratochvil, Paul 1968 The Chinese Language Today: Features of an Emerging Standard. London: Hutchinson University Library. Laufer, Berthold 1919 Sino-Iranica: Chinese contributions to the history of civilization in ancient Iran. Chicago: Field Museum of Natural History, Publication 201, Anthropological Series, vol. 15, no. 3. Li, Charles-Sandra A. Thompson 1981 Mandarin Chinese: A Functional Reference Grammar. Berkeley: University of California Press. Li, Jinxi (Sttffii) 1924 New Grammar of the National Language (Xln Zhii Guöyü Wenfä Shanghai [Cited in P a n - Y i p - H a n : 33, and in Chao 1968: xxix; name of publisher not given]. 1930 Variable and Fixed Meaning in Mandarin Compound Words (Güoyü Zhöng Füheci de Qiyi he Piänyi. [Cited in Pan—Yip—Han 1993: 33. No further citation given], Li, Tien-yi (Ed.) 1964 Selected Works of George A. Kennedy. New Haven, CT: Far Eastern Publications. Li, Yafei 1990 "On V-V Compounds in Chinese", Natural Language and Linguistic Theory 8, 177-207.

32

Jerome L. Packard

Li, Ying-che 1972 "Sentences with Be, Exist, and Have in Chinese", Language 48.3, 573-583. Liao, Shuqian (JftffH) 1946 Oral Grammar (Kouyü wenfä • !§:£?£) [cited in Pan—Yip—Han 1993: 39, no further citation given.] Lieber, Rochelle 1983 "Argument linking and Compounds in English", Linguistic Inquiry 14: 251-285. Lu, Zhiwei 1964 Chinese Word Formation {Hänyü de Gducifä ä l ^ W Ä P i i J . Beijing: Scientific Publishing Co. (ttJsC: First Revised Edition. Lu, Zongda-Ning Wang Ϊ4?) 1983 Methodology in Glossing the Classics (Xungü Fängfä Lün f l l ü ^ i i l l ) . Beijing: China Social Science Publishing Co. (ftlT: Ma, Jianzhong 1898 Ma's Grammar of Language (Mäshi Wentöng, U f t t ® ) . Publisher and place of publication unknown. [1983] [Beijing: Commercial Press CJfcgT: Mair, Victor H. 1994 "Buddhism and the Rise of the Written Vernacular in East Asia: The Making of National Languages", Journal of Asian Studies 53.3, 707-751. Mathews, R. H. 1945 A Chinese-English Dictionary. Cambridge, MA: Harvard University Press, (revised American edition). Norman, Jerry 1988 Chinese. Cambridge, England: Cambridge University Press. Packard, Jerome 1990 "A lexical morphology approach to word formation in Mandarin", Yearbook of Morphology 3, 21—37. 1996 Chinese Evidence Against Inflection-Derivation as a Universal Distinction. In Cheng. T.-F., Y. Li and H. Zhang (Eds.) Proceedings of ICCL-4/NACCL-7, Vol. 2. Los Angeles: GSIL Publications, University of Southern California. 1998 Words in Chinese: A Linguistic and Cognitive Approach. In progress. Pan, Wenguo-Po-Ching Υιρ-Saxena Yang Han &#) 1993 Studies of Chinese Word Formation, 1898—1990. (Hänyu de Gducifä Yänjiu 1898-1990 1898-1990), Taipei, Taiwan: Student Book Co., Ltd. Pulleyblank, Edwin G. 1973 a "Some New Hypotheses Concerning Word Families in Chinese", Journal of Chinese Linguistics 1, 111 — 125.

Introduction

1973 b

33

"Some Further Evidence Regarding Old Chinese -s and its Time of Disappearance", Bulletin of the School of Oriental and African Studies 36, 368-373. 1995 Outline of Classical Chinese Grammar. Vancouver, British Columbia: University of British Columbia Press. Qu, Qiubai («ffcö) 1931 Research in Common Chinese Words (Pütöng de Zhöngguohua Ziyän de Yänjiü W j i ß ^ a i S ^ g ß t o W ^ ) . [cited in P a n - Y i p - H a n 1993: 62, no further citation given.] Ross, Claudia 1990 "Resultative Verb Compounds", Journal of the Chinese Language Teachers Association 25.3, 61—83. Schussler, Axel 1976 Affixes in Proto-Chinese. Wiesbaden, Germany: Franz Steiner Verlag GmbH. 1985 "The function of qusheng in early Zhou Chinese", in: ThurgoodMatisoff-Bradley (eds.), 344-362. Serruys, Paul L-M. 1959 The Chinese Dialects of Han Time according to Fang Yen. University of California Publications in East Asian Philology, Vol. 2. Berkeley: University of California Press. Spencer, Andrew 1994 Review of AronofT, M. (1994). Morphology by itself: Stems and Inflectional Classes. Language 70.4, 811—817. Sproat, Richard—Chilin Shih 1996 "A Corpus-Based Analysis of Mandarin Nominal Root Compound". Journal of East Asian Linguistics 5.1, 49—71. Tai, James Η-Y. (JRra—) 1969 Coordination Reduction. [Indiana University Ph. D. dissertation.] Tang, Charles T-C. (Wm&) 1977 Studies in Transformational Grammar of Chinese (Guöyu Biänxing Yüfä Yänjiü Taipei: Student Book Co. 1993 The Relation between Word-Syntax and Sentence-Syntax in Chinese: A Case Study in Compound Verbs. Paper presented to the Second International Conference on Chinese Linguistics, Paris, June 1993. 1995 "More on the Relationship between Word-Syntax and Sentence-Syntax: A Case Study in Chinese Compound Nouns", in: Camacho, J. and Choueiri, L. (Eds.) Proceedings of the Sixth North American Conference on Chinese Linguistics. Los Angeles: GSIL Publications, University of Southern California. 195—248. Teng, Shou-hsin (sP^Ffl) 1973 "Negation and Aspects in Chinese". Journal of Chinese Linguistics 1.1, 14-37.

34 Jerome L. Packard Thompson, Sandra A. 1973 a "Transitivity and the ba Construction in Mandarin", Journal of Chinese Linguistics 1.2, 208—221. 1973 b "Resultative verb compounds in Mandarin Chinese", Language 49.2, 361-379. Thurgood, Graham—James A. Matisoff-David Bradley (eds.) 1985 Linguistics of the Sino-Tibetan Area: the state of the art. Pacific Linguistics Series C, No. 87. Department of Linguistics, The Australian National University. Tsao, Feng-Fu (W^S) 1978 A Functional Study of Topic in Chinese. Taipei: Student Book Co. Van Coetsem, Frans-Linda R. Waugh (Eds.) 1980 Contributions to Historical Linguistics: Issues and Materials. Leiden: E. J. Brill. Wang, Li (ΞΛ) 1972 Chinese Phonology. (Hänyü Ymyunxue ^HHISIP) Hong Kong: China Book Company 1980 An Outline of Chinese Language History (Hänyü Shigäo i££i§5&fl$) Beijing: China Book Company Wang, William S-Y. ( ϊ ± π ) 1963 "Some Syntactic Rules in Mandarin", Project on Linguistic Analysis 3, Columbus: The Ohio State University Research Foundation. 32-53. 1965 "Two Aspect Markers in Mandarin", Language 41.3, 457—470. 1967 "Conjoining and Deletion in Mandarin Syntax", Monumenta Serica 26, 224-236. Xia, Mianzun ( S 1 ? ^ ) 1946 Methods of Composing Two-character Words (Shuängzi Ciyü de Goucheng Fängshi ISi^sfMWÄiK^Ä). [Cited in P a n - Y i p - H a n 1993: 37, no further citation given.] Xue, Xiangsui 1919 "On Chinese Language and Script (Zhöngguo Yänyü Wenzi Shuölue !£*§)", National Antiquities 4. g & , BBffl Zhang, Shizhao (*±M) 1907 Intermediate Chinese Grammar (Zhöng Deng Güo Wen Diän IslJtÄ). Shanghai: The Commercial Press, Limited ( ± S : ft&EPM). [1928] [Fifteenth edition.] Zhou, Bingjun (SHI £ ) 1981 An Outline of Ancient Chinese. (Gü Hänyü Gängyäo ΐ ^ Β ί Ρ Ι Ι ) . Hunan, China: Hunan Educational Publishiing Company («am: $ma*wttiii§*i). [1983] [fourth printing.]

Word formation in Old Chinese William H. Baxter and Laurent Sagart Chi fan guöyänhuä! Chuän yl xizhuänghuä Quän guö tingzihuä! Zöu Iii jiäochehuä! 'State-banquet-ization of meals! Western-suit-ization of clothing! Pavilion-ization of the whole country! Sedan-ization of transportation!' - the Four (Modern-)izations according to Meng Qinglin, Chinese mental patient (Montagnon 1988)

1. Preliminaries The present paper illustrates some of the major known morphological processes of Old Chinese - roughly, the language of the Chinese classical texts of the Zhöu Μ dynasty (11th—3rd centuries B.C.E.). To speak of morphological processes in Old Chinese may surprise some readers, for there is a widespread belief that early Chinese had only impoverished morphology if it had any at all. Even such an insightful scholar as A. C. Graham, though refuting some of the more outlandish claims which have been made about the Chinese language, speaks of the "uniform and unchanging monosyllables" of Classical Chinese, "organized by syntax alone" (Graham 1989: 390, 403).1 For the Classical Chinese written language as used in recent centuries, this argument perhaps has some validity: how could a dead language — dead in its spoken form, at any rate — have a productive morphology anyway? But the Old Chinese we speak of here was not dead, and there is ample evidence, as we shall see, that it had various prefixes, suffixes, and other morphological processes, some still productive at the time the

36

William Η. Baxter and Laurent Sagart

classical texts were composed, others perhaps represented only by traces inherited from an earlier period. In fact, the extensive morphological processes of Old Chinese have been known and extensively studied for some time (see, for example, Maspero 1930, Karlgren 1933, 1956, and Downer 1959). It would be well beyond the scope of this paper to offer a comprehensive account of Old Chinese morphology. Our chief aims are: (1) to clarify the discussion of "word formation" by critically examining the notion of "word" itself; (2) to survey some of the better understood morphological processes of Old Chinese, with enough examples to enable readers to evaluate for themselves what Old Chinese morphology was like; and (3) to present some of the results of recent and ongoing research in this area. 2 We will see that Old Chinese was typologically rather different from the varieties of Chinese spoken today (though overt morphological processes are by no means absent from modern Chinese either). There should be nothing surprising in this, given what we know of the history of other languages. Old Chinese and Modern Chinese do in fact share some typological features, but the existence of such features is a matter of fact, not necessity; we reject the assumption that Chinese, at every stage of its history, must have had some essential characteristics which set it apart from other human languages. The only features whose presence could be assumed a priori would be features necessarily shared by all human languages.

1.1. What is a word? A discussion of Old Chinese "word formation" presupposes an understanding of what a word is. This is not a simple question. As is often the case in natural language, the meaning of the English expression word, in ordinary usage, is probably best characterized not by a strict definition but by a set of prototypical features. As the term is used by speakers of English, a prototypical English word has a pronunciation, a spelling, a meaning, and a "part of speech", which one can find in "the dictionary"; in written texts, it is bounded by spaces or punctuation on either side, and has no spaces or punctuation internally. Speaking or writing is viewed as a process of "choosing one's words" and placing them one after another in linear sequence, like beads on a string. It seems to be implicitly assumed that the words of a given text do not overlap, and that everything in an utterance (apart from some auxiliary devices such as "tone of voice"

Word formation in Old Chinese

37

and punctuation) belongs to exactly one word. The length of a text can therefore be measured by counting its words. In the English-speaking world — where there is a language written alphabetically, an elaborate set of doctrines used in teaching it, a printing industry, and people who are paid to write and speak - this bundle of prototypical features is evidently a useful one, with parallels in other similar language communities. It is worth noticing that the Chinese situation is somewhat different, as Y. R. Chao pointed out. The writing system is nonalphabetic, written syllable by syllable, with no spaces. Though there is an expression for "word", namely ci If, this is a learned word, uncommon outside linguistic discussions; and there is little agreement on how to define it. Texts and utterances are thought of as sequences of zi not of ci; it is zi which have sociolinguistic reality, and for which authors are paid (Chao 1968: 136-138). It should not be taken for granted, then, that the English expression word coincides exactly with any single theoretical concept which is useful in linguistic analysis, any more than the Chinese zi does. In fact, we find it useful to distinguish at least three distinct linguistic concepts, which do not exactly coincide in meaning with word or with each other: (1) the phonological word, (2) the "basic expression" of the formal semantics tradition, and (3) the minimal or zero-level unit of syntax.3 For some languages, probably not all, it is possible to define a wordlike unit in purely phonological terms. In languages like Swahili or Polish, where there is a regular rule for stressing a particular syllable of the word, utterances can be segmented unambiguously into wordlike phonological units. In ancient Greek, each word normally has exactly one pitch accent, though its position and identity are not predictable by general rule; in the first line of the Iliad, (1)

Menin äeide, theä, Peleiädeö Akhileos wrath sing goddess son of Peleus Achilles 'Sing, Ο goddess, the wrath or Peleus' son Achilles'

we can tell that there must be five phonological words because there are five pitch accents, though the actual word boundaries cannot be drawn on phonological grounds alone. In other languages even this much information may be lacking. In the tradition of formal semantics, the term expression is used for any well-formed form in a language which has a semantic value.4 Expressions can be combined recursively to form larger expressions according to

38

William Η. Baxter and Laurent Sagart

rules: each rule specifies how the relevant expressions are to be combined syntactically and how their semantic values are combined to give a semantic value to the new expression. Unlike the ordinary English "word", then, expressions are not like beads on a string: they are nested, with smaller ones embedded in larger ones. A basic expression is a minimal expression, one whose semantic value cannot be derived by syntactic rule from its component subexpressions; its value must therefore be specified as part of the formal system. In natural-language terms, a basic expression is an expression whose meaning must be learned separately because it is not predictable from the expression's parts. This includes both prototypical words like cat, compounds like bläckbird 'Turdus merula' (as opposed to the phrase black, bird 'bird which is black'), and idiomatic expressions like kick the bucket 'die'. Finally, we may define syntactic words as forms at the zero-bar level of X-bar theory. Depending on the language, this level may be characterized by the possibility of adding certain modifiers or specifiers. For example, consider the use of shui-jiäo HSU 'to sleep' in the modern Chinese sentence (2)

fife mj ft tä shui-le yi-huir jiäo he sleep-PF a-while 'He slept for a little while.'

Since jiäo Μ is combined with the determiner yi-huir 'a while' to make the noun phrase yi-huir jiäo, jiäo ft itself should be considered an N° (perhaps also an N 1 ) from a syntactic point of view. However, from a semantic point of view, jiäo Ä is not a basic expression: it cannot be used independently as a linguistic sign, and is simply part of the basic expression shui-jiäo Bill 'to sleep'. The examples cited show that although the concepts phonological word, basic expression, and zero-level syntactic category coincide in prototypical words like cat, they do not coincide in every case. The three concepts coincide also in English bläckbird 'Turdus merula': from a semantic point of view, it is a basic expression (since its meaning is not constructible from its parts); from a phonological point of view, its stress pattern marks it as a phonological word (since the first syllable has primary stress, even when no contrast is involved); and syntactically it is a word rather than an adjective-noun phrase (since bläck can take no modifiers: *a very bläckbird is ungrammatical).

Word formation in Old Chinese

39

With other items, there is variation within the speech community as to whether basic expressions are also phonological words. French fries is semantically a basic expression for all varieties of American English (since its meaning is not predictable from its parts). Syntactically it should probably be regarded as an N° also (since one cannot expand either French or fries: *[very French] fries and * French [crisp fries] both lack the idiomatic sense). But phonologically, there is variation: many speakers from the southern U. S. (including the first author) say French fries, with the same stress pattern as blackbird; but Northern speakers often use phrasal stress: French fries, so that there is a mismatch between phonology on the one hand and semantics and syntax on the other. In our view, Modern Chinese idiomatic verb-object compounds like shui-jiäo ß f t 'sleep' are just mismatches between semantics and syntax. Semantically, they are basic expressions (because their meaning must be learned as a unit), but syntactically, they are phrases. There is no need to worry about whether or not jiäo H is a "word" in any general sense. Semantically, it is part of the basic expression shui-jiäo but it is not a basic expression itself (and will therefore not be included in utterances independent of shui-jiäo ß f t ) . Syntactically, it is an N°. As for its phonology, we cannot tell whether or not it is a phonological word, because there is no general way to distinguish two-syllable phrases from twosyllable words in Modern Chinese in any case. Finally, enclitic particles like the possessive de of standard Mandarin illustrate mismatches between phonology and syntax (like the possessive /s/ ~ Izl ~ /iz/ of English). Phonological words cannot begin with a neutral-tone syllable, so phonologically, de must be part of the preceding word. But syntactically, it is a marker in construction with the whole preceding phrase: adapting the notation of Sadock (1991), we can represent the syntactic and phonological structures approximately as follows: (3)

[[/ä shuö] W

W

de] huä W

The search for consistent criteria for what constitute a "word" in Chinese, then, is a quixotic one, for though there are several theoretical notions which overlap with the popular concept of "word", none coincides with it exactly.

40

William Η. Baxter and Laurent Sagart

1.2. Word formation What, then, do we mean by "word formation"? For purposes of this paper, we use this term to describe processes by which a language allows the construction of new expressions at the lexical level, rather than the phrase level, from existing morphemes. Creating expressions at the phrase level is the job of syntax; word-formation processes, on the other hand, extend the lexicon itself by providing additional elements for possible input to syntactic processes. Consider, for example, the modern Chinese suffix -huä it '-ize, -ify'. Generally, this suffix is added to nominal or adjectival elements; the result is a lexical-level verb or V° whose meaning is related to the nominal or adjectival root. Some of the examples cited by Chao (1968: 225— 226) are: (4)

a.

kexuehua

'make scientific' (from

kexue

'science, scien-

tific') b. lüMib jTxiehuä c. üHk

'beautify' (from

meihuä

d. HHKb

'mechanize' (from jlxie

mäiguöhuä

mei

'machinery, mechanical')

'beautiful')

'Americanize' (from

Meiguo

'America')

The verbs thus formed can also be used nominally; the examples above can also be nouns meaning 'scientification', 'mechanization', 'beautification', and 'Americanization'. This change from verb to noun may also be thought of as a process of word formation which makes an N° out of a V°. Word-formation processes serve at least two functions: they allow speakers to expand the lexicon of their language on the fly as needed, and they make the lexicon easier to learn by giving it more internal structure. Consider the following utterance of a Chinese mental patient, quoted above: (5)

vmmmt

\

n i m m t !

! j t ^ m m t

ι

Word formation in Old Chinese

41

Chi fan guoyänhuä! Chuän yi xTzhuänghuä! Quän guo tingzihuä! Zöu lü jiäochehuä! 'State-banquet-ization of meals! Western-suit-ization of clothing! Pavilion-ization of the whole country! Sedan-ization of transportation!' As far as we know, the word jiäochehuä 'sedan-ize; sedan-ization' never occurs elsewhere; the speaker has invented a new zero-bar-level expression. Though other native speakers will never have heard the word, their knowledge of the word-formation process used allows them to infer, at least in general terms, what its semantic value is: to 'sedan-ize' China's transportation must be, somehow, to change it by using sedans more. Such a word may be used only once, but if other speakers find it a useful addition to the lexicon, it may spread through the speech community. Such processes give a language the flexibility to change its lexicon in response to changing circumstances, without the speakers having to construct and learn entirely new morphemes. They provide a large collection of potential words, only a fraction of which will actually be used. In this way they resemble the antibodies of the immune system: a very large number of potential antibodies exist, which can be activated if circumstances require; those antibodies which are needed will multiply and spread through the body, just as a useful new lexical item spreads within a speech community. Word-formation processes also make the lexical stock of a language easier to learn and remember by adding to its inner structure. If all the words semantically connected with science were made from distinct, unrelated morphemes, the ensemble might impose an excessive burden on the memory of speakers and hearers. The phonological system might be overloaded as well (in that it might not be able to distinguish enough different morphemes). Even though derived words, once they enter the lexicon, may have idiosyncratic features not predictable by rule, they still make the representation of information more efficient than would otherwise be the case. The noun-forming suffix -zi rp, for example, may have a wide variety of semantic values (Chao 1968: 238-242 identifies fourteen general patterns), but its presence is a sure sign that an item is a noun (or, occasionally, a measure word).

42

William Η. Baxter and Laurent Sagart

1.3. Evidence for Old Chinese word formation processes Although the Chinese writing system gives little direct information about pronunciation, the Chinese classical texts are accompanied by a rich commentarial and lexicographical tradition from which much can be learned about early Chinese word formation processes. For example, the JTngdiän shiwen by Lü Deming BW^ W ([583] 1985) includes phonological notes on the major classical texts. Consider the following lines from stanza 2 of ode 76 of the Shljlng It$2 {Zheng feng Ü Ä Qiäng Zhdng zifflΦ?):

(6)

^{Φί^· Qiäng

Zhdng

ζϊ

χι

m m m WM yii wo

qiäng

mimm^ wü z h e wo shü

säng

Ί beg of you, Zhongzi, Do not climb over our wall, Do not break the mulberry-trees we have planted.' 5 The JTngdiän shiwän (Lü Deming [583] 1985: 247) gives a note on the pronunciation and meaning of if zhe in this passage: (7)

i i i f j : Ζ & 51 "MfJf":

zhishe

fan,

shäng

häi ye,

xia

long.

'"Do not break": [pronounced as] plus IS"' [i.e., Middle Chinese tsy(i) + (zy)et = tsyet], meaning "to harm"; similarly below.' As an aid in reciting the text aloud, Lu Deming gives the Middle Chinese pronunciation of if zhέ by using the fänqie EiW spelling Z.lS'lii, meaning that the initial consonant of if zhe here is the same as that of zhf 'it' (i. e., MC isy-), and the rest of the syllable is the same as in ^ shi 'tongue' (i.e., U C - ( j ) e t ) . On the other hand, the same character if zhe also appears in the expression £aif duän zhέ 'to die prematurely' in section 24 of the Shäng-

Word formation in Old Chinese

shü ioiIt {Hong fan ([583] 1985: 178): (8)

43

Here, Lü Deming's annotation is as follows

Γί/fj: m & S — # £ ^ S "ί/f" : shi she fän, yi ym zhi she fan ' "Die young": [pronounced as] ^ plus ix' [i.e., Middle Chinese dzy(i) + (sy)et = dzyet]\ another pronunciation is plus $ [i. e., Middle Chinese tsy(i) + (zy)et = tsyet]

Thus Lu Deming records a tradition that the character if may be recited in two ways: as Middle Chinese tsyet, with the voiceless initial consonant written here as tsy-, in the passage "Do not break the mulberry-trees we have planted"; but when it means 'to die prematurely', as in the Shäng shü passage, it is to be read as Middle Chinese dzyet, with the voiced initial dzy- (though it was sometimes read as tsyet even here). On the basis of this and other similar examples, we conjecture that the two pronunciations tsyet and dzyet of the Middle Chinese reading tradition reflect two forms of a single Old Chinese root: tsyet, an active/transitive verb meaning 'to cut off, break off', and dzyet, a passive/intransitive form 'to be cut off, to be broken off', with the extended meaning 'to die prematurely'. To account for this alternation, we assume an Old Chinese prefix *N- (discussed in more detail below), which could be added to a transitive verb to make an intransitive or passive form. Our reconstructions are thus as follows: 6 (9)

a. i f zhe < tsyet < *tjef 'to cut off, break off; decide' b. if she < dzyet < *N-tjet 'to be cut off, be broken off'

The Chinese scholars of the QTng if dynasty (1644-1911) were inclined to doubt the reality of multiple character readings such as these, suspecting that they were simply made up by classical scholars. It is quite likely that the reading tradition includes artificial pronunciations created analogically, but the reality of the underlying process hardly seems doubtful. Our confidence is increased when we find exact parallels, involving cognate elements, in related languages such as Tibetan. Consider the following written Tibetan forms from Jäschke ([1881] 1975): gcod-pa,

pf. bead, fut. gcad, imp. chod: Ί . to cut, ... to cut asunder, ... to

cut off, chop off, the hands; to cut down, to fell, trees' ... (Jäschke [1881] 1975: 145)

44

William Η. Baxter and Laurent Sagart

'chad-pa, pf. chad, vb. η. to gcod-pa ... 'to be cut into pieces, to be cut off, to decay, ... to cease, end, stop, ... to die away, to become extinct' (Jäschke [1881] 1975: 168)

The root of both Tibetan verbs is *cad, which may represent pre-Tibetan *tyat.s The apostrophe in the intransitive form 'chad-pa represents the letter called 'a-chung, which — whatever its phonetic nature — is probably cognate to the Old Chinese prefix *N- (Pulleyblank 1973). Since Tibetan -ya- corresponds regularly to OC *-e-,9 the phonological agreement between the Chinese pair and the Tibetan pair is very good. Moreover, the semantic agreement is virtually exact: in both languages, the root is used of cutting down trees and cutting off body parts (#f"tf zhe shöu 'cut off heads' is a common expression in early Chinese texts). In addition to the extended intransitive meaning '(*to be cut off:) to die' in both languages, there is a parallel extension of the transitive form meaning 'to decide (e.g., a criminal case)'. Schuessler (1987: 821) cites the following line from the Shängshü (Lu xing SfflJ): (ίο)

η m m m fei ning zhe yit 'It is not the specious who should decide criminal cases.'

Similarly, Jäschke cites the use of the transitive form gcod-pa with the object zhal-che 'to pass sentence or judgement; to judge, condemn'. 10 This example illustrates the richness of the classical Chinese reading tradition, preserved not only in the Jlngdiän shiwen and similar commentaries, but also in lexicographical works such as the rhyming dictionary Qieyitn fflWk, compiled in 601 C. E. by Lü Fäyän H&lli and others (see Baxter 1992: 13-14, 32-44 for further discussion). In addition, modern varieties of Chinese sometimes preserve traces of earlier morphological processes which are no longer productive, or in some cases fully productive processes which may be inherited from Old Chinese. The following forms, for example, are part of the residue of the early process of *5-suffixation (discussed in section 2.2.1 below): (11) a. Hf chuän < drjwen < *drjon 'pass on, transmit' b. ΐϋ zhuän < drjwenH < *drjon-s '(that which is transmitted:) account, biography' (12) a. Ä duo < dak < *dak 'to measure out' b. S. du < duH < *dak-s 'standard, measure'

Word formation in Old Chinese

(13) a.

also b.

'to lodge for the night' (colloquial pronunciation) < sjuwk
laip chuang

Like xip in (22), two syntactic words qu bu or lai pu are pronounced as one phonological word qup and laip in (23), and the default one-to-one correspondence between syntactic word and phonological word is overridden (cf. the English case in (1)). More interestingly, these phonological words do not form single syntactic constituents. The phonological words in these two cases may be called clitic groups, if one takes a clitic to be a syntactic word prosodically attaching to, and phonologically interacting with, its adjacent syntactic word to form a phonological word (Dai 1992 b: 267). Note that my choice of definition is purely terminological, as different theoretical frameworks may characterize and label clitics in different ways. The grammatical significance of such clitics, though, lies in the mismatch between syntactic words and phonological words, rather than in the label "clitic" per se. If one assumes that a phonological word in Chinese must have a lexical tone and stress, and that an atonic and unstressed syntactic word must attach to an adjacent syntactic word to form a phonological word, then Chinese sentential

124

John Xiang-Ling Dai

particles could be clitics, as assumed in Chao (1968: 149) and C-R. Huang (1985). The sentential particles in question include, but are not restricted to, ma (for yes—no questions), ne (for contrastive/expecting addressee's expectation), ba (for conjectural and advisative remarks), de (presuppositional), and le (inchoative/new state). They are located at the sentencefinal position. Here I adopt C-R. Huang's notation " = " for the putative clitic attachments. (24) a. Zhangsan xihuan Lisi-mal Zhangsan like Lisi=w« 'Does Zhangsan like Lisi?' b. Zhangsan xihuan Lisi=ne. Zhangsan like Lisi=ne '(You might know, but) Zhangsan (really) likes Lisi.' c. Zhangsan xihuan Lisi=ba1 Zhangsan like Lisi=6« '(I suppose that) Zhangsan likes Lisi, (does he?)' d. Zhangsan xihuan Lisi=de. Zhangsan like Lisi=de 'Surely, Zhangsan likes Lisi.' e. Zhangsan xihuan Lisi—le. Zhangsan like Lisi=/e 'Nowadays, Zhangsan has begun to like Lisi.' In Dai (1992 b: 276), I explicitly argue against the clitic analysis of the above sentential particles, because even though the condition for the word-internal sandhi, the Final Elision for ma and ba, is met, it cannot optionally delete the rime of a second syllable and resyllabify its bilabial stop onset as the coda of the first syllable as it can in wo men —* worn 'we' and ba-ba —* bap 'dad'. (25) a. Zhangsan xihuan wo=malba. Zhangsan like I—malba 'Does Zhangsan like me?' {ma) '(I suppose that) Zhangsan likes me, (does he?) (ba) b. * Zhangsan xihuan Lisiml c. *Zhangsan xihuan Lisipl

Morphological words in Chinese

125

But the argument based on the applicability of the Final Elision may not be relevant here, since one may instead insist on phonological words being prosodically strong in Chinese. In traditional grammar, the term "morphological word" would be an irrelevant notion to Chinese, since the language has been assumed to have little or no inflections. This view of Chinese I refute in Dai (1992 b: 152). Let us take for granted the conclusion offered there that aspect markers such as -le or -zhe in Chinese are inflectional morphemes which close a word (verb), although they are not obligatory in every construction. Now the notion of morphological word may become relevant in analyzing the lai-construction. The data below indicate that -le can independently attach to the first verb (VI) lai or the second verb (V2) chang when either occurs alone, as in (26a/b) respectively. When VI and V2 are in series, however, only V2, but not VI, can be so suffixed, as in (26c/d). (26) a. Ta lai-le liangci. he come-perf. twice 'He came twice.' b. Ta chang-le liangci. he sing-perf. twice 'He sang twice.' c. Ta lai chang-le liangci. he come sing-perf. twice 'He came and sang twice.' d. *Ta lai-le chang liangci. he come-perf. sing twice Here one may assume that there is no morphological word boundary between VI and V2 in serialization and that VI +V2 forms a single morphological word. This morphological word instantiates two syntactic words.

8.2. Compounds "Compound" (or "compounding") is used as a cover term for a collection of related, but not necessarily identical, phenomena in the literature, ranging from a word composed of two or more bound stems to a word consisting of two or more existing words. Following Dai (1992 b: 64), I

126

John Xiang-Ling

Dai

reserve the term compound/compounding for a particular class of compound-like phenomena. A compound is a syntactic word (abbreviated to W in case of need) composed of two (or more) W's, and correspondingly, it is a morphological word (abbreviated to w) composed of two (or more) w's. This is illustrated as [W/w W/w + W/w] by the English compounds blackboard and television table. Similarly, a class of combinations of Ν with Ν (ertong-leyuan [child-paradise] 'children's playground') and A with Ν (xinxian-sucai [fresh-vegetable] 'fresh-vegetable] 'fresh vegetable') which are prima facie phrases are actually compounds in Chinese, as will be shown shortly. In these compounds, the first element modifies the second, and the result is an N. More precisely, a compound is composite in both syntax and morphology. Syntactic composition means that the internal structure of the matrix compound W is describable or licensed by a syntactic rule in a language, in terms of (sub)categories, headedness, syntactic features for inflected forms, ordering, grammatical relations, and so forth. Morphological composition means that the immediate constituents of a matrix compound morphological word are licensed by a morphological rule concatenating nonaffixal items and referring to their (sub)categories, forms, and ordering.14 Thus the two Chinese nominal compoundings can be described by one syntactic W-rule as in (27 a), in which XW[+N] is the modifier of the head NW. This syntactic W-rule is matched by the morphological w-rule in (27 b).15 (27) a. Syntactic compounding (W-rule): NW > XW[+N] NW (ertong-leyuan 'children's playground', xinxian-sucai 'fresh vegetable' ...) b. Morphological compounding Xw[+N] + Nw

(w-rule):

concatenation

of

The syntax-morphology co-licensing of a compound predicts at least the following syntactic properties of compounds, if other conditions are met. First, because of the internal syntactic concatenations of syntactic words, syntactic operations referring to rank W may refer to the internal Ws of a compound (Dai 1992 b: 109). For instance, since coordination in Chinese may refer to W besides phrase and sentence, coordination is expected to be possible in compounds 16 ([N [ N [ N sucai] he [N shuiguo]\ [N guantou]] [vegetable and fruit can] 'cans of vegetables and cans of fruit').

Morphological words in Chinese

127

Second, because syntactic concatenations elsewhere are usually recursive and productive, a syntactic compounding rule predicts or licenses an open-ended list of compounds in that category. For example, the Ν —• N + N syntactic compounding rule is not only recursive (Sichuan-hongmuzhuozi [[Sichuan-padauk]-table] 'a Sichuan padauk table'), but also productive: (28)

zhongguo-renmin [China-people] 'Chinese people', ertong-leyuan [child-paradise] 'children's playground', naiyou-binggan [buttercracker] 'butter crackers', chengren-xuexiao [adult-school] 'an adult school', wanju-gongchang [toy-factory] 'a factory making toys', tudi-geming [land-revolution] 'agrarian revolution', shuiguo-guantou [fruit-can] 'can of fruit', suanpan-naodai [abacusbrain] 'a math whiz', hongmu-zhuozi [padauk-table] 'a padauk table', etc.

Third, there may be grammatical relations (subject, object, modifier, etc.) between the component W/w's of a compound, since grammatical relations are constructs of syntax. For example, N1 modifies N2 in the rule in question. In addition to those predictions made in syntax, the following morphological predictions can be made for compounding, since a compound as a whole is a morphological unit w. First, compounding as a morphological rule allows for idiosyncratic properties, such as lexical gaps Qdaren-leyuan [adult-paradise] 'adult playground') and semantic noncompositionality of its components (e. g., ertong-leyuan [child-paradise] is not necessarily a 'children's paradise' but may mean 'children playground'). More importantly, all components of a compound observe the lexical integrity of a morphological word. Since movement, modification, or pronominalization operate universally at phrasal levels, syntactic words within a compound are not accessible to them according to the Lexical Integrity Hypothesis, similar to those illustrated in section 7. Therefore, an external adjectival phrase with de may not modify N1 in the expression, as in (10 b) ( " # " designates an impossible reading.). The modification of N2 is apparent, since N2 is the ultimate modified head of the expression. Consequently, an external modifier pragmatically compatible with N1 but not N2 yields an anomalous reading, represented as "?" in (29 c). With de inserted in the compound, the anomaly is gone, as in (29 d). This means that only when N1 is made phrasal (via being followed

128

John Xiang-Ling

Dai

by de) can it take a modifier. Also, an external modifier incompatible with N1 but compatible with N2 will not yield an anomaly, as in (29 e). (29) a. zhongguo-renmin China-people] 'Chinese people' b. weida de zhongguo-renmin great de China-people 'great Chinese people' #'people in the great country China' c. liaokuo de zhongguo-renmin broad de China-people ?'broad Chinese people' #'people in the broad country China' d. liaokuo de zhongguo de renmin broad de China de people 'people in the broad country China' #?'broad Chinese people' e. qinlao de zhongguo-renmin hard-working de China-people 'hard-working Chinese people' #?'people in the hard-working country China' A similar set of examples are given below: (30) a. wanju-gongchang toy-factory 'a factory making toys' b. Shanghai de wanju-gongchang Shanghai de toy-factory 'a toy factory in Shanghai' # ' a factory making toys of Shanghai-style' c. ertong de wanju-gongchang child de toy-factory ?'a children's play factory' # ' a factory making toys for children'

Morphological words in Chinese

129

d. ertong de wanju de gongchang child de toy de factory 'a factory making toys for children' #?'a toy factory played by children' e. juda de wanju-gongchang large de toy-factory 'a large factory making toys' # ' a factory making big toys' The blocking of an external modifier would remain unexplained by a phrasal account, since either of N1 and N2 can receive the corresponding modifier with de in isolation, cf. liaokuo de zhongguo 'the broad country China', Shanghai de wanju 'toys made in Shanghai'. By the morphological-compounding account, however, the solution follows from the Lexical Integrity Hypothesis. Since such modification works at the phrasal level, it cannot penetrate into a syntactic/morphological word, and thus break the integrity of a morphological word. Expectedly, a phrasal modifier to N2 cannot be placed between N1 and N2, as in (31 a/b), but if N1 is made phrasal by de, then grammaticality returns, as in (31 c). (31) a. hongmu-zhuozi padauk-table 'padauk table' b. *hongmu- yuan de zhuozi padauk- round de table 'round padauk table' c. hongmu de yuan de zhuozi padauk de round de table 'round padauk table' Summarizing, the assumption that a compound syntactic/morphological word is both a syntactic and morphological composite of words gives rise to syntax-morphology co-licensing of compounding. It follows that components of compounds may behave like components of both phrases and words. The set of syntactic properties predicted for compounding differentiate compounding rules from typical affixation rules in morphology on the one hand; on the other hand, the morphological properties of compounding distinguish compounding from phrasal rules in syntax. For details, see Dai (1992 b: 64-151).

130

John Xiang-Ling

Dai

9. Conclusion This paper has developed notions of the syntactic word, the phonological word, and the morphological word in Chinese. They are not always convergent but domains for rules of independent grammatical components to refer to. Based on the Lexical Integrity Hypothesis, syntactic tests resorting to phrasal operations may be used for differentiating words in a phrase from morphemes in a word. The autonomy of words in syntax, phonology, and morphology accounts for cliticization by which two syntactic words are pronounced as one word-like prosodic unit, and for compounding by which two syntactic words occur within one morphological unit.

Notes 1. This paper is developed from an earlier paper of mine, "Fundamental concepts in Chinese morphology", in International Symposium on Chinese Languages and Linguistics 3, 411 -430, 1992, which is in turn based on chapter 2 of my Ph. D. dissertation Chinese morphology and its interface with the syntax (henceforth, Dai (1992 b)). I am greatly indebted to the Editor for his advice on the content and organization of this paper. Thanks also go to Catherine Callaghan, Brian Joseph, Carl Pollard, James Tai, and Arnold Zwicky for their comments on, and criticism of, the early presentations of this paper. As always, I alone am responsible for the remaining errors. Correspondence address: University Libraries, The Ohio State University, 1858 Neil Avenue Mall, R 140, Columbus, OH 43210-1286, USA (E-mail: [email protected]). 2. They apply within prosodic (phonological) domains, subject only to phonological conditions (Nespor and Vogel 1986), e.g., flapping alveolar stops between a stressed and an unstressed vowel, or assimilation of nasals in place of articulation with the following consonant in English. By contrast, nonautomatic rules are morphologically or syntactically conditioned (e.g., the Ikl-lsl alternation in electric and electricity, or the alternation of the indefinite article a-an in English). To me, nonautomatic rules are morphological rules resorting to phonological operations (cf. Janda 1987, Joseph and Janda 1988). 3. Thus the flapping rule is an internal sandhi, and the nasal place assimilation an external sandhi. For nonautomatic phonology, the Ikl-lsl alternation (electric vs. electricity) applies word-internally, and the article a-an alternation externally.

Morphological words in Chinese

131

4. This approach is employed by Dai (1990 a) for the description of the historical morphologization of syntactic phrases in the language. 5. There are still lexical gaps: *ajarly, *bluely, *smally, etc. 6. As a reminder, in English, the slot for VW in VP • VW NP may be filled by a VP or phrase-word, as in [gave Mary] a book and [sent away] the money (Zwicky 1990: 206). The same is true of the corresponding slot in Chinese, since a V + measure-P (xiu san tian 'repair three days') can fill the slot. The phenomenon requires an adequate grammar to allow rank shift (Zwicky 1990: 206), i. e., a phrase may function as a word, but it does not entail that VP * VW NP is not a rule in both languages. 7. Cf. Chao (1968: 254). 8. Additional conditions must be placed on the application of the Final Elision besides those observed in the traditional literature and in Dai (1990 c). For example, the rule does not seem to work if the vowel of the second syllable is a front vowel, or if the second verb in the /a/-con struct ion (to be discussed shortly) is disyllabic. I won't explore these conditions in detail, since they do not affect my argument here. 9. Chao (1968: 141) thinks that the vowel in question is not missing but is pronounced as voiceless; thus the formative is still syllabic. He also suspects that the rule is caused by the falling tone in the preceding (root) syllable. However, I observe that tonality is irrelevant to many speakers. At this point I am not sure whether ^syllabification or devoicing is closer to being representative and leave the burden of proof to the phoneticians. Fortunately, neither choice will affect the main concrn here, namely, that the rule is a word-internal one and cannot apply across word boundaries. 10. Certain syntactic operations such as coordination and anaphora may additionally refer to syntactic words in compounding (see section 8). 11. As a reminder, a free form cannot necessarily occur alone. In addition, I will follow a traditional principle in this paper, i. e., an item is considered free if it is sometimes free. For instance, go and black are free words in They go to school and The board is black respectively, but bound morphemes in They are going to school and The blackboard is there. But if an item is said to be bound, it is however always bound, e. g., -ing and -ness are always bound suffixes in English. 12. Language- and structure-specific restrictions on expansion can still be found. For instance, in English, no word may intervene between the verb see and its pronominal object NP it in see it. 13. The A-not-A question is one of the hotly-debated issues in Chinese syntax (for more discussion, see C-T. Huang 1989, Dai 1990 b: 285, 1992 b: 231). 14. Such constructs find independent motivation in phrasal syntax and afTixal morphology anyway. As an example of the latter, the morphological rules licensing ir- regular-iti-es specify, among other things, that the base regular is an adjective (category), that the internal morphemes are arranged in terms

132 John Xiang-Ling Dai of prefixation or suffixation (order), and that the result is a plural (inflectional form). 15. As W represents a syntactic word, NW contrasts with NP in syntactic rank. It ought not to be strange that an NW may contain Ws and call (27 a) a syntactic rule, since the immediate constituents of a phrase may contain phrases, e. g., VP —• VP AdvP, giving [Vp [vp studied English] [AdvP at home]]. The redundancies between (27 a) and (27 b) in terms of (sub)category, order, etc., are not of concern here. 16. This is the reason why using coordination is not an effective method to distinguish phrases from words and morphemes (cf. section 7.8).

References Anderson, Stephen R. 1985 "Inflectional morphology", in: Timothy Shopen (ed.), 150-201. 1992 A-morphous morphology. Cambridge: Cambridge University Press. Bloomfield, Leonard 1933 Language. New York: Henry Holt and Co. Chan, Marjorie-Tom Ernst (eds.) 1989 Proceedings of the Third Ohio State University Conference on Chinese Linguistics. Bloomington: Indiana University Linguistics Club. Chao, Yuen Ren 1968 A grammar of spoken Chinese. Berkeley: University of California Press. Cheng, Chin-Chuan 1973 A synchronic phonology of Mandarin Chinese. The Hague—Paris: Mouton. Di Sciullo, Anna M.—Edwin Williams 1987 On the definition of word. Cambridge: MIT Press. Dai, John Xiang-Ling 1990 a "Historical morphologization of syntactic structures: evidence from derived verbs in Chinese", Diachronica 7: 9—46. 1990 b "Some issues on A-not-A questions in Chinese", Journal of Chinese Linguistics 18: 285—316. 1990 c "Syntactic constructions in serial verb expressions in Chinese", The Ohio State University Working Papers in Linguistics 39: 316—339. 1992 a "The head in wo pao de kuai", Journal of Chinese Linguistics 20: 84-119. 1992 b Chinese morphology and its interface with the syntax. [Unpublished Ph. D. dissertation, The Ohio State University.] Hammond, Michael-Michael Noonan (eds.) 1988 Theoretical morphology: approaches in modern linguistics. Orlando, FL: Academic Press.

Morphological words in Chinese

133

Huang, Chu-Ren 1985 Chinese sentential particles: a study of cliticization. [Unpublished paper presented at the Linguistic Society of America Annual Meeting, Seattle, WA, Dec. 29, 1985.] Huang, James C-T. 1989 "Modularity and explanation: the case of Chinese A-not-A questions", in Majorie Chan-Tom Ernst (eds.), 141-169. Janda, Richard 1987 On motivation for an evolutionary typology of sound-structure rules. [Unpublished Ph. D. dissertation, University of California, Los Angeles.] Joseph, Brian D.—Richard D. Janda 1988 "The how and why of diachronic morphologization and demorphologization", in Michael Hammond—Michael Noonan (eds.), 193-210. Kratochvil, Paul 1968 The Chinese language today: features of an emerging standard. London: Hutchinson University Library. Levine, Robert (ed.) 1992 Formal grammar: theory and implementation. Oxford: Oxford University Press. Li, Charles N.—Sandra A. Thompson 1981 Mandarin Chinese: a functional grammar. Berkeley: University of California Press. Lu, Zhiwei (ed.) 1964 Hanyu goucifa [A grammar of Chinese word-formation.]. Beijing: Kexue Chubanshe [Science Publication Co.]. LÜ, Shu-Xiang 1963 "Xiandai Hanyu danshuangyinjie wenti chutan [A preliminary investigation on the mono- and disyllabification of Chinese.]", Zhongguo Yuwen [Chinese Language and Writing] 1963: 10-22. Lyons, John 1968 Introduction to theoretical linguistics. Cambridge: Cambride University Press. Nespor, Marina—Irene Vogel 1986 Prosodic theory. Dordrecht: Foris. Nevis, Joel 1988 Finnish particle clitics and general clitic theory. New York: Garland Publishing Co. Packard, Jerome L. 1990 "A lexical morphology approach to word formation in Mandarin", Yearbook of Morphology 3: 21—37.

134

John Xiang-Ling Dai

Sadock, Jerrold M. 1985 "Autolexical syntax: a theory of noun incorporation and similar phenomena", Natural Language and Linguistic Theory 3: 379—439. Selkirk, Elizabeth 1982 The syntax of words. Cambridge: MIT Press. Sheu, Ying-Yu 1990 Topics on a categorial theory of Chinese syntax. [Unpublished Ph. D. dissertation, The Ohio State University.] Shopen, Timothy (ed.) 1985 Language typology and syntactic description. Vol. 3. Cambridge: Cambridge University Press. Zwicky, Arnold, M. 1989 "Idioms and constructions", ESCOL '88: Proceedings of the Fifth Annual Meeting of the Eastern States Conference on Linguistics, 547-558. 1990 "Syntactic words and morphological words, simple and composite", Yearbook of Morphology 3: 201—216. 1992 "Some choices in the theory of morphology", in Robert Levine (ed.), 327-371.

Wordhood in Chinese* San Duanmu

1. Introduction While zi 'character' has figured prominently throughout the long history of Chinese linguistics, ci 'word' was hardly a topic prior to the twentieth century. According to Lü (1990: 367, note 3), the first Chinese scholar to talk about ci 'word', as in contrast to zi 'character', was Shizhao Zhang (1907). Real discussion did not occur until the 1950s, when, prompted by the desire to introduce an alphabetic writing system, wordhood became an issue of urgency and many studies ensued. It was soon realized, however, that the task at hand was harder than one had thought, since testing criteria often conflicted with each other (see, for example, Lu [1964], Ling 1956, Fan 1958, Chao 1968, Lü 1979, Huang 1984, H. Zhang 1992, Dai 1992). This has made some leading scholars doubt whether defining "word" in Chinese is a meaningful thing to do. For example, in his classic work on Chinese grammar, Y. R. Chao (1968: 136) states that "Not every language has a kind of unit which behaves in most (not to speak all) respects as does the unit called 'word' ... It is therefore a matter of fiat and not a question of fact whether to apply the word 'word' to a type of subunit in the Chinese sentence". Similarly, Shuxiang Lü (1981: 45) says, "the reason why one cannot find a satisfactory definition for the Chinese 'word' is that there is nothing as such in the first place. As a matter of fact, one does not need the notion 'word' in order to discuss Chinese grammar." The distinction between words and phrases, however, is of vital importance to both morphology and phonology. Without knowing what a word is, one cannot meaningfully talk about morphology. Similarly, some phonological rules, such as stress assignment and the determination of tonal domains, apply differently at the word level from the phrase level (cf. for example, Selkirk and Shen 1990; Duanmu 1992, 1993; H. Zhang 1992; and section 5 below). Without a distinction between words and phrases, such rules would appear ad hoc. In this paper I discuss the distinctions between word and phrase in Chinese with regard to their morpho-syntactic, semantic, and phonological properties. The term "word"as used here refers to an X° in the X-bar theory,

136

San Duanmu

and the term "phrase" refers to an XP in the X-bar theory. A "phrase" therefore can be either a phrase or a clause in the ordinary sense, which I will not distinguish. For example, da de shu1 'big DE tree' will simply be called a phrase, whether one analyzes it as 'a big tree' or 'a tree that is big' (cf. Sproat-Shih 1991 for the latter position). Note also that for a word that contains two or more morphemes, such as gao-xing 'glad' (literally 'highmood'), I will not be concerned with whether it should better be called a compound or something else, although I will use the term "compound" when there is no possibility of confusion. 2 1 will also assume that a word can be made of words, departing from Lu's [1964] position that a word can only be made of morphemes. Finally, for reasons of space, I will not discuss all forms of word structures. Instead, I will focus on nominal structures, and even here the discussion will not be exhaustive. In section 2 I review previous morpho-syntactic and semantic criteria for testing wordhood and show the conflicts among them. In section 3 I suggest which criteria should be abandoned, which modified, and which adopted. Unlike the popular view, represented by Chao (1968) and Lü (1981), I conclude that wordhood in Chinese is clearly definable. In particular, a modifier-noun [Μ N] nominal without the particle de is a compound; so are its derivatives, such as [Μ [Μ Ν]], [[Μ Ν] Ν], [[Μ N] [Μ N]), etc., as proposed by Fan (1958) and Dai (1992). In section 4 I discuss some background in metrical and tonal phonology, as a preamble to section 5, where I give phonological evidence for wordhood in Chinese, which has been discussed very little previously. I show that phonological evidence and morpho-syntactic and semantic evidence support each other, and when one is missing the other can often fill the gap. In section 6,1 discuss some remaining problems.

2. Previous criteria In this section I review morpho-syntactic and semantic criteria that have previously been proposed for testing wordhood in Chinese. The list is probably not exhaustive but contains the important criteria. 2.1. The Lexical Integrity Hypothesis (LIH) Huang (1984) suggests that most differences between a word and a phrase in Chinese can be attributed to the Lexical Integrity Hypothesis, which "is the single most important hypothesis underlying much work on Chi-

Wordhood in Chinese

137

nese compounds" (1984: 64). Similarly, in a comprehensive study on Chinese morphology, Dai (1992: 80) suggests that "the LIH is a theoretical universal, slight variants of which underlie most current linguistic theories." Following Jackendoff (1972) and Selkirk (1984), Huang (1984: 60) states the Lexical Integrity Hypothesis as follows: (1)

The Lexical Integrity Hypothesis No phrase-level rule may affect a proper subpart of a word.

Intuitively, the Lexical Integrity Hypothesis makes good sense. For phrasal rules, words are usually the minimal units whose internal structures are no longer accessible. In practice, however, it is not always easy to decide which operation is a phrasal rule, and different test criteria often give conflicting results. In the following, therefore, I will review various test criteria separately.

2.2. Conjunction Reduction Huang (1984) suggests that in both Chinese and English Conjunction Reduction can be applied to coordinated phrases but not to coordinated words. For example, consider the following (the latter two are taken from Huang 1984: 61): (2)

a. [jiu de shu] gen [xin de shu] old DE book and new DE book 'old books and new books' b. [jiu de gen xin de] shu old DE and new DE book 'old and new books'

(3)

a. [huo-che] gen [qi-che] fire-car and gas-car 'train and automobile' b. *[huo gen qi] che fire and gas car

(4)

a. [New York] and [New Orleans] b. *New [York and Orleans]

138

San Duanmu

(2 a) is a conjunction of two phrases, so Conjunction Reduction can apply to delete the first shu 'book', giving (2 b). In contrast, (3 a) is a conjunction of two compounds, so Conjunction Reduction cannot apply, as shown by the ill-formed (3 b). The same is true in English. In (4 a) there is a conjunction of two proper names, which behave like compounds rather than phrases, therefore Conjunction Reduction cannot apply, as shown by the badness of (4 b). As Huang suggests, Conjunction Reduction is a phrase-level rule. By the Lexical Integrity Hypothesis, Conjunction Reduction cannot be applied to coordinated words. In other words, the Conjunction Reduction effect is a reflex of the Lexical Integrity Hypothesis. The Conjunction Reduction effect has been observed before. For example, Fan (1958) suggests that there are two kinds of nominale in Chinese, as shown below (M = modifier, Ν = noun, de = a particle): (5)

a. [MdfeN] b. [MN]

As Fan shows, these nominals behave quite differently in a number of ways. Among the differences, Fan notes (1958: 215) that Conjunction Reduction may apply to (5 a) but not to (5 b) as we saw in (l)-(3). Further examples are shown below: (6)

a. [xin de yi-fu] he [xin de xie] new DE clothes and new DE shoe 'new clothes and new shoes' b. xin de \yi-fu he xie] new DE clothes and shoe 'new [clothes and shoes]'

(7)

a. \yang mao] he \yang rou] sheep wool and sheep meat 'sheep wool and sheep meat' b. *yang [mao he rou] sheep wool and meat 'sheep [wool and meat]'

(6 a) is a conjunction of two [M de Ν] structures, so Conjunction Reduction may apply to give (6 b). In contrast, (7 a) is a conjunction of two [Μ N] structures, so Conjunction Reduction cannot apply to give the intended (7 b).

Wordhood in Chinese

139

By the Conjunction Reduction criterion, all [M de Ν] nominals are phrases, and all ife-less [Μ N] nominals are words. This is what Fan (1958: 216) suggests. In addition, the Conjunction Reduction criterion can be applied iteratively, so that [Μ [Μ Ν]], [[Μ Ν] Ν], [[Μ Ν] [Μ N]], etc., are also words. For example, not only is xin shu 'new book' a compound, but [xiao [xin shu]] 'small new book', ][da yan-jing] gu-niang] 'bigeyed girl', [[chang mao] [xiao gou]] 'long-haired small dog' are also compounds, and so on. The Conjunction Reduction test is challenged by Dai (1992, Chapter 3), who argues that coordination may appear inside a compound, therefore Conjunction Reduction is not a phrase-level rule. For evidence, Dai cites compounds like television and VCR table (1992: 65) and antiandpro-democracy (1992: 123), which contain compound-internal coordination. We will return to Dai's criticism in section 3. We will also see below that the Conjunction Reduction test is in conflict with several other tests.

2.3. Freedom of Parts The Freedom of Parts criterion, termed after Chao (1968: 361), says that if an immediate component of an expression is a "bound" form, such as an affix, then this expression is a word. The Freedom of Parts criterion has been proposed by earlier researchers such as Lu [1964] and Ling (1956). By Freedom of Parts, jin-zi 'gold' and gao-xing 'glad' (literally 'high-mood') are both words, since in the former both parts are bound forms, and in the latter the second part is a bound form. Huang (1984: 63) suggests that Freedom of Parts is derivable from the Lexical Integrity Hypothesis, presumably because a phrase consists of words, and all words are free; if an expression contains a bound form, then it cannot be a phrase. Lü (1979: 21) points out that the Freedom of Parts test may lead to wrong results, a problem that was also noted by Lu [1964]. For example, the Chinese question marker ma is not a free form, but it does not make sense to consider a whole question sentence to which ma is attached a compound. In addition, it will be noted that the reverse of Freedom of Parts does not hold, that is, one cannot assume that if all parts of an expression are free forms, then the expression is a phrase. In English, for example, both black and bird are free forms, yet blackbird is a compound. Similarly, consider the Chinese examples below:

140

San Duanmu

(8) a. Free-Free ji dan 'chicken egg'

b. Bound-Free ya dan 'duck egg'

It happens that ji 'chicken' is a free form, but ya 'duck' usually has to be used with a meaningless suffix zi. As Lü points out, if the reverse of Freedom of Parts is true, one arrives at the rather absurd conclusion that ji dan 'chicken egg' is a phrase but ya dan 'duck egg' a word. It is generally true, however, that if one part of an expression is bound, and if the other part is not a phrase, then the expression must be a word. If both parts are free, then one has to use additional criteria. This is the approach of Lü (1979) and Chao (1968), among others.

2.4. Semantic Composition Chao (1968: 363) proposes that for an expression whose parts are free, we can check whether the meaning of the expression is compositional from its parts. If the meaning is not compositional, then the expression is usually a word. If the meaning is compositional, then the expression is usually a phrase. Let us call this criterion Semantic Composition. For example, consider an example from Chao (1968: 363): (9)

da yi big garment 'overcoat' (*'big garment')

Since the meaning of (9) is not a composition of its parts, (9) is a compound. Similarly, consider the following: (10) a. da che big car 'cart'

b. huang jiu yellow wine '(yellow) rice-wine'

(11) a. da shu big tree 'big tree'

b. bai zhi white paper 'white paper'

The meanings of (10 a, b) are not compositional, so they are words. The meanings of (11 a, b) are compositional, so they are phrases.

Wordhood in Chinese

141

Since the meaning of a compound need not be compositional, an [A N] (adjective—Noun) compound can take an additional A whose meaning may otherwise contradict that of the original A, as noted by Huang (1984: 61) and Dai (1992: 108), among others. This is shown below: (12) a. *bai de hei de ban white DE black DE board 'white black board' b. bai de hei-ban white DE black-board 'white blackboard' c. bai hei-ban white black-board 'white blackboard' In (12 a), hei de ban 'black board' is a phrase, so it cannot take the additional adjective bai 'white', whose meaning contradicts that of the original adjective hei 'black'. In contrast, in (12 b, c) hei-ban 'blackboard' is a compound, so adding the additional bai 'white' (with or without the particle de) is possible, even though bai 'white' contradicts hei 'black'. Huang (1984: 61) suggests that semantic interpretation rules are phrasal rules, which cannot see the internal semantics of a word. Therefore Semantic Composition follows from the Lexical Integrity Hypothesis. The Semantic Composition test has limitations, however. First, as noted by Chao (1968: 364) and Huang (1984: 63), the meaning of an idiomatic expression is not compositional, yet many idioms are not compounds. For example, neither kick the bucket nor let the cat out of the bag is a compound. Secondly, even when idioms are excluded, and when Semantic Composition is used together with the Freedom of Parts criterion, ji dan 'chicken egg' will still be seen as a phrase while ya dan 'duck egg' will be seen as a word, which is a rather strange conclusion. Finally, the results of Semantic Composition conflict with those of the Conjunction Reduction criterion; the latter considers both ji dan and ya dan as well as (10 a, b) and (11 a, b) to be compounds.

2.5. Syllable Count Lü (1979: 21-22) suggests that in deciding whether an expression is a word or a phrase, one should consider the length of the expression. As

142

San Duanmu

Lü puts it (1979: 21), "The word in the mind of the average speaker is a sound-meaning unit that is not too long and not too complicated, about the size of a word in the dictionary entry." Specifically, Lü suggests that disyllabic [Μ N] nominals should be considered words, while quadri-syllabic or longer nominals should be considered phrases. In this analysis, both ji dan 'chicken egg', ya dan 'duck egg', (10 a, b) and (11a, b) are compounds. On the other hand, all the following are phrases ((13 c) from Chao 1968: 481; (13 e) from Chao 1968: 365): (13) a. ren-zao xian-wei man-make fiber 'man-made fiber' b. xiu-zhen ci-dian pocket dictionary 'pocket dictionary' c. luo-xuan tui-jin-qi snail-turn push-advance-instrument 'screw propeller' d. Beijing shi-fan da-xue Peking Normal University 'Peking Normal University' e. lian-he guo jiao-yu ke-xue wen-hua zu-zhi united nation education science culture organization 'United Nations Education Science Culture Organization' It will be noted that in each of (13 a—c), the first immediate component is not a free form. In (13 d, e), the expressions are proper names. According to Chao (1968), all these expressions are compounds. According to Lü, however, they are too long to be compounds. It is not hard to see that the Syllable Count criterion is in conflict with most other criteria. We have already seen in (13) that it conflicts with Freedom of Parts, as well as with the general assumption that a proper name is not a phrase. On the other hand, for disyllabic [Μ N] nominals, the Syllable Count criterion gives the same results as the Conjunction Reduction criterion in that both consider ji dan 'chicken egg', ya dan 'duck egg', (10 a, b) and (11 a, b) as compounds. Yet for longer expressions, the Syllable Count criterion again gives different results from the

Wordhood in Chinese

143

Conjunction Reduction criterion, as the following examples show (taken from Fan 1958: 215; judgments are Fan's): (14) a. [zheng-que yi-jian] he [zheng-que taidu] correct opinion and correct attitude 'correct opinion and correct attitude' b. *zheng-que [yi-jian he tai-du] correct opinion and attitude 'correct opinion and attitude]' (15) a. [zheng-que si-xiang] he [cuo-wu si-xiang] correct thought and wrong thought 'correct thought and wrong thought' b. *[zheng-que he cuo-wu] si-xiang correct and wrong thought '[correct and wrong] thoughts' (14 a) is a conjunction of two [M N]s, therefore Conjunction Reduction cannot apply, as shown by the badness of (14 b). Similarly, (15 a) is a conjunction of two [M N]s and Conjunction Reduction again cannot apply. By the Conjunction Reduction criterion, therefore, a quadrisyllable [Μ N] is a compound, but by the Syllable Count criterion, it is a phrase. If Syllable Count is accepted, a range of facts will remain unexplained. A different version of Syllable Count is proposed earlier in Lu ([1964]: 22-27), who suggests that whether an [NN] nominal is a word or a phrase depends on the length of each N. In particular, [1 1], [1 2], [1 3], [2 1] and [3 1] (where the digits indicate the number of syllables in each N) are words regardless of other criteria (such as the Insertion test discussed immediately below), while [2 2] could be a word or a phrase depending on other criteria.

2.6. Insertion The Insertion test was proposed as early as Wang (1944: 16). Lu [1964] considers Insertion (what he calls "expansion") to be the most important test for wordhood. The Insertion test says that if an expression allows an item to be inserted between its parts, then it is a phrase; otherwise it is a

144

San Duanmu

word. The Insertion test is adopted by many others. For nominals, the typical item to be inserted is the particle de, so that [Μ N] is converted to [M de Ν]; in fact, according to Lu ([1964]: 21), ife-insertion is the only workable test for [Μ N] nominals. For illustration, let us consider two cases from H. Zhang (1992: 33). (16) a. bai zhi white paper 'white paper'

b. bai de zhi white DE paper 'white paper'

(17) a. xin zhi letter paper 'letter paper'

b. *xin de zhi letter DE paper 'letter paper'

(16 a) allows fife-insertion, but (17 a) does not (for the intended meaning). Therefore, following Lu, H. Zhang considers bai zhi 'white paper' a phrase and χ in zhi 'letter paper' a compound. Lu ([1964]: 8) points out that for the Insertion test to work, it is necessary that the inserted material should not change the structure of the original expression. To what extent two expressions have the same structure is not explained in detail, but a few illustrations are given. For example, Lu considers pairs like (16 a, b) to have the same structures, both being [modifier noun], and the particle de apparently having no significance. On the other hand, the expressions below, from Lu [1964]: 8), do not have the same structures: (18) a. yang rou sheep meat 'mutton' b. yang DE SHEN-SHANG YOU rou sheep DE body have meat 'The sheep's body has meat.' Although (18 a) can be converted into (18 b) by inserting the capitalized materials, the original structure has changed from a nominal in (18 a) to a sentence in (18 b). Therefore (18) should not be considered a genuine case of insertion. A further restriction on the Insertion test is that the inserted material should not change the meaning of the original expression (cf. Lu [1964]: 32 and Chao 1968: 362). For example, consider the following:

Wordhood in Chinese

145

(19) a. you zui oil mouth 'glib talker' b. you de zui oil DE mouth 'greasy mouth' (*'glib talker') Although de can be inserted into (19 a) to give (19 b), the meaning has changed substantially. Therefore we should consider (19 a) to have failed the Insertion test. Let us then state the two conditions on the Insertion test below: (20)

Conditions on the Insertion Test a. The resulting expression should have the same structure as the original. b. The resulting expression should have the same meaning as the original.

Proponents of de-insertion must have assumed that it is possible, at least in some cases, that d i e - i n s e r t i o n will not change either the meaning or the structure of the original expression. But this assumption is not shared by others, since significant semantic and structural differences between [Μ N] and [M de Ν] have been well documented (e. g., Zhu [1980], Fan 1958, Lü 1979, Sproat-Shih 1991, Dai 1992). We will return to this point. Lu ([1964]: 8) notes a further problem with the Insertion test. Sometimes the results of de-insertion conflict with each other depending on whether the host expression occurs alone or in a larger structure. Consider the following examples (from Lu [1964]: 8): (21) a. yang rou sheep meat 'mutton' b. yang de rou sheep DE meat 'sheep's meat (mutton)'

146

San Duanmu

(22) a. mai yi-jin yang rou buy one-jin sheep meat 'to buy a jin of mutton' (a jin is 500 grams) b. V. mai yi-jin yang de rou buy one-jin sheep DE meat 'to buy a jin of sheep's meat (mutton)' In (21 a, b), both expressions are good (although one may argue whether the meanings are really the same). In (22), however, fife-insertion makes (22 b) odd. (21)—(22) show that passing de-insertion in one environment does not guarantee passing it in another environment. To solve the problem, H. Zhang (1992: 52) suggests the following condition on where to apply de-insertion: (23)

An [Μ N] nominal is a phrase if it can be changed into [M de Ν] in the accusative position.

In other words, the proper place to apply de-insertion is a situation like (22), but not a situation like (21). According to (23), therefore, yang rou 'mutton' fails the de-insertion test, so it is a compound. Similarly, both ji dan 'chicken egg' and ya dan 'duck egg' are compounds. On the other hand, xin shu 'new book' is a phrase, as the following sentences show (H. Zhang, 1992: 52): (24) a. wo mai-le yi-ben xin shu I bought one-copy new book Ί bought a new book.' b. wo mai-le yi-ben xin de shu I bought one-copy new DE book Ί bought a new book.' For H. Zhang, the meanings of (24 a, b) are identical, therefore xin shu 'new book' passes the de-insertion test and is a phrase. Similarly, da shu 'big tree', bai zhi 'white paper', hei mao 'black cat', etc., are considered phrases. But why is the accusative position, and the accusative position alone, selected for de-insertion? Although H. Zhang does not explain, it is probably because it is hardest to apply the de-insertion in that position. In any case, some problems remain. For example, whether the meanings of

Wordhood in Chinese

147

(21 a, b) and (24 a, b), and pairs like them, are really identical is perhaps not so easy to tell. Zhu [1980], for example, argues that xin shu and xin de shu have different meanings, as do [Μ N] and [M de Ν] in general. Similarly, according to Sproat—Shih (1991), xin shu means 'new book' but xin de shu means '(a) book which is new'. Moreover, the Insertion test is in conflict with the Conjunction Reduction test, as the following example shows: (25)

xin shu he jiu shu new book and old book 'new book and old book'

(26)

*[xin he jiu] shu new and old book 'new and old books'

Since (25) cannot be reduced to (26), the Conjunction Reduction criterion regards both xin shu and jiu shu as compounds. Hence, the Conjunction Reduction criterion and the de-insertion criterion provide conflicting results. There is another problem with the ife-insertion test. Recall that for Lu and H. Zhang, both da shu 'big tree' and da de shu 'big tree' are phrases; presumably the presence or absence of de does not matter. Now consider the following examples: (27)

*da [tie de shi-zi] big iron DE lion 'big iron lion'

(28)

da de [tie de shi-zi] big DE iron DE lion 'big iron lion'

Like da shu 'big tree', da [tie de shi-zi] 'big iron lion' must also be a phrase. Yet (27) is bad. For (27) to be good, there must be a de after da, as shown in (28). This effect has been noted by Fan (1958), Chao (1968: 288), Sproat—Shih (1992), and Dai (1992). For proponents of ^-insertion there is no explanation for why (27) is bad. For others, however, the reason is simple. According to Zhu [1980], Fan (1958), Lü (1979), Sproat-Shih (1992), and Dai (1992), among others, [M de Ν] and [Μ Ν]

148

San Duanmu

are very different structures; [Μ de Ν] is a phrase and [Μ N] a compound. Even if there is no apparent semantic difference, de-insertion will change the structure of an [Μ N] nominal. In particular, in (27), [tie de shi-zi] is a phrase but it occurs inside a compound structure [A N], making the expression ill-formed.3 A further problem with the Insertion test is that even if insertion applies, an inserted item does not necessarily change a compound into a phrase. For example, according to Chao (1968: 362), a compound verb may take de while still remaining a compound, as shown below:4 (29)

wo neng kan-jian ta. I can look-see him Ί can see him'

(30)

wo neng kan-de-jian ta. I can look-DE-see him Ί can see him'

In (29), kan-jian is a compound verb, which can take an object. In (30), kan-de-jian is also a compound verb, even though de is added. One can also cite apparent cases in English, for example evening class is a compound and evening chemistry class is also a compound (Halle—Vergnaud 1987). In summary, the Insertion test can at most be used in a limited way. If the Insertion test cannot apply to an expression, then the expression is probably a word. If the test does apply, nothing can be inferred, and one has to turn to other evidence.

2.7. Exocentric Structure Another test suggested by Chao (1968: 362) is whether the structure of an expression is exocentric. If it is, then the expression is a compound. Below are some examples: (31)

SV —• Ν huo shao fire burn 'baked wheaten cake'

Wordhood in Chinese

(32)

VO — Ν tian fang fill room 'second wife (to a widower)'

(33)

VV —• Ν kai guan open close 'switch'

149

Huang (1984: 63) attributes this criterion to the Lexical Integrity Hypothesis. This is because general principles require that all well-formed phrase structures be endocentric. In order for exocentric expressions to appear, they must be converted to compounds so that their internal structures are no longer visible to phrasal rules. Further examples of exocentric compounds include the following: (34) a. SV —• A shou-ti shi hand-carry style 'portable style' b. VO —> A wa-tu ji dig-soil machine 'soil-digging machine (excavator)' Note that VO and SV in (34) cannot be analyzed as relative clauses, since in Chinese, the particle de is required between a relative clause and the noun: (35) a. [VO de Ν] dai mao-zi de ren (*dai mao-zi reri) wear hat DE person 'the person who wears a hat' b. [SVrfeN] wo mai de shu {*wo mai shu) I buy DE book 'the book I bought' As far as I can see, the Exocentric Structure criterion works very well.

150 San Duanmu 2.8. Adverbial Modification Fan (1958: 214) notes that [Α de Ν] may take an adverb (typically an adverb of degree) that modifies A, but [A N] cannot take such an adverb. Let us call this Adverbial Modification (although I leave it open whether all such modifiers are adverbials). The contrast between [A de Ν] and [A N] under Adverbial Modification is shown below: (36) a. xin de shu new DE book 'a new book' b. hen xin de shu very new DE book 'a very new book' c. geng xin de shu more new DE book 'a newer book' d. zui xin de shu most new DE book 'the newest book' e. zheme xin de shu so new DE book 'such a new book' f. bu xin de shu not new DE book 'a book that is not new' (37) a. xin shu new book 'a new book' b. *hen xin shu very new book 'a very new book' c. *geng xin shu more new book 'a newer book'

Wordhood in Chinese

151

d. *zui xin shu most new book 'the newest book' e. *zheme xin shu so new book 'such a new book' f. *bu xin shu not new book 'a book that is not new' (36)-(37) show that [A de Ν] can take any adverbial that modifies A but [A N] cannot take such adverbials. Dai (1992: 108) suggests that the badness of (37 b—f) is due to the Lexical Integrity Hypothesis, in that A in [A N] is protected by Lexical Integrity and is not accessible to an external modifier. In contrast, A in [A de Ν] is not protected by Lexical Integrity and is accessible to an external modifier. But why should the adverbial be considered external rather than internal? We know, for example, that (37 b) hen xin shu is not 'a [very [new book])' but 'a [[very new] book]', that is, hen is an internal modifier. Perhaps Μ in [Μ N] cannot be expanded? But this is not true either. The examples below show that [Μ N] can expand into [[Χ Μ] N]: (38)

[Ν N] bu shou-tao cloth glove 'cloth glove'

—•

[[Α Ν] N] [lan bu] shou-tao blue cloth glove 'blue-cloth glove'

(39)

[AN] hong shou-tao red glove 'red glove'

—•

[[Ν A] N] [tao hong] shou-tao peach red glove 'peach-red glove'

(38)-(39) show that Μ in [Μ N] is expandable at least in some cases. What is the explanation then that Μ cannot be expanded in (37)? The reason, I suggest, is that the [adverb adjective] structure is always a phrase,5 and because it is a phrase, it cannot occur inside a compound. In contrast, the [A N] in (38) and the [N A] in (39) are compounds, so they can occur inside a compound to give [[Χ Μ] N]. The same seems to be true in English, too. For example, most A, more A, so A, best A, very

152

San Duanmu

A, etc., where A is an adjective, are always phrases. On the other hand, there are numerous [A N] compounds, such as blackbird, redwood, and White House, and numerous [N A] compounds, such as peach-red, pitchdark and snow-white.

2.9. XP Substitution Fan (1958: 214) notes that Ν in [M de Ν] can be substituted for by [X N], where X is a numeral-classifier unit or a demonstrative unit, but Ν in [Μ N] cannot be substituted this way. Since there is in my view little question that both [Numeral-Classifier N] and [Demonstrative N] are phrases (or XPs), I will call this process XP Substitution. (40) gives the schematic forms of XP Substitution and (41)-(42) show some examples: (40) a. [M de Ν] —> [Μ de XP] b. [Μ N] (41)

*[M XP]

[M de XP] a. xin de [son ben shu] new DE three copy book 'three books that are new' b. xin de [nei ben shu] new DE that copy book 'that book which is new'

(42)

*[M XP] a. *xin [san ben shu] new three copy book b. *xin [nei ben shu] new that copy book

(41) shows that Ν in [M de Ν] can be substituted by an XP. (42) shows that Ν in [Μ N] cannot be replaced by an XP. Recall that in section 2.8 we have seen that Μ in [M de Ν] can be substituted by a phrase but Μ in [Μ N] cannot. In other words, both Μ and Ν in [M de Ν] can be replaced by a phrase, while neither Μ or Ν in [Μ N] can. Similar effects

Wordhood in Chinese

153

are observed by Sproat-Shih (1991, 1992) and Dai (1992), who note that a ife-phrase cannot occur inside a compound. For example, consider the following:6 (43) a. [[M de Ν] de Ν] [xin-xian de dou-sha] de yue-bing fresh DE bean-paste DE moon-cake 'mooncake with fresh bean-paste filling' b. [Μ de [Μ de Ν]] xiao de [xin de shu] small DE new DE book 'small new book' (44) a. *[[M de Ν] Ν] *[xin-xian de dou-sha] yue-bing fresh DE bean-paste moon-cake 'mooncake with fresh bean-paste filling' b. *[M [M de xiao [xin small new 'small new

Ν]] de shu] DE book book'

(43) shows that both Μ and Ν in [M de Ν] can be substituted by a de-phrase. (44) shows that neither Μ nor Ν in [Μ N] can be replaced by a ife-phrase. The contrast between [M de Ν] and [Μ Ν] under XP Substitution is compatible with the assumption that [Μ N] is a compound and [M de Ν] is a phrase. Since a phrase cannot occur inside a compound, the badness of (42 a, b) and (44 a, b) are expected. If [M de Ν] and [Μ Ν] had the same structures, as proponents of ife-insertion assume, then the contrast between (41) and (42) would need an explanation.

2.10. Productivity It is reasonable to assume that phrasal rules are productive. For example, if a language has the rule NP —* [A N], by which a noun phrase can be made of an adjective plus a noun, 7 one expects most [A N] combinations to be possible. On the other hand, if most [A N] combinations are not possible, one would conclude that [A N] is not a phrase.

154

San Duanmu

In English, [A Ν] is productive. In Chinese, many adjectives, such as da 'big', xiao 'small', xin 'new', jiu 'old', bai 'white', hong 'red', chang 'long', duan 'short', etc., are quite productive in that they can form [A N] with many nouns. If all [A N] structures are compounds in Chinese, as proposed by Fan (1958) and Dai (1992), one would wonder whether the criteria have been too loose. Why, for example, are all the expressions below (mostly from Zhu [1980]: 9-10) compounds in Chinese, while their structures, their meanings, and their English translations seem patently phrasal? (45) a. gui dong-xi expensive article 'expensive article' b. bao zhi thin paper 'thin paper' c. cong-ming hai-zi clever child 'clever child' d. hua-ji dian-ying funny movie 'funny movie' e. huang zhi-fu yellow uniform 'yellow uniform' f. sheη shui deep water 'deep water' g. duan xiu-zi short sleeve 'short sleeve' h. bai zhi white paper 'white paper' The picture in (45) is deceptive, however. In his insightful study on Chinese adjectives, Zhu [1980] points out that Chinese [A N] is not fully productive and many gaps remain. For example, all the expressions in

Wordhood in Chinese

155

(46) are unnatural, even though they are exactly parallel in structure to those in (45) and their English translations are perfectly well-formed (from Zhu [1980]: 9 - 1 0 ; judgments are Zhu's): (46) a. *gui shou-juar expensive handkerchief 'expensive handkerchief' b. *bao hui-chen thin dust 'thin dust' c. *cong-ming dong-wu clever animal 'clever animal' d. *hua-ji ren funny person 'funny person' e. *huang qi-chuan yellow steam-boat 'yellow steam-boat' f. *shen shu deep book 'difficult book' g. *duan cheng-mo short silence 'short silence' h. *bai shou white hand 'white hand' One may wonder if Chinese has language-particular constraints on the collocation between certain adjectives and nouns, such as those in (46). But this is not the case. All of (46 a - h ) will become good if de is added between the adjective and the noun, as shown below:8 (47) a. gui de shou-juar expensive DE handkerchief 'expensive handkerchief'

156

San Duanmu

b. bao de hui-chen thin DE dust 'thin dust' c. cong-ming de dong-wu clever DE animal 'clever animal' d. hua-ji de ren funny DE person 'funny person' e. huang de qi-chuan yellow DE steam-boat 'yellow steam-boat' f. shen de shu deep DE book 'difficult book' g. duan de cheng-mo short DE silence 'short silence' h. bai de shou white DE hand 'white hand' Examples like the above strongly indicate that while [A de Ν] is fully productive in Chinese, [A N] is not. It should be pointed out that the distributional gaps in (46) are not exceptions but the norm. To appreciate how defective the [A N] distribution is, consider the following: (48) a. gao shan tall mountain 'tall mountain' b. gao lou tall building 'tall building' (49) a. *gao shu tall tree 'tall tree'

Wordhood in Chinese

157

b. *gao ren tall person 'tall person' In (48), gao 'tall' appears productive. But (49 a, b), perfectly normal [A N] structures from an English point of view, are simply bad. One may suspect that, in parallel to English, which has two words for 'highness', high (which goes with standard, speed, and mountain) and tall (which goes with building, tree, and person), perhaps there is another Chinese word for 'highness' which can go with shu 'tree' and ren 'person'? Unfortunately, this is not the case; gao is the only word in Chinese for 'highness' and covers the meanings of both high and tall in English. To express 'tall tree' and 'tall person' in Chinese, gao must be followed by de. (50) a. gao de shu tall DE tree 'tall tree' b. gao de ren tall DE person 'tall person' In other words, there is simply no way of forming plain daily expressions like 'tall tree' and 'tall person' in Chinese with an [A N] structure. If [A N] is a productive Chinese construction, such gaps are very striking indeed. The following words from Zhu ([1980]: 11) nicely summarize the facts we examined in this section: "Evidence shows that ([A N]) is a structure that tends to be tightly frozen. Its structure is not determined by productive phrasal rules. When compared with other languages, this property is especially striking. When foreigners learn Chinese, they often cannot understand why expressions like bai shou 'white hand' and gui shoujuar 'expensive handkerchief' are not natural." Since [Ν N] is less productive than [A N] (cf. Lu [1964]), by similar arguments [NN] cannot be a phrase either. In short, productivity evidence supports the view that [M de Ν] is a phrase but [Μ N] a word.

2.11. Intuition A number of researchers have assumed that Chinese speakers, or educated linguists at least, have an intuition of what a word is and that the

158 San Duanmu predictions of one's theory should agree with it. For example, Lü (1979: 21—22) suggests that in the mind of the average speaker a "word" is something that is "not too long", and Lü proposes an upper limit of four syllables, beyond which an expression should be considered a phrase regardless of other criteria. Similarly, intuition is often appealed to when one is faced with conflicting criteria. For example, H. Zhang (1992: 39) notes that by the c/e-insertion test da shu 'big tree' and xiao shu 'small tree' are phrases, but by the Conjunction Reduction test (cf. section 2.2 above) they are compounds; since "(a)lmost all Chinese linguists are of the same view" that da shu 'big tree' and xiao shu 'small tree' are phrases, H. Zhang rejects the Conjunction Reduction test in favor of the de-insertion test. Intuition is certainly an important factor to consider, and in many cases people's intuitions do agree. On the other hand, the fact that there is still no consensus on where to draw the line between word and phrase in Chinese, even though the discussions started at least as early as the 1950s, indicates that there are areas where people's intuitions either are not clear or do not agree. Specifically, while it is relatively easy to determine the wordhood of an expression that contains an affix, it is harder to analyze [Μ N] nominals that do not contain an affix. As Lu ([1964]: 5) puts it, When popular grammar books discuss (Chinese) word structures, they rarely focus on simple expressions like tie lu 'iron road (railroad)' and cai chuang 'vegetable beds' that are made of content forms only, apparently in order to avoid the greatest difficulty in Chinese morphology. In Chinese, expressions without functional forms are the hardest to analyze, because here one cannot rely on inflection to recognize wordhood, as one can in Indo-European languages.

Intuition, therefore, should be used with caution, especially with [Μ N] nominals. As far as possible, intuition should not be used alone to argue for one or another among conflicting criteria.

2.12. Summary In this section I have reviewed a number of tests for wordhood in Chinese. I have focused on tests for nominals only, in particular [Μ N] nominals. The results are summarized below:

Wordhood in Chinese

(51)

Test

word or phrase

Conjunction Reduction Freedom of Parts Semantic Composition Syllable Count Insertion Exocentric Structure Adverbial Modification XP Substitution Productivity Intuition

both both both both both V. word word word

159

99

There is no question that [M de Ν] is always a phrase. For [Μ N], results differ. Three tests (Adverbial Modification, XP Substitution, and Productivity) consider all [M N]s as words. The Intuition test has no fixed answer, since people's intuitions do not always agree. The Exocentric Structure test considers exocentric [M N]s as words but says nothing about other [M N]s. The remaining five tests (Conjunction Reduction, Freedom of Parts, Semantic Composition, Syllable Count, and Insertion) consider some [M N]s as words and some as phrases; however, they differ on which [M N]'s are words and which phrases. I will now offer my view on which tests should be adopted and which abandoned.

3. The present analysis In this section I offer my view of which criteria should be rejected and which adopted.

3.1. Rejecting Syllable Count, Insertion, and Intuition Consider Intuition first. There are two reasons for rejecting it. First, as noted by Lu [1964], people's intuitions do not always agree, especially with [Μ N] nominals, therefore it is hard to decide whose intuition to follow. Second, when intuitions do agree in certain cases, one can usually interpret these intuitions in concrete terms. For example, all people agree that you zui 'oil mouth —> glib talker' and tian fang 'fill room —• second wife (to a widower)' are compounds; the former can be explained by

160

San Duanmu

Semantic Composition and the latter by Exocentric Structure. Therefore, it is better to rely on concrete evidence than intuition. Next, consider Syllable Count. The shortcoming with this criterion is its arbitrary nature and lack of motivation. Why, for example, should the threshold for phrasehood be set at four syllables, instead of three or five? And why is there no such condition in other languaes? Finally, consider Insertion. As discussed in section 2.6, the Insertion criterion crucially requires that the following conditions be both met (52)

Conditions on the Insertion Test a. The resulting expression should have the same structure as the original. b. The resulting expression should have the same meaning as the original.

But the first condition is unlikely to be satisfiable in de-insertion. This is because inserting de definitely makes a nominal into a phrase, whereas without de a nominal could be a word. Besides, as Fan (1958) has extensively shown, [Μ N] and [M de Ν] have very different syntactic behaviors, therefore they cannot be of the same structure. As for the second condition, it is often hard to tell when two expressions have the same meaning. For example, does 'a big tree' have the same meaning as 'a tree that is big'? The semantic judgment required here must be very refined. The same is true in Chinese. For some, such as Zhu [1980], [Μ N] and [M de Ν] never have exactly the same meanings; for others, [Μ N] and [M de Ν] can have the same meanings. But even if 'a big tree' and 'a tree that is big' have the same meaning, it does not follow that they have the same structure. Why then should one assume that da shu 'big tree' and da de shu 'big DE tree' have the same structure just because they have similar meanings?9 A further reason to reject the above three criteria (Intuition, Syllable Count, and Insertion) is that not only do they conflict with each other, but they conflict with other criteria as well (cf. section 2). As we will see below, once these three criteria are rejected, all the remaining criteria give converging results.

3.2. Adopting Conjunction Reduction, Freedom of Parts, Semantic Composition, and Exocentric Structure with limitations Let us now consider Conjunction Reduction, Freedom of Parts, Semantic Composition, and Exocentric Structure. The assumption here is that

Wordhood in Chinese

161

phrases should have regular syntactic and semantic behavior; they should allow conjunction reduction, be made of free parts, be semantically compositional, and be structurally endocentric. If an expression fails any of these tests, it is not a phrase. This assumption is held by all analysts and will not be disputed here. But what if an expression passes all these tests? Apparently one cannot conclude that the expression must be a phrase. If one does, one is assuming that no compound can allow conjunction reduction, be made of free parts, be semantically compositional, and be structurally endocentric. But this assumption is incorrect. As Dai (1992) points out, while some compounds may have peculiar syntactic and semantic behaviors, others have regular syntactic and semantic behaviors. Therefore, if an expression has peculiar behavior, it is likely a word, but if an expression does not have peculiar behavior, it may still be a word. Consider the following: (53) a. meat-and-potato eater b. apple pie c. blackboard (53 a) is a compound which contains an internal conjunction (from Dai 1992: 112, citing Bates 1988: 228).10 (53 b) is a compound that is made of two free parts. (53 c) is a compound whose structure is endocentric. Finally, the semantics of (53 a, b) are quite compositional. (53) shows that the syntactic and semantic structures of a compound can be regular. Thus, even if an expression passes all of Conjunction Reduction, Freedom of Parts, Semantic Composition, and Exocentric Structure, it still could be a compound. In other words, Conjunction Reduction, Freedom of Parts, Semantic Composition, and Exocentric Structure can only be used to spot expressions that have peculiar syntactic and semantic behavior, hence marking them as compounds, but they cannot be used for expressions with regular syntactic and semantic behavior, which may or may not be compounds.

3.3. Adopting Adverbial Modification, XP Substitution, and Productivity Let us now consider the remaining three criteria, Adverbial Modification, XP Substitution, and Productivity. Adverbial Modification can probably be subsumed under XP Substitution, but it will not be our concern. These three criteria are based on reasonable assumptions, namely, that the A in

162

San Duanmu

an [A N] phrase (but not the A in an [A N] compound) should be modifiable by an adverb, that in a phrase made of two parts at least one should be an XP and so substitutable by an XP, and that all phrasal constructions should be productive. As we have seen in section 2, by these three criteria one arrives at the same conclusion that in Chinese, all [M de N]s are phrases and all [M N]s are words. It will be noted that in this respect Chinese differs from English in an important way. In English, an [A N] nominal can be either a compound (e. g., black market) or a phrase (e. g., black dogs). This has led many people to assume that the same is true for Chinese [A N] structures. But in English, [A N] is a fully productive construction, where the A readily accepts adverbial modifications (e. g., difficult discussions, more difficult discussions, most difficult discussions, very difficult discussions). In Chinese, however, [A N] is unproductive for most adjectives (e. g., *jian-ku tao-lun 'difficult discussions'), and the small number of adjectives, such as da 'big' and xiao 'small', which seem to be quite productive in [A N] structures, cannot take an adverbial modifier (e. g., *geng da gou 'more big dog —• a bigger dog', *zui da gou 'most big dog —• the biggest dog', *hen da gou 'very big dog —> a very big dog'; cf. geng da de gou 'more big DE dog —* a bigger dog', zui da de gou 'most big DE dog —• the biggest dog', hen da de gou 'very big DE dog —* a very big dog'). If the Chinese [A N] is equated with the English [A N], such facts will be very hard to explain. The same applies to longer ife-less nominals, such as da-xing min-yongpen-qi-shifei-ji 'large civilian jet liner'. For many people, such nominals are too long to be a compound. But then in *xiang-dang da-xing min-yong pen-qi-shi fei-ji 'fairly large civilian jet liner', *bi-jiao da-xing min-yong pen-qi-shi fei-ji 'relatively large civilian jet liner', and *geng da-xing min-yong pen-qi-shifei-ji '(more large) larger civilian jet liner', why are the Chinese expressions bad yet the English ones good? In the present analysis, this contrast is just what one expects.

3.4. Summary I have shown that there are good reasons for rejecting Syllable Count, Insertion, and Intuition as tests for wordhood in Chinese. Once this is done, all the remaining criteria provide converging results. In particular, all [Μ N] nominals, as well as their iterative derivatives (e.g., [Μ [Μ N]], [[Μ Ν] N], etc.) are words. This conclusion differs from most previous analyses of Chinese wordhood, but is similar to the one proposed by Dai (1992) and in part to the one proposed by Fan (1958). I will now present independent phonological evidence that supports the present analysis.

Wordhood in Chinese

163

4. Background in metrical phonology and tonal phonology To facilitate our discussion of phonological evidence for wordhood in Chinese, let me first review relevant findings in metrical and tonal phonology.

4.1. Metrical phonology Metrical phonology determines which speech elements are more prominent than others. Metrical rules are often called stress rules for the reason that in many languages metrically prominent elements surface as stressed elements. It is important to remember, however, that phonetic stress (greater duration and/or intensity) is not the only possible manifestation of metrical prominence. In Japanese, for example, a metrically prominent syllable is assigned an Η tone (which may spread to preceding syllables), without being necessarily longer or louder than other syllables. A metrically prominent element need not always bear a high pitch, either. For example, while a stressed syllable in English usually bears a high pitch, in certain speech styles it may bear a low pitch, while unstressed syllables bear high pitches. Having mentioned the above precaution, I will continue to refer to metrical prominence as stress, with the understanding that stress may be realized in different ways. The manifestations of stress in Chinese will be discussed below. For more discussion on metrical phonology, see Halle -Vergnaud (1987) and Hayes (1995), among many others.

4.1.1. Foot, head, and degenerate foot Metrical elements (moras, syllables, etc.) are grouped into constituents, or "feet". In each foot one member is more prominent than others, and this member is called the head. The head element is also called the stressed element. The process of constructing feet and determining heads is therefore the process of stress assignment. As an example, consider the stress pattern in a language with syllabic trochee ($ = syllable, ( ) = foot): (54)

χ χ χ ($ $) ($ $) ($ $)

A trochaic foot is one that has two elements, with the first being the head. (54) shows a word with six syllables, which form three feet. In each foot, the head is marked by the symbol "x" on top of it.

164

San Duanmu

A foot usually consists of two or more elements. A foot that consists of just one element is called a "degenerate foot". A degenerate foot is disfavored in all languages. One solution is to merge it with a neighboring foot, as shown below: (55)

χ χ χ —• χ χ ($ $) ($ S) ($) ($ $) ($ $ $)

(55) shows what may happen to a word with five syllables under trochaic stress. The first four syllables form two full feet. The last syllable does not form a full foot, so it merges with the second foot. An alternative way to avoid a degenerate foot is to create a new member for it, such as lengthening the vowel (for moraic feet) or reduplicating the syllable (for syllabic feet), so that it is no longer a degenerate foot.

4.1.2. Word stress, compound stress, and phrasal stress Stress assignment may be applied at several levels.11 In general, it starts at the word level, and then moves on to higher levels. Different levels may have different ways of stress assignment. In English, for example, stress is assigned at the word level, the compound level, and the phrase level. At the compound level, stress is usually (though not always) left-headed, i. e., it is assigned to the first word of the compound. In contrast, at the phrase level, stress is right-headed, i. e., it is assigned to the last word in a phrase. In addition, both compound stress and phrasal stress are assigned cyclically.

4.1.3. Stress Clash A very important finding in metrical phonology is that stresses should not occur too close to each other. This is referred to as Stress Clash. When it happens, a number of things may be triggered. Usually one of the stresses will be removed, so that there is no longer a clash, or an unstressed element may be inserted between stressed elements, so that they are farther apart. Below is an example of Stress Clash and the subsequent results: (56)

x X X

x X

x X

($) ($$) — ($)($$) -+($$$)

Wordhood in Chinese

165

First, Stress Clash leads to the removal of the stress on the second syllable. This in turn leads to the loss of the second foot, since by a standard metrical assumption every foot must have a head. The result is therefore a single foot with three syllables. It will be noted that there is certain overlap between resolving Stress Clash and avoiding a degenerate foot. The Stress Clash in (56) is in part due to the fact that the first foot is degenerate. Removing the second stress not only resolves Stress Clash, but the output no longer has a degenerate foot. Whether resolving Stress Clash and avoiding degenerate feet are reducible to some higher principle will not be our concern here, however.

4.1.4. Stress Reduction Stress Reduction is another important rule in metrical phonology. Its effect is to reduce the number of stress levels in a language through deleting a line of stress marks. Stress Reduction is also called "conflation". For example, in Khalkha Mongolian (cited in Hayes 1980: 63), stress falls on the first long vowel. The standard metrical analysis is as follows (v = short vowel, V = long vowel): (57)

Line 2: Line 1:

χ χ χ ν V ν ν Vv

χ χ V —• ν V ν ν V ν V

The hypothetical word in (57) has three long vowels. First, all the long vowels are assigned a stress mark on Line 1. Then on Line 2, left-headed stress is assigned, which falls on the first long vowel. Next, Stress Reduction is applied, deleting the Line 1 stresses. This leaves the second and third long vowels with no stress. For the first long vowel, its Line 2 stress falls onto Line 1, and this is the only surface stress in the word.

4.2. Tonal phonology We now review relevant points in tonal phonology, in particular the treatment of contour tones, association rules, and association domains.

166 San Duanmu 4.2.1. Contour tone, tiers, and mapping rules In African languages, contour tones (e. g., rise and fall) can usually be analyzed as a sequence of level tones (e. g., Η and L). For example, a fall can be seen as HL and a rise as LH. For illustration, consider an example from Margi (Williams 1976; tone markings: v = rise, x = low, ' = high) (58) a. vel to jump

b. ani c. νέΐάηί causative to make jump

The verb for 'to jump' has a rising tone when said alone, and the causative suffix has no underlying tones. When the verb and the suffix are put together in (58 c), the verb appears with a low tone, and the suffix vowels with high tones. Williams suggests that a rise should be analyzed as the sequence LH, and that tones be represented on a tier separate from segments. In addition, he suggests that tones be mapped to syllables by the Mapping Rules (or "association rules"), given in (59), along with the analysis of (58) in (60). (59) The Mapping Rules a. Associate tones to syllables one-to-one, left to right. b. If there are more syllables, spread the last tone to excess syllables. c. If there are more tones, link excess tones to the last syllable. (60)

a. vel

59a 59c • vel • vel

LH

b.

I

κ

LH

LH

59a vel + ani LH

: segmental tier : tonal tier

59b • velani

• velani

LH

LH

II

I 1/

: segmental tier : tonal tier

While the essence of Williams' proposal has now become the standard practice in multi-tiered (or autosegmental) phonology, many people remain doubtful whether contour tones in Chinese are analyzable in exactly the same way as in Margi (cf. Yip 1989 and Bao 1990). There is no question, however, that in several Chinese dialects of the Wu family, contour tones behave exactly the same way as in Margi, although the mapping rules vary

Wordhood in Chinese

167

in some ways. For illustration, consider the following examples from Shanghai:12 (61) a. dw'big' rise

b. η 'fish' rise

c. 91 'fresh' fall

(62) a. du IJ low high 'big fish' b. pi rj high low 'fresh fish' (61) shows that when said alone, du 'big' and η 'fish' have a rising tone, and ςϊ 'fresh' has a falling tone. (62) shows the tones when these words are put together, where no syllable bears the same tone as in isolation. The above data can be analyzed as follows (cf. Selkirk-Shen 1990, Duanmu 1993):13 (63) Mapping Rules for Shanghai a. Delete tones from non-initial syllables. b. Associate tones to syllables one-to-one, left to right. c. If there are more tones, link excess tones to the last syllable. (64)

63b 63c • du • du I κ LH LH LH du

(65)

a.

63a du g

• du r)

LH LH

b.

63b

LH

63a pi

ο

HL LH

* du rj I I LH

'big fish'

• ςϊ q I I H L

'fresh fish'

63b • ςΐ τ) HL

168 San Duanmu Unlike Margi, Shanghai has an extra rule (63 a) which deletes tones from non-initial syllables. We will return to the reason for this later. In addition, Shanghai does not spread the last tone to excess syllables; this, however, should not be of concern, since some African languages do not spread tones either (cf. Pulleyblank 1986). The remaining aspects of Shanghai are largely similar to those of Margi. 4.2.2. Association domains Our discussions of Margi and Shanghai have focused on what happens to tones in a given domain. Let us call such a domain an "association domain", in which tones may shift from one syllable to another. It is important to bear in mind that an association domain may differ from other kinds of tone sandhi domains, such as those for the Mandarin Third-Tone sandhi or the tone sandhi in Min dialects, in which tones do not shift from one syllable to another (cf. Duanmu 1993). A given expression may form one or more association domains. For example, the Shanghai sentence below forms four association domains ([ ] = association domain boundaries): (66)

[lo wa] [hß-gi] [ipV] [ha' ij\ old Wang like eat black fish O l d Wang likes to eat black fish.'

Within each domain, the same tonal rules in (63) apply. The tonal derivation of (66) is therefore as follows: Underlying:

[lo wa\ LH LH

[he-gi] HL LH

[ffV] LH

[ha' rj\ LH LH

After (63 a):

[lo wa] LH

[he-gi] HL

[ffV] LH

[ha' u] LH

After (63 b, c): [lo wa\ I I I I L Η

[,he-gi] I I I I Η L

[ipV] l\A LH

[ha' ο] Ι I Iι L Η

While the tonal processes within each association domain are now quite clear, the determination of association domains has been a thorny problem. As we will see below, the distinction between word and phrase plays a crucial role in this regard.

Wordhood in Chinese

169

5. Phonological evidence for wordhood in Chinese In English, phrases and compounds can often (though not always) be distinguished by stress. For example, the primary stress in the compound blackbird is on the first word, whereas the primary stress in the phrase black bird is on the second word. In Mandarin Chinese, however, such a cue is not readily available. Indeed, apart from the fact that some compounds may contain a syllable with a "neutral tone", many people do not think there is any other phonological distinction between words and phrases in Chinese (cf., however, Packard 1992, and this volume, who proposes some phonological arguments for Mandarin morphology). For example, Chao (1968: 360-361) states that when the neutral tone is excluded, Mandarin "compounds do not differ from phrases" in regard to stress. In this section I will argue that there is a rich body of phonological evidence, especially metrical and tonal evidence, for the distinction between words and phrases in both Mandarin and other Chinese dialects. In particular, I will heavily draw on evidence from Shanghai, although similar evidence is available in many other Wu dialects. In order to use evidence from beyond Mandarin, I will assume that, in general, if an expression is a compound in one dialect, it is also a compound in other dialects. This assumption is shared by other researchers, among them Chao (1968: viii), who says that "in terms of grammar, most of what is said ... about Mandarin is true of all Chinese".

5.1. Association domains as stress domains We first ask what makes an association domain in Shanghai. In an extensive study, Selkirk-Shen (1990) propose that every lexical word forms an association domain. Higher syntactic levels have additional effects but do not concern us here. Duanmu (1992) points out that an association domain can be smaller than a word. Consider the following data: (68)

{pa-li)

'Paris'

(zä-he)

'Shanghai'

(lu-mo)

'Rome'

170

(69)

(70)

San Duanmu

(tsz-ka-ku)

'Chicago'

(ko-r-fu)

'golf

(ka-na-da)

'Canada'

(ya-lu) (-sa-le)

'Jerusalem'

( (tsfio)

(ve)

LH LH LH LH fry rice 'to fry rice' (81) is a nominal and (82) is a verb phrase. Both expressions have the same input words, but their surface tone patterns differ. (81) forms one association domain and (82) forms two association domains. For a long time,

Wordhood in Chinese

173

the contrast between (81) and (82) has remained without explanation. Under a metrical analysis of association domain, however, there is a simple answer. Recall that in English, compound stress is left-headed, while phrasal stress is right-headed. If this is also true in Chinese, then (81) and (82) follow. Let us look at the process in detail: (83) a. [A tsPo

N] ve

Tried rice' xx ($) ($)

b. [V

N]

tsPo

ve

'to fry rive' xx ($) ($)

χ XX ($)($)

Compound Stress χ XX ($)($)

X X ($)($)

Word Stress (trochaic)

Phrasal Stress

X X ($)($)

χ X ($ $)

Clash Resolution

Foot Merging χ χ χ ($) ($)

Exhaustive parsing

First, each word is assigned word stress, which, as discussed earlier, is trochaic in Shanghai; although it does not show on a monosyllable, its effect will become clear immediately. Next, Compound Stress (left-headed) applies. Since (83 a) is a compound, an additional stress is assigned to its first syllable. (83 b) is not a compound, so it is unchanged. Next, Phrasal Stress (right-headed) applies. This time a stress is put on the second syllable in (83 b). (83 a) is not a phrase and nothing happens to it. Now both (83 a, b) show Stress Clash. As discussed earlier, there are several ways to resolve Stress Clash. Here I will assume that the weaker stress is removed. Now

174

San Duanmu

stress removal leaves two headless feet, which should merge with a neighboring foot if they can. This occurs in (83 a), but cannot occur in (83 b). The reason is that the foot in Shanghai is trochaic, whose head is the first syllable. Had the stressless first syllable in (83 b) merged with the stressed second syllable, the foot would no longer be trochaic. In the final step, the unstressed first syllable in (83 b) is re-assigned a stress; this is based on an assumption that all elements must be parsed into one domain or another (cf. Halle—Vergnaud 1987).14 A subsequent process will lengthen both the syllables in (83 b) to make them bi-moraic feet.15 Let us now consider nominals. We discussed in sections 2 and 3 that [M N]s are compounds and [M de N]s are phrases. Association domain formation supports this. Consider the following examples: (84)

(P sz) new book

X X

X XX

($ $)

'new book' (85)

( f i ge') new DE

(sz) book

X X

X

($ $)($)

'new book'

(84) shows that a disyllabic [Μ N] will always form just one association domain. In (85), since the functional word de does not bear stress, it can always form a disyllabic foot with the first syllable. The third syllable forms an association domain by itself.16 Since association domain is determined by stress, it will be sensitive to word length. This is exactly the case. Consider [Μ N] nominals with the length pattern [1 2] and [2 2] (1 = monosyllabic, 2 = disyllabic): (86)

χ XX

($)($ $) (87)

χ X X

χ X

— ($ $ $) X

($ $)($ $)-•($ $ $ $) For [1 2], there is always Stress Clash, and so it will always form just one association domain. For [2 2], there is no Stress Clash. Thus it may form

Wordhood in Chinese

175

two association domains. On the other hand, Stress Reduction may optionally apply, therefore [2 2] may also form just one association domain. The data below shows that the prediction is correct: (88) a. (sä fio'-yü) business school 'business school'

b. * (sä) (fio'-yü)

(89) a. ißü-yi) (Fio'-yü) language school 'language school'

b. (jiü-yi fio'-yü) language school 'language school'

Sometimes, the association domains of an expression may seem to contradict its morphological structure, as shown in the case below. (90)

ςο ka-li-fo'-ßi-ya —> (ςο ka-li-) (fo'-ßi-ya) 'small California'

In (90), the first word forms an association domain with part of the second word. This fact is a mystery in previous analyses. But in a metrical analysis, it is again what we expect. Below are the metrical derivations. (91)

χ X

X

X

($) ($-$-) ($-$-$)

X

χ X

X

($) ($-$-) ($-$-$)

Word Stress (trochaic) Compound Stress

X

X

($ $-$-) ($-$-$) Clash Resolution

First, each word is assigned trochaic stress. Then Compound Stress adds another stress to the first word. Next, Clash Resolution removes the stress from the second foot, which then merges with the first foot. The result is what we saw in (90). The association domains in expanded [M N]s, such as [Μ [Μ N]], [[Ν A] N], and [[Ν Α] [Μ N]], can be derived in the same way. Let us consider two more examples: (92) a. [Μ [Μ N]] (la tsho ve) cold fry rice 'cold fried rice'

b. [Μ [Μ N]) (ςϊ-ςί) (t^o ve) fresh fry rice 'fresh fried rice'

176

San Duanmu

(93) a.

x X XXX

XXX

XX

x χ

XXX

χ

[$ [S $]] — [$ [$ $]] [$ [$ $]] Word first cycle second cycle b.

[$ [$ $]] — ( $ $ $ ) Clash association domain x

χ xx [[$-$] [$ $]]

x x x x χ xx χ xx [[$-$] [$ $]] — [[$-$] [$ $]]

Word

first cycle

x χ x x [[$-$] [$ $]] — ($-$) ($ $)

second cycle Clash

association domain

Both (92 a, b) are [Μ [Μ N]], with the difference that the first Μ in (92 a) is monosyllabic, while that in (92 b) is disyllabic.17 This difference leads to their tonal differences: (92 a) forms one association domain, but (92 b) forms two. The metrical structures of (92 a, b) are shown in (93 a, b) respectively. In (93 a), each word is first assigned a word stress. Then Compound Stress applies. Since Compound Stress is cyclic, it will first apply to the inner brackets (which include the last two syllables) and assign greater stress to the left member (which is the middle syllable). On the second cycle, Compound Stress assigns greater stress to the first syllable. Next Clash Resolution applies, removing stresses from the last two syllables. The output is one association domain. The metrical derivation in (92 b) is similar, except that the first word is disyllabic, so the stress on the third syllable does not clash with the stress on the first. Thus, both stresses survive, giving two association domains. Let us now look at non-nominal structures. First, consider [Adv A] structures, in which an adjective is modified by an adverb, such as very beautiful, more beautiful, most beautiful, so beautiful, etc. These expressions are phrases in English. In section 2, I suggested that they are also phrases in Chinese; that is why expressions like [[hen xiri] shu] 'a [[very new] book]' are bad, since the phrase hen xin occurs inside an [Μ N] compound. If I am correct, then [Adv A] should behave like [V O] and form two association domains, even if the first word is monosyllabic. This prediction is borne out.

Wordhood in Chinese

(94)

{ka) (lo) 'so old'

(lo) (lo) 'very old'

(^0) (/o) 'most old —* oldest'

(ka) (phyo-lyä) 'so pretty'

(lo) (phyo-lyä) 'very pretty'

(tso) (phyo-lya) 'most pretty'

177

In previous studies of tonal domains, the fact that [Adv A] does not form a single association domain remains a stipulation. In a metrical analysis, it follows from the fact that [Adv A] is a phrase, so that A has greater stress, which cannot be removed by Clash Resolution. In section 2, we also discussed that, in contrast to [Adv A], which is a phrase, [N A] and [Adj A] can be a compound; that is why [[Ν A] N] and [[Adj A] N] are possible, as shown in the Mandarin examples \\tao hong] se] 'peach-red color', [[z/ hong] se] 'purple-red color', and [[da hong] se] 'big-red color —• bright-red color'. According to this analysis, [N A] and [Adj A] should be assigned Compound Stress, which is left-headed. When N, Adj and A are monosyllabic, [N A] and [Adj A] should form just one association domain. This is again correct, as the following Shanghai data show: (95) a. (ka) (do) 'so red'

(lo)((iÖ) 'very red'

(ts0)(üö) 'most red —• reddest'

b. (do ho) (tsz Ho) (du Ho) 'peach red' 'purple red' 'big red —• bright red' In (95 a), [Adv A] forms two association domains. In (95 b), [N A] and [Adj A] form one association domain. This supports the analysis that [Adv A] is a phrase and [N A] and [adj A] are compounds. Let us now consider more [V O] structures. It has been noted that some [V 0]s behave like compounds, such as dan xin 'to carry heart —• to worry', which may take another object. If dan xin is a compound, it should be assigned left-headed stress at the compound level, and so form one association domain. This prediction is borne out in Shanghai. (96)

(no) (de-ςί) (sa) you carry-heart what 'What do you worry about?'

178

San Duanmu

Huang (1984) suggests that whether a [V Ο] is a compound or a phrase can be determined by whether it can take another object. But H. Zhang (1992) points out, correctly, that object-taking is an inadequate test. For example, an intransitive [V O] compound, such as zou lu 'walk road —* to walk', cannot take another object and so will be wrongly seen as a phrase in Huang's analysis. H. Zhang suggests instead that [V O] be tested on whether the Ο can be fronted in a baconstruction. But ^-fronting is itself an inadequate test, since not all objects accept foz-fronting, even if the object is a phrase. For example, *wo ba ta xi-huan le Ί BA-her like LE —* I liked her' is bad, but we cannot conclude that xi-huan ta 'like her' is a compound, just because ta cannot be ^«-fronted. Where syntactic tests fall short, phonological evidence turns out useful. In our analysis, if [V O] (with a monosyllabic V) is a compound, it should form one association domain; if it is a phrase, it should form two association domains. Consider the following [V O] expressions in Shanghai: (97) a.

(tsy

(ga

lu)

walk road 'to walk' b . (tsy)

se-wu)

saw mountain-river 'to chat (on unimportant matters)'

(kä-sz)

(ga)

walk steel-wire 'to walk on tight-rope'

(mo'-dy)

saw wood 'to saw wood'

In (97 a), both expressions form one association domain, indicating that they are compounds. In (97 b), both expression forms two association domains, indicating that they are phrases. Huang (1984) raised the possibility that a given [V O] may sometimes be a compound and sometimes a phrase. This again is confirmed by association domain formation: (98)

a . qu le'-le'

( t f e sz)

I Asp read book Ί am reading.' b . rju le'-le'

(k*0%sz)

I Asp read book Ί am reading a book.'

Wordhood in Chinese

179

When 1^0 sz is used as a compound, it means 'reading' and forms one association domain; when it is used as a phrase, it means 'to read a book' and forms two association domains. A similar contrast can be obtained in a [S V] expression: (99) a. no (dy tho va) you head ache Q 'Do you have a headache?' b. ηδ (ιάγ) {thö να) you head ache Q 'Does your head ache?' When άγ ftö is used as a compound, it forms one association domain (together with the following unstressed particle); when it is used as a phrase, it forms two domains.

5.3. Compound stress and phrasal stress in Mandarin Unlike in Shanghai, where stress is detectable by Stress Clash, Stress Reduction, and association domain formation, stress in Mandarin and most other Chinese dialects is less obvious. One would wonder why. YueHashimoto (1987) suggests that different dialects may have different prominence patterns, for example, the Wu family has left-dominance and the Min family has right-dominance. The Mandarin family presumably has no special dominance. A rather different proposal is made in Duanmu (1993), who argues that the difference between Shanghai and Mandarin lies in their syllabic weight. We have seen that in Shanghai, foot formations apply to syllables, and Stress Clash also occurs on neighboring syllables. This is because Shanghai syllables are generally mono-moraic, and what appears to be syllabic trochee is in fact moraic trochee. In contrast, Mandarin syllables are generally bi-moraic, therefore every full syllable will form a bi-moraic trochee and be inherently stressed by what is called the Weight to Stress Principle (Prince 1990). Higher levels of stress, such as Compound Stress and Phrasal Stress, will then no longer have obvious effects, nor will Stress Clash. To see why, let us assume that Mandarin has the same stress rules as Shanghai, and consider the metrical representations of two Mandarin expressions (m = mora):

180

San Duanmu

(100) a. Compound χ χ χ (m m) (m m) chao fan fry rice 'fried rice' b. Phrase χ χ χ (m m) (m m) chao fan fry rice 'to fry rice' Since every full syllable is bi-moraic, Stress Clash does not directly occur between two syllables. In addition, since every foot is bi-moraic, syllable lengthening is rarely necessary. Therefore, Compound Stress and Phrasal Stress cannot be detected easily, and both expressions in (100) sound similar in Mandarin. In particular, every full syllable is an association domain, thus it does not usually lose its tones or shift them to another syllable. Nevertheless, I would like to point out two facts that suggest that Mandarin may have the same Compound Stress and Phrasal Stress as in English and Shanghai. First, consider disyllabic compounds. There are three stress patterns, either both syllables are stressed, or the first syllable is stressed and the second unstressed, or the first syllable is stressed while the second is optionally unstressed. There is no Chinese compound in which the first syllable is unstressed while the second is stressed, even if the first syllable is a bound form. This fact is in agreement with the assumption that Mandarin Compound Stress is left-headed, as in English and Shanghai.18 Second, it has been noted that many Chinese words have elastic length, namely, they can be either disyllabic or monosyllabic, such as suan vs. da-suan 'garlic' and zhong vs. zhong-zhi 'to plant'. The popular explanation for this apparent redundancy is that modern Chinese lost many syllabic contrasts, giving too many monosyllabic homophones; consequently, disyllabic forms are created to avoid ambiguities and help

Wordhood in Chinese

181

understanding (e.g., Lü 1963: 21, Wang 1944: 15). However, the popular view cannot explain why the elasticity of word length already existed in early history, as discussed by Guo (1938), or why Chinese is a monosyllabic language in the first place. A more serious problem with the popular view is that the short and the long forms are not always interchangeable, even when they are completely synonymous. Consider the following examples: (101)

a. b. c. d.

(102) a. b. c. d.

[V zhong-zhi *zhong-zhi zhong zhong plant

O] da-suan suan da-suan suan garlic

[M

N] shang-dian dian shang-dian dian store 'coal store'

mei-tan mei-tan *mei mei coal

'to plant garlic'

In (101), [2 1] (disyllabic monosyllabic) is bad, while in (102), [1 2] is bad. (101) and (102) represent a general tendency, namely, [2 1] is disfavored in [V O], and [1 2] is disfavored in [Μ N]. This asymmetry has been noted by Lü (1963) and Li (1990), among others, but has remained without good explanation. In particular, if the disyllabic form is created to avoid ambiguity, why should (101 b) and (102 c) be bad? Guo (1938) offers a different view on this matter. He suggests that word length elasticity is due to the tempo of speech. The short forms are used at points where one speaks faster, and the long forms are used at points where one speaks slower. But exactly where should one speak faster and where slower? In particular, why is it possible for one to speak the V faster and the Ο slower in (101), but not the V slower and the Ο faster? And why is the pattern of tempo reversed in [Μ N]? These questions are left unanswered. A more specific proposal is made by Duanmu—Lu (1990) and L u - D u a n m u (1991). They suggest that word length elasticity is due to metrical prominence. In particular, they suggest that a word with greater stress should not be shorter than a word with less stress. In addition,

182

San Duanmu

greater stress is assigned to the syntactic "non-head", namely, Μ in [Μ N] and Ο in [V O]. While a full account of the elasticity problem cannot be given here, the proposal of D u a n m u - L u is in agreement with the present analysis that [Μ N] is a word and [V O] a phrase, and that in Mandarin, Compound Stress is left-headed and Phrasal Stress is right-headed, just as they are in English and Shanghai.

6. Further issues In the preceding discussions, I have shown that there is a convergence between phonological evidence and structural evidence in regard to wordhood in Chinese. In this section I explore some implications of our analysis.

6.1. Personal names and titles It has not been clear whether personal names and titles should be considered phrases or compounds. In English, for example, personal names and titles behave like phrases in regard to stress. For example, the stress patterns of John Smith and Mr. Smith are similar to black bird rather than to blackbird. In Chinese, however, the reverse seems to be true. Consider the following examples from Shanghai: (103)

(wä ςϊ-sä) Wang Mr 'Mr. Wang'

(lo wä) old Wang O l d Wang'

(wä Ii') Wang Li 'Wang Li'

χ χ ($) ($ $)

χ X X ($) ($)

χ X X ($) ($)

The association domain patterns of the above are similar to those of compounds. In particular, the main stress is assigned to the first syllable, and the stress on the second syllable is removed due to Stress Clash. The result is one association domain. A more interesting case is as follows:

Wordhood in Chinese

(104)

183

{de fo) (bi) or (de po hi) Deng Xiao Ping 'Deng Xiao-ping'

In (104), de is the family name, and ςο the given names. As shown above, this expression may form either two association domains or just one. In the former case, the first two syllables form one association domain, and the third forms another. This means that in the speaker's mind the three syllables have no internal structure, instead of being [$ [$-$]] as the spelling suggests. To see why, consider the metrical structures below: (105) a. χ χ χ ($ $) ($) b. χ x x

($) ($-$)

χ χ

($ $-$)

In (105 a), the three syllables are treated as having no internal structure. Trochaic stress then groups the first two into one foot, and the third into another. Since the second foot is degenerate, it may merge with the first foot, giving one association domain. Alternatively, the third syllable may undergo lengthening to become a bimoraic foot, giving two association domains. In (105 b), the three syllables are structured as [$[$-$]], namely, the given names form an internal unit. In this case, the last two syllables will form a foot. At the compound level, another stress is added to the first syllable. Next, Stress Clash leads to the removal of the stress from the second syllable. This gives just one possible pattern for (105 b), namely, ($ $-$). The fact that (104) has two association domain patterns shows that its metrical structure must be (105 a) and not (105 b). Why should personal names behave differently in English and Chinese? The answer is not clear. We know, however, that not all compounds in English have left-headed stress. For example, in Madison Street the primary stress is on the first word, but in Madison Avenue it is on the second (Halle—Vergnaud 1987: 271-272). Perhaps personal names are also compounds in English but happen to have right-headed stress? An additional puzzle to note is that in English, the title (such as Mr or Professor) comes before the name, whereas in Chinese it comes after the name. But in both languages the title has less stress than the name.

184

San Duanmu

6.2. A missing structure? In English, there are three kinds of nominal structures, compound, phrase, and relative clause. In Chinese, however, there are just two nominal forms, [Μ N] and [M de Ν]. These are shown below: (106) a. blackbirds

b. black birds

(107) a. hei niao black bird

b. hei de niao black DE bird

c. birds that are black

It is not clear which English structure relates to which Chinese structure. Many people assume that (107 a) relates to both (106 a) and (106 b), and (107 b) relates to both (106 b) and (106 c). On the other hand, Sproat-Shih (1991) argue that (107b) exclusively relates to (106c), but whether (107 a) relates to (106 a) or (106 b) or both is left unanswered. In the present analysis, all [Μ N] nominals are compounds, therefore (107 a) exclusively relates to (106a). This again leaves an unanswered question as to how (107 b) is related to (106 b) and (106 c). If Sproat-Shih (1991) are right, then (107b) exclusively corresponds to (106 c). And if the present analysis is also right, then (107 a) exclusively corresponds to (106 a). This leads to the unexpected conclusion that Chinese nominals are either words or relative clauses, with no "nominal phrases" in the traditional sense. Another possibility, suggested by Bingfu Lu (personal communications), is that (107 a) corresponds to (106 a), both being compounds. (107 b), however, can either be a phrase, as in san zhi hei de niao 'three black birds', or a relative clause, as in hei de san zhi niao 'three birds that are black'. At this point, none of the analyses seems conclusive, and I leave the issue open.

6.3. *wo ba qi sheng le I BA gas bear LE Ί am angry.'

According to H. Zhang, (i) shows that du shu is a phrase, and (ii) shows that sheng qi is a compound. Two questions remain. First, does the transformed expression have the same meaning as the original? Second, does the transformed expression have the same structure as the original? Neither question can be given a positive answer. Besides, the transformability test has other limitations. It is applicable only to the so-called "disposable" verbs and not to other verbs. Consider: (iii)

ta pa gui —> *ta ba gui pa le he fear ghost he BA ghost fear LE 'He fears ghost.'

192

San Duanmu (iv)

ta pa zhe-ge gui —• *ta ba zhe-ge gui pa le he fear this ghost he BA this ghost fear LE 'He fears this ghost.'

In (iii) an (iv), the object cannot be fronted in a ^-construction. One cannot say that the VO part is a compound, since the demonstrative zhe-ge 'this' in (iv) shows that the object can be a phrase. Thus, the transformability test tells us nothing about whether pa gui is a compound or not. In fact, it is not clear whether the verb in (ii) is a "disposable" verb. 10. While (53 a) contains a conjunction, it need not have come from Conjunction Reduction, since 'potato-and-meat eaters' does not mean 'potato eaters and meat eaters'. If this is the general case with compound internal conjunctions, then perhaps no compounds allow Conjunction Reduction. Thus, in addition to saying that if Conjunction Reduction fails to apply, the conjoined parts must be words, we may further say that if Conjunction Reduction applies, the conjoined parts must be phrases. 11. See Gussenhoven (1991) for the view that stress is only assigned at the word level. At higher levels, stress deletion takes place, instead of further stress assignment. 12. This variety is called New Shanghai by Xu et al. (1988). As in other Wu dialects, onset voicing has certain effects on tone, but they are ignored. The transcription is in phonetic symbols. According to Duanmu (1993), all Shanghai syllables are underlyingly CV, with one onset position and one rime position. In particular, ['] is used for glottalization of the vowel, replacing the traditional [?] coda, and traditional [VN] (vowel-nasal) is replaced with a nasalized vowel. In addition, a syllabic consonant, such as [η] 'fish', should strictly speaking be written as [ηη] or [η:], or perhaps represented as follows: (i)

CV V 0

With this cautionary note, and for typographic convenience, I continue to use a single symbol for a syllabic consonant. 13. Strictly speaking, tones are not linked to syllables but to moraic segments. However, in the Shanghai data we look at, all syllables are mono-moraic, so there is no confusion. For more discussions, see Duanmu (1993). 14. Instead of deleting the stress on V in (83 b) and re-assigning a stress to it later, one may regard avoiding Stress Clash and exhaustive parsing as constraints on the output, and different means can be used to satisfy them. For example, when Stress Clash occurs, one may delete the weaker stress if the headless foot can merge with another foot, otherwise one lengthens a foot so that the stresses are no longer adjacent.

Wordhood in Chinese

193

15. I suspect, however, that sometimes the monosyllabic verb before an object noun may remain unfooted, in which case it will not surface with its full tones, nor will it be lengthened. This agrees with the description of Xu et al. (1988) that the tone of the verb before an object noun is much reduced. 16. Again, two things may happen to this monosyllabic foot. Either it will lengthen to a bi-moraic foot, as is the usual case, or it may cease to be a foot and merge with the preceding foot. The latter situation may happen when the first syllable pf has emphatic stress, as suggested by Selkirk-Shen (1990). 17. It can be argued that ςΐ-ςϊ is a compound; even so, its output is the same. 18. Chao (1968: 35) says that in a compound with no unstressed syllables, the last syllable has greater stress than the rest. The phonetic study of Lin et al. (1984) shows that in disyllabic compounds read in isolation where both syllables are stressed, the second syllable is indeed longer (265 ms vs. 301 ms for one speaker, and 317 ms vs. 346 ms for another). This lengthening may have been due to the phrase-final effect, however.

References Bao, Zhiming 1990 On the nature of tone. [Unpublished Ph. D. dissertation, MIT.] Bates, Dawn 1988 Prominence relations and structures in English compound morphology. [Unpublished Ph. D. dissertation, University of Washington.] Chao, Yuen-Ren 1968 A grammar of spoken Chinese. Berkeley: University of California Press. Dai, John X. L. 1992 Chinese morphology and its interface with the syntax. [Unpublished Ph. D. dissertation in progress, Ohio State University.] Duanmu, San 1992 "End-based theory, cyclic stress, and tonal domains," paper presented at CLS 28 Parasession on the Cycle, April 23—25, Chicago. 1993 "Rime length, stress, and association domains", Journal of East Asian Linguistics 2.1: 1—44. Duanmu, San—Bingfu Lu 1990 Word length variations in Chinese. [Unpublished MS.] Fan, Jiyan 1958 "Xing-ming zuhe jian 'de' zi de yufa zuoyong" [The grammatical function of de in adjective-noun constructions], Zhongguo Yuwen 1958.5: 213-217. Georgopoulos, Carol-Roberta Ishihara (eds.) 1991 Interdisciplinary approaches to language: Essays in honour of S Y. Kuroda. Dordrecht: Kluwer Academic Publishers.

194

San Duanmu

Guo, Shao-yu 1938 "Zhongguo yuci zhi tanxing zuoyong" [The elastic property of Chinese word length], Yen Ching Hsueh Pao 24. [1963] [Reprinted in Guo 1963: 1-40.] 1963 Yuwen Tonglun [Collected Essays on Chinese Language and Literature]. (2nd edition.) Hong Kong: Taiping Shuju. Gussenhoven, Carlos 1991 "The English rhythm rule as an accent deletion rule", Phonology 8: 1-35. Halle, Morris-Jean-Roger Vergnaud 1987 A essay on stress. Cambridge, Mass.: MIT Press. Hashimoto, Anne Yue 1969 "The verb 'to be' in Modern Chinese", Foundations of Language Supplementary Series 9.4: 72—111. Dordrecht: Reidel. 1987 "Tone sandhi across Chinese dialects", in: Chinese Language Society of Hongkong (ed.), Wang Li memorial volumes: English volume. Hongkong: Joint Publishing Co., 445-474. Hayes, Bruce 1980 A metrical theory of stress rules. [Unpublished Ph. D. dissertation, MIT.] 1995 Metrical stress theory: Principles and case studies. Chicago: University of Chicago Press. Huang, James C.-T. 1984 "Phrase structure, lexical integrity, and Chinese compounds", Journal of the Chinese Language Teachers Association 19.2: 53—78. Inkelas, Sharon—Draga Zee (eds.) 1990 The phonology—syntax connection. CSLI monograph. Chicago: University of Chicago Press. Jackendoff, Ray 1972 Semantic interpretation in generative grammar. Cambridge, Mass.: MIT Press. Kennedy, George A. 1953 "Two tone patterns in Tangsic", Language 29: 367-373. Li, Ning-ding 1990 Dongci fenlei yanjiu shuolue (A brief discussion of verb classification]. Zhongguo Yuwen 1990.4: 248-257. Lin, Maochan—Jing-zhu Yan—Guo-hua Sun 1984 "Beijing hua lian zi zu zhengchang zhong yin de chubu shiyan" [Preliminary experiments on normal stress in disyllabic phrases in Beijing dialect], Fangyan 1984.1: 57-73. Ling, Qixiang 1956 "Guanyu Hanyu goucifa de jige wenti" [Some problems concerning Chinese word structure, Zhongguo Yuwen 1956.12: 12—14.

Wordhood in Chinese 195 Lu, Zhiwei 1957 Hanyu de goucifa [Chinese Morphology] Beijing: Kexue Chubanshe. [1964] [Revised edition. Beijing: Kexue Chubanshe.] Lu, Bingfu-San Duanmu 1991 "A case study of the relation between rhythm and syntax in Chinese", paper presented at the Third North America Conference on Chinese Linguistics, May 3 - 5 , Ithaca, Ν. Y. Lü, Shuxiang 1963 "Xiandai Hanyu dan shuang yinjie wenti chu tan" [A preliminary study of the problem of monosyllabism and disyllabism in modern Chinese], Zhongguo Yuwen 1963.1: 11-23. 1979 Hanyu Yufa Fenxi Wenti [Problems in the analysis of Chinese grammar]. Beijing: Shangwu Yinshuguan. 1981 Yuwen chang tan [Talking about language]. Beijing: San Lian Shudian. 1990 Lü Shu-Xiang wen ji 2 [Collected papers by Lü Shu-Xiang, volume 2], Beijing: Shangwu Yinshuguan. Packard, Jerome L. 1992 "Why Mandarin morphology is stratum-ordered", paper presented at the Fourth North America Conference on Chinese Linguistics, Ann Arbor, May 8 - 1 0 . Prince, Alan 1990 "Quantitative consequences of rhythmic organization", CLS 26. Papers from the 26th Regional Meeting of the Chicago Linguistic Society Volume 2: The Parasession on the Syllable in Phonetics and Phonology, Chicago Linguistic Society, 355 — 398. Pulleyblank, Douglas 1986 Tone in lexical phonology. Dordrecht: Reidel. Selkirk, Elizabeth 1984 Phonology and syntax: The relation between sound and structure. Cambridge, Massachusetts: MIT Press. Selkirk, Elizabeth—Tong Shen 1990 "Prosodic domains in Shanghai Chinese", in: Sharon Inkelas-Draga Zee (eds.), 313-337. Sproat, Richard—Chilin Shih 1991 "The cross-linguistic distribution of adjective ordering restrictions", in: Carol Georgopoulos—Roberta Ishihara (eds.), 565-593. 1992 "Why Mandarin morphology is not stratum-ordered. Yearbook of Morphology 1992: 185-217. Wang, Li 1944 Zhongguo yufa lilun [Grammatical theory of Chinese], Shanghai: Shangwu Yinshuguan. Williams, Edwin 1976 "Underlying tone in Margi and Igbo", Linguistic Inquiry 7.3: 436-468.

196

San Duanmu

Xu, Baohua—Zhenzhu Tang—Rujie You—Nairong Qian—Rujie Shi—Yaming Shen 1988 Shanghai shiqü fangyan zhi [Urban Shanghai dialects]. Shanghai: Shanghai Jiaoyu Chubanshe. Yip, Moira 1989 "Contour tones", Phonology 6: 149-174. Zhang, Hongming 1992 Topics in Chinese phrasal phonology. [Unpublished Ph. D. dissertation, University of California, San Diego.] Zhang, Shizhao 1907 Zhongdeng guowen dian [Intermediate Chinese Grammar]. Shanghai: Shangwu Yinshuguan. Zhu, Dexi 1956 "Xiandai Hanyu xingrongci de yanjiu" [A study on adjectives in modern Chinese], Yuyan Yanjiu 1956.1: 83-112. [1980] [Reprinted with revisions in Zhu 1980: 3-41.] 1961 "Shuo ' d e ' Z h o n g g u o Yuwen 1961.12. [1980] [Reprinted in Zhu 1980: 67-103.] 1980 Xiandai Hanyu yufa yanjiu [Studies on modern Chinese grammar], Beijing: Shangwu Yinshuguan.

Prosodic structure and compound words in Classical Chinese* Shengli Feng

1. Introduction The purpose of this paper is to investigate the nature of compound words in Classical Chinese. I use the term Classical Chinese to cover the language from the Warring States Period (500 BC-200 BC) to the Han dynasty (206 BC-220 AD). My study mainly concentrates on the Han dynasty and the Pre-Qin period (221 BC). This is because compound words in Classical Chinese, as I will show below, developed to a large extent during the Han dynasty. I will discuss the properties of these compounds, the criteria used to identify them and the reason for their development. Three major points are proposed in this paper. First, I argue that compound words did indeed exist in Classical Chinese and the number of compound words in Classical Chinese sharply increased during the Han dynasty. Second, such a development of compounding in Classical Chinese is chiefly due to disyllabic foot formation, which was newly established around the Han dynasty caused by the loss of bimoraic feet in Old Chinese (c. 1000 BC). Third, I argue that compounds in Classical Chinese are not only syntactic words, but also prosodic words. The former is shown by syntactic relations among each part of the compounds, and the latter is derived from the Prosodic Hierarchy and Foot Binarity in the theory of Prosodic Morphology. The paper is organized as follows: section two examines criteria for identifying compounds in Classical Chinese; Section three presents a comparative study of Mencius (c. 372-289 BC) and the commentary on Mencius by Zhao Qi (c. 107-201 AD). Section four discusses previous accounts of development of compounding. Section five investigates the development of disyllabicity and proposes that the development of disyllabicity is independent of compounding. Section six discusses the phonological changes of Old Chinese (OC) and proposes that change of CVC basic (minimal) syllable structure of Old Chinese to a CV basic (minimal)

198

Shengli Feng

syllable structure of Middle Chinese (MC) inevitably results in a loss of bimoraic foot formation. The loss of bimoraic feet was compensated for by the introduction of disyllabic feet, and disyllabic combinations are therefore produced in sharply increased quantity during or after the phonological change took place. Given this historical development and the monosyllabic nature of the language, I further propose a Word Formation Rule, incorporated with a Foot Formation Rule based on the recent theory of Prosodic Morphology. Section seven discusses some theoretical implications and empirical consequences of the theory developed in this paper. Section eight provides a summary of this study.

2. Criteria for identifying compounds in Classical Chinese Before we discuss compound words in Classical Chinese, we must first answer the question what a compound word in Classical Chinese is. For example, the combination of two words tian-zi (The Son of the Heaven, 'Emperor') in Classical Chinese is generally considered a compound, while jun-chen ('monarch and official') is not. 1 What is the difference between these two? Are they differentiated syntactically, morphologically, or semantically? Obviously, we need a set of criteria to identify what can be called a compound in Classical Chinese. However, the problem with criteria proposed to date is that they are not entirely satisfactory for use with Classical Chinese compounds. For example, let us look at the criteria given by Chao (1968):2 (1)

a. Part of the item is neutral-toned. b. Part of the item is a bound form. c. The parts are inseparable from each other. d. The internal structure is exocentric. e. The meaning of the whole is not compositional of its parts.

If a combination of two morphemes meets one of these criteria, according to Chao, it is considered a compound in Modern Chinese.

Compound words in Classical Chinese

199

Let us consider (1 a) first. The "neutral-tone" test is quite reliable for identifying compounds in Modern Chinese, for example, shao.bing (burncake, 'pancake') is a VO-compound because the object of the verb has been neutralized (indicated by a dot "." before the syllable). However, this diagnostic is not valid in Classical Chinese, simply because Classical Chinese is an extinct literary language. Therefore we do not know whether any part of the two combined forms is neutral-toned or not. Therefore, criterion (1 a) cannot be considered a criterion for Classical Chinese compounds. According to (1 b), if part of the item is a bound form, this item is a compound. However, it is well known that morphemes in Classical Chinese are nearly always free forms, since each part of a compound can be used independently. For instance: (2)

a.

ο xiaoren shao er junzi duo ... guo-jia jiu villain few but gentleman more ... country-family long an. save 'If there are fewer villains but more gentlemen ... the country will be safe forever.' (Hanfeizi.Anwei)

b. Jin guo er ze zi zhi jia huai. Jin country two then you 's family break 'If Jin country is broken up, your family will be destroyed as well.' (Zuozhuan.Xiang.24) In these examples, although guo (country) and jia (family) can form a compound in (2 a), they can also be used independently in other sentences as in (2 b). This shows that although sometimes two elements are closely knitted together to be used as a compound, there is hardly any evidence to show that one of the parts is a bound form in Classical Chinese. 3 As a result, criterion (1 b) would not work for Classical Chinese compounding either. Let us consider the criterion (1 c), that is, the inseparability of the parts from each other in a compound. (3) is an example showing a fairly wellknown compound in Chunqiu Fanlu («##dRÄ») by Dong Zhongshu (179 B C - 1 0 4 BC):

200

Shengli Feng

(3)

ΧΨ&ΖΨϋϊο Tian-zi, tian zhi zi ye. heaven-son, heaven 's son prt 'An Emperor is the Son of Heaven.'

tian-zi is a compound but this does not mean that the two parts cannot be separated. Since Classical compounds are usually composed of free forms, even if the two forms are bound together to form a compound under one circumstance, they may also be used as a phrase with two single words separately in other contexts. In other words, the inseparability criterion cannot apply without regard to specific contexts as examples (2 a) and (2 b) show. Therefore, criterion (lc) may not be ideal for use with the Classical language. The two remaining criteria for determining compoundhood are (Id) and (1 e). These two criteria seem to work for identifying classical compounds. For example: (4)

a. m ? qi-zi wife-children 'wife' qi-zi hao he wife good marry 'good marriage with a nice wife.'

(Shijing.Tangdi)

b. mm dong-jing active quiescent 'activity' cha qi dong-jing scout his activity 'To scout his activity.' c. ju-ma carriage horse 'carriage'

(Honshu. Jimichanzhuan)

Compound words in Classical Chinese

201

daifu bu de zao ju-ma officialdom not can make carriage 'The officialdom cannot make carriages themselves.' (Liji. Yuzao) d. π Μ Si-ma charge military 'General (a title in army)' First, let us consider the criterion of exocentricity indicated in (Id): the internal structure (of a compound) is exocentric, that is, the syntactic form class of the head of the compound is not the same as that of a phrase in which the compound occurs. In other words, syntactic phrase structure rules cannot apply to the internal structure of a compound, which has been considered as a corollary of the Lexical Integrity Hypothesis (LIH, Huang 1984). According to the criterion of exocentricity, the example given in (4 d) must be a compound, because the verb si (to control) cannot serve as a head of phrase when si-ma is used as a compound (since si-ma is a noun). However, tian-zi 'Emperor', as we have seen before, should be considered a compound, since it has become a proper noun. Yet, in (3) tian zhi zi ye 'The Son of Heaven', the zhi, a possessive marker in Classical Chinese can be inserted into it, which means that a phrasal rule can actually apply to it. Is tian-zi a compound? By (Id) it should not be, but in fact it is. Obviously, (1 d) is not a sufficient criterion. Consider next (1 e), i. e., the criterion of semantic noncompositionality. This criterion can be rendered as the following equation ("||...||" Indicates the meaning of "...".): (5)

||AB|| Φ a + b

Let AB be a combination of two forms A and B, and let the meaning of A be "a" and that of Β be "b". If the meaning of AB is compositional, i.e., "a+b", then AB must be a phrase, rather than a compound, given the criterion that the meaning of the whole is not merely a composition of its parts. On the other hand, if the meaning of AB is not "a+b", we will have the following possibilities: (6)

a. ||AB|| = a (left part of AB) b. ||AB|| = b (right part of AB) c. ||AB|| = c (other)

202

Shengli Feng

Accordingly, if a combination of two forms meets one of the three possibilities in (6), it will be considered as a compound. Based on the extended formula given in (6), examples described in ( 4 a - 4 c ) must all be analyzed as compound words. This is because in all of these examples, the meaning of the whole (i. e. AB) is not simply a composition of its parts (i. e., AB Φ a+b). While the semantic criterion seems to work for identifying compounds in Classical Chinese, it is not perfect. For example, in (4c),ju-ma (carriagehorse) meets the condition of the semantic criterion: ||AB|| = a. That is, junta means only "carriage", and another part of the combination ma (horse) has no semantic value at all, hence it is considered a compound. (4 c) represents a special type of compound traditionally known as "pianyi fuci" (tö^&ifä) - a combination using only one meaning of the two. 4 At first glance, this type of combination would make perfect sense to be identified as a compound, because if one part of the combination has no meaning, the combination would be more like a word, instead of a phrase. However, the problem with this treatment is that, without the sentence given in (4 c), ju-ma will not mean "carriage" but "carriages and horses", that is, the meaning of "carriage" in ju-ma is totally dependent on the verb zao (to build/make), and there is no evidence to show that ju-ma (carriage) has been used anywhere else. If ju-ma does not occur freely as a compound, it is difficult to consider it as an independent lexicon entry. There is an additional problem. If we treat ju-ma as a compound, what is the function of ma in ju-mal Although the semantic criterion has identified ju-ma to be a compound, it creates a problem for further analysis of the internal structure of the compound. If ju-ma is formed by a syntactic coordination rule, that is, the structure of ju-ma is syntactically "carriage and horse", how do we explain the fact that half of the structure has no semantic value? As we know that ma is a noun proper, and is not a functional element or a suffix, if ju-ma is a compound, how can ma be ignored totally within the structure? As we have seen above, none of these five criteria would work completely for Classical Chinese compounds. However, each of them, except for the neutralization of tones, works to a certain degree for certain types of compounds. For example, compounds created by what is known as the reduplication process (Dobson 1959) are easily to be identified by criteria given in (1): (7)

nn pu-fu 'to creep, to crawl, to toddle'

Compound words in Classical Chinese 203

chizi pu-fu jiang ru jing. baby crawling will enter well Ά baby crawling is about to fall into a well.'

(Mencius)

It has been observed (Dobson 1959) that compounds which are derived by reduplication may have the meaning "actions or states in a repetitive pattern, succeeding each other". Obviously, this type of compound can easily be identified by either (1 b) "part of the item is a bound form" or (1 c) "the parts are inseparable from each other" or even (1 e) "the meaning of the whole is not compositional of its parts". However, the easiest cases, such as reduplicatives, are in the minority, while the most difficult cases, those that have been called syntactic words (Chao 1968),5 are in the majority, such as the examples given in (2 a) and (4). The following statistics (taken from Cheng 1981) show the proportion between these two categories ("Der" refers to Derivative compounds and "SynW" refers to syntactic words): Table 1. Proportion between derivative and syntactic compounds in Confucius and Mencius Chronology

Texts

Total

Der

%

SynW

%

c. 550 BC c. 300 BC

Confucius Mencius

180 333

24 44

13.3 13.2

138 249

76.7 74.8

There are only 13.3% derivatives in Confucius, and 13.2% in Mencius·, but 76.7% syntactic words in Confucius and 74.8% in Mencius. If a criterion can only handle 13% of the data in the language, it should not be considered valid. If we consider the development of compound formation through time, what we can see from Cheng's statistical data is that by Han times (c. 100 AD) the derivatives have decreased to only 8.22% among all the compounds in the following table: Table 2. Proportion between derivative and syntactic compounds in Lunheng (c. 100 A D )

Lunheng

Total

Der

%

SynW

%

462

38

8.22

424

91.78

204

Shengli Feng

Given that 91.78% of the compounds in the language are "syntactic", we conclude that, in practice, the most effective criterion for identifying compounds in Classical Chinese is the semantic one, that is, the one given by Chao in (1 e), formulated in (6) and modified here as (8): (8)

Semantic Criterion: If A and Β are two independent forms, and the semantic interpretation of A is "a" and that of Β is "b"; and if in context X, either a. ||AB|| = a (left part of AB), or b. ||AB|| = b (right part of AB), or c. ||AB|| = c (other)6 then the combination of AB is a compound in context X.

I will adopt the semantic principle as a working criterion to embark upon the following study of classical Chinese compounds. However, a more theoretical and formal constraint, i. e., the Word Formation Rule (WFR), and the notion of Idiomatized Prosodic Word defined by the Foot Formation Rule, as developed in section six, will be taken to characterize the idiomatic property of compound word in Classical Chinese.

3. Compounding in Zhao Qi's Mencius Zhangju In order to examine the development of Classical Chinese compounds, I have compared Mencius (c. 372—289 BC) with Zhao Qi's commentary on Mencius, i. e., the Mencius Zhangju (c. 200 AD). The reasons for selecting Zhao Qi's work as a body of comparative data are the following. First, the Han dynasty (206 BC to 220 AD) in which Zhao Qi lived (107-201 AD), was an important transition period from Old Chinese (c. 1000 B.C., i.e., the Shijing [The Book of Poetry] period) to Middle Chinese (7th century AD, i. e., the Qieyun [rhyme dictionary] period.). It is well known that from Old Chinese to Middle Chinese, the language changed a good deal with respect not only to its phonology and morphology, but also to its syntax. (Chou 1959; Wang 1980; Mei 1980; Norman 1988; Baxter 1992; and many others). Therefore Zhao Qi's work is a good place to look at the development of Classical Chinese compounds. Second, we can deduce that the language used by Zhao Qi is close to the

Compound words in Classical Chinese

205

vernacular of that time. This can be seen from Zhao Qi's preface to the Mencius Zhangju, which I translate as follows: ... When I took refuge in Haidai (i.e., Shandong province), I had nothing to do except read classical books. Often, I gain new insight from reviewing classics. During this period, a noble man (i. e., Sun Song) admired my hard work and old age. He often came to me and discussed classical texts with interpretations of those texts ... Under these circumstances, I narrated what I know, and wrote this book . . . " (Preface to Mencius Zhangju)

From this, we know that (i) Mencius Zhangju was written during the special time that Zhao Qi had discussions with (or probably gave lectures to) Sun Song, and that (ii) the language used in Mencius Zhangju was based on those discussions or lectures. Thus, we might conclude that Mencius Zhangju is closer to the Han vernacular than most other documents found in this period. Third and most importantly, in Mencius Zhangju, probably because it is close to the vernacular language, Zhao Qi often uses two-character combinations to interpret one character words in Mencius. I will call this the "one-to-two" interpretation in the following discussions. The "oneto-two" annotations allow us to determine when one character has been replaced by two between the Warring States and the Han periods (300 B C - 2 0 0 AD). 7 The procedures of the investigation for classical compounding in Mencius Zhangju are as follows. First, I list all the tokens that consist of two characters in Zhao Qi that are one-character words in Mencius. For example: (9)

Mencius:

SAJ&^xto shengren qie you guo sage-person also have mistake 'Even sages make mistakes.'

Zhao Qi:

Ι λ Ι » shengren sage-person 'Even sages

. qie you miu-wu also have false-mistake make mistakes.'

In Mencius, the one character monosyllable word guo was used for the concept "mistake". In Zhao Qi's exegesis, the two characters miu and wu are combined to gloss the one character guo.

206

Shengli Feng

In addition to all of the instances of one-to-two translations, I also list annotational materials which contain two-character combinations in Zhao Qi. For example: (10)

Mencius: guan guo wu du inner-coffin outer-coffin no rule 'The inner and outer coffins have no rules.' Zhao Qi:

f&R^Ä» guan guo hou-bao wu chi-cun inner-coffin outer-coffin thin-thick no meter-inch ζ hi du 's rule 'The thickness of inner and outer coffins have no rules for their size.'

In this example, we have three combinations in Zhao Qi: guan-guo which is repeated from Mencius, hou-bao and chi-cun which have no corresponding words in Mencius, but are used by Zhao Qi. I will call this type of annotation "none-to-two". Although this type of data is not a word-to-word annotation like the ones given above, nevertheless they are annotations of meanings implied in that sentence. These two-character combinations provide us an opportunity to see how meanings are expressed by the two-character combinations in the Han period language. Therefore, such examples are also included in my percentage study of this section. As indicated in (10), I also take into consideration the two-character combinations that Zhao Qi repeated from the Mencius, such as guan-guo. I will call this type-of annotation "two-to-two". (11)

Mencius: guan guo wu du inner-coffin outer-coffin no rule 'The inner and outer coffins have no rules.' Zhao Qi: guan guo hou-bao wu inner-coffin outer-coffin thin-thick no chi-cun zhi du meter-inch 's rule 'The thickness of inner and outer coffins have no rules for their size.'

Compound

words in Classical Chinese

207

Thus we have three types of combinations that we will examine in this study. These are: (a) combinations used to gloss a monosyllabic word (1to-2); (b) combinations used to explain the meanings or implications of the sentences (0-to-2), and (c) combinations repeated from the original text (2-to-2). Putting all these combined forms together, I then evaluate them according to the semantic criteria for compounding given in the previous section. Since the use of two characters by Zhao Qi to gloss the one character given in Mencius provides an excellent illustration for the study of the development of compounding, we are able to see where and how a monosyllabic word was replaced by a disyllabic compound. The questions we seek to answer are: i) How many two-syllable combinations used by Zhao Qi can be identified as compounds? ii) How many one-character words in Mencius have been glossed by compounds in Zhao Qi's annotation? iii) How many compounds have been used by Zhao Qi in his explanations of meanings and ideas within sentences? iv) How many compounds used by Zhao Qi have survived into presentday Mandarin Chinese? As we can see from Table 3, there are a total of 169 two-character combinations in my data: in the Liang Huiwang Shang section of the Mencius Zhangju, there are 113 tokens; in the Gongsun Chou Xia section of Mencius Zhangju, there are 56. Among these 169 cases, there are 73 cases that belong to the "one-to-two" category, 60 cases are "non-totwo" and 36 cases are "two-to-two". Table 3. Combinations of two characters in Zhao Qi's Mengzi Zhangju and Mencius

l-to-2 0-to-2 2-to-2 Total

Total

%

Han Compound

%

Modern Compound

%

73 60 36 169

43 36 21 100

34 39 29 102

47 65 80 60

31 25 18 74

42 42 50 44

208

Shengli Feng

From the data given in Table 3. we can see that 73 monosyllabic words in Mencius have been replaced by two-syllable combinations in Zhao Qi, and among the 73 two-syllable combinations used by Zhao Qi, 47% of them are compounds. In addition to the replacement of one by two, there are 60 cases of "none-to-two". Among these 60 cases, 65% are compounds. From these data we may conclude the following: first, an ever greater number of compound words were formed during this period. This can be seen clearly from Table 3. Among all of the 169 cases, only 21% of the tokens were disyllabic combinations in the Warring States Period, while 79% of them occurred in the Han dynasty. Although the use of compounds can be traced back to the Shang dynasty (sixth to eleventh centuries BC, see Cheng 1981) and a further development can be found during the Warring States Period, it is evident that a sharp increase in compounding occurred during the Han. The 73 one-to-two cases show that 43% of the time the Han people used disyllabic forms, whereas the people who lived in the Warring States Period used monosyllabic forms in the same linguistic contexts. All of these pieces of evidence suggest that the people of the Han period used more two-syllable combinations or compounds to express the same concepts which were expressed using monosyllable words during the Warring States Period. Secondly, the data also suggest that the development of compounds correlates with the appearance of disyllabic combinations. There are 169 disyllabic combinations in Zhao Qi, and by the semantic criterion, only 43% of compounds have appeared. The fact that there are more twosyllable combinations appearing in the language but that fewer compounds can be identified indicates that the appearance of two-syllable combinations may be the fundamental basis of the development of compounds in Classical Chinese. The data from Zhao Qi comport with the general observation that Classical Chinese compounds are structurally formed using rules from syntax. The following syntactic relations between the two parts of compounds are observed in my data: (12)

Coordinating Compounds a. NN ^^

chi-cun yi-shi

meter-inch, 'size' cloth-food, 'daily use'

Compound words in Classical Chinese

b. VV 'htÄ cun-duo fltttft zeng-kui

209

think-measure, 'ponder' send-give, 'make a present of

c. AA xian-zu chun-cui

dangerous-blocking, 'difficult' pure-best, 'unadulterated'

Subordinate Compounds d. AN

e. NN SA

gua-ren

single-person, Τ (1st person Pronoun for Emperor)

guo-ren

country-person, 'aristocrat'

There are no S(ubject)-P(redicate), V(erb)-R(esultative complement) and V(erb)-0(bject) compounds found in my data. This indicates that coordinative and subordinative relations are the most favored structures for compound formations, and that VR-structures, VO-structures and SP-structures are disfavored structures. The comparison between Mencius and Zhao Qi shows that a) an ever greater number of compounds developed during the Han dynasty; b) the development of compounds is based on the development of disyllabic combinations; c) compounds must be formed structurally from syntax; and d) coordination and subordination are favored structures for compounding while Verb-Object, Subject-Predicate, and Verb—Resultative are disfavored. All these facts about Classical Chinese compounds call for a theory to explain why they exhibit such properties during the course of their development.

3.1. Questions regarding the development of compounds In this section I will address questions arising from the study of Zhao Qi's data and studies of compounds in general. First, if, as indicated in Zhao Qi's data, compounds are derived from two-syllable combinations or phrases, then why are coordinative and subordinative compounds very common, but Verb-Object, Verb-Resultative, and Subject-Predicate compounds extremely rare?

210

Shengli Feng

Second, if coordinative structures such as cao-mu (grass-tree) and linmu (woods-tree) can develop into compounds by specializations of meaning ("grass and trees" —• 'vegetation', "woods and trees" —• 'woods', respectively), then why do we not find three-character coordinative structures such as cao-mu-shu (grass-woods-tree) as a result of the same process: "grass, woods, and trees" —> 'vegetation'? Third, if syntax determines the internal structure of compounds, then coordinative compounds such as dong-jing (active-quiescent, 'activity'), ju-ma (carriage-horse, 'carriage') must be structurally interpreted as "active and quiescent" and "carriages and horses" respectively. However, the semantics of these compounds does not allow us to give a full interpretation of the meanings conveyed by each part of the compounds in these structures: ju-ma is not interpreted as 'carriage and horse', but as 'carriage'; dong-jing is not interpreted as 'active and quiescent', but as 'active'. The interpretation requires the other part of the compound to be semantically empty, and the syntax of such coordinating structures must thus be interpreted as: "dong and ", "ju and ". How can a syntactic rule allow a coordinative structure with the second part semantically empty? If the second part of a coordinative structure has no semantic value, what does "coordination" mean structurally, and semantically? Fourth, why was an ever greater number of compounds produced specifically around the period of the Han dynasty? Why did the Chinese language suddenly have such a strong tendency toward the formation of compounds? Finally, if both coordinative and subordinative structures are the most productive types of compounding, why at the beginning of their development (The Spring and Autumn Period, c. 550 BC), were there more subordinative compounds than coordinated words (see Cheng, 1981)? Also, why after the Warring States Period (c. 221 BC) did coordinative compounds become more and more dominant while the number of subordinative compounds declined (Cheng 1981)? As a response to these questions, I propose that the sharp increase in the number of compound words is a consequence of the development of disyllabic feet resulting from syllable-structure simplification that occurred from Old Chinese to Middle Chinese. In what follows, I will first review some previous accounts for the development of compounds in section four, and then propose that the development of disyllabicity is independent of compounding in section five. Section six shows how the syllable-structure simplification resulted in disyllabic feet. Some theoretical and empirical implications will be discussed in section seven.

Compound words in Classical Chinese

211

4. Previous accounts for the development of compounding There have been a number of hypotheses to explain the development of compounding in Chinese. To date, the answers that have been provided are mostly functionally oriented.

4.1. Loss of phonological contrast Norman (1988), for example, has suggested that it was "... chiefly due to phonological attrition, which greatly decreased the number of phonologically distinct syllables in the language." (1988: 86). If phonologically distinct syllables were merged into phonologically nondistinct syllables, it would result in a great increase in the number of homophones in the language. It seems quite reasonable to assume that the increase in the number of compounds around the Han dynasty is a result of the phonological changes in the language. Let us first consider the argument that compounding was caused by phonological attrition. Among the facts known about phonological attrition from Old Chinese to Middle Chinese, two changes have been posited in Sino-Tibetan studies: consonant cluster simplification and the loss of morphological affixation. Haudricourt (1954 [1972]) proposed that the departing tone in Middle Chinese originated from a suffix *-s in Old Chinese, a hypothesis that has been widely accepted in the literature (Mei 1994; Baxter 1992; and many others). Following this hypothesis, all departing tones of Middle Chinese originally ended in *-s in Old Chinese, from which we may infer a final consonant cluster in CVC8 roots: *CVC-s. These clusters were lost in the transition to Middle Chinese. Not only does the suffix *-s allow us to reconstruct final consonant clusters, but sets of characters which shared a common phonetic element (Xiesheng) also lead to the reconstruction of initial consonant clusters in Old Chinese. For example, a cluster *sm- has been reconstructed for Old Chinese in examples such as the following: (see Baxter, 1992: 175): (13)

Modern Middle Chinese Chinese sang < sang fz wang < mjang

<
Vn > V > V

According to a sociolinguistic study of the Beijing dialect by Barale (1982), the final-nasal consonant attrition noted by Chen follows a process of nasalization of the preceding vowel as seen in (29). Furthermore, Wang (1993) suggests that Mandarin Chinese syllables can all be analyzed as open syllables, that is, the maximal syllable structure in Mandarin Chinese is arguably CV. 16 Juxtaposing the syllable endings of different periods gives us a clear picture of the process of syllable-structure simplification throughout Chinese history: OC: C C V C ( C ) ( C )

(30)

MC: CV (

-m -n -ng -P -t -k

On-going

OM: CV ( \

-m -n -ng

MM: CV (

-n -ng

MM:CV

That is, the first step is to drop the "post-coda" (Baxter 1992), and the second step is to drop the coda. There is clearly a strong tendency to simplify Chinese syllable structures by dropping final consonants.

6.2. Metrical theory and Old Chinese syllable structure In metrical theory, syllables with a CVCC structure are heavier than syllables with a CVC structure, and CVC syllables are heavier than CV syllables (see Goldsmith 1990, and references cited there). The process of syllable simplification outlined above clearly shows that syllable weight has

226

Shengli Feng

continuously declined throughout Chinese history. Given this fact, the phonological change from Old Chinese to Middle Chinese can be characterized in terms of syllable-weight reduction. An important consequence of the syllable-weight reduction within the new system, I propose, is that a single syllable was not "heavy" enough to form a minimal independent prosodic unit — a foot. In other words, the new system requires the minimal prosodic unit (the foot) to be formed not by one, but by two syllables. This hypothesis implies that a one-syllable foot was permissible before the final clusters disappeared, but not afterward. Theoretically, this may be justified as follows. In prosodic phonology, in general the structure of a foot can be characterized as consisting of one relatively strong and any number of relatively weak syllables dominated by a single node (see, among others, Liberman and Prince 1977; Kiparsky 1979; Nespor and Vogel 1986). Therefore, the structure of a binary foot would be as follows ( " f ' stands for a foot, and "σ" for a syllable): (31)

f σ {s

σ w}

However, based on an analysis of a large number of languages, Hayes (1980) concludes that there are fairly strong restrictions on the grouping of syllables into feet in any given language. That is, a language may have either binary feet, consisting of two syllables each, or unbounded feet, consisting of (theoretically) any number of syllables. In addition to these types of foot structures, one-syllable feet are also found, although they are highly marked. Since Old Chinese was basically a monosyllabic language, it is reasonable to assume that while a foot in Old Chinese may have consisted of more than one syllable, a one-syllable foot would also have been allowed, because the maximal syllable structure in Old Chinese was CCCMVCCC (Ting 1979; Yu 1985), which is, in prosodic terms, not only a heavy, but a "super-heavy" syllable structure. Heavy syllables with complex structures may independently form feet, while light or weak syllables with simple structures may require another syllable to form a foot (see McCarthy—Prince 1993). Since the Chinese syllable went from heavy to weak, it lost the ability to independently form a foot.

Compound words in Classical Chinese

227

6.3. Syllable structure simplification as a cause of possible disyllabic feet Within the framework of prosodic phonology, whether a syllable is heavy or not depends on whether the rhyme constituent of the syllable is geometrically branching. A heavy syllable is defined as one having a branching rhyme, and a light syllable is defined as one without a branching rhyme. The "weak-nodes-don't-branch" principle of metrical theory would allow a CVC syllable to have the following structure: 17

onset

rhyme nucleus

C

V

coda C

Thus, aside from the obvious increase in length, a CVC syllable structure must also be considered theoretically "heavier" than a CV structure, because it has a branching rhyme. Note that this is exactly the difference between Old Chinese and Middle Chinese with respect to their basic (minimal) syllable structures, as proposed by Ting (1979), Li (1980), and Yu (1985). Furthermore, the syllable structure of Old Chinese is not only minimally CVC, but also maximally CCCMVCCC, i.e., a consonant cluster is allowed in word-final position. The hypothesis that final clusters created super-heavy syllables in Old Chinese is supported by looking at other languages where final clusters also form super-heavy syllables. For example, in Arabic, word final consonant clusters are permitted and syllables that contain such final clusters are superheavy. The interesting fact about Arabic is that it is the syllable-final consonant that "creates" the super-heavy syllables. McCarthy (1979 and elsewhere) suggests a structure like (33) for super-heavy syllables (see also Goldsmith 1990: 198):

onset

C

rhyme

V

c

c

228

Shengli Feng

What is crucial to note here is that the final CC cluster in a syllable, metrically speaking, behaves differently from a single consonant. This is not to say, of course, that Old Chinese was necessarily exactly like Arabic in terms of prosodic structure.18 Nonetheless the Arabic case provides us with evidence that final consonant clusters may create super-heavy syllables, allowing such syllables to independently form a foot. Thus the complex syllabic structure in Old Chinese may hypothetically be organized in terms of "Foot", according to (33), as follows. (34)

onset

rhyme

The hypothesis that Old Chinese had a heavy syllable structure and hence permits one-syllable feet is also supported by the moraic theory of syllable structure, in which a mora (μ) is dominated by a syllable node (σ) and syllables are dominated by feet (f). The syllable node (σ) may dominate one or two mora nodes, with each mora dominating at most one segmental element. Consequently, consonants are daughters of σ (see McCarthy—Prince 1993: 21). The following structures illustrate this analysis:

the Foot Binarity Principle: Foot Binarity (McCarthy-Prince, 1993: 43): Feet must be binary under syllabic or moraic analysis. Based on the moraic theory of syllable structure and the Foot Binarity Principle, the structure (35 a) cannot form a foot because there is only one mora, which violates the Foot Binarity Principle. Structure (35 b),

Compound

words in Classical

Chinese

229

however, will form a perfect foot because there are two moras, thus meeting the Foot Binarity Principle requirement that a foot must be at least bimoraic. Based on this theory, we may reasonably propose that the basic syllable structure of CVC was able to serve as an independent foot in Old Chinese,19 as shown in (36). (36)

Note that this theory also predicts that if final consonant clusters are dropped from the language, we will have structure the following, as illustrated in (37) below: f

(37) a.

onset

rhyme μ I

ν

μ I

c

onset μ I

c

C

rhyme μ μ I I V C

The prosodic weight of the CVCC foot is reduced. If we assume that the loss of coda reduces the minimal syllable structure to CV in Middle Chinese, we lose the phonological basis for bimoraic feet: (38) a.

230

Shengli Feng

The loss of the post-coda results in a loss of a super-heavy syllable structure, and the loss of the coda results in a loss of moraic branching structure. Since both apparently occurred in the language, the resulting structure would no longer be able to serve as an independent foot. 20 Furthermore, if the language changed its syllable structure systematically from (a) to (b) in (37) and (38), the result would have been that one-syllable words (since Old Chinese is basically a monosyllabic language) would no longer constitute independent feet. If this is so, two-syllable combinations will came to play a major role in foot formation in the language. To restate, the bimoraic foot disappeared in Old Chinese due to the loss of final consonants and consonant clusters. This, in turn, leads to the loss of heavy and super-heavy syllables. Since the foot is an obligatory level of prosodic structure, according to the theory presented above (see also Selkirk 1980b, McCarthy-Prince 1991, 1993; Kager 1992, and many others), the language made up for the loss of bimoraic feet by replacing them with disyllabic feet. Therefore, the change of syllable structure from Old Chinese to Middle Chinese may be prosodically characterized as a change from bimoraic to disyllabic feet, resulting in the tendency to form two-syllable combinations. 21

6.4. Grammatical evidence for the disyllabic foot I have argued elsewhere (Feng 1994), that a monosyllabic word was unable to form an independent foot during the Warring States Period (475-221 BC), while a two-syllable unit was able to form a standard foot in positions where an independent foot was required, as shown in the following analysis. (39) a. i f ö W ? zi he yanl you what say 'What do you say?'

(Shangshu. Yiji, 1000 BC)

b. Shi du zun he zaiV it only follow what prt 'What does it expressly follow?'

(Lunheng.Huoxu, 100 AD)

Although Classical Chinese of the Pre-Qin period (221 BC) is basically an SVO language, different types of SOV order are clearly observed. For

Compound words in Classical Chinese

231

example, if the object of a verb is a wA-expression, it must occur directly to the left of the verb as shown in (39 a). This type of SOV word order (i. e., wh-V) changed after the Han dynasty. In example (39 b) a wh-object would follow the verb in the Han text Lunheng. However, when an object wA-expression is formed by two constituents, e. g., he zui 'what guilt', it does not appear to the left of the verb before the change from [wh-V] to [V-wh]: *[What-N V], (40) *Song he-zui you? Song what-guilt have 'What guilt does Song have?' Rather, the structures that are allowed are [what-N pro-V] or [V whatN]. For example: (41) a. Song he-zui zhi you. Song what-guilt it have 'What guilt does Song have?' b. ... ... you he jiu yaril ... haveoldwhat old complain 'What grievance do (you) have?'

(Mozi. Gongshu)

(Jinyu. 4, Wei Zhao Zhu)

Either a pronoun zhi 'it' is inserted between the w/i-expression and the verb in earlier documents, or the w/z-expression appears to the right of the verb after the Han dynasty. The question, then, is why *[he-zui you] (what-guilt have) is not wellformed while [he you] (what have) is. In Feng (1994), I propose that (i) Proto-Archaic Chinese was an SOV language, and it changed into an SVO language which is what we see as Old Chinese (1000 BC). Based on this, the SOV orders such as the [wh-V] structure are considered as remnants of the change from SOV to SVO. In order to account for the survival of SOV phenomena, I propose that since Classical Chinese was basically an SVO language, the primary sentential stress falls on the right side of the sentence.22 A Sentential Prosodic Rule is thus formulated as follows:

232

(42)

Shengli Feng

Sentential Prosodic Rule For [X Y] P , if X and Y are constituents of P, and if Ρ is the last phrase of a sentence, then Y must be stressed.

According to the Sentential Prosodic Rule, a sentence is acceptable if the last element of the last phrase is properly assigned a stress, otherwise it will be ill-formed prosodically. Following this analysis, the non-existence of (40) is accounted for by saying that you is the last element of the sentence, and the last phrase that contains you is the VP structure *[he zui you], therefore, he zui will be the X and you is the Y of the Sentential Prosodic Rule. However, the monosyllabic word you 'have' is not heavy enough to act as an independent foot to realize the primary stress in the following structure: (43)

1 *Y] s

[X w / σ he

\ σ zui

σ you

That is, within the prosodic domain of [X Y] VP , X consists of a branching node, while Y consists of only a non-branching node, therefore, Y cannot realize the sentence-final stress. Technically speaking, according to Liberman and Prince's relative-prominence principle (1977), a strong node must be licensed by a weak node. This implies that the stress cannot be realized on Y itself, because it is a single node, and as I have argued before (see 6.3), one syllable cannot serve as a branching node in a prosodic structure. In the branching node VP, Y still cannot realize the stress, because X, the sister node of Y, is a branching node, and is prosodically stronger than a single node, i.e., than Y. As a result, (43) must be ruled out. The implication of this argument predicts that if another syllable is attached to the node Y in (43), or the elements under the X node reduce to a monosyllabic w/z-expression, then the sentence final stress can be realized (on a disyllabic foot), and the sentence will be grammatical. This is exactly what happened, as we can see below. He you 'what have' is grammatical, because he you is not only the last phrase but also a minimal prosodic unit, namely, a foot. Therefore the primary stress can be assigned to the right element you.

Compound words in Classical Chinese

233

(44)

[X

Y] s you have σ

w he what σ

You he-zui 'have what guilt' is also grammatical because he-zui is the last phrase (NP) and these two words form an independent foot with the stress on the right, satisfying the requirement of Sentential Prosodic Rule :

The structure of he-zui zhi you 'what guilt it-have' is also acceptable, because he zui zhi you forms the last phrase (VP) in the sentence, where he-zui is still the object of the verb, zhi is cliticized onto the verb you2i (forming a prosodic foot with you), hence zhi-you would be interpreted as the Y and he-zui as the X, illustrated in (46).

he

zui

zhi

you

234

Shengli Feng

Since there are two syllables under the Y position, they can form a standard foot so that the stress can be assigned to it, satisfying the Sentential Prosodic Rule. Note that if he you is grammatical, there is no reason to rule out hezui you either syntactically or semantically. The only difference between these two structures, I argue, is their prosodic structure. Thus the best way to explain the non-existence of *he-zui you is to assume that you is a monosyllabic word, and one syllable is not heavy enough to act as a standard foot. 24 This is further confirmed by examples of the following kind in which an extra, meaningless syllable is used in order to form a disyllabic foot. (47) Huo-yi, she zhi wei wang tantan zhe. Great-yi, She Nom.prt being King magnificent prt 'Great, the way that She became a king is magnificent!' (Shiji. Chenshe Shijia) The sentence is traditionally taken to be closest to the vernacular given by Sima Qian (145-? BC). Probably because the word huo-yi used in the Chu dialect is relatively uncommon, Fu Qian (c. 184?-? AD) glosses it: (48)

3tAH? Μ XW Chu ren wei duo wei huo, you yan yi zhe, Chu people call great is huo, again say yi N.prt. zhu-sheng zhi ci ye." support sound 's word prt. 'In Chu dialect, the word for "great" is "huo". However, "yi" is added to make the sound better.' (Fu Qian, Shiji.Suoyin)

According to Fu Qian, the meaning of the exclamation expression huo-yi in (47) is interpreted as the same as the monosyllabic left-hand constituent huo, thus making yi semantically empty. Here, the addition of yi to huo occurs to lend metrical support to huo as Fu Qian notes. The fact that a monosyllable needs extra "sound support", while a disyllabic unit does not (see Guo 1985), indicates that a monosyllable is not heavy enough prosodically to act as an independent foot needed to realize the stress on an exclamation or a focus expression. Therefore, the use of "sound support" on a monosyllable provides further evidence for the argument that a disyllabic unit constitutes a standard foot.

Compound words in Classical Chinese

235

6.5. Disyllabic feet, prosodic words, and phrase structure Given the prosodic arguments in section 6.3. and the grammatical evidence in 6.4, a Foot Formation Rule for Classical Chinese is therefore formulated as follows: (49)

Foot Formation Rule in Classical Chinese f σ

σ

a standard foot must be formed by at least two syllables. As we have seen, disyllabic feet resulted from syllable reduction, therefore the Foot Formation Rule must apply chronologically after the loss of final consonant clusters in Old Chinese. As shown before, there was a sharp increase in disyllabicity during the Han dynasty, and it is well known that by the Han dynasty, final consonant and consonant clusters had almost disappeared completely (Mei 1980, Baxter 1992). The fact that the development of disyllabicity followed the loss of the final consonants and consonant clusters is chronological evidence corroborating the Foot Formation Rule given in (49). If the above analyses are correct, we have answered the question of where the tendency to disyllabify originated. Recall that I have also argued (e.g., 5.3) that the development of disyllabicity was theoretically independent of compounding. The Foot Formation Rule in (49), may be considered system-internal evidence supporting this hypothesis. Now, if disyllabicity did not directly result in compounding, why did the Classical morphology proceed in the direction of compounding and what is the relationship between the development of disyllabicity and that of compounding? I argue that although the development of disyllabicity is inherently independent of the development of compounding, the Foot Formation Rule played a crucial role in word formation in Classical Chinese. This is not because disyllabicity is inherently related to compounding as a means of word formation, but because of the fact that Classical Chinese was basically a monosyllabic language. Once the monosyllabic nature of the language is assumed, disyllabicity can then be considered a "cause" for the development of compounding.

236

Shengli Feng

6.5.1. Minimal Prosodic Word The relationship between disyllabicity and compounds can be naturally derived from the recent developments in the theory of Prosodic Morphology (see McCarthy-Prince 1993). In prosodic morphology, prosodic restrictions are defined in terms of prosodic units such as mora, syllable, foot, and prosodic word (PrWd) which are hierarchically organized (see Selkirk 1980 a, 1980 b; McCarthy-Prince 1993): (50)

Prosodic Hierarchy PrWd I Foot I Syllable I Mora

In this theory, any instance of the category Prosodic Word (PrWd) must contain at least one foot. According to Foot Binarity, every foot must be bimoraic or disyllabic. Thus a PrWd must contain at least two moras or syllables. The "at-least" requirement automatically leads to a notion about what would be the smallest Prosodic Word: a minimal Prosodic Word is a metrical Foot. As argued by McCarthy—Prince (1993), the Prosodic Hierarchy and Foot Binarity, taken together, derive a notion "Minimal Word". We shall see below how the notion "minimal word" interacts with the disyllabic foot and phrase structure rules.

6.5.2. Phrase Structure Correspondence and Idiomatized PrWd As mentioned before, Classical Chinese was a monosyllabic language. If a foot must be formed by two syllables, and each syllable is a word in the language, the only way to make disyllabic feet in the language would have been to group two words together as shown below ("W" stands for Word):

Compound words in Classical Chinese

σ W

237

σ W

In other words, a disyllabic foot inevitably results in a two-word prosodic combination. That is, (52)

Foot = σ + σ = \¥ + λν

a disyllabic foot must be based on a two-word combination in the "monosyllabic" system. However, such combinations are also constrained by phrase-structure rules in the language.25 It follows that feet that are realized on two words would often happen to correspond to phrases (XP): (53)

Foot = a + a = W + W = XP

That is, the equation "σ = W" inevitably leads to the equation "F = XP". Once a foot corresponds to a phrase, the prosodic foot will merge with the phrase, due to structural isomorphism. When this happens, the following situation results: (54)

F σ =W

σ =W XP

(54) illustrates that a correspondence between a prosodic foot and a syntactic phrase will eventually lead to a merging of these two structures. Since, by the Prosodic Hierarchy in Prosodic Morphology, a foot is dominated directly by the Prosodic Word, and the minimal prosodic requirement for a word is the presence of one foot, the merging of a prosodic category (a foot - the minimal prosodic requirement for "word") with a syntactic category (a phrase) would readily satisfy the Prosodic Word requirement. Therefore, the merged structures all have the potential to form PrWs in the prosodic morphological system. Note that, the prosodic integrity of Foot always forces two elements in a phrase to be closely

238

Shengli Feng

knitted together, hence one element cannot occur without another, otherwise it will violate the minimality requirement for being a prosodic word. However, when a prosodic word is repetitively used in the language, the two elements in that phrase will be fixed, resulting in what I will call an Idiomatized Prosodic Word. This analysis proceeds from the assumption that idioms are phrasal categories. Note that by only one step further, the Idiomatized PrWds can be lexicalized as compounds. That is, compounds are lexicalized idiomatic phrases. If the above analysis is correct, given the Prosodic Hierarchy in (50), the structure (54), which is derived from the Foot Formation Rule, including the monosyllabicity of the language, would be considered the Word Formation Rule for Classical Chinese, formulated as (55): (55)

Word Formation Rule in Classical Chinese PrWd F

XP X and Y form a prosodic word, iff the combination of X with Y simultaneously satisfies the syntactic and prosodic conditions of being a phrase and a foot, respectively. Note that compound words in Classical Chinese are syntactic words because they historically originated from disyllabic phrases. Compound words are prosodic words also, because they are lexicalized idiomatic PrWds. This entails that not every phrase can develop into a compound, but only those which meet the prosodic requirements. Neither can any foot be identified as being a compound, but only those that represent an independent syntactic unit, i.e., a phrase. By prosody, only phrases that fit the description of being one foot are eligible to become compounds. By syntax, only feet that represent independent phrases are qualified to be compounds. Given all the analyses above, the origin of compounding can now be described as follows: the phonological change of Old Chinese resulted in a disyllabic foot, the disyllabic foot, in turn, resulted in disyllabic PrWds, disyllabic PrWds are formed by two-syllable phrases given the monosyllabic property of the language, and the two-syllable phrases are idiomatized in usage, becoming Idiomatized PrWds. When Idiomatized PrWds

Compound words in Classical Chinese

239

are lexicalized, they become an X° level category item, i. e. a compound word in the lexicon, as illustrated in (56): (56) f A

Β XP

Idiomatized PrWd

PrWd • A

Β XP

• A

Β XP

Compound Word • A

Β X°

This, I argue, is how disyllabic phrases, compounds and the prosodic morphological system came about. Since the disyllabic foot became standard, and a foot is the minimal unit for a PrWd, forming a standard foot in the language will eventually lead to idiomatized PrWds, and the ensuing compounds in the language. Compounds are therefore the result of foot formation. This is why disyllabic compounds increased in number after the establishment of foot formation. Given the theory presented here, the semantic criterion in (8) can therefore be replaced26 and most separable disyllabic combinations will all be treated as Idiomatized PrWd listed in the dictionary. Compounds are only those that have clearly undergone a process of Lexicalization (or a category changing rule, see Feng 1995: 141) such as si-ma 'general' of (4d). The theory presented here requires that the procodic argument of being one foot and the syntacic relation of being a phrase interact to determine PrWds and compounds in Classical Chinese: the syntax determines the structural relation between each element of a compound, the prosodic template of a foot determines the metrical shape of that compound. Compounds are identified only by a process of lexicalization. Any two-syllable combination that is closely knitted together and listed in the dictionary, but exhibits some phrasal properties, will belong to the category of Idiomatized PrWds. Under the treatment of Idiomatized PrWd, there is little surprise why the two forms A and Β in coordinating structures, such as tu-shu 'picture and book' given in (25) can be formed as either tu-shu or shu-tu. Because they are idiomatized phrases, and both orders, AB and BA satisfy the requirements for a Idiomatized PrWd in a coordinating structure. This also explains why ju-ma 'carriage and horses' can be formed by two words, but without the surface meaning of the second word, as seen in (18). Because the Foot Formation Rule demands that a minimal prosodic word be formed by at least two syllables, ju must take another word (here, ma, in the same semantic field with ju) to meet this

240

Shengli Feng

requirement. The PrWd licenses ju-ma to function as a independent prosodic unit, even though the actual meaning of ju-ma is focused on only ju.

7. Empirical consequences and theoretical implications If, as I have argued, the bimoraic foot lost its phonological basis, and the two-syllable unit came to constitute the standard foot in Classical Chinese, what we would expect empirically is for two-syllable combinations to become more and more common during the course of the change. Given the fact that Classical Chinese was basically a monosyllabic language, and given the Foot Formation Rule requirement that a standard foot must be formed by a unit at least two syllables long, the only way to make a disyllabic foot in the language would have been to group two words together. As shown above, a disyllabic foot would often result in two-word prosodic combinations and such combinations would also be constrained by phrase-structure rules in the language. It follows that the prosodic foot would, in turn, often result in Idiomatized Prosodic Word. If two-word combinations were the only way to realize disyllabic feet, we would expect that, in the early stages, naturally-occurring syntactic twoword phrases would be highly preferred candidates to act as two-syllable feet. More explicitly, it is more likely that naturally-occurring phrases would bear two-syllable feet than it is that entities (two-syllable words) would be created expressly for that purpose. If disyllabic feet are originally realized on naturally-occurring phrases, the result of these developments would be the following: (">" means "result® in") (57)

Phonological change > disyllabic feet > disyllabic phrases > 1 2 3 idiomatized PrWd > compounds 4

Since disyllabic feet are mostly realized on syntactic two-word phrases, it is likely that naturally-occurring two-word phrases would be the first candidates for disyllabic feet at the beginning of the development of disyllabicity. Also compounds would originate from these naturally-occurring disyllabic phrases. This hypothesis receives support if we find that disyllabic combinations (phrases or compounds) in Classical Chinese did indeed originate

Compound words in Classical Chinese

241

from naturally-occurring disyllabic phrases, rather than from those expressly created for prosodic requirement. How can we distinguish the naturally-occurring phrases from those created expressly for prosodic requirements? Furthermore, how do we distinguish disyllabic combinations that originated from naturally-occurring phrases from those that were created expressly for the prosody? Considering the first question, we have seen that there are two structures which are very productive for compounding, namely, coordinating and subordinating structures (see section 3). We also know that each of these structures can be formed by different types of syntactic relations among the two elements they contain. For example, the coordinating structure can be formed by a noun plus a noun, or a verb plus a verb, etc., and the subordinating structure can be constructed by a noun modifying a noun, or an adjective modifying a noun, etc. According to Cheng (1981), there are 6 types of coordinating and 9 types of subordinating structures as shown in (58) (N = noun, A = adjective, V = verb, Ρ = pronoun, Num = number). Table 6. Types of coordinating vs. subordinating structures I. Coordinate structures Types

Examples

Gloss

1. 2. 3. 4. 5. 6.

jia-bing gong-ji kong-ju xian-liang xue-wen san-wu

'armor-weapon; war, military' 'attack-assault; to attack' 'fear-dread; frightened' 'able-virtuous; worthy man' 'study-inquire; knowledge' 'three-five; a few'

NN > Ν W > V AA > A AA > Ν W > Ν Num + Num > A

II. Subordinate structures Types

Examples

Glosses

1. 2. 3. 4. 5. 6. 7. 8. 9.

tian-zi xiao-ren qi-ren fu-xing cao-chuang yan-ju xian-sheng wu-zi bai-xing

'Heaven-son; Emperor' 'small-person; a person of low position' 'beg-person; beggar' 'help-travel; entourage' 'grass-create; to initiate' 'confortable-live; to relax' 'early-born; sir, teacher' 'my-sir; you' 'hundred-names; people'

NN > Ν AN > Ν VN > Ν W > Ν NV > V AV > V AV > Ν PN > Ρ Num + Ν > Ν

242

Shengli Feng

Given the different coordinating and subordinating structure types, the argument for naturally-occurring phrases can be tested by assuming that if there are more types of subordinating structures, there would be more occurrences of subordinating compounds, and if there are fewer types of coordinating structures, there would be fewer occurrences of coordinating compounds. This is because everything else being equal, more structure types will produce more total occurrences of that structure, and vice versa. This prediction is borne out as seen in Cheng's (1981: 112) statistical data given in Table 4, repeated here as Table 7. ("Total Comp" words, "CC" stands for Coordinating Compound words, "MH" stands for Modifier Head compound words): Table 7. Percentage of CC and MH compounds in Confucius (c. 550 BC)

Comp

Total CC

180

48

Total

Total

%

Total MH

26.7

67

37.2

%

In Table 6, we have seen that there were more structure types of the subordinating than of the coordinating variety. From Table 7, we see that there are more instances of subordinating than of coordinating structures. The correlation between the number of structure types and the number of instances of that structure can be seen clearly in Table 8. ("CC" stands for Coordinating structure and "MH" for Modifier Head structure). Table 8. Number of structure types vs. number of structure instances for CC and MH compounds

CC MH

Structure types

Structure instances

6 9

48 67

40% 60%

42% 58%

The 40% versus 60% of structure types closely correlates with the 42% versus 58% of instances of coordinating and subordinating structures, respectively. The correlation supports our contention that if there are more types of subordinating structures, there would be more instances of

Compound words in Classical Chinese

243

disyllabic forms (phrases or compounds) formed by subordinating structures, and vice versa. Given this, a reasonable explanation for the correlation is to assume that disyllabic feet originated by making use of naturally-occurring phrases, and that compounding started from natural phrases as well. Although the correlation between the number of structure types and the frequency of their occurrence supports the claim that disyllabic feet in Classical Chinese were realized on naturally-occurring phrases, this does not necessarily mean that disyllabic forms did not originate from phrases that were created for prosodic purposes, because it is not yet clear what the structure of phrases created expressly for the prosody would be. Since both structures, the subordinating and the coordinating, can form two-word phrases equally well, both structures can serve for the need for disyllabicity. As a result, if the notion of "phrases created for prosody" is not specified, there would be no judgement on the second part of the hypothesis that disyllabic combinations (phrases or compounds) in Classical Chinese originated from naturally-occurring disyllabic phrases, rather than from those expressly created for the prosodic requirement. Considering this question, I suggest that coordinating structures can be considered structures which are created expressly for the purposes of prosody. This is because the coordinating structure exhibits special syntactic and semantic properties which the subordinating structure lacks, that is, with or without part B, the semantic interpretation of A in a [A+B] coordinating structure would always be approximately the same. Compare: (58) a. Subordinating Tian-zi (Heaven's son) ^ zi (son) qi-ren (beg-person, beggar) ren (person) b. Coordinating kong-ju (fear-dread; frightened) = kong (fear, frightened) kong-ju (fear-dread, frightened) = ju (dread, frightened) gong-ji (attack-assault, attack) = gong (attack) = ji (attack) zhan-dou (warring-tussle) = zhan (fight) = dou (fight) sha-lu (kill-kill) = sha (kill) = lu (kill) Subordinating structures are not as flexible as coordinating structures in their ability to form disyllabic combinations out of monosyllabic words without affecting the semantic interpretations of the phrase. In other

244

Shengli Feng

words, the subordinating structure cannot be freely used without changing the original meaning of the phrase in which it occurs. However, the coordinating structure can do this easily by simply adding a synonym to the original monosyllable verb, noun, or adjective in any position of a sentence without changing the basic syntactic structure and meaning of that sentence. This, as we have seen before, is what Zhao Qi did in his Mencius zhangju (e. g., (17)). Given this analysis, it follows that the coordinating structure has an advantage over subordinating structures in creating disyllabic phrases. If the Coordinating structure is the structure by which phrases could be created expressly for prosodic purposes, and if as I argued before, it is more likely that naturally-occurring phrase would bear two-syllable feet, than that coordinating structures would be created expressly for that purpose, we would then predict that there must be statistically more naturally-occurring disyllabic phrases (i.e., more subordinating phrase) than coordinating disyllabic forms in the earlier stages, because it requires less effort to make use of naturally-occurring phrases than to create new ones. This is also borne out as seen in Table 7. there were 67 tokens of subordinating, but only 48 tokens of coordinating structures. If the coordinating structure is used to create disyllabic phrases, and if the creation of disyllabic forms is required only when the disyllabic foot became stronger, we would further expect that a reverse situation would occur in the language, i.e., there would eventually be more disyllabic combinations that were formed by coordinating structures than by subordinating structures, because when the prosodic requirement becomes stronger and stronger, making use of naturally-occurring phrases would not be efficient and productive, so the phrases created for prosody would come to dominate in late stages. This analysis receives support from Cheng's (1981: 112; (1985: 337) statistical data given in Table 9. ("Total Comp" = Total compound words, "CC" stands for Coordinating Compounds, and MH for Modifier Head Compounds). Table 9. Percentage of CC and MH compounds in Confucius, Mencius and Lunheng

Chronology

Texts

Total Comp

c. 550 BC c. 300 BC c. 100 A D

Confucius Mencius Lunheng

180 333 2088

Totel CC 48 115 1401

%

Total MH

%

26.7 34.5 67.24

67 100 517

37.2 30 24.76

Compound words in Classical Chinese

245

Table 9 shows that making use of naturally-occurring phrases was replaced by coordination as a way to meet the prosodic requirement. Since coordinating structures have certain productivity advantages over subordinating structures in creating disyllabic forms, coordinating word structures came to dominate in the later stages. The theory presented here explains why some compounds undergo a process of dephrasalization (making use of naturally-occurring phrases), while others (created for prosodic requirement) do not. It also explains why there were more subordinating compounds in earlier stages than later on, and why compounds created for disyllabicity were mostly found at the later stage (most examples of this type given by Cheng (1981) are from Han Feizi, c. 230 BC). Secondly, the theory presented here also explains why SP, VO, and VR structures are disfavored structures for forming compounds as seen in section 3. Let us consider the VO construction first. The reason why VO compounds were very rare has to do with sentence prosody. It is claimed that the sentential normal stress in SVO languages such as English (Liberman-Prince 1977) and Chinese (Chao 1968) generally falls on the right-most element of a sentence (e. g., (42), and note 22). Since VO phrases in Classical Chinese frequently appear at the ends of sentences,27 the object of the verb in a sentence will often be the target of the normal sentence-final stress. As seen in section 6.4, according to Liberman and Prince's relativeprominence principle (1977), a strong node must be licensed by a weak node. Therefore, a single node alone cannot realize the stress. Since one syllable cannot serve as a branching node in a prosodic structure, another syllable must be attached to it to form a disyllabic foot in order to realize stress. Thus if the object is a monosyllabic word, that word must attach to the preceding verb to become a part of a foot in order to realize the stress. However, when the VO predicate becomes a foot and the primary stress has been realized upon it, the VO foot must fulfill the requirement assigned by sentential stress. As a result, a VO structure is bound with sentential stress in a sentence. In other words, the normal stress on VO structures will always require the VO to be the verb and the object of that sentence,28 hence it is difficult for them not to serve as the main predicate of the sentence, and it is hard for dephrasalization or lexicalization to take place. This explains why there are hardly any VO verb compounds in Classical Chinese. Under this analysis, the way for a VO combination to become a compound is for it to avoid acting as a predicate of the sentence. This is achieved by changing its part of speech, i.e., acting as a noun, such as si-ma (control-army, general), which is precisely what has been observed in the literature.

246

Shengli Feng

As for VR, since the verb-resultative complement structures are a later development in the language (starting from the Han dynasty), it is no surprise that VR compounds are rare before the Han. In addition, SP compounds are even rarer, simply because there are hardly any SP phrases in the language,29 partly due to the fact that the subject in Classical Chinese is often dropped.

8. Summary I have argued in this paper that two-syllable (compound) words in Classical Chinese appeared in large numbers during the Han dynasty because of the advent of a disyllabic prosodic foot structure during that period. I argued that an earlier, bimoraic, monosyllabic foot could no longer be supported by a syllable structure that had undergone simplification following the loss of consonant clusters and syllabic-final consonants. Based on the moraic theory of syllable structure, I argued that a Foot Formation Rule (49) follows naturally from the loss of final consonants and consonant clusters in Old Chinese. Furthermore, given the fact that Classical Chinese is basically a monosyllabic language, the Word Formation Rule (55) is thus derived from the Prosodic Hierarchy and the Foot Binarity principles adopted in this paper. The theory presented here requires that the prosodic integrity of being a single foot and the syntactic relation of being a phrase interact to cause PrWd, Idiomatized PrWd, and compounding in Classical Chinese: syntax determines the structural relations between each elements of a compound, and prosody determines the metrical size of that compound. Under this analysis, I have also argued against the hypothesis that the increase in compounds around the time of the Han dynasty was due to a decrease in the number of phonologically distinct syllables. I argue against this functional hypothesis, because it cannot account for the fact that compounds given in (18) and footnote (15) are highly counter-functional, and also because it cannot explain the structural mechanism of the morphological development of compounding. Using this prosodic-based analysis to account for the development of classical compounding, we have explained a wide range of phenomena, such as why there are more Modifier-Head compounds than coordinate compounds at earlier stages of compounding, and why the reverse situation occurs later on, i.e., more coordinate compounds than ModifierHead compounds. Questions such as these are answered naturally under the unified theory developed here.

Compound words in Classical Chinese

247

The arguments made here are quite different from the traditional analysis in many aspects. First, in the traditional analysis, the only connection between phonological change and compounding is that phonological change resulted in more homophones, causing the development of compounds (e.g., 4.1). The present study took a new look at phonology and compounding from a prosodic point of view. By taking prosody into account, we reached a new understanding of the phonological change in the development of compounding. Secondly, the importance of "Foot" has been recognized in the literature for quite a long time (e. g., Guo 1932 [1985]; Chen 1979; Shih 1986). However, no connection had been made between the Foot Formation Rule (Chen 1979; Shih 1986) and the development of disyllabicity. On the contrary, linguists (for example, Guo 1932 [1985]) had believed that disyllabicity is merely a stylistic device, and that the disyllabic foot occurred throughout the history of the Chinese language. The present study has made a first attempt to motivate a Foot Formation Rule based on the phonological system of Classical Chinese. It is argued that the Foot Formation Rule was established during the Han dynasty based on characteristics of syllable structure. Third, compound words are traditionally known as syntactic words in Chinese (e.g., Chao 1968). The present study argues that compound words are not merely syntactically structured, but also prosodically motivated. As a result, the so-called compounds in Classical Chinese can naturally be divided into two categories: a word category and a phrasal category, and both are listed in the dictionary. The former are compounds based on lexicalization (or a category changing rule, cf. si-ma 'charge-military' -* 'general' as in (4d)). The latter are Idiomatized PrWds based on their frequency of usage (cf. yi-shang 'shirt-skirt', 'clothes'; jia-bin 'armor-weapon', 'military'). It is also possible that some items can be listed twice in the dictionary, once as a lexical word (cf. tian-zi 'Emperor'), once as an idiomatic item (cf. tian-zi 'Heaven-Son' tian zhi zi ye 'Son of the Heaven'). Strictly speaking Idiomatized PrWds are neither (free) phrases nor words, but are idioms created by the prosodic system and fixed in usage, exhibiting special properties: they are listed in the dictionary, used as lexical items, bear the same metrical shape as a compound word, and yet, still retain some phrasal properties. Therefore, Idiomatized PrWds constitute an intermediate category between free phrases and words in the morphological system of Classical (as well as Modern) Chinese.

248

Shengli Feng

Notes * The present study is based on part of my 1994 University of Pennsylvania Ph. D. dissertation. I would first like to thank Jerry Packard for carefully reading the entire manuscript in a short period of time and for having made many important comments and suggestions. Thanks go especially to Mark Liberman for his valuable comments and suggestions, which not only affected this paper, but also contributed ideas for future studies on this topic. I am also grateful to my thesis supervisor Anthony Kroch, without whose inspiration on the study of the interaction between syntax and prosody, the research would not have taken this direction. Thanks also to Mei Tsu-lin for his encouragement, without which I would not have been able to carry on this research. I would like to thank F. W. Mote for his valuable comments and suggestions, especially regarding classical texts cited in this paper. The present work also owes much to Duanmu San, Lin Hua, Shih Chi-lin, and Wang Zhijie. In discussions with them, I have learned a great deal about issues involved in this paper. I would also like to thank Ao Xiaoping, Huang Shizhe, and Li Yafei for spending time discussing some important questions in this paper, and for their valuable suggestions. All errors, of course, are mine. 1. jun-chen is not a compound by the semantic criterion given in (8) in contexts such as the following: (i)

eiatt·*!·, i t e ± £ t t - t & . jun yi ji xu chen, chen yi ji shi monarch use trick treat official official use trick serve jun, jun-chen zhi jiao ji ye. monarch, monarch and official 's relation trick prt. "The monarch uses tricks to gain officials and officials use tricks to serve the monarch, the relations between them are nothing but tricks.' (Hanfeizi. Shixie)

2. Huang (1984) also proposed criteria for modern Chinese compounds. His criteria are based on the Lexical Integrity Hypothesis which says, roughly, no phrasal structure rule may apply to a lexical item, and the Phrase Structure Constraint which requires, roughly, that no two constituents appear after the last verb. The Lexical Integrity Hypothesis works (but not completely for classical Chinese as we can see below), but the Phrase Structure Constraint will not apply to Classical Chinese, simply because two constituents are allowed to appear after the verb. Therefore, although Huang's criteria are important for modern Mandarin Chinese, the Phrase Structure Constraint is not relevant to Classical Chinese. 3. Derivative compounds are different; see below. 4. In traditional philology, this has sometimes been called Han lei er ji ('bring two words of the same kind together') which means A is added to B, because it is the same semantic category. In this case, usually one part of AB functions as a dummy place holder, which has no semantic interpretation at all.

Compound words in Classical Chinese

249

5. The term Syntactic Word refers to compounds which are formed according to syntactic relations such as Subject + Predicate (SP), Modifier + Head (MH), Verb + Object (VO), Verb + Resultative complement (VR), and Coordinate Constructions (CC). 6. The term "others" refers to meaning specializations, such as tian-xia (skybelow, 'the Emperor'): "below the sky" -»• "all below the skies" -»· "the world of men" -*• "society -*• "the Emperor". 7. Of course, the best way to study compounding in Mencius zhangju may be to list all of the monosyllable words in Mencius that have been translated into two-syllable combinations in Zhao Qi's commentary, i.e., to provide an exhaustive listing of the "one-to-two" notes. However, since time does not allow for such an investigation, I will analyze two chapters of Mencius Zhangju, namely, the Liang Huiwang Shang and the Gongsun Chou Xia. These two chapters (c. 300 BC) constitute nearly 15% percent of the entire book. This 15% sample size is sufficient to postulate (1) different proportion of compounds in the Pre-Qin period and the Han dynasty; (2) the basic linguistic properties of compounds in these two periods. 8. As shown below, the minimal syllable structure in Old Chinese is CVC as proposed by Li 1980 and Ting 1979. 9. Although the no-open-syllable hypothesis for Old Chinese has been questioned by scholars (see Norman 1988 and Baxter 1992), there are scholars who accept this hypothesis, such as, Lu Zhiwei (1947), Li Fang-kuei (1980), Ting Pang-Hsin (1979), and Yu Naiyong (1985). Most importantly, as argued by Ting (1979) and illustrated by Yu (1985), syllable structure was clearly more complex in Old Chinese than in Middle Chinese. In this paper, I adopt Ting's hypothesis that the basic (minimal) syllable structure of Old Chinese is CVC. Note that even though not all syllables in Old Chinese are CVC, most scholars agree that the majority of syllables in Old Chinese had a minimal CVC structure. If this is so, the theory developed in this paper can still be held without assuming the strong form of the "no-open-syllable" hypothesis as we will see below. 10. This is why Baxter introduces the term "pre-initial" for first segment of initial clusters (*s- of *sk-) and "post-coda" for final segment of syllable-final clusters (*-* of for Old, but not Middle, Chinese (see Baxter 1992: 7). 11. For example, yu Μ (foolish, stupid) and yu £ (anxiety, worry) are phonologically different in Old Chinese, but they became homophones in Middle Chinese, as did jing (city) and jing Μ (surprise), see Wang 1980. 12. The term "coda" refers to segments immediately following the main vowel; and "post-coda" refers to final segment of syllable-final clusters. See also note 10. 13. The distinctive function of the new tone system can be seen clearly from the fact that the number of etymological words which are distinguished by tonal differences (for example, Level Tone of nouns cognately related with Departing Tone of verbs - the change of category from Noun to verbs [Mei 1980]),

250

Shengli Feng dramatically increased during the late Han dynasty. For example (taken from Chou 1962: 54): (i)

Noun (Level Tone)

Verb (Entering Tone)

kuan 'cap' jei 'clothing'

kuan 'to cap' jei 'to wear (clothes)'

14. In Mencius, shi-chao can also be used to mean only shi (market) but not shichao (market and imperial court): (i)

tttTiW!. Da zhi yu shi-chao. whip him at market-court 'Whip him at the market.'

15. There are more examples of this type (see Gu Yanwu [1613—1682 AD], Rizhi-lu, Juan.27): (i)

«äüös'J, shan hing er bie, duo ta li-hai take army and leave, more other benefit-harm 'Take the army and leave, there will be more harm (to us).' (Shiji. Wuwang Bi Zhuan)

(ii)

äL^ifcH, «ft#l^rtt#. sheng nü bu sheng nan, huan-ji wu ke shi zhe born female not born male, unhurried-hurried no can use prt. 'If one has only girls but no boys, there is no help for urgency.' (Shiji. Canggong Zhuan)

(iii)

ÄffWlWJstf^tfe, l l J i K . Xian di chang yu Taihou you bukuai, ji zhi late Emperor before with Queen have unhappy, almost cause cheng-bai success failure. 'The late Emperor often had a fight with the Queen, it almost causes a failure.' (Houhanshu.Douhe Liezhuan)

In modern Chinese, there are also compounds of this type. For example: (iv)

Ta yaoshi you ge hao-dai, haizi zenme ban? She if have one good-bad, children how do 'If she has a disaster, what about her children?'

16. Along the lines of Chen's nasal attrition (1975), Wang (1993) proposed the reduction of the postnuclear consonant (n, ng): from a consonant into an approximant, forming a part of the V. That is, the nasal endings of the syllable rime are all [-consonantal], and can be viewed as part of a diphthong. As

Compound words in Classical Chinese 251 a result of her analysis, Beijing Mandarin syllables are all arguably CV, where the V covers both single vowels and diphthongs. 17. Regarding the medial segment in Old Chinese, here I would like to claim that the medial is part of the onset, based on recent analyses by Hsueh (1986), Duanmu (1990), Bao (1990), and Wang (1993), in which the prenucleus (i.e., y, w, y" medials) in Mandarin Chinese is not analyzed as part of the rhyme. Therefore, whether the syllable contains a medial or not, the syllable weight remains the same, since the syllable onset has no bearing on this matter. 18. One may argue that although Arabic is one of the languages that is sensitive to the prosodic weight of syllables, it does not mean that (Old) Chinese is also sensitive to prosodic weight. However, as I will show in 6.4, the wellformed [σ-σσ] and ill-formed *[σσ-σ] prosodic structures in Classical Chinese indicate that Classical Chinese was indeed a prosodic-weight-sensitive language. See also notes 19—20 for more support of this argument. 19. At this point, I should point out that the assumption that the CVC syllable structure of OC is capable of forming an independent foot does not mean that the replacement of bimoraic one-syllable feet by two-syllable feet is an all-or-none operation, i.e., it is unlikely that one-syllable feet suddenly were all considered ill-formed and two-syllable feet were immediately dominant. What seems natural is that the phonological basis for the monosyllabic foot was lost step by step and monosyllabic feet became more and more disfavored, while disyllabic feet became more and more common and dominant — the result of a decrease in disfavored elements and a corresponding increase in favored ones (Kroch 1989). This follows because the syllable structure reduction in OC and the ensuing four-tone system in M C actually took a quite long time to be finally completed (probably by the late Han, see Xu 1996: 269). Nevertheless, the unacceptability of the monosyllabic foot can be seen clearly from both Classical Chinese (e.g. (43)) and Modern Chinese, as follows: A:

B.

Jintian ji haol tody what date? 'What date is today?' a. * Wu. 'Five.' b. Wu hao. five-number 'Five.' c. Chu wu. Beginning five 'Five.' d. Shi wu. ten five 'Fifteen.'

252

Shengli Feng

20. One may argue that since diphthongs in Chinese (Middle Chinese and Modern Mandarin) can also be analyzed as consisting of two moras, a syllable that contains a diphthong can still be a bimoraic foot even if the coda is lost. However, I will not consider diphthongs in Chinese to be able to form a standard foot using long vowels as they can in other languages, even though diphthongs in other languages are sometimes analyzed as two moras. The reasons are as follows: first, there is no evidence of a phonological contrast between long vowels and short vowels in Chinese, therefore there is no evidence to show that diphthongs are necessarily distinctively longer (or heavier) than monophthongs. Secondly, it is well known that Mandarin syllables are of the same length for single rhymes (monophthongs) and compound rhymes (diphthongs) (see Duanmu 1990, and Wang 1993), therefore, if diphthongs are considered as long vowels so that they can form a bimoraic foot, then monophthongs must also be considered as being able to form a foot, because there is no length difference between these two types of syllables. However, it has been widely recognized in the literature (Chen 1979, Shih 1986; also see examples given in note 19), that there are clear prosodic contrasts between two-syllable and one-syllable units in poetic prosody (Chen 1979). Also, a monosyllabic word must be grouped with another foot in the Tone Sandhi domain defined by Foot Formation which normally contains at least two syllables (Shih, 1986). This contrast is also observed in syntactic structures as discussed in 6.4 below. On the other hand, there is no prosodic contrast between diphthongs and monophthongs in the language. Therefore, if we consider a minimal foot as being formed by two syllables, the prosodic and syntactic properties of one-syllable and two-syllable units can be captured. If, on the other hand, a monosyllable is considered a normal foot based on an analysis that monophthongs consist of two moras, one cannot explain why diphthongs do not differ from monophthongs. In addition, a significant generalization about the prosodic properties of one-syllable and two-syllable units is lost. Therefore, no matter how one analyzes diphthongs, prosodically speaking, diphthongs must be considered equivalent to monophthongs, and both lack the ability to form a foot (for more arguments on this and related questions, see Feng 1995; 246—252). 21. There may be an alternative account for how to motivate the disyllabicity (or the Foot Formation Rule [49] given below) from the phonological changes (e. g., 4.1 and 6.1) in Old Chinese. San Duanmu has suggested to me that the incapability of Mandarin to form an independent foot with only one syllable is due to the tonal system of the language (Duanmu 1994, personal communication through e-mail). If this is so, according to the hypothesis that disyllabic feet were newly developed in Classical Chinese and the fact that the tonal-system followed the loss of final consonants, the development of disyllabicity could also be attributable to the development of the tonal system in

Compound words in Classical Chinese

253

Classical Chinese. As mentioned above (4.1), the tones of Middle Chinese were developed from Old Chinese codas and post-codas: *-s > Departing Tone; and *-? > High-rising Tone. By the time of the Han period, the tonal system was partially (if not completely) established (see note 13). Given this fact, if one syllable cannot form an independent foot in a tonal language in general, then the development of the tonal system would be another factor to motivate the Foot Formation Rule given in (49). Nevertheless, the tonalbased account also supports my analysis for the establishment of the Foot Formation Rule. 22. See either Duanmu's Non-head Stress Rule (1990), or Cinque's (1993) hypothesis that phrasal stress is assigned universally as follows: in [XP Y] or [Y XP], stress goes to XP, or the syntactic complement. The Sentence Prosodic Rule in (49) given below can be derived from Duanmu's and Cinque's hypothesis: that is, within a VP, if the language is SVO, the Sentence Normal Stress falls to the right of the verb, i. e., the complement of the head of the VP. 23. Note that in the surviving SOV structure he-zui zhi-you, zhi you 'it-have' can never be separated. This indicates that, zhi in he zui zhi you must be a pronominal clitic form cliticized onto the verb (e. g., zhu W, a fusion form of zhiyu 'it at' [ZT]). 24. At this point, one may argue that the nonexistent structure of *[[he-zui] you] is not due to whether a monosyllable can be a foot or not, but to the contrast between two syllables he-zui versus one syllable you. In other words, it might be argued that a foot that consists of fewer syllables cannot compete with a foot that contains more syllables. However, note that a disyllabic foot is able to compete with a trisyllabic foot, as seen in (i a) and (i b) below: (i)

a. Wu he er-feng zhi youl I what near-fiefdom it have 'What near fiefdom do I have?'

(Zuo.Zhao 9)

b. S i f f i g , ifefej&t, fäthmzm Wuzi xiang zhi, laofu bao zhi, he you-jun zhi you You assist him, I carry him, what young prince it have 'You assist him; I carry him; what kind of young prince do we have?' {Gongyang. Cheng 15) In (i a), he er-feng zhi-you is the last phrase, and in (i b) he you-jun zhi-you forms the last phrase. According to the SPR (Sentence Prosodic Rule), in both cases the left node X contains three syllables he er-feng (what nearfiefdom) or he you-jun (what young-prince), while the right node Y contains only two syllables zhi-you. Yet, unlike (40), (ia) and (ib) are grammatical. The contrast between (40) and (ia—b) is illustrated as follows:

254

Shengli Feng

(40)

a.

*yp X

Λ σ]



ia-b. Y

yp X

I

Α

[σ]

Y

[σ σ σ] [σ

Λσ]

This strongly suggests that two-syllable units behave differently from onesyllable units. Given the different prosodic behaviors between monosyllabic units and disyllabic units, the argument for the one-syllable foot can no longer be held. The fact is that a standard foot can always stand alone, but one syllable is incapable of doing so, as exemplified in (40). It follows that a one-syllable unit, unlike a two-syllable unit, cannot form a standard prosodic food. 25. In Classical Chinese, word order was the fundamental means for indicating grammatical relations between the elements of a sentence. Therefore, combinations of words must be constrained by phrase-structure rules of the language. 26. Note that there is no theoretical reason to expect that all X°-level constituents would be semantically non-compositional, nor any reason to expect that all X'- or X"-level constituents would be semantically compositional (see Liberman-Sproat 1992). The semantic criterion (8) is unsatisfactory in this connection. The Foot Formation Rule, on the other hand, encourages the development of disyllabic lexical units given the theory presented here. The Word Formation Rule, a formal constraint for prosodic words, is theoretically motivated. Therefore compounds in Chinese can be formally derived by (55) alone. I would like to thank Mark Liberman for pointing this out to me. 27. For example, of 158 sentences I collected from Qin Jin Xiao Zhi Zhan [The War between Qin and Jin in Xiao] in Zuo.Xi 32 —33. (c. 200 BC), there are 120 sentences in which a verb with or without its complement appears at the end of the sentence, and only 14 sentences in which VO combinations do not appear at the end of the sentence, constituting 9% of the total. Among the 120 final-VP structures, 44% are VO structures, 19% are single verbs (intransitive or transitive without object) and 8% are [V—PP] structures, as illustrated below: Table i.

Verbs with their complements in Zuozhuan. Xi 32—33 (c. 200 BC)

Total

...VOXP]

[...V PP]

[...V]

[...VO]

Final-[VP]

other

158 (100%)

14 (9%)

13 (8%)

38 (19%)

69 (44%)

120 (76%)

33 (21%)

Note that the only structure that would allow a VO combination to escape from the sentential stress is the structure [...VO XP]. However, there are only

Compound words in Classical Chinese

255

9% cases off this type. On the other hand, in 44% of the cases VO appears at the end of the sentence. If we compare the non-final VOs and the final VOs, we see a large difference between the two: Table ii.

Final and non-final VOs in Zuozhuan. Xi 32 —33 (c. 200 BC)

Total

Non-final VO

Final-VO

83 100%

14 17%

69 83%

That is, only 17% of VO combinations do not fall under sentential stress position, while 83% occur in sentential stress position. So the object of the verb in these sentences will often be the target of the normal sentence-final stress. 28. The iambic stress on VO compounds in Modern Chinese sometimes causes speakers to treat them as phrases (Chao, 1968). For example: A.

Wo hen dan-xin ta de jianglai. I very bear-heart he/she future Ί worry about his/her future.'

B.

Ni dan shem xin a\ you bear what heart prt. 'what on earth are you worrying about!'

It has been suggested (Chao 1968: 431; Feng 1995: 107) that sentence stress can ionize an iambic compound into a phrase in modern Chinese in certain contexts. This analysis supports the assumption that sentence stress on VO structures causes them to be construed as phrases. 29. It is also possible, as Feng (1993) has argued, that in Classical Chinese there was a pause between the subject and the predicate in declarative sentences. If it is so, the pause may block the natural combination of an SP structure from being a foot; hence it is harder for the SP structure to become a compound than for other structures, given the hypothesis that compounds must be constrained by the prosodic integrity of being one foot.

References Aronoff, Mark-Mary-Louise Kean (eds.) 1980 Juncture. Saratoga, CA: Anma Libri. Bao, Zhiming 1990 "Fanqie Languages and Reduplication", Linguistic Inquiry 21: 317-350.

256

Shengli Feng

Barale, Catherine 1982 A Quantitative analysis of the loss of Final Consonants in Beijinig Mandarin. [Doctoral Dissertation. University of Pennsylvania.] Baxter, William H. 1992 A handbook of Old Chinese phonology. Berlin/New York: Mouton De Gruyter. Benedict, Paul K. 1972 Sino-Tibetan: A conspectus (Contributing editor: James A. Matisoff). Cambridge: Cambridge University Press. Bodman, Nicholas C. 1978 "Old Chinese reflexes of Sino-Tibetan *-?, -k and Related Problems", Paper Presented to 11th International Conference on SinoTibetan Language and Linguistics. 1980 "Proto-Chinese and Sino-Tibetan: Data towards establishing the nature of the relationship", in: F. Van Coetsem and Linda R. Waugh (eds.), 34-199. Borer, Hagit—Youssef Aoun (eds.) 1981 Theoretical issues in the grammar of Semitic languages. MIT working papers in linguistics, vol. 3. Cambridge, MA: The MIT Press. Chao, Yuan Ren 1968 A grammar of Spoken Chinese. Berkeley, California: University of California Press. Chen, Matthew Y. 1975 "An areal study of nasalization in Chinese", Journal of Chinese Linguistics 3: 16—59. 1979 "Metrical Structure: Evidence from Chinese Poetry", Linguistic Inquiry 10: 371-420. Cheng, Xiangqing 1981 "Xianqin Shuangyinci Yanjiu", [A study of disyllabic words in preQin], in: Cheng, Xiangqing (ed.), 45—113. 1985 "Lungheng Fuyinci Yanjiu", [A study of polysyllabic words on Lunheng], in: Cheng, Xiangqing (ed), 262—340. Cheng, Xiangqing (ed.) 1981 Xianqin Hanyu Yanjiu [Studies of pre-Qin Chinese]. Shandong: Shandong Jiaoyu Chubanshe [Shandong Educational Press]. 1985 Liang Han Hanyu Yanjiu [Studies of Han Chinese]. Shandong: Shandong Jiaoyu Chubanshe [Shandong Educational Press]. Chomsky, Noam 1993 "A minimalist program for linguistic theory", in: Kenneth HaleSamuel J. Keyser (eds.), 1-52. Chou, Fa-kao 1962 A historical grammar of Ancient Chinese. Part II: Morphology. Taipei: Academia Sinica, Institute of History and Philology, monograph No. 39.

Compound words in Classical Chinese

257

Cinque, Guglielmo 1993 "A null theory of phrase and compound stress", Linguistic Inquiry 24: 239-297. Dirven, Rene—Vilem Fried (eds.) 1987 Factionalism in linguistics. Linguistic and literary studies in Eastern Europe 20. Amsterdam: John Benjamins. Dobson, W.A.C.H. 1959 Late Archaic Chinese. Toronto: University of Toronto Press. Dong, Tonghe 1948 "Shang Gu Yinyun Biaogao" [Draft phonological tables for Old Chinese], Bulletin of the Institute of History and Philology, Academia Sinica 18:1-249. 1954 Zhonguo Yuyin Shi [The history of Chinese phonology], Taipei: Zhongguo Wenhua Chuban Shiye She. Duanmu, San 1990 A formal study of syllable, Tone, Stress and Domain in Chinese languages. [Doctoral Dissertation, MIT.] Feng, Shengli 1991 "Prosodic structure and word order Change in Chinese", The Penn Review of Linguistics 15: 15—21. 1993 "The copula in Classical, Chinese declarative sentences", Journal of Chinese Linguistics 21, 2: 211—311. 1994 "Stress shift and object post-posing in Early Archaic Chinese", Yuyan Yanjiu 26: 79-93. 1995 Prosodic structure and prosodically constrained syntax in Chinese. [Doctoral Dissertation, University of Pennsylvania.] Goldsmith, John A. 1990 Autosegmental and metrical phonology. Oxford: Basil Blackwell Ltd. Guo, Shaoyu [1938] Zhaoyushi Yuyan Wenzi Lunji [Collection of linguistic and philological works]. Shanghai: Guji Chubanshe [Shanghai Classics Press]. Hale, Kenneth—Samuel J. Keyser (eds.) 1993 View from building 20. Cambridge, MA: The MIT Press. Haudricourt, Andre 1954 "De L'origine des tons en vietnamien", Journal Asiatique 242: 68—82. 1972 Problemes de phonologie diachronique. Langes et civilisations ä tradition orale, 1. Paris: Societe pour l'Etude des Langues Africaines. Hayes, Bruce 1980 A metrical theory of stress rules. [Doctoral Dissertation, MIT.] 1989 "Compensatory lengthening in moraic phonology", Linguistic Inquiry 20: 253-306. Hsueh, Frank F. S. 1986 Beijing Yinxi Jiexi. [Analysis of the Beijing dialect sound system].

258

Shengli Feng

Beijing: Beijing Yuyan Xueyuan Chubanshe [Beijing Language Institute Press]. Huang, C . - T . James 1984 "Phrase structure, lexical integrity, and Chinese compounds", Journal of Chinese Linguistics Teacher's Association. 19.2: 53 — 78. Kager, Rene 1992 "Alternatives to the iambic-trochaic law", Natural Language and Linguistic Theory. II: 381—432. Karlgren, Bernhard 1940 "Grammata Serica: script and phonetics in Chinese and Sino-Japanese", Bulletin of the Museum of Far eastern Antiquities 12: 1—471. Kiparsky, Paul 1979 "Metrical structure assignment is cyclic", Linguistic Inquiry 10: 421-442. Kroch, Anthony 1989 "Reflexes of grammar in patters of language change", Journal of language Variation and Change. 1: 133—172. Labov, William 1984 "The interpretation of zeroes", Phonologica, 6, 135—156. 1987 "The overestimation of functionalism", in: Rene Dirven and Vilem Fried (eds.), 311-332. Li, Fang-Kuei 1980 Shang Gu Yin Yanjiu. [A study of Old Chinese phonology]. Beijing: Shangwu Chubanshe [Commercial Press]. Liberman, Mark-Alan Prince 1977 "On stress and linguistic rhythm", Linguistic Inquiry 8: 249-336. Liberman, Mark-Richard Sproat 1992 "The stress and structure of modified noun phrases in English", in: Ivan A. Sag and Anna Szabolcsi (eds.), 131-181. Lu, Zhiwei 1947 "Gu yin Lue Shuo" [A summary discussion of Old Chinese pronunciation], Yanjing Xuebao [Yanjing Journal] Monograph no. 20. Taipei: Xuesheng Shujiu [Student Book co.]. McCarthy, John 1979 "On stress and syllabification", Linguistic Inquiry, 10: 443—466. McCarthy, John-Alan Prince 1991 "Prosodic minimality." Lecture presented at The University of Illinois Conference "The Organization of Phonology". 1993 Prosodic morphology I - Constraint interaction and satisfaction. [Unpublished manuscript, University of Massachusetts and Rutgers University.] Mei, Tsu-lin 1970 "Tones and prosody in Middle Chinese and the origin of the rising tone", Harvard Journal of Asiatic Studies 30: 86—110.

Compound words in Classical Chinese

259

1980

"Chronological strata in derivation by tone-change", Zhongguo Yuwen 6: 427-443. 1990 "The origin of the disposal construction during Tang and Song Dynasties", Zhongguo Yuwen 3: 191—206. 1994 "Notes on the morphology of ideas in Ancient China", in: Willard J. Peterson, et al. (eds.), 37—46. Nespor, Marina-Irene Vogel 1986 Prosodic phonology. Dordrecht: Foris 86. Norman, Jerry 1988 Chinese. Cambridge: Cambridge University Press. Peterson, Willard J. et al. (eds.) 1994 The power of culture: studies in Chinese cultural history. Hong Kong: The Chinese University Press. Prince, Alan 1980 "A metrical theory for Estonian quantity", Linguistic Inquiry, 11: 511-526. Pulleyblank, Edwin G. 1962 "The consonantal system of Old Chinese", Asia Major 9: 58-144, 206-265. 1977—1978 "The final consonants of Old Chinese", Monumenta Serica 33: 180-206.

Sag, Ivan A . - A n n a Szabolcsi (eds.) 1992 Lexical matters. Stanford: Stanford University Press. Selkirk, Elisabeth 1980 a "Prosodic domains in phonology: Sanskrit revisited", in: Mark Aronoff and Mary-Louise Kean (eds.), 107-129. 1980 b "The role of prosodic categories in English word stress", Linguistic Inquiry, 11: 563 — 605. 1981 "Epenthesis and degenerate syllables in Cairene Arabic", in: Hagit Borer and Youssef Aoun (eds.), 209—232. Shih, Chi-lin 1986 The prosodic domain of tone sandhi in Chinese. [Doctoral Dissertation, University of California San Diego.] Stimson, Hugh M. 1966 The Jongyuan In Yunn: A guide to Old Mandarin pronunciation. Sinological Series, No. 12. New Haven: Yale University Far Eastern Publications. Ting, Pang-Hsin 1979 "Shanggu Hanyu de Yinjie Jiegou" [The syllable structure in Archaic Chinese], Bulletin of the Institute of History and Philology, 50 (Taipei: Institute of History and Philology, Academia Sinica: 717-739. 1975 "Lunyu, Mengzi, ji Shijing zhong Binglieyu Chengfen Zhijian de Shengdiao Guanxi", (Tonal relationships between the two constituents of coordinating structures in the Analects, the Meng-tze and the

260

Shengli Feng

Book of Odes), Bulletin of the Institute of History and Philology) 47 (Taipei: Institute of History and Philology, Academia Sinica): 17—51. van Coetsem, Frans—Linda R. Waugh (eds.) 1980 Contributions to Historical Linguistics. Leiden: E. J. Brill, van der Hulst, Harry 1984 Syllable structure and stress in Dutch. Dordrecht: Foris. Wang, Li 1980 Hanyu Shigao [A historical Grammar of Chinese], Beijing: Zhonghua Shuju. Wang, Zhijie 1993 The geometry of segmental features in Beijing Mandarin. [Doctoral Dissertation, University of Delaware.] Xu, De'an 1981 "Cong Xungu Ziliao zhong Fanying Chulai de Hanyu Zaoqi Goucifa" [Early Chinese morphology shown in commentary materials], Xinan Shifan Xueyan Xuebao [Journal of the Xinan Normal Institute], 3: 31-39. Xu, Tongqiang 1996 Lishi Yuyanxue [Historical Linguistics], Beijing: Shangwu Press. Yu, Nai-yong 1985 Shang Gu Yinxi Yanjiu, [Study of the Old Chinese sound system]. Hong Kong: The Chinese University Press. Zhu, Qingzhi 1992 Fo-tien yü chung-ku Han-yü tz'u-hui yen-chiu [A study of the relationship between Buddhist scriptures and the vocabulary of Middle Sinitic]. Taipei: Wen-chin ch'u-pan-she.

Chinese as a headless language in compounding morphology* Shuanfan Huang

1. The problem Since the early eighties the weight of opinion in morphological research has been to accept the credo that every natural language makes an essential use of the notion of headedness in constructing morphologically complex entities, whether they are compounds, derivations, or inflected words. Thus Tagalog is taken to have a left-headed morphology and in English the head is taken by practically all of the morphologists, Lieber (1992) being the lone exception, who have addressed the question, to be the rightmost member of a morphological construct (e.g., Lieber 1980, Williams 1981, Selkirk 1982, Trommelen-Zonneveld 1986, DiSciulloWilliams 1987). In section 5, however, I will show that that hypothesis is simply not supported by the data. The present study was undertaken to examine in detail the morphological structure of Chinese, a language whose word structure defies much current dogma about headedness in theories of morphology, since it is essentially headless. To this end, an entire dictionary of Guoyu Ribao Cidian (Mandarin Daily Dictionary, henceforth abbreviated as GRC) — a 1001-page long work containing nearly 24,000 disyllabic compounds entries alone - was subjected to a detailed analysis in order to ascertain the possible role of head in Chinese morphology. The answer that has emerged from this inquiry is clearly that Chinese is essentially a headless language. Neither the rightmost member nor the leftmost member of a compound can claim to monopolize the privileged status of determining the category of a compound. Indeed, as a consequence of that, either the righthanded member or the lefthanded member of a compound predicts, up to a mean of 70 % accuracy, the category of a compound, rendering the notion of headedness in Chinese at best a moot question. Before proceeding, it is important to be clear that there are no unequivocal structural criteria for distinguishing compounds from phrases (see the exchanges in Bates, Chen et al. 1991; Bates, Chen et al. 1993; and Zhou et al. 1993 for the latest discussion on this issue). Inseparability

262

Shuanfan Huang

of elements of compounds, a condition known as "phrase-structure condition" in the literature and variously attributed to Li and Thompson (1981) or James Huang (1984), frequently serves to pick out compounds, but a great majority of compounds allow limited syntactic movements of various sorts. Second, some appeal to referential specificity of nominal elements has been made to distinguish compounds from phrases; chifan 'eat (rice)' differs from fan chi (le) 'the rice has been eaten'. But this criterion fails to acknowledge the possibility that even in an expression such as fan chi le, jiao shui le, hai gan shemme? fan chi (le) and jiao shui

(le) are just as much compounds as are chifan and shuijiao. Although there are no foolproof criteria for identifying compounds, discourse-functional considerations are often robust enough to pick out compounds. Thus native speakers will intuitively feel that fan chile and jiao suile are indeed compounds in the expression above, given suitable contextualization. In this paper I take disyllabic expressions found in GRC to be bona fide compounds and use them as a data base to launch my inquiry into Chinese morphology.

Table 1. Possible types of compounds

Ν

V

A

NN: mugong 'carpenter'

NN: wuse 'hunt for'

AN: NA: AA: VN:

AN: ?

NN: maodun 'contradictory' renwurenliu 'mamly' AN: henxin 'cruel' NA: nianqing 'young' A A: qiguai 'strange'

shengqi 'vitality' fengshi 'rheumatism' kongbai 'blank' lingshi 'consul'

VA: ? a. NV: yashua 'toothbrush' b. NV: waiyu 'extra marital affair' W : dongzuo 'activity' AV: xiaoshuo 'fiction' ΦΦ: mada 'motor' youmo 'humor'

NA: ? AA: ? ViN: naogui 'haunt' VN: kaidao 'operate' VA: tigao 'increase' NV: tianliang 'day break' NV: neiying 'respond from within' W : fenxi 'analyze' AV: gongbu 'announce' ΦΦ: cuotuo 'dawdle'

Note: ? indicates gaps in word formation

VN: deyi 'elated' VA: ? NV: guoyou 'state-owned'

W : baoshou 'conservative' AV: haokan 'pretty' ΦΦ: cenci 'in disarray'

Compounding

morphology

263

It is well-known that Chinese exhibits a high degree of compounding possibility. Practically any compound can be forged out of any combination of stems, prepositions excepted. Table 1 illustrates all of the possible combinations of stems drawn from the three major lexical categories Ν (noun), V (verb), and A (adjective).1 The compounding process has gone far enough to make morphemes the entities which have a high degree of stability of meaning and of categorial status. Morphemes become more and more sharply defined such that we tend to take them as freely combinable units. It is true that a number of "morphemes" have never disengaged from their compounds and we go on using them exactly as we have heard them used, as if they were deeply frozen forms. Compounds of the (ΦΦ) category in Table 1 are of this type - they are either unmorphemicizable loan-words or native compounds whose elements native speakers find impossible to categorize. These are lexicalized idioms in that each compound is an indivisible construct and each has a set meaning which cannot be computed by adding up the separate meanings of the elements of the compounds. Loan-words aside, the sources of these native idiomatic compounds lie in the etymological past and are at best meaningless to the modern speaker. Lexical idioms are found even among recently coined words, as with liuhai 'straight hair cut across the forehead; bang'. Most lexicalized idioms persist as such; a handful become remorphemicized in the course of the words' history and lose part of their idiomaticity.

2. Chinese as a headless language Given that there is such a wide variety of compounding possibilities, the question naturally arises whether Chinese makes an essential use of the notion of headedness in constructing compounds. Table 2 gives the distribution of disyllablic compounds in the entire dictionary of GRC and Table 3 gives the distribution of disyllabic compounds in the first hundred pages of GRC. Looking across Table 2 and Table 3, the intuitive answer must be negative, since, for instance, if both (VN)n and (NV)n are possible compounds, the head claimed, on whatever grounds, for one of the constituents on one type of compound, say (VN)n, would be annulled by the existence of the other type of compound, (NV)n. A glance at the compound types given in Table 1 suggests that the existence of a noun constituent does not guarantee that a VN compound or an NV compound or even an NN compound as a whole is a noun; each of these

264

Shuanfan Huang

could belong in any of the three lexical categories. Similarly, the existence of an adjective element does not guarantee that an NA compound or an AN compound or even an AA compound as a whole is an adjective. Again, each of these could be assigned to any of the three lexical categories. There is then prima facie evidence that Chinese compounds seem to be headless. But a full account of which word-formation processes are productive and which unproductive will give us a better idea of the nature of headedness in Chinese compounding morphology. With this in mind, I turn now to a statistical analysis of compounds.

3. Some statistics A count of the total number of disyllabic compounds was performed on GRC, the results of which were shown in Table 2. A separate count of the distribution of various types of compounds in the first hundred pages of GRC was also performed in order to find confidence limits for proportions, the results of which were tabulated in Table 3.

Table 2. Number of disyllabic compounds in GRC Structure NN NV NA W VN VA AN AV AA ΦΦ Total

Ν

V

A

Total

6,910 306 168 276 1,581

21 446

90 72 209 103 378

72

198 173 1,609 66

7,021 824 377 4,071 4,881 560 3,177 996 1,684 395

8,350 (8,278)*

2898 (2,832)*

23,986 (23,591)*

? 2,961 116 163 257 12,738 (12,481)*

? 3,730 2,940 434 ?

707

?

?

* figures within parentheses are obtained by subtracting the number of compounds of the (ΦΦ) type.

Compounding morphology

265

Table 3. Number of disyllabic compounds in pages 1-100 of GRC Structure

NN NV

NA W VN VA AN AV

AA ΦΦ Total

Ν

V

721 20 25

A

Total

0

10

29

15

0

16 18

731 64 41 424 453 41 284 107

305 403 30

101

39

0

0

254 10 16 38

11

10

30 22 101 5

857

239

75 5

1,224

11

122

53 2,320

A total of 1,224 compound nouns were found in the first hundred pages of GRC. Of these, 87.9% are XN in structure, where X is any major lexical category. But the question is: to what extent is this percentage figure representative of the language as a whole? How can we set confidence limits for the proportion of XN compound nouns in the whole dictionary, the population from which the sample is taken? The standard error of a proportion is given by standard error = where Ρ is the proportion in the sample, Ν is the sample size. Thus for our sample of 1224 compound nouns, we have a standard error = 0.93 %. We therefore have 95% confidence limits = 0.88 + / - (1.96 X 0.0093) = 0.9 or 0.86. We can thus be 95 % sure that the proportion of XN compound nouns in the population lies between 90%-86%. We apply the same procedure to compound verbs. 59% of the 857 compound verbs in the hundred pages analyzed are XV in structure. The standard error is thus 0.017 and 95% confidence limits = 0.59 + / (1.96 X 0.017) = 0.62 / 0.565. We can thus be confident that the proportion of XV compound verbs in the population lies between 62%-56%. With compound adjectives 95 % confident limits are calculated to be 0.60/0.47. We can thus be confident that the proportion of XA compound adjectives in the population lies between 60%—47%.

266

Shuanfan Huang

The respective confidence limits are in fact beautifully corroborated by our investigation into the frequency distributions of (XN), (XV), and (XA) compounds in the population, GRC. Our statistics show that of the 12,738 compound nouns in GRC, fully 89.7% are in fact (XN) in structure, a percentage falling exactly in the range predicted by the confidence limits for our chosen confidence level of 95 %. Our statistics also show that of the 8,350 compound verbs in GRC, 57.8 % are (XV) in structure, a percentage again entirely predicted by the confidence limits for our chosen confidence level of 95 %. Only the percentage for compound adjectives is slightly off. The statistics in Table 2 show that 62.7 % of the compound adjectives have the (XA) structure, which still compares favorable with the 95 % confidence limits of 60 %—47 %. Parallel investigations into the frequency distributions of compounds in two other Chinese dialects, Hakka and Taiwanese, have yielded essentially similar patterns of distribution, suggesting that beyond their common ancestral origin in the (largely) monomorphemic past, there has been much inter-dialectal symbiosis, perhaps with Mandarin playing more of the role of a host, leading to the current state of morphological structure we see exhibited in Table 4. Since there is little substantive difference in morphological structure among the three Chinese dialects, either in their compounding possibilities or in the way various types of compounds are distributed, we can characterize the morphological structure of the Chinese language in general as one in which there is much more noun compounding (53.1 % in Table 2) than verb (34.8%) or adjective (12%) commpounding and in which noun compounding is more strongly rightheaded, but verb or adjective compounding shows little propensity towards rightheadedness, where the head of a compound is formally defined as that element of a compound, if any, which determines the category of the compound as a whole.

4. Chi-square value and head in Chinese morphology Table 4 indicates that if the category type of a Mandarin compound is known, then the probability of locating its head on the rightmost member of the compound is 0.7, which is a bit greater than the probability of 0.664 for Hakka and 0.677 for Taiwanese.

Compounding morphology

267

Table 4. Cross-dialectal comparison of frequency distributions of compound Dialect

(XN)n

(XV)v

(XA)a

Mean

(%)

(%)

(%)

(%)

Mandarin Hakka Taiwanese

89.7 93 91.8

57.8 52.1 54.2

62.7 54.3 57.3

70 66.4 67.7

Mean

91.5

54.6

58.1

68.1

Table 5. Observed and expected frequencies of structure and category under the null hypothesis Structure

+N

+(NN)

721 (385.7) 503 (838.3)

12 (347.3) 1084 (748.7)

1224

1096

-(NN) Total

-N

Total 733 1587 2320

There appears, then, a significant association between the structure of a morphological construct and its category type. Table 5 shows that 721 out of the total sample of 1,224 compound nouns are N N and that of the total number of 2320 compounds in the first hundred pages of GRC only 12 compounds that are not nouns have the (NN) structure. On the null hypothesis, there is no association between the two features +(NN)/ - ( N N ) and + N / - N . We know that 733/2,320 of the compounds have the (NN) structure. If there is no connection between structure and category, the same proportion of the compound nouns should be (—NN). Since the total number of compound nouns is 1,224, the expected number of compound nouns with (NN) structure is Ε =

731 2,320

X 1,224 = 385.7

We can calculate the expected frequencies in the other three cells in the same way, and they are given in Table 5. The valued of X 2 is calculated to be 898.9. The critical value for the 5 per cent level and one degree of freedom is 3.84. Since the value of X 2 is

268

Shuanfan Huang

much greater than this, we conclude that the null hypothesis can be rejected at this level. Indeed the value obtained is significant even at the 0.1 percent level. There is thus a significant association between structure and category. The relationships between structure and category among compound verbs and compound adjectives are similarly calculated. The value of X 2 in each case (681.1 for compound verbs and 730.1 for compound adjectives) is significantly greater than the critical value at the 0.1 per cent level. There is thus a significant association between structure and category among all of the types of compounds in the first hundred pages of GRC. Returning to Table 4, it is important to note that the percentage figures given there represent, for each dialect, only the proportion of XN noun compounds in relation to the totality of compound nouns, the proportion of XV verb compounds relative to the totality of compound verbs and, finally, the proportion of XA adjective compounds relative to the totality of adjective compounds. In other words, the percentage figures do not represent the proportion of XN noun compounds (or of XV verb compounds, XA adjective compounds) relative to the totality of XN compounds (or of XV, XA compounds) in GRC, which would give us a truer measure of the role of the rightmost member of a compound in the morphology of the language. To this end, appropriate proportions were calculated for each of the three types of compounds, XN noun compounds, XV verb compounds, and XA adjective compounds, and the results are given in Table 6. Note that "Part 2" in the leftmost column of Table 6 refers to the rightmost member of a compound. To interpret Table 6, let us observe that if the category type of a Mandarin compound (XN) is not known, then the probability of locating its head on the rightmost member of the compound, namely a noun element, Table 6. Proportion of (XN)n, (XV)v, (XA)a compounds Lexical category

Second element of a compound

Total

Ν

V

A

Ν V A

11430 (91.6) 2961 (35.8) 666 (23.5)

698 (5.6) 4833 (58.4) 348 (12.3)

353 (2.6) 484 (5.8) 1818 (64.2)

12481 (52.9) 8287 (35.1) 2832 (12.0)

Total

15057 (63.8)

5879 (24.9)

2655 (11.3)

23591

Compounding morphology

269

is 0.759; if the category type of a compound (XV) is not known, then the probability of correctly locating its head on the rightmost member of the compound, namely a verb element, is 0.822. If the category type of a compound (XA) is not known, then the probability of locating its head on the rightmost member of the compound, namely an adjective, is 0.685. On balance, then, the probability of locating the head of a Mandarin compound on its rightmost member is 0.755. Before we rush to embrace the conclusion that Chinese is thus a rightheaded language and forestall the possibility that it might well be leftheaded, let us calculate the proportions of NX noun compounds, VX verb compounds, and AX adjective compounds respectively. The results are shown in Table 7, where "part 1" in the leftmost column refers to the first member of a compound. Table 7 indicates that if the category type of a Mandarin compound NX is not known, then the probability of locating the head on the leftmost member of the compound is 0.898; if the category type of a compound VX is not known, then the probability of locating its head on the leftmost member of the compound is 0.741; if the category type of a compound AX is unknown, the probability of locating its head on the left member of the compound is 0.317. Thus the overall probability of locating the head of a Mandarin compound on its leftmost member is 0.652. Table 8 summarizes the results of the computation. Similar analyses performed on Cantonese and Hakka data yield the results shown in Table 9.2 We now extend the Chi-square test to cover cases with three intersecting features shown in Table 6 and Table 7. The expected frequencies are found by exactly the same kind of reasoning as for the 2 x 2 table in Table 5. Thus the expected value for (XN) compounds in compound nouns is Table 7. Proportion of (NX)n, (VX)v, (AX)a compounds Lexical category

First element of a compound

Total

Ν

V

A

Ν V A

7384 (59.2) 467 (5.6) 371 (13.1)

1859 (14.9) 7048 (85.1) 607 (21.4)

3240 (26.0) 763 (9.2) 1854 (65.5)

12481 (52.9) 8287 (35.1) 2832 (12.0)

Total

8222 (34.9)

9512 (40.3)

5857 (24.8)

23591

270

Shuanfan Huang

Table 8. Degree of right-headedless and left-headedness for Mandarin mean (XN)n: 0.759 (NX)n: 0.898

(XV)v: 0.822 (VX)v: 0.741

(XA)a: 0.685 (AX)a: 0.317

0.755 0.652

Table 9. Degree of headedness for Cantonese, Hakka, and Taiwanese

(XN)n (NX)n (XV)v (VX)v (XA)a (AX)a RH LH

Cantonese

Hakka

Taiwanese

0.88 0.60 0.44 0.88 0.42 0.61 0.58 0.695

0.627 0.945 0.865 0.831 0.573 0.359 0.688 0.712

0.927 0.813 0.776 0.833 0.413 0.812 0.705 0.819

15,057 X 12,481 Ε=— = 7,966 23,591 We now calculate X 2 by summing the values of (O-E) 2 /E and obtain a value of X 2 = 17,075.2. For the 3 X 3 table the number of degrees of freedom is (3 - 1 ) x (3-1) = 4, and the critical value of X 2 at the 0.1 per cent level is 18.47. The calculated value of X 2 is far greater than this, and is significant at the .000 level. We therefore conclude that there is a significant association between the second member of a compound and its category type. Analogous calculations performed on Table 7 yield a value of chisquare 13,607.6, which is significant at the .0000 level. We conclude that there is also a significant association between the first member of a compound and the category type of the compound. These two results taken together provide evidence that Chinese morphology does not exploit the use of head, since neither the first nor the second member of a compound prevails in the determination of the category type of a compound. The comparison given in Table 8 shows that all of the dialects other than Mandarin are consistently more left-headed than right-headed. Exactly why Mandarin should be an exception in this regard is a mystery.

Compounding morphology

271

One possible reason might be that the database used for Mandarin in this comparative study was much larger than that used for the other three dialects: over 24,400 compounds for Mandarin, 5,123 compounds for Cantonese, 2,895 compounds for Hakka and 6,450 compounds for Taiwanese. Another possibility is that Mandarin Chinese has been for the last century or two the major conduit through which loan-words from English, a language known to be predominantly rightheaded, were imported (but see section 5 for some caveat). We have established in the preceding pages that Chinese is headless in its compounding morphology, where "head" is defined in structural terms as that element of a compound that determines the category of the compound as a whole. But a formal account of the morphological structure of a language is only half of the story. A true appreciation of the workings of the morphology of natural languages can be had only by considering the head as a semantic notion. To this we now turn. In semantic terms, a headed compound is one which is endocentric (i. e., one which has a semantic center) and a headless compound is one which is exocentric. Headless compounds, then, are those that meet any of the following conditions: (i)

The category of a compound differs from those of its constituent stems, e.g., [AV]n, [AN]v, [VN]a, [AA]n, [VV]a, [NN]v, ...

(ii)

The category of a compound is identical with those of its component stems, e. g., [VV]v, [AA]a.

(iii)

A noun compound whose two stems stand in a coordinate relationship, e.g., jiangshan 'river and mountain'.

(iv)

Compounds that are lexicalized idioms, e.g., liuhai 'bang'; dongxi 'thing, object'.

Analyses of compounds for their extent of headedness on this semantic criterion have been performed for both Hakka and Cantonese. Table 10 summarizes the results of the analyses. Table 10 clearly shows that both Hakka and Cantonese are, again, headless in their compounding morphology, with head understood as the semantic center of compounds, since neither R H nor LH predominates as the overriding pattern of compounding. A solid majority (58.5%) of the Cantonese compounds are in fact semantically headless.

272

Shuanfan Huang

Table 10. Headedness on semantic criterion for Cantonese and Hakka compounds Head

Ν

V

A

CanHakka Can- Hakka tonese tonese (0/ \ (0/ \ (/o) (A (A) (A) Rightheaded (RH) 59.8 Leftheaded (LH) 1.7 Headless 38.5 (-H)

66.31

8.1

9.77

5.57

38.8

65.96

28.02

53.1

24.26

Mean

CanHakka CanHakka tonese tonese /o/ \ (Ο/Λ i0/\ ίΟ/Λ (/o) (A) (A) (A> 2.61

24.7

36.37

10

21.82

16.8

32.06

83.9

75.57

58.5

31.57

6.1

To sum up, we have established that the head of a Chinese compound, be it in Mandarin, Cantonese, Hakka, or Taiwanese, where the head is defined variously as the semantic center or as that constituent which determines the category of the compound as a whole, can not be located on either its leftmost member or its rightmost member. The calculated values of X 2 have shown that there is a significant association between the category type of either the leftmost member of a compound or the rightmost member of a compound and the category type of the compound as a whole. Since neither member of a compound can uniquely determine the category of a compound, Chinese is, strictly, a headless language.

5. Is English right-headed? — an excursus Before concluding the present inquiry into the nature of head in Chinese morphology, it is important to turn, at least briefly, to the question of the role of head in English morphology. Morphologists of the lexicalist-morphology stripe have consistently maintained that English words are headed and that the head of an English word is the rightmost member of a word, where words refer to all cases of word-formation. In derivational morphology, for instance, the head in a suffixal derivative happiness is the suffix -ness, because it is this suffix which determines the lexical category of the derivative as a whole.

Compounding morphology Table 11.

Compounding possibilities for English (LDCE, pages 1050—1250)

Compound

Ν

V

NN AN VN PN PV VP NA VA NP PP NV W AV AA PA AP

803 205 15 14 0 34 0 0 1 0 3 0 0 1 0 0

1 1 1 0 7 2 0 0 0 0 12 1 0 0 0 0

Total

273

1,075 88.26%

25 2.05%

A 7 15 2 1 0 8 37 0 1 0 1 1 0 37 3 5 118 9.69%

Total 810 221 18 15 7 44 37 0 2 0 16 2 0 38 3 5 1,218

In most cases of prefixal derivation in English the root is the head, because the prefix does not carry a categorial feature and thus can not determine the lexical category of the derivative as a whole. Compounds, however, present a serious problem. A separate check through pages 1050-1250 and through pages 750-900 of the Longman Dictionary of Contemporary English (LDCE) turns up 1,221 and 1,340 compounds respectively whose distributions are given in Table 11 and Table 12. Gaps in Table 11 and Table 12 suggest that there are greater restrictions on forming verbal compounds in English. Conspicuous by its absence are VV verbal compounds, a pattern found very often in Chinese (44% of verbal compounds are of this type). It is this greater restriction of verbal compound formation that explains why English relies heavily on noun-to-verb conversions to make up for the structural gaps in its lexicon. It should be noted, however, that the direction of conversion is not necessarily tied to the presence of gaps. In Chinese, the most common type of conversion goes from the verb to the noun and yet there are also greater restrictions on verb-compound formation, as in English.

274

Shuanfan Huang

Table 12. Compounding possibilities for English (based on LDCE, pages 750 -900) Ν N N pint-size 539 A N old flame 195 VN pasteboard 28 PN on-looker 43 PV output 29 VP pay-off 30

V

A

A N two-time 1

Total N N storybook 2 AN old hat 25 PN off-color 21

PV outshine 102 VP pickup 172 VP throwaway 6

N A papier-mache 4 VA speak-easy 1 NP passerby 2 PP outback 2

Ν A threadbare 36 VA type-written 2

NV typecast 5 VV touch-type 3 AV soft-pedal 1

W

top-go 1

AA old-fashioned 44 PA on-going 37 AP paid-up 5 NP odds-on 3 Total 874 (65.2%)

284 (21.2%)

182 (13.5%)

1,340

Note: Blanks indicate gaps Ρ — preposition or particle

Verbal compounds in these two tables show high variability from one sampling to another, depending principally on the presence or absence of Ρ (prepositions or particles) within that sample, as can be seen by noting that they account for 21.2% in Table 12, but just 2.3% in Table 11. The higher percentage of verb compounds in Table 12 stemmed from two major sources: PV compounds and VP compounds. These two types of word formation, among others, are features that distinguish English from Chinese, since prepositions (or particles) never participate in word formation in Chinese. Indeed, compounds formed with a Ρ element in English are highly productive as they account for 36.4% of all compounds in Table 12 (and 9.7% in Table 11). Prepositions (or particles) in these compounds have been assumed in the morphological literature to carry no

Compounding morphology

275

relevant features that determine the category of a word as a whole and VP compounds are therefore assumed to have their heads located on the verb itself. A proper analysis of Ps in PV or VP compounds is of course a vexing question. If we consider the Ps as specifiers, as seems reasonable, it would run afoul of Lieber's (1992) licensing condition for English that heads are final with respect to specifiers, since both VP and PV compounds do occur, giving us both left-headed and right-headed verbal compounds, an unacceptable conclusion to either Lieber or morphologists generally. Lieber's (1992) strategy appears to dismiss VP and PV compounds as genuine compounds altogether. Addressing the issue of whether there are words in English with the head occurring on the left and acting as a Theta-assigner, she claims that "there is some sort of constraint in English which prevents verbal compounds of any sort from being created. The exact nature of that constraint need not concern us here. For our purposes, it is sufficient to note that this constraint would rule out all verbal compounds, left- as well as right-headed" (1992: 58). Given the productivity of VP and PV compounds, Lieber's dismissal seems premature, the sole purpose of which seems to make these two types of compounds the sacrificial lamb on the altar, in order to make her licensing conditions look good. The only alternative left is to take head to be literally that element that determines the category of a word as a whole, and PV verbal compounds would be, on this analysis, right-headed, and VP compounds left-headed. If we accept this line of thinking, English compounds would turn out to be much less rightheaded than generally assumed. We subjected the 1,340 compounds in Table 13 to the kind of analysis carried out in sections 3 - 5 for Chinese and came up with the following result: Table 13. Direction of head in English compounds Category

RH

LH



Total

Ν V A

748 108 99

6 172 5

120 4 78

874 284 182

Total

955 (71.2%)

183 (13.6%)

202 (15.1%)

1340

Thus compounding morphology in English is only 71.2% right-headed. Like Chinese, English compound nouns exhibit a marked preference for

276

Shuanfan Huang

right-headedness and compound verbs are more left-headed than rightheaded, while compound adjectives are strongly headless. Such congruence between the two otherwise structurally different languages strongly suggests that something deep is going on about the organization of the lexicon. We have here arrived at an important insight which has so far escaped the attention of the morphologists who have disavowed the possibility of a headless language. The notion of headedness may explain the majority of cases in Chinese and English morphological processes, but headless compounds still jeopardize it as a general explanation of the facts. The actual compounding patterns attested in Chinese and English do not support simple adherence to the head hypothesis. If this were merely a research strategy, acknowledged as an overgeneralization, this might not matter. But if it is taken as a fixed point of reference, it would be methodologically unsound and a distortion of the basic data.

6. Direction of head and phrasal syntax We have established that Chinese is essentially a headless language in its compounding morphology. A major source of this headlessness resides in the fact that, in striking contrast to English word formation, coordinate compounds predominate over all other types of compounds (VV verbal compounds account for 44.6 % of all verbal compounds and AA adjectival compounds account for 55.5% of all adjectival compounds) and that compounds of the forms {XY}Z, {XX}Y are also significantly large in number. Now it is generally assumed that there is an intimate connection between the position of the head of words and its position in phrases, we would then expect phrasal syntax in Chinese, as in other languages, to shed light on the headedness of words. If compounds are essentially headless, we would also expect phrases to be essentially headless. This, however, is not the case with phrasal syntax in Chinese, although it is true that heads are ambiguous in phrasal syntax in ways that have not been sufficiently appreciated. The facts about phrasal syntax in Chinese are as follows: (i)

Heads are final with respect to modifiers.

(ii)

Heads are final with respect to specifiers.

Compounding morphology

(iii)

277

Heads are initial or final with respect to complements, depending on the transitivity of the predicates in question.

I assume that (i) and (ii) are uncontroversial. (iii) rests on observations like the following: (1)

VP a. Wo meiyou lianluo ta I havenot contact him Ί didn't contact him.' b. Wo meiyou gen ta lianluo I havenot with him contact Ί didn't get in touch with him.'

(2)

AP a. Wo hen gaoxing ni neng lai I very glad you can come Ί am glad you could come.' b. Wo dui ta hen human I with him very disgusted Ί am disgusted with him.'

(3)

NP

a. (dui) lunwen de taolun about paper DE discussion 'discussion of the paper.' b. Wo dui lunwen meiyou yijian I about paper have no opinion Ί didn't have any opinion about the paper.' The objects appear to the right of the verbs and adjectives and argument PPs appear to the left of the verbs and adjectives. Nouns are syntactically intransitive and take argument PPs to their left. It is immediately obvious that the positions of heads in relation to their complements will not predict headless word structures in addition to phrase-level constituents. In other words, word syntax in Chinese is not wholly predictable from phrasal syntax.

278

Shuanfan Huang

The message of this paper has now been stated. Chinese is essentially a headless language in its morphology. The predominance of coordinate compounds suggests that Chinese compounding is not constructed on the principle of headhood, but rather on the principle of syntactic concatenation: any two constituents concatenable by syntactic rules are ipso facto well-formed compounds. Let us call this the Principle of Syntactic Compounding. In effect, this means that compounding in Chinese operated not on the head of a phrase, but on a phrase. The fact that phrases, not heads, are the input to compound formation is shown by the existence of such compounds as (NA)n,v,a, (NV)n,v,a, (VN)n,v,a, (VV)n,v,a, or (AA)n,v,a. Formally we have the following (not exhaustive) morphological rules for Chinese: (4)

a. Ν

*——

NA

b. Y

Χ Ε

>

ΗΗ Ο

cd Ι-ι

3§ 1 Λ cd u.

υ 1-1 55

αο

JS ο Ό t-l I

ε



>"

g

c •οPN Λ Oi.Sä D, ö 3 . 5 -α ^ S

>

Vi

ζ

Χ/1 C Ο cd cd £ £ ö > -α
ω ω £

.y >1 « η»

§

Β ^

1 § ^J Ν

A Lexical Phonology of Mandarin Chinese

317

triggering the stratum IV 3 => 2 sandhi rule, yielding däohüi. Exocentric verbs which have two third tones (such as dänxiäo gall-small 'timid') have stress assigned to the right, thus triggering the stratum IV sandhi rule, yielding dänxiäo. Finally, nouns which are composed of two third tones but do not undergo sandhi (such as erduo 'ear' and liji 'tenderloin') are formed at this stratum, and therefore receive stress on the left. The absence of stress on the right accounts both for the fact that these words are neutral-toned and also for the fact that they do not undergo sandhi even through they are composed of two third tones. These words are phonologically marked (i.e., this tonological pattern is relatively uncommon) in Mandarin, and this special phonological status is indicated by their "deep" lexical placement at stratum I. Intuitively speaking, this class of words has been lexicalized to the point where the right-hand morpheme is not analyzed by the native speaker as a third-toned word, and so does not trigger the 3 => 2 stratum IV sandhi operation.

5. Mandarin stratum II phonology Words formed at stratum II (see Table 2) include all compound nouns (e. g., "garden variety" nominals like qiche, steam-vehicle 'car', or xiäojie small-sister 'young lady'; antonymous nominals such as häodäi good-bad 'good/bad-ness'; and exocentric nominals10 like gänxiäng feel-think 'thoughts, impressions'), resultative verb compounds (such as känwän, read-finish 'to finish reading'), reduplicated classifiers such as benben (volume-volume 'every volume'), complex classifiers such as zheiben (thisC:volume, 'this (volume)' or zheiliängben (this-two-C:volume, 'these two (volumes)').11 Also included at this stratum is the potential marker insertion operation (i.e., känwän, read-finish => kändewän, read-can-finish, 'able to finish reading'). Compound nouns and resultative verbs receive stress on the right and left respectively. Right-hand stress for noun compounds accounts for the normal sandhi behavior of these words, with optional neutral tone marked by stratum IV removal of the stress mark. The placement of stress on the left for resultatives explains why these words are usually neutral-toned when they occur without the potential marker de or bu (Xiändäi Hänyü Cidiän [Modern Chinese dictionary], Chinese Academy of Social Sciences 1983: 4). If de or bu are affixed, then

318

Jerome L. Packard

8 * Μ I •>-> Μ

.a u Ö ί" Ο

§ -S es Ν --ι

S ϊs; Dh - α,

κ

nj Ii

51

3 Μ8 rt) Ζ 2

α ρ '— Iο



> Κ oo 1-1

J3

« .a Λ ^ ^ ϊί

,ο β

> .2 4> c .S

A Lexical Phonology

of Mandarin Chinese

321

fact that these morphemes normally occur with neutral tone in this context, and are listed as having neutral tone in major dictionaries such as Xiändäi Hänyü Cidiän [Modern Chinese dictionary] (Chinese Academy of Social Sciences 1983). However, they do trigger stratum IV 3 => 2 sandhi, so they must be specially assigned stress by rule in the phonology of stratum III. Accordingly, "locative stress assignment" follows nonhead stress in the phonological component of this stratum. This accounts for the fact that the third tone li 'inside' triggers global 2 => 3 sandhi as in shouli hand-in 'in the hand'. Optional neutral tone may then be derived by stratum IV removal of the stress mark. For nouns formed with the suffix -zi (also -tou and -r), -zi acts as the head of the word. Stress assignment therefore occurs on the left, and so the underlyingly third-toned -zi never triggers the 3 =» 2 tone sandhi rule. Thus, e. g., for 'chair' we get yizi, but never *yizi. After the application of the V-not-V question inflection at this stratum, stress occurs on the non-head (i. e., the li of chubuchuli) which will appropriately trigger the stratum IV sandhi rule. The phenomenon of disyllabic adjective reduplication (Yip 1980: 43-44; Li-Thompson 1981: 32-34) also occurs at stratum III. In disyllabic adjective reduplication, the syllables are reduplicated at the morpheme rather than the word level, (AB AABB; e. g., qlngchu 'clear' —• qlngqlngchiichü 'very clear'), unlike 'normal' disyllabic verb reduplication at Stratum IV. This operation applies to only a restricted set of adjectives, and like many other morphological rules, operates on the head of the word. This rule has access to the internal structure of these words because the formation of complex adjectives and their reduplication occur within the same lexical stratum, prior to the "bracket erasure" that occurs at the transition between strata. This rule first reduplicates the word head, then the complement. Note that the reduplication operation is restricted within this stratum to disyllabic adjectives. The process of "regular" disyllabic verb reduplication occurs at the next stratum (stratum IV), where the internal structure of the word cannot be seen and so the entire word is reduplicated (AB => ABAB; e.g., liäojie 'understand' => liäojieliäojie 'understand a little').

7. Mandarin stratum IV phonology The stratum IV morphological component contains the affixation of inflectional processes and compound verb reduplication. The phonology of

322

Jerome L. Packard

stratum IV contains no stress assignment, but contains two non-cyclic rules. The first is the 3 - 3 => 2 - 3 tone sandhi rule which applies to all tone 3—3 bisyllabic words whose rightmost syllable is marked for stress. The second is optional neutral tone marking, which removes the asterisk from the rightmost syllable of words that are optionally neutraltoned. Words that have no stress mark on the right by the time they reach stratum IV (viz., words with "obligatory" neutral tone) automatically receive neutral tone. In all cases, the post-lexical phonetic interpretation rules will realize the rightmost syllable of a word as full-toned if it is marked for stress, and neutral-toned if it is not.

8. Implications and concluding remarks The analysis presented in this paper accounts for all cases of neutral tone and 2 3 tone sandhi in the Mandarin lexicon, using a preexisting Lexical Phonology and Morphology framework which already had accounted for morphological cooccurrence restrictions in Mandarin complex word formation (Packard 1990).15 In considering this model, we should keep in mind that the lexical system must also include a list of stored items used as the "building blocks" of word formation. Following Caramazza—Laudanna—Romani (1988), we may hypothesize that those stored items are listed in both morphologically composed and decomposed form. This means that the stored list of lexical primitives includes individual word-forming morphemes, as well as common partially-derived and fully-derived words (see, e. g., the Augmented Addressed Morphology model of the lexicon; Caramazza—Laudanna—Romani 1988). My goal in presenting this model has not been to focus attention on the fact that the Mandarin word-formation component is divided into hierarchically-ordered strata, and that words are somehow "built up" in an on-line fashion by mechanically "passing through" the strata in some rigidly computational matter. Instead, my goal has been to illustrate that Mandarin complex words fall neatly into virtually discrete classes that reflect not only their syntactic identity, but also characteristics such as their order of composition, boundness of constituents, and susceptibility to morphological and phonological operations. Thus, this model is intended to describe the implicit knowledge of these complex word classes that exists in the mind of a Mandarin native speaker. The fact that the

A Lexical Phonology of Mandarin Chinese 3 2 3

cυö χυ

C

O .2

s S ^ S Oh -9ft Q u

ω α ο cd Wh

* "«si •S • «»Λ δ»

Ζ CO > Ο ε υIi 13 α ΙΛ ο (Λ

£

's»

ft

Λ >>

ε £ 3 2 s ° is J3 C/3 ft

Ό C *



υCA .Ω u "ft ε cö X ω

c ε c 60

• «Μ *

ί

> + >

.> e>> l-H >o0

ο α

ε Λ Im -*-» Crt

31Λ 3Wft; is ° oo ε

ϊ t-c

-Si

ö

•o .2

§ο £ %

ft ε Ο υ

rv 3 Ό Β

324

Jerome L. Packard

model also accounts for many cross-linguistic generalizations (such as derivation virtually always preceding inflection, and phonological operations being strictly ordered with respect to morphological ones) raises the possibility that the model also represents a universal property of language, and that the knowledge represented in this model is a priori knowledge that children bring with them to the task of Li acquisition.

Notes 1. This paper is an extensive revision of part of a paper presented to the Fourth North American Conference on Chinese Linguistics held at the University of Michigan, Ann Arbor, Michigan in May 1992. Several scholars have pointed out factual and bibliographic citation errors in that paper, and have made suggestions contributing significantly to the version presented here. In particular, I would like to thank Richard Sproat, Chilin Shih, Matthew Chen, and San Duanmu, who undoubtedly would not agree with everything I have said here. Thanks also to Tom Ernst for helpful comments on this latest version. 2. Mandarin has four lexical tones. The first is a high-level tone, the second is a high rising tone, the third is a low fall-rise, and the fourth is a high falling tone. 3. The tone of certain syllables is sometimes reduced or "neutralized" as a result of metrical prominence relations in the syntax. We will not discuss that phenomenon here. 4. Tone marking conventions in this paper are as follows. "First tone" is represented as [ä], "second tone" is [a], "third tone" is [ä], and "fourth tone" is [a]. Neutral tone is unmarked. 5. The syllable and the morpheme are virtually isomorphic in Mandarin. 6. For the purposes of this paper, in all cases a 2 - 3 (e. g., diänli) tone combination is considered to be a 3 - 3 which has undergone the 3 - 3 => 2 - 3 tone sandhi rule, so the second tone is actually an underlying third tone. A secondneutral (e.g., xiäojie) combination is also considered to have undergone the 3 — 3 => 2—3 rule, so both the second and neutral tones are underlying third tones. A third-neutral (e. g., jiejie) combination is in all cases an underlying third tone preceded by a third tone which has not undergone the 3—3 => 2 - 3 rule. 7. For some speakers of Mandarin, there is also the class "neutral tone obligatory; sandhi triggered". For these speakers, some words (e.g., xiäojie) have an obligatory neutral tone on the second syllable, yet they undergo sandhi in the proper context. I presume that diachronically speaking, the obligatoriness of the neutral tone for these words is a relatively recent phenomenon, and therefore that the underlying tones of such words are fully "transparent" to these speakers. The evidence is that these speakers will produce the full (third)

A Lexical Phonology of Mandarin Chinese

325

tone on the second syllable when asked for the "citation form". This clearly does not happen for the class "neutral tone obligatory; sandhi not triggered". For example, if asked for the citation form of 'ear' erduo, no native speaker of Mandarin would produce the second syllable with a third tone (*-duö). 8. Some investigators (e.g., Sproat—Shih 1993) have criticized my use of the term "abstract" to characterize stress marking in Mandarin. In using the term "abstract", my intention is to convey the notion that, as with the "accent" of pitch-accent languages, the syllable marked with the abstract diacritic is not always the syllable upon which the phonological effect may be directly observed, but rather is the syllable in terms of which the phonological phenomenon is best explained. 9. There are two versions of the 3 => 2 sandhi rule, one lexical and one postlexical. The lexical version is needed to explain the fact that the 3 => 2 sandhi rule must precede the marking of neutral tone (because 3 => 2 sandhi always feeds the neutral tone realization rule, except for words such as jiejie, ΙάοΙαο, erduo, and liji, to be discussed in section IV), and neutral tone is necessarily marked in the lexicon. The post-lexical version is needed to account for the more general application of 3 => 2 sandhi (i. e., the word-sequence application; see Halle-Vergnaud 1987: 79) throughout the grammar. We will not discuss the post-lexical version here. 10. Exocentric noun compounds are those that do not have a nominal morpheme in canonical head (i.e., right-hand) position (Packard 1990). Note that the placement of exocentric nominals at Stratum II represents a change from the analysis in Packard (1990, 1992, 1993), where it was held that all exocentrics are formed at level I. The reasons for this change are that first, compound nouns differ from compound verbs in that they are not generally subject to head operations. This being the case, there is no compelling reason to place them at level I where they would be exempt from the head operations which occur at the later lexical levels. Compound verbs which are exocentric avoid certain head operations (such as V-not-V question reduplication and resultative compound formation) by being formed at level I (see Packard 1990). Second, as pointed out in Sproat-Shih (1993), exocentric noun compounds do undergo the third tone sandhi rule, which is not predicted if they are formed at level I as originally proposed in Packard (1990, 1992, 1993). Third, the creation of exocentric "antonymous" nominal compounds by juxtaposing two adjectives (stative verbs) which represent opposite qualities (such as häohuäi good-bad 'quality', or däxiäo big-small 'size') is a rather productive morphological process in Mandarin (see Chao 1968: 375-376; Li-Thompson 1981: 81). Since the assignment of morphological processes to lexical level depends in part on the productivity of the process (generally with less productive rules at lower levels and more productive rules at higher levels), the formation of exocentric nominals at a higher level comports with the relative productivity of the process.

326

Jerome L. Packard

11. Complex classifier words were listed as being formed at Stratum III in an earlier work of mine analyzing Chinese aphasic speech (Packard 1993). The reason for changing them to Stratum II in this revision of the system is that they clearly pattern syntactically like nouns. This change and the one in the preceding footnote do not affect the major critical results in the analysis of aphasic Chinese speech presented in Packard (1993). 12. For an analysis that views such complex classifier words as syntactic phrases with different dominance relations than those given here, see Tang (1990: 402-406). 13. Strictly speaking, these are also exocentric nominals. 14. Also suffixed at this level are nominals formed with -tou and -r, though these are not relevant to the phonological analysis presented here. 15. But see Sproat—Shih (1993) for counterarguments.

References Caramazza, Alfonso—Alessandro Laudanna—Cristina Romani 1988 "Lexical access and inflectional morphology", Cognition 28: 297-332. Chan, Marjorie 1984 Word formation in Mandarin: A preliminary sketch. [Paper presented to WECOL, Vancouver, Canada.] Chao, Yuen Ren 1968 A grammar of spoken Chinese. Berkeley: University of California Press. Chinese Academy of Social Sciences Linguistics Institute 1983 Xiändäi Hänyü Cidiän [Modern Chinese Dictionary] Beijing: Shängye Yinshüguän [Commercial Press]. Halle, Morris—Jean-Roger Vergnaud 1987 An essay on stress. Cambridge, MA: MIT Press. Li, Audrey Yen-Hui 1990 Order and constituency in Mandarin Chinese. Dordrecht: Kluwer. Li, Charles—Sandra Thompson 1981 Mandarin Chinese: A functional reference grammar. Berkeley: University of California Press. Packard, Jerome 1990 "A lexical morphology approach to word formation in Mandarin", Yearbook of Morphology 3: 21—37. 1992 Why Mandarin morphology is stratum-ordered. [Paper presented to the Fourth North American Conference on Chinese Linguistics, May 8—10, Ann Arbor, Michigan.] 1993 A linguistic investigation of aphasic Chinese speech. Dordrecht: Kluwer.

A Lexical Phonology of Mandarin Chinese

327

Ross, Claudia 1990 "Resultative verb compounds", Journal of the Chinese Language Teachers Association 25.3: 61—83. Sproat, Richard-Chilin Shih 1992 a Mandarin morphology is not stratum-ordered. [Paper presented at the 66th annual meeting of the Linguistic Society of America, Philadelphia, PA.] 1992 b "On the sources of some constraints in Mandarin Morphology", in: Proceedings of the Third International Symposium on Chinese Language and Linguistics·. 20-37. Hsinchu, Taiwan: National Tsing Hua University. 1993 "Why Mandarin morphology is not stratum-ordered", Yearbook of Morphology: 185-217. Tang, Chih-Chen Jane 1990 Chinese phrase structure and the extended X'-Theory. [Cornell University Ph. D. dissertation.] Yip, Moira 1980 The tonal phonology of Chinese. [MIT Ph. D. dissertation.]

Cognate objects and the realization of thematic structure in Mandarin Chinese Claudia Ross

1. Introduction In Mandarin Chinese, verbs which denote open-ended activities (cf. Grice 1975) such as chi 'to eat', xie 'to write', shuo 'to speak', and hua 'to paint' have several unusual properties. First, they are obligatorily transitive, even in contexts which do not involve an affected object. As (1) and (2) illustrate, English uses intransitive verbs in such contexts. 1 (1)

Don't speak.

(2)

When do we eat?

Second, as illustrated in (3)-(6), these verbs have cognate objects, 2 which are semantically weak or empty. These cognate objects are used in precisely those contexts in which English selects intransitive activity verbs, contexts which lack an implied or entailed goal. (3)

bie shuo hua. don't speak speech 'Don't speak.'

(4)

women shenmo shihou chi fan? we what time eat rice/food 'When do we eat?'

(5)

ta bu hui xie zi. he not able write character 'He can't write.'

(6)

ta meitian hua huar. he every-day paint picture 'He paints pictures every day.'

330

Claudia Ross

The existence of cognate object verbs and the properties of these verbs have largely been ignored in the literature. Chao (1968), for example, does not go beyond the observation that verbs such as chi and shuo are transitive, and he makes no note of the existence or properties of the cognate objects of these verbs. Li—Thompson (1981: 158) correctly observe that the relatively widespread use of zero anaphora in Mandarin often gives the surface appearance of intransitivity to transitive verbs like chi. But the properties of open-ended activity verbs such as chi remain unexplored, relegated to the status of arbitrary and unexplainable facts. The present study examines these properties within the framework of thematic structure especially as it has been developed in Jackendoff (1987). It argues that the obligatory transitivity of cognate object verbs and the concomitant existence of cognate objects are determined by a hierarchy of thematic role assignment in Mandarin that gives special status to the role of Theme. Furthermore, it demonstrates that aspectual properties of verbs interact with thematic role assignment in a regular way, such that the assignment of thematic roles inherent in a verb is determined in part by the aspectual properties of the verb. Finally, it accounts for differences in the syntactic properties of activity verbs and other verbs in terms of their thematic and aspectual structure. The paper is organized as follows. Jackendoff's 1987 framework of thematic structure on which this analysis is based is discussed in section 2. An explanation of the significance and status of cognate objects is presented in section 3. Section 4 examines a special case of cognate-object activity verbs and explores the interaction of theta structure, verb classification, and argument structure. Section 5 examines other syntactic phenomena in Mandarin that can be explained in terms of thematic structures.

2. Thematic structure The notion of thematic roles developed in Gruber (1965) and subsequent works has been incorporated in many recent theories of grammar as a way of explaining the semantic relationship of the verb to its arguments (cf., e.g., "Case Grammar" as developed by Fillmore [1968] and "Theta theory" in the G - B framework developed by Chomsky [1981]). It is generally agreed that some semantic and relational information is conveyed by the verb to its arguments, and most works that incorporate thematic roles recognize at least the roles presented in (7).

Mandarin Chinese

(7)

331

Theta roles: a. Theme: the argument that is set in motion by the action of the verb or the NP whose location is predicated by the verb; b. Source: the point of origin of the theme; c. Goal: the endpoint of the theme; d. Agent: the causer, the initiator of the action of the verb; e. Patient: the affected object, the recipient of the action of the verb; f. Location: the place where the verb occurs.

Chomsky (1981) assumes a distinct principle of grammar, Theta Theory, and a universal Theta Criterion which functions to determine and constrain the assignment of Theta roles. (8)

The Theta Criterion: Each argument bears one and only one Theta role, and each Theta-role is assigned to one and only one argument. (Chomsky 1981: 36)

Jackendoff (1987) argues that thematic roles themselves are not primitives but are bundles of semantic features whose values are determined by the conceptual structure of language. These bundles of features are defined in terms of the roles that arguments assume with respect to such primitives as Place, Path, Event, and State that are associated with the meanings of verbs. Jackendoff argues that the labels Source, Goal, Theme, etc., are shorthand notations for these bundles of features representing conceptual relationships encoded by arguments vis-ä-vis each other and the verb. Jackendoff (1987) cites the sentences in (9) to demonstrate problems with the Theta Criterion, especially with the stipulation that a single NP may be assigned only a single theta role. (9)

a. The ball rolled down the hill. b. John rolled down the hill.

The problem with (9) concerns the theta roles that are needed to interpret the subject of the verb roll. As Jackendoff notes, the subject of roll in (9 a) and (9 b) is a Theme, the object in motion. In (9 a) Theme is the only role associated with the subject, and thus (9 a) conforms to the predictions of the Theta Criterion. (9 b) is ambiguous. In one reading,

332

Claudia Ross

John inadvertently rolls down the hill. In this reading, John, like the ball in (9 a), has the single role of Theme. But in the other reading of (9 b) John deliberately causes himself to roll down the hill. In this reading John is not merely a Theme. He is also an Agent, the initiator of the action of rolling. That is, in this reading, John has two theta roles, Theme and Agent. This interpretation of (9 b) is neither marginal nor exceptional in English. But it cannot be accommodated under the traditional framework of Theta roles because it violates the Theta Criterion. The Theta Criterion cannot be salvaged by assuming that the role of Theme includes that of Agent. For Agents need not be Themes and Themes need not be Agents. The notion of Agent is not associated with the Theme in (9 a) or in the "accidental" reading of (9 b), for example. And as (10) and (11) illustrate, the role of Agent may be associated with roles other than Theme. In (10) it is associated with the Source. In (11) it is associated with the Goal. (10)

John hit Bill. Source Goal Agent

(11)

John got the money from Bill. Goal Source Agent

In short, the role of Agent is often assigned simultaneously with other Theta roles to a single NP. Jackendoff accounts for this property of Agents by proposing that the role of Agent is assigned independently of the roles of Source, Theme, and Goal. He argues that Theta roles are grouped in several tiers and that the role of Agent (together with the role of Patient) belongs to a different tier than the roles of Source, Goal, and Theme.3 Jackendoff identifies three tiers, the Thematic tier, the Action tier and the Temporal tier, presented in 12.4 (12)

Theta tiers: Thematic tier - deals with motion and location [theta roles assigned: Source, Theme, Goal]. Action tier - deals with Agent-Patient relations [theta roles assigned: Agent (initiator of action), Patient (affected object)]. Temporal tier — specifies aspectual features of the event [Temporal distinctions assigned: Ρ (point of time), R (region of time)].

Mandarin Chinese

333

The distinction in Theta tiers accounts for the interaction and conflation of roles associated with verbs and arguments. As we have just seen, the identification of distinct thematic and action tiers, for example, accounts for the independence of the roles of Source, Goal, and Theme. The temporal tier incorporates features that are relevant in the aspectual classification of the verb as identified, for example, in Vendler (1967). Yendler distinguishes four verb classes: activities (also referred to as processes), achievements, accomplishments, and states. In Jackendoff's framework, activities, achievements, and accomplishments are distinguished by the temporal features of R (region) and Ρ (point). Activities, actions for which initial and terminal points are not specified, have the feature R in the temporal tier, indicating that they take place over a region of time. Achievements such as 'reach the top' and 'remember' are Ρ entities, since for achievements it is not the process but the endpoint that is salient. Accomplishments such as 'write a letter' are R - P entities since for accomplishments both the region of time over which an action occurs and also the point at which it is completed are salient. States, like activities, have the feature R in their temporal tier, though states are distinguished in JackendofFs framework by the presence of the concept of BE in their conceptual structure. We might consider BE to represent the feature +Stative. We will see that it is useful to distinguish stativity as a feature within the Theta structure of verbs in Mandarin. This paper will focus on the interaction of theta roles and theta tiers in determining phrase structure configurations in Mandarin. In particular, it will show that certain properties of Mandarin phrase structure can best be understood with reference to theta roles, and that the system of theta roles that determine phrase structure must incorporate a distinction of theta roles in tiers.

3. Mandarin cognate object verbs Mandarin verbs with cognate objects include the following. (13)

chi fan eat rice 'eat'

334

Claudia Ross

hua huar paint painting 'paint' jiao shu teach book 'teach' kan shu read book 'read' nian shu study book 'study' mai dongxi buy thing 'buy/shop' xie zi write character 'write' shuo hua speak talk 'speak' As noted in Chu (1976), Tai (1984), and Smith (1989), the verbs in (13) are unbounded activity verbs and the presence of either a cognate object or a semantically more informative object does not function to bound these verbs and turn them into accomplishments. This fact is not always apparent in the English translations, for in English, the presence of an object often entails an endpoint to the activity and turns the activity into an accomplishment. But it becomes apparent when an attempt is made to cancel the sense of boundedness or completion from the verb-object string. As (14) and (15) illustrate, seemingly equivalent verb-object strings in English and Mandarin behave very differently when an attempt is made to cancel the sense of completion (cf. Grice 1975). (14) a. Wo mai le yiben shu, mai-le ban tian, mei I buy-/e one-cl book buy-le half-day not maidao. succeed-in-buying

Mandarin Chinese

335

b. ?/ bought a book, bought it for half a day, but didn't succeed in buying it. (15) a. wo kan-le naben shu, keshi mei kanwan. I read-le that-cl book but not read-finish b. ?/ read that book but I didn't finish it. In English, strings with activity verbs in past tense with specified or quantified objects typically entail completion of the activity. This entailment is a function of the boundedness contributed to the activity by the specified or quantified object plus the boundness contributed by the property of perfective aspect associated with the simple past tense in English (Smith 1991).5 As (14 b) and (15 b) illustrate, cancellation of this entailment in English results in semantic contradiction. But in Mandarin, strings with activity verbs which are suffixed with the perfective aspect marker le and which have specified or quantified objects do not entail completion. As (14 a) and (15 a) illustrate, completion may be cancelled from Mandarin sentences with these properties. With or without objects and regardless of the aspectual markers with which they occur, Mandarin activity verbs may only imply, but do not entail, completion. 6 This difference between Mandarin and English helps to explain why Mandarin activity verbs, unlike English activity verbs, can be transitive. But it does not explain why Mandarin activity verbs must be transitive. For this we must examine the thematic structure of these verbs. The thematic structures of the activity verbs in (13) are identical. In each case the subject has the thematic role of Source and the action role of Agent, and the object has the thematic role of Theme. This is represented in (16).

(16)

[

]

T: S, Τ A: A Many of the verbs in (13), including jiao 'teach', mai (3) 'to buy', xie 'to write', and shuo 'to say' also imply a Goal, although Goal is not obligatorily assigned by the verb. If it is assigned, it is assigned by a "Coverb" in a "Coverb Phrase" as in (17).7 (17) a. wo gei ni jiao shu. I to you teach book Ί teach you.'

336

Claudia Ross

b. wo gei ta mai dongxi. I to/for him buy thing Ί buy things for him.' c. wo gei ta xie xin. I to/for him write letter Ί write letters to/for him.' d. wo dui ta shuo hua. I to him speak talk Ί speak to him.' In English, verbs like teach, buy, and write subcategorize for direct and indirect objects and thus permit the roles of Source, Theme, and Goal to be assigned by the verb, directly or compositionally. But Mandarin has very few verbs that subcategorize for two objects, and thus very few verbs that can be assigned three theta roles from the thematic tier.8 Thus, for most verbs whose conceptual structures incorporate three roles, only two can be assigned and one must be suppressed. Significantly, it is never the role of Theme that is suppressed, but only that of Source or Goal. In other words, it appears that Themes must be assigned to an argument position in Mandarin and cannot be suppressed or assigned to an optional adjunct position. This is, I believe, the reason why activity verbs must have objects in Mandarin: the theta structure of activity verbs entails a Theme, and Theme must be assigned. To illustrate this, let us consider the verbs jiao 'to teach', mai (3) 'to buy', mai (4) 'to sell', and jie 'to borrow/to loan.' 9 All of these verbs entail, in their conceptual structures, the three theta roles of Source, Goal, and Theme, and all permit only a single object in their phrase structure configuration. For the verbs jiao and mai (4) the role of Source is assigned to the subject and the role of Theme is assigned to the object. The Goal may be suppressed or expressed in a Coverb Phrase. (18)

Laoshi (gei xuesheng) jiao Hanyu. teacher to/for student teach Chinese 'The teacher teaches (the students) Chinese.'

(19)

Lao Li (gei Xiao Wang) mai (4) le yiliang che. old Li to/for little Wang sell-/e one-cl car 'Old Li sold (Little Wang) a car.'

Mandarin Chinese

337

The verb mai (3) assigns Goal to subject and Theme to object. The Source may be suppressed or expressed as the object of a coverb. (20)

wo (gen ta) mai(3)-le che. I from him buy-le car Ί bought a car (from him).'

The verb jie 'borrow/loan' is particularly interesting. As we have seen, the obligatory assignment of Theme contrasts with the optional suppression of the role of Source or Goal associated with a non-subject, jie entails a Source, a Goal, and a Theme, but as the following sentences illustrate, only the role of Theme is strictly associated with a grammatical function/argument position. Theme is always assigned to the object of jie. In contrast, thematic assignment to the subject of jie is not fixed. The subject of jie may be either Source or Goal. If it is Source, the role of Goal is either suppressed or expressed in an adjunct phrase and the English translation of jie is 'loan'. If it is Goal, the role of Source is suppressed or assigned in an adjunct phrase and the English translation of jie is 'borrow'. This is illustrated in (21).10 (21) a. ta jie-le qian gei wo. he JIE-le money to me 'He loaned money to me.' b. ta gen wo jie-le qian. he from me JIE-le money 'He borrowed money from me.' In order to account for theta-role assignment in Mandarin we need two rules. One rule obligatorily assigns a Theme to an argument position and requires an argument position to accommodate the Theme. But it is clear that Mandarin also assigns theta roles according to a theta hierarchy in which Source and Goal take precedence over Theme. For if the conceptual structure of a verb includes a Source and/or a Goal in addition to a Theme, it is Source or Goal and not Theme that is assigned to the subject. But it is not the case that both Source and Goal take precedence over the Theme. For if the conceptual structure of a verb includes both Source and Goal in addition to Theme, only one of Source or Goal is assigned to an argument position. The second role assigned is Theme. This is captured in the hierarchy in (22).

338

(22)

Claudia Ross

Mandarin Hierarchy of Theta Assignment: Source/Goal > Theme

Finally, Mandarin differs from languages like English in that the Theme must be assigned. Verbs whose conceptual structures contain Source and/or Goal and also Theme must be transitive, for Theme must be assigned to an argument position and cannot be suppressed.

4. Activity and transitivity: the interaction of Theta tiers We have seen that activity verbs whose theta structures include Theme and also Source and/or Goal must be transitive, and we have explained this in terms of a Mandarin rule of theta assignment that requires that the role of Theme is assigned to an argument position. However, the transitive structure of a small class of activity verbs cannot be explained in this way. These verbs include verbs of motion such as pao 'run' and zou 'walk' and are illustrated in (23) with their cognate object lu 'road'. (23)

wo zou/pao lu. I walk/run road Ί walk/run.'

zou and pao like the verbs in (13) above are open-ended activities. And like the verbs in (13), zou and pao have obligatory cognate objects. But the thematic structure of zou and pao differs from that of the verbs in (13) in an important way. zou and pao include in their thematic structures only the role of Theme and not Source or Goal. Theme is assigned to the subject of the verb, thus satisfying both the thematic hierarchy and also the requirement that Theme be assigned to an argument position. The explanation for the obligatory cognate object of the activity verbs in (13) can thus not be extended to zou and pao. In the remainder of this section I will examine the obligatory transitivity of activity verbs of motion like zou and pao by comparing them to other verbs with overlapping but distinct theta structures. I will argue that the obligatory transitivity of activity verbs aligns these verbs with other activity verbs, verbs with identical temporal structures. At the same time the transitivity of these verbs distinguishes them from other verbs with whom they share thematic structures but not temporal structures.

Mandarin Chinese

339

In other words, the obligatory transitivity of activity verbs of motion serves to highlight the temporal structure of these verbs and to identify the primary classification of these verbs in terms of temporal structure rather than thematic structure. Besides activity verbs of motion like zou and pao, there are two other verb classes whose thematic tiers contain only Theme and not Source or Goal, adjectival stative verbs and achievement verbs of motion. Adjectival stative verbs include words which translate into adjectives in English but which in Mandarin belong to the grammatical category of Verb (Ross 1983), including gao 'to be tall', hao 'to be well', congming 'to be intelligent', etc. These verbs are intransitive and their subjects, as the entity about whom a state is predicated, are Themes (cf. (7) above). Achievement verbs of motion include the verbs lai 'to come', and qu 'to go'. Unlike zou and pao, lai and qu are not open-ended activities. Each involves an endpoint. It is the endpoint that distinguishes lai and qu from each other, qu entails an endpoint that is distinct from the location of the speaker, and lai entails an endpoint that is identical with the location of the speaker. (24) a. Lao Li dao gongyuan qu le. Old Li towards park go le 'Old Li went to the park.' (a location distinct from that of speaker) b. Lao Li dao gongyuan lai le. Old Li towards park come le 'Old Li came to the park.' (a location containing the speaker) Furthermore, it is the endpoint that is salient for lai and qu and not motion towards the endpoint. Thus, in Vendler's classification they are achievement verbs. In JackendofT's framework they have the feature Ρ in their temporal tier, distinguishing them from pao and zou which have only the feature R in their temporal tier. zou and pao differ minimally from adjectival stative verbs like gao 'to be tall' and hao 'to be well' and also from achievement verbs of motion like lai 'to come' and qu 'to go'. They differ from lai and qu by a single feature in the temporal tier. They share with stative verbs the same thematic tier (they all contain only Theme) and the same temporal tier (they are all R verbs) and differ from adjectival stative verbs by stativity alone. As I noted above, Jackendoff assumes that stative verbs are identified by

340

Claudia Ross

the presence of the concept of BE in their conceptual structures. The fact that stativity works like the features R and Ρ in distinguishing verbs with identical thematic structures suggests that BE is a temporal feature.11 The locus of BE in the conceptual structure is not crucial here. What is important is that the transitive phrase structure configuration of activity verbs of motion like zou and pao serves to clearly distinguish them from stative verbs and achievement verbs of motion and to identify them with other activity verbs. I suggest that activity verbs of motion are transitive precisely for this reason. The transitive structure of activity verbs of motion identifies their status as activity verbs and in this way identifies the temporal tier as the primary classification for these verbs.

5. Subject postposing In this section I use the framework of theta structure to examine the property of subject postposing in Mandarin. I will show that the acceptability of subject postposing is determined by the features of the thematic and action tiers. As noted in many works including Chao (1968) and Li-Thompson (1981), the subjects of certain intransitive verbs may be postposed. This is illustrated in (25) (from Chao 1968: 76) and (26) [(26 b) from Li-Thompson 1981: 518], (25) a. ke lai le. guests arrive le 'The guests have come.' b. lai ke le. arrive guests le 'There have come some guests.' (26) a. san zhi yang tao le. three cl. sheep escape le 'Three sheep escaped.' b. tao le san zhi yang. escape le three cl. sheep 'There escaped three sheep.'

Mandarin Chinese

341

Studies of postposed subjects in Chinese have typically focussed on the meaning shift associated with postposing or on the structural conditions under which postposing occurs. Chao (1968), for example, notes that the post-verbal position in the sentence is associated with new information, and that postposed subjects are understood as new and unexpected information, a situation which is reflected in Chao's translation of (25 b). Li-Thompson (1968) note that subject postposing in Mandarin is restricted to a small set of verbs of motion. We are now in a position to explain why postposing is restricted to these verbs, and why there is a difference of meaning between sentences with preposed and postposed subjects. Consider the theta structure of the verbs of motion lai and tao. These verbs assert the willful movement of the subject, and thus assign the theta role of Theme in the thematic tier and the role of Agent in the action tier. Since these verbs assign only a single role in each tier and since these roles apply to the same entity, the roles are realized on a single argument. The role of Agent is generally associated in Mandarin with the grammatical function of subject. Thus yige ren lai-le 'one person came' and san zhi yang tao-le 'three sheep escaped' are perfectly acceptable. But when a verb has both Agent and Theme, Theme is generally realized as object. Thus, the NP associated with Theme can also occur as object.12 In this way, the argument of verbs of motion, as simultaneous Agent and Theme, can occur in either position. Finally, the association of preverbal position with agency provides a possible explanation for the obligatory post-verbal position of the thematic arguments of the verb xia 'to fall' when used predicate rain or snow. This is illustrated in (27). (27) a. xia yulxue le. fall rain/snow le 'It rained/snowed.' b. *yulxue xia le. rain/snow fall le While yu 'rain' and xue 'snow' are theta assigned by xia as its Theme, they are assigned no role on the action tier, since the action of raining or snowing involves no volition on the part of the rain or snow (or any other entity). As noted above, theta assignment on the action tier distin-

342

Claudia Ross

guishes states from non-states, and thus the absence of an agentive role associated with a non-stative verb like xia is highly unusual. I suggest that Mandarin restricts the preverbal position, the position associated with the Agent, from being filled in the case of the weather use of xia as a way of marking the absence of theta assignment in the action tier. In this way, the weather use of xia contrasts with other uses of xia which do not require postposed subjects and also with non-agentive preverbal subjects of stative verbs, for whom the unmarked status involves the absence of an action tier. The postposing requirement does not prevent sentences like (28), in which the weather argument is topicalized. (28)

yu xia de hen da rain fall de very big 'The rain fell heavily.'

6. Concluding remarks This study has demonstrated the complex interaction of phrase-structure rules, theta structure, and theta-role assignment. It has shown that phrase-structure configuration is not a direct realization of theta structure. Instead, language-specific rules of phrase structure restrict thetarole assignment, while at the same time language-specific rules of theta assignment influence phrase-structure configurations. Specifically, it has shown that in Mandarin, the obligatory transitivity of activity verbs is determined by a binary branching restriction in phrase structure (Huang 1982) that limits the realization of theta roles, and by rules of theta assignment that give special status to the Theme. The binary-branching restriction limits realization of theta roles in argument positions to two. While this rule often requires the suppression of theta roles, it also makes possible the assignment of a theta role to an argument position even when the N P to which the role is assigned is semantically weak and referentially unspecified. That is, it provides a syntactic slot for cognate objects. This phrase-structure rule interacts with the rules of theta assignment that require the assignment of Theme. Finally, it has shown how the individual tiers within the theta structure of verbs play a role in the phrase-structure configurations of sentences, influencing the argument structure and the position of arguments in surface strings.

Mandarin Chinese

343

Notes 1. The intransitive status of eat and speak in these sentences can be demonstrated by comparing them to (i) and (ii) in which a lexical object occurs.

2.

3.

4.

5.

(i)

Don't speak it. IDon't speak English.

(ii)

When do we eat it?

The sense of (i) and (ii) is quite different from that of (1) and (2) in a way which we would not expect if (1) and (2) were instances of strings with empty objects. In (i) and (ii) the verbs are directed towards some goal while a goal is entirely absent in (1) and (2). I will not be concerned here the nature of the relationship of the transitive and intransitive lexical entries for verbs like speak and eat. For the purpose of this paper it is sufficient to acknowledge that the subcategorization of an object is optional in English but obligatory in Mandarin. I use the term 'cognate object' to refer to an NP which functions as the syntactic object of the verb, but whose semantic content is entirely predictable from the verb. In most cases, the meaning of the cognate object is coterminous with that of the verb itself. Examples include shuo-hua 'speak-talk', hua-huar 'paint-pictures', and shui-jiao 'sleep-sleep'. For some NPs that function as cognate objects such as fan in the VP chi-fan 'eat', the NP has a specific reference (fan = 'rice') in some contexts, but when it is used as a cognate object it refers in a non-specific way to food in general. I am using the term "cognate object" in a different way from that of Chao 1968: 312, in which cognate objects are defined as "... an expression for (a) the number of times of an action, (b) its duration, (c) its extent, (d) its course of locomotion, or, less often, its destination ..." In this way it is possible to preserve the Theta Criterion by the stipulation that it applies at the level of the tier. Jackendoff notes other problems with the Theta Criterion that cannot be resolved by the addition of tiers. For example, he notes that many verbs imply theta roles that they do not assign. This is a property of many Chinese verbs and I will discuss it in detail below. In JackendofFs framework, the roles associated with these tiers are not primitives but are composed of features from the conceptual structure of verbs. Labels such as "Source" and "Agent" are shorthand notations for bundles of conceptual features. That the tense of the verb plays a role in conveying completion in English can be seen by comparing (15 b) with (i). (i)

I was reading that book but I didn't finish it.

When the past progressive is used as in (i), completion is not entailed, and it can be cancelled without any resulting incongruity. The point here is that in English, the properties of direct objects of activity verbs contribute to the

344

6.

7.

8. 9. 10. 11.

12.

Claudia Ross

entailment of completion in strings in which the verbs occur. In Mandarin, the properties of objects play no role in conveying completion. This may be one reason why there is no alternation between transitive and intransitive uses of activity verbs in Mandarin, and why there is such an alternation in English. In English, this distinction contributes to the interpretation of a string as completed or not. Intransitive uses of activity verbs are generally interpreted as imperfect and unbounded, while strings with transitive activity verbs may be interpreted as bounded if other features (involving tense and object properties) are present. In contrast, Mandarin activity verbs never entail completion. Therefore, there is no functional reason for a transitive— intransitive contrast in Mandarin. The unbounded meaning of the Mandarin example can sometimes be made clearer with careful selection of the English translation. Thus, when mai (3) dongxi is translated as 'to shop for things' rather than 'to buy things' the unbounded meaning of the Chinese VP is preserved. The completion of activities in Mandarin can be conveyed, but not through aspect markers or objects. In Mandarin, activities are completed when they are compounded with achievement verbs. Thus, for example, while kan 'to read' is an activity, kan-wan, in which kan is compounded with wan 'to finish', means 'to read to the point of finishing'. For a more detailed discussion of resultative verb compounding in Mandarin see Ross (1990). "Coverb" is a descriptive label, not a categorial one. The Coverb Phrase consists of a Coverb and a NP object and the Coverb Phrase occurs before the V + N P object constituent. The categorial status of Coverbs and the constituent structure of VPs containing a Coverb Phrase is not resolved. See Ross (1991) for a more complete discussion of the problem. See Li (1990) for a discussion of Double-Object Verbs. Parenthesized numbers refer to tone contours. The verbs mai (3) and mai (4) in Mandarin differ only in tone. (21 a) probably contains serial VPs rather than subordination of one phrase to another. Stativity is also reflected in the action tier. Stative verbs do not have an action tier, since the notions of volition and affected object are incompatible with stativity. Pre- and post-verbal position is also associated with a difference between given and new information.

Mandarin Chinese

345

References Bach, Emmon-Robert Harms (eds.) 1968 Universals in linguistic theory. New York: Holt, Rinehart and Winston. Chan, Marjorie-Thomas Ernst (eds.) 1989 Proceedings of the third Ohio State University conference on Chinese linguistics. Bloomington, IN: Indiana University Linguistics Club. Chao, Yuen Ren 1968 A grammar of spoken Chinese. Berkeley: University of California Press. Chomsky, Noam 1981 Lectures on government and binding. Dordrecht: Foris Publications. Chu, Chauncey 1976 "Some semantic aspects of action verbs", Lingua 40: 43—54. Cole, Peter—Jerry Morgan (eds.) 1975 Syntax and semantics. New York: Academic Press. Fillmore, Charles 1968 "The case for Case", in: Emmon Bach—Robert Harms (eds.), 1—90. Grice, Paul 1975 "Logic and conversation", in: P. Cole-J. Morgan (eds.), 41 - 5 8 . Gruber, Jeffrey 1965 Studies in lexical relations. MIT doctoral dissertation. Bloomington: Indiana University Linguistics Club. Huang, C-T. James 1982 Logical relations in Chinese and the theory of grammar. [MIT doctoral dissertation.] Jackendoff, Ray 1987 "The status of thematic relations in linguistic theory," Linguistic Inquiry 18: 369-412. Li, Audrey Yen-hui 1990 Order and Constituency in Mandarin Chinese. Dordrecht: Kluwer Academic Publishers. Li, Charles-Sandra Thompson 1981 Mandarin Chinese: A functional reference grammar. Berkeley: University of California Press. Ross, Claudia 1983 "On the function of Mandarin de", Journal of Chinese Linguistics 11: 214-246. 1990 "Resultative verb compounds", Journal of the Chinese Languae Teachers' Association. 25.3: 61—83. Smith, Carlota 1989 "Event types in Mandarin", in: M. Chan-T. Ernst (eds.), 215-243. 1991 The parameter of aspect. Dordrecht: Kluwer Academic Press.

346

Claudia Ross

Tai, James H.-Y. 1984 "Verbs and times in Chinese: Vendler's four categories", Papers from the parassession on lexical semantics. Chicago: Chicago Linguistic Society, 289-296. Vendler, Zeno 1967 "Verbs and times", in: Linguistics in philosophy. Ithaca: Cornell University Press, 97-121.

On defining the Chinese compound word: Headedness in Chinese compounding and Chinese VR compounds1 Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

1. Introduction The question of what constitutes a compound word in Chinese languages can be approached in two ways. The first is to adopt the traditional Chinese philological definition which supposes a compound word (in the great majority of cases) to be a word made up of two characters just as the traditional Chinese philological tradition is to suppose that an idiom consists of four characters. The second is to take the hypothesis of contemporary linguistics which is to suppose that a compound word is a word consisting of two or more words. We take it that Chinese languages are natural languages subject to the same constraints as other natural languages, and to be analyzed in terms of the same procedures and terminology used by linguists in the analysis of other languages. It is our contention that although the traditional Chinese definition of a compound word is used by many scholars working on Chinese word formation, the definition makes it difficult or impossible in some cases to give significant explanations of the facts of Chinese word formation because the definition obscures the distinction between various kinds of morphemes on the one hand and words on the other. We shall illustrate this contention by looking at two phenomena: headedness in Chinese compounds and so-called VR or resultative compounds in Mandarin. In both cases we will compare analyses which take the traditional Chinese view of compounds and views which utilize contemporary linguistic theories to see which definition of compounds provides the more compelling explanation of the phenomena. On the basis of our findings we will suggest that in the study of Chinese word formation care needs to be taken with the extent to which linguists rely on traditional analyses. Why do this? In physics there are a number of abstract terms which have been developed over the last couple of millennia to account for the

348

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

nature of the physical universe. They include, in the field of mechanics, terms such as "mass", "velocity", "acceleration". The definition of such terms has been a matter of evolution within physics and the terms do not now mean what they did even a hundred years ago. Their evolution has been the result of evolving theories about the physical universe and, no doubt, when physical theories again change, the terminology will change. This happens when better explanations result from newer theories. It is therefore necessary in linguistics no less than in physics to look constantly at terminology (and the theories which give rise to it) to assess whether that terminology arises from theories which provide the best explanation of linguistic phenomena. A realist interpretation of scientific theory implies that each change in the meaning of the theoretical constructs of a science genuinely changes what is known about the phenomena themselves. It is our contention that traditional Chinese terminology in the field of word formation is militating against better, that is, more universal, explanations of the phenomena of Chinese word formation. It also makes for confusion in the way linguists communicate their findings to each other. Western linguists may believe that linguists who are working on Chinese languages are using the terms "word", "compound word", and "morpheme" with the same meaning that they have in western linguistics. But if this is not so, significant misunderstandings are likely to result. Of course people are at liberty to define terms in any way they wish. The advantage of using traditional Chinese definitions of the linguistic properties of Chinese is that Chinese languages have the capacity thereby to appear to be very different from other languages. For example, it is often said that Chinese is unusual in that a single word is in many ways ambiguous, with one basic sense but many subsidiary ones. This is not surprising. A character is a symbol in a writing system which has evolved over a very long period. There are a large number of characters and in the process of memorizing them, students are taught mnemonics to recall them, mnemonics which depend on the recognition of particular patterns and the association of each of these with one meaning which is then taken to be fundamental. Viewed from a western linguistic vantage point, characters may be many ways ambiguous, both syntactically and semantically, because they can represent more than one word, i. e., they are potentially homographs. For a western linguist this is no surprise since homographs are relatively common even in languages with a phoneme-based writing system. Given the impossibility of learning to write in a language where every sense of

The Chinese compound word

349

every word in the language is represented by a unique ideograph, homographs must be very common in languages which use ideographic writing systems. The disadvantage of the Chinese view is therefore that it has the potential to obscure the properties of Chinese languages.

2. What is a compound word? Structuralist theories in western linguistics do not always give clear answers to the question of what constitutes a compound word. Bloomfield (1933: 227) states that "compound words have two (or more) free forms among their immediate constituents". But Bloomfield also states that (1933: 227) "The gradations between word and phrase may be many: often enough no rigid distinction can be maintained". The reason for this is that both compound words and phrases are concatenations of free forms. However, compound words "exhibit some feature which, in their language, characterises single words in contradistinction to phrases". (Bloomfield 1933: 227) Such features include semantic noncompositionality (although Bloomfield points out that this property is also shared with idiomatic phrases), a stress pattern characteristic of words rather than phrases, sandhi characteristic of words rather than phrases, fixed word order, and grammatical features of selection. None of these, unfortunately, provides totally clear ways to distinguish compound words from phrases. But for our purposes it will be sufficient to suppose that compounds consists of two or more free forms, i. e., two constituents which are themselves able to function as words and therefore have syntactic categories. The resulting pairing must itself also be a word. For comparison let us look in more detail at how a scholar influenced by the traditions of Chinese philological scholarship deals with the question of what is a compound. According to Chao (1968: 143-144), "A morpheme which can be uttered alone is free (F), and one which always occurs without pause with another morpheme in an utterance is bound (B). Therefore Ii [U2] 'pear' is free. ... Therefore taur [tao2 'peach'] is bound." 2 The question then becomes, is the constituent of a word a form like li2 'pear' or a form like tao2 'peach'? Only if it is a U2 form, that is, a word, can we raise the question as to its syntactic category because syntactic word classes are established on the basis of syntactic distribution. A form which appears only inside a word has no syntactic distribution

350

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

or function, and therefore has no syntactic category. (This is not, of course, to say that bound forms such as suffixes cannot donate syntactic categories to the words of which they are constituents, as in the case of the English suffix -ation which donates to the words on which it is the last suffix the category noun.) The difficulty in deciding whether a constituent of a word is itself a word is exacerbated by the traditional Chinese definition of "word". It is well-known that there is no good Chinese translation for the English word word. About the closest we can come is ci2 'An expression. Words; phrases; a part of speech. Tales; stories. A form of poetry' (Mathews 1968: 1031, no. 6971), and zi4 Ά letter; a written character; a word' (Mathews 1968: 1025, no. 6942). In fact, Chinese characters encode monosyllabic morphemes, not words (DeFrancis 1989), but in practice, those who are influenced by traditional Chinese scholarly practice often regard a word as whatever corresponds to a single written character,3 and rarely raise the question of whether it is free or bound. 4 As Charles Hockett puts it: "The early western students of Chinese, and the Chinese themselves until quite recently, perceiving the language through a haze of characters, saw utterances as rows of bricks, of uniform size and shape, each a single syllable and a single 'word', immutable, subject to no influence (or almost none) from the preceding and following bricks." (Hockett 1950: 70) The problem with this assumption from a linguistic point of view is that, while Mandarin Chinese does have a number of true monosyllabic words, the stretch of speech corresponding to a written character is very often not a minimum free form. As Chao notes (1968: 145—146), the linguistic unit corresponding to a character is typically a morpheme bound to a preceding or following morpheme: "(3) (-)B(-): Start-Free or End-Free, but Not Both. The great majority of morphemes entered in a dictionary of single characters belong to this category. All numerals are of this type,... most measures, ..., the cardinal directions, ..., the seasons of the year, ..., monosyllabic names." But notwithstanding this realisation that many morphemes are bound Chao (1968: 145-146) allows for compound words to consist of two bound stems. "Root words. All these ( - ) B ( - ) forms are bound forms, since they are bound at least at one end. It is useful to distinguish here between those bound forms which will become free by the addition of an affix and those which occur only in combination with other root morphemes. The first kind consists of roots to form primary derived words or root words. Examples are: yiitz, \yi3zi\ 'chair', wahtz, [wa4zi\ 'sock,

The Chinese compound word

351

stocking' ... The other is a much larger class consisting of bound morphemes occurring in compounds. Examples are nan [nan2] 'man, male', neu [nü3] 'woman, female', lih [li4] 'strength', liueh [lü4] 'outline, approximate', yau \yao2] 'rumor', j'ye \jie2] 'to knot, to conclude'." Li and Thompson (1981) adopt a very similar view of compounding. They say that there is "a great deal of disagreement over the definition of compound. The reason is that, no matter what criteria one picks, there is no clear demarcation between compounds and non compounds." (Li-Thompson 1981: 45) They, however, adopt the traditional Chinese definition of compounds, "we may consider as compounds all polysyllabic units that have certain properties of single words and that can be analysed into two or more meaningful elements, or morphemes, even if these morphemes cannot occur independently [i. e., as words] in modern Mandarin." (Li-Thompson 1981: 46) There is an alternative to adopting traditional Chinese practice and that is to allow the definition of terminology to arise from theory. Thus what constitutes a word or a compound becomes the outcome of theories about the nature of words and compounds. We will follow this methodological procedure.

3. Headedness in Chinese compounds Many generative theories of compounding (Kuiper 1972, Williams 1981 a, Di Sciullo- Williams 1987, Lieber 1992) propose that words, including compound words, have heads. The head of a word has a number of attributes; the only one which will be of significance for what follows is that heads determine the syntactic category features of the word as a whole. For example, in the compound noun bluebird, it is the fact that bird is a noun which is responsible for the fact that bluebird is also a noun. There is a small class of exceptions, the so-called exocentric compounds such as redcap, where neither constituent is responsible for heading the compound. There are other odd cases such as the word outcome which do not seem to be headed. However, in general, words are headed, and they are headed in only one way. In English, for example, it is claimed, for example by Williams (1981 a) and Lieber (1981), that all the compounds which are headed are right-headed, that is, the right-hand constituent determines the syntactic category of the compound.

352

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

Simple compounds are binary compounds consisting of two words (Kuiper 1972).5 The structure of English simple compounds is as follows (Selkirk 1982: 14-15): Nouns

NN AN PN VN

millwheel, firetruck high school, poor house uprising, afterbirth scrubwoman, pickpocket

Adjectives:

ΝΑ AA PA VA

heartbroken, color-blind icy cold, deaf-mute above-mentioned, under ripe diehard

Verbs:

NV AV PV vv

hand-made, spoon-feed double-coat, sweet-talk overdo, outlive freeze-dry, drop-kick

Prepositions:

PP

into, onto

These examples clearly show the right-headed property of English simple compounds. Complex compounds, compounds which have other compounds as constituents such as a novel creation like riceflour bagel, are also rightheaded in that the whole compound is a noun because bagel which is a noun is its head and, in turn, riceflour is a noun because flour which is its head is also a noun. They also show that English compounds have a binary structure. In the general case if compounds in all languages are binary, it is possible for them to be either left-headed or right-headed. In the case of English they appear to be, for the most part, right-headed. We might then hypothesize that compounds in all languages have binary structure, that they are headed, and that they are either uniformly left- or right-headed. This then becomes a set of hypotheses about universal grammar, the grammar which all languages have in common. It is conceivable that Chinese compounds are unheaded or, if headed, that their headedness is not uniform. This is worse than a null hypothesis because it predicts that Chinese is fundamentally unlike other natural languages. If we examine those words which would be termed compounds in traditional Chinese scholarship then, in the case of simple compounds,

The Chinese compound word

353

i.e., of those compounds consisting of two and no more characters, it does appear that Chinese compounds are not headed. Their categorial composition seems for the most part random even using traditional Chinese syntactic categories (see, e. g., Huang, this volume). (2)

Dominating category

Constituents

Ν

NN, AN, VN, nN, MN, NA, AA, VA, HA, nA, NV, AV, VV, HV, nV, nn, NM, MM.

Α

NN, AN, VN, HN, NA, AA, VA, HA, nA, NV, AV, VV, HV.

V

NN, AN, VN, AA, VA, HA, NV, AV, VV, HV, nV, NH, Vn, LL, KK.

Η

AN, VN, HN, DN, nN, NA, AA, VA, HA, JA, DA, NV, AV, VV, HV, DV, JV, NH, AH, VH, HH, nH, DH, HD, nn, LL, AJ, HJ.

η

nn

L

NL, DL, FPL, LL.

Μ

NM, AM, nM, DM.

Κ

NK, AK, KK, KN.

J

NJ, AJ, VJ, HJ, nJ, JJ, DJ, VA, HA, NV, AV, VV, HV, DV, JV, JH, DH, HH, ND.

Abbreviations of categorial features which have been used in the structural descriptions of Chinese compounds are as follows (following Chao 1968): Ν - noun; A - adjective; Η - adverb; V - verb; J - conjunction; L - localizer; Μ — measure; Κ — preposition; η -numeral; D - determiner; FPL - free place localizer; Ρ - particle. Notice, however, that not all categories appear wholly random; n, L, and Μ appear to be right-headed. So it is only the major lexical categories of noun, verb, adjective, and adverb which appear to be wholly unheaded while only conjunction among the other syntactic categories appears to be unheaded. If we look at complex compounds the situation becomes clearer. Using Chao's (1968) classification we find the following complex compounds:

354

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

(3)

a. Three-morpheme compounds This is the most productive class of complex compounds. Chao (1968: 481) divides this group into eight sub-types: i) [n[nAN] [N]] xiao3 shu4 dian3 "small number point, i. e., decimal point" ii)

[N [A] [ n NN]]

yal dan4 "salty duck egg, i. e., salted duck egg"

iii)

[ n [VN] [N]]

lu4 yinl jil "record-sound machine, i. e., type recorder"

iv)

[ N [ n NN] [N]]

fanl bu4 xie2 "sail cloth shoes, i.e., canvas shoes"

v)

[ N [vVV] [N]]

jiang4 luo4 san3 "descending-dropping umbrella, i.e., parachute"

vi)

[ N [AAA] [Ν]]

suanl la4 tangl "sour hot soup, i. e., soup with vinegar and pepper"

vii)

[ n [ v VA][N]]

fang4 da4 jing4 "let large optical instrument, i. e., magnifying glass"

viii) Other compounds [Ν [V] [ N NN]] [ν [V] [ n AN]]

chao2 niu2 rou4 "stir fried beef' ban4 huang2 gual "mixed yellow melon, i. e., cucumber salad"

b. Four-morpheme compounds i) The greatest number of four-morpheme compounds is of the type 2+2.

ii)

[

N

[NAN] [NAN]]

bai3 huo4 gongl sil store"

"department

[

N

[AAA] [NNN]]

gongl gong4 shi4 ye4 "public career"

Of the 3 + 1 types, the 3 is more likely to be 2 + 1 than 1 + 2 [N [N [N^N] [Ν] [N]] jiu3 long2 shanl ren2 "Nine-dragon Mountain Man, i. e., pen name of a painter" [ν [ν [A] [NnN]] [N]] hong2 shi2 zi4 hui4 "the Red Cross"

The Chinese compound word

iii)

355

Type 1 + 3 is rare except with certain listable versatile morphemes, often called prefixes, occurring in titles, terms of address, and so on. [N [A] [N [VVV] [N]]] fu4 yan2 jiul yuan2 "assistant research fellow"

c. Longer compounds i) Five-morpheme compounds [N [N [ A A A ] [ N N N ] ]

[N]] ii)

Six-morpheme compounds [Ν [Ν [NAN] [N]] ta [A] [NNN]]]

d.

gongl gong4 qi4 chel zhan4 "bus station, bus stop"

zhongl guo2 yu3 xinl zi4 dian3 "Chinese Language New Dictionary"

Telescoped compounds [ N [NAN] [N [N [ N A N ] [ N N N ] ]

[N [NVN] [NAN]]]]

zhongl guo2 cheng2 tao4 she4 bei4 chul kou3 gongl sil "China National Complete Plant Export Corporation"

These are all right-headed. It would appear then, that where Chinese compounds are headed, they are right-headed. So what is to be done with all of the apparently unheaded and leftheaded simple-compound cases? Let us look again at the traditional definition of a compound. First many characters in Chinese are syntactically ambiguous as we said earlier. If Chinese compounds are right-headed then in the case of nouns, verbs, adjectives, and adverbs, it may be that the right-hand constituent is itself ambiguous thus making it impossible to say which of its syntactic categories is responsible for the category of the whole compound word being what it is. Second, it may be the case that not all two-character words are compound words. One or both of the characters may represent a bound stem, as Li and Thompson (1981) suggest. In the first case this would make the word like the English word cranberry, where the first morpheme is not a possible word. The second would make the word like the English word conceive, in which neither of the two syllables is a possible English word but where we might still wish to say that the word consisted of two morphemes. In other words, it may be that many of the words which are not right-headed are also not compound words.

356

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

There are further possibilities. The number of available Chinese characters to represent words is large but finite. It is as likely in Chinese as it is in English that homonyms develop over time so that the same character can do service for more than one morpheme or word. For example, one of the two characters may be a bound stem which is homonymous with a free morpheme represented by the same character. For example, the adjectives which end in the suffix -able such as comfortable, would if they were written in characters have the suffix written with the same character as the independently existing adjective able. However, there is a clear synchronic distinction in current English between the suffix and the independently occurring adjective whatever their historical relationship may have been. More radically, some words represented by two characters may be homonyms of other words represented by the same two characters but the word may be a single morpheme. A partly parallel case in English is the word window which was a thousand years ago a compound word consisting of the word wind and the word eye. If it were written in characters these two elements would still be apparent in the written form of the word, whereas to current English speakers even the wind part of window appears to be only accidentally related to wind as an independent word. Let us therefore suppose that all or at least the great majority of compounds in Chinese are right-headed; how then are the individual nonheaded or not right-headed words to be analysed? Since the mnemonics used to teach characters tend to present a single underlying sense as the basic sense of each character, however implausible the semantic route to the homonymous senses represented by the same character may be, it is likely to be difficult for scholars influenced by traditional Chinese scholarship to see homonymy where it exists. A conventional dictionary is useless for this purpose because Chinese dictionaries are dictionaries of characters, and the glosses they give for characters do not distinguish glosses for the character when it is used alone (if it ever is), glosses obtained by subtracting the meaning of its fellow morpheme from words in which it occurs, and glosses for the word as it was used in Classical Chinese hundreds of years ago. In that sense, Chinese dictionaries are etymological morpheme dictionaries that do not distinguish the modern morpheme from its etymon.6 Because of the importance of orthography in Chinese education, it is also a frustrating task to attempt to determine whether a form is free or bound by asking someone who has been socialized in such a system. In our experience, such a person will normally claim that almost any given

The Chinese compound word

357

character can be used as a free form, frequently citing Classical Chinese forms to prove their point. The only Chinese dictionary we know of which seems to show any understanding of the difference between free and bound forms and between synchronic and diachronic stages of a language is the Dictionary of Spoken Chinese published by the U.S. War Department in 1945. What a linguist working in Chinese needs is the linguistic nativespeaker intuitions of a Chinese speaker but of one who is also illiterate in Chinese. Failing the presence of such a person we may examine other areas of Chinese morphology to see if our hypothesis that not all of what are claimed to be compounds are in fact compounds, can find further support.

4. VR forms Mandarin VRs ("result compounds") (Thompson 1973, Lu 1977, Gebauer 1980, Huang 1984) give further evidence that traditional analyses and modern ones based on them may be providing incorrect analyses of Chinese word formation. In analyzing VR forms, three questions need to be answered. First, is a VR form a word? Second, what lexical class does it belong to? Third, what kinds of elements is it composed of? There is general agreement that VR forms are themselves words. VR forms are free in the sense that they are not limited in the choice of elements which can precede or follow them, and they are minimal in the sense that they cannot be further decomposed into parts which are minimal and free. VRs also satisfy further Bloomfieldian criteria in having distributions like other morphologically simple words, and in being indivisible (Bloomfield 1933: 180-184). 7 There is also general agreement regarding de2 and bu4, elements which appear to be exceptions to the indivisibility criterion since they can appear between the V and the R. The consensus, going back at least as far as Y. R. Chao, is that these elements are infixes (Chao 1968: 159; cf. Thompson 1973: 364, Her 1990: 200, Lin 1990: l). 8 With regard to the second question, there seems to be no disagreement about the fact that a VR form is a verb, and in terms of its external distribution a verb occurring in the same kinds of environments as morphologically simple verbs, including, for example, the ability to be negated by mei2 and perfected by le.

358

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

There is also general agreement on the final question, namely that the first element of a VR form is a V. The remaining question is that of the nature of the R element of a VR form. If the R is a verb, as assumed by traditionally-based analyses, then a VR form is composed of two free words and should be regarded as a compound. We will show that, in a number of cases, the R is not a verb, and not a word. That being so, a VR form of this type is not composed of two free forms, and thus not a compound. 9 To do this we will provide the same kind of analysis as in the previous section contrasting traditionally-based analyses of VR, particularly those of Lin (1990) and Li (1992) with one which does not make the traditional assumption that the R element is a verb but instead that it is a derivational suffix. Lin (1990) is a careful, thorough, and competent Lexical Functional Grammar analysis of VR forms. However, it adopts the traditional Chinese position on what constitutes a compound as the following quotations illustrate (Lin 1990: 1). "Among the traditionally categorized classes of compounds in Mandarin, there is a group of compounds often referred to as the Verb-Complement ( V - R ) compounds which is structurally [Vl-V2]y in general. Semantically, both concatenated members of the compounds have predicative functions; and, their meanings are generally that the second verb describes the state of the subject, the object, or the event as a result of the action or process described by the first verb". Lin assumes without question that VR forms must be compounds composed of two verbs and her study explains how the properties of the VR verb can be derived from the predicate-argument structures of its two constituent verbs. However, if the R is not a verb but rather a suffix, then there is no second verb and no second predicate-argument structure to work from. Therefore the properties of VRs cannot be a function of the predicate-argument structures of the second verb. Lin's assumptions are shared by others, for example, One-soon Her (Her 1990: 125-126) "This semantic classification may also be relevant to the description of the morpholexical process of resultative compounding, where an action verb, [ACTIVE + PROCESS - ] , which may be either transitive or intransitive, is joined by an [ACTIVE —] verb, i. e., either a state verb or a process verb, to form an action-process verb, [ACTIVE + PROCESS +]." Once again, the author never questions the assumption that the R element is a verb and that a VR form is a compound. So, what evidence is there that the R is in fact a suffix and not a verb or even a bound verb stem? The evidence comes from problems and

The Chinese compound word

359

anomalies that appear when the assumption that it is a verb are pursued in a generative framework such as Lexical Functional Grammar. In particular, even though Lin, for example, defines V—R compounds as "structurally [VI—V2] v " (Lin 1990: 1), in many cases the second element of the "compound" is either a) different in meaning from the homophonous/homographic verb occurring alone, or b) not a verb at all synchronically, or c) not even an identifiable word synchronically (cf. Thompson 1973: 363). Consequently, the forms are not compounds by any standard linguistic definition, and analyses which attempt to derive them by the conflation of two free forms are untenable. To cite some specific examples: Lin derives yao3zhu4 'bite and hold' (Lin 1990: 84,140) from two input verbs, yao3 'bite' and zhu4 'reside'. However, it takes a lot of semantic ingenuity to see any connection between the meaning of the verb 'reside' and the meaning of 'persisting resulting state' that is shared by verbs suffixed by zhu4.10 In this situation, we might take refuge in the fact that compounds are typically not semantically compositional (cf. Chao 1968: 278), and state that the -zhu4 of yao3zhu4 really is the verb 'reside', but that the compound has undergone a semantic shift. The contrary view, that -zhu4 is a suffix in resultative verbs, is supported by the fact that its meaning in these verbs is perfectly regular; the R suffix -zhu4 derives VR verbs with the meaning 'persisting state resulting from the action V' from their verb stems, and speakers derive perfectly regular new forms by this same pattern. Thus the compounding analysis may preserve etymological information, but the suffixation analysis captures information about the speaker's synchronic competence, the regularity of the sense of the suffix, and its productivity. If this is a relatively simple conclusion to reach, why have linguists working with Chinese classifications of word formation not drawn it? Mainly, we believe, because of preconceptions about the nature of Chinese word formation. In the cases where the suffixal nature of a form is so obvious that it becomes very difficult to maintain the equation of a character with a word or bound stem, suffixes are allowed, but only with a new definition that limits the range of application of this term. For example, Chao limits morphemes in Chinese as follows (Chao 1968: 219), "A suffix in Chinese is an empty morpheme, mostly in the neutral tone, which occurs at the end of a word and characterizes its grammatical function." Why should a suffix be defined as "semantically empty"? Certainly it is not generally true that suffixes in other languages are "semantically empty". If a suffix such as -zhu4 is linguistically distinct from the verb written with the same character, and never occurs as a free form, how is one to

360

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

decide its properties as a word? One common method is to look up the meaning the character represented in the Chinese language of a thousand years ago. The other procedure is a kind of subtractive inference grounded in the equation, WORD = CHARACTER. In a VR form, V and R are both words, since they are written with characters. Some of the Rs correspond to free verbs (or adverbs), so all Rs are verbs (or adverbs). The meaning of the posited R verb is derived by subtracting the meaning of V from the overall meaning of VR, and the predicate-argument structure of this hypothetical verb is then extrapolated from the meaning so derived. This is, for example, how Lin arrives at the meaning of R elements and at their contribution to the meaning of the whole VR form. "It seems to us that the predicative functions in V - R compounds must be established on the SEMANTIC properties of the two verbal items involved". Even the argument structures of V - R compounds in which both components are intransitive seem to be determined in this way. "In the following sections, I will introduce the notion of lexical semantic defaults (Dowty 1988) and classify verbs into a small set of semantic classes according to the lexical semantic defaults. Then, based on the semantic classes, the predicative functions of the R members in V—R compounds will be predicted and the possible thematic structures of the compounds will be constructed." (Lin 1990: 46, 80) A study such as Lin's, then, is circular, first performing the operation VR - V = R, and then commuting the operation: VR = V + R to "predict" the meaning of its constituents. I will discuss the compounds with the following assumptions: 1) V - R compounding involves combinations of two lexical entries including the two thematic structures and the semantic information contained, 2) the first member of the compounds is HEAD and the second one COMPLEMENT, [1] and 3) the thematic structure of the HEAD member will unify with that of the COMPLEMENT member to produce a new thematic structure. 1. The morphological operation: VI —• V 1 - V 2 ' ... 2. The combination of thematic structures: a. VI + V2

VI - V 2 ... Lin 1990: 80, 113-114)"

Given the questionable ontological status of the R verbs identified in this procedure, it is not surprising that there are sometimes problems in setting them up, e. g., "While it is a simple matter to determine the form

The Chinese compound word

361

class of a compound as a whole, the form classes of its constituents are not always clear or determinate and consequently sometimes their syntactical relation is unclear or ambiguous. This is a reservation we have to make in all the headings and subheadings of the syntactic compounds detailed below." (Chao 1968: 367, 366) Lin notes the difficulty in predicting the meanings of VRs from their components. "Based on past works, it is quite difficult to establish a rule-governed interpretation process for V - R compounds, because the subcategorization frames of the compounds are not formed by straightforwardly concatenating the frames of the V member and the R member. The [chou2]vr[bai2]v2 in (20) a and the [chou2]Vi-[si3]V2 in (20) b share the same V member; in another pair, the [chou2]v-[si3]v2 in (20) b and the [nan2]vi[si3]v2 in (20) c share the same R member. The argument structures of these compounds are not decidable by the V member or the R member alone." (Lin 1990: 1 - 2 , 47) This unpredictability is again partly due to the assumption that both components are verbs, so that it is hard to predict the meaning of nan2si3 'extremely difficult' from nan2 'difficult' and si3 'die'. However, when the second element is regarded as the separate derivational suffix -si3 which synchronically has nothing at all to do with dying and which adds the meaning 'extremely', the process turns out to be quite regular. At several points Lin appears close to recognizing the fact assumed in the Dictionary of Spoken Chinese analysis that a single character may represent two quite different linguistic elements, a free verb with one meaning and a suffix, indicated by a preceding hyphen, with another not necessarily related meaning.12 At least this seems to be an obvious interpretation of her tabulation (Lin 1990: 9, slightly modified): shang4 xia4 jin4 chul qi3 hui2 guo4 kail long3

'ascend' 'descend' 'enter' 'exit' 'rise' 'return' 'pass' 'open' 'gather'

'-up' '-down' '-in' '-out' '-up' '-back' '-over' '-away, apart' '-together'

Lin's example following the table, zou3 'walk', kail 'open', zou3kail 'walk away' is an excellent illustration of our point, zotßkail has nothing

362

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

to do with opening anything, but all VRs ending in the suffix -kail seem to refer regularly to a separation between objects resulting from the action of the V element of the VR form. Admittedly, not all of Lin's examples are of this type. Thus RVs with -jin4 do probably share a meaning of entering something. However, a unified analysis attempting to capture the synchronic semantic regularities would have to treat all of these forms as suffixes, resorting to etymology to account for any semantic relation between the R and a corresponding free verb which is written with the same character. Note that this is not to claim that the meanings of VR words derived by suffixation are 100% predictable, since derivation, like compounding, is typically not fully productive or compositional. It claims rather that not formally tying the meaning of suffixes to their etymological sources results in a far higher degree of generalization than is gained in more traditionally based analyses. In support of such an analysis the following examples show how it might be carried further. This list of verbs and suffixes which are written with the same character in Chinese orthography must be distinguished according to linguistic criteria. They are, in our view, homonyms. Glosses for the meanings as verbs and as suffixes are based partly on the Dictionary of Spoken Chinese.

cha2 dao3 dao4 diao4 fu2 guangl hao3 ji2 jin4 zhu4 kail liang2 liao3 ni4 qi3 shang4 si3

Verb gloss 'look up' 'topple' 'arrive at' 'fall, drop' 'be convinced' 'bare' 'good; finished' 'be urgent' 'enter' 'reside' 'open' —

('enough') 'bother, bore' 'rise' 'ascend; go to' 'die'

Derivational suffix gloss 'examine' 'down, over' 'to, until; succeed' 'off' 'over' 'completed, exhausted' 'finished' 'extremely' 'into; ahead; to the end' 'persisting result' 'apart, away' 'better; improved' 'be possible' 'bored from doing' 'begin' 'up; together' 'dead; extremely'

The Chinese compound word

tou4 tongl wail ya3 zou3

'air passes' 'pass through' 'askew' 'low, quiet, mute' 'walk, go'

363

'thoroughly, completely' 'thorough effect' 'askew, lopsided, awry' 'hoarse, husky' 'away, through, from'

Our derivational analysis is an extension of the one proposed by Thompson (1973), and thus might be thought to be open to Lin's criticism of this approach (Lin 1990: 37) "Thompson's approach leaves the correlation between the predicative functions of complement members and the behaviours of their corresponding compounds unexplained." However, in our view, the R element is not a complement and has no predicative function. We claim, instead, that a derivational analysis, embedded in the framework of a universal theory and divorcing itself from the vagaries of etymology, is able to capture more and better generalizations about the relations between VRs and their components than one which resorts to etymology. The word-and-paradigm analysis we assume again might seem to be open to another of Lin's criticisms (Lin 1990: 33): "Thus, if treating them as the compound verbs without internal structure, then how to explain the relation between the syntactic behaviours of this composition type and the predicative functions of the complement members?" The answer to Lin's rhetorical question is that derivational rules relate the semantic, syntactic, and phonological properties of source words to those of derived words directly via Derivational Rules (DR), analogical patterns of word formation (Starosta 1988: 90—96), without ever having to refer to internal structure. As an example, the following Derivational Rule would account for Lin's zou3kail and other -A:a/-sufflxed forms: (6)

DR-1 ~+V -tele

+V +telc +sprt kail]

This rule can be read as "corresponding to any atelic (non-result) verb, there may be a telic verb identical to the source verb but ending in kail and differing in meaning by the addition of a component of 'separation' ". Since this is a lexical derivation rule, it has the conventional properties of derivation, allowing some semantic deviation and lexical gaps.

364

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

Note that there is no internal structure referred to here at all, and none is needed. The rule is a statement of an analogical pattern of correspondence between two sets of verbs, a pattern which can be used in recognizing or creating new VR forms. Note, too, that this formalization automatically accounts for the non-recursivity of the process, something that Lin's approach does not seem to handle. The rule applies only to nonresult verbs, but the forms it produces are result verbs, and thus not eligible to undergo the rule. Derivational Rules are established for a language by first independently classifying all verbs syntactically, without regard for whether they are morphologically simple or complex. The linguist then looks for subsets of words under one branch of the classification that have a fairly regular one-to-one syntactic, semantic, and phonological correspondence to one another, then writes rules that will associate those branches. The result for Chinese will be Derivational Rules such as DR-1. As in the case of Lexical Functional Grammar these rules typically refer to argument roles, though the case relations of lexicase dependency grammar are grammatically based, unlike the thematic relations used in Lexical Functional Grammar and Government and Binding theories. There are further advantages in the analysis which we propose having to do with the argument-linking properties of YR forms. In addition to Lin (Lin 1990: 109-111), Chang (Chang 1991: Chapter 2), and Li (Li 1990: 178, 190) also claim that complex internal structures must be posited for VR forms. These structures typically require head-feature percolation and may include, for example, multiple occurrences of the same theta-role which are marked as identical to each other ("fusion") or which do not get assigned externally at all ("suppression"), or different theta-roles which get assigned to the same external actant (cf. Baker 1989). For example Chang observes (Chang 1991: Chapter 2, Section 2.5) that "complex verbs, on surface and in terms of grammatical functions, do not differ from simple verbs — most of them take a subject and an object. What makes them distinct from simple verbs is their complex internal thematic structures." She has no real explanation as to why her complex verbs fit into the same pre-existing classes that can be established for simplex verbs. Given the power of her eclectic descriptive system, there is certainly nothing that would formally require this result. In lieu of an explanation, she gives a list of language-specific ad-hoc stipulations, which she refers to as "principles", in order to reconcile her output to her observations. Powerful and complex analyses such as the ones proposed by Lin, Chang, and Li are only valid until such time as an alternative analysis

The Chinese compound word

365

which accounts for the same patterns without assuming such power and complexity is produced. A lexicase analysis such as that of Ng (1992) shows that the same phenomena which Lin and Chang can only account for with complex internal structures fall out naturally from a constrained dependency analysis which posits only five case relations. If we compare, for example, Li's (1990) treatment of thematic relations in VV compounds with our and Ng's treatment of these as suffixed verbs, we will see that much of the additional machinery required to account for VV verbs (on the assumption that the R element is a V) is not necessary on the assumption that Rs are suffixes. The assumption that the second element is a verb leads to analyses which are in breach of major constraints on linguistic theory such as the Theta Criterion which requires that "Each argument bears one and only one theta role and each theta role is assigned to one and only one argument". (Chomsky 1982: 36) Since it is not manifest that R elements are verbs, the breach of the theta criterion provides theory-internal support for the analysis of R as a suffix. Li (1990), for example, makes claims following from the R element being analyzed as a verb as follows: "A remarkable property of these compounds is that there seem to be more theta-roles available than overt NP arguments." (Li 1990: 178) However, this is only remarkable if the R element of VR forms is in fact a verb. Our analysis claims that it is not, so there are no extra arguments left over to account for. Naturally if Li is correct then "The question is how the Theta Criterion is satisfied, which requires that each theta-role is assigned to an argument." (Li 1990: 178) In our analysis, the number of nominal arguments required by a VR form is specified by the derivation rule itself. The number could be the same as the number of arguments allowed by the V-element of VR, or more or less than that number, but never more than five (Starosta 1988: 126).13 Each nominal argument is projected from one and only one of the maximum of five possible theta-role slots available in the case frame of the derived VR verb, so no violation of the Theta Criterion is formally possible. Li's problems with the theta criterion which arise from supposing the R element to be a verb lead to the introduction of a number of "independently justified assumptions". (Li 1990: 178) In our analysis, no such special assumptions are required. The analysis we propose is consistent with the strict constraints of the lexicase theory without any additional stipulations. Li's assumptions include "three particular assumptions which will be used in the rest of the paper. Of the three, theta-identification is the mechanism for 'reducing' the number of theta-roles a V - V

366

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

compound actually assigns to its arguments. The other two, a structured theta-grid and the head-feature percolation, interact so as to correctly restrict the patterns in which the theta-roles of the two components of such a compound are identified and assigned." (Li 1990: 178) By contrast, in a lexicase dependency grammar, the inventory of case relations ("theta-roles") is limited to five grammatically determined universal roles, and there are only two possible subject case roles, agent for transitive verbs and patient for intransitives (Starosta 1988: 181). Li's observations about the prominence of "Theme" in object-incorporation verb compounding follows from the other axiomatic assumption of the lexicase case-relation system, Patient centrality: every verb takes a Patient in its case frame, and Patient is the central pivotal case relation, the one which is the scope for other complement-case relations (Starosta 1988: 128). Thus there is no need for a "structured theta-grid". Li (1990: 181) also claims "That the relevant features of the head will be maintained throughout its derivation projection is widely assumed in linguistic practice. Therefore, I will follow this convention by assuming that the head of a compound word determines the fundamental properties of the compound. In fact, I will assume a (probably) strong version, which requires that the theta-role prominency of the head must be strictly maintained in the theta-grid of the compound." If, however, the R element of VR compounds is a suffix, then the V-constituent is the only constituent which is a lexical item with its own case frame ("theta-grid"), and the properties of the derived V—R form are a function of the properties of the V-. There is no need therefore for a mechanism to subordinate the lexical properties of the -R component, since it has no lexical properties to suppress. There is further support for supposing that R elements are suffixes, and it comes from the first section of this paper. If Chinese compounds are right-headed, then VR compounds constitute a class of counter-examples to this claim, since most scholars who suppose them to be compounds suppose that it is their left-hand V constituent which is the head of the compound. If, however, the R element is a suffix, then in many generative treatments, such a suffix should be the head of the derived word of which it is the right hand constituent since headedness in word structure must be either uniformly left- or right-headed. This again provides theory-internal evidence that what are often termed, following traditional Chinese analyses, "VR compounds" are not compounds but derived words.

The Chinese compound word

367

5. Conclusion We have shown that traditional Chinese definitions of compounds and words, and contemporary analyses which take those definitions for granted, can be a hindrance to gaining an understanding of Chinese compounds in general and of VR compounds in particular. We have also shown that applying standard linguistic criteria to these constructions results in a significantly different analysis, one which distinguishes between synchronic linguistic competence and etymology, and which conforms to a radically constrained grammatical framework, thus making a credible claim to explanatory adequacy.

Notes 1. An earlier version of this paper was presented at the First International Conference on Chinese Linguistics, University of Singapore, in June 1992. We would like to thank Lillian Meei-chin Huang for careful reading and comments, Tsai-fa Cheng for helpful discussions and criticism in connection with our views on the impact of traditional analyses on current Chinese linguistics, and Ke-sheng Wang for checking some of the pinyin orthography. 2. All romanized forms cited or quoted in the paper have been converted to pinyin orthography, and tone numbers have been added. 3. Since a character represents a single syllable, this supports the commonly accepted characterization that Chinese is a monosyllabic language. For an informed and linguistically well-founded description of the Chinese writing system, see DeFrancis (1988). 4. There are a few recognized exceptions to this, such as the noun suffix -zi and the two syllables of such disyllabic forms as hu2die2 'butterfly', xilshuai4 'cricket', and donglxi 'thing', but the general assumption is that a character represents a word. 5. The distinction between simple and complex compounds is made by Kuiper (1972) on the grounds that the structural properties of complex compounds in English are more constrained than those of simple compounds. 6. Regrettably this tradition continues in some modern dictionary projects. Thus the computerized dictionary project at the Academia Sinica in Taiwan employs well-trained linguistic scholars and sophisticated computer expertise to make yet another version of traditionally-based dictionaries. However, the situation is apparently about to change. According to Chu-ren Huang (personal communication), the CKIP dictionary now being used and developed at the Academica Sinica in Taiwan is word-based rather than characterbased. It distinguishes free from bound forms and has morphological rules

368

7.

8.

9.

10.

11. 12.

13.

Stanley Starosta, Koenraad Kuiper, Siew-ai Ng, and Zhi-qian Wu

for deriving suffixed forms in -xing and -du as well as determiner-measure compounds. Because of technical considerations, it is not consistent in providing distinct lexical entries for homophonous words, but they are given separate entries when they belong to different grammatical categories. Note, however, Bloomfield's caveat that "None of these criteria can be strictly applied: many forms lie on the border-line between bound forms and words, or between words and phrases." (Bloomfield 1933: 180—184) Chao has an example from Cantonese showing that the rule is conditioned by the phonological shape of the word rather than by some particular kind of internal lexical structure (Chao 1968: 163): "In Cantonese the first syllable can be repeated alone, even though bound. ... This is possible in Cantonese even if it does not make good sense, as Nee dhoo-mu-dhoongoh? 'Are you hungry?', where dhoongoh means literally 'stomach [dhoo] hungry [ngoh]'." This definition has been loosened somewhat to accommodate cases in which compounds are composed of stem forms, words with their inflectional affixes missing. This situation could also conceivably arise in Chinese if Starosta is correct in his claim, which has so far gone uncontested, that Mandarin Chinese has case inflection (Starosta 1985). However, we will not further consider this possibility here. The Dictionary of Spoken Chinese does cite one example which may represent the more direct etymological source of this verb: 'j < (I) to stop, stay, live. Sometimes causatively: j