Frequency Dictionary of Chinese Words
 9783110815177, 9789027926272

Table of contents :
Foreword
TABLE OF CONTENTS
Introduction
PART ONE: ALPHABETICAL ORDER
A
B
C
D
E
F
G
H
J
K
L
M
N
O
P
R
S
T
W
Y
PART TWO: DECREASING ORDER
First Three Hundred
Second Three Hundred
Third Three Hundred
Fourth Three Hundred
Fifth Three Hundred
Sixth Three Hundred
Seventh Three Hundred
Eighth Three Hundred
Ninth Three Hundred
Tenth Three Hundred
APPENDIX

Citation preview

FREQUENCY

DICTIONARY

OF CHINESE WORDS

LINGUISTIC

STRUCTURES

D I R E C T E D BY A L P H O N S E J U I L L A N D

First Series:

Chinese

FREQUENCY DICTIONARY OF CHINESE WORDS by

E R I C S H E N LI U National Chung-Hsing

University

m 1973

MOUTON THE HAGUE

• PARIS

Copyright 1973 in The Netherlands. Mouton & Co N.V., Publishers, The Hague. No part of this book may be translated or reproduced in any form by print, photoprint, microfilm, or any other means, without written permission from the publishers.

Printed in The Netherlands by Mouton & Co., Printers, The Hague.

FOREWORD

The Collection Linguistic Structures inaugurated by the present volume parallels the Collection The Romance Languages (Mouton and Co.), which aims to provide structural studies in depth of the phonology, grammar, and vocabulary of the French, Italian, Portuguese, Rumanian and Spanish languages. Exploiting methods and techniques described in some detail in the Introduction to our Frequency Dictionary of Spanish Words (The Hague 1964), these studies aim to reduce, by means of scientific sampling, the textual manifestations of each language to a representative "corpus" of some 25,000 sentences totalling about 500,000 words, a "lexical universe" divided into five "lexical worlds" consisting respectively of dramatic, fictional, essayistic, periodical, and technical literature. Punched in coded form on IBM cards, such representative samples are then transferred onto electromagnetic tapes and further "compressed" by means of exhaustive scanning and weighting into smaller inventories and hierarchies of phrases, words, morphemes, syllables, etc. These weighted inventories serve as a basis for extensive phonologic, morphologic, syntatic, and lexical studies of the languages sampled. In turn, such structural studies are to provide sound foundations for further comparative, historical, and typological investigations, whose validity and significance are secured by the identity of the methods of sampling, scanning, and weighting, and by the similarity of the conceptual frameworks in terms of which the structures are grasped and described. For additional information on these studies, the reader is referred to the frequency dictionaries of Spanish (1964), Rumanian (1965), and French (1969) published by Mouton. The Collection Linguistic Structures has set for itself similar objectives which it aspires to attain by similar means. Chiefly designed to receive parallel studies devoted to such major languages as English, German, Indonesian, Japanese, Russian, Tibetan, etc., the Collection is eventually to accomodate similar publications devoted to other languages, e.g., Czech, Dutch, Finnish, Hungarian, Qechua, Turkish, Malay, etc. Although the new Collection strives to parallel its Romance forerunner, certain differences ought to be mentioned. The representative samples, for instance, had to be reduced, by about one half to some 12,000 sentences approximating 250,000 words. Whatever statistical arguments may be invoked in its favor, the main reason for this decision was financial, a considerable reduction in costs being achieved at the expense of the relatively small increase in the margin of error for the structural statements based on weighted inventories extracted from samples twice smaller. As a consequence, "basic vocabularies" had to be proportionally reduced from 5,000 words to 3,000 words: indeed, variations in the membership and structural properties of the lower statistical classes are too considerable in hierarchies extracted from samples of no more than 250,000 words. This reduction is partly compensated for by our decision to append to each dictionary

FOREWORD

a list containing all words in the sample which failed to qualify for membership in the basic vocabulary. Thus, the body of the present dictionary, which consists of the 3,000 Chinese words with coefficients of at least 1.15, is followed by an Appendix listing the 6,335 words with coefficients lower than 1.15. Dr. Eric Liu's Frequency Dictionary of Chinese Words is to be accompanied shortly by Dr. Hiroshi Miyaji's Frequency Dictionary of Japanese Words. Should circumstances prove favorable, studies now in progress devoted to English and Russian will follow, it is hoped, without too much delay. ALPHONSE JUILLAND

VI

T A B L E OF CONTENTS

Foreword

INTRODUCTION 1.

T H E SAMPLE

1.1 1.2 1.3 1.4 1.5 2.

. . .

.

.

. . . .

. .

. .

. . . .

. . . .

.

. . .

Textual Problems Systematic Problems 2.21 Nominals . 2.22 Predicatives 2.23 Modifiers 2.24 Nominal definers 2.25 Adverbials . 2.26 Verbal Auxiliaries 2.27 Prepositions 2.28 Conjunctions 2.29 Markers 2.210 Particles

. . .

.

. .

.

.

.

. .

. .

. . .

. .

THE COEFFICIENTS

3.1 Frequency 3.2 Dispersion 3.3 Usage 4.

.

T H E CODES

2.1 2.2

3.

Dramatic Literature Fictional Literature Essayistic Literature Periodical Literature Technical Literature

. .

.

.

.

.

.

.

.

.

.

.

.

THE RESULTS

4.1

Frequency 4.11 Frequency 4.12 Frequency 4.13 Frequency 4.14 Frequency

Classes Halves Median Average

TABLE OF CONTENTS

4.2

4.3

XXXI

Dispersion 4.21 Dispersion Classes 4.22 Dispersion Halves

XXXIII

4.23

Dispersion Median

XXXIII

4.24

Dispersion Midpoint

XXXIII

XXXV

4.25 Dispersion Average 4.26 Dispersion Ranges

XXXV

Usage . . . 4.31 Usage Classes

XXXVIII

XXXV

.

XXXIX

4.32 Usage Halves

XXXIX

4.33 4.34

XLI

Usage Median Usage Average

5.

T H E DICTIONARY

6.

T H E APPENDIX

XLII

XLIV

.

XLIV

P A R T ONE Alphabetical Order A

3 4 9

B C

17 27

D E F

28

G

32 37 42 55 58 63 67

H J K L M N

W

70 71 74 76 91 97

Y

101

O P R S T

VIII

TABLE OF CONTENTS

PART TWO Decreasing Order 113

First Three Hundred Second Three Hundred Third Three Hundred Fourth Three Hundred Fifth Three Hundred Sixth Three Hundred Seventh Three Hundred Eighth Three Hundred Ninth Three Hundred Tenth Three Hundred

119 126 133 140 147 154 161 168 175

APPENDIX

185

IX

INTRODUCTION

The present Dictionary contains the 3,000 most frequently used words extracted with the help of electronic and mechanical devices from a representative sample of Modern Chinese, consisting of about 20,000 sentences totalling close to 250,000 words, by means of techniques devised by Alphonse Juilland and described in more detail in the Introduction to the Frequency Dictionary of Spanish Words.1

1

THE SAMPLE

In establishing the "lexical universe" from which sentences were selected at random, punched on IBM 5081 cards, transferred to electro-magnetic tapes, and scanned and counted by means of a Burroughs 5000 computer, we were guided by the principles outlined in the introduction to the above-mentioned dictionary. Our first purpose was to determine a sample that would strike a right balance between unity (homogeneity) and diversity (heterogeniety): on the one hand, the sample had to be homogeneous in the sense that it represent more than a mechanical agglomeration of data borrowed from different periods, from different dialects, or from mutually incompatible levels of style; on the other hand, the sample had to be sufficiently diversified to be truly representative of the language as a whole, not only of a particular genre or style. The task of securing satisfactory samples of Chinese literature was not facilitated by the relative scarcity of holdings available in American libraries, which compelled us to broaden the perspective so as to encompass a longer period than originally planned, covering about half a century, roughly from 1910 to 1960. However, such limitations did not affect our decision to limit the lexical universe to about 250,000 words instead of 500,000, as is the case of dictionaries published in the Collection The Romance Languages. Juilland's own estimates suggest that the margin of error is increased only by a small fraction if the size of the samples is halved to a quarter million words. 2 Consequently, to insure its representative character, we divided the universe into five equal worlds of about 50,000 instead of 100,000 words, each representing a particular "genre" as follows: (1) Fictional literature, consisting of novels and short stories; (2) Dramatic literature, consisting exclusively of plays; 1

Cf. A. Juilland and E. Chang-Rodriguez, Frequency Dictionary of Spanish Words (The Hague 1964). Cf. A. Juilland, P. M. H. Edwards, and I. Juilland, Frequency Dictionary of Rumanian Words (The Hague 1965). 2

INTRODUCTION

(3) Essayistic literature, consisting of essays, letters, memoirs, etc.; (4) Periodical literature, consisting of news, editorials, advertising, etc. in dailies, weeklies, and monthlies; (5) Technical literature, consisting of writings on agriculture, engineering, medicine, etc.

1.1

Dramatic

Literature

1. Chinese New Literature Collections: Collections of Drama, Hu Shr, Tyan Han, and others (Lyang You Book Co., Shanghai, 1935). 2. Chwen Gwei He Chu: Yu Jwen-Jr. (Committee of Chinese Cultural Publications, Taipei, 1958). 3. Brothers: Syu Yu (Ye Chwang Book Co., Shanghai, 1947). 4. Hwang Tyan Hou Tu: Yu Chu (Jang Jung Book Co., Taipei, 1958). 5. Wei-Syin Bridge: Yu Chu (Jang Jung Book Co., Taipei, 1958). 6. Works of Tyan Han: Tyan Han (Central Book Co., Shanghai, 1933). 7. Dai Di Hwei Chwen: Chen Bai-Chen (Cultural Book Co., Shanghai, 1948). 8. Works of Tsau Yu: Tsau Yu (Kai Ming Book Co., Shanghai, 1951). 9. Works of Sya Yan: Sya Yan (People's Publishing Co., Peking, 1942). 10. Dang Di Jr Hwa: Gwo Mwo-Rwo (Chwen Ye Book Co., Shanghai, 1949).

1.2 Fictional

Literature

1. Chinese New Literature Collections: Mau Dwen, Bing Syin, and others (Lyang You Book Co.. Shanghai, 1935). 2. Modern Novels: Ge Syan-Ning (Committee of Chinese Cultural Publications, Taipei, 1952). 3. The Complete Works of Lu Shwen (Peking People's Publisher, Peiking, 1958). 4. Collections of Far-Eastern Literary Works: Jou Shu-ren and others (Commercial Press, Shanghai, 1923). 5. Feng Syan Syan: Syu Yu (Ye Chwang Book Co., Shanghai, 1948), 2 volumes. 6. Lu Hwo: Syu Yu (Ye Chwang Book Co., Shanghai, 1950). 7. Kung Shan Ling Yu: Syu Di-Shau (Commercial Press, Shanghai, 1926). 8. Eight Years in China: Chen Ji-Ying (Chung Gwang Literary Publishing Co., Taipei, 1949). 9. Rickshaw Boy: Lau She (Cultural Life Publishing Co., Shanghai, 1941). 10. Midnight: Mau Dwan (People's Literature Publishing Co., Peiking, 1957). 11. The Family: Ba Jin (People's Literature Publishing Co., Peiking, 1957). 12. Stars, Moon and Sun: Syu Su (Gau-Ywan Book Co., Shanghai, 1953). 13. Spring: Syu Su (Kai Ming Book Co., Shanghai, 1929). 14. Autumn: Syu Su (Kai Ming Book Co., Shanghai, 1931). 15. Selections from Lau She: Chwan Chou Book Co. (Shanghai, 1937). 16. Hai Yan: Jeng Jen-Dwo (New China Book Co., Shanghai, 1932). 17. Selections from Yu Da-Fu's Novels (Wan Li Book Co., Shanghai, 1921). 18. Selections from Ding Ling (People's Literature Publishing Co., Peiking, 1932). 19. In the City: Jang Tyan'Yi (Lyang You Book Co., Shanghai, 1935). 1.3 Essayistic

Literature

1. Chinese New Literature Collection: Collection of Essays (Lyang You Publishing Co., Shanghai, 1935), 4 volumes. 2. New View of Life: Lwo Jya-Lwen (Commercial Press, Shanghai, 1942). 3. The Mainland After Its Fall: Yau Gwo-Shwei (Committee of Chinese Cultural Publications, Taipei, 1958). 4. My Trip through the Taiwanese Mountains: Cheng Jau-Syung (Committee of Chinese Cultural Publications, Taipei, 1958). XII

INTRODUCTION

5.

Great Figures in Chinese History: Jang Chi-Ywen (Committee of Chinese Cultural Publications, Taipei, 1955), 2 volumes. 6. Post-War Japan: Chen Gwo-Ting (Committee of Chinese Cultural Publications, Taipei, 1955).

1.4 Periodical Literature 1. Central Daily (Taipei, 1958—1960). 2. Syin-Sheng Daily (Taipei, 1958—1960). 3. Democratic Forum: (Taipei, June, 1957—December, 1958). 4. The Rambler (Taipei, October, 1955—December, 1957).

1.5 Technical Literature 1. Scientific Gazette of the Republic of China: First Collections, 4 volumes; Second Collections, 2 volumes, Li Syi-Wwo (ed.) (Committee of Chinese Cultural Publications, Taipei, 1955, 1958). 2. Modern Medicine: Wu Jing (ed.) (Committee of Chinese Cultural Publication, Taipei, 1952), 2 volumes. 3. Modern Engineering: Line Hung-Sywen (ed.) (Committee of Chinese Cultural Publications, Taipei, 1952), 2 volumes. 4. Modern Agriculture: Jau Lyan-Fang (ed.) (Committee of Chinese Cultural Publications, Taipei, 1953).

From the above sources, sentences were sampled in such varying numbers as to approximate as closely as possible the desired amount of 50,000 words per world. This was achieved by taking into consideration the number of volumes per genre and by estimating the average number of pages per volume, the average number of sentences per page, and the average number of words per sentence. 3 The sampled sentences were punched on IBM cards by postscribing to each word the appropriate functional code, then transferred onto electromagnetic tapes. Frequency dictionaries were compiled for each genre by scanning the five samples separately, with the help of programming routines devised for this purpose. The five separate dictionaries were then merged on an alphabetical matrix and the coefficients of frequency, dispersion, and usage were mechanically calculated. The 3,059 words with a coefficient of usage higher than 2.00 were selected for the Basic Chinese Vocabulary and appear in the two parts of this volume; the remainder of words whose coefficient of usage is lower than 2.00 are listed in the Appendix.

2

THE CODES

A word count aims to establish which words occur in a representative sample; how many times each word occurs in the sample (frequency); and how evenly or unevenly each word's occurences are dispersed in the sub-samples (dispersion). To this end, one must know which forms are occurrences of the same word and which are occurrences of different words, a task which raises difficulties at both textual and systematic levels: in the text, they involve decisions designed to resolve instances of ambiguous or indeterminate segmentation; in the system, they focus on instances of ambiguous or indeterminate identification. 3

Cf. Juilland and Chang-Rodriguez, § 1.123. XIII

INTRODUCTION 2.1

Textual

Problems

In the text or chain, the question is whether certain sequences of adjacent morphemes must be interpreted conjunctively or disjunctively, as parts of one and the same word or as constituents of different words. As in other languages, most instances of ambiguous segmentation in Chinese involve compound words and functional words: should we interpret conjunctively m« 4 tou,2 shu4 lin2 dz,:i chr2 tang,2 kan4 jyan,4 kan4 de* jyan,4 etc., or disjunctively, mu4 tou,2 shu4 4 2 2 2 4 4 4 4 lin d z , c h r tang, kan jyan, kan ife jyan, etc.? Correct solutions for such problems presuppose a theory of the word in general and a theory of the Chinese word in particular. Unfortunately, despite valuable contributions by Chao 4 and Halliday/' no such theory is yet available.6 Although it would be hardly appropriate to discuss such a general question in a too particular context, we must at least indicate the framework within which instances of ambiguous segmentation were solved in compiling the present Dictionary. Problems of word segmentation arise in languages in which the status of the word is indefinite: formal or grammatical criteria are either inapplicable (indeterminacy) or in conflict (ambiguity), whereas substantial or phonetic criteria are weak and do not physically mark the syntagmatic discreteness of intermediate grammatical units. By intermediate units, roughly words, we understand minimal syntactic constituents which are generally but not necessarily larger than morphs and smaller than phrases or clauses. Chinese belongs to the end of the traditional typological scale where one finds languages analytic in nature and isolating in tendency, which rely predominantly on morphosyntactic and syntactic rather than morphologic means in order to express the relations which hold between the terms of linguistic propositions; that is, languages which preferably express grammatical meaning through particles or functional words (morphosyntactic) and through word order (syntactic) rather than through affixes and flexives (morphologic). As for creating new means for expressing the terms of grammatical propositions (lexical meanings), analytic and isolating languages mostly resort to composition, a syntactic mode, rather than derivation, a morphologic mode. In many languages, the minimum syntactic constituents or morphemes are not clearly marked by physical, sensorially verifiable prosodic or suprasegmental features: word stress is weak and the expected one-to-one correlation between stresses and lexical words is low, with fewer stresses than words per construction. Furthermore, syllabic ictus may be hesitant and the syllable a weakly defined unit, characterized by a low measure of correlation with words. Moreover, various phenomena of syntactic phonetics such as sandhi (liaison, elision, or linking) contribute to blurring the physical boundaries of the grammatical word. Because of the heavy reliance on syntactical modes of grammatical expression, word order is not free in such languages and even though this does not necessarily lead to agglutination, it restricts permutation as well as the substitutional and distributional versatility which contributes to the autonomy of the word and facilitates the testing of its syntagmatic discreteness on formal grounds. Fortunately, the syllable is a clearly defined unit in Mandarin Chinese, where it coextends with the morpheme 7 and tends to support word segmentation in the sense that every word 4

Cf. Y . R. Chao, "The Logical Structure of Chinese Words", Language 22.4-13 (1946). Cf. A . K. Halliday, "Grammatical categories in Chinese", Transactions oj the Philological Society (1956). 8 Cf. A . Juilland, Structural Relations (The Hague 1961). W e have consulted, however, the unpublished manuscript The Concept of Word, by the same author. 7 Cf. Y . R. Chao, Mandarin Primer (Cambridge 1948), p. 33. 5

XIV

INTRODUCTION

boundary is a syllable boundary (though not vice-versa, of course), with few or no word boundaries cutting across syllables to dissociate nuclei from margins by agglutinating codas to the following onsets or onsets to preceding codas. Instances of ambiguous word segmentation are raised in Chinese by particles, the functional words which represent the morphosyntactic mode of grammatical expression or function as substitutes for certain lexical words. The term "character" has been chosen to reflect the identity in the Chinese word dz,4 which is the name both of the graphic unit in the script and of the linguistic unit of which this is a written symbol. If the character is to be recognized as a grammatical unit, the word must exhibit internal structure. Tre problem of word structure is complex and some theories have been proposed. Halliday pointed out that "character is what is talked about; the word is what is talked with". 8 He suggested that Chinese compounds are composed of two elements, S and G (specific and general). Then kous "opening" (G) and men2 "door" (S) constitute the word men2-kou3 "doorway". In applying this method, the analyst presupposes at least some knowledge of the lexical meaning of each element to determine which is S and which is G. Unfortunately, many compound nominals cannot be analyzed this way, e.g. words like ji2 chi4 "machine" and dz4 ran2 "nature", or words which take suffixes, e.g., mai4-dzs "wheat" and mu4-tau2 "wood". A classification of characters into "free" and "bound", such as the one suggested by Chao, 9 has certain pedagogical advantages, but it is impossible to proceed directly from the free/bound system to the elements of word structure, since, in establishing the class by manipulation of these elements, the question as to whether the form is simple or compound and the classification of constituents as free or bound is irrelevant. Both Chao and Halliday advocate the "inner approach", which defines words synthetically, relative to the properties of their constituent morphs. For us, words are stretches of morphs between word boundaries, 10 and the problem is how to determine which morph boundaries are also word boundaries and which are not. Chinese morphs are of three kinds: free, free/bound, and bound: free morphs are called roots; free/bound morphs are called particles; bound morphs are called partials. Whether a morph is free, bound, or free/bound can be established in terms of primary and/or secondary criteria. Primary or structural criteria are of two kinds: functional, which test the privileges of occurrence of morphs relative to sentences, whether restricted or unrestricted; and distributional, which test the degree of adhesion of morphs relative to their immediate neighbors, whether disjunctive or conjunctive. Restrictions on the morphemes' privileges of occurrence relative to sentence positions, i.e., initial, medial, and final, signifies synthetic concreteness, hence conjunction; lack of such restrictions signifies analytic discreteness, hence disjunction. Separability of morphs by means of an insert signifies analytic discreteness, hence disjunction; inseparability signifies synthetic concreteness, hence conjunction. Secondary or prosodic criteria are of two kinds: accentual, which test whether a given morph can bear the stress or not; and tonal, which test whether the characteristic tone of a given morph can be neutralized or not. A morph's capacity of bearing the stress signifies analytic discreteness, hence disjunction; lack of such a capacity signifies synthetic concreteness, hence conjunction. Capacity of neutralizing a morph's tone signifies synthetic concreteness, 8 9 10

Cf. Halliday, op. cit. Cf. Chao, The Logical Structure. Cf. J. H. Greenberg, Essays in Linguistics (Chicago 1957). XV

INTRODUCTION

hence conjunction; lack of such a capacity signifies analytic discreteness, hence disjunction. Relative to these criteria, three categories of morphs can be recognized: 1) Roots, which are characterized by a maximal degree of independence (or by a minimal degree of dependence) inasmuch as the four criteria signal analytic discreteness. 2) Partials, which are characterized by a maximal degree of dependence (or by a minimal degree of independence) inasmuch as the four criteria signal synthetic concreteness. Partials are assigned to two sub-classes: prefixes, whose privileges of occurrence are prohibited sentencefinally; and suffixes, whose privileges of occurrence are prohibited sentence-initially. 3) Particles, whose intermediate status is characterized by a limited degree of dependence/ independence, insofar as the primary criteria are in conflict: the functional criterion signals synthetic concreteness (restricted occurrence), i.e. inseparability, whereas the distributional criterion (insertion) signals analytic discreteness, i.e. separability. Particles are assigned to two sub-classes: proclitics, whose privileges of occurrence are restricted sentence-finally and whose function is to introduce or modify a following morph of construction; and enclitics, whose privileges of occurrence are prohibited sentence-initially and which govern or modify a preceding morph or construction. These classifications can be summarized as follows:

Free morphs (roots)

Functional

Distributional

Accentual

Tonal

+

+

+

+

Free/Bound morphs (particles)



Bound Morphs (partials)



+ —

Primary









Secondary

Constructions, or sequences of morphs, can be segmented into words by distinguishing morph boundaries that are also word boundaries from those that are not, relative to the structual properties of the morphemes preceding and/or following each boundary. Boundaries which link roots with other roots, roots with proclitics, or enclitics with roots, and enclitics with proclitics are interpreted disjunctively, the adjacent morphemes being assigned to different words. Boundaries which link roots with suffixes and prefixes with roots are interpreted conjunctively, the adjacent morphs being assigned to the same word: prefixes are conjoined with the immediately following morpheme and suffixes are conjoined with the preceding morpheme, as parts of the same word. This technique gives satisfactory results, except for the boundaries which link the constituents of compounds, which ought to be interpreted conjunctively and assigned to different words on the basis of functional criteria, as their privileges of occurrence relative to the sentence being unrestricted; however, the conjunctive interpretation is imposed by the distributional criterion, as the boundary which links such constituents cannot be disjoined by insertion; furthermore, secondary criteria confirm the conjunctive interpretation, as only one of the adjacent constituents can be stressed or neutralized.

XVI

INTRODUCTION

2.2 Systematic Problems In the system, the problem is to assign word variants to invariants and invariants to classes, which presupposes that instances of lexical, morphologic, and syntactic homonymy are correctly interpreted. In view of its isolating character, Chinese offers no instances of morphologic homonymy, and most instances of lexical homonymy are resolved by the difference of the tonal superscripts which follow the transcription of each root. The very few instances of total homonymy (both segmental and suprasegmental) were resolved by means of ad hoc symbols postscribed to each word's transcript. The major problem is raised in Chinese by the numerous instances of syntactic homonymy conditioned by the isolating character of a language which does not mark the function of words morphologically, by means of appropriate affixes and/or flexives, as is generally the case in inflexional languages. These can be resolved by distributional means, a task not facilitated by previous analyses of the Chinese word system, which do not explicate with sufficient clarity the formal criteria underlying the various word classes they postulate. Our interpretation postulates 10 classes of Chinese words: nominals, predicatives, modifiers, nominal definers, verbal modifiers, verbal auxiliaries, prepositions, conjunctions, markers, and particles, which compares with 9 classes recognized by Chao and 8 by Halliday. In the following paragraphs, Chinese word classes are distributionally characterized and, whenever appropriate, compared with the corresponding classes in Chao's and Halliday's interpretations. Each class is exemplified by a number of characteristic members. 2.21

Nominals

A nominal is a word can occur in the environment yi1 x , where x is any single word. Words which satisfy this test correspond to nouns in Chao's and Halliday's classifications, e.g. ping2 gwos "apple", hai- dzz "child", etc. Halliday gives no specific definition for this class, whereas Chao defines a noun as "a syntactic word which can be placed in apposition with D AN compound", 11 where D is a determinative, and A an auxiliary noun. 2.22 Predicatives Predicatives are words which can occur in the environment hens — , or neng, e.g. hau3 "to be good", mang2 "to be busy", mei3 lii "to be beautiful" etc. Chao considers members of this class as verbs or as adjectives; he defines a verb as "a syntactic word which can be modified by the adverb bu and can be followed by the phrase suffix le".12 He also specifies that "since Chinese adjectives are verbs, they form predicates without requiring a verb to be". 1 3 Halliday reckons with the concept of "verbal group", for which he gives no definition, but distinguishes three kinds of verbs: free, auxiliary, and postpositive verbs. Halliday's free verbs correspond to our auxiliaries, his postpositives to our suffixes.

11

12 is

Cf. Chao, Mandarin Primer, p. 46. Op. cit., p. 47. Op. cit., p. 52. XVII

INTRODUCTION

2.23

Modifiers

Modifiers occur in the environment N. They modify nominals and, in contrast to nominal definers have verbal homonyms (i.e., the same shape can occur in verbal function). For example, gau1 in gau1 short1 "high mountain", has a homonym in constructions such as shan1 gau1 "the mountain is high" in which the same shape is determined by the postnominal position; also shan1 hen1 gau1 „the mountain is very high", but never * hen1 gau1 shan Chao classifies these words as ajectives and stipulates that " . . . since Chinese adjectives are verbs, they form predicates without requiring a verb to b e . . ,".14 2.24 Nominal definers Members of this class constitute a small set; jei4 "this", nei"' "that", nei3 "which", mei3 "each", ge4 "various", bye4 "other", etc. Characterized by the same environments as adjectives, adjectivals differ from adjectives proper insofar as they have no verbal homonyms. Chao calls them determinatives and specifies that they ". . . consist of numerals, demonstratives, interrogatives, and a few other bound words".15 2.25

Adverbials

Words which occur in the environment Vi are classified as adverbials, if they are not nouns in the distributional sense established above. Notice that \ 1 symbolizes verbs which occur in the environment hen2 , e.g., hen3 hau3 "very good". Both Chao and Halliday assign members of our class of adverbials to a broader class of "adverbs" which also contains auxiliaries. 2.26 Verbal Auxiliaries V 2 and are We classify as verbal auxiliaries all words which occur in the environment not nouns in the sense defined above. Notice that V 2 symbolizes words which occur in the environment neng2 . Examples are jyou4 "right away", neng2 "can", kes yi3 "may", 4 4 yau "about to", kwai "about to", etc. Our auxiliaries correspond to Chao's adverbs, most of which are bound forms: "Monosyllabic adverbs are bound words in Chinese . . ,".1