Phonetic Variation and Acoustic Distinctive Features: A study of four general American fricatives [Reprint 2021 ed.] 3112414551, 9783112414552, 9783112414569

139 38 12MB

English Pages 166 [164] Year 1964

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Phonetic Variation and Acoustic Distinctive Features: A study of four general American fricatives [Reprint 2021 ed.]
 3112414551, 9783112414552, 9783112414569

Citation preview

PHONETIC VARIATION A N D ACOUSTIC DISTINCTIVE

FEATURES

JANUA LINGUARUM STUDIA MEMORIAE N I C O L A I VAN WIJK D E D I C A T A edenda curai

C O R N E L I S H. VAN S C H O O N E V E L D STANFORD UNIVERSITY

SERIES

P R A C T I C A XII

1964 MOUTON

& CO.

LONDON • T H E H A G U E • PARIS

PHONETIC VARIATION AND ACOUSTIC DISTINCTIVE FEATURES A Study of Four General American by

CLARA N. B U S H STANFORD

UNIVERSITY

1964 M O U T O N & CO. LONDON •

THE H A G U E

• PARIS

Fricatives

© Copyright 1964 Mouton & Co., Publishers, The Hague, The Netherlands. No part of this book may be translated or reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publishers.

Printed in The Netherlands by Mouton & Co., Publishers, The Hague.

ACKNOWLEDGMENTS

This study is a direct result of the interests and questions fostered by the teaching of Professor Dorothy A. Huntington. Its theoretical development, its design, its instrumentation, and its final form all owe much to her vital interest in her field, her insatiable search for interrelationships among concepts, and her conspicuous willingness to give freely of her time and energy to her students. Grateful appreciation is due Professor Virgil A. Anderson for his unfailing support and for his extremely helpful criticisms, and Professor Ruth H. Weir for her continued interest, for her valuable suggestions, and for her critical appraisal from the linguistic point of view. Finally, the encouragement and support so wholeheartedly given by Dr. Joseph A. Miksak and by Griffith Richards proved to be fully as important as the willing and valuable assistance they repeatedly provided.

TABLE OF CONTENTS

Acknowledgments I.

II.

INTRODUCTION

11

A. Genesis of the Problem B. Statement of the Problem C. Plan of the Study

11 12 13

BACKGROUND OF THE PROBLEM

15

A. The Distinctive Feature Theory B. The Specification of Phonemes in Terms of Distinctive Features C. The Specification of the Distinctive Features

15 19 19

I I I . PROCEDURES

A. B. C. D. E. F.

G. H. I. J. IV.

5

Selection of Stimuli Formulation of Stimulus Items Speakers Instrumentation Preparation of Stimulus Items Acoustic Analysis 1. Procedure in graphical recording Sound spectrograph Preparation of stimuli for intensity level recorder Intensity level recorder Preparation of Graphical Records for Measurement Acoustic Measures from Sonagrams Acoustic Measures from Intensity Level Records Treatment of Raw Data

. . .

27

27 30 31 32 33 34 35 35 37 38 39 43 43 44

PRESENTATION AND DISCUSSION OF RESULTS

45

A. Introduction B. The Distinctive Feature Tense/Lax 1. Amount of energy

45 47 47

8

TABLE OF CONTENTS

Obtained measures (for the consonants and consonant oppositions as a function of the adjacent vowel, vowel category, and pooled vowels) Comparison measures (for the consonant cognates and consonant opposition as a function of the adjacent vowel, vowel category, and pooled vowels) 2. Spread in spectrum Obtained measures (for the consonants and consonant opposition as a function of the adjacent vowel, vowel category, and pooled vowels) Comparison measures (for the consonant cognates and consonant opposition as a function of the adjacent vowel, vowel category, and pooled vowels) 3. Duration Obtained measures (for the consonants and consonant opposition as a function of the adjacent vowel, vowel category, and pooled vowels) Comparison measures (for the consonant cognates and consonant opposition as a function of the adjacent vowel, vowel category, and pooled vowels) 4. Duration of the preceding vowel Obtained measures (for the stressed vowels as a function of the following consonant, the vowel category, and the following consonant category) Comparison measures (for the stressed vowels as a function of the following consonant cognates, the vowel category, and the following consonant opposition) Obtained measures (for the unstressed vowel as function of the following consonant, the subsequent stressed vowel opposition, and the intervening consonant opposition) Comparison measures (for the unstressed vowel as a function of the following consonant cognates, the subsequent stressed vowel opposition, the intervening consonant opposition, and the subsequent stressed vowels) C. The Distinctive Feature Grave/Acute 1. Percentage of energy (200c-lkc) Obtained measures (for the consonants and consonant opposition as a function of the adjacent vowel, vowel opposition, and pooled vowels) Comparison measures (for the consonant cognates and consonant opposition as a function of the adjacent vowel, vowel opposition, and pooled vowels) D. Summary 1. Tense/Lax : / f / v s . / v / a n d / 9 / v s . / 6 / Amount of energy Spread in spectrum

47

55 61 62

70 78 79

86 91

92

99

104

108 113 114 115

126 132 132 132 133

TABLE OF CONTENTS

Duration Duration of the preceding vowel 2. Grave/Acute: / f / vs. /0/ and /v/ vs. /&/ Percentage of energy (200c-lkc) V.

9

133 134 135 135

SUMMARY AND CONCLUSIONS

137

A. The Tense/Lax Feature 1. Duration of the preceding vowel B. The Grave/Acute Feature C. Implications for the Distinctive Feature Theory D. Incidental Findings E. Summary

138 139 140 141 142 143

APPENDIXES

A. Note on the Transcription System Comparative Transcription Systems B. Forms Instructions to Speakers Sample Recording Form C. Supplementary Tables BIBLIOGRAPHY

144 145 146 146 147 148 154

I INTRODUCTION

A.

GENESIS OF THE PROBLEM

The science of phonetics has, since its inception, concerned itself with the specification of the sounds of speech. Necessarily, the earliest attempts employed subjective methods and focused on the physiological relationships and processes which produced, recognizable units of the speech code. The work of the early phoneticians culminated in the classical phonetic formulation of vowels and consonants classified by place and manner of articulation and represented by the symbols of the International Phonetic Alphabet (88).1 Since the early 1940's, the development of modern acoustical instrumentation and methods of experimental research has resulted in an ever-increasing number of reports directed toward the specification of speech sounds in terms of their traditional physical parameters - frequency, intensity, and wave composition. At the same time, there has been a growing conviction that the physical parameter of time is also relevant to the acoustic specification of speech sounds. Attempts to specify the various speech sounds acoustically have met with varied success. Some sounds, notably the vowels in steady-state, lent themselves particularly to such acoustical instrumentation and methodology as was available, and early yielded valuable information. Other sounds were harder to specify. The class of voiceless fricatives, for example, is characterized by a wide range of intensities, by some of the highest frequency components of human speech, and by aperiodic vibration. Each of these characteristics posed special problems for investigators using instruments and methods appropriate to vowel analysis. Consequently, while the acoustic specification of speech has progressed with increasing rapidity, our knowledge of the acoustic characteristics of the several sounds of speech is extremely uneven. The phonetic specification of speech sounds, both physiological and acoustical, has been of continued interest to those involved in phonemics, the branch of linguistics concerned with cataloguing the essential sound segments of a given language. Traditionally, the phonemicists have held divergent opinions concerning the relevancy of phonetic specifications to phonemic analysis. One of the most basic controversies 1

Numbers between brackets refer to the Bibliography, pp. 154-161.

12

INTRODUCTION

has concerned the nature of the phoneme itself, as well as the nature of its representation in the actual stream of speech. Among the theories which have been propounded on that subject is the distinctive feature theory, which holds that each phoneme is an unique bundle of concurrent distinctive features, or sound characteristics, that are specifiable in the terms of physiological and acoustical phonetics. The Preliminaries to Speech Analysis (97) by Jakobson, Fant, and Halle (hereinafter referred to as the Preliminaries), the basic exposition of this theory, specified certain phonemes of English according to their distinctive features, and presented an initial formulation of the perceptual, physiological, and acoustical characteristics of the distinctive features. As a basic construct relating certain aspects of phonemic and phonetic analysis, the distinctive feature theory has been widely accepted as extremely promising. The necessary work of elaborating and refining its application, however, is exceedingly complex. Those responsible for its formulation, both in its theoretical aspects and in the specification of the distinctive features, have encouraged the critical consideration of those interested in both phonemics and phonetics. The logic and appropriateness of the theoretical construct must be scrutinized and appraised in terms of phonemic theory, and the relevance of its application to the actual speech event must be tested in terms of experimental phonetics.

B.

STATEMENT OF THE PROBLEM

The statement of the distinctive feature analysis in terms of binary oppositions makes it amenable to experimental test phonetically. For example, each distinctive feature has been defined acoustically. The theory states that a given distinctive feature, if relevant to the acoustic specification of a certain phoneme, is relevant on a dichotomous basis, that is, it is either present or absent as evidenced by the acoustic record. If present, it contributes to the unique bundle of distinctive features which characterizes that phoneme acoustically. If absent, it contributes to the unique bundle of distinctive features which characterizes that phoneme's cognate (or opposition pair) on the distinctive feature parameter involved. Thus, according to this theory, through a series of dichotomous judgments in terms of the acoustic records, any phoneme can be distinguished from all other phonemes in the language. In establishing the acoustic specifications for the distinctive feature opposition, the traditional linguistic method of analysis by minimal pairs was used, i.e., the method of testing for sameness or difference by commutation in identical phonetic context. In spite of this limitation, the results for both phoneme analysis and distinctive feature specification have been presented in the form of generalizations, and not as specific to phonetic environment. That is to say, a given phoneme is said to be characterized by a certain set of distinctive features, the implication being that this holds true whenever that phoneme is recognized. Similarly, the distinctive

INTRODUCTION

13

features, in turn, are characterized as having certain specific acoustic attributes. Thus, the specification of both phonemes and distinctive features must be assumed to be broad enough to embrace each identifiable member of the phoneme class, whatever the circumstances of its utterance, e.g., regardless of speaker, phonetic context, or position in utterance. On the other hand, there is evidence from research in experimental phonetics that such specifications may not be possible. No one-to-one relationship has been established between sound recognition and acoustic specification. Identical acoustical characteristics have resulted in the recognition of a given phoneme in one phonetic context, and in the recognition of a completely different phoneme when the phonetic environment was changed (53, 71). Conversely, certain sounds identified as representative of the same phoneme have been found to have acoustically divergent characteristics, so divergent as to resemble other phonemes more than they resemble each other (119, 175). Under these circumstances, it seemed worthwhile to investigate the effect of changing phonetic environment and position in utterance upon the acoustic specification of phonemes in terms of the distinctive features. The hypothesis to be tested stated that the differences which exist acoustically among the allophones of a phoneme are sufficient to alter the distinctive feature specification of that phoneme. To test that hypothesis, the present study was designed.

C. PLAN OF THE STUDY

The general objective of the present study was to take appropriate acoustical measures of certain English sounds in various phonetic contexts and to relate these measures to the distinctive feature specification for the phonemes represented by those sounds. The phonemes selected for investigation were four English fricatives, /f/, /v/, /0/, /S/, representing two distinctive feature oppositions: Tense vs. Lax (/f/ and /9/ vs. /v/ and /6/) and Grave vs. Acute (/f/ and /v/ vs. /0/ and /8/). The consonants selected for test were used in two positions in utterance, medial and final. Phonetic context was varied systematically employing each of the following vowels: /i/, /ae/, /a/, /u/. 2 For the medial condition, each of the consonants preceded the stressed vowel, for example, [ha'fit]. For the final condition, the consonant followed the stressed vowel, for example, [ha'tif]. The four consonants, appearing in two positions with each of the four vowels, resulted in thirty-two stimulus items, which were recorded on tape by eight speakers, four men and four women. Stimulus items judged to contain acceptable representations of the consonant phonemes intended were recorded graphically, using a sound spectrograph and a high speed level recorder. The acoustic records made were analogous to those specified 2

See Appendix A (p. 144) for a note on the transcription system used in the present study, and a chart of the relevant phonemic and phonetic symbols.

14

INTRODUCTION

for the acoustic representation of the distinctive feature oppositions Tense/Lax and Grave/Acute (see Table 2 on p. 22). From these acoustic records, the appropriate measures were taken (see Chapter III, p. 33). Results were tabulated in terms of acoustic representation for both the phoneme and the relevant distinctive feature opposition. Through comparisons of these acoustical measures, taken from the same phoneme in phonetic contexts varied systematically by both adjacent stressed vowel and by position in utterance, it was possible to observe the effects which change in environment exerted upon the phoneme's acoustical representation, as well as upon the acoustical representation of the distinctive feature oppositions under investigation.

II BACKGROUND OF THE PROBLEM

A. THE DISTINCTIVE FEATURE THEORY

The publication of Preliminaries to Speech Analysis (97) by Jakobson, Fant, and Halle in 1952 presented one of the most interesting and significant constructs in the field of phonemics. Essentially an analysis of the sound features of human language, this theoretical construct specifies twelve attributes (inherent distinctive features) as characteristic of the sounds of speech, i.e., in linguistic terms, the segmental phonemes as represented by their allophones. The theory proposes the analysis of any phoneme in a given language into an unique bundle of certain of these concurrent features. The distinctive features, rather than the phonemes themselves, are said to be specifiable in the stream of actual speech on all levels of the speech event: articulatory, acoustical, and perceptual. For each sound, a series of dichotomous judgments in terms of relevant binary oppositions serves to specify the phoneme. "According to the theory of distinctive features... a phoneme is regarded as the sum of the relevant sound features which preserve its identity versus other phonemes of the language" (36, p. 87; cf. 61, p. 511; 89, p. 34; and 174, pp. 118-119). The distinctive features presented in the Preliminaries are said to be universal and common to all languages and are presumed to be "independent of one another, that is, no one of them can be expressed as combinations [sz'c] of the others" (18, p. 63). The twelve oppositions are stated: Vocalic/Non-vocalic, Consonantal/Nonconsonantal, Interrupted/Continuant, Checked/Unchecked, Strident/Mellow, Voiced/ Voiceless, Compact/Diffuse, Grave/Acute, Flat/Plain, Sharp/Plain, Tense/Lax, and Nasal/Oral (97, pp. 18-40). While these twelve binary oppositions theoretically permit the specification of 4096 unique phonemes (19, p. 93), not all of the distinctive features are relevant to a given language. Of those which are, only a small number of the possible combinations are utilized for the phonemes of that language. Thus, since the number of segmental phonemes in any language is relatively small, the information conveyed in the phoneme has a high degree of redundancy, i.e., there are multiple cues among the relevant distinctive features which tend to ensure phoneme identification under less than ideal listening conditions. The concept of the phoneme as a concurrent bundle of distinctive sound features

16

BACKGROUND OF THE PROBLEM

was not new in 1952. It had been fore-shadowed early in the twentieth century by de Saussure's interest in the simultaneous as well as the successive character of structural linguistic entities (174). Saussure is also credited with the notion that the primary element in the sound system of a language is not the phoneme, but is, rather, the opposition, the differential quality among phomenes (93). Jakobson has stated: "Since 1932 in my papers I have defined the phoneme as a bundle of differentiating properties" (93, p. 328.) In 1933, Bloomfield wrote: Among the gross acoustical features of any utterance, then, certain ones are distinctive, recurring in recognizable and relatively constant shape in successive utterances. These distinctive features occur in lumps or bundles, each one of which we call a phoneme. The speaker has been trained to make sound-producing movements in such a way that the phoneme features will be present in the sound waves, and he has been trained to respond only to those features and to ignore the rest of the gross acoustical mass that reaches his ears (11, p. 79). Bloomfield, himself, did not propose the specification of phonemes in terms of their inherent features. Rather, he held, with Sapir (172), that phonemes should be grouped into categories according to their possibilities of combination with other phonemes in the speech chain. While both linguists employed phonetic criteria in pre-linguistic analysis (172, p. 45), they held that phonemic analysis must be based primarily on distributional patterns in order to have relevance to structural linguistics (11, pp. 129-130). The extreme of this viewpoint was stated later by Hjelmslev. "As phonemes are linguistic elements, it follows that no phoneme can be correctly defined except by linguistic criteria, i.e., by means of its function in the language. No extra-lingual [sic] criteria can be relevant, i.e., neither physical nor physiological nor psychological criteria" (77, p. 49; cf. 39). Other linguists currently active have emphasized that phonemic analysis must be primarily distributional rather than phonetic (73, 167). In recent years, the proposal has been made that linguists either develop rigorous rules for phonetic similarity or drop the criterion entirely. To that point, both Austin (3) and Belasco (7) have proposed formulations for phonetic similarity which they believe to have universal scope. Most linguists support the use of both distributional and phonetic criteria as providing complementary information important to the cataloging of phonemes, although many express reservations about the use of the distinctive feature analysis as the basic approach to phonemic analysis (29, 40, 41, 79, 136, 159, 189). However, Trubetzkoy (196) in 1939, supported the inherent distinctive features (phonetic characteristics) as the preferred basis for phonemic analysis. In his view, not every phoneme in each language could be uniquely specified on a solely distributional basis. Jakobson, Fant, and Halle, have pronounced the distinctive feature analysis fundamental to the distributional analysis. They point out in the Preliminaries (97, p. 12) that distributional classifications are based on innumerable assumptions that two

BACKGROUND OF THE PROBLEM

17

sounds may be classed as the "same". Information concerning the patterning of "same" sounds in the language is said to define the phonemes. Judgments of sameness or difference are bound to the informant's response to the sounds as they occur in spoken language, i.e., to his differential response to the acoustical output of native speakers. Consequently, these authors, among others, hold that a catalog of the phonemes of a language can never be actually based on purely distributional criteria. The problem of relating phonetics, a "natural" science, and phonemics, a "linguistic" (social) science, has long occupied the linguists (139). In relating phonetics and phonemics on the level of the distinctive features rather than on the level of the phonemes, the Preliminaries bypasses one of the most persistent problems of twentieth century structural linguistics, that of defining the phonemic concept and the relationship of phonemes to sounds which occur in actual speech (199). The distinctive feature theory is not antithetical to the concept of the phoneme as an "abstractional fictitious unit" of form (199, p. 37) in the functional system of language. At the same time, this theory does provide for the necessary link between phonemics and phonetics. The phoneme is related to the actual sound in the stream of speech through the specification of the "ultimate components", the distinctive features (97). The distinctive feature analysis utilizes, as a part of its working hypothesis, the theory of binary oppositions. Its authors term the dichotomous scale "the pivotal principle of the linguistic structure. The code imposes it upon the sound" (97, p. 9). While this scale has been accepted as extremely promising for all levels of linguistic analysis (54, p. 60), many linguists reject it as an "ultimate truth" of language structure (107, p. 708; cf. 16), and consider it merely a method of analysis imposed by the analyzer. "The theory that primary recognition is the result of a series of binary choices is convenient from the point of view of information theory, though it is not in any sense a sine qua non" (47, p. 170). The belief that binary classifications are inherent in our recognition of phonemes is defended in the Preliminaries on both theoretical and empirical grounds. Research in multidimensional auditory displays (162) is cited, showing that five different values of six different variables can be accurately recognized by listeners when presented to them one at a time. However, when all six variables are presented simultaneously, listeners are able to do no better than make an accurate binary judgment for each. Such multidimensional stimuli are presumed to bear certain similarities to the complex speech signal. Empirically, also, there is evidence that perceptual judgments are made on the basis of a series of two-choice decisions. For example, nasality, like many other sound features, is recognized as extending along a continuum from extreme nasal resonance to extreme de-nasal resonance. Yet any given English consonant is conventionally classified as either nasal or not nasal (oral). Thus, whenever such a decision is relevant, the listener is presumed to consign a specific sound to one of the opposing categories. For many English phonemes, e.g., all the vowels, this decision

18

BACKGROUND OF THE PROBLEM

is not relevant. For French vowels, however, it is relevant and here, too, it is said to be made on the basis of the dichotomous choice. The binary choice in phoneme identification is said to be universal. Lotz (131, p. 716) has reported, for example, that "although the voicing could form a continuum from the unvoiced to the completely voiced, there are no known cases in which more than two degrees of voicing are distinguished." There are other linguists who believe that the binary classification, rather than being an ultimate truth concerning the intrinsic structure of all language, may be as language-bound as the phoneme, i.e., appropriate to certain languages, but not to others. Even in the Preliminaries, notice was taken of one exception. "The opposition compact vs. diffuse in the vowel pattern is the sole feature capable of presenting a middle term in addition to the two polar terms" (97, p. 28). The authors cited two languages, Hungarian and Roumanian, in which the Compact/Diffuse opposition had three values (97, p. 9, 28). Halle (62) later absorbed this ternary opposition into the binary system by restating it as two different features, present or absent, i.e., Compact/Non-compact and Diffuse/Non-diffuse. This permits the specification of a particular phoneme as characterized by both features, a provision not permitted by a binary choice between polar opposites. (Cf. 83, p. 243 for similar analysis of English vowels.) Halle did not, at that time, extend this emendation to the other distinctive features which are stated as polar opposites, e.g., Grave/Acute, Tense/Lax, etc. However, in a later analysis of Russian consonants, he reaffirmed the binary rather than ternary opposition for Compact/Diffuse, and further resolved the Sharp/Plain and Flat/Plain oppositions of the Preliminaries into Sharp/Plain and Flat/Natural (63, p. 53). This set of oppositions had also received earlier criticism as a ternary rather than a binary opposition (201). A number of linguists have objected to the statement of two different kinds of oppositions among the elements of the distinctive feature analysis. Some are stated as features which negate each other (a "contradictory" relationship), and others as features which exclude each other (a "contrary" relationship) (178, p. 255). The logic of the inconsistent statements has been defended by Jakobson (89, p. 35), and certainly, stating all the features consistently would extend the list, a contingency avoided in the interests of economy, one of the four basic principles of phonemic analysis (59, 80). However, as Hockett has pointed out, greater economy, or a "reduction of inventory" for basic units (80, p. 110) (in this case, a shorter list of distinctive feature oppositions) is of doubtful advantage if it results in further complication at some later stage of linguistic analysis. This may be the case with the economy achieved through inconsistent statement of the binary oppositions. The difficulty of applying the dichotomous scale to the phonemes of all languages may be apparent rather than real - an artifact of too much economy in the statement of the feature oppositions.

19

BACKGROUND OF THE PROBLEM

B. THE SPECIFICATION OF PHONEMES IN TERMS OF DISTINCTIVE FEATURES

While the distinctive features are considered to be universal (cf. 44, pp. 473 if. for disagreement on this point), the unique bundle of features which characterizes a given phoneme is specific to a given language. The specification of phonemes according to their inherent distinctive features was made gradually more explicit in Jakobson's writings during the period from 1938 through 1941 (cf. especially 89, 91). With the publication of the Preliminaries in 1952, a fuller development of the theory was presented, together with a component analysis of some of the phonemes of English in terms of their distinctive features. Since that time similar specifications for certain phonemes of a number of other languages have been published, among them German (59), French (100), Mandarin Chinese (211), Spanish (1), Russian (63), Swedish (36, 135), Arabic (96), and Polish (184). In addition, the distinctive feature theory has been applied to other areas of linguistic analysis in these and other languages, among them Amerindian languages, e.g., Chontal (110). TABLE 1

Distinctive feature pattern of English consonant phonemes* 1

1.

2.

3. 4. 5. 6. 7. 8.

Vocalic/Non-vocalic Consonantal/ Non-consonantal Compact/Diffuse Grave/Acute Nasal/Oral Tense/Lax Continuant/ Interrupted Strident/Mellow

J tJ k 3 d 3 g m 1 P

V

b n

s

e

t

z Ö d h

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + + + + —



+









+











+





+

























+



-

+

































+ + +





+ + +





* After Jakobson, Fant, and Halle (97, p. 43). See Appendix A, p. 144.

Table 1 presents the relevant distinctive features for each of the English consonant phonemes which were specified in the Preliminaries (97, p. 43). A plus indicates that the first member of the opposition listed is relevant to the identity of that phoneme, a minus, that the second member of the opposition is relevant, and a blank, that both members are irrelevant. In the last case, either sound characteristic represented in the opposition might be present or absent without affecting the identity of that phoneme. It will be noted that only eight out of the twelve universal binary oppositions are considered relevant to the description of English consonant phonemes. C. THE SPECIFICATION OF THE DISTINCTIVE FEATURES

The distinctive feature oppositions are theoretically specifiable at the present time on

20

BACKGROUND OF THE PROBLEM

at least three levels of the actual speech event - in terms of the articulatory processes, the acoustical signal, and perception. The problem lies in specifying the articulatory adjustments, the acoustical characteristics, and the perceptual effects which are invariant for each distinctive feature. This can be done only through recourse to specific utterances. Thus, this particualr phonemic hypothesis can be tested only in the field of experimental phonetics. The Preliminaries attempted only to define each feature generally, illustrating those generalities with selected examples of the specification of various features on one or more of the levels indicated. No attempt was made to present a specific or exhaustive formulation in terms of any one aspect of speech. Linguists have differed regarding the optimum frame of reference for the specification of the distinctive features. Some have felt that "the attempt to keep one foot on either side of the physical-psychological transformation" is one "apparent weakness" in the distinctive feature theory as presented (47, p. 170). Most linguists agree that, in terms of our present knowledge and instrumentation, either the articulatory or the acoustic aspect is more promising than the perceptual. Further, there is agreement on the point that "any description of phonetic features ... requires to be correlated with the listener's perceptual response. Physiology or acoustics can provide us with the tools for description, but only auditory perception can tell us what to describe" (56, p. 131). This is in accord with the recommendation in the Preliminaries that "the variables of any antecedent stage be selected and correlated in terms of the subsequent stages, given the evident fact that we speak to be heard in order to be understood" (97, p. 13). As to the choice between articulatory and acoustic representation, some linguists favor the former, some the latter (200, 56). Fischer-Jergensen represented a widely held view with the statement: "Acoustic description should [not] replace articulatory and auditory descriptions. ... A complete and satisfactory description of speech sounds must take all three aspects into account" (44, p. 479). The authors of the Preliminaries aspire to the defining of the distinctive features on all three levels, but at the same time, they give certain reasons for favoring acoustic specification. Each of the consecutive stages of speech from production to perception, they say, can be predicted from the previous stage; the auditory has been predicted from the acoustic and the acoustic from the articulatory. Halle has written: Given a certain geometrical configuration and excitation of a resonator, its acoustic output is entirely predictable. Hence, all other things being equal, any uniformity on the articulatory side must have a statable acoustical counterpart (59, p. 206). Wendahl's recent vowel study (204) appears to corroborate this point, for he found it feasible to go from physiological measures to electrical analog-adjustments with resultant perceptual confirmation of the phonemes intended. This predictability, however, is said to be irreversible, since some variables from each of the previous stages have proved to be irrelevant for the following stage (97, pp. 12-13).

BACKGROUND OF THE PROBLEM

21

Halle has emphasized the essential dependence of the articulatorv specification upon the perception of the acoustic output. The phoneticians who picked a particular articulatory uniformity as a distinctive feature over a whole series of others which they might have chosen were guided by the observation that these functioned as perceptually distinctive marks, and hence must also have existence on the acoustical level (59, p. 206). The definitive statement of preference for the acoustical specification was made by Jakobson. By an exact measurement of the resonating cavities and of the phonation we can predict the acoustic effect; but this predictability is irreversible: the same acoustic effect can be obtained by different means, e.g., by a parakeet, or by the electrical "Voder" of the Bell Telephone Laboratories, or by Haskins' hand-drawn sound tracks. Articulatory compensations permit individuals with intact hearing but without teeth to pronounce correctly all "dentals", or without a tongue to produce "lingual" speech sounds. And even normally, as experimental phonetics detected, a sound such as the Russian y is articulated by natives in diverse ways with no acoustic difference ... not the articulatory prerequisite, but the acoustic stimulus carries the whole information in the message from the sender to the receiver .... [This is] why the Prague phonemicists have assigned to the motor criteria an important but nevertheless subordinate role, as compared to the acoustic (93, p. 331). Following the preference of the authors of the Preliminaries, the acoustic specification of the distinctive features has been adopted for the present study. Since only general statements were made in the Preliminaries concerning acoustic values for the various features, it was necessary to supplement that information with more precise formulations from subsequent studies. 1 The information concerning the acoustic specification of the distinctive features under investigation in the present study has been collated and is shown in Table 2, together with the source of information for each specification. The formulation of the distinctive features in physical (acoustical) terms has followed a traditional scientific procedure, the search for invariant properties. The essence of an effective rule for a game or a useful law of physics is that it be stateable in advance, and that it apply to more than one case. Ideally, it should represent a property of the system discussed which remains the same under the flux of particular circumstance. In the simplest case, it is a property which is invariant to a set of transformations to which the system is subject (206, p. 63). Thus, Fundamentals of Language (98, p. 13) declares phonemic analysis to be "the study of properties invariant under certain transformations", and asserts that "the sameness of a distinctive feature throughout all its variable implementations is now 1

These include Jakobson and Halle, Fundamentals of Language (98), Jakobson and Halle's chapter, "Phonology in Relation to Phonetics", in Kaiser's Manual of Phonetics (99), Halle's dissertation, The Russian Consonants: A Phonemic and Acoustical Investigation (58), and L. G. Jones' articles, "The Vowels of English and Russian: An Acoustic Comparison" (102) and "English Consonantal Distribution" (103), the former based upon his dissertation. Both these dissertations were done at Harvard under Jakobson and were subsequently incorporated into Halle's book, The Sound Patterns of Russian (63).

22

BACKGROUND OF THE PROBLEM TABLE 2

Acoustic specifications of two distinctive feature oppositions Distinctive Feature Oppositions TENSE VJ. LAX

GRAVE vs. ACUTE

Acoustic Characteristics

Phonemes

Higher total amount of energy Greater spread of energy in spectrum Longer duration (97, 98, 99)

HI

/e/

Lower total amount of energy Smaller spread of energy in spectrum Shorter duration

M

161

Concentration of energy in lower frequencies of the spectrum* (97, 98, 99)

Id

M

Concentration of energy in upper frequencies of the spectrum

/©/

làl

* Comparison measurement for identification of feature of Gravity from acoustic records: 1. Pass sample through 200 cps high pass filter with attenuation of 24 db/octave. 2. Measure energy below 1000 cps and the total energy of the sound. For voiced phonemes, subtract 15 db from the first value. 3. Subtract the 1000 cps low pass value from the total energy. For diffuse continuants the difference for grave phonemes is below 20 db; for acute phonemes, above (58, 63).

objectively demonstrable". Similarly, the phonemes specified by combinations of these acoustical "constant correlates" (98, p. 14; cf. 63, p. 6 and 93, p. 329) are themselves held to be linguistic invariants. Jakobson has written: "In the combination of distinctive features into phonemes, the freedom of the individual speaker is zero" (95, p. 158). These "invariant" properties of both distinctive features and phonemes, however, have been at the same time defined as "relational properties" such that the "minimum same" of a feature may only be specified in terms of its relationship to the opposite alternative, i.e., in terms of the phoneme cognate on a particular distinctive feature parameter. "However the stops in tot may differ from each other genetically and acoustically, they are both high-pitched in opposition to the two labials in pop, and both display a diffusion of energy as compared to a greater concentration of energy in the two stops of cock" (98, p. 14). This traditional method of phonemic analysis, commutation in minimally differents sets (41, 43, 200) is said to "avoid the difficulties connected with allophones and to eliminate the need for phonetic 'identity'— Since the members of minimally different sets are by definition identical contexts, we are always comparing things which are otherwise the same" (59, p. 207, 209). The phonemic identification in terms of distinctive features, therefore, is said to "concentrate on the differences existing between the phonemes and not upon the properties common to all utterances of a given phoneme" (59, p. 203).

BACKGROUND OF THE PROBLEM

23

Yet the specification arrived at by commutation methods, in which phonemes are contrasted on only one relevant parameter and in identical contexts, is presented in the form of "a matrix in which each phoneme is given with the distinctive features which are necessary for its identification" (59, 208; cf. Table 1 of the present study, p. 19). There is no stipulation as to context. "These distinctive cues . . . are . . . the invariants of speech" (61, p. 511). "If we state that in English the phoneme /k/ occurs before /u/, it is not at all the whole family of its various submembers, but only the bundle of distinctive features common to all of them [italics mine], that appears in this position" (98, p. 13). Since that is true, the statement of the distinctive features of the English phoneme /k/ must apply with equal validity to the allophones which appear before and after various vowels and which cluster in permissible consonant combinations - all in various positions of utterance (cf. 87, p. 38). The relating of phonemics to phonetics in this manner may not be so simple as it appears from the distinctive feature analysis. The varying transitional effects upon speech sounds in different environments have been traditionally recognized by phoneticians (33, 39, 190, 177, 101). "In detailed phonetic analysis there are as many combinatory variants or allophones of a phoneme as the number of possible variations of a phonetic environment" (36, p. 87). Truby mentions a collection of spectrograms which totaled over one thousand CV situations and over one thousand VC situations, "all from a single speaker of General American. From the standpoint of possible phonetic combinations, even this collection is incomplete" (197, p. 396). In a later publication, Truby elaborated still further on the potential variability among phonemes, concluding that "allophone calculations are astronomical!" (198, p. 127). Peterson has stated that the individual class members of each phoneme, i.e., its allophones, possess "acoustical target relationships which are environmentally conditioned" (155, p. 275). Hockett has summarized the problem. Acoustic allophones are numerous, diverse, intersecting, and overlapping. By "numerous" we mean that a single phoneme, instead of being represented by at most a few allophones ... is represented by dozens of clearly different ones. By "diverse", we mean that the allophones which represent a single phoneme do not necessarily appear as minor variations around some single measurable constant core, but instead may seem to have virtually nothing in common. By "intersecting" we mean that a given acoustic allophone of phoneme A may resemble some allophone of phoneme B much more closely than it does some of the other allophones of phoneme A By "overlapping" we mean that the representation of one phoneme does not necessarily end before the representation of the next begins. Indeed, the total representation of a given phoneme in a given environment may be spread or scattered through the portion of the spectrogram which also represents several preceding and several following phonemes (80, pp. 116-117). It is the recognition of this complexity in the acoustic representation of the phoneme which has led a number of linguists to decide that "it is the allophones, not the phonemes, which are the acoustic constants in speech" (200, p. 608; cf. 41, p. 612;

24

BACKGROUND OF THE PROBLEM

44, p. 475; and 131, p. 713). Jakobson, however, has stated that "a linguist knows that speech-sounds present, besides phonemes, contextual and optional, situational variants (or, under other labels, 'allophones' and 'metaphones') But variations cannot be acknowledged without the existence of invariants" (94, p. 19). Since the distinctive feature analysis for the phonemes makes no provision for allophonic variation, it must be presumed that the bundle of distinctive features indicated is common to the phoneme, whatever its environment. Therefore, the acoustic specifications of all relevant distinctive features must be appropriate, in terms of dichotomous scale, to every allophone. It is at this point that this phonemic theory is difficult to reconcile with the results of experimental phonetic research (24, 119). Acoustic instrumentation has provided visible evidence of the physical modifications of sounds as a function of phonetic environment. Speech analysis and speech synthesis have both demonstrated unequivocally that, in terms of their acoustic characteristics, consonants modify adjacent vowels (82, 106, 166), vowels modify adjacent consonants (119, 175, 180), and consonants modify adjacent consonants in the clusters which are permissible in the language (103, 173). Further, it has been shown that position within utterance affects the acoustic character of sounds (123, 180, 182). As would be expected, the authors of the distinctive feature theory have taken cognizance of sound modification in the stream of speech. "We see acoustic signals succeeding each other in time, with adjacent segments exerting an influence on each other." "Nobody would expect events which succeed each other as rapidly as do phonemes not to affect one another" (61, p. 511). Their assumption is that all such modifications lie well within the acoustic specification of the distinctive features and the distinctive feature specification of the phoneme (98, 131). This assumption may not be tenable in terms of these specifications as now stated. The possibility that contexual variants among the phonemes may alter specifications for both phonemes and distinctive features is recognized by those most intimately concerned with the original specifications. Fant has written: We would like to have a more profound knowledge of how a distinction is realized quantitatively within all possible minimally distinct pairs and in all positional variants, including those due to differences among individual speakers as well as to the influence of superimposed denotative and emphatic features. Implementation of this program will require research of great complexity and difficulty, and we may thus be excused for not waiting for a precise mapping of all phonetic details before proceeding to a structural ordering of our knowledge. As research progresses it will be possible to supplement, modify and reformulate more concisely the statements we made in Preliminaries (37, p. 109). Halle, too, has indicated an awareness of the probability that changes may need to be made in the specifications. "The practical application of a theory to a large body of data always brings with it more or less minor modifications in the theory. Certain concepts may have to be redefined in a manner differing somewhat from the original theory; special terminology may have to be created, etc." (63, p. 52).

BACKGROUND OF THE PROBLEM

25

In point of fact, the physical modification of a sound as a function of its phonetic environment has been shown to be extensive and has produced complications in the relating of acoustical representations to phoneme recognition (22, 154). Sounds with widely disparate acoustic patterns are recognized by listeners as the "same" sound (53, 71). Sounds with identical acoustic patterns have been identified by listeners as "different" sounds (175). In fine, "the same acoustic effect may be obtained in various ways, and the same perceptual eifect may be due to different acoustic stimuli" (44, p. 466). In all such cases, the acoustic phenomena have been attributed to phonetic context, and the perceptual phenomena to phonemic conditioning (53, 71). The process of language learning is presumed to permit the listener to identify phonemes by ignoring irrelevant differences and recognizing relevant similarities and differences. However the disparity between acoustical and perceptual representations is rationalized, ... this freedom between the number and arrangement of physical clues and the sound which is recognized seems to be one of the basic facts about speech, a fact which cannot be disregarded by any valid theory of speech recognition. The theory of distinctive features may appear at first sight to allow for this freedom since a number of features are relevant to the recognition of one phoneme. But if each feature is correlated with an invariant property of the speech wave motions and if the recognition of a given phoneme requires always the same pattern of distinctive features, then the theory does imply one-to-one correlation between wave-motions and linguistic units in a sense which is contradicted by a number of experimental results (47, p. 170-171). The definitive test of this point lies in a consideration of the phonemes and the distinctive features as they are optimally represented in highly intelligible speech. However, changing phonetic context as permitted by the language is not the only relevant variable. The phoneme specifications must be presumed appropriate to every sound identified on the basis of its acoustic cues alone, even if the conditions of representation are less than ideal, e.g., with speakers of both sexes, and of varying ages, dialects, and idiolects, with changes in attitude, physical condition, or mode of speech for a given speaker, with distortions in transmission channels, with singing, with whispering, etc. (37, 93, 131, 160). As Fischer-Jergenson has suggested, "there may be a bundle of differences which are all present in optimal cases, but which need not all be there" (42, p. 58). Even for the specification of distinctive features under optimal conditions, more information is needed. K. N. Stevens wrote in 1953: Many of the features are at present defined in the acoustic domain in rather general terms, and statistical data for a group of speakers is not available. Further detailed experimental analysis is required to obtain a more precise definition of those features. For example, the acoustic properties of turbulent and transient excitation associated with the fricative and stop consonants are not thoroughly understood at present (187, p. 163). The years since 1953 have produced more information on all these points, but because of the complexity of the problem and the difficulties of research in this area, Stevens' comment remains a timely one.

26

BACKGROUND OF THE PROBLEM

In summary, the acoustic properties of the distinctive features have been specified as they apply to all languages. The distinctive features of certain English phonemes have been specified acoustically on the basis of constant phonetic environment as represented in minimal pairs. Phonetic research indicates that the acoustic characteristics of a given sound vary as a function of phonetic environment and that phonemic recognition of the same acoustic pattern varies as a function of its phonetic context. The purpose of this study was to investigate the effect of changing phonetic environment upon the acoustic specification of certain consonant phonemes in terms of the distinctive feature analysis. The hypothesis is stated as follows: that the differences which exist acoustically among the allophones of a phoneme are sufficient to alter the distinctive feature specification of that phoneme.

Ill PROCEDURES

A.

SELECTION OF STIMULI

The purpose of the present study was to determine the effect of changing phonetic environment upon the specific acoustic characteristics of certain consonants. The sounds selected to be studied were to be produced in two positions of utterance (medial and final) by as many speakers as might be feasible, both men and women. "There are at least as many allophones of a given phoneme as there are positional (phonemic) environments in which the given phoneme may be found" (198, p. 126; cf. 123, 180, 182, 200). "There are as many diaphones of a given allophone as there are speakers who employ that allophone," (198, p. 128; cf. 78, 131, 160). Keeping in mind the number of positions and speakers to be employed, the initial problem was the selection of the vowels, consonants, and distinctive features to be investigated. It seemed desirable for the purposes of the study to vary the consonantal environments by using adjacent stressed vowels which would represent the range of assimilative influence characteristic of the vowel system of the language. Certain vowels can be supported on theoretical grounds as satisfying this criterion. The traditional physiological vowel diagram established the cardinal vowels for General American English on the basis of place of articulation. In terms of that diagram, the widest variety of tongue, jaw, and lip adjustments, and hence of resultant cavity modifications, is represented by four vowels: the highest front vowel, /i/, the lowest front vowel, /ae/, the lowest back vowel, /a/ and the highest back vowel, /u/ (111, p. xiii). Corroborative evidence of the contrasts among these four vowels has been supplied by a number of acoustic studies, notably those of Peterson and Barney (156) Jones (102), Grubb (55), Tiffany (193), and Wendahl (204). These vowels are characterized acoustically by large differences in the interrelationships of their first three, or identifying, formants (cf. Table 3). Perhaps the most helpful illustration of the acoustical differences among these vowels as shown in Table 3 is seen in the column of ratios on the right depicting the relationships between the various formants for each vowel. In addition to their physiological and physical differences, these four vowels represent phonemes classified as contrasting in terms of two distinctive feature oppositions (Grave/Acute and Diffuse/Compact) which are basic to vowel analysis:

28

PROCEDURES TABLE 3

Comparative formant values and ratios for / i f , /œ/, f a f , and fuf-male subjects.

fil

/«/

/a/

M

Peterson & Barney (156)

Grubb (55)

Wendahl (204)

Tiffany (193)

Jones (102)

Fl

270

263

279

395

287

R1

7.6

F2

2290

2378

2116

2355

2195

R2

1.3

F3

3010

3099

2895

R3

10.0

Fl

660

F2

Jones' Ratios*





733

679

725

650

R1

2.6

1720

1654

1724

1865

1700

R2

1.4

F3

2410

2510





2405

R3

3.8

Fl

730

775



750

625

R1

1.6

F2

1090

1064



1235

1042

R2

2.2

F3

2440

2614



2300

R3

3.7

Fl

300

279

325

440

350

R1

2.2

F2

870

825

986

930

777

R2

2.7

F3

2240

2496

2117

R3

6.0



All values for formants cited in cycles per second. * R1 =F2/F1, R2=F3/F2, R3=F3/F1. The measure for Diffuse/Compact is the magnitude of R3. The less the R3, the more Compact the vowel; the greater the R3, the more Diffuse the vowel, (/i/ and /u/ are the most Diffuse; /a/ the most Compact; /ae/ is also Compact.) The measure for Grave/Acute is the difference between R1 and R2. If R1 is less than R2, the vowel is Grave; if R1 is greater than R2, the vowel is Acute, (/i/ and /ae/ are Acute, /a/ and /u/ are Grave.) (102).

f i f , Diffuse and Acute, /ae/ Compact and Acute, faf Compact and Grave, and /u/ Diffuse and Grave. These mutually supportive items of information seemed sufficient to justify the choice of / i f , /ae/, /a/ and /u/ on theoretical grounds. For a number of reasons, selecting the consonants to be investigated was more difficult. The literature provides less help on this problem; acoustically, a good deal more is known about the vowels than about the consonants. The range of possible choice is greater for consonants, for the list of English phonemes contains more consonants than vowels. Yet is was essential to restrict the number of consonants to be studied, since every consonant must be produced twice with every vowel by each speaker. Further, the choice of consonants could not be made independently of the choice of the distinctive feature oppostions to be studied. The distinctive features

PROCEDURES

29

might be expected to vary in their degree of acoustic stability in changing environments. However, not every distinctive feature opposition is relevant to every phoneme (see Table 1, p. 19). Thus, the choice of consonant could dictate the distinctive feature to be investigated, or the distinctive feature to the studied could narrow the range of possible choices among the consonants. This latter possibility appeared most promising. To help provide some relevant information on this point, a pilot study was conducted. This preliminary investigation employed only one speaker, but used nine vowels, /i, I, e, ae, A, a, o, o, u/, and twenty consonants, /p, t, k, b, d, g, m, n, q, f, v, 0,5, s, z, J, 3, tj, d3,1/, i.e., all the consonants specified in terms of the distinctive feature analysis. The five additional vowels were used as a check on the appropriateness of the four vowels selected on theoretical grounds. The vowels /e/ and /o/ were not used since they tend to be unstable and frequently are dipththongized in the General American dialect. Both these characteristics would present problems in the interpretation of the assimilative influence of the vowel. The pilot study followed the procedures to be used in the main investigation, and which are specified later in this chapter. The sounds selected for study were set in appropriate phonetic frames, randomized, recorded, presented for recognition judgments, and analyzed acoustically. Information provided by the pilot study was collated with other available information and formed the basis for the final selection of stimulus sounds. The decision to use the four vowels originally selected was confirmed on empirical grounds. These vowels did appear to exert the anticipated range of assimilative influence upon the acoustic character of the adjacent consonant. Moreover, those effects seemed to pattern consistently according to the Compact/Diffuse nature of the vowel. However, it also appeared that the acoustic influence of vowel upon consonant could be expected to operate across distinctive feature lines, e.g., the Compact/Diffuse nature of the vowel, rather than changing the Compact/Diffuse nature of an adjacent consonant, might instead affect its Grave/Acute and Tense/Lax characteristics. This information dictated the choice of a class of consonants which included both Grave/Acute and Tense/Lax pairs. Only two English consonant classes satisfy this criterion: the plosives and the fricatives. Acoustically, more is known about the plosives than about the fricatives. Speech synthesis experiments at Haskins Laboratory (119, 120, 121, 122), supplemented by later tape-cutting experiments on actual speech (175), have yielded relevant data. The plosive pair upon which vowel environment is reported to effect the greatest change, both physically and perceptually, is the velar plosive set, /k/ - /g/. This pair, however, does not meet the criterion specified by the pilot study, i.e., that Grave/ Acute and Tense/Lax contrasts both be relevant (cf. Table 1, p. 19). Consequently, the fricative class seemed the obvious choice. The literature contains relatively little information on the acoustic character of the fricatives. The information which is available, however, would appear to support their use in this investigation. Tarnoczy (191) has reported that at least one fricative,

30

PROCEDURES

HI, shows significant acoustic modification as a result of its proximity to high or low vowels. Harris found that the acoustic synthesis of the fricatives /f/ and /9/ depended "primarily on the basis of cues contained in the vocalic part of the syllable" (67, p. 952). These findings seem to support the notion that the acoustic character of the fricatives may be significantly modified by changes in vowel environment. In addition to providing a basis for consonant selection, the pilot study also served to focus interest on two particular distinctive features, Grave/Acute and Tense/Lax. The investigation of both in relationship to Compact/Diffuse vowels promised to be of interest. Theoretically, the Grave/Acute and Compact/Diffuse consonantvowel interrelationship is of particular interest, for it is said that "no language lacks the oppositions Grave/Acute and Compact/Diffuse whereas any other opposition may be absent" (98, p. 40). On the basis of the above information, the decision was made to investigate the Grave/Acute and Tense/Lax oppositions in fricative consonants adjacent to Compact/ Diffuse stressed vowels. For a number of reasons, the fricative pairs /f/ /v/ and /0/ /5/ were chosen as the focus of the investigation. These four consonants present all the possible combinations of the two distinctive feature contrasts under investigation. There is relatively little information in the literature on the acoustic specification of these sounds and some of the available information is contradictory. These last two points are conspicuously true in the case of /0/ and /5/. Studies relevant to these four consonants will be discussed in conjunction with the presentation of results for the present study.

B.

FORMULATION OF STIMULUS ITEMS

It was important for the purposes of this study to have recognition judgments based as completely as possible on acoustic cues alone. There is evidence that perceptual judgments are strongly influenced by a number of language factors which are nonacoustic. Among these are the meaningfulness of test materials (13), the effects of context (13, 143), the number of syllables in the word (76, 84), and word probability in the language (84, 147). Consequently, the sounds to be studied were embedded in nonsense syllables with stable phonetic frames so that their identification would depend upon "primary recognition", termed "the only kind of recognition that is based directly on the physical input" (47, p. 170). Since position of the consonant in relation to the vowel has been cited as a possible variable in consonant recognition (180, 182), it was decided that two positions of utterance would be used, medial and final. By analogy with the nonsense word format first used by House and Fairbanks (82), each stimulus item began with the unstressed syllable /ha/. This served to minimize the meaningfulness of the material and at the same time ensured a relatively stable

31

PROCEDURES

onset for the consonants in medial position. For true vowel quality, the contrasting vowels had to appear in stressed syllables. A stable structure for the stressed syllables was provided by the use of a constant consonant, the voiceless plosive /t/, for initiation or termination of the syllable. The constant /t/ served yet another function. "Stops and fricatives are especially useful topographical references for mapping the sequential segments of speech" (38, p. 339). The spike of the /t/ burst (see Figure 1) provided a clear reference point on all spectrographic records, and this spike could be reliably related to the location of the /t/ burst on the magnetic tape recordings. Furthermore, the /t/ always stood in clear contrast to the adjacent stressed vowel, by virtue of its burst when it preceded the vowel and its implosion period when it followed the vowel. This circumstance facilitated the establishment of the boundaries of the sound segments to be measured. The consonant and vowel combinations to be studied, set into appropriate phonetic frames, resulted in a total of thirty-two stimulus items for each speaker. These thirty-two stimulus items, unrandomized, are shown in Table 4. TABLE 4

Stimulus items ha

f

i t

hat i

f

ha

e

i t

hat

i

e

ha

f

as t

ha t œ

f

ha

e

» t

ha t s

e

ha

f

a t

ha t a

f

ha

e

a t

ha t a

e

ha

f

u t

ha t u

f

ha

e

u t

ha t u

e

ha V

i t

hat i

V

ha

Ö

i t

hat i

Ö

ha

V

se t

ha t sc

V

ha

Ö

¡e t

ha t at

Ö

ha

V

a t

ha t a

V

ha

Ö

a t

ha t a

a

ha

V

u t

ha t u

V

ha

Ö

u t

ha t u

Ö

C.

SPEAKERS

The stimulus items were recorded by eight adult native speakers of the General American dialect, none of whom had any history of speech or hearing impairment. These included four men and four women, all professionally trained in speech and phonetics. Both men and women were used to provide information concerning the phonemes as realized in the speech of two major groups of language users. Systematic differences have been shown to exist in the speech of men and women, e.g., the vowel

32

PROCEDURES

formants for women average 17% higher in frequency than those cited for men in Table 3 (156). Facility in phonetics was essential for both the reading of the phonetically transcribed nonsense words and their subsequent transcription in self-judgment. Trained speakers could be expected to prove more stable in their utterances, both individually and as a group, and to produce stimulus items with the highest potential recognition scores for nonsense syllables (55, 193). Jakobson and Halle (98) have indicated that optimal speech signals are presumed to supply optimal representation of the distinctive features.

D.

INSTRUMENTATION

The stimulus items were recorded using an Electro-Voice model 630 microphone and a Magnecorder model PT6A tape recorder. For the subsequent phonemic recognition test the Magnecorder model PT6J amplifier was connected to a monitor headset of Permoflux PDR-8 phones with NAF 48490-1 type cushions. For the production of all visual records, the tape recordings were reproduced on an Ampex 351 full track tape recorder. From this recorder the signal was passed into a Hewlett-Packard model 350 B attenuator and, with appropriate impedence matching, into a Kay Electric sound spectrograph, Sonagraph Model-R (38, 109, 112, 113, 114, 163, 164, 185). This instrument analyzes a complex speech signal as a function of frequency and time. The visual record which results displays frequency along the vertical axis and time along the horizontal. The instrument scans energy present in the frequency regions from 85 cycles to 12,000 cycles, and records it in two separate bands, each covering a range of 6,000 cycles. The horizontal distance covered by one Sonagram is 318.5 millimeters, representing 2.4 seconds in time. The band selector for the scanning filter may be set for either a narrow band (45 cycles) or a wide band (300 cycles). Spectrographic records for the present study were made with the wide band filter setting. In addition, it is possible to modify this instrument's operation so as to make a record of a preselected portion of the speech signal at some point in time. For this record, called a section, the instrument is said to integrate the filter output energy over a five millisecond interval (109, 113). The result is a visual record displaying amplitude in the horizontal direction versus frequency in the vertical. Ordinarily, such sections are inverted when they appear on the Sonagram, i.e. the low frequencies are represented at the top of the visual record and the high frequencies at the bottom. However, through the use of the pattern switch on the Scale Magnifier-Sr. - Model R unit of the Sonagraph, it is possible to reverse the frequency display produced by the sectioner. For all necessary re-recordings the Ampex was used to reproduce the signal and the Magnecorder to record it. Throughout the investigation, both recorders were operated at a tape speed of fifteen inches per second. On re-recorded tapes of the

PROCEDURES

33

stimulus items a signal tone for each item was recorded by the Magnecorder from a Hewlett-Packard model 201C audio-oscillator. These dubbed tapes, reproduced by the Ampex and employing appropriate impedence matching, were passed through two banks of Khronhite Variable Band Pass Filters model 310-AB, cascaded to 48 db per octave. From the filters, the signal went into a Bruel and Kjaer High Speed Level Recorder, type 2304. This latter instrument, a graphic recording voltmeter, responds to energy throughout the frequency range from 20 cycles to 20,000 cycles and produces a visual record which plots intensity in db along the vertical axis and time along the horizontal. The records used in this study were produced using a 50 db potentiometer. On the intensity level records obtained from this instrument, the appropriate measures were made by means of a Dietzgen Ott-Compensating Polar Planimeter, Type 16, 1803 D. This instrument provides a measure of the area under a curve. "The energy of any segment of the speech wave, e.g., a syllable, is apparently proportional to the area under the intensity curve within the time interval under consideration" (38, p. 334).

E.

PREPARATION OF STIMULUS ITEMS

In presenting stimulus items for phonemic recognition, it is important to provide freedom of choice in the perceptual response (143, 198). It has been shown, however, that consonant confusions occur within class for the most part (66, 144). Thus, presenting all the sounds apt to be confused with the four fricatives under study could be expected to minimize the effect of expectancy in phoneme recognition (13, 126, 144). In order to provide an essentially open matrix for phoneme recognition of the four fricatives under investigation, therefore, they were recorded in a test which included all the other English sounds with which they are consistently confused (66, 144). These included the other fricatives, /s/, /z/, /J"/, /$/, and the lateral /I/. These appeared in similar phonetic contexts, increasing the number of stimulus items to seventy-two for each speaker, thirty-six in each position. Since intermixing speakers also affects recognition scores (117) the items for each speaker were recorded as a unit, randomized according to the position of the test consonant in the nonsense word. Each position test group was preceded by a short practice session group of ten representative items, recorded for that purpose by the speaker. Thus, a total of ninety-two stimulus items was recorded by each speaker, thirty-six items with medial position test consonants, thirty-six items with final test consonants, and ten sample items for each position group. (See Appendix B, p. 146, for a sample of the recording for one speaker.) To help the judges keep their places on the answer forms, the speaker was instructed to precede each stimulus item by reading aloud the word "number" and the number of the item. This served as a brief carrier phrase, which previous studies have shown to be an aid to correct recognition of subsequent test material (117).

34

PROCEDURES

The same dual purpose was served by the simple directions read by each speaker within his test material, e.g., "Here are ten trial samples selected at random from the second list. Please transcribe them in the trial sample blanks on your paper. Number one " For the recording of the nonsense syllables, the speaker was seated in a soundtreated room with the microphone approximately fifteen inches from his lips. Each speaker was provided with a form sheet containing the randomized stimulus items in phonetic script and presented in the format in which they were to be recorded. A sample of the recording form used appears in Appendix B together with the instructions to the speakers. The speakers were instructed to read the test materials in a normal speaking manner, using a relatively constant time interval between items (approximately four seconds) and maintaining a relatively uniform intensity level. Each speaker was given sufficient practice time to satisfy himself and the experimenter that he could exercise this control without monitoring himself by the VU meter, and without receiving a visual timing cue. Throughout the actual recording, the experimenter monitored with the VU meter and a stopwatch. Whenever the stimulus items failed to satisfy the above criteria, or proved phonetically unsatisfactory to either speaker or experimenter, the items in question were re-recorded until both were satisfied. These replacements were later spliced into the appropriate places in the recorded tapes by the experimenter. Acceptance of the recorded items involved the agreement of speaker and experimeter, before listening to the recording, that each item had indeed been produced as specified. Subsequently, the speaker, the experimenter, and one other experienced phonetician transcribed the completed test phonetically as the recording was played back. Items which were thus approved and correctly transcribed were accepted for analysis. Items which failed this test were re-recorded until the above criteria were met and were later spliced into the tape where appropriate. Upon the completion of the recording and acceptance of all stimulus items for all eight speakers, the tapes were edited and prepared for acoustic analysis.

F.

ACOUSTIC ANALYSIS

The specification of the acoustic characteristics of the two distinctive feature oppositions Grave/Acute and Tense/Lax as represented in the four fricatives, /f/, /v/, /9/, and ¡5/, necessitated a series of five measures for each consonant under study. These included measures of 1) the duration of the consonant, 2) the spread of consonant energy in the spectrum, 3) the amount of energy present in the consonant above 200 cycles (high pass), 4) the amount of energy present in the consonant from 200 cycles to 1,000 cycles (band pass), and 5) the measure of difference between the two energy measures for each consonant (cf. Table 2, p. 22). In addition, a measure

PROCEDURES

35

of the duration of the vowel preceding each of the consonants under study was taken for ancillary information. This total set of measures for each consonant and preceding vowel necessitated a set of five different visual records from the sound spectrograph and a set of two different intensity level records for each stimulus item. The thirty-two items processed for each of the eight speakers (a total of 256 items) yielded 1280 different spectrographic records and 512 different intensity level records, a total of 1792 different graphical displays. 1. Procedure in Graphical Recording Sound

Spectrograph

Each stimulus item was reproduced separately on the Ampex and recorded onto the oxide-coated plastic disc of the Sonagraph with the appropriate HS setting and record-reproduce level adjustments. From this record a wide band Sonagram was made of the complete stimulus item from 85c-6kc, using a mark level which ensured the resolution of the weakest consonants without rendering the strongest sounds unrecognizable. Since almost all of the stimulus items were of durations well under 1.2 seconds, this wide band Sonagram occupied only half of the length of the Sonagram paper. Consequently, the paper was slipped half way around on the recording drum, and a wide band short Sonogram or stub, was made of the same complete stimulus item. This stub was continued only past the third formant of the stressed vowel, and in making it the mark level was reduced in order to permit better resolution of vowel formants. Directly above this full stimulus item stub, a section stub of the consonant under investigation was recorded. The location for the consonant section was decided upon by comparing the visual record of the Sonagram with the auditory record on the Sonagraph disc. Observing both records, the experimenter rotated the disc slowly back and forth by hand with the monitor gain high. The section was located at a point clearly outside the transition movements of the vowel formants where consonant energy was discernible on both the visual and the auditory record. The consonant section stub was recorded directly above the point in the stimulus item at which it was taken. All sections were recorded with the same high mark level to ensure resolution of the weaker consonants. The inverter pattern switch was used for certain consonant section stubs when appropriate. Finally, two full consonant sections were recorded using the same section setting. These covered the full frequency range of the instrument (85c-12kc). The inverter pattern switch was used to record all full consonant sections, placing the low frequencies at the bottom of the Sonagram. One hundred and sixty Sonagraph records were made from each of the eight speakers, a total of 1280 spectrographs representations. A sample set of the five visual records from the sound spectrograph is shown in Figure 1, p. 36.

36

PROCEDURES

PROCEDURES

37

Preparation of Stimuli for Intensity Level Recorder The reliable identification of speech sounds on the intensity level records demanded a point of reference, a signal mark, by which the intensity level configuration could be related to the stimulus item as recorded on the magnetic tape. Further, the need to inter-relate measures on the spectrograph«; records and the intensity level records made it necessary that the signal mark be specifiable in terms of the spectrogram as well as in terms of the speech signal. The burst of the /t/ satisfied the need for a reference point common to spectrogram and magnetic tape recording, for on both records its abrupt onset was readily identified. A magnetic tape recording of the re-recorded stimulus item was passed back and forth by hand across the playback head of the Magnecorder with the monitor at high gain until the onset of the /t/ burst was located. This point was marked and from it an appropriate constant distance was measured off, six inches following the burst of each final /t/ or eight inches preceding the burst of each medial /t/. These distances represented durations of approximately .40 second and .53 second, respectively, on tape moving at a speed of fifteen inches per second. Each of these respective distances provided a point at which a signal cue could be recorded without obscuring any portion of the speech signal. The signal cue had to be characterized by a broad frequency spectrum in order to retain its identity under the different frequency filtering conditions required for the two intensity level records. To ensure a broad frequency spread, a 400 cycle tone was generated by an audio-oscillator and recorded in a manner which resulted in a noise of abrupt onset with wide spread in the frequency spectrum. Perceptually, when the tape was played back, the effect was that of a signal bleep of unspecifiable pitch. Visually, on the intensity level records, the signal mark appeared at an appropriate distance from the speech signal as a single bleep of sufficient magnitude to be distinct from the grass, which represents the noise level, at the base of the record (see Figure 4). On Sonagrams made to verify its frequency spread, it appeared as a spike of noise which spread throughout the spectrum from 85c-6kc, very like the visual record of an intense click. A sample Sonagram of a stimulus item with a recorded signal mark is shown in Figure 2. Fifteen Sonagrams redone later with a higher mark level for resolution of weak /t/ bursts were made from the stimulus items which had recorded signal cues. This provided an informal reliability check on the process by which the signal cue was recorded. The distances of the signal mark onset from the /t/ burst onset proved to all be within a range of two millimeters on the Sonogram. This means that the assumed durations of .40 second and .53 second could be considered accurate to within ± .0075 second. The final step in preparing the items for analysis by the Bruel and Kjaer High Speed Level Recorder was to establish a compensatory playback level for each item so that the stressed vowels all reached approximately the same peak on the Ampex VU. This ensured that all items could be recorded with the same instrumental settings

on the Bruel and Kjaer level recorder, and hence would be comparable, without either overdriving the instrument at the one extreme or failing to resolve the weaker fricatives at the other. However, since differences in the relative power of the vowels have been established as one of their secondary characteristics (35, 171), and further, have been shown to vary as a function of consonant environment (82), the consequences of this procedure must be considered in the interpretation of results. Intensity level recorder

Two intensity level records were made of each complete stimulus item. For the first, the frequency passband was 200 cycles to 20,000 cycles; for the second, 200 cycles to 1,000 cycles. These filter settings provided records which may be compared to those used in specifying the distinctive features opposition Grave/Acute (cf. Table 2, p. 22). The stimulus items were recorded separately with appropriate Ampex playback level adjustments and constant settings on the Bruel and Kjaer. The lower limiting frequency switch* was set at 200 cycles to coincide with the filter setting cut-off.

PROCEDURES

39

The potentiometer db range setting was 50, the stylus writing speed was 700 millimeters per second, and the paper speed was 100 millimeters per second. In order to reduce the possibility for operator error, all stimulus items in the medial position group for one speaker were recorded without change in filter setting. The same group was then recorded with the second filter setting. The procedure was then repeated with the final position group. The process of graphing intensity level records resulted in sixty-four records for each of the eight speakers, or a total of 512 intensity level records.

G. PREPARATION OF GRAPHICAL RECORDS FOR MEASUREMENT

Obtaining measures comparable to those by which the distinctive features have been acoustically specified necessitated delimiting the particular sound in time. Such segmentation of speech into discrete units has been criticized as an arbitrary and artificial violation of the continuous stream of speech (44, 155, 197; cf. 157, 203). Nevertheless, the technique has been widely accepted for acoustic phonetic analysis (38, 75, 82, 158, 180). It is considered to be empirically meaningful and to have produced a good deal of valuable information (43, p. 148 and 59, p. 199). There is evidence that fairly sharp discontinuities do exist in the excitation function or in the spectral composition between two successive sounds. The natural unit as seen in the spectrogram may differ slightly from the linguistic phonemes, but they are more or less synonymous, especially in the major grouping of sound units such as vowels, stops, fricatives, etc. (14, p. 566). Further, such segmentation is a requisite to the relating of phonetic evidence to phonemic concepts, the essential purpose of the present investigation. The criteria used in establishing the sound segments to be measured were based upon those established in the literature (38, 106, 164). Vowel segments were identified by evidence of the presence of voicing, i.e., the voicing bar along the baseline of the Sonagram and the vertical striations which indicate the beginning of a voice fundamental period, as well as by the presence of the horizontal resonance bars (formants) associated with the vowel in question. Voiceless fricative segments were identified through multiple cues also. Among these were irregular vertical striations, conspicuously less intensity than adjacent portions of the acoustic pattern, the absence of strong and well-defined formants, and the absence of regularly spaced vertical striations. These characteristics also pertain to the spectral appearance of the /h/ and to the burst of the /t/, from one of which, depending upon position in the stimulus item, the vowels preceding the fricative were to be separated. Voiced fricative segments were similarly established from multiple cues. These included irregular vertical striations and conspicuously less intensity than adjacent portions of the Sonagram, especially in the frequency region above the third vowel

PROCEDURES

41

formants. In addition, the voiced fricative segments, unlike their voiceless cognates, showed the voicing bar along the baseline and the regularly spaced vertical striations associated with periodic vibration. They might also show one or more horizontal resonance bars. When these appeared, however, they were either discontinuous with those of the adjacent vowel or they stabilized in frequency regions not characteristic of that vowel. On the basis of the above criteria, carets were placed on the baseline of each Sonagram at points bounding the sound segments to be measured, i.e., the fricative and the vowel which preceded it in each item. For medial position fricatives, the preceding vowel was always /a/; for final position consonants, the preceding vowel was one of the four stressed vowels used in the study. Samples of Sonagrams so marked for a fricative cognate pair in each position appear in Figure 3. It will be noted that in all cases the transitional formant movement was included in the vowel segment. This was essential in order to obtain comparable measures for voiced and voiceless fricatives and for their adjacent vowels. The disposition of transitions in segmentation is a problem not yet resolved in acoustic research (157, 198, 203). For the purposes of this study, it was essential to exclude the transitions from the consonant segments. Any other course of action would have yielded measures not comparable to those which specified the distinctive features. As a reliability check on the segmentation procedures, all Sonagrams were arranged by fricative position groups in a simultaneous display. These were rechecked by the experimenter and by one other experienced phonetician for consistency in the application of segmentation criteria. Particular attention was given to consistency of segmentation internally within one consonant in both positions across all speakers and, similarly, between voiced and voiceless cognates. Appropriate conversions were worked out interrelating the time values on the magnetic tape recording, the Sonagram, and the intensity level record. By means of these conversions, the point at which the signal cue was recorded on the tape was established in respect to the /t/ burst on each Sonagram. The distance from this point to the boundary carets for each sound was tabulated. On each 200c-20kc intensity level record a baseline was drawn which averaged the peaks and troughs of the grass. This baseline was transferred to the 200c- lkc record of the same item. A caret was placed on each baseline at the point where the signal mark clearly emerged from the grass, i.e., the point where the left slope of the signal spike crossed the baseline. From this point the distance measures from the Sonagram were plotted on the baseline of the intensity level record. Perpendiculars were erected so that the points at which they intersected the intensity level curve completed the enclosure of the fricative segment for the intensity measure. Samples of intensity level records for the fricative cognate pairs of Figure 3 are shown in Figure 4 with the fricative segments delimited.

h

f

a

h

K 9

t

t

a

v

h e

u.

f

u.

t

"t

u.

v

Fig. 4. Intensity level records for the four stimulus items shown in Figure 3. Fricative boundaries delimited.

PROCEDURES

43

H. ACOUSTIC MEASURES FROM SONAGRAMS

Duration values for the fricatives and for their preceding vowels were converted from measures of segment length on the Sonagrams. Similar measures were taken from the intensity level records as a reliability check on the transfer of segment boundaries from Sonagram to intensity level records. With negligible exceptions, the two sets of figures were found to be identical through the second decimal place, indicating a high level of reliability in measurement. Consonant duration measures were tabulated directly from the Bruel and Kjaer records. Both consonant spectrographic sections and intensity level records indicated energy present in all final consonants past the location at which energy was resolved on the full Sonagrams. Consequently, for all final consonants, the termination caret was located at the point where the descending intensity level curve crossed a line five db above the grass line. Spread-in-spectrum measures were taken from the Sonagram sections. The total distance covered vertically on the sections was recorded in millimeters and converted to cycles per second. Calibration records for the Sonagraph used indicated that it recorded energy present in a frequency band from approximately 100 cycles to 13,200 cycles, or an average frequency band of approximately 65 cycles per millimeter of vertical distance on the section records. For the fricatives in which energy was not continuously present throughout the entire range of the spectrum spread, the presence and width of hiatuses was recorded. For the purposes of this study, however, spread-in-spectrum measures indicate the extent of the frequency range between the lowest and highest frequencies at which energy was recorded on the section records. Additional information was recorded in terms of the frequency equivalents of the extreme limits of the spectral spread for each consonant to show the comparative location of a given spread in spectrum.

I.

ACOUSTIC MEASURES FROM INTENSITY LEVEL RECORDS

Amount-of-energy measures were taken from the intensity level records made under filtering conditions of 200c-20kc. Measurements were made tracing the outline of the fricative segment by means of a compensating polar planimeter with a fixed tracing arm. The area under the intensity curve was recorded as energy per unit time, db times seconds, expressed as db seconds and reported in this study in db milliseconds. This procedure was repeated on the intensity level records obtained under the second set of filtering conditions, i.e., a pass band of 200c-lkc. For each stimulus item, the second intensity measure was subtracted from the first to obtain a difference measure. Since time was identical for the two samples, such an operation resulted in a measure of the difference in the intensity of the samples.

44

PROCEDURES

Finally, the energy present in the two pass bands was reduced to a proportional measure for each observation, i.e., the percent of the amount of energy in the fricative which was present in the 200c-lkc band. J. TREATMENT OF RAW DATA

All tabulated measures were summed and the means and standard deviations computed. These data were put into tabular form, together with the minimum and maximum observations for each consonant in a changed phonetic environment. For each consonant, mean values were averaged across speakers in terms of 1) position, 2) vowel, 3) pooled vowels, and 4) the Compact/Diffuse nature of the adjacent stressed vowel. For each distinctive feature opposition, the means were averaged across speakers and consonants as a function of the same variables cited above, 1) position, 2) vowel, 3) pooled vowels, and 4) the Compact/Diffuse nature of the adjacent stressed vowel. In addition to the above direct measures, a set of comparison measures was formulated for each distinctive feature opposition. These measures involved the subtraction of values obtained from opposition pairs according to the acoustic specification of the distinctive feature involved. For example, the Tense/Lax opposition specifications state that Tense consonants have greater duration than do their Lax cognates. Consequently, the value obtained for a /v/ (Lax), when subtracted from the value obtained for its Tense cognate, /f/, in the same context and position produced by the same speaker, yielded an appropriate comparison measure. Comparison measures for each minimal pair served to indicate the extent and direction of the difference between them on each relevant acoustical parameter. For these comparison measures, means and standard deviations were computed. These data were presented in tabular form, together with the minimum and maximum observation, as well as the number of observations for which the difference between the pairs was equal to, or less than, zero. For each minimal pair, i.e., distinctive feature opposition, mean values were averaged across speakers in terms of 1) position, 2) vowel, 3) pooled vowels, and 4) the Compact/Diffuse nature of the adjacent stressed vowel. The mean values obtained for all measures were represented in graphical form, the direct measures in bar graphs and the comparison measures in deviation bar graphs.

IV PRESENTATION AND DISCUSSION OF RESULTS

A.

INTRODUCTION

In general, the findings of the present study support the hypothesis under test, i.e., that the acoustic variability among the allophones of certain phonemes is sufficient to necessitate modifications in the distinctive feature analysis. Results were examined in an effort to determine 1) whether or not each phoneme was, in general, represented by the acoustic distinctive features specified as essential to its character, and 2) whether or not the members of each phoneme cognate pair 1 were characterized in all speech utterances by the acoustic distinctive features said to be essential to their differential recognition. It was found that, in the phonemes under study, the Tense/Lax distinctive feature was represented, on the average, by the acoustic properties specified, while the Grave/Acute distinctive feature opposition was not. Further, it was found that neither the Tense/Lax nor the Grave/Acute distinctive feature opposition was consistently represented as specified in every sample of a phonemic cognate pair, i.e., as the members of the pair were compared in every phonetic context, in each position, uttered by every speaker. The deviations from specified relationships were found to pattern consistently as a function of several variables in phonetic context: 1) the position of the consonant in the test word, 2) the particular consonant opposition pair under study, and 3) the Diffuse/Compact character of the adjacent vowel. These findings are primarily based upon measures specified by the authors of the distinctive feature analysis for the consonantal acoustic character of the distinctive features Tense/Lax and Grave/Acute (see Table 2, p. 22). The appropriate measures are presented and discussed in terms of the feature for which each was specified. For the Tense/Lax feature, measures of the amount of consonant energy, spread in spectrum, and duration are presented and discussed in that order. An ancillary measure of the duration of the preceding vowel is included as an appropriate further specification of the Tense/Lax character of the phoneme cognates. For the Grave/ 1

The term "cognate pair", as used in the present study, refers to those phonemes which are in opposition according to the distinctive feature analysis, e.g., the Tense phoneme /f/ is the cognate of the Lax phoneme /v/ on the Tense/Lax parameter, and the Grave phoneme /f/ is the cognate of the Acute phoneme /9/ on the Grave/Acute parameter (see Table 1, p. 19).

46

PRESENTATION AND DISCUSSION OF RESULTS

Acute feature, the measure presented is that of the proportion of the total consonant energy to be found in the lower portion of the frequency spectrum. On each of the measures specified by the distinctive feature analysis, results are reported in terms of the obtained values under the relevant conditions of varied phonetic context, i.e., 1) for the consonant as a function of the adjacent stressed vowel, 2) for the consonant as a function of the Compact/Diffuse nature of the adjacent stressed vowel, 3) for the distinctive feature opposition as a function of the Compact/Diffuse nature of the adjacent stressed vowel, 4) for the consonant with pooled vowels, and 5) for the distinctive feature opposition with pooled vowels. On each parameter, a second set of values is presented in the form of comparison measures. Each comparison measure indicates the difference between consonant cognates on the acoustic parameter which specifies the distinctive feature under investigation. When the difference is in the predicted direction, the difference values are positive numbers; when the difference is in the direction opposite to that predicted, the numbers are negative. For all comparison measures, results are presented in the manner detailed above for the obtained measures. For both the obtained and the comparison measures, the mean values, the standard deviations, and the minimum and maximum observations are reported. More elaborate statistical treatment appears to be unjustified in the absence of information relating auditory perception and speech data sufficiently to make possible an adequate interpretation of such statistical treatment. For the purposes of this study, all results are reported for all eight speakers pooled on each parameter. Here the concern is with the acoustic specification of distinctive features as they relate to the identity of the phonemes under study. Such formulation must be broad enough to embrace all users of the language, whatever may be the consistent differences among them or within their own usages. Results for medial and final position consonants are reported separately throughout. This makes it possible to follow a technique used in the original specification of the distinctive features - comparison measures for minimal pairs. Presenting results separately by position also points up the fact that, for the same phoneme in the two positions investigated, there are consistent differences on a number of the obtained measures. The only studies to report acoustic values for fricatives in different positions (9, 194, 195) have indicated consistent differences as a function of position. The current Haskins Laboratory's minimal rules for synthesizing speech contain twelve "position modifiers" (125). Certain linguists have suggested that a phoneme may demand different acoustic specifications as a function of position - that the acoustic data which characterizes an initial phoneme may be entirely unavailable for characterizing the same phoneme in final position (200, p. 608). Certainly the differences in obtained values as a function of position which are cited throughout the present chapter lend support to Fischer-Jergensen's statement. "Only when distinctive features have been phonetically defined for various positions separately can we attempt to find a common denominator" (44, p. 476).

INTRODUCTION

47

Throughout the chapter, measurements made on voiced and voiceless cognates are treated as directly comparable. As has been pointed out by Harris (69), segmentation which involves the voiced fricatives presents a greater problem than that which involves voiceless fricatives. Consequently, absolute values may be affected by errors in judgment concerning the boundaries of the voiced fricative segments. However, since particular care was taken to ensure that criteria for segmentation were consistent (cf. Chapter III, p. 39), both across the Tense/Lax opposition cognates /f/-/v/ and /0/-/5/, and across the Grave/Acute opposition cognates /f/-/0/ and /v/-/3/, it is difficult to see how judgmental error in segmentation could affect the results differentially. Throughout the presentation of the results, each table is accompanied by a figure which presents the mean values in graphical form. For the comparison measures, the figures are deviation bar graphs.

B. THE DISTINCTIVE FEATURE TENSE/LAX

The distinctive feature Tense/Lax is specified in terms of three physical parameters (see Table 2, p. 22). A Tense phoneme is said to be characterized by a "higher (vs. lower) total amount of energy in conjunction with a greater (vs. smaller) spread of the energy in the spectrum and in time" (98, p. 30). A testing of this predicted relationship between Tense consonants and their Lax cognates required consonantal measures of: 1) amount of energy, 2) spread in spectrum, and 3) duration. Of the phonemes under investigation in the present study, /f/ and /0/ are classified as Tense, and /v/ and /6/ as their Lax cognates, respectively.

1. Amount of Energy Obtained Measures The amount of energy for a given consonant in each consonant and vowel combination is tabulated in Table 5. The values of column one, presented in units of db milliseconds (db times milliseconds), indicate the mean energy present in each consonant. These measures were obtained from full consonant duration, within the frequency range from 200 cycles to 20,000 cycles.2 Since it has been demonstrated that duration is relevant to consonant identity (28, 125, 129, 140, 198) measures taken over the full consonantal durations are presented as being more representative of the phoneme in question than are measures drawn from an arbitrary sample taken from some portion of the phoneme (cf. 58, 63, 85). Further, a constant sampling interval may include most of the consonant in question when that consonant is ' As reported earlier in this study, the 200 cycle high pass filter was used by analogy with previous measures in order "to eliminate tape noise, hum, etc." (58, p. 125).

PRESENTATION AND DISCUSSION OF RESULTS

48

TABLE 5

Amount of energy* for the consonant as a function of the adjacent stressed vowel.

i iI

X

a u i X

Va u i X

6a u i ö

*

X

a u

Mean Value

MEDIAL Min. Max. Value Value

491.10 450.00 472.58 556.45 414.52 341.94 331.42 412.90 503.23 408.06 430.65 466.90 395.16 362.06 379.03 450.00

322.58 309.68 348.39 335.48 245.16 245.16 232.26 258.06 367.74 309.68 322.58 387.10 296.77 212.90 290.32 367.74

658.06 625.81 645.16 709.68 567.74 503.23 529.03 677.42 638.71 529.03 554.84 587.10 554.84 587.10 522.58 606.45

SD 20.40 17.83 19.37 19.31 16.73 13.00 14.21 21.05 15.94 12.85 13.96 12.81 13.38 18.36 12.00 12.37

i X

•Ii a

u i X

a V u i X Q O a u i X S 0 a

u

Mean Value

FINAL Min. Value

945.94 941.94 833.03 938.71 846.77 620.13 745.94 874.19 754.00 642.71 787.87 855.61 808.84 937.87 888.71 845.16

612.90 354.84 432.26 638.71 683.87 374.19 380.65 516.13 522.58 264.52 354.84 638.71 658.06 483.87 451.61 425.81

Max. Value 1677.42 1858.06 1819.35 1393.55 1051.62 909.68 1380.65 1303.23 1200.00 961.29 1329.03 1200.00 1187.10 1522.58 1651.61 1225.81

SD 56.04 69.50 63.56 43.82 18.68 27.35 47.01 40.50 34.21 39.88 44.83 28.16 26.48 52.35 53.01 38.08

Values in db milliseconds (db times milliseconds).

typically shorter, as are the Lax phonemes, but may include only a small portion of the consonant when that consonant is typically longer, as are the Tense phonemes. As a consequence, there is the danger that a sample of fixed duration may not represent the members of opposition pairs with equal validity. Columns two, three, and four in Table 5, citing the minimum value observed, the maximum value observed, and the standard deviation from the mean, give some indication of the variability among the speakers. As is traditional for measurements of human speech, the variability is large. Of the acoustical studies which have investigated more than single utterances of one or two speakers, it has been said, "As might have been expected, these studies showed great variability in the data" (63, p. 107). Three of the most recent of the few studies on the fricative sounds have cited similar results (9,194,195). Hughes and Halle wrote, "The discrepancies among the spectra of a given fricative as spoken by different speakers in different contexts are so great as to make the procedure of plotting those spectra on one set of axes a not very illuminating one" (85, p. 305). Tarnöczy (191) reported relative pressure levels for the voiceless fricatives on a scale from zero to —18 db with variations of ± 5 db on repeated observations with a single speaker. This variability, he felt, had implications concerning potential variability among speakers. "Ergebnisse auf Grund verschiedener Serien derselben Versuchsperson weisen eine Streuung von ± 5 db auf, wobei die mögliche Streuung der Ergebnisse bei Ausspracheabweichun-

THE DISTINCTIVE FEATURE TENSE/LAX

49

gen verscheidener Versuchspersonen noch höher sein kann und hierbei sogar eine Änderung der Gestalt des Lautspektrums nicht ausgeschlossen ist" (191, p. 335). Figure 5 presents a graphical representation of the mean values from Table 5 for each consonant as a function of the adjacent stressed vowel. The amount of energy in db milliseconds is displayed along the abscissa for each of the consonant allophones indicated along the ordinate. These combinations are presented in minimal pairs, i.e., in paired values for the Tense/Lax opposition consonants in identical phonetic environments. Each bar represents the mean of eight observations. Figure 5 indicates that, on the average, in both medial and final positions, the amount of energy in the Tense consonant /f/ is greater than that in its Lax cognate, /v/. Further, in medial position, both Tense consonants, /f/ and /0/, average consistently greater amounts of energy than do their Lax cognates, /v/ and /ö/. However,

Fig. 5. Amount of energy for the consonant as a function of the adjacent stressed vowel. Tense/Lax cognate pairs contrasted. Values in db milliseconds.

50

PRESENTATION AND DISCUSSION OF RESULTS

the difference appears to be negligible with /9/ and /8/ appearing medially before the vowel /u/. In final position, while the relationship between /f/ and /v/ remains unchanged, the /G/-/5/ opposition does not follow the predicted pattern. In this position, the amount of energy for the Lax consonant /5/ exceeds that for its Tense cognate /0/ in three out of the four vowel-consonant combinations. The single exception is for final position after the vowel /u/. Here, as in medial position with /u/, the difference is in the predicted direction, but negligible. There is little information in the literature to which these data may be related. In 1926, Sacia and Beck (171) published data on intensity relationships among the sounds of speech. Somewhat later, in 1929, Fletcher published a table which collated the differents sets of data and represented the relative amounts of power in the different speech sounds, using the power of the faintest sound as the basis for comparison. The ratio of powers was 680 to one, from the strongest vowel, /o/, to the weakest consonant, /9/. On this scale (45, p. 74), the four fricatives of interest in this study were rated as follows: /v/ 12, /0/ 11, /f/ 5, and /0/ 1. The Sacia and Beck study has been criticized for "insufficient sampling" (46, p. 83), and for using methods which were "not unexceptionable" (63, p. 100). Nevertheless, that study remains unique in the literature. It has never been superseded, has been extensively quoted, and reappears unchanged in the 1953 edition of Fletcher's book (46, p. 86). Yet, despite the fact that these data, practically the only data available on relative consonant intensities, cite the voiced consonants as stronger than their voiceless counterparts, the literature has consistently reiterated the idea expressed in Halle's generalization, "It is well known that voiced consonants are of lower intensity than unvoiced consonants" (63, p. 149). TABLE 6

Amount of energy* for the consonant as a function of adjacent DiffuseI Compact stressed vowels FINAL

MEDIAL

Mean

Mean

*

f with

Diffuse Compact

523.81 461.29

Diffuse Compact

with f

942.32 887.48

v with

Diffuse Compact

413.74 336.71

Diffuse Compact

with v

860.52 683.10

G with

Diffuse Compact

485.10 419.35

Diffuse Compact

with 0

804.84 715.35

5 with

Diffuse Compact

422.58 370.58

Diffuse Compact

with ó

827.03 913.29

Values in db milliseconds.

51

THE DISTINCTIVE FEATURE TENSE/LAX

The data obtained under the conditions of this study, presented in Table 5 and Figure 5, might be interpreted as lending some support to the generalization that unvoiced consonants are stronger than their voiced cognates. At the same time, there is some support for the Sacia and Beck findings in regard to the comparative intensities of /©/ and /5/. The variable here appears to be that of position in utterance: the energy of /9/ relative to that of ¡6/ varies as a function of position. Table 6 and Figure 6 show the mean values for the amount of energy for each

D + f C + f D + • C • T

D + e c + e d

5 + C



a

c • 8

Medial

Final

Fig. 6. Amount of energy for the consonant as a function of adjacent Diffuse/Compact stressed vowels. Cognate pairs contrasted. Values in db milliseconds.

consonant as a function of the Diffuse/Compact nature of the adjacent stressed vowel. For the same consonant, in either position, mean energy values are greater when that consonant appears with a Diffuse stressed vowel (either /i/ or /u/) than when it appears with a Compact stressed vowel (either /ae/ or /a/). 3 Table 7 and Figure 7 present means for Tense and Lax cognates as a function of the adjacent stressed Compact or Diffuse vowel. These data indicate values for the Tense phonemes which are consistently greater than those for the corresponding Lax phonemes. Further, without exception, means for amount of consonant energy are greater when the adjacent stressed vowel is Diffuse than when is it Compact. * A single exception, again, is the Lax consonant /6/ in final position. This time the exception has a dual nature. Not only does the Lax phoneme /6/ average a greater amount of energy than its Tense cognate /0/, but also that increased amount of energy is greater with Compact vowels than with Diffuse vowels. This exception, however, is not strong enough to reverse the general trends shown in Figure 7.

52

PRESENTATION AND DISCUSSION OF RESULTS TABLE 7

Amount of energy* for Tense/Lax consonants as a function of adjacent Diffuse¡Compact stressed vowels FINAL

MEDIAL

Mean

Mean

*

Tense with

Diffuse Compact

Lax

Diffuse Compact

with

504.45 440.32 418.13 353.61

Diffuse Compact

with Tense

873.61 801.42

Diffuse Compact

with Lax

843.74 798.19

Values in db milliseconds. 400

Tense + 0 Tense + C

Lax + D Lax + C

e

800 —-r—

1200 —r

1200

I) f Tens* C t Tense

D 4- Lax C + Lax Medial

Final

Fig. 7. Amount of energy for Tense/Lax consonants as a function of adjacent Diffuse/Compact stressed vowels. Values in db milliseconds.

This consistent difference is interesting in the light of information concerning the relative intensities of these four vowels in terms of their classification as Compact or Diffuse. Jakobson (93, p. 333) has commented that "typical by-products of compactness" are greater loudness and longer duration. Fant has been more specific: Two vowels differing in the frequency of the first formant, Fl, i.e., in terms of compactness, are known to differ also in intensity. The greater intensity found with the higher Fl is not an independent variable. It must occur because of the higher Fl, everything else being equal (37, p. 110). Physiologically, the Compact vowels /ae/ and /a/ are characterized by greater incisor separation than are the Diffuse vowels /i/ and /u/. The amount of incisor separation has been shown to have a very high correlation with the amount of phonetic power in the vowel (34, p. 393). Black (8) reported relative vowel intensities in db for /i/ 0.00, for /ae/ 3.44, for /a/ 3.69, and for /u/ 2.56. These values would indicate that the two Compact vowels /ae/ and /a/ average considerably greater intensity than the Diffuse vowels /i/ and /u/. Fairbanks, House, and Stevens (35) have assigned to

53

THE DISTINCTIVE FEATURE TENSE/LAX

these four vowels the following relative positions in a descending order of vowel intensities: /ae/, /a/, /u/, and /i/. In the light of these established comparative vowel intensities, it appears that within both stronger (Tense) and weaker (Lax) fricative groups, there is a conjunction of lesser consonantal intensity with greater vowel intensity, and of greater consonantal intensity with lesser vowel intensity. The reciprocity which appears consistently in these intensity relationships may not be dissimilar to that established for the relative durations of Tense/Lax consonants and their preceding vowels. However, while the vowel-consonant duration reciprocity concerns only the preceding vowel, this reciprocity in terms of intensity appears to operate for a consonant with either a preceding or following stressed vowel. Table 8 and Figure 8, which present mean values for each of the consonants TABLE 8 Amount of energy* for the consonant with pooled MEDIAL Mean f v 0 3

492.52 375.23 452.19 396.58

vowels

FINAL Mean f V e 3

914.90 771.81 760.06 870.19

* Values in db milliseconds.

with vowels pooled, show that /f/ is consistently stronger than its Lax cognate, /v/, and that medial /0/ is stronger than its Lax cognate, ¡6/, but that final /0/ is weaker, on the average, than its Lax cognate, /5/. It is clear that the greater strength of /f/ as compared with /v/ is more pronounced in both positions than is the greater strength of /©/ as compared with /5/ in medial position. This, coupled with the reversal of /0/-/5/ relative strengths in final position, might suggest that the amount of energy contrast may play a more definitive role in the /f /-/v/ phonemic opposition, and that its part in the /G/-/8/ opposition may be rather equivocal. In the latter case, the difference is typically less pronounced, and hence more subject to being eliminated or reversed under conditions of changed phonetic environment. In terms of Figure 8 it is also interesting to observe that, while the Tense phoneme /f/ is consistently stronger than the Tense phoneme /9/, that condition is reversed in the case of their voiced cognates. Both medially and finally, /5/ averages a greater amount of energy than does /v/. This relationship is partially confirmed by the information available in the literature. The Sacia and Beck data cited earlier rate /{/ above /0/ in intensity and /v/ above ¡6/, but not by comparable amounts. The relative values assigned to these fricatives would indicate that the two Tense and the

54

PRESENTATION AND DISCUSSION OF RESULTS 0

400

800

Medial Fig. 8.

1200

0

400

800

1200

Final

Amount of energy for the consonant with pooled vowels. Values in db milliseconds.

two Lax fricatives do not stand in similar intensity relationships to each other, i.e., the energy differential between /f/ and /0/ is five times greater than that between /v/ and /5/. The latter phonemes are assigned values only one digit apart (12 and 11, respectively) on a scale which runs from 680 to one. In the most recent data on the intensity of voiceless fricatives, Tarnóczy (191, p. 334) ranked them in terms of negative db values, assigning to the strongest voiceless fricative, /s/, the arbitrary value of zero db. On his scale, /9/ is rated at —11.5 db, and /f I at —12.5 db. They are given comparable relative positions in respect to their relationships to the average speech power, i.e., /9/ —13.5, and / f / —14.5. Since no other source in the literature attributes greater strength to /0/ than to / f / , one must consider that the sounds which were investigated by Tarnóczy were produced by native Hungarian speakers. For them, the / f / is a native phoneme; the /0/ is not. It may be that natural acoustical relationships are not preserved when the test stimuli contain both native and non-native phonemes. In any event, it would seem advisable to view comparisons of such phonemes with some reservations, particularly when they contradict the available evidence. Final summaries of the obtained values are presented in Table 9 and Figure 9. Here it is apparent that Tense consonants in either position with vowels pooled are stronger than are their Lax cognates. This is in accord with the distinctive feature specification. The general tendency for Tense consonants to show greater strength

THE DISTINCTIVE FEATURE TENSE/LAX

55

than Lax cognates proves sufficient, on the average, to override certain reversals in terms of specific consonants and positions. TABLE 9

Amount of energy* for Tense/Lax consonants with pooled vowels FINAL

MEDIAL Mean Tense Lax *

Mean

472.39 385.87

Tense Lax

837.48 820.97

Values in db milliseconds.

0



400 •

i

800 i

]

0

Lax

Medial Fig. 9.

1200 i

i

400 i

i

800 i

i

1200 r

I

Tense

Tense

Lax

i

] Final

Amount of energy for Tense/Lax consonants with pooled vowels. Values in db milliseconds.

Comparison Measures Table 10 presents comparison measures showing the difference in the strengths of the Tense/Lax cognates. To obtain these values, the amount of energy present in the Lax consonant was subtracted from the amount of energy present in the Tense cognate. Since the distinctive feature analysis specifies the amount of energy in the Tense cognates to be greater than that in their Lax cognates, this comparison measure should reveal both the direction and extent of the difference. The comparison measure was expressed as a negative value whenever the obtained value for the Lax consonant exceeded that for the Tense cognate. Column five in Table 10 shows, for each consonant opposition, the number of observations for which the difference was equal to, or less than, zero. Thus, out of a total of 128 observations there were forty-eight instances in which the Tense and Lax cognates either showed no difference in amount of energy, or reversed the direction of the expected difference. Fourteen of those failures to meet the specification were found in medial position, thirty-four in final ; sixteen of the forty-eight failures

PRESENTATION AND DISCUSSION OF RESULTS

56

TABLE 10 Comparison

measure:

Difference

cognates

i ae f-v a u i 86 8-Ö a u

in amount

as a function

Mean Value

MEDIAL Min. Max. Value Value

76.65 108.06 141.16 143.55

0.00 232.26 14.53 - 1 2 . 9 0 316.13 18.39 45.16 329.03 13.78 - 6 . 4 5 367.74 21.95

2 2 0 1

8.66 19.05 11.66 10.95

0 3 2 4

108.06 46.00 51.61 16.97

45.16 -141.94 -58.06 -70.97

187.10 251.61 161.29 148.39

SD

of energy*

of the adjacent

Nx go

for

stressed

consonant

vowel

FINAL Min. Max. Value Value

Mean Value i 99.23 X 314.52 f-v a 87.10 u 64.52 i -54.84 » -295.16 9-Ö a -100.84 u 10.52

Tense/Lax

SD

Nx go

625.81 51.10 948.39 56.57 438.71 40.13 309.68 32.71

5 1 3 2

-193.55 206.45 19.72 -890.32 141.94 52.35 322.58 70.97 19.03 322.58 548.39 43.93

6 7 6 4

-161.29 -290.32 -412.90 -316.13

Values in db milliseconds.

*

1

Z1

t

1

m

minas

alms a •



u

1

e

• ninas

Dinas a

3

i a Medial

c

e 1

c=

] Final

Fig. 10. Comparison measure: Difference in amount of energy for Tense/Lax consonant cognates as a function of the adjacent stressed vowel. Values in db milliseconds. were for the /f /-/v/ o p p o s i t i o n a n d thirty-two for the / 0 / - / 0 / opposition. It appears that the predicted relationship o f Tense a n d L a x cognates in terms o f a m o u n t o f

57

THE DISTINCTIVE FEATURE TENSE/LAX

energy may not be a uniformly stable cue. These data indicate that it shows less stability with final position consonants than with medial position consonants, and less stability for the /0/-/8/ opposition than for the /f/-/v/ opposition. Figure 10 presents the mean values from Table 10 in the form of a deviation bar graph. Bars representing positive values extend to the right of the center, or zero, line; bars representing negative values extend to the left. As would be expected from the obtained values reported earlier, the comparison measures for the /0/-/S/ opposition in final position with the vowels /i/, /se/, and /a/ are represented here as negative means. Clearly, they show a reversal of the predicted trend. Only with the vowel /u/ is the expected intensity relationship maintained between /0/ and /8/, and here the difference, while in the expected direction, is negligible. The most conspicuous feature of Figure 10 is the extent of the difference found between Tense and Lax consonants in final position with /ae/. Although the differences for the two cognate pairs are of almost equal magnitude, they are in opposite directions. In both cases, they represent an intensification of the already established trend.4 A last minor point of some interest in connection with Table 10 is that, with the single exception of /f/ minus /v/ in medial position, once again the smallest differences between consonant cognates are to be found in conjunction with the vowel /u/. Table 11 and Figure 11 present comparison means for consonant oppositions as TABLE 11

Comparison measure: Difference in amount of energy* for Tense/Lax consonant cognates as a function of adjacent Diffuse¡Compact stressed vowels FINAL

MEDIAL

Mean

Mean

* 4

f minus v with

Diffuse Compact

110.06 124.58

Diffuse Compact

with f minus v

81.87 200.84

0 minus 3 with

Diffuse Compact

62.52 48.77

Diffuse Compact

with 8 minus ó

—22.19 — 198.00

Values in milliseconds.

Each of these two mean values is maximized by one extreme observation. These two observations, however, involve two different speakers, two different consonant pairs, and two different directions of difference. In each case, the extreme value was in the direction taken by seven of the eight comparision measures in that set of observations. In column four of Table 10 the elimination of each of these extreme measures would substitute a +496.77 for the +948.39 as the maximum value obtained for final /f / minus final/v/ with /se/, and would substitute a — 561.29 for the — 890.32 as the maximum value obtained for final /©/ minus final 16/ with /«/. This would, of course, bring a commensurate reduction in the means and standard deviations for each, without reversing the direction of difference in either case. In terms of their acceptability as phonemic representations there appears to be no reason to exclude the observations which yield these extreme comparison values. They should be considered in assessing the magnitude, but not the direction, of their respective means as shown in Figure 10. It may prove of some interest that these extreme values, though differing on all other parameters (speaker, consonant pair, and direction of difference) did both occur in final position with the vowel /se/.

58

PRESENTATION AND DISCUSSION OF RESULTS -300

minus

ffllOUB

e minus

Medial

Final

Fig. 11. Comparison measure: Difference in amount of energy for Tense/Lax consonant cognates as a function of adjacent Diffuse/Compact stressed vowels. Values in db milliseconds.

a function of the Diffuse/Compact nature of the adjacent stressed vowel. With the exception of the /9/-/Ô/ opposition in final position, the differences are all in the expected direction. While earlier obtained measures showed that a given consonant, either Tense or Lax, tended to show a greater amount of energy in conjunction with a Diffuse vowel, and a lesser amount in conjunction with a Compact vowel, it appears in Figure 11 that no such consistent relationship holds for the comparison measures. It would seem at first glance that the differences in the amount of energy between Tense and Lax cognates tend to be less in conjunction with Diffuse vowels and greater in conjunction with Compact, with the exception of the /9/-/Ô/ opposition in medial position. This apparent tendency, however, is not confirmed by the summary presented in Table 12 and Figure 12, where Tense minus Lax values are displayed TABLE 12

Comparison measure: Difference in amount of energy* for the Tense/Lax opposition as a function of adjacant Diffuse/Compact stressed vowels FINAL

MEDIAL Mean Tense minus Lax *

Diffuse

86.32

Mean Diffuse

with

with Compact

86.71

Compact

Tense minus Lax

29.87 1.42

Values in milliseconds.

as a function of the Diffuse/Compact nature of the adjacent stressed vowel. Here the exception noted above is strong enough to result in identical values for the differences between Tense and Lax cognates with Diffuse and with Compact vowels. Further, the negative and positive values for final position, when summed, yield

59

THE DISTINCTIVE FEATURE TENSE/LAX

-300

Tense

100

0

100

Tense _ inimie C+Lax

D+

minus lax +0

Medial

Final

Fig. 12. Comparison measure: Difference in amount of energy for the Tense/Lax opposition as a function of adjacent Diffuse/Compact stressed vowels. Values in db milliseconds.

greater means for differences between Tense and Lax cognates with Diffuse vowels than with Compact. Thus, if the comparison measures can be said to show any trend in this matter, it is that of a slight tendency in the direction of the decided trend of the obtained measures, i.e., greater averages with Diffuse vowels than with Compact. In terms of the effect of Diffuse/Compact vowels upon failures in the direction of expected differences, data from Table 10 indicate that among the forty-eight instances in which differences in the strength of Tense/Lax cognates were either non-existent or in the direction opposite to that predicted, twenty-four occurred with Diffuse vowels and twenty-four with Compact vowels. It appears that such deviations in terms of the comparison measures are not a consequence of the Diffuse/ Compact nature of the adjacent stressed vowel. Table 13 and Figure 13 show differences between Tense and Lax consonant cogTABLE 1 3

Comparison measure: Difference in amount of energy* for the Tense/Lax opposition as a function of the adjacent stressed vowel FINAL

MEDIAL

Mean

Mean Tense minus Lax *

i SB

a u

92.32 77.03 96.39 80.26

i ae a u

Tense minus Lax

22.19 9.68 6.84 37.48

Values in db milliseconds.

nates as a function of individual vowels. These differences appear to be consistently greater for medial position cognates than for final position cognates. This tendency is confirmed by the final summaries, Table 14 and Figure 14. In both positions, the difference is in the direction predicted, i.e., the amount of energy contained in Tense consonants is greater than that in their corresponding Lax cognates. In medial position, that difference is greater than it is in final position. In summary, amount of energy measures, both in obtained values and in com-

60

PRESENTATION AND DISCUSSION OF RESULTS 100

0

100 i

1

ZI

i Tesse

m minus

300+ r

-300



Tense

lax



Lax

u

100

h

t

3(X)+

minus

ZI

a

100

Medial

Final

Fig. 13. Comparison measure: Difference in amount of energy for the Tense/Lax opposition as a function of the adjacent stressed vowel. Values in db milliseconds.

TABLE 14

Comparison measure: Difference in amount of energy* for the Tense/Lax opposition with pooled vowels MEDIAL

FINAL Mean

Mean

86.52

Tense minus Lax

Tense minus Lax

15.61

Values in milliseconds.

-300

100

0

100 -T^

Medial

300+ 1 '* l |



-300 fi

1

100 1

0

100 I

I

300+ I»

Final

Fig. 14. Comparison measure: Difference in amount of energy of the Tense/Lax opposition with pooled vowels. Values in db milliseconds.

parison values, indicate that, on the average, Tense consonants are stronger than their Lax cognates. This is in accord with the specification of the distinctive feature analysis. It is important to note, however, that of the consonant and vowel combinations studied in this investigation, a considerable number (forty-eight out of 128 observations, or 37.5 %) yielded comparison measures contrary to the specification. These exceptions pattern in terms of both the consonant opposition and the position involved. The consonant opposition /f/-/v/ appears to show greater stability in the

THE DISTINCTIVE FEATURE TENSE/LAX

61

amount of energy relationships between cognates than does the /Q/-/5/ opposition. Similarly, the medial position consonants appear to show greater stability in the amount of energy relationships between cognates than do final position consonants.

2. Spread in Spectrum The Tense phonemes are said to be characterized by a "greater (vs. smaller) spread of the energy in the spectrum" (98, p. 30). The relatively smaller spread in spectrum typifies their Lax cognates, according to the distinctive feature analysis. For the purposes of this study, it was decided that a single complete section taken at a comparable location for each consonant would provide useful information which could be interpreted within the limitations imposed by both instrumentation and methodology. It is recognized that the spread of energy in the spectrum of a sound varies in time. Since the Sonagraph sectioner, set at a point in time to scan a selected portion of the speech signal, integrates the filter output energy over a five millisecond interval, the section it produces represents the spread of energy in the spectrum of the signal during that period alone. For a sound of 150 milliseconds duration thirty contiguous sections would be necessary in order to specify the spectral spread of energy for the complete sound. For continuous consonants, as for vowels, "the heavy investment of time required" for such a procedure (63) must be weighed against the information obtained. A further complication resulted from the frequency limits of the instrument on which the sections were made, an instrument calibrated as recording energy throughout the frequency band from 100 cycles to 13,200 cycles, an extension of roughly 5000 cycles beyond the capacities of otherwise similar instruments from which most of the data to be found in the literature were obtained. Among the four phonemes investigated in the present study, three are represented by at least two sounds apiece for which the upper limits of spectral spread exceeded 13,200 cycles. It is apparent that, while the Sonagraph can be expected to yield useful information, it is not the definitive instrument for specifying the spread of energy in the spectra of these particular consonants. On the other hand, with a feasible amount of extra time and effort, single consonant sections for each observation could be added to the data for the present study. Considering the dearth of information concerning these consonants, it seemed advisable to include such records, interpreting them with care. The four consonants under study are all classified by the distinctive feature analysis as Diffuse Continuants, i.e., sounds characterized by their lack of predominant central resonant regions and by their gradual onsets. Care was taken in the location of the section within the consonant so that the sections might be comparable. (See Chapter III, p. 35, for criteria.) These values are to be construed as comparable representative samples from the four fricatives under study, and not as statements of their respective ultimate limits in spectral spread.

62

PRESENTATION AND DISCUSSION OF RESULTS

Obtained Measures The mean spread of energy shown in the spectrum of sections taken for a given consonant with each of the vowels is presented in column one of Table 15. The TABLE 1 5

Spread in spectrum* for the consonant as a function of the adjacent stressed vowel FINAL

MEDIAL Mean Value

Min. Value

Max. Value

SD

Nx ^ 13,200 cps

i as f a u

7955 6492 7240 8401

3060 390 1820 3060

13,200 11,100 13,200 13,200

66.56 45.39 67.31 62.45

2 0 1 3

se a u

i

7183 4030 5899 6078

3380 520 460 520

13,200 7,020 13,000 13,200

42.43 38.21 58.05 78.61

9051 7142 7240 9742

4490 2150 1950 6630

13,200 13,200 13,200 13,200

5980 5355 3811 6256

1170 590 260 3060

8,000 12,480 6,830 8,190

V

X

a u i

O

Ö

x a u i

S X

o

a

u *

Mean Value

Min. Value

Max. Value

SD

Nx ä 13,200 cps

c •1

7418 8995 7020 9165

4030 3840 4550 9220

11,100 13,200 13,200 13,200

36.74 56.75 42.43 50.99

0 2 1 2

1 0 0 1

i se V a u

5143 2551 3486 3242

2990 1890 1950 850

7,670 3,510 6,830 6,180

27.91 9.11 22.18 23.81

0 0 0 0

55.05 54.22 63.88 48.27

2 1 1 3

i se A •u a u

6614 6061 5241 7670

3380 390 2860 650

13,200 13,200 7,410 13,200

56.75 57.88 27.22 64.73

1 1 0 2

31.59 56.13 36.61 19.85

0 0 0 0

4672 3372 3786 5550

2540 850 1890 1110

6,630 6,630 6,050 7,220

25.59 32.86 27.24 31.43

0 0 0 0

J

i X

a 0 u

In cycles per second.

figures represent frequency in cycles per second. The minimum and maximum values obtained for each combination are indicated in columns two and three, and the standard deviations from the means in column four. For this measure, as for the amount-of-energy measure, large standard deviations indicate a high degree of variability among the speakers. The measures indicate, however, somewhat less variability in spectral spread among voiced consonants than among unvoiced consonants, particularly in final position. Column five shows the number of observations in each set which were equal to, or could be presumed to exceed, the instrument's upper frequency limit. As would be expected, the voiceless fricatives, /f/ and /0/, provided most of these instances. Each of the voiceless consonants provided the same number of such observations, eleven out of the sixty-four observations for each consonant, or 17%. For both voiceless consonants an appreciable percentage of these excessive spectral spreads occurred with Diffuse vowels, for /f / 64 % and for /0/ 73 %. The voiced consonants,

THE DISTINCTIVE FEATURE TENSE/LAX

63

3 e

64

PRESENTATION AND DISCUSSION OF RESULTS

on the other hand, provided only two instances in which the spectral spread exceeded instrumental capacity; /v/ provided both instances, ¡6/ none. The two excessive spectral spreads for /v/ both occurred with Diffuse vowels. Figure 15 presents the means from Table 15 in graphical form. Spread in spectrum in cycles per second is displayed along the abscissa as a function of each of the consonant and vowel combinations indicated along the ordinate. The Tense/Lax opposition consonants are presented in pairs, with each bar representing the mean value of eight observations. Figure 15 indicates clearly that the spectral spread recorded from the Tense consonants in both positions and with all vowels is greater than that recorded from their Lax cognates. In addition to being in the predicted direction, the differences are generally of considerable magnitude, greater in final position than in medial position for the /f/-/v/ opposition but greater in medial position than in final position for the /0/-/S/ opposition. The literature in acoustic phonetics present virtually no direct information about characteristic spectral spread for these four fricatives. For reasons cited earlier, the fricatives have proved exceedingly difficult to specify acoustically. Frequency measures for them have been particularly subject to the limitations of instrumentation. Further, such acoustic studies as have been done on the fricatives have been concerned with a specification of characteristic frequency regions (in the sense of frequency bands of energy maxima) rather than with their total spread of energy in the spectrum. Certainly a contributory factor here has been the difficulty of resolving the relatively weak fricative energy. The combination of problems has been such that no single study since the Crandall study of 1925 has published spectral data on all four fricatives of interest here. Some studies have concerned only the voiceless fricatives (67, 68,191), others only certain fricative cognate pairs (85). European studies have not concerned themselves with the English phonemes /0/ and /6/ (see 191 as an exception). The most recent study to direct attention to all four of these fricatives was a perceptual and tapesplicing experiment which did not employ acoustic analysis (69). Since instrumentation and methodology have varied widely in these studies, this selectivity in the stimuli used makes it difficult to interrelate the findings of the few studies which have involved any of these four fricatives. However, it is possible to extract information from which certain inferences can be drawn regarding characteristic spectral spread for at least certain of these consonants. Since there is so little information available, even this much might serve to further the understanding of these four sounds which have proved so hard to specify. Certain of Crandall's 1925 data on the characteristics of the several speech sounds have been re-published unaltered in the 1953 Fletcher (46). These data have not been superseded, even though Crandall's instrumentation was obviously limited, considered in terms of that available today. Crandall sharply curtailed variation in speaker and phonetic environment, so that all the sounds of English might be

THE DISTINCTIVE FEATURE TENSE/LAX

65

considered to some extent. Values established for the fricatives, for example, were based on measures obtained for a given consonant in only one position (medial) with only one vowel (/a/), spoken by only two subjects (both male). Crandall's statement that /v/ "shows a less prominent high-frequency component than its partner f, or any of the other fricative consonants" [italics mine], may be explicable in the light of the above-noted limitations in methodology. The present study would seem to indicate that such a generalization concerning /v/ might be too broad. It appears that the frequency spectrum of /v/ may vary considerably as a function of vowel environment, regardless of position. Of these four fricatives, Crandall stated that, except for /v/, "the high frequencies are persistent and in many cases of large amplitude, both at the start and during the course of the consonant sound" (23, p. 622). Crandall speaks of 2600, 3000 and 3200 cycles as "the high frequencies". In the light of the percentage of stimulus items in the present study for which energy was still being recorded at the upper limit of instrumentation (13,200 cycles), it is apparent that Crandall's data yield no information concerning the total spread of energy in the spectrum for these sounds. /8/ was characterized by Crandall as possessing "persistent ... high frequencies (2600, 3000, 3200)" and an "upper sibilant" at approximately 4000 cycles (23, p. 622). /0/ was said to possess the same "high frequencies," but with essential differences in the relative strengths of certain frequency bands within the regions which /0/ and ¡6/ shared. Crandall took note of the "steeper wave front" of /0/ as a characteristic which differentiated it from /5/ (23, p. 622). His general summary of these four fricatives states: "Comparing v/f with dth/th [/6/-/0/] it seems from the records that the former pair are of higher frequency (particularly f) and that for v/f as a unit the higher frequency characteristic is more pronounced; just the opposite conclusion to that reached by Paget" (23, p. 622). While the Crandall-Paget differences may have been simply an artifact of instrumentation and methodology (Paget used isolated sounds produced by an analog, cf. 149), Figure 15 of the present study indicates that similar differences may result from investigations of various positions in utterance. It appears that spectral spreads for these four fricatives pattern differently as a function of position. For the voiced consonants /v/ and /5/ no other acoustic data have appeared from which characteristic spectral spread may even be derived by inference. More attention has been given, however, to the frequency characteristics of the several voiceless fricatives. In 1954, Fischer-Jorgensen's report of an extensive acoustical study of Danish stop consonants mentioned 1412 spectrograms of three voiceless fricatives with which the stops under investigation were compared. Of the fricatives considered, only /f/ is of interest to the present study. The statement was made that /f/ is "characterized by a rather diffuse noise from 2000 upwards, but with predominance of frequencies above 4000" (42, p. 52). Later, in 1958, she summarized the information available in the statement that "/has a rather equal distribution of noise in the whole spectrum, although more above 2000, and there is generally a peak at very high

66

PRESENTATION AND DISCUSSION OF RESULTS

frequencies (8-9000)" (44, p. 475). Since there are no definitive statements in the literature as to total spread of energy in the spectrum for these consonants, such statements as the above will at least serve to indicate certain minimal limits of characteristic frequency range. In 1954, Tarnoczy's extensive study (191) on voiceless fricatives provided new information, as well as a summary of related findings from earlier sources, among them Herman, Stumpf, Gemelli, and Exner. Again, most of these investigators were concerned with establishing characteristic frequency regions for the several sounds rather than their total spread in spectrum. Stumpf's early report of the Vollpegel in Hertz for / f / as being 435-7400 (191, p. 320), however, may be interpreted as a statement of average minimal spread in spectrum for that consonant. Tarnoczy reported a frequency range for /f/ of 260-14,000 cps, a spread in spectrum of 13,740 cycles. The range for /©/, Tarnoczy reported as 400-16,000 cps, a spread in spectrum of 15,600 cycles. As has been stated previously, Tarnoczy, in investigating /9/, used speakers for whom it was a non-native phoneme. Further, the report of his results is tied to loudness perception, in that spread in spectrum for each fricative is presented in terms of the extremes of frequency between which consonantal energy was greater than zero phons. It is interesting to note, however, that Exner, working independently at the same time, and using the technique of auto-correlation rather than graphicalperceptual analytic methods, established essentially similar spectral spread values for /f/. His study did not include /0/. In such comparisons as are possible with their data, the present study appears to present similar findings. There can be no doubt that the means shown in Figure 15 for the voiceless fricatives would be considerably higher, given equipment with a higher frequency response. It should be noted that while Tarnoczy's data indicates greater spectral spread for /0/ than for /f/, Tarnoczy investigated only medial position consonants. The present study indicates a similar relationship between the relative spreads of /0/ and /f/ in medial position, but shows a reversal of that relationship in final position. The possibility exists that relationships among consonants in terms of spread in spectrum may vary as a function of position. Table 16 and Figure 16 indicate that a consistent difference exists in the spectral spread of a given consonant as a function of the Diffuse/Compact nature of the adjacent stressed vowel. All four consonants show the same trend, i.e., in both medial and final positions and whether the stressed vowel precedes or follows the consonant, spectral spreads for allophones adjacent to stressed Diffuse vowels are greater than those for allophones adjacent to stressed Compact vowels. Summaries for Tense and Lax consonants, presented in Table 17 and Figure 17, show that Tense consonants in all contexts average greater spread in spectrum than do their Lax cognates. Both Tense and Lax consonants show slightly greater spectral spread in medial position, and a similar magnitude of increased spectral spread with Diffuse vowels in both positions.

68

PRESENTATION AND DISCUSSION OF RESULTS TABLE 16

Spread in spectrum* for the consonant as a function of adjacent DiffuseI Compact stressed vowels MEDIAL

FINAL Mean Value

*

Mean Value

f with

Diffuse Compact

8178 6866

Diffuse Compact

with f

8291 8007

v with

Diffuse Compact

6630 4965

Diffuse Compact

with v

4193 3019

0 with

Diffuse Compact

9396 7191

Diffuse Compact

with 0

7142 5651

6 with

Diffuse Compact

6118 4583

Diffuse Compact

with Ö

5111 3579

In cycles per second. TABLE 17

Spread in spectrum* for Tense/Lax consonants as a function of adjacent Diffuse/Compact stressed vowels. MEDIAL

FINAL Mean

*

Mean

Tense with

Diffuse Compact

8787 7028

Diffuse Compact

with Tense

7717 5513

Lax with

Diffuse Compact

6374 4774

Diffuse Compact

with Lax

4651 3299

In cyles per second.

Table 18 and Figure 18 reveal an interesting relationship among these four fricatives in terms of their relative spreads in spectrum as a function of position. As mentioned earlier, findings here support Tarnoczy's data in terms of medial position, i.e., spread in spectrum for /0/ is greater than that for /f/. However, this circumstance is reversed in final position, where /f/ shows greater spectral spread than does /0/. According to Figure 18, the relationship between their Lax cognates, however, is the opposite. In medial position, /v/ shows greater spectral spread than does /5/, while in final position, the ¡6/ spectral spread exceeds that for the /v/. This combination of inverse and reversed relationships means that whichever Tense consonant shows greater spread in spectrum as a function of position also shows a greater increase in spectral spread over its Lax cognate in that position. For example, in

70

PRESENTATION AND DISCUSSION OF RESULTS TABLE 18

Spread in spectrum* for the consonant with pooled vowels MEDIAL

FINAL Mean

Mean

7122 5797 8293 5350

f V

e

Ö

f

8150 3606 6397 4345

V

e

Ô

* In cycles per second.

final position, where /f/ exceeds /0/ in spectral spread, /f/ exceeds its Lax cognate /v/ by twice the extent to which medial /{/ exceeds medial /v/. A similar relationship, of somewhat lesser magnitude, pertains for /0/ and /S/ in medial position. In a final summary of the obtained spread-in-spectrum values, Table 19 and Figure 19 confirm that, with pooled vowels and in both positions, Tense consonants TABLE 19

Spread in spectrum* for Tense!Lax consonants with pooled vowels MEDIAL

FINAL Mean

Mean Tense Lax

7908 5574

Tense Lax

7273 3975

* In cycles per second.

are characterized by greater spread in spectrum than are their Lax cognates. It would appear that the magnitudes of the spectral spread for both Tense and Lax consonants is greater, on the average, in medial position than in final position. Comparison Measures Table 20 presents comparison measures indicating the difference in the spread in spectrum shown by the Tense/Lax cognates. For each observation, the spread in spectrum value of the Lax consonant was subtracted from that of its Tense cognate. Observations in which the spectral spread for the Tense consonant exceeded that for its Lax cognate appear as the predicted positive values; observations for which the difference indicated greater spectral spread for the Lax consonant appear as negative values. Column five in Table 20 presents, for each combination, the number of observations out of eight in which the difference was either equal to, or less than zero, i.e., in

THE DISTINCTIVE FEATURE TENSE/LAX

W

i

71

72

PRESENTATION AND DISCUSSION OF RESULTS TABLE 2 0

Comparison measure: Difference in spread in spectrum* for Tense/Lax consonant cognates as a function of the adjacent stressed vowel

Mean Value i

e

I

minus

V e

minus

O *

œ a u i

« a u

MEDIAL Min. Max. Value Value

SD

FINAL Min. Max. Value Value

2275 6443 3535 5923

-650 330 1240 2730

4680 10,990 8580 11,000

29.85 63.95 34.93 47.75

0 0

1942 2690 minus 1455 A U 2121

-2210 -1690 -3190 -5980

7350 6570 3900 8970

50.10 42.66 35.36 76.81

1 1 2 2

Nx ^ 0

772 2056 1341 2324

-3580 -130 -3380 -2210

5920 54.31 5400 40.50 4810 39.62 8320 61.32

4 2 1 3

2161 1788 3429 3486

-8190 -5200 -1950 -1040

7280 6570 7150 7020

2 3 1 1

84.32 68.56 46.90 47.65

Mean Value

i' f ae I a minus V u i X

a u

A

« ,

SDNx^O 2 0

In cycles per second.

which there was no difference between Tense and Lax cognates in spectral spread, or in which the difference was in the opposite direction to that predicted. Out of a total of 128 observations, there were twenty-five such instances (20 %). Seventeen of these occurred with medial position cognates; eight with final. Twelve of the total occurred with the /f /-/v/ opposition, thirteen with the /0/-/5/ opposition. This might seem to imply that the spread-in-spectrum relationship for the Tense/Lax distinction appears to be equally stable for both the /f/-/v/ and the /0/-/9/ oppositions, but that it varies in stability for both pairs as a function of position. This is not actually the case. The two Tense/Lax oppositions pattern quite differently. For the /f/-/v/ opposition, all but two of the twelve departures from predicted spread-in-spectrum relationship occurred in medial position. For the /0/-/Q/ opposition, on the other hand, the thirteen negative instances pattern almost equally, seven in medial position and six in final. It would appear that spread-in-spectrum differences between cognate pairs may be comparatively stable cues for the /f/-/v/ opposition in final position. Only 8 % of the total number of negative instances occurred there. The differences appear to be somewhat less stable for the /G/-/8/ opposition generally, but appear not to vary as a function of position. The number of reversals for this opposition amounted to 24% of the total in medial position and 21% in final position. For the /f/-/v/ opposition in medial position, the differences between cognates in terms of spread in spectrum would appear to be least stable as distinctive cues. Of the total number of reversals, 40 % occurred with the /f/-/v/ opposition in medial position. Figure 20 presents the mean values from Table 20 in a deviation bar graph. All means are represented as positive values, indicating that the departures shown in Table 20 are neither numerous enough nor strong enough to reverse the predicted relationship for the means. Comparison values for /f/-/v/ tend to be greater for final

73

THE DISTINCTIVE FEATURE TENSE/LAX

r

I

I

*

Final

Fig. 20. Comparison measure: Difference in spread in spectrum for Tense/Lax consonant cognates as a function of the adjacent stressed vowel. Frequency in kilocycles.

74

PRESENTATION AND DISCUSSION OF RESULTS

position cognates; while for /G/-/8/, they tend to be greater for medial position cognates, except for those obtained with /ie/. The comparison measures for the /0//5/ opposition seem to follow no particular pattern in respect to the adjacent vowel. The opposition /f/-/v/, on the other hand, shows obvious similarity in pattern for both positions, that is, whether these consonants precede or follow the vowel, the difference between the spectral spreads of /f/ and /v/ is greater with the vowels /ae/ and /u/ and less with the vowels /i/ and /a/. Table 21 and Figure 21 show that the difference in the spectral spreads of the /f/

minus

Medial

minus Final Fig. 21. Comparison measure: Difference in spread in spectrum for Tense/Lax consonant cognates as a function of adjacent Diffuse/Compact stressed vowels. Frequency in kilocycles.

and /v/ cognates tend to be greater with Compact vowels than with Diffuse vowels. Itjwould seem reasonable to expect that the Diffuse vowels might have a differential effect upon the spread in spectrum of the Tense and Lax cognates and hence, increase the difference between them in this respect. A vowel characterized by dispersion of its first three formants over a relatively wide frequency range would still not attain sufficiently high frequency values to exceed those of consonants like /f/ and /0/,

75

THE DISTINCTIVE FEATURE TENSE/LAX TABLE 2 1

Comparison measure: Difference in spread in spectrum* for Tense/Lax consonant cognates as a function of adjacent Diffuse!Compact stressed vowels MEDIAL

FINAL Mean

f minus v with

Diffuse Compact

0 minus 3 with

Diffuse Compact

Mean

1548 1698 2824 2608

Diffuse Compact Diffuse Compact

with f minus v with 0 minus ô

4099 4989 2031 2072

* In cycles per second.

themselves characterized by even higher frequencies. On the other hand, such Diffuse vowels might well influence the consonants with predominantly lower frequencies, like /v/ and ¡5/, toward greater spread of energy in the spectrum. This appears to be the case with /f/ and /v/, as indicated by the comparison measures presented in Figure 21 for differences between /f/ and /v/ with Diffuse and with Compact vowels. The same effect does not appear to operate consistently with the /©/-/8/ opposition, for in final position the difference is essentially the same with Compact and with Diffuse vowels, while in medial position it is only slightly greater with Diffuse vowels. TABLE 2 2

Comparison measure: Difference in spread in spectrum* for the Tense/Lax opposition as a function of adjacent Diffuse¡Compact stressed vowels FINAL

MEDIAL

Mean

Mean Tense minus Lax

with

Diffuse Compact

2186 2153

Diffuse Compact

with

Tense minus Lax

3065 3530

* In cycles per second.

Table 22 and Figure 22 indicate that differences between the Tense/Lax cognates with Diffuse and Compact vowels in medial position tend to be essentially similar, while in final position the differences between Tense and Lax cognates tend to be greater with Compact vowels. Medial position consonants in this study are all preceded by the unstressed vowel /a/, while the stressed vowels preceding final consonants vary in character. The trend shown in Figure 22 would seem to suggest that a preceding vowel has a stronger influence upon the spectral spread of an adjacent consonant than does the vowel which follows that consonant. Since Miksak (140,

76

T e M

PRESENTATION AND DISCUSSION OF RESULTS

% D

minus Lax +C Medial

_ Tense i minus C+ Lax

D +

Final Fig. 22. Comparison measure: Difference in spread in spectrum for the Tense/Lax opposition as a function of adjacent Diffuse/Compact stressed vowels. Frequency in kilocycles.

p. 130) has indicated that the preceding consonant has little, if any, effect upon the duration of the following vowel, this might be interpreted as one more indication of the possible reciprocal influence within a VC syllabic unit. Table 23 and Figure 23 show the values for each consonant as a function of the contiguous stressed vowel. The conspicuous departure from the obtained pattern is, once again, the extreme value for the difference between Tense and Lax consonants in final position with /x/.6 Except for this extreme value with /ae/, the difference in TABLE 2 3

Comparison measure: Difference in spread in spectrum* for the Tense/Lax opposition as a function of the adjacent stressed vowel MEDIAL

FINAL Mean

Tense minus Lax

i ae a u

1466 1921 2384 2905

i œ

a u

Tense minus Lax

Mean 2109 4566 2495 4022

* In cycles per second. 6 Inspection of the raw data for this measure indicates that rather large values were obtained from samples for four of the eight speakers, two men and two women, for the difference between Tense and Lax cognates in final position with /ae/. Whatever the cause, this departure of a measure from the pattern appears to be a real departure.

77

THE DISTINCTIVE FEATURE TENSE/LAX

spread in spectrum between Tense and Lax cognates in both medial and final positions appears to grow progressively greater with vowels in the order of their placement on the physiological vowel diagram, i.e., the least difference with the highest front vowel, /i/, and the greatest difference with the highest back vowel, /u/. e

mim« Lax

Medial 4



%

2

r

- r

j± i

+

T«nte minus

Final Fig. 23. Comparison measure: Difference in spread in spectrum for the Tense/Lax opposition as a function of the adjacent stressed vowel. Frequency in kilocycles.

The final summaries of comparison measures, Table 24 and Figure 24, confirm the distinctive feature specification of greater spread in spectrum for Tense consonants than for their Lax cognates. It appears that Tense consonants exceed their Lax cognates by a greater margin in final position than in medial position. • The gradual progression of these opposition differences as a function of the adjacent vowel suggests some support for Malmberg's opinion that among certain allophones it may be more meaningful to speak of "gradual distinctive differences" (135, p. 319), rather than of dichotomies in regard to the relevant distinctive features.

78

PRESENTATION AND DISCUSSION OF RESULTS TABLE 2 4

Comparison measure: Difference in spread in spectrum* for the Tense/Lax opposition with pooled vowels MEDIAL

FINAL Mean

Tense minus Lax

Mean

2170

3297

Tense minus Lax

In cycles per second.

1

1

1

1

1

1

1

1

1 Lax Medial

k Tana*

l

i

. 2 i

i

0

minus

k

2

+

1

lax

Final Fig. 24. Comparison measure: Difference in spread in spectrum for the Tense/Lax opposition with pooled vowels. Frequency in kilocycles.

In summary, spread-in-spectrum measures, both obtained values and comparison values, indicate that, on the average, Tense consonants show greater spread in spectrum than do their Lax cognates in comparable samples. This is in agreement with the distinctive feature specifications for Tense/Lax cognates in terms of their spread in spectrum. However, it should be noted that for twenty-five out of 128 observations of cognate pairs, or 20 %, there was either no difference in the spectral spread of the cognates, or the spread in spectrum was greater for the Lax consonants than for their Tense cognates. Furthermore, it appears that the magnitude of the difference varies as a function both of cognate pair and of position, being of greater magnitude for the /0/-/Ó/ opposition in medial position and for the /f/-/v/ opposition in final position. 3. Duration Tense consonants, according to the distinctive feature analysis, are characterized by longer durations, i.e., greater "spread ... in time" (98, p. 30) than are their Lax

79

THE DISTINCTIVE FEATURE TENSE/LAX

cognates. "The prolonged duration of the sound is an accessory effect of the tension" (100, p. 153). Obtained Measures Column one of Table 25 shows the mean duration for each consonant as a function of the adjacent stressed vowel. Values are presented in milliseconds and each mean represents the average of eight observations. Columns two and three present the minimum and maximum values, respectively, for each set of eight observations. The standard deviations shown in column four indicate far less variability among speakers on the duration measure than on the two measures previously reported, i.e., amount of energy in the consonant and spread of energy in the consonant spectrum. It does appear from Table 25, however, that considerably more variability among speakers is evidenced for duration values on final position consonants. It seems reasonable that consonants surrounded by other phonemes might exhibit less durational variability than those produced as terminal sounds.7 Figure 25 presents the mean values from Table 25 in graphical form. Duration in milliseconds is displayed along the abscissa as a function of each consonant and TABLE 2 5

Consonant duration* as a function of the adjacent stressed vowel

Mean Value i f Se I'

a

u i V'

se

a u i

se o a a

u i 0n

se a

u *

MEDIAL Min. Max. Value Value

FINAL Mean Min. Value Value

SD i

208.8 198.8 202.5 216.3

140 130 135 130

310 260 250 270

5.57 3.87 3.61 4.90

146.3 136.3 135.0 153.8

95 90 100 95

200 175 170 215

3.61 2.83 3.00 4.36

se V a

208.8 177.5 196.3 200.0

145 130 130 150

240 245 255 250

2.83 4.12 4.69 4.36

£B û Ö

158.8 140.0 156.3 161.3

130 90 120 115

185 185 190 185

2.00 3.32 2.24 2.45

se x O

se i I a

u i

u i a u i

a u

Max. Value

SD

423.8 392.5 377.5 377.5

280 145 235 250

695 660 775 620

13.96 14.35 17.35

342.5 277.5 330.0 360.0

275 185 195 250

425 380 530 490

5.39 6.00 9.70 9.06

346.3 295.0 393.8 372.5

260 120 205 280

565 465 630 470

9.80 11.05 12.37 6.63

345.0 382.5 353.8 341.3

245 205 185 200

495 595 650 440

7.62 12.96 13.15 8.19

:

12.00

Values in milliseconds.

' Within the given positions, it is interesting to note a slight, but consistent, tendency for variability to be greater in medial position with Diffuse vowels and in final position with Compact vowels;!

80

PRESENTATION AND DISCUSSION OF RESULTS

Ç *

fi*

•h