Strength and Weakness at the Interface: Positional Neutralization in Phonetics and Phonology 3110185210, 9783110185218

This thorough study of the expression of contrast in the world's vowel systems examines phonetic and phonological d

247 86 5MB

English Pages 304 [302] Year 2006

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Strength and Weakness at the Interface: Positional Neutralization in Phonetics and Phonology
 3110185210, 9783110185218

Table of contents :
Frontmatter
Contents
Chapter 1 Introduction
Chapter 2 Stressed syllables and unstressed vowel reduction
Chapter 3 Final syllables
Chapter 4 Initial syllables
Chapter 5 Conclusions
Backmatter

Citation preview

Strength and Weakness at the Interface



Phonology and Phonetics 10

Editor

Aditi Lahiri

Mouton de Gruyter Berlin · New York

Strength and Weakness at the Interface Positional Neutralization in Phonetics and Phonology by

Jonathan Barnes

Mouton de Gruyter Berlin · New York

Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.

앝 Printed on acid-free paper which falls within the guidelines 앪 of the ANSI to ensure permanence and durability.

Library of Congress Cataloging-in-Publication Data Barnes, Jonathan, 1970⫺ Strength and weakness at the interface : positional neutralization in phonetics and phonology / by Jonathan Barnes. p. cm. ⫺ (Phonology and phonetics ; 10) Includes bibliographical references and index. ISBN-13: 978-3-11-018521-8 (cloth : alk. paper) ISBN-10: 3-11-018521-0 (cloth : alk. paper) 1. Grammar, Comparative and general ⫺ Phonology. 2. Neutralization (Linguistics) 3. Phonetics. I. Title. II. Series. P217.3.B376 2006 414⫺dc22 2005035417

Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at ⬍http://dnb.ddb.de⬎.

ISBN-13: 978-3-11-018521-8 ISBN-10: 3-11-018521-0 ISSN 1861-4191 쑔 Copyright 2006 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Cover design: Christopher Schneider, Berlin. Printed in Germany.

Acknowledgments

This book grew out of my 2002 UC Berkeley doctoral dissertation on the subject of positional neutralization and phonological typology. This being the case, all those thanked in that prior work may hereby consider themselves thanked again (even, perhaps especially, the cats). To that earlier foundation have been added new data, new discussion, even some new experimental results. Gone, I hope, are most of the errors, oversights, and imperspicuities. Since then a great many people have helped me move this work forward. First among them is Sharon Inkelas, my one-time advisor and current colleague, whose advice, help, and encouragement was indispensible in making all this happen. The debt of gratitude I owe to her is vast. Alan Yu was a great help to me with his Mandarin language skills back when I was trying to figure out whether Seediq vowel reduction really works that way. It does, but I callously forgot to thank him in my dissertation. Sorry Alan, and thanks. Here on the east coast, Donca Steriade and Edward Flemming have been a wonderful source of inspiration at all times, and good neighbors too. The beauty of their work in phoneticsphonology is a large part of what keeps me interested in these issues. Others who have provided valuable discussion, advice, and encouragement in one way or another over the last few years include Juliette Blevins, Alejna Brugos, Andrew Garrett, Larry Hyman, John Kingston, Dasha Kavitskaya, Shigeto Kawahara, Michael Kenstowicz, Aditi Lahiri, Teresa McFarland, David Mortensen, Orhan Orgun, Jaye Padgett, Janet Pierrehumbert, Anne Pycha, Stefanie Shattuck-Hufnagel, Jen Smith, Colin Wilson, and Cheryl Zoll. Audiences at the UMass Amherst and MIT phonology circles, the MIT Speech Group, and the LSA annual meeting in Boston all heard bits and pieces of this work and provided helpful feedback. The various instructors and students at the 2005 LSA Summer Institute in Cambridge provided an endlessly stimulating setting for the final days of this book’s preparation. I am grateful too to all my students and my colleagues at Boston University for the questions, challenges, support, and advice they bring to me in the day-to-day, and generally just for making this a great place to be a linguist. Carol Neidle and Paul Hagstrom are owed particularly profuse thanks for their roles in the preparation of this volume. The editorial staff at Mouton made everything much easier than it might otherwise have been. Anke Beck and Wolfgang Konwitschny were especially helpful on the production end, while Aditi Lahiri provided critical structural guidance. Here in Massachusetts, Julia Munemo was a wonderful copy editor with a keen eye for my punctuation-

vi Acknowledgments

related foibles. Finally, my family was as patient and supportive through the run-up to this book as they were during my years as a graduate student. Thanks Mom, Dad, Alison, and Grandma Carla. This work is dedicated to my beloved wife Jen.

Contents

Acknowledgments .......................................................................................

v

Chapter 1. Introduction............................................................................

1

1.1.

Phonetics and phonology in recent approaches to positional neutralization.................................................................................. 3 1.1.1. Pure Prominence ............................................................................ 3 1.1.1.1. Phonetic arbitrariness in Pure Prominence models....................... 5 1.1.2. Phonetically-driven Phonology ..................................................... 6 1.1.3. Integrated Phonetics and Phonology ............................................. 7 1.1.4. Neo-Grounded Phonology ............................................................. 8 1.2. Goals............................................................................................... 8 1.3. Phonetics and phonology in the Phonologization Approach........ 9 1.4. Categories, neutralization and the phonologization process ........ 11 1.5. Organization................................................................................... 16 Chapter 2. Stressed syllables and unstressed vowel reduction............ 19 2.1. 2.1.1. 2.1.2. 2.1.3. 2.2. 2.3. 2.3.1. 2.3.2. 2.3.3. 2.4. 2.5. 2.5.1. 2.5.2. 2.5.2.1. 2.5.2.2. 2.5.2.3. 2.5.2.4. 2.5.2.5.

Unstressed vowel reduction: patterns of neutralization................ Vowel height contrasts................................................................... Nasalization.................................................................................... Quantity.......................................................................................... The phonetics of vowel reduction ................................................. Case studies.................................................................................... The phonologization of phonetic vowel reduction: Bulgarian..... Categorical vowel reduction: Belarusian ...................................... A mixed system: Brazilian Portuguese ......................................... UG-based approaches to UVR typology....................................... Vowel reduction in Russian........................................................... Facts................................................................................................ Experimental investigation of Russian UVR ................................ Subjects .......................................................................................... Materials......................................................................................... Methods.......................................................................................... Results ............................................................................................ Discussion ......................................................................................

20 20 27 28 29 32 32 36 37 39 47 48 52 52 52 53 55 63

viii Table of Contents

2.6. 2.6.1. 2.7.

Phonetic determinism in UVR systems......................................... 67 Vowel reduction in Seediq............................................................. 68 Summary ........................................................................................ 71

Chapter 3. Final syllables ......................................................................... 73 3.1. 3.1.1. 3.1.2. 3.2. 3.2.1. 3.2.2. 3.2.2.1. 3.2.2.2. 3.2.3. 3.2.3.1. 3.2.3.2. 3.2.3.3. 3.3. 3.4. 3.5.

Final syllable prominence.............................................................. Phonetic prominence...................................................................... Psycholinguistic prominence......................................................... Phonological strength effects in final position.............................. Tone in final position..................................................................... Vocalic strength effects in final position ...................................... Hausa .............................................................................................. Pasiego Spanish.............................................................................. Final syllable resistance to assimilation and reduction................. The phonetic roots of final resistance ........................................... Syllable structure effects on final resistance................................. Final strength and psycholinguistic prominence........................... Why final strength is different from stressed syllable strength.... Categorical and gradient effects in final strength systems ........... On assumptions concerning licensing capacity and phonetic prominence ..................................................................................... 3.6. Final Weakening ............................................................................ 3.6.1. Devoicing ....................................................................................... 3.6.2. Glottalization.................................................................................. 3.6.3. Mixed systems................................................................................ 3.6.4. Nasalization.................................................................................... 3.6.5. Final vowel reduction and loss of contrasts .................................. 3.6.5.1. Lowering ........................................................................................ 3.6.5.2. Final reduction ............................................................................... 3.6.5.3. The phonetics of final syllable reduction ...................................... 3.7. Vowel quantity in final syllables................................................... 3.7.1. Neutralization of quantity contrasts in final position.................... 3.7.2. Final short vowel avoidance .......................................................... 3.8. Summary ........................................................................................

74 74 77 78 78 79 79 82 86 93 97 98 100 103 108 114 115 125 132 134 141 141 143 146 151 151 157 158

Chapter 4. Initial syllables........................................................................ 161 4.1. 4.1.1.

Recent work on initial syllable phonology.................................... 162 Functional motivations for PN in stressed and initial syllables ... 163

Table of Contents

4.1.2. 4.2. 4.3. 4.4. 4.4.1. 4.4.2. 4.5. 4.5.1. 4.5.1.1. 4.5.1.2. 4.5.1.3. 4.5.1.4. 4.5.2. 4.5.2.1. 4.5.2.2. 4.5.2.3. 4.5.2.4. 4.5.3. 4.5.3.1. 4.5.4. 4.6.

Agenda for this chapter.................................................................. Licensing asymmetries in initial position ..................................... The phonetics of initial position .................................................... Initial-syllable strength effects ...................................................... Initial syllables in languages with fixed initial stress ................... Other sources of duration in initial syllables ................................ Vowel harmony and initial syllables............................................. Initial syllables in English ............................................................. Subjects .......................................................................................... Materials......................................................................................... Methods.......................................................................................... Results ............................................................................................ Initial syllables in Turkish ............................................................. Subjects .......................................................................................... Materials......................................................................................... Methods.......................................................................................... Results ............................................................................................ Initial strengthening and Turkic Palatal Harmony........................ Does vowel harmony come from vowel reduction? ..................... Bantu............................................................................................... Summary ........................................................................................

ix

165 165 168 173 174 177 179 179 180 180 181 182 184 186 186 187 188 193 193 200 203

Chapter 5. Conclusions ............................................................................. 206 5.1. 5.2. 5.3. 5.4.

Phonologization and Universal Grammar in typology ................. On the integration of phonetics and phonology............................ Does “anything go” in phonology? ............................................... Further challenges: incomplete neutralization ..............................

206 209 220 223

Notes ............................................................................................................ 239 References.................................................................................................... 256 Index............................................................................................................. 285

Chapter 1 Introduction

This study investigates the typology and implementation of the positional neutralization of contrasts in vowel inventories. Its purpose is to account in a meaningful way for regularities in that typology, and to elucidate, on the basis of these phenomena, the nature of the relationship between phonetics and phonology more generally. Most current approaches to positional neutralization (PN) attribute the emergence of crosslinguistic regularities in PN patterns to specific restrictions imposed on possible phonological systems by Universal Grammar. I argue here that the role of the phonological grammar in generating the typology of PN systems, particularly as concerns matters of phonetic naturalness or grounding, is far less central than has been previously assumed. Rather, a diachronically oriented phonologization-based approach to PN yields far more accurate typological predictions as to which systems should and should not be attested, while at the same time providing realistic phonetic explanations for the existence of those patterns. The phonologization approach thus liberates the categorical phonology from the need to reproduce in synchrony the phonetic naturalness sound patterns will acquire in diachrony one way or another. The term “positional neutralization” comes to us from Steriade’s influential 1994 paper “Positional neutralization and the expression of contrast”, and refers to the class of neutralizations termed “structurally conditioned” by Trubetzkoy (1969: 235–241), meaning that the contrast in question is neutralized whenever one of the terms involved comes to occupy a particular structural position in the word or phrase. This Trubetzkoy opposed to “contextually conditioned” neutralization, in which the collapse of contrasts is triggered, for example, by segmental environment through processes such as assimilation, dissimilation and the like. I will use the term positional neutralization throughout this work in its maximally generic sense, referring to any instance of an asymmetrical capacity of two positions (or sets of positions) in the representation to license phonological contrasts, such that one set of structural positions licenses a larger array of contrasts than another. Specifically, one set of positions, termed “weak”, allows realization of only a subset of the range of contrasts available in another set of positions, termed “strong”. In this sense positional neutralization refers not only to instances in which

2

Introduction

contrasts are lost through mergers in weak positions, but also to cases in which licensing asymmetries arise through, for example, the emergence of new contrasts in strong positions. In my treatment of PN systems, I defend an approach to the phoneticsphonology interface recognizing a crucial distinction between a gradient phonetics and an abstract categorical phonology in a manner similar to that advocated in Keating (1996, inter alia). Functioning as they do with different sets of primitives, the only link between these two systems comes in diachrony, as naturally arising language-specific phonetic patterns are divorced from their phonetic origins and made phonological.1 Apparent restrictions on possible systems or principles of “phonetic grounding” operative in the phonology itself prove to be chimerical and unnecessary. Phonological patterns are grounded in phonetics by virtue of the fact that they originate in the phonetics in the vast majority of cases. Once phonologized, however, the relationship between two or more neutralizing entities is functionally arbitrary and no longer responsive to the phonetic forces which gave rise to the pattern of neutralization in the first place. Patterns appearing “unnatural” or phonetically unexpected in the phonology are typically the result of additional layers of change in the system, driven by anything from morphological factors such as paradigm uniformity, to subsequent sound change. Such additional changes may obscure the phonetic origins of a process, or the appropriateness of a process to its environment. This view of the role of phonologization in determining the phonetic shape of phonological patterns has clear roots in the listener-oriented approach to sound change developed by John Ohala (1981, 1993a), as well as in the work on phonologization of Larry Hyman (e.g., 1977). Other contemporary descendents of those approaches can be found in Blevins and Garrett (1998), Blevins and Garrett (2004), Hale and Reiss (2000), Kavitskaya (2001) and the work of Blevins (2004) under the rubric of Evolutionary Phonology. All these works share the basic observation of their progenitors that a primary route from phonetics to phonology runs through sound change, and in particular through the mechanism of phonologization. They are importantly different, however, in other respects, as will become apparent. The alienation described here of a pattern of neutralization from its phonetic roots creates a problem for the phonologist seeking to understand the nature of the processes involved, but in no way makes the pattern of neutralization any less viable from the point of view of synchronic implementation in the phonology. This is to be expected under the approach I advance in this study, according to which the phoneticsphonology interface literally is phonologization, with the former influencing the latter exclusively through this medium.

Recent approaches to positional neutralization

3

1.1. Phonetics and phonology in recent approaches to positional neutralization Recent debate concerning the appropriate treatment of positional neutralization in phonological theory (e.g., Alderete 1995; Beckman 1998; Bosch 1991; Casali 1997; Crosswhite 2001; Dyck 1995; Flemming passim; Harris 1998; Hyman 1998; Kirchner 1998; Lavoie 2001; Lombardi 1991; Majors 1998; Smith 2000, 2002; Steriade 1994, 1997, 2001; Walker 2001a, 2001b; Zhang 2001; Zoll 1998, inter alia) has been centered around questions concerning the place of phonetic information in the grammar that generates attested patterns of neutralization. A felicitous analysis of positional neutralization phenomena has become a primary desideratum for any potential model of the phonetics-phonology interface. A well-known example of positional neutralization clarifying the centrality of this issue to the development of our understanding of the relationship between phonetics and phonology is the following from Steriade (1997): In a great many languages, laryngeal features for consonants are subject to general patterns of positional neutralization such that some combination of these features (i.e. voicing, aspiration, glottalization) can be freely contrasted in positions traditionally identified with syllable onsets, but see their ability to contrast neutralized in syllable codas (Steriade [1997] for details and analysis). Phonetic factors are implicated more or less uncontroversially in the explanation of the diachronic development of such systems. In coda position, for example, the lack of a stop burst or CV transitions following the consonant may make the features in question more difficult to perceive, and hence more prone to effacement over time. C-place features are known to suffer similarly under the same conditions. Less obvious, however, is the extent to which this phonetic information is necessary or desirable in a synchronic model of positional neutralization. 1.1.1. Pure Prominence Some approaches to positional neutralization are largely unconcerned with the phonetic motivations for the alternations they model (e.g., Beckman 1998; Zoll 1998). While these Pure Prominence models differ from one another in substantive and principled ways, they share the basic assumption that positional licensing restrictions are best expressed in the grammar through constraints which have reference to a fixed set of phonological features and positions. Strong and weak positions are essentially listed as such, and are freely combinable with phonological features to produce constraints generating the necessary alternations or regularities. The Pure

4

Introduction

Prominence approach sees strong or weak licensing capacity as an inherent abstract property of a given position supplied by Universal Grammar, irrespective of language-specific phonetic details. In this sense it finds an ancestor in approaches to certain tonal and segmental phenomena in Bantu languages advocated by, e.g., Goldsmith (1982) or Hyman (1987), in which the designation of a particular syllable or mora with a grid mark or accent meant not the presence of metrical structure or anything interpretable as phonetic stress, but rather an abstract higher degree of relative phonological prominence, a prominence which might be expressed in increased licensing capacity for consonants and vowels or special behavior of some kind in the system of tone assignment, or some combination of all these. This approach, of course, need not necessarily bear the Universalist baggage encumbering later OT-approaches to positional prominence. The fact that the first mora of the stem-initial syllable in Kukuya bears an accent (reflecting the fact that the verb-stem-initial syllable in Kukuya has, among other things, greater licensing capacity for both onset consonants and vowels than other syllables – Hyman 1987; Paulian 1974) may be a fact about Kukuya alone, leaving open the possibility that some other language might single out this very position as the locus of particular relative weakness of licensing capacity. Still though, the fact that the same set of positions (initial syllable, stressed syllable, root) appear again and again as strong licensers throughout the languages of the world has led Pure Prominence advocates to assume that the “strength” or “weakness” of particular structural positions are properties of those positions supplied in Universal Grammar. A constraint system constructed on the principles of the Pure Prominence approach might look as follows in (1) below. This system has the effect of excluding mid vowels from unstressed syllables, a pattern typical of many languages with contrast-neutralizing vowel reduction. (1)

Neutralization of a height contrast in unstressed syllables a. b.

Ident[hi]/σ@ *MidV/unstressed σ

>> >>

*MidV Ident[hi]

>> >>

Ident[hi] *Mid

The constraint ranking in (a) accomplishes the vowel height neutralization using Positional Faithfulness constraints as proposed by Beckman (1998). Here, a general constraint banning mid vowels (*MidV) is ranked higher than a general faithfulness constraint mandating faithful output realization of the input feature [hi]. Were this the extent of the constraint system, the grammar would generate a language with no surface mid vowels at all. The reverse ranking, by contrast, would brook no

Recent approaches to positional neutralization

5

alteration of underlying [hi] specifications despite a general dispreference for mid vowels, this being evidenced only, for example, in cases of epenthesis, where the lack of underlying feature specifications makes the Ident constraint irrelevant, and allows markedness free rein in determining the identity of the new vowel. In the case of positional neutralization, however, the presence of the higher-ranked Positional Faithfulness constraint (Ident[hi]/σ) mandates faithful realization of the feature [hi] specifically in stressed syllables, with the effect that input mid vowels surface intact there alone, while in all other cases they are barred from the output representation. This is illustrated below in the tableau in (2).2 (2)

No mid vowels in unstressed syllables (Positional Faithfulness) /CeCé/ 

Ident[hi]/σ@ CiCé CeCé CiCí

*Mid

Ident[hi]

* **!

*

*!

**

The second ranking in (1) above, on the other hand, is a system of Positional Markedness constraints of the type advocated in Zoll (1998), which accomplishes the same banishment of mid vowels from unstressed syllables. In this system, a general markedness constraint against mid vowels is outranked by a general faithfulness constraint for the feature [hi]. The higher-ranking Positional Markedness constraint *MidV/unstressed σ, however, has the effect of leaving intact only input mid vowels located in stressed syllables. 1.1.1.1. Phonetic arbitrariness in Pure Prominence models Otherwise significant differences between these two approaches will not be of concern here.3 What is noteworthy in this context rather is the arbitrary relationship between the positions “stressed syllable” or “unstressed syllable” and the features [hi] and [lo] defining mid vowels. As it happens, precisely this combination of positions and features is necessary with great frequency crosslinguistically, and thus raises few eyebrows in its formalization as above. But in this model as far as the phonology is concerned, there is no reason why these statements should be preferred over the combination of the same positions with any other sets of features, e.g., *[+anterior]/unstressed σ. Models not assuming universality of positional strength or weakness, such as the accent approach to stem-initial

6

Introduction

prominence in Kukuya, are in principle free to designate any position strong or weak in this manner, regardless of crosslinguistic concerns of phonetic plausibility; any one combination of feature with position is just as well-formed from the point of view of the grammar as any other combination of feature with position, whether or not there is any reason to suppose that that feature is in any way related to that position phonologically in any language. For the case of laryngeal neutralization cited above, the phonology would simply combine the relevant feature (e.g., [+constricted glottis]) with the relevant position (here, coda) in either a faithfulness or a markedness constraint to generate the desired neutralization pattern. It is not clear in these approaches what we should make of the fact that the grammar would be just as happy formally with the same constraint, only with “unstressed syllable” substituted for “coda”, both being weak positions. For better or worse, then, there is a fairly obvious way in which this approach is missing certain generalizations. Specifically, it is manifestly not the case that all features are equally relevant or active phonologically in all positions. For example, languages do not typically ban all laryngeal contrasts from, e.g., all and only unstressed syllables, and a theory which allows for the ready expression of such unattested and phonetically unexpected systems is held to be deficient in its ability to account for the typology of positional neutralization patterns. Put simply, such theories overgenerate wildly. 1.1.2. Phonetically-driven Phonology Observations like these form the basis of the influential Licensing-by-Cue theory advanced by Steriade (1997) and implemented in numerous works of other authors adopting versions of this approach, such as Bradley (2001), Côté (2000), Crosswhite (2001), and Zhang (2001). Steriade observes that in positional neutralization patterns, the same features appear correlated again and again crosslinguistically with the same positions, and argues furthermore that this correlation is not accidental, but rather follows from the specific phonetic characteristics of each position. More precisely, Steriade’s claim is that features are licensed preferentially in positions in which phonetic conditions make them maximally robust perceptually, and are likewise eschewed in positions where they would be less perceptually robust, and hence easily overlooked. It is not then the position itself which licenses or bans features, but rather the concrete phonetic cues which are important for those features’ perception. For example, potentially hard-toperceive laryngeal contrasts, such as those involving voice, aspiration or

Recent approaches to positional neutralization

7

glottality could be licensed on stops only in the presence of release bursts and following CV transitions, wherever those may happen to occur in a given language. Likewise certain vowel height contrasts, such as mid vs. high or low, could be permitted exclusively on vowels with sufficient phonetic duration for their accurate perception. That voiced, aspirated, or glottalized consonants happen to contrast in one or another structural position within the syllable, or that vowels with inadequate durations are often phonologically unstressed is on Steriade’s view irrelevant. This is the point of greatest contrast with the Pure Prominence approach to PN. If it turns out that not only stressed vowels but, for example, phrase-final vowels as well have enough duration to license mid vowels in some languages, then Steriade’s approach is doubly vindicated, in that it avoids the disjunctive specification of environment which would otherwise be necessary in the phonetics-free models described above.4 1.1.3. Integrated Phonetics and Phonology The integrated models of phonetics and phonology of Flemming (passim) and Kirchner (e.g., 1998) share this goal of deriving phonological patterns of neutralization directly from phonetic factors, though they are not both committed specifically to perceptual cues as the basis of all licensing statements. In these models the boundary between phonetics and phonology is erased entirely, allowing any phonological process to refer directly to an unlimited array of phonetic factors, whether acoustic or articulatory in nature. Flemming (2001b), for example, derives patterns of vowel reduction from the decreased duration of unstressed syllables resulting in an unacceptable amount of effort necessary to reach articulatory targets for the production of non-high vowels. This class of approaches shares with the Licensing-by-Cue model the ability to achieve improved accuracy of typological predictions in comparison with the Pure Prominence models detailed above, since it too brings the phonetic forces which shape phonological patterns diachronically directly into their implementations in synchrony. By linking the licensing patterns directly to the phonetic effects which cause them, these approaches avoid the vast overgeneration characteristic of the phonetics-free, abstract phonological model.

8

Introduction

1.1.4. Neo-Grounded Phonology Attempts to chart a third path in accounting for positional neutralization patterns include the approaches of Hayes (1999) and Smith (2002), both of which owe a great deal to the Grounded Phonology approach of Archangeli and Pulleyblank (1994). According to Smith, for example, phonological positions and features are combined as before to form PN-inducing constraints; only now, before incorporation into the grammar, those constraints are subject to screening by a set of phonetically sensitive substantive filters which endorse constraints reflecting articulatory or perceptual concerns in some way and reject combinations of features and positions not grounded in this manner. These filters are said to constitute a sort of “meta-grammar” of constraint construction. This approach holds the specter of phonetic detail in phonology at bay while incorporating some of the restrictiveness of the Licensing-by-Cue model into the theory. 1.2. Goals The present study probes the controversies concerning both the implementation of positional neutralization and the role of phonetics in phonology more generally. It is devoted specifically to positional neutralization patterns affecting contrasts in vowel inventories, though in some areas, where relevant, consonants will be treated as well. I have selected for investigation three commonly cited pairs of strong and weak positions: stressed vs. unstressed syllables, initial vs. non-initial syllables, and final vs. non-final syllables.5 For each pair of positions, a typological survey was conducted of the licensing asymmetries found in several hundred languages, the results of which I report in each chapter, using some of the clearer cases to illustrate general patterns. The question to be answered again is why the neutralization of certain contrasts should be common in a given position while other neutralization patterns are rare or unattested. Put more generally, how can we best account for the existence and nature of typological regularities in the patterning of sounds? In each case I show how phonetic characteristics commonly associated with the position in question can explain why one pattern of neutralizations should be attested regularly while others seem not to occur. In the case of stressed syllables, where the phonetic factors involved in the development of licensing asymmetries are relatively unambiguous (dramatic durational asymmetries between stressed and unstressed syllables in the languages in question), phonological neutralization patterns are likewise fairly homogenous. In the case of final

Phonetics and phonology

9

syllables, however, where the phonetic situation is more complicated, the phonological status of the final syllable with respect to the licensing of contrast is just as complex, making final syllables strong for some contrasts in some languages, and weak for the same or other contrasts in other languages. The ambiguous status of the final syllable presents a strong argument against the notion of strength or weakness of licensing potential as inherently determined primitives in Universal Grammar. Whether phonological strong or weak licensing effects are ultimately phonologized in a given position for a given feature will depend on the phonetic characteristics of that position in the relevant language at a given moment in time. The recurrence of certain patterns crosslinguistically in phonology is seen here to be entirely a function of the recurrence in the phonetics of the input patterns to the phonologization process. It is at this phonetic level that the task of explaining the origins of potentially universal patterns becomes meaningful.6 The recurrence of certain phonetic patterns together with the reinterpretative mechanism of phonologization ensures that certain patterns will be attested in phonologies in certain positions to the exclusion of other logically possible patterns, whether a preference for these patterns is specifically expressed in the phonological grammar or not. This activity of the phonetics in delimiting through constraints on possible sound changes the range of attested phonological patterns, a theme recurring throughout the work of John Ohala, has been referred to recently by Mark Hale (2003) as a “diachronic filter” on the substance of phonology. That phonetics plays a role in determining the typology of positional neutralization patterns is clear enough. I argue, however, that the link between phonetic reality and phonological patterns is severed at the moment of phonologization in a manner which is not compatible with the direct connection between these two assumed in the Licensing-by-Cue and other “Direct Phonetics” approaches described above. The sporadic surfacing of phonetically unmotivated neutralization patterns through the agency of layered sound change or the operation of analogy underscores the indifference of the phonological grammar to the phonetic characteristics of the entities it manipulates in patterns of positional neutralization. 1.3. Phonetics and phonology in the Phonologization Approach The phonologization approach to phonological typology I advance in this study rests on the following assumptions concerning the relationship between phonetics and phonology: Phonetics and phonology are segregated by a formal split of essentially the type proposed by Keating (1988b, 1996,

10

Introduction

inter alia), and assumed with some variations by Cohn (1990), Zsiga (1993) and many since. While these models differ in some respects, crucial and common to all is the view that the phonetic component operates on gradient, physical representations containing quantitative specifications for each dimension, while the phonology is categorical, operating on abstract symbolic representations, interpreted by the phonetics, but not necessarily having phonetic content themselves. This interpretation process provides contextually appropriate phonetic targets for each feature specification in the phonology, where a one-to-one relationship between elements of these representations is of course not assumed (a given phonological feature may receive target specifications along a series of phonetic dimensions). In many ways this view of the phonetics-phonology interface resembles more traditional approaches to phonetics and phonology, which distinguish phonological processes from processes which are “just phonetic implementation”. It differs from traditional approaches, however, in removing phonetic content from phonology altogether, freeing the phonology from the responsibility of reflecting naturalness in any fashion, allowing it to be truly abstract and symbolic. It also removes from the purview of phonology the implementation of a great number of gradient, phonetic processes previously thought to apply within that component. The phonetics thus has increased responsibilities, generating as it now must not just universal aspects of implementation, but a broad array of language-specific patterns which cannot be considered categorical (phonological). I adopt, for the purpose of modeling the activity of the phonetic component in creating the patterns of realization for vowels which precede phonologization Keating’s (1996) view of phonetic target windows, the central component of her window model of coarticulation. Here Keating uses target windows of varying widths to model phonetic underspecification as a matter of degree, a refinement of her earlier all-ornothing approach to the subject (Keating 1988a). Rather than assuming either that there is a target specified for a given phonetic feature or that there is not (in which case values must be interpolated as a path between specifications for surrounding segments), specifications of varying degrees of strictness can be modeled by assuming narrower or broader target windows as the case warrants. Keating (1996: 274) describes these target windows as “ranges of permitted values”, where a narrow window represents a precise target with little room for contextual variation, while a wide window indicates a more loosely specified target, such that more contextual variation is tolerated. The widest possible window corresponds to phonetic underspecification, where targets must simply be interpolated on the basis of neighboring target specifications. Thus, in cases where the F1 value for the low vowel /a/ is seen not to vary significantly with the

Categories and neutralization

11

duration of the vowel in question (despite the increased effort required to reach the more open target articulations of non-high vowels within periods of shorter duration), we can assume a narrow target window for F1 values. Where F1 lowers significantly as /a/ raises toward [] in contexts of reduced vowel duration, such as unstressed syllables in languages with unstressed vowel reduction, a wide target window for the vowel height dimension models the increased variability of the specification, such that contextual factors like duration can have a stronger influence over the outcome of the articulation. I will assume the existence of both acoustic and articulatory target specifications, though in some approaches, such as Articulatory Phonology (e.g., Browman and Goldstein passim; Zsiga 1993; Byrd 1996), targets are exclusively gestural in nature. I will also assume that target windows may be resized according to structural position or prosodic environment. The width of the target window itself is meant to encode variability by context, such that this may seem unnecessary. On the other hand, it may be necessary to assume in certain cases of positional allophony target windows of differing widths in different circumstances, such that while a vowel under stress, for example, shows little variation regardless of contextual changes in speech rate or segmental environment, an unstressed vowel may be allowed to vary more under identical circumstances. Such a possibility is mentioned by Keating (1996: 275–276) and assumed by Guenther (1995). The case of vowel reduction in Russian described in Chapter 2 represents a case in which contextual resizing of target windows may be crucially important. 1.4. Categories, neutralization and the phonologization process As outlined above, while the influence of phonology on phonetics takes place in the synchronic interpretation of categorical phonological representations implemented as sets of phonetic targets, in the approach assumed here the influence of phonetics on phonology is diachronic in nature. The vehicle of its influence is the process of phonologization, which alters phonological representations as a consequence of the reinterpretation of existing phonetic regularities. Before outlining the phonologization process, however, it is necessary first to provide further details concerning the relationship between phonological categories and phonetic targets in the phonologization model assumed here. The first possibility for the relationship between the realizations of a single phonological element in weak and strong positions involves a single wide target specification for the phonetic feature in question. Thus, in a

12

Introduction

system where a single phonological entity /e/ is realized as [e] under stress, but in unstressed syllables raises toward [i] as a function of the duration accorded it, we can imagine a single wide target window for all realizations of /e/.7 In stressed syllables, assuming duration al targets to be consistently high in the relevant languages, the pressure from shortened durations necessary to condition raising would never arise, allowing values for the specification of the height of /e/ to be realized toward the center of the target window for that dimension (whether conceived as a set of articulatory parameters corresponding to openness of the upper vocal tract or as acoustic targets for F1). In environments such as unstressed syllables, however, where duration is characteristically low, pressure to undershoot the target would increase, resulting in values closer to the lower end of the F1 target window. This situation would imply, however, that any instance of reduced duration would result in raising, while any instance of longer duration would see none. We would predict then that in fast speech stressed /e/ too would raise toward [i], while in slower speech unstressed /e/ should fail to raise. This gradience of application of reduction is certainly characteristic of the vowels of unstressed syllables in what I will argue are unphonologized, gradient systems of unstressed vowel reduction, such as the reduction of [a] to schwa in Russian non-first-pretonic unstressed syllables. It is less clear, however, that the relationship between duration and F1 for stressed vowels is so straightforward. If it is the case in the UVR language in question that stressed /e/ never raises significantly toward /i/, even in cases where increased speech rate lowers its duration substantially, the above picture must be revised somewhat. In particular it would be necessary to assume a narrow target window for parameters connected with the realization of vowel height in stressed syllables, thereby allowing less contextual variation. In unstressed syllables, however, the target window for the same phonological /e/ would be expanded, such that durational changes (among other things) would cause significant variation in the realization of the vowel along the height dimension. Shorter durations would mandate strong raising toward [i], while longer durations would necessitate less, possibly no reduction.8 Note that at no point in this discussion so far has the concept of contrast neutralization played any role. It is nonetheless commonplace in the literature to see discussion of vowels nearly neutralizing, or sometimes neutralizing, or gradiently neutralizing. For the purposes of this study such characterizations are held to be meaningless (i.e. uninterpretable). Neutralization here refers solely to a phonological process whereby two abstract categories are merged into one under certain conditions, resulting in a loss of contrast between the two. In a situation such as that described above, in which we find duration-dependent reduction of /e/ toward [i] in

Categories and neutralization

13

unstressed syllables, it is quite possible, and indeed often the case in systems of positional neutralization (such as the Bulgarian reduction of /o/ toward [u] described in Chapter 2), that due to constant contextual pressures realizations of two independent phonological entities such as /i/ and /e/ will overlap significantly along some, perhaps all, dimensions. This would mean in effect that some percentage, perhaps the majority of realizations of both phonological entities in the relevant context would be phonetically ambiguous, the relevant tokens having an equal probability of belonging to the distribution either of the one or of the other phonological category. Two categories may overlap in this way in a given set of positions (as Bulgarian /o/ and /u/ in unstressed syllables), however, without implying that the contrast between the two has been neutralized. Indeed, we would speak of neutralization of the contrast only in such case as the distributions of the two entities could be shown to be identical, so that no significant differences emerge in statistical comparisons of the two. Assuming that identical distributions imply identity of the relevant phonetic target windows, and assuming that a single array of phonetic targets for all dimensions in a given context must correspond to a single array of feature specifications for the segment in question, identity of realizational distributions for two underlying segments implies the phonological neutralization of the contrast between those two segments in the position in question.9 If it can be determined, thus, that /e/ and /i/ in the unstressed syllables of a given language have identical phonetic distributions, the contrast between the two must be assumed to be neutralized, implying the operation within the categorical phonology of some process changing underlying feature specifications of /i/, /e/, or both, as the pattern would have it. The realization of /e/ and /i/ identically within a distribution characteristic of [i] implies change of /e/ to /i/ in the phonological grammar.10 We could equally imagine, however, a situation in which a single phonological entity, /a/ for example, had a window of realizations in stressed syllables varying narrowly around [a], and a window of realizations in unstressed syllables varying equally narrowly around [], such that in neither structural position would we find significant variation in the realization of the vowel /a/, but in the two positions the narrow target windows for the relevant phonetic dimensions would be completely distinct. In such a system stressed /a/ would be [a], while unstressed /a/ would be [] regardless of perturbations in the durations of the vowel in question. It is common in the literature to find allophonic processes such as this one, in which the two allophones vary little and have little or no overlap, referred to as “categorical” processes, as opposed to gradient processes in which a continuum of realizations is attested between the two

14

Introduction

extremes, most likely as function of contextual factors such as duration. This study will avoid this usage completely. Allophonic processes in which allophones have distinct distributions with little overlap will be assumed to have one narrowly specified phonetic target in one structural position and another narrowly specified phonetic target in the other. Gradient processes describing continua of realizations between two points will be assumed to be the result of loose specification of a single target for both positions, or potentially even partially overlapping but separate target windows in the positions in question. The term “categorical” will be reserved solely for processes affecting the phonological category membership of the entities in question. At this point, however, the distinction between the phonetic and the phonological becomes somewhat tenuous. What exactly is the difference between a phonetic process applying to a single phonological entity, the output of which is two distinct, narrowly specified phonetic distributions, and a phonological process changing the categorical feature specification of a given phonological entity in a given context such that the two resulting phonological representations are realized as two distinct, narrowly specified phonetic distributions? In other words, how can we distinguish /a/  /a/ – [a] in stressed syllables and /a/  /a/ – [] in unstressed from /a/  /a/ – [a] in stressed syllables and /a/  // – [] in unstressed? If /a/ and // contrast underlying and have identical distributions in the syllables in question, the process must be assumed to be phonologically neutralizing. If there is no underlying //, however, the situation becomes ambiguous, and is not, I argue, resolvable through an appeal to phonetic distributions. In such situations we can be truly certain a given process involves changes in phonological feature specifications only if output feature specifications can be shown to be phonologically relevant to some other process. Where there is no evidence that the features in question are phonologically relevant (for the linguist and the learner alike), there is no compelling reason to assume that any change in the category membership of the phonological entity in question has taken place. If /a/, for example, is realized as something like [] in unstressed syllables and the process is non-neutralizing, we can assume either that the phonetic targets for /a/ in unstressed syllables are such that it is realized phonetically as [], or it might be that a phonological change to the input /a/ takes place such that it goes from [+low], [–high] to [–low], [+high], and is realized as []. Evidence that the latter is the correct analysis might be susceptibility of the vowel in question to some phonological process targeting segments specified [+high], or equally failure to undergo processes targeting all segments specified [+low]. In the absence of such phonological evidence, however, there is no reason for the linguist or the native speaker/learner to

Categories and neutralization

15

assume that any categorical phonological process has taken place. A single phonological entity can then correspond to one set of phonetic targets or two distinct sets. Two phonological entities corresponding to the same set of phonetic targets, however, implies neutralization. A situation in which two phonological entities overlap in a significant percentage of their realizations may not constitute phonological neutralization, then, but it is certainly a situation ripe for reinterpretation as such. Where the phonetic distinction between two categories becomes ambiguous in a large enough percentage of the tokens of each in a particular structural position, that distinction becomes unstable, and the likelihood of reinterpretation of the phonetic pattern as phonological positional neutralization increases. The notion of sound change as reinterpretation based on the listener’s misperception of the speaker’s intentions has its source in the work of John Ohala (passim, but especially 1981 and 1993a). On Ohala’s conception of sound change, ambiguous phonetic realizations carry with them the possibility of changes in phonological representations. Briefly, a speaker utters a string /AXC/ – [AXC]. If the phonetic realization here is ambiguous, such that the distinction between it and a realization [AYC] of string /AYC/ is not perceptually robust, a listener may hear the token of [X], but misperceive it as [Y] due to their perceptual similarity. Believing the speaker to have intended to utter [Y] as a realization of /Y/, the listener then sets up a potential alternate realization of the string in question as /AYC/. In this particular context then, the contrast between /X/ and /Y/ has been neutralized. A simple example would be the development of umlautlike vowel fronting such that /uCi/ becomes /yCi/ (or /yC/). In Ohala’s schema, the speaker, with a UR of /uCi/, produces the coarticulated sequence [yCi] with significant fronting of the /u/. The listener then fails to attribute the fronting he or she hears to the influence of the following front vowel, hearing instead [y], and believing this to be the speaker’s intended pronunciation, constructs a new representation for the form in question: /y/. If the language already contrasts /u/ and /y/ underlyingly, this results in the neutralization of the contrast between these in the relevant position. Hyman (1976) presents a model of phonologization in which he distinguishes three distinct stages of development. Phonologization in this view is conceived of as the intrinsic becoming extrinsic, such that automatic contextual variation is reinterpreted as intentional on the part of the speaker. Hyman distinguishes two distinct processes, the first of which he calls phonologization, and the second phonemicization. His example is consonantally conditioned tone split, whereby naturally occurring perturbations of F0 after voiced and voiceless obstruents are reinterpreted as intentional, such that speakers begin to use tone contours as cues to

16

Introduction

voicing, exaggerating the range of the perturbations in the process such that they can no longer be attributed to universals of phonetic implementation. In Keating’s terms, however, this is the development of language-specific phonetic patterns. According to the model laid out above, while the exaggerated F0 perturbations are clearly intentional (targeted), there is still no reason to believe that they are phonological (meaning themselves encoded as distinct phonological categories. That they function as cues to the distinction between other phonological categories is clear enough). Hyman’s next stage, phonemicization, comes when the distinction between the voiced and voiceless consonants which originally conditioned the pitch differences is lost, yielding a contrast between segmentally identical strings differing only in their tonal contours. I will use the term phonologization throughout to mean specifically the innovation of changes to phonological representations, whether these result in the neutralization of contrasts or not. Hyman’s first stage of phonologization I will treat as the alteration or addition of phonetic targets to existing phonological representations. In this study I advance the view that typological regularities involving the content of phonological patterns such as positional neutralization are best accounted for through an understanding of the phonologization process by which those patterns arise from their phonetic antecedents. I also argue that both the phonologization process and the synchronic implementation of the phonetic and phonological patterns at either end of that process are best understood in a split model of phonetics/phonology such as that described above. 1.5. Organization The rest of this study is organized as follows: Chapter 2 presents the results of a survey of licensing asymmetries found between stressed and unstressed syllables. These effects are well-known in the literature as unstressed vowel reduction (UVR). The typological profile of UVR is quite unambiguous. In virtually all cases neutralization of contrasts occurs along the dimension of vowel height. While other features, such as frontness/backness or rounding are often involved as well, cases in which a system of unstressed vowel reduction is unambiguously based on the neutralization of palatality, rounding or ATR contrasts are exceedingly rare, if not utterly unattested.11 The neutralization of vowel height in UVR systems has commonly been attributed to the difficulty of reaching articulatory targets for non-high vowels in the shortened durations allowed vowels in unstressed syllables in most of the languages in question (Flemming 2001b, 2004). Compression of the vowel space along this

Organization

17

dimension due to undershoot of targets, together with a shorter window within which to correctly identify the unstressed vowel leads to the possibility of misperceptions by listeners leading to reinterpretation of speakers’ intended pronunciations. In this way the neutralization of height contrasts is phonologized. Nearly identical patterns of vowel reduction are presented in their phonologized and unphonologized states, in some cases in different positions within a single language. In contrast to the relatively simple typological patterning found in stressed and unstressed syllables, Chapter 3 shows the array of effects found in final syllables to be substantially less uniform. There is robust crosslinguistic attestation of patterns of final-syllable strong licensing, and particularly final vowel resistance to neutralization processes for which that vowel would otherwise be a legitimate target. The existence of these final resistance effects is attributed to the well-known phonetic pattern of domain-final lengthening, together potentially with articulatory strengthening of the type documented in pre-boundary elements in English by Cho (2001). The phonetics of domain-final position, however, unlike that of the stressed syllable, is not uniformly prominence-enhancing. Together with final lengthening and strengthening, final syllables, especially above the level of the word, see dramatic drops in subglottal pressure. The significantly lower intensity and pitch of domain-final material, often accompanied by non-modal phonation, can lead to phonologization of, among other things, a pattern of weak licensing of vowel contrasts for the final syllable, such that in some languages neutralizations take place in final position alone. The range of final syllable neutralizations is catalogued and phonetic explanations for each are discussed. The diversity of patterns found in domain-final syllables is a problem for an approach assuming the inherent strength or weakness of structural positions to be encoded in UG. An approach deriving the typology of positional neutralization from the phonologization of common phonetic patterns, on the other hand, predicts precisely this ambiguous behavior on the part of final syllables. Chapter 4 presents an analysis of licensing asymmetries between initial and non-initial syllables, the first regularity of which is the surprising rarity, perhaps non-existence, of unambiguous cases of the positional neutralization of vowel contrasts involving initial syllables. The majority of cases presented as such in the literature turn out to be crucially confounded, either by bearing fixed primary stress (currently or at the relevant moment in history), or by being coextensive with the morphological root, another commonly attested domain for strong licensing potential. It is therefore not possible in these cases to say with certainty whether the licensing asymmetries in question are the result of the initialness of the syllable per se, or rather just the stressedness or morphological affiliation of that (inci-

18

Introduction

dentally) initial vowel instead. The relatively small durational asymmetries characteristic of many fixed stress systems crosslinguistically would be unlikely to result in the types of dramatic reduction processes discussed in Chapter 2. I argue, however, that smaller durational asymmetries of this sort, together with vowel-to-vowel coarticulation in languages with a particular morphological structure could result in the phonologization of precisely the types of harmony processes so frequently proceeding (synchronically or diachronically) from initial syllables. In this context I present the results of an experimental study showing that under certain circumstances even in the absence of initial stress a small durational asymmetry between the vowels of initial syllables and those of following syllables may be produced by the process of initial strengthening welldocumented already for domain-initial consonants. Here too then the typology of positional neutralization effects falls out directly from the phonologization approach. In Chapter 5, I present my conclusions together with a discussion of the enterprise of phonological typology more generally. I first compare the phonologization approach to typological patterning with approaches invoking restrictions on possible systems encoded in UG. In particular detail are discussed predictions made by the Licensing-by-Cue and Direct Phonetics approaches to PN in light of the empirical findings presented in Chapters 2 through 4. I conclude that the synchronic connection these draw between the phonological regularities and the phonetic patterns which give rise to them is too immediate to account for certain aspects of the behavior of these systems both synchronically and diachronically. While typological patterns of PN are thus best accounted for with reference to the phonetics, ironically enough it is probably the phonetics-free phonological approaches, overgeneration and all, which draw the most accurate picture of categorical PN in synchrony. I conclude with some discussion of several crucial-yet-unresolved questions for all theories of the phonetics-phonology interface, most importantly the analysis of cases of so-called “incomplete neutralization”.

Chapter 2 Stressed syllables and unstressed vowel reduction

This chapter is an examination of the licensing asymmetries attested crosslinguistically between stressed and unstressed syllables. Unstressed vowel reduction (hereafter UVR), the topic of inquiry for most of this chapter, is perhaps the best known of the licensing asymmetries to be treated in this book. Section 2.1 presents an overview of the typology of unstressed vowel reduction, identifying the main patterns a theory of vowel reduction must be able to account for. Specifically, it is shown that the overwhelming majority of UVR patterns are based on the elimination of height contrasts from unstressed syllables. Nasalization and quantity contrasts can also be involved. Systems based on the elimination of other contrasts, however, such as palatality, roundness, and ATR, RTR, or pharyngealization, features commonly operative in systems of vowel harmony, appear never to arise. This asymmetry, unremarked in previous literature on the topic, is striking and demands an explanation. Section 2.2 introduces the primary phonetic tendencies which give rise to vowel reduction patterns and shows how the phonologization model uses these to account straightforwardly for the typological patterns identified here. Most treatments of vowel reduction make a strict division between “phonological vowel reduction”, the neutralization of phonological contrasts outside of stressed syllables, and “phonetic vowel reduction”, usually held to be a less dramatic process of centralization or raising of non-high vowels as a result of articulatory undershoot of the sort demonstrated in Lindblom (1963) and discussed below. I maintain the distinction between the phonological and the phonetic here, arguing in fact that a number of UVR patterns commonly identified as phonological in the literature are in fact phonetic in nature. I demonstrate that these phonetic patterns are no less dramatic in their effect than their phonological counterparts. Furthermore, I argue that it is these phonetic patterns, through the medium of phonologization as described in Chapter 1, which account for the fact that most (but not all) categorical patterns appear phonetically motivated. The phonologization approach to typological regularities, I argue, gives us a broader and deeper understanding of the reasons for the existence of those regularities than do synchronic models with phonetic motivations built into them. Section 2.3 presents three case studies of unstressed vowel reduction systems which illustrate both the phonetic and phonological

20

Stressed syllables and unstressed vowel reduction

patterns of UVR, as well as the process of phonologization whereby the one is converted into the other. Section 2.4 introduces an alternative account of vowel reduction, the Licensing-by-Cue approach of Steriade (1994), exemplified here by Crosswhite (2001). This model is evaluated in light of the typological and phonetic findings presented in preceding sections. Section 2.5 presents a case study of vowel reduction in Contemporary Standard Russian, a system which poses an interesting test case for the comparison of the Licensingby-Cue and phonologization approaches. On the basis of experimental findings, I propose a new interpretation of Russian vowel reduction based on the distinction between phonetic and phonological processes outlined above. Section 2.6 discusses the ramifications of the phonologization model of UVR typology for the question of phonetic determinism in phonology. 2.1. Unstressed vowel reduction: patterns of neutralization Unstressed vowel reduction is a relatively familiar phenomenon in the linguistic literature, in part surely due to its robust attestation in such thoroughly studied language families as Romance and Slavic. A typological investigation of vowel reduction systems yields the following clear and striking results: The vast majority of licensing asymmetries between stressed and unstressed syllables in the languages of the world involve the neutralization of contrasts of vowel height, nasalization, or quantity. Neutralizations of features along any other dimensions, to the extent that they exist at all, are spottily attested and usually secondary to the neutralization of height contrasts in the same system. The enormous typological gaps this generalization leaves, including the absence of any systems of UVR based on neutralizations of palatality, ATR, or rounding contrasts, precisely the features most commonly involved in vowel harmony systems, must be accounted for in any theory of vowel reduction. As we shall see below, these gaps are predictable according to the phonologization approach, while in other theories they remain mysterious. 2.1.1. Vowel height contrasts Turning first to the neutralization of height contrasts, these are far and away the most commonly effaced contrasts in systems of unstressed vowel reduction, forming the basis of virtually all the well-known Romance and

Patterns of unstressed vowel reduction

21

Slavic systems noted above. A commonly cited example comes from Central Eastern Catalan, shown in (1). (1)

Central Eastern Catalan (Beckman 1998; Prieto 1992; Recasens 1986) a.

Stressed i e, , a u, o, 

b.

riw new ml pal r  mon kur

   

Unstressed i  u

‘river’ ‘snow’ ‘honey’ ‘shovel’ ‘wheel’ ‘monkey-FEM’ ‘cure’

riwt nwt mlt plt ru et munt kurt

‘river-DIM’ ‘snow-DIM’ ‘honey-DIM’ ‘shovel-DIM’ ‘wheel-DIM’ ‘monkey-FEM-DIM’ ‘cure-DIM’

In Central Eastern Catalan, stressed syllables license contrasts within a seven-vowel inventory /i, e, , a, , o, u/, while unstressed syllables license contrasts among only three vowel qualities, [i, , u]. Systems with these particular inventories in stressed and unstressed positions are fairly common (particularly within Romance). We find the same seven-to-three inventory reduction in stressed and unstressed syllables in Berguener Romansh (Kamprath 1987: 52–54), posttonic syllables in Brazilian Portuguese (Major 1981, 1985; Nobre and Ingemann 1987; Simões 1991; Wetzels 1992), and unstressed syllables in South Russian dialects (Flier 1978). Equally common are reductions from seven vowels to five as in Standard Italian (Vincent 1988b), or Loniu (Austronesian, Admiralty Islands, Hamel 1994), and of five vowels to three, as in Eastern Bulgarian12 (Stojkov [1962] 1993), Standard Belarusian (Czekman and SmuΩkowa 1988), or posttonic syllables in Seediq (Austronesian, Atayalic, Asai 1953; Li 1977; Li 1980; Holmer 1996). The fact that these same canonical inventories appear again and again in the strong and weak positions of systems of unstressed vowel reduction is good news for dispersion-based theories of vowel inventory construction (Liljencrants and Lindblom 1972; Flemming 1995, 2001b). The same perceptual demands for maximal dispersion of vowels within the vowel space appear to be operative in constructing inventories of unstressed vocalism as have been argued to govern selection of full inventories (Flemming 2001b). Interestingly, however, while stressed and unstressed inventories are remarkably steady

22

Stressed syllables and unstressed vowel reduction

across UVR systems, the mapping relationships between the vowels of the two inventories vary wildly from language to language. Thus, while in the Central Eastern Catalan seven-to-five reduction, /e, , a/ merge as [], /u, o, / merge as [u], and /i/ is realized as [i], in Berguener Romansh /i, e/ merge as [i], /u, o/ merge as /u/, and /, , a/ are realized as [], and in Brazilian Portuguese posttonic syllables /i, e, / merge as [i], /u, o, / merge as [u], and /a/ is realized as [ ]. By the same token, while in the Standard Italian seven-to-five reduction /e, / merge as [e], /o, / merge as [o], and [a] is raised slightly toward schwa, in Loniu unstressed syllables /, , a/ are realized as [] while /i, e, o, u/ stay more or less in place. Finally, in the five/six-to-three systems cited above, in Eastern Bulgarian /i, e/ merge as [i], /u, o/ merge as [u], and /, a/ merge as [ ], while in Belarusian /e, a, o/ merge as [a~ ] leaving /i, u/ in place, and in Seediq post tonic /i, a/ are not neutralized, while /e, o, u/ merge as [u]. Partial, sometimes asymmetrical variants of these basic system types are also well-attested, all of which indicates the central role played by language-specific phonetic patterns of unstressed vowel realization in determining which vowels will ultimately undergo phonologized merger. Crucially, however, in all cases systems are oriented around the elimination of height contrasts in unstressed syllables. Conspicuous in their absence are the unstressed vowel reduction analogues of the common vowel harmony systems of the world. Specifically, UVR systems based on the elimination of contrasts in frontness or backness, rounding, or ATR/RTR/pharyngealization are spottily attested at best, if not utterly non-existent. There are, of course, plentiful instances in which a front/back contrast is neutralized along with a height contrast, such as in mergers of /a/ and /e/ as [] (as Steriade [1994: 11] notes for Catalan where the resulting system is [i, u, ]), but this could hardly be interpreted as a manifestation of the need to eliminate front/back contrasts from weak positions. Aside from this, we do find neutralization of front/back contrasts among low vowels, as in Chamorro (Topping 1969; Steriade 1994; Crosswhite 2001), where in unstressed syllables /æ/ and / / merge as [a]. Given the relatively small acoustic distance between front and back low vowels, and the corresponding dispersion-based preference against this contrast in smaller vowel inventories, it is unsurprising that specifically this front/back pair should neutralize before others. The system of unstressed vocalism in Chamorro is [i, u, a]. For further reductions of front/back contrasts to occur, it is necessary for the unstressed inventory to fall below three vowels. Thus, in pretonic syllables in certain southern Italian dialects such as that of Bari (Loporcaro 1997: 340–341), the only vowel contrast remaining is between unstressed

Patterns of unstressed vowel reduction

23

[] and [a]. This pattern is also in accord with the crosslinguistic typology of vowel inventories, in which, alongside the famed vertical vowel inventories of Marshallese or Kabardian, we find no instances of “horizontal inventories”, the only other option for an inventory smaller than /i, u, a/ but retaining some contrast still.13 The final option of course is the neutralization of all vowel contrasts, as in Saanich (Straits Salish, Montler 1986; Urbanczyk 1999), where nearly all unstressed vowels are realized as schwa. In Section 2.2 below I will argue that the predominance of heightcontrast neutralizations in UVR systems follows directly from the phonetic characteristics of the systems in question. The virtual non-existence of systems based on the neutralization of front/back, round, or ATR contrasts is furthermore predicted under the phonologization approach, since except in the most extreme cases of durational contraction of unstressed syllables, these contrasts are not threatened phonetically to the extent that height contrasts are. Importantly, while the phonologization approach predicts that the UVR analogues of palatal, rounding, or ATR harmonies should be rare, by no means does it preclude their existence entirely. This is because while the phonologization approach accounts for typological regularities in the patterning of PN systems through an understanding of the phonetic circumstances which give rise to them, a central tenet of the approach is that once phonologized, the developmental logic of the systems in question ceases to be phonetic in nature. Instead, it is constrained by other factors, what we might call grammatical, cognitive, or systemic. Thus, UVR systems based on the neutralization of, e.g., palatality contrasts should not arise naturally through phonologization, since the phonetic patterns which would create such systems seem not to arise. Other diachronic developments, however, not constrained by phonetic factors, could ultimately conspire to produce inventories of this sort, just as I will argue below that other phonetically unmotivated correspondences can arise through analogical extensions or layered sound changes. In contrast to “Direct Phonetics” models of positional neutralization, in which phonetic naturalness is derived online in the synchronic phonological grammar, the phonologization model predicts only the improbability of the genesis of unnatural patterns. Should they ultimately come to exist by whatever circuitous path, the abstract, categorical phonology assumed here, devoid as it is of restrictions on the phonetic content of its patterns, should have no trouble implementing the typologically disfavored patterns. The case of the vowel systems of the Ob-Ugrian languages (Finno-Ugric) Khanty and Mansi provide evidence that this is so. In the Sosva dialect of Mansi, for example, in the non-initial syllable only /i, e, , a/ are contrasted (Honti

24

Stressed syllables and unstressed vowel reduction

1988: 335), meaning that front and back vowels are contrasted only in the initial, stressed syllable. This system, predicted not to arise as the direct result of the phonologization of phonetic patterns, developed from an earlier system in which non-initial vowels were subject to palatal harmony, such that while both front and back vowels occurred, their appearance was conditioned by the frontness or backness of the vowel of the initial syllable. In many (but not all) Ob-Ugrian dialects it is the case that the ancestral harmony system has broken down in some way. In Sosva Mansi, then, even prior to the merger of the front and back non-initial syllable vowels which created the appearance of a system of UVR, the contrast between the front and back was already neutralized. This preexisting lack of phonological contrast along the front/back dimension may have predisposed the language to undergo the reductions it did further along in its history. I know of no instance of the attestation of a system similar to this one in which a healthy contrast between front and back vowels was simply lost to unstressed vowel reduction in the same manner height contrasts are lost in the development of canonical UVR systems. All this makes Sosva Mansi a typological rarity, but there is no reason to think that this crosslinguistic fact in any way impedes the functioning of the system in synchrony. This is unproblematic for the phonologization approach to typology. Theories seeking to derive the phonetic “motivation” of typological patterns through the existence of phonetic restrictions in phonological Universal Grammar, on the other hand, in order to derive the near exceptionless height-first rule for UVR systems must assume that UG itself has something against systems like Sosva Mansi. But while systems of this sort may be disadvantaged in evolutionary terms, for the Mansi, presumably, they are acceptable enough. Something similar to the Sosva Mansi system occurs in some dialects of Khanty, such as Tromagan (Abondolo 1998b). Additionally, in most Khanty dialects (as noted by Steriade [1994]), no round vowels occur outside initial syllables, and vowel harmony has been lost, creating a system in which UVR neutralizes specifically round contrasts, another rare configuration.14 The Proto-Ob-Ugrian vowel inventory for non-initial syllables contained front and back harmonic variants of ii, aa, and  (Honti 1988: 331). Proto-Uralic, in fact, is reconstructed with palatal harmony affecting non-initial-syllable vowels of only two heights, written as *[i, , ä, å] (Sammallahti 1988: 481). Discounting the harmonization, this system is identical to the two vowel system attested in cases of UVR such as the pretonic syllables of Bari Italian (given as [ , a] by Loporcaro), in which remaining front/back contrasts are neutralized while a single height contrast remains (the next step being the reduction of that last height contrast, yielding schwa). The lack of round vowels in Khanty non-initial syllables

Patterns of unstressed vowel reduction

25

is thus a continuation from this earlier state, one which is both attested and expected in the typology of UVR systems. It does not, therefore represent a legitimate instance of a (phonetically-driven) UVR system based on the elimination of roundness contrasts in unstressed syllables. Again, though, this is no reason not to consider it such in synchrony. Ob-Ugrian vowel systems are fascinating and richly deserving of increased attention by phonologists. Another putative instance of a UVR system based on the elimination of palatality contrasts are the Upper and Lower Carniolan dialects of Slovenian as described by Lencek (1982: 146–151) and cited in Crosswhite (2001). In these dialects, which contrast short and long vowels, the short high vowels /i/ and /u/ fall together with underlying // (in Upper Carniolan underlying schwa has a tendency to delete altogether). Crucially, though, these mergers are not related to the placement of stress, but are rather unconditioned mergers of these three short vowels in both stressed and unstressed positions. Among Lencek’s examples are even such stressed monosyllables as Upper Carniolan [nt] (cf. Standard Slovene [nit]) and Lower Carniolan [st] (cf. Standard Slovene [sit]). It is true, of course, that unstressed syllables share with short vowels the property that they are the shorter of two sets of vowels in the system, and hence both may be subject to similar pressures. But they differ in important respects as well, perhaps most importantly in that unconditioned mergers in the short (or long) vowel inventories are not examples of positional neutralization. They take place across the board, regardless of the structural position in which they are located. In positional neutralization systems such as UVR, the two opposing sets of vowels, stressed and unstressed, are by definition in complementary distribution. Long and short vowels in, e.g., Slovene, are a different matter. They may both occur in stressed and unstressed syllables, and because of the potential for paradigmatic contrast of this sort (unavailable in UVR systems), they may exert pressure on one another in the manner that high vowels and mid vowels of the same quantity may also, potentially occasioning chain shifts and the like. That such a merger should take place in the short vowel inventory of Slovene dialects is interesting,15 and the fact that similar mergers seem absent from UVR systems points out a potential difference in the relationship between long and short and stressed and unstressed vowel inventories, but this is just one of many differences we do not yet fully understand (see Steriade [1994: 14, 22] for some notes here) in the development of these systems. One clear difference here is that long vowel inventories are routinely both larger and smaller than short vowel inventories, while in UVR systems stressed inventories are rarely if ever smaller than unstressed. Languages with contradictory patterns from this point of view listed in

26

Stressed syllables and unstressed vowel reduction

Maddieson (1984) include Telefol (long /i, , a, , u/, short /i, a, u/) and Chipewyan (long /i:, a:, u:/, short /i, e, a, o, u/. Obviously in each instance radically different factors may have contributed to the development of the systems in question. The point is that the relationship between long and short inventories is of a different nature altogether, and that phenomena affecting this relationship and that of stressed and unstressed vowels in UVR systems may be similar, but are not directly comparable. That stress shifts in Slovene may result in categorical alternations between contrastively long high vowels and [] (formerly a short high vowel) indicates the disconnect between potential synchronic alternations and the phonetic patterns commonly giving rise to vowel reduction, but does not necessarily add to the list of patterns that phonetic motivations need account for. A more convincing instance of high vowel front/back neutralization in unstressed syllables comes from the history of Welsh (Bosch 1996; Williams 1989). In Northern Welsh there is a synchronic pattern whereby final [, u] alternate with penultimate [ ]. Synchronically this case is complicated by the fact that stress in Northern Welsh is penultimate, making this an instance of non-augmenting reduction in strong position.16 Additionally, there are lexical exceptions to the reduction of []. It appears that non-reducing [] is actually original, while reducing [] is derived from an earlier /y/, which later merged with //. The original reduction, then, was of /y/ and /u/ to []. The case is complicated, and in any event not an instance of the elimination of all front/back contrasts even among the high vowels (since original /i/ and // remain contrastive in non-final syllables). Tamil (Beckman 1998) presents a case in which no round vowels occur outside the stressed initial syllable (see Chapter 4 for a discussion of stress in Tamil and Dravidian at large. Why Beckman chooses to treat Tamil as an instance of initial prominence rather than UVR is not obvious), contrary to the generalization made here. In the dialect Beckman treats, of the five underlying short vowels /i, e, a, o, u/, only /i, a, u/ appear contrastively outside the initial syllable (Beckman 1998: 87). Additionally, in non-initial syllables, the three peripheral vowels receive the allophonic realizations [, , ]. It is thus not so much the case that the system avoids round/unround contrasts in unstressed syllables, but rather that the allophonic unrounding of short /u/ and the predictable UVR elimination of the mid vowels conspire to produce a system in which no round vowels appear. It is potentially relevant here too that it is short /u/ in particular which is affected in this way. Even in the initial syllable the short peripheral vowels undergo some qualitative phonetic reduction, being realized as [, , ] respectively.

Patterns of unstressed vowel reduction

27

The special vulnerability of the short high vowels in systems with long/short contrasts deserves serious further investigation. One potential avenue of exploration would focus on the durational characteristics of these, particularly in connection with the well-known crosslinguistic tendency for short high vowels to be subject to devoicing. Another direction might explore the qualitative issues, in particular the perceptual effects of the laxing short high vowels often undergo, presumably to keep them distinct from the long high vowels in the event of shorter phonetic realizations for these latter. The vulnerability of –ATR high vowels in ATR systems may be of relevance here as well. The generalization, in any event, is clear: While UVR systems eliminating height contrasts from unstressed syllables are abundant throughout the languages of the world, systems of UVR eliminating contrasts in palatality, rounding, ATR/RTR or pharyngealization in a reduction equivalent to the commonly attested harmony systems appear to be completely unattested. While the list of height-neutralizing systems is obviously in no way meant to be exhaustive, the list of potential exceptions is. A central claim of this chapter is that this hole in the typology is no accidental gap, but rather follows from the phonetic characteristics of UVR systems in a manner consistent with the phonologization approach to typological patterns. Details follow in Section 2.2 below. 2.1.2. Nasalization Another contrast found neutralized in unstressed syllables is vowel nasalization. In Copala Trique, for example, nasal and oral vowels contrast only in the stressed syllable (Hollenbach 1977; Steriade 1994). The same is true in Guarani (Beckman 1998: 158). Reduced durations in unstressed syllables may go some way toward explaining the neutralization of the contrast between nasal and oral vowels outside stressed syllables. It is wellknown that, crosslinguistically, nasalized vowels tend to be significantly longer than their oral counterparts (Whalen and Beddor 1989). If longer duration is needed for accurate perception of the contrast, its neutralization in shorter unstressed syllables is understandable. Stressed vowels may also be subject to less coarticulation than unstressed vowels, increasing the likelihood that assimilations would flow from the former to the latter, rather than vice versa, as argued by Majors (1998) in her treatment of stress-based harmony systems.

28

Stressed syllables and unstressed vowel reduction

2.1.3. Quantity Quantity contrasts behave somewhat ambiguously with respect to stressed syllables in the languages of the world. It is a commonplace that certain languages require stressed vowels to be bimoraic, lengthening them when necessary, and thus neutralizing the contrast. This type of process is what Smith (2002) calls augmentation, and cites as a characteristic of stressed syllables. The idea here is that it is more important to enhance a stressed syllable’s phonetic prominence, ensuring perception of the placement of stress than it is to maintain segmental contrasts within the stressed syllable. It does seem clear that some languages value the heaviness of a stressed syllable over the preservation of the contrast between monomoraic and bimoraic vowels. From the phonologization perspective, we would expect these to be languages in which a vowel quantity contrast was forced to coexist with strongly duration-cued stress. The phonetic lengthening of short vowels under stress (common in many languages without long/short contrasts, such as Russian) would ultimately make the contrast difficult to perceive, eventually resulting in misperceptions and concomitant reanalysis of the lengthened short vowel as an intended distinctively long vowel. This phonologization, of course, neutralizes the contrast. It is not surprising, given that stressed syllables are, if anything, a lengthening environment, that the output of this neutralization should be a long vowel. See the discussion of vowel length contrasts in final syllables in Chapter 3, however, for a more complicated case. In addition to languages neutralizing quantity contrasts in stressed syllables, however, there are also attested a fair variety of languages which neutralize length contrasts in all syllables except the stressed syllable. Kolami (Emeneau 1961: 6–7) and several other Dravidian languages are examples. Here the initial syllable is stressed, and long and short vowels do not contrast in non-initial syllables. In Warumungu (Bosch 1991), as in Dravidian, stress is fixed initial, and contrastively long vowels occur only here. In Jamul Tiipay (Miller 2001), vowels contrast long and short only in the stressed and immediately pre-stress syllables, with all vowels outside these positions being short. Also in Burushaski (Anderson 1997), vowel length contrasts exist only in stressed syllables, with all unstressed syllables licensing only short vowels (though this case is not without complications). What these patterns suggest is an origin in languages in which unstressed syllables were at least somewhat restricted durationally, enough to cause sufficient shortening of unstressed long vowels that the contrast should be lost. Stressed syllables, however, cannot have been expanded durationally to a great extent for the quantity contrast to remain intact. This prosodic profile actually fits the Dravidian languages remarkably well. Many

The phonetics of vowel reduction

29

Dravidian languages are described as having some degree of initial stress, or some immediate derivative thereof, but this stress, as in Tamil (Schiffman 1999; Bosch 1991: 183), is not reported to be marked by a great durational asymmetry. Durational curtailment of non-initial syllables, however, is demonstrated in the frequent patterns of syncope of non-initial vowels in various patterns throughout the histories of various Dravidian languages, e.g., Kannada (Schiffman 1983), Kurux (Pfeiffer 1972; Gordon 1976) and Malto (Gordon 1976). In Toda, in fact, a language which also underwent massive syncope in non-initial syllables, Emeneau (1984) describes a length contrast with a durational ratio of about 1:2 in initial syllables. Toda also has strong initial-syllable stress. In non-initial syllables, by contrast, Emeneau characterizes the long – short distinction as more between “short and less short”, with the distinction being variable and often difficult to perceive. This leads Emeneau to speculate that the contrast may be in the process of disappearing. The preceding sections have shown that the only clearly attested patterns of neutralization involving stressed and unstressed syllables are those affecting primarily height, nasalization, or quantity. For each of these features, duration is either a well-known cue for the contrast in question, or in the case of quantity, potentially the only cue for the contrast in question. Patterns of UVR conspicuously not attested include those based on contrasts of palatality, roundness, and ATR, all commonly active in harmony systems (see Chapter 4), though not known to be strongly cued by duration. The following section explores the phonetic basis of UVR in more detail. 2.2. The phonetics of vowel reduction The fact that unstressed vowel reduction should so definitively prefer targeting height contrasts is perfectly comprehensible in light of the phonetic environment in which it generally arises. The correlation between the presence of UVR in a language and duration as a primary correlate of lexical stress has been widely recognized for decades (see, e.g., Lehiste 1970). Generally speaking, UVR appears in languages with a large durational asymmetry between stressed and unstressed syllables, such that unstressed syllables undergo significant durational contraction relative to a substantially longer stressed syllable, particularly under increased rate of speech. Flemming (2001b, 2004) follows Lindblom (1963) and others in attributing neutralizations of height contrasts in durationally impoverished syllables to the comparatively longer duration needed to achieve the open

30

Stressed syllables and unstressed vowel reduction

upper vocal tract required to produce a higher F1. One way of adapting to the durational limitations of unstressed syllables would be to speed up articulatory movements sufficiently so that the more open target articulations of non-high vowels could be reached in the time allotted them. Flemming attributes the failure of a language to implement this alternative to a desire to minimize articulatory effort where possible. The other option, short of expending that additional effort or violating restrictions on the durations of unstressed vowels is to articulate the vowel in such a way that it falls short of its articulatory targets. In this new, more crowded vowel space, of course, perceptual problems produced by spectral overlap in the realizations of unstressed vowels are compounded by the shorter time the listener has to correctly identify a comparatively indistinct vowel. This situation sets the stage for routine misperceptions, which can ultimately result in the loss of the contrasts in question. This reinterpretation is the phonologization process which produces systems of unstressed vowel reduction. While Section 2.1.1 demonstrated the enormous crosslinguistic variation found in the merger patterns between the stressed and unstressed vowels of otherwise identical inventories, the merger patterns attested are not completely arbitrary. Rather, each set of mergers represents plausible groupings of vowels according to perceptual proximity to one another caused by one or another language-specific pattern of shrinking of the vowel space due to durational pressures on articulatory targets. Which is to say, patterns of merger reflect the directions in which vowels were being shifted under durational pressure in unstressed syllables.17 This can be seen in action in the case studies of Brazilian Portuguese and Bulgarian below, in which different dialects show phonologized and unphonologized versions of the same patterns. Since it is chiefly the implementation of vowel height contrasts which is hindered by the decrease in duration of the unstressed syllable, it is to be expected that unstressed vowel reduction would primarily impact contrasts of vowel height, leaving other contrasts, such as those not found to be active in systems of UVR for the most part unscathed. The phonologization approach thus accounts for the fact that the contrasts most often involved in vowel reduction systems are precisely those which would be most severely impacted by serious durational impoverishment. Recent UG-based approaches to vowel reduction, on the other hand, fail to speak to this striking typological generalization. Another way in which the phonetics can create the input to phonologization, resulting in stress-based licensing asymmetries is through the durational and articulatory enhancement of the stressed syllables. Cho (2001) presents articulatory evidence from English that the presence of pitch accents on vowels results in their strengthening not only in duration

The phonetics of vowel reduction

31

and resistance to vowel-to-vowel coarticulation, but also in what Cho calls sonority enhancement, whereby lip and jaw position are both more open, and in many cases tongue position is lower as well (but see de Jong [1995] for somewhat contradictory results). These enhancements, taken together, can account for lengthenings, lowerings, and diphthongizations of stressed vowels, potentially resulting in the emergence of new contrasts in those positions (assuming that some comparable stressed vowels fail to undergo the process, or enter the language only after the process has taken place). That it is specifically duration or lack thereof, and not “stressedness” per se that is responsible for the licensing asymmetries found in stressed and unstressed syllables is underscored by cases in which vowels of one or another degree of stress are nonetheless subject to reduction due to the failure of that stress to impart any additional duration to the vowel in the syllable it is realized on. Thus, in Italian, only vowels under primary stress are immune to vowel reduction, while secondarily-stressed vowels undergo the neutralization of unstressed high and low mid vowels to [e] and [o] just as would an unstressed vowel (Crosswhite 2001: 206). Experimental evidence from Farnetani and Kori (1990) provides an explanation for this fact, in that vowels with secondary stress are seen not to differ significantly from unstressed vowels in duration, while the vowels of stressed syllables are regularly longer than both. It should be noted also, though, that while durational impoverishment is strongly correlated with vowel reduction, it is not necessarily the case that additional duration alone will rescue an otherwise-reducing unstressed vowel from its fate. This is the conclusion reached by Johnson and Martin (2001) in their experimental study of phonetic vowel reduction in Creek, where the additional duration supplied to final vowels by phrase-final lengthening does not cause them to resist the centralization characteristic of unstressed syllables in the language. Nord (1987) reaches similar conclusions for Swedish. The role of final lengthening in inhibiting UVR in many languages (along with its failure to do so in others) is treated in detail below in Chapter 3. The phonologization approach advanced in this study derives the typological characteristics of UVR systems from language-specific patterns of phonetic undershoot developing in unstressed syllables under articulatory pressure caused by durational impoverishment. We should therefore expect to find in the languages of the world plentiful instances of patterns of phonetic vowel reduction in which vowels of different heights come to approximate one another’s positions in the vowel space, but have yet to undergo phonologization as categorical UVR. Such systems are indeed well-attested. In fact, in addition to clear cases of phonetic vowel reduction and phonological vowel reduction, we also find systems in which both

32

Stressed syllables and unstressed vowel reduction

patterns are attested together. The following sections present case studies of each type. 2.3. Case studies 2.3.1. The phonologization of phonetic vowel reduction: Bulgarian The first case study I present is that of unstressed vowel reduction in Bulgarian. Bulgarian is an especially valuable example of UVR in that the continuum of its dialects from west to east illustrates eloquently the way in which gradient phonetic patterns over time become phonologized, yielding categorical patterns of neutralization. Some of the westernmost dialects of Bulgarian lack qualitative vowel reduction more or less completely. Unstressed non-high vowels are realized with qualities not overly different from those of stressed non-high vowels. Other western dialects, such as that of the Sofia region, have varying degrees of phonetic raising of non-high vowels, without, however, neutralizing the relevant contrasts (at least in the case of the mid and high vowels). Further east, the phonetic raising pattern has undergone phonologization, such that Eastern Bulgarian has a fully developed system of categorical UVR. Details are as follows. Bulgarian has six vowels in stressed syllables: /i, e, a, â, o, u/. I n unstressed syllables in Eastern Bulgarian, the mid vowels raise to neutralize with the high vowels as [i] and [u]. The low vowel /a/ raises to merge with /â/ as []. I use the standard symbol ‘â’ to transcribe the non-low central vowel of Bulgarian. Many scholars transcribe it as //, which unfortunately gives the impression that this vowel is qualitatively identical to the unstressed vowel representing the merger of /a/ and /â/ in most dialects. Crosswhite (2001), for example, assumes on this basis that stressed /â/ is featureless, with its realization interpolated between the gestural targets of adjacent segments. In fact, however, the stressed realization of this vowel and the unstressed vowel representing the merger of /a/ and / / are not identical in quality. The phonetician and dialectologist Vladimir Zhobov of Kliment Oxridski University in Sofia describes it (personal communication) as being more in the direction of cardinal vowel 15 ([]). The Bulgarian Academy of Sciences grammar edited by Tilkov (1982: 50) shows a clear degree of constriction in the velar region present in the case of stressed /â/ and absent for its unstressed counterpart. Of the unstressed vowel relative to the stressed, Tilkov says that the unstressed version “is characterized by a greater openness of the oral cavity, such that the tongue has a lower position, and thus a greater degree of closure in the pharyngeal cavity” (Tilkov 1982: 50). Of its acoustic characteristics, he continues,

Case studies

33

“this change in articulatory configuration is reflected spectrally through a concentration of acoustic energy through raising of the frequencies of the first formant, and a corresponding lowering of the frequencies of the second formant – the characteristics of a more open vocalization.”18 The stressed vowel is very explicitly lowered to merge with unstressed /a/. This merger makes sense as the result of centralization, whereby less jaw opening for /a/ and less velar constriction for /â/ brought the two close enough together that misinterpretations occurred, resulting in loss of the contrast, and neutralization as actual [ ]. The rise of the Eastern Bulgarian system becomes clear when viewed in conjunction with material from other Bulgarian dialects, such as those underlying the standard language. In Eastern Bulgarian UVR, there are three categorical neutralizations, yielding a phonological three-vowel unstressed inventory. In the standard language, and most of the western and southern part of the country, however, this is not the case. Wood and Pettersson (1988) characterized the situation in general terms as follows: The actual occurrence of reduction in everyday speech is subject to normative, social, dialect and morphological constraints. Unstressed /a/ is reduced in all dialects. Complete reduction of both /e/ and /o/ is limited to eastern dialects, but they are reduced to a varying extent elsewhere, depending on the informality of the situation. The reduction of /o/ is common in the Sofia dialect, but reduction of /e/ is less usual there. (Wood and Pettersson 1988: 240)

In Pettersson and Wood (1987), they provide more details, but decide to describe all the systems in terms of a single phonetic tendency: “However, we prefer to see the regional distribution as a case of varying generalization of a single process, a reflection of how the Bulgarian speaker’s willingness or unwillingness to make certain reductions is subject to a complex of stylistic, sociolinguistic and dialect forces.” (Pettersson and Wood 1987: 268) This holistic approach to the dialect continuum and social fabric is fine of course, for certain aims, but is unacceptable as the basis for a model of the grammar of a single speaker of a single dialect, if that speaker’s dialect simply lacks certain of the categorical reductions represented in the holistic take. Crosswhite (2004), who describes “Bulgarian” with the three neutralizing reductions characteristic only of Eastern Bulgarian dialects, notes in a footnote that In the pronunciation norm of Sofia and other western areas of Bulgaria, vowel reduction is weaker than in the eastern areas, or even entirely absent. Since Sofia pronunciation defines the standard, many vowel-reducing speakers

34

Stressed syllables and unstressed vowel reduction attempt to suppress vowel neutralizations when speaking in formal registers. In particular, suppression of /e/ > [i] is quite common, and the lack of this suppression is rather stigmatized. (Crosswhite 2004: 51)19

The general point concerning greater degrees of reduction in informal speech and lesser degrees in formal speech is correct, as is the suppression of the reduction of /e/ by speakers of eastern dialects wishing to avoid detection. But speakers of dialects such as that of Sofia and virtually the entire west and south of the country are not “suppressing” reduction of /e/ or categorical reduction of /o/ under some sort of social pressure to conform to the literary standard. They are speaking their native dialects, which simply do not include these features in their grammars. A speaker whose dialect is close to the standard in this respect (e.g., a Sofian) will at least in informal speech merge /a/ completely with /â/. The same speaker may even realize some percentage of utterances of unstressed /o/ in informal speech as overlapping more or less completely with unstressed /u/. But that speaker will not, no matter how informal, lazy, drunk, or socially maladjusted, suddenly adopt categorical merger of /e/ and /i/ in unstressed syllables, because no register of this dialect contains such a process in its phonological or phonetic grammar. Crosswhite’s description of neutralizing reduction in Bulgarian is attributed to the results of an experimental study on the speech of two Bulgarians from villages near Sofia presented in Wood and Pettersson (1988). Describing these results, she writes: Pettersson and Wood started their investigation by first verifying the existence of acoustically neutralizing vowel reduction using spectrographic evidence – the formant frequencies measured for unstressed /e,o,a/ were found to coincide with those measured for the vowels /i,u,/, respectively. Since there was no acoustic difference in vowel quality reflecting these underlying vowel contrasts, we can state that Bulgarian vowel reduction is acoustically neutralizing. (Crosswhite 2001: 42)

But this is simply false, both as a characterization of vowel reduction in the Sofia dialect, and as a characterization of the experimental conclusions of Pettersson and Wood. What Pettersson and Wood actually report is the very opposite.20 On pages 244–247 they show scatterplots of their results for unstressed vowels spoken in isolation, and also in frame sentences. Unsurprisingly, there is a tendency toward more reduction in the frame sentences than in forms spoken in isolation. For one speaker in the isolated words even unstressed /a/ and /â/ form clearly distinct, if overlapping, distributions. Since they provide no statistical analysis, it is hard to see

Case studies

35

whether otherwise /a/ and /â/ form one distribution or two. /o/ and /u/ overlap for both speakers, but in both contexts many of the realizations of /o/ fall well outside the distribution for /u/. Again, statistical analysis would say for certain. For /e/ and /i/, however, there is very little overlap for either speaker at either rate, to say nothing of a single distribution. Most importantly, to the question of phonological neutralization and categorical merger the authors say directly, evaluating their results (in the context of an earlier study by Lehiste and Popov): “Lehiste and Popov summarize the vowel spectra as means, which gives the impression that vowel reduction in Bulgarian is discrete [Though the means of Lehiste and Popov also reflect almost no reduction of /e/ – JB]. But the raw data recorded in Figs 1–4 indicate that reduction is gradual.” (Wood and Pettersson 1988: 251) Also on pages 258–259: The two informants were typical of the reported tendency in their dialect not to reduce unstressed /e/ as much as /a, o/. One informant tended to be more formal when reading isolated words. ... Reduction was gradual rather than discrete, suggesting that an underlying articulatory plan for a complete rendering is executed with varying degrees of attention, depending on which components are neutralized and how far.

I take the above to indicate unambiguously that vowel reduction in Bulgarian, except in its eastern dialects (specifically the Balkan dialects, so-called, meaning the mountains, not the peninsula), is not neutralizing. It is gradient and apparently rate-dependent. Intriguingly, Tilkov (1982: 46), characterizing the likelihood and degree of application of this reduction in the standard language, states that unstressed vowels reduce the least in the first pretonic syllable (what he calls the first degree of reduction) and moreso in non-first pretonic syllables and posttonic syllables. This is the same phonetic trend as is found in East Slavic and discussed below in the context of Russian and Belarusian, only apparently far less dramatic. Tilkov provides no quantitative demonstration of this trend, unfortunately.21 Seen from a dialectological perspective, of course, the different Bulgarian dialects do represent a single whole with respect to UVR: They preserve a number of stages along a possible progression of phonologizations of gradient phonetic patterns. In the west, nearer to the borders with the Former Yugoslavia, dialects have little if any unstressed vowel reduction (missing entirely in standard Macedonian and SerboCroatian, present to some extent in certain dialects nearer to Bulgaria). In Sofia there is gradient phonetic raising of /a/ and /o/, possibly with the former phonologized as a merger for some speakers. Little or no raising of /e/, however, is attested. Variants on this situation are found across the

36

Stressed syllables and unstressed vowel reduction

country (as well as pockets of interesting innovations and/or archaisms which are beyond the scope of this study). Finally, in the eastern dialects, the phonetic reduction of /e, a, o/ has become dramatic enough and consistent enough that it has been phonologized, resulting in categorical phonological mergers of those vowels with /i, u, /. 2.3.2. Categorical vowel reduction: Belarusian An example of phonologized, categorical unstressed vowel reduction comes from Belarusian (Czekman and SmuΩkowa 1988). In Belarusian, five vowels, /i, e, a, o, u/, in stressed syllables are reduced to three vowels, [i, a, u], in unstressed syllables. Specifically, in unstressed syllables /e, a, o/ merge, all being realized as [a] or [ ], depending on the position of the vowel relative to the stress. Crucially, in Belarusian, as in Eastern Bulgarian, vowel reduction is neutralizing. It takes place whenever the nonhigh vowels occur in an unstressed syllable, regardless of the phonetic duration of the nucleus vowel. There is also no question of gradience. The mid vowels are simply realized as the contextually appropriate allophones of /a/. Reduced tokens in which, for example, /o/ is more [a]-like or more [o]-like depending on its duration are not found. The diachronic situation in East Slavic is complex and will not be resolved definitively here. It is possible, however, that the mergers of /e/, /o/ and /a/ never took the route of gradual phonetic approximation to eventual phonologization of merger in the first place. To dramatically distill the opposing points of view on the matter, one account holds that southern East Slavic dialects lost the distinction between /a/ and /o/ in unstressed syllables as a result of vowel reduction. Evidence of this includes the relatively late appearance of orthographic evidence of the merger in East Slavic documents (fourteenth century – Vlasto 1985: 316–317). Another view holds that this merger (known as ‘akanje’, ‘[a]saying’, in the Slavic literature), is actually of more ancient provenance, being not a reduction of unstressed vowels at all, but rather a failure of the contrast between /a/ and /o/ ever to emerge outside stressed syllables. Stressed [a] and [o] in the relevant dialects are ultimately derived from Late Common Slavic long /a:/ and short /a/ respectively. The rephonologization of this earlier quantity distinction as one of quality took place at a period in which similar changes were affecting the rest of the vowel system as well, essentially the breakdown of the Common Slavic prosodic system. While in most areas the split of the long and short low vowels took place in all syllables, irrespective of the placement of stress, it has been argued, most famously perhaps by Shevelov (1964), that in the relevant dialects of East

Case studies

37

Slavic the split occurred only under stress, while in unstressed syllables the quantity difference collapsed to yield a single [a]-like vowel of one or another degree of reduction.22 Whatever the case diachronically, however, the synchronic status of the system as one of categorical UVR eliminating contrastive mid vowels from unstressed syllables is clear enough. While the neutralization is not conditioned in synchrony by the phonetic properties of the host syllable, incidentally, the realization of the allophone of /a/ resulting from the merger is subject to some duration-dependent variation. In the first pretonic syllable, the vowels of which are characteristically longer than other unstressed syllables in Russian and Belarusian, /a/ is realized as [a]. In nonfirst-pretonic unstressed syllables, however, contrary to the claims of many descriptions, it is realized somewhat differently. Czekman and SmuΩkowa (1988: 226) symbolize the vowel in question as Greek ‘α’, describing it as weaker and higher than stressed [a], and coming about via the reduction of [a]. 2.3.3. A mixed system: Brazilian Portuguese Brazilian Portuguese is a system in which both phonologized and unphonologized vowel reduction patterns are in evidence, though for some speakers the phonetic pattern appears already to have become categorical. The facts of Brazilian Portuguese vowel reduction are generally described as follows (Major 1981, 1985; Nobre and Ingemann 1987; Simões 1991; Wetzels 1992): The stressed vowel inventory is /i, e, , a, , o, u/. In pretonic syllables, however, the contrast between high mid and low mid (or tense and lax mid) is neutralized, yielding [e, o]. In posttonic syllables, the mid vowels are eliminated entirely and /a/ is raised to schwa, yielding the inventory [i, u, ]. The neutralization occurring as a result of what is often called the first degree of reduction, the prohibition on realization of a contrast between /e, o/ and /, / in any unstressed syllable, is categorical (Major 1985). No amount of additional duration, emphasis, or formality of register will cause it to reemerge in syllables where it is neutralized. It must, therefore, be represented by the collapse of the two independent entities into a single category in the phonology. The second degree of reduction (occurring in posttonic syllables), however, is more ambiguous. Major notes that for many speakers Degree 2 reduction of /e, o/ to [i, u] does not take place in phrase-final open syllables. He demonstrates that these syllables are also subject to substantial phrase-final lengthening, which exempts the Degree 2 vowels from reduction, making them durationally (and spectrally) equivalent to the vowels of pretonic syllables.

38

Stressed syllables and unstressed vowel reduction

Furthermore, the realization of the pretonic syllables as [e, o], which Nobre and Ingemann show to be somewhat reduced qualitatively in comparison with their stressed counterparts in any circumstance, can in certain speech styles undergo further phonetic reduction. Major notes that while in the “normal” register pretonic mid vowels are realized as (slightly reduced) [e, o], in “casual” speech these vowels are reduced further still, being realized instead as [i, u] also.23 Major hypothesizes that the reason for the difference in register is one of speech rate, which is to say, the vowels pronounced at the faster rate have shorter durations. The reduction of [e, o] to [i, u], being duration-dependent, applies pretonically only when those vowels are shortened sufficiently to condition it. Posttonic vowels, on the other hand, are routinely short enough, unless altered by phrase-final lengthening. Other processes are also sensitive to speech rate in this way: In normal speech, nasalized mid vowels are never reduced, but in fast speech in posttonic position only they are raised to merge with the high nasalized vowels. Major attributes their failure to raise in normal speech to the fact that, as is typical crosslinguistically, the nasalized vowels are phonetically longer than their oral counterparts all things being equal, such that only a substantially faster speech rate would contract them sufficiently to begin to challenge the articulation of their target heights. While Degree 1 reduction is irreversible and categorical, Degree 2 reduction, the raising of mid vowels, clearly demonstrates gradient properties. The duration-dependence of the process indicates that it is not yet independent of phonetic factors in the way that Degree 1 reduction appears to be. Approaches treating both reduction processes in Brazilian Portuguese as equivalently phonological have no means of characterizing the failure of reduction in phrase-final vowels and the fast-speech reduction of pretonic vowels as a single process. By assuming the gradient, durationdependent raising of Brazilian Portuguese mid vowels to be carried out entirely in the phonetics, we gain a more accurate, unified picture of the processes at work here. We need only assume that mid vowel raising is a function of phonetic vowel duration, and that the language-specific phonetics assigns different durations to syllables according to their position in the word relative to the primary stress, their position within the phrase, and rate of speech, none of which should be controversial. As a final note, Major points out that not all speakers allow for the nonreduction of mid vowels in phrase-final syllables despite the additional duration supplied by phrase-final lengthening. He hypothesizes that for these speakers posttonic raising has become “lexicalized”, or phonologized in the terms applied here. For speakers who have phonologized this merger, the raising becomes categorical, sensitive now only to the structural

UG-based approaches to UVR typology

39

description of its host syllable, rather than to factors like the additional duration supplied by phrase-final lengthening. The previous sections have presented examples of similar UVR systems on both ends of the phonologization process, demonstrating how patterns of vowel realization produced by phonetic durational pressures are converted into categorical neutralization in the phonology through phonologization. Since durational pressures impact the realizations of some features more severely than others, and vowel height in particular, vowels come to approximate one another’s positions creating perceptual ambiguity only along those dimensions. Since phonetic ambiguities arise primarily in the dimension of vowel height, it is precisely height contrasts which will be lost to the phonologization of UVR. Since contrasts in frontness/backness, rounding, or ATR are not dependent on longer durations articulatorily to the same extent as height contrasts are, we would not expect vowels exhibiting these contrasts to develop ambiguous realizations to the same extent, except under more extreme durational pressures. In such situations though, even those contrasts will ultimately be abandoned, yielding the typological picture presented above in 2.1 whereby, for example, front or back contrasts may be lost in addition to height contrasts in unstressed syllables, but will not be lost instead of them. That other duration-sensitive contrasts such as those of nasalization and quantity should also be sensitive to the stressedness of the syllable in which they appear is predicted by the phonologization model in the same way. As the cases of Sosva Mansi and Tromagan Khanty illustrate, however, the phonologization approach predicts only the evolutionary improbability of the disfavored UVR patterns. Because of its diachronic take on typological patterning, should patterns of this sort nevertheless come to exist, through channels other than straight phonologization of duration-dependent phonetic patterns, the phonologization model in no way prejudices the phonology against their implementation. 2.4. UG-based approaches to UVR typology Taken as a model of the UG-based approach to phonological typology, and of the Licensing-by-Cue (Steriade 1994) approach in particular, Crosswhite (2001) presents the most comprehensive account of vowel reduction patterns to date in the generative tradition. An Optimality-Theoretic approach to unstressed vowel reduction, that study seeks to account both for the synchronic implementation of UVR systems in the constraint-based phonology, but also through the reranking of constraints or constrainttypes, to produce a factorial typology generating all and only attested

40

Stressed syllables and unstressed vowel reduction

patterns of vowel reduction. The assumption that the phonological grammar must be responsible both for modeling individual competence and for producing a full accounting of crosslinguistic phonological typology is central to the program of Optimality Theory (Prince and Smolensky 1993). It is also, I would argue, a serious mistake. In the following sections I introduce the account of phonological UVR presented in Crosswhite (2001), and compare its typological predictions to those of the phonologization model. In addition to contributing a comprehensive typological survey of vowel reduction patterns in the languages of the world, Crosswhite (2001) seeks, on the basis of evidence accrued therein, to divide vowel reduction systems into two major types. These types are said to differ both in their phonetic motivations and formal implementations, and the distinction between them for Crosswhite represents the primary typological generalization to be made for UVR systems. The two types are named contrast-enhancing and prominence-reducing UVR. Contrast-enhancing vowel reduction is a neutralization process whereby “undesirable or perceptually challenging vowel contrasts are limited to stressed position.” (Crosswhite 2001: 22) In this way it is a direct descendent of Steriade’s Licensing-by-Cue approach to Positional Neutralization. The idea is that contrasts are deployed by speakers only in positions in which the phonetic cues which make those contrasts perceptually robust are present and strong. For vowel-quality contrasts, Steriade (1994) identifies duration as the phonetic cue most indispensable for accurate perception. In positions of lesser duration, then, only the most easily perceptible vowel contrasts will be licensed, while in, e.g., stressed syllables, which are in many languages associated with additional duration, the entire inventory of contrasts may be realized. Crosswhite follows Steriade in this, presenting arguments showing why specifically mid vowels are targeted by so many reduction systems, and why the resulting systems of unstressed vocalism are so often /i,u,a/. The arguments for this are firmly grounded in the phonetic literature on crosslinguistic regularities in the shape of vowel inventories: optimal dispersion in the vowel space (Liljencrants and Lindblom 1972; Flemming 1995), the quantal stability of the corner vowels, meaning the tendency to keep stable acoustic realizations even with a degree of articulatory variation (Stevens 1986), and the tendency for certain pairs of formants to approach and enhance one another in each of the corner vowels (Stevens 1986), increasing their perceptual salience. Because of this perceptual salience, the vowels most favored by contrast-enhancing UVR will be peripheral [i, u, a].24 Crosswhite formalizes these observations in OT using licensing constraints à la Steriade, formulated as follows:

UG-based approaches to UVR typology

(2)

LIC-Q/β: Where

41

The vowel quality Q is only licensed in context β. Q = any vowel quality or a natural vowel class β = any context that enhances the perception of Q

In the case of UVR, a typical contrast-enhancing constraint is (3): (3)

LIC-Noncorner/Stress: stress.

Noncorner vowels are licensed only under

The effect of this constraint is to ban the realization of mid vowels categorically from unstressed syllables in a manner quite like that illustrated in Chapter 1 for Positional Markedness constraints. An example of a system produced by the interaction of this constraint with general faithfulness constraints on the relevant features is Belarusian. As described above, Belarusian contrasts five vowels /i, e, a, o, u/ under stress, but in unstressed syllables, /e, o/ lower and merge with /a/, yielding [a]. Another example Crosswhite cites of contrast-enhancing UVR comes from Standard Italian, in which a seven-vowel stressed system /i, e, , a, , o, u/ is reduced to five vowels [i, e, a, o, u] in unstressed syllables. Most important here is the fact that contrast-enhancing UVR is argued specifically not to be a product of duration-dependent undershoot of the sort described above. It is rather a strictly perceptual phenomenon whereby naturally difficult contrasts are eliminated from positions of lesser phonetic prominence. The main evidence that this is so is the putative appearance of true corner vowels in contrast-enhancing systems. Thus, in Belarusian and Italian, unstressed /a/ is not raised as dramatically toward schwa as it is in, for example, Bulgarian. The immediate problem with the notion of contrast-enhancing UVR lies in the typological predictions it makes. Contrast-enhancing reduction is said to take place as a response to an imperative to eliminate “perceptually difficult” contrasts from positions in which a lack of the necessary cues (here, duration) would lessen the likelihood of their correct parsing by the listener. For this reason, it prefers peripheral vowels, explaining why Belarusian leaves /a/ as [a], while Bulgarian raises it to schwa. The problem comes in the definition of what constitutes a perceptually difficult vowel contrast. Crosswhite argues that mid vowels are less robust than corner vowels perceptually, and hence prone to effacement, and this is surely true. But previous accounts of vowel neutralization in phoneticallydriven phonology have given a very different answer to this question. Steriade in particular, in her seminal 1994 paper on positional neutralization, cites the work of Kaun (1995) and Suomi (1983, 1984) on the typology and phonetic underpinnings of vowel harmony systems.

42

Stressed syllables and unstressed vowel reduction

Steriade divides harmony systems into those which are articulatorilydriven, and those which are perceptually-driven. Perceptually-driven harmony systems include those based on front/back, round, and ATR distinctions. It is this which motivates the dictum “bad vowels spread” (Steriade 1994: 23). Kaun (1995: 73–80) discusses the relative perceptibility of F2-based distinctions vs. F1-based, concluding that F1-based distinctions are perceptually more robust (probably due to the higher intensity of F1). Kaun cites as evidence for this the existence of vowel systems allowing contrasts only along the F1 dimension (the famous “vertical” vowel systems of Kabardian, Marshallese, and others), and the corresponding non-existence of “horizontal systems”. Steriade supports this view, noting an insight of Gorecka’s that “all cases of wide-spread harmony (round, ATR, front) involve featural contrasts that are relatively hard to identify; on the other hand, height, the most salient vocalic distinction, rarely, if ever, leads to across-the-board harmony.” (Steriade 1994: 23) If this is correct, then, and height truly is the most salient vocalic distinction, and if it is correct that “contrast-enhancing” vowel reduction is motivated by the need to remove perceptually difficult contrasts from weak positions, then it is not clear why all vowel reduction systems, contrastenhancing or otherwise, should focus so consistently on height. Even worse is the fact that, as we have seen, instances of UVR systems based on precisely the contrasts Steriade identifies as most difficult perceptually (front/back, round, ATR) are all but completely unattested in the languages of the world. Steriade (1994) actually notes the absence of UVR systems based on the elimination of roundness or ATR contrasts, but judges it to be an accidental gap. As argued above, however, given the phonetic factors underlying vowel reduction, there is nothing accidental about this. In sum, either Steriade, Kaun, and the phoneticians are wrong, and height contrasts are actually less robust perceptually than contrasts of palatality, roundness, and ATR, or there is more to vowel reduction than just the elimination of difficult contrasts from contexts where duration is insufficient for their accurate perception. The pure “bad vowels go” approach to vowel reduction described above does not account for the fact that the virtually all UVR systems deal in neutralizations of either height, nasalization or quantity. By contrast, this striking pattern is a primary prediction of the phonologization approach to the typology of UVR systems. The second type of vowel reduction system Crosswhite introduces, prominence-reducing UVR, is of a different character altogether. While the purpose of contrast-enhancing reduction is to remove difficult-to-perceive contrasts from positions in which they are likely to be misperceived (such that there is little point troubling oneself to deploy them in the first place),

UG-based approaches to UVR typology

43

the point of prominence reduction is the removal of “particularly loud and lengthy vowel qualities” from unstressed syllables. The basic idea behind prominence reduction is that prominent segments should be aligned to prominent positions, while non-prominent segments should be aligned to non-prominent positions. Crosswhite conceives of prominent segments as those which are more sonorous, where sonority is a complex property of segments derived from a combination of phonetic factors, including at least duration and low-frequency amplitude (2001: 37). For the purposes of vowel reduction, this measure characterizes differences in vowel height, with the caveat that unstressed schwa, conceived of as a placeless, targetless vowel, is less sonorous even than the high vowels. Prominent positions for our purposes here are stressed syllables. Crosswhite is careful in all this, incidentally, to distinguish prominence reduction from the gestural undershoot of Lindblom (1963), discussed above. Briefly, her reasons are this: “…in vowel undershoot, decreased articulation time leads to a change in vowel quality, and can be traced to a desire to avoid effortful articulations (i.e. ones in which articulator movement must be fast). In contrast, in prominence-reducing vowel reduction, a change in vowel quality leads to a decrease in articulation time.” (Crosswhite 2001: 46) Issue is taken here with the employment of articulatory undershoot to derive categorical UVR patterns synchronically. In OT terms, prominence reduction is implemented in a manner similar to that which Prince and Smolensky (1993) use to derive the role of sonority in syllabification. Crosswhite assumes two “phonetic” scales, one of accentual prominence, the other of vowel prominence, as shown in (4):25 (4)

Phonetic scales for prominence-reducing UVR a. Accentual Prominence: stressed prom> unstressed b. Vowel Prominence a prom> ,  prom> e, o

prom>

i,u prom> 

The first scale encodes the fact that stressed syllables are more prominent than unstressed, while the second encodes the familiar Sonority Hierarchy. These scales are then “crossed” to produce the relevant constraint families, resulting in fixed rankings of markedness constraints such as that shown in (5):

44

(5)

Stressed syllables and unstressed vowel reduction

Prominence reduction family of constraints *Unstressed/a >> *Unstressed/, >> *Unstressed/e,o >> *Unstressed/i,u >> *Unstressed/

The result is a set of constraints banning high sonority vowels from unstressed positions in a ranking from most sonorous to least. Crossing the scales in the other direction (producing, e.g., *stressed/) would create constraints that could be used to attract stress to sonorous vowels, or to raise the sonority of stressed vowels, à la Kenstowicz (1994). Crosswhite exemplifies prominence-reducing UVR with Bulgarian, presented, as noted above, in the form of its eastern dialects, in which neutralizations are in fact categorical. See Section 2.3.1 for discussion of the crucial differences between systems of UVR in Bulgarian dialects. Bulgarian, in this description, has six contrastive vowels under stress: /i, e, , a, u/. In unstressed syllables, however, the mid vowels are raised to merge with the high vowels as [i, u], and the low vowel /a/ is raised to merge with the mid central vowel as //.26 In other words, constraints against prominent (non-high) vowels in unstressed syllables interact with faithfulness constraints to cause the raising of /e, a, o/ to merge with /i,  , u/. Crucially, the way in which this system differs from that of Belarusian (aside from the obvious raising vs. lowering of the mid vowels), is in the realization of the low vowel as schwa. The fact that the high sonority vowel [a] is not allowed in non-prominent positions is what marks Bulgarian as prominence-reducing rather than contrast-enhancing. Crosswhite notes, importantly, that it is not always possible to tell a prominence-reducing UVR system from a contrast-enhancing system on the surface. Many of their effects can be the same; for instance, both systems disfavor, and have a tendency to eliminate, mid vowels. Contrast enhancement does so because they are non-peripheral, and hence less perceptually robust, and prominence reduction does so because they are fairly sonorous. As noted, comparing Belarusian with Bulgarian, the only outward mark of which underlying phonetic motivation (and formal encoding) of vowel reduction we find in each is the presence or absence of schwa. Schwa for Crosswhite is a shibboleth; it is extremely low in sonority, making it good for prominence reduction. That Bulgarian UVR produces a schwa means that it is prominence-reducing. On the other hand, schwa is no good at all for contrast enhancement, which favors peripheral vowels. The fact that Belarusian has [a] in unstressed syllables therefore means that it is contrast-enhancing.

UG-based approaches to UVR typology

45

That the one system raises mid vowels and the other lowers them is irrelevant, since in both UVR types either pattern can be generated by simple reranking of faithfulness constraints. A system with raising of mid vowels and no raising of the low vowel, however, would be completely ambiguous, since reduction of /a/ to schwa can actually be blocked in a prominence-reducing system by ranking the relevant faithfulness constraints (Max [+low]) above even *Unstressed/a. The possibility of such cases in which it is a priori impossible to determine, presumably for linguist and learner alike, which grammatical mechanism is responsible for generating such a pattern is troubling. This is not, however, the only instance in which the line between the two types of reduction systems becomes blurred. First, contrast-enhancing UVR is not so simple in its implementation as it might seem. In the cases of Standard Italian, Belarusian, the pretonic syllables of Brazilian Portuguese and the immediate pretonic syllable of Russian (introduced below), the low vowel, our sole diagnostic for system type, is often said to be peripheral, and is transcribed [a]. This, however, is an idealization of the phonetic facts. In Standard Italian, for example, where all unstressed syllables witness a merger of the tense and lax mid vowels, the vowel transcribed [a] is in fact substantially raised in most unstressed syllables. Farnetani and Vayra (1991) provide experimental evidence showing that F1 values for the unstressed vowels of two (northern) speakers of Standard Italian are on average decreased by 28% (974 Hz vs. 700 Hz) for one speaker and 18% (669 Hz vs. 548 Hz) for another. Furthermore, this experiment found a high degree of correlation between vowel duration and F1 values for the two speakers, r(133) = 0.783 and r(130) = 0.774 respectively. Rather than implementing a maximally (or even steadily) peripheral vowel, then, here the realization of /a/ is a function of its duration. The reason it is not as high as Bulgarian schwa (for example) might therefore be not that it is specified [+low], but rather just that the pressure of the duration decrease is usually not extreme enough in Italian to force raising the full distance to schwa. In a case such as this, it is not even clearly meaningful to draw a distinction between “schwa” and “[a]”, unless we wish to define an arbitrarily selected F1 value as the border, beyond which a system ceases to be contrastenhancing, and becomes prominence-reducing instead. The fact that the realization of /a/ is gradiently determined by duration is even worse, since in this case even such an arbitrarily selected F1 boundary has little meaning. The same problem arises for the pretonic syllables of Brazilian Portuguese, which Crosswhite argues undergo contrast-enhancing reduction. Nobre and Ingemann (1987) give experimental results showing the

46

Stressed syllables and unstressed vowel reduction

mean F1 of stressed and pretonic /a/ for four Brazilian Portuguese speakers to differ by nearly 100 Hz (stressed /a/ – 684 Hz, pretonic /a/ – 585 Hz). Flemming (2004) notes of the tense and lax mid vowels that the neutralized realizations are centralized in the F2 dimension, while in F1 falling in between the stressed realizations of the tense and lax phones. To this may be added the facts observed by Major concerning the tendency for the second degree of reduction (raising of non-high vowels) to apply in pretonic syllables at faster rates of speech. Again, since the low vowel of pretonic syllables is realized somewhere between [a] and [] as a function of duration (and the mid vowels apparently behave similarly with respect to the high vowels), it is unclear how to draw a sharp distinction between this type of system and one like Bulgarian. I submit that the difference between these systems is not one of formal implementation or phonetic motivation at all, but rather merely one of degree. Durational pressures in unstressed syllables in Italian and pretonic syllables in Brazilian Portuguese have simply not been severe enough to condition consistent drastic raising of the non-high vowels.27 In the posttonic syllables of Brazilian Portuguese, in principle prominencereducing because of the raising of the non-high vowels and the presence of schwa, for some speakers this pattern is simply not phonological in the first place, while for others durations have been low enough to cause significant raising consistently enough that the phonetic pattern has become phonologized. If there is a serious dichotomy to be found here, it is between the phonologized and the unphonologized versions of a variety of processes all ultimately arising from the same phonetic pressures. The Belarusian situation is somewhat more complex, due to the unresolved questions discussed above surrounding the diachronic source of the mergers of the mid and low vowels. To review, the problem was that it is not clear whether mid and low vowels once contrasted in unstressed syllables in East Slavic, but later fell together, or whether the contrast between the mid and the low vowels only ever emerged in stressed syllables to begin with, such that diachronically at least, no vowel reduction per se ever took place. Recall that in Belarusian immediate pretonic syllables /e, a, o/ are realized as [a], while in other unstressed syllables the neutralization is the same, though the resulting [a] undergoes some qualitative reduction, causing Czekman and SmuΩkowa (1988) to symbolize it as [α], perhaps IPA [ ]. Here again the ambiguity in the system arises: Since some raising toward schwa does in fact occur, should this be considered contrastenhancing or prominence-reducing vowel reduction? Has non-first pretonic /a/–/o/–/e/ been raised enough to count as raised, or is it still peripheral, despite the raising?

Vowel reduction in Russian

47

Of course, the problem still remains that the Belarusian first pretonic syllable does, according to Czekman and SmuΩkowa, actually contain phonetic [a], so at least there the contrast-enhancement analysis seems solid. For the phonologization approach, this could be a problem, since if the low vowel is not forced by durational pressures to raise, becoming perceptually ambiguous with the reduced variants of the mid vowels, then it is not clear why neutralization takes place to begin with. If it is correct, however, that there never was a contrast between mid and low vowels in the unstressed syllables of East Slavic, then this issue disappears. Even under the opposite hypothesis, though, it is still possible that the unreduced [a] of the Belarusian pretonic syllable is a secondary development. Note that the realization of [a]~[ ] in the first pretonic syllable varies substantially in both Russian and Belarusian dialects, in some being more like a stressed /a/, and in others being reduced to one extent or another. Again there is much dispute as to the original quality of the system, but most scholars assume that either the one (uniform pretonic [a]) or the other (uniform pretonic reduced [a ] ) was original (see Chekmonas 1987 for argumentation). Timberlake (1993: 435) presents a scenario in which other, more complicated dialectal reduction patterns can be plausibly derived if we assume the original first pretonic syllable to have contained an ‘a’ with some degree of reduction and shortening (he writes [α]). If this is the case, such that lengthening and lowering of the outcome of the merger of the mid and low vowels in all first pretonic syllables is a secondary development in Standard Belarusian, then the neutralization makes sense even as a result of ordinary duration-driven articulatory UVR. 2.5. Vowel reduction in Russian The system of UVR in Contemporary Standard Russian is an excellent test case for the two approaches. Crosswhite treats Russian (like Brazilian Portuguese) as displaying two degrees of vowel reduction. The first and less extreme of these is analyzed as contrast-enhancing, while the second, more extreme degree of reduction is treated as prominence-reducing. The analysis of Russian as possessing two distinct degrees of vowel reduction is in fact the standard treatment in the Slavic grammatical tradition. Generally, however, the difference is treated as one of degree, rather than one of kind. Crosswhite’s innovation is to treat the two degrees of vowel reduction as differing not merely in the vowels they allow to be expressed, but in the very nature, both in terms of formal implementation and phonetic motivation, of the reduction process itself.

48

Stressed syllables and unstressed vowel reduction

In what follows, I will lay out the facts of the Russian case. After brief discussion of Crosswhite’s approach to it, I will present a different account of Russian vowel reduction. Specifically, on the basis of experimental results to be presented below, I will argue that there is only one phonological vowel reduction process in Russian: Degree 1 reduction. Degree 2 reduction (as in Brazilian Portuguese for some speakers) is an unphonologized process of duration-dependent raising applying to all unstressed syllables, affecting those with shorter durational targets more dramatically than those with longer targets. 2.5.1. Facts Contemporary Standard Russian (CSR) is usually described as having two “degrees” of vowel reduction. Degree 1 reduction applies to the first pretonic syllable (but see below for details), while Degree 2 applies elsewhere. The generalization usually cited is this: /a/ and /o/ are categorically neutralized in all unstressed syllables. In the first (immediate) pretonic syllable, both vowels are realized as [a]. In other pretonic syllables farther from the stress, and in all posttonics, they are realized as []. In all unstressed syllables, /e/ is raised to merge with /i/. /a/ and /o/ following a palatalized consonant reduce according to the same pattern as /e/. There are further details concerning /e/, palatalization, and // which will not concern us here. This is exemplified in (6), where (a) shows stressed vowels, (b) shows Degree 1 reduction, and (c) shows Degree 2 reduction. (6)

UVR in Contemporary Standard Russian a. moldst bol starj razum

‘youth’ ‘pain’ ‘old’ ‘reason’

b. malodnkj balet starik razumn

‘young-DIM’ ‘to hurt’ ‘old man’ ‘wisely’

c. mladoj blvoj strna rzumet

‘young’ ‘pain-ADJ’ ‘old times’ ‘to understand’

Vowel reduction in Russian

49

Under Crosswhite’s analysis, the Degree 1 reduction occurring in the first pretonic syllable is contrast-enhancing reduction, since mid vowels are eliminated, but the low vowel is realized peripherally, rather than as schwa. Degree 2, exemplified above in the second pretonic syllable does reduce /a/ and /o/ to [ ], and therefore is analyzed as prominence-reducing. Note that the reduction of the result of the merger of /a/ and /o/ from [a] to [ ] is nonneutralizing. The merger takes place in all unstressed syllables, with different realizations depending on the position of the vowel relative to the stressed syllable. No additional mergers take place in Degree 2 reduction. The details of the case, however, are more complicated than the picture sketched above would lead one to believe. The first complication concerns the realization of what is transcribed [a] in the first pretonic syllable. While speakers of some dialects, at least in certain circumstances, realize a pretonic [a] with little if any reduction,28 standard handbooks describe a first pretonic realization of this /a/ (from /a/, /o/), which seems to have undergone some degree of qualitative phonetic reduction. Though not reduced as much as they would be in non-first-pretonic syllables, it is not clear that the reduction they do undergo is completely different in kind from that more dramatic raising pattern. Sources are vague as to mean formant frequencies.29 Matusevic  transcribes first pretonic /a/–/o/ as [ ], which is not, however, meant as the IPA value of the same symbol. Articulatorily, compared to /a/, he writes that “the tongue ... is pulled slightly back, while (importantly) the tension of the organs of speech is reduced.” (Matusevic 1976: 100) It is substantially shorter than stressed /a/, with a mean duration given as 90 ms. to the 196 ms. of its stressed counterpart. It is said to be perceptually “further back and more closed than a stressed ‘a’, less tense and shorter.” Avanesov (1968: 51) says pretonic /a, o/ is “not always identical, but generally produced with a bit less lowering of the lower jaw, and therefore not as wide opening of the mouth, as for a stressed [a].” He cautions, however, that it must nonetheless remain “alike” in character, i.e. not raised all the way to schwa. What is clear from this is that the vowel in question is subject to some phonetic reduction, a fact which, as in the case of Italian and Brazilian Portuguese, questions the extent to which a clear distinction can be drawn between the reduction in question and the reduction found in, e.g., Eastern Bulgarian. There are further complicating factors in the realization of Degree 2 reduction as well. Firstly, in absolute word-initial position /a/ and /o/ are not in fact reduced to schwa, but rather are realized as the same sort of [a] as is found in the first pretonic syllable, which I will transcribe from here on as [ ]. This is not unexpected, given the correlation between reduction and duration we have seen in action elsewhere. Vowels in onsetless initial

50

Stressed syllables and unstressed vowel reduction

syllables are seen in a variety of languages to be longer than their counterparts in initial syllables which have onsets. As discussed in 3.2.3 absolute word-initial vowels in Nawuri (Casali 1995; Kirchner 1998) fail to undergo an assimilation process involving roundness of which they would otherwise be targets. Long vowels and vowels in phrase-final position likewise avoid this assimilation, all three being associated with additional duration. In Tamil, likewise, absolute word-initial vowels have been shown experimentally to be longer than vowels in word-initial syllables with onsets (Balasubramanian 1981). In Luganda, the contrast between long and short vowels is neutralized in onsetless word-initial syllables. Additionally, only three vowels contrast in this position (in fact, initially in any morpheme): /e, o, a/ (Hubbard 1994: 161–165). While Hubbard’s durational measurements for vowels in this position are equivocal as to the relative durations of these vowels, in at least certain cases they are longer than they would be in internal positions. A similar example operating at the phrase-level comes from Runyambo, in which initial /i, u/ are lowered to [e, o] after pause (Larry Hyman, personal communication). This lowering of the high vowels at the expense of the high-mid quality distinction can be interpreted as the result of boundary-adjacent strengthening of the sort documented by Cho (2001), which is discussed at length in Chapter 4.30 It is possible that this effect is a result of the tendency for domain-initial segments to be strengthened both in duration and magnitude of gestures. See Chapter 4 for discussion. Unsurprisingly in light of the reduction data, vowels in absolute initial position in Russian are durationally enhanced as well. Phonetically, then, there is plenty of crosslinguistic evidence for the durational enhancement of absolute word-initial vowels, and the failure of these to reduce to schwa in Russian makes perfect sense if reduction to schwa is a function of vowel duration. Crosswhite, for whom reduction to schwa in Russian represents the inability of non-first-pretonic vowels to bear moras, derives this nonreduction with a constraint mandating the alignment of a mora to the wordedge. Reduction to schwa also fails to apply in situations of hiatus before [a]. Thus, we find [saatnaenije] (*[satnaenije]), ‘relationship’. This too makes sense under the assumption that reduction to schwa in Russian is not carried out in the categorical phonology at all, but rather is an instance of gradient, duration-dependent phonetic raising, just like the less dramatic effect described above in, e.g., Italian. If the problem with producing [a] in an unstressed syllable with consonants on either side is lack of sufficient time for the jaw-lowering gesture to reach target, the realization of two identical vowels with no intervening consonant would solve this problem handily, giving ample time for the gestural target of the low vowel to be

Vowel reduction in Russian

51

reached. This exception to the reduction of /a/~/o/ to schwa thus follows from the same duration-based phonetic principles as failure to reduce in absolute initial position: Reduction occurs where phonetic durations are substantially impoverished, and fails where additional duration is realized. Crosswhite derives the failure of /a/~/o/ to reduce to schwa in hiatus to a constraint *Hiatus([a], []), which states that the featureless vowel [] may not occur in hiatus with [a]. In addition to all this, however, there is one more position in which nonfirst-pretonic /a/ and /o/ fail to reduce to schwa: In phrase-final open syllables, /a/ and /o/ are realized as [ ], just as they are in the first pretonic syllable. This realization is said to depend on style, speech rate, and to be gradient, sometimes producing a vowel in between Degree 1 and Degree 2 reduction (Matusevic 1976: 102; Zlatoustova 1962: 109–139). Nonreduction only occurs when the vowel is final in the phrase, with phrase-internal word-final vowels reduced to schwa along with other non-first-pretonic unstressed vowels. In Section 3.2.3 I attribute this rate-dependent failure to reduce to the additional duration provided by phrase-final lengthening, robustly attested in Russian. Again, failure to reduce to schwa is predictable according to the phonetic duration of the vowel. It is not clear how (if at all) other approaches to Russian VR would treat this last seemingly exceptional context. Each instance of the failure of reduction to schwa in non-first-pretonic syllables is explicable as a function of the phonetic duration assigned to the vowels in those syllables. The additional duration in each of the above cases is furthermore either part of a well-documented crosslinguistic trend (initial strengthening, final lengthening), or simply falls out from the nature of the sequence being realized ([a.a]). There is nothing unique to Russian phonetics or phonology about any of these cases. The phonetic account of Degree 2 reduction presents a unified picture of all the cases of the failure of reduction, deriving them all from additional duration assigned in the phonetics. The following can be said thus far: Degree 1 reduction, the merger of /o/ and /a/ in unstressed syllables is phonological. It is neither durationdependent nor gradient in its application. Degree 2 reduction, the realization of /a/ as [], however, is as yet unphonologized. The facts cited above concerning the failure of reduction to schwa suggest that it is duration-dependent. There is also no independent phonological evidence that reduction to schwa must take place in the phonology rather than in the phonetics (i.e. there are no other phonological processes which interact with reduction to demonstrate that schwa must be featureless, or even that it must not be specified [+low]. The next section presents experimental results which confirm this picture of Russian UVR.

52

Stressed syllables and unstressed vowel reduction

2.5.2. Experimental investigation of Russian UVR The failure of reduction to schwa in certain positions with characteristically longer vowel durations, however, turns out to be just part of the story; in what follows, I present the results of an experimental study of Russian vowel reduction demonstrating the inherently gradient nature of the reduction of /a/~/o/ to schwa, and showing the extent of this process’s application to be essentially a function of the physical duration assigned to the vowel in question. In virtually all normal speech, /a/~/o/ in positions subject to Degree 2 reduction is indeed realized closer to schwa than it would be if it were located in the first pretonic syllable. It reduces far less, though, in slower speech than in faster, and in deliberate, hyperarticulated speech it is virtually identical to the corresponding first pretonic. Crucially, however, while reduction to schwa can be undone through hyperarticulation, as noted above, no amount of hyperarticulation or emphasis can restore the contrast between /a/ and /o/, the neutralization of which is truly categorical.31 Details of the study are as follows. 2.5.2.1. Subjects Subjects in the study were four native speakers of Contemporary Standard Russian, three females and one male, between the ages of twenty and forty. All had resided in the United States for several years prior to participation in the experiment, but were born, raised and educated at least through secondary school in Russia. Two were originally from Moscow, one was from St. Petersburg, and another was from Ufa. All reported continued regular use of Russian with family and friends. Three of four subjects had some training in linguistics, but were naive as to the purpose of the experiment. 2.5.2.2. Materials Target words for the comparison of unstressed vowels in distinct prosodic contexts were real Russian trisyllables with either medial or final stress. 42 of these contained either /a/ or /o/ in the first pretonic syllable (21 each), while 29 contained /a/ or /o/ in the second pretonic syllable (11 and 18 respectively). 20 instances of stressed /a/ and /o/ were also selected as a baseline against which to gauge reduction. That instances of unstressed /o/ were in fact underlying /o/ (and not /a/, despite realization as [a]-[] in the unstressed syllables of these target words) was supported both by Russian

Vowel reduction in Russian

53

orthography and by the existence of morphologically related forms with the same vowel under stress and surfacing in its unreduced variant.32 Examples of each word type are given in (7). (7)

a.

First pretonic /a/: /napadat/

[npadat] ‘attack-INF’ (cf. [napatki] ‘attack-NOM.PL’)

b. First pretonic /o/: /olodat/

[ladat] ‘starve-INF’ (cf. [alodnj] ‘hungry-MASC.SG.’)

c. Second pretonic /a/: /takovoj/

[tkavoj] (cf. [tak]

‘such-MASC.SG’ ‘thus, so’)

d. Second pretonic /o/: /odovoj/

[davoj] (cf. [got]

‘annual’ ‘year’)

All target vowels in this study were located in open syllables. They were followed either by unpalatalized voiced or voiceless stops, and preceded by unpalatalized voiced stops, voiceless stops, or laterals. These limitations on consonantal context were chosen primarily to facilitate segmentation in subsequent analysis. Nasals were excluded from target contexts. Target words were embedded in frame sentences of the form Mas¬ka X skazala, 'Mashka said X', such that target syllables were some distance from both utterance boundaries. Each target word appeared twice for a total of 182 sentences. These sentences were then arranged in pseudo-random order and broken into blocks of 12 (the final block containing two additional sentences). 2.5.2.3. Methods Each speaker was given the list of sentences, presented in Russian orthography, and asked to read each block of sentences twice; the first pass

54

Stressed syllables and unstressed vowel reduction

through each block was to be read in a formal style, described to subjects as “like a newscaster might sound”. The second reading of each block, on the other hand, was to be done at a faster tempo, “as fast as you comfortably can”. This manipulation of speech rate was intended to cause enough variation in the durations of target vowels that the relationship between duration and vowel reduction could be adequately assessed. Blocks of sentences were deliberately kept short enough that speakers were able to keep a more or less constant speech rate throughout each one, and speakers were reminded at the beginning of each block which tempo they were meant to be producing there. Subjects were recorded in a sound-attenuated room at a sampling rate of 44.1 KHz using a Marantz CDR-300 portable CD-recorder and a Shure SM10A cardioid headworn dynamic microphone run through a Rolls MP13 Mini-Mic preamplifier. Recordings were then downsampled to 22.050 KHz and saved as .wav files using the Praat 4.0.2 speech analysis software (Copyright©1992–2001 by Paul Boersma and David Weenink). Measures of duration for target vowels were taken from linked spectrographic and waveform displays. Vowels were measured as beginning, after stops, at the onset of reliable periodicity following the stop burst. (Russian voiceless stops are typically realized with low voice onset time.) Following laterals, vowel measurements began at visible spectral discontinuities associated with the lateral’s release, or failing this, at the onset of significant energy in the second formant and above. In certain extreme cases, where no sufficient landmark could be detected between the lateral and the vowel, tokens were simply discarded. Measurements of the first formant, taken to be indicative of vowel height, and hence position on a potential cline of values between [a] and [], were made at vowels’ midpoints using LPC analysis (10 poles, with downsampling of original files to 11 KHz for female subjects and 10 KHz for male, analysis window 25 ms., shortened to 10 ms. where vowel durations necessitated). Since subjects were instructed to read half their productions at as fast a rate of speech as they could comfortably maintain, a substantial number of the total tokens recorded from each speaker ended up containing vowels, particularly in second pretonic syllables, which had been deleted, devoiced, or otherwise overlapped to the point where reliable formant analysis became impossible. These productions were excluded from the final analysis of results. Thus, of 364 tokens produced by each speaker (182 sentences x 2 repetitions), the total numbers of tokens ultimately analyzed were 311 for Speaker RR, 317 for Speaker TM, 312 for Speaker DT, and 257 for Speaker M, for a total of 1197 tokens analyzed. These tokens represented our categories of interest in the following numbers: 152 tokens of stressed /a/, 103 of stressed /o/; 290 tokens of first pretonic /a/, 295 of

Vowel reduction in Russian

55

first pretonic /o/; 136 tokens of second pretonic /a/, 221 of second pretonic /o/. Results of these measurements were as follows. 2.5.2.4. Results Mean durations and F1 values for each speaker are shown in Table 1 and Table 2 below. Table 1. Speaker RR TM DT M All

Table 2. Speaker RR TM DT M All

Mean first formant values in Hz and standard deviations 2nd pretonic /a/ /o/ 482 490 (54) (49) 443 449 (36) (42) 492 506 (42) (63) 484 483 (59) (50) 475 481 (52) (55)

1st pretonic /a/ /o/ 734 720 (53) (45) 713 695 (79) (90) 641 640 (34) (32) 734 742 (52) (70) 705 694 (68.5) (73)

/a/ 723 (53) 691 (88) 637 (32) 784 (42) 700 (78)

stressed /o/ 450 (25) 444 (38) 455 (30) 503 (34) 461 (39)

/a/ 85.9 (14) 91.7 (25.7) 72.6 (13.2) 75.2 (11.4) 81.8 (19.3)

stressed /o/ 77.6 (18.1) 81.4 (32) 69.3 (12.6) 68.7 (15.4) 74.5 (21.5)

Mean vowel durations in ms. and standard deviations 2nd pretonic /a/ /o/ 21.7 26.2 (7.2) (7.6) 24.2 26.6 (7.5) (9.4) 26.3 32.6 (10) (10.6) 28.1 31.6 (10.4) (9.9) 25.2 28.9 (9.2) (9.7)

1st pretonic /a/ /o/ 65.9 69.1 (9.9) (11.6) 81.1 80.4 (19.3) (18.5) 58.2 60.7 (13.6) (13.5) 70.9 70.9 (14.5) (13.7) 69 69.9 (16.9) (16.2)

Repeated measures analysis of variance performed on the data with underlying vowel and position relative to stress (first pretonic or second pretonic) as within-subjects factors yielded the following results. For F1,

56

Stressed syllables and unstressed vowel reduction

unsurprisingly, there was a strong main effect of syllable relative to stress, reflecting the distinction traditionally reported between an [a]-like vowel in the first pretonic syllable, and a schwa-like vowel in the second pretonic: F(1, 3) = 65.434 (p = .004). For underlying vowel, on the other hand, no effect at all was found: F(1,3) = .008 (p = .936). No significant interaction was recorded between vowel and syllable either: F(1, 3) = 2.939 (p = .185). In other words, /a/ and /o/ are indistinguishable in their F1 values in unstressed syllables. This lack of difference is unsurprising given all previous descriptions of Russian vowel reduction, as well as other recent experimental results such as those of of Padgett and Tabain (to appear), in which neutralization of /a/ and /o/ is shown to be complete in a number of phonetic dimensions; no trace of this distinction has ever been reported in unstressed syllables. For the duration figures given above, the repeated measures ANOVA again shows a main effect of syllable: F(1, 3) = 66.74 (p = .004). Oddly, however, it also does record a difference between /a/ and /o/ which at least nears significance: F(1, 3) = 10.342 (p = .049), and does show a significant interaction between vowel and syllable relative to stress: F(1, 3) = 27.251 (p = .014). This reflects the fact that, while in the first pretonic syllable, mean durations appear nearly even (69 ms. vs. 69.9 ms.), there is a slightly larger difference between /a/ and /o/ in second pretonic syllables: 25.2 ms. for /a/, compared with 28.9 for /o/. Before rushing to conclude that this is evidence for the incomplete neutralization of /a/ and /o/ in the second pretonic syllable, however, several points must be addressed which, taken together, suggest that the difference recorded is in fact accidental, and would most likely vanish in a larger study. First, the failure of any previous researchers (such as Padgett and Tabain, whose study investigates this question in detail) to obtain this result should inspire skepticism from the outset. Second, the weakness of the significance of the effect, together with the extremely small difference in durational means recorded, fails to provide the kind of basis one would want for the unexpected claim that these vowels remain durationally distinct in the second pretonic syllable. More importantly, however, the difference surfaces precisely in the more reduced second pretonic syllables, where mean durations are less than half what they are in first pretonic syllables. If /a/ and /o/ in fact remained phonetically distinct in any unstressed context, one would expect this to occur in the first pretonic syllable, where the durational pressure toward reduction is far weaker than it is in the second pretonic. For the distinction between /a/ and /o/ to vanish in the first pretonic, which it indisputably has, only to reemerge in the weaker second pretonic makes little sense. Lastly, the mean duration of second pretonic /o/ was actually slightly longer than that of /a/, rather than

Vowel reduction in Russian

57

shorter, as one might expect given the ordinary durational relationships between mid and low vowels (cf. the relative values in stressed syllables given above). This makes the likelihood of the difference in means recorded here reflecting a real distinction seem remote indeed. Given all the above, I will adopt in what follows the standard assumption that neutralization of /a/ and /o/ in unstressed syllables in Russian is complete and categorical. A less expected result of the present study, however, is this: at least for vowels in second pretonic syllables, there turns out to be a strong correlation between vowel duration and F1, suggesting that the height of the unstressed vowels in question is to some extent dependent on their length.33 This can be seen in the scatterplots shown below:34 Second Pretonic /a/ and /o/: Speaker RR

Second Pretonic /a/ and /o/: Speaker TM

700

600 /o/: R2 = 0.3949

2

/o/: R = 0.1385 500

/a/: R2 = 0.4957

500

/a/ /o/

400

Linear (/a/) 300

Linear (/o/)

200

First formant in Hz

First formant in Hz

600

2

/a/: R = 0.4341 400

/a/ /o/

300

Linear (/a/) Linear (/o/)

200 100

100 0

0 0

10

20

30

40

50

0

10

20

Vowel durations in ms.

30

40

50

60

Vowel durations in ms.

Second pretonic /a/ and /o/: Speaker DT

Second pretonic /a/ and /o/: Speaker M 800

700

2

2

/o/: R = 0.2282 /a/: R2 = 0.2718

500

/a/ /o/

400

Linear (/a/) 300

Linear (/o/)

200 100

/o/: R = 0.2529

700 First formant in Hz

First formant in Hz

600

2

/a/: R = 0.4618

600

/a/

500

/o/ Linear (/a/)

400

Linear (/o/)

300 200 100

0

0 0

10

20

30

40

Vowel durations in ms.

50

60

70

0

10

20

30

40

50

60

70

Vowel durations in ms.

Figure 1. Russian second pretonic /a/, /o/: F1 vs. Duration

In these scatterplots, two regression lines, along with two R-squared values are displayed: one for /a/ and another for /o/. Given the assumption that /a/ and /o/ in fact represent a single category in unstressed syllables, together with the similarity of the regression lines and correlation values

58

Stressed syllables and unstressed vowel reduction

exhibited by the two underlying vowels, all further statistical tests will merge the two, analyzing the pooled results as a single category. This is simply the consequence of the conclusion that neutralization of the contrast between /a/ and /o/ is complete in unstressed syllables. Pearson correlations conducted for the relationship between vowel duration and F1 for each speaker confirms the strength of the dependency shown graphically above: For Speaker RR, r(94) = .647, p < .001; for Speaker TM, r(97) = .453, p < .001; for Speaker DT, r(81) = .495, p < .001; for Speaker M, r(85) = .582, p < .001. Results pooled for all speakers can be seen in Figure 2. Second pretonic /a/ and /o/: all speakers 800 /o/: R2 = 0.2468

First formant in Hz

700

/a/: R2 = 0.3584

600

/a/

500

/o/ Linear (/a/)

400

Linear (/o/)

300 200 100 0 0

10

20

30

40

50

60

70

Vowel durations in ms.

Figure 2. Russian second pretonic /a/, /o/, all speakers: F1 vs. Duration

The Pearson correlation for the pooled results shown above continue to confirm the strong dependence of vowel height as measured by F1 on vowel duration for non-high vowels in Russian second pretonic syllables: r(357) = .532, p < .001. Note, however, that difficulty in comparing formant values across speakers, particularly those of different genders, obscures to some extent the strength of the relationship between the variables in question. To correct for this, duration and F1 results for all speakers were transformed into z-scores based on the mean and standard deviation of each vowel in each position for each speaker.35 Normalizing in

Vowel reduction in Russian

59

this manner gives the results shown below in Figure 3. The correlation between F1 and duration is now somewhat stronger: r(357) = .552, p < .001, with an R-squared of .304. The conclusion here is apparent: the shorter a vowel gets in a second pretonic syllable, the more it raises toward schwa, and conversely, the longer that vowel becomes, the closer it remains to the [a]-like F1 territory associated, for example, with the first pretonic syllable. This duration-dependence is precisely what we would expect given the phonetic account of vowel reduction sketched above in Section 2.2. Normalized 2 pretonic /a/ and /o/: all speakers 4 /o/: R2 = 0.2442

3

2

/a/: R = 0.417

First formant

2

/a/ /o/

1

Linear (/a/) 0 -3

-2

-1

-1

Linear (/o/) 0

1

2

3

-2 -3 -4 Vowel durations

Figure 3. Russian 2nd pretonic /a/, /o/ normalized, all speakers: F1 vs. Duration

While the results for second pretonic syllables show a clear and strong dependence of F1 on duration, results for first pretonic syllables are somewhat more complex. Results for individual speakers are given in Figure 4. The results of Pearson correlations for duration and F1 for each speaker in this position are as follows: For Speakers TM and M, a strong correlation again emerges (r[149] = .517, p < .001, and r[123] = .525, p < .001 respectively). For Speaker DT though, the correlation is quite a bit weaker, with r(159) = .283, p < .001, and an R-squared of just .08. For Speaker RR the relationship is weaker still, with r(154) = .172, significant only at p < .05, and with an R-squared of .03, meaning that for this speaker

60

Stressed syllables and unstressed vowel reduction

little if any of the variation in F1 values in first pretonic syllables is explained by differences in duration. First Pretonic /a/ and /o/: Speaker RR

First Pretonic /a/ and /o/: Speaker TM

1000

/a/

700

/o/

600

Linear (/a/)

500

Linear (/o/)

400 300 200

/o/: R2 = 0.3549

900

2

/a/: R = 0.0337

800

First formant in Hz

First formant in Hz

1000

/o/: R2 = 0.0475

900

100

800

/a/: R2 = 0.1892

700

/a/ /o/

600

Linear (/a/)

500

Linear (/o/)

400 300 200 100

0

0

0

20

40

60

80

100

0

20

40

Vowel durations in ms.

First Pretonic /a/ and /o/: Speaker DT

80

100

120

140

First Pretonic /a/ and /o/: Speaker M

800

1000

2

/o/: R = 0.0742

700

/a/

500

/o/ Linear (/a/)

400

Linear (/o/)

300 200 100

First formant in Hz

/a/: R = 0.0902

600

/o/: R2 = 0.2971

900

2

First formant in Hz

60

Vowel durations in ms.

/a/: R2 = 0.2738

800

/a/

700

/o/

600

Linear (/a/)

500

Linear (/o/)

400 300 200 100

0

0 0

20

40

60

80

100

0

20

Vowel durations in ms.

40

60

80

100

120

Vowel durations in ms.

Figure 4. Russian first pretonic /a/, /o/: F1 vs. Duration

While the strong correlation numbers at least for Speaker TM may be due to a few outlier values at the low end of the duration scale, this seems less likely in the case of the slightly higher correlation for Speaker M. In fact, removal of all observations with dependent or independent variable measures above just 2.5 standard deviations from the mean does not remove the differences observed between speakers here: For Speaker M, r(117) = .521, and for Speaker TM, r(143) = .385, still well above what was observed for Speakers RR and DT above. It appears, then, that for two of the speakers in this study, F1 in first pretonic syllables was not dependent on duration, while for the other two, such a dependency was clearly observed. Before any larger conclusions can be drawn from this, a similar study involving more speakers is clearly necessary. One possible conclusion from the above might be that speakers differ in the width of their target windows for [a] in first pretonic syllables: Speakers RR and DT

Vowel reduction in Russian

61

maintain narrow windows, requiring the expenditure of as much articulatory effort as necessary to achieve target F1 values, even in the face of diminished duration within which to do so, while in second pretonic syllables these windows are resized to allow greater variability under pressure from duration. Speakers TM and M, on the other hand, might have a single width for the target windows for F1 of [a] in both positions, such that as durational pressures increase on its realization, undershoot of articulatory targets increases as well, yielding the gradient raising toward schwa we see in the scatterplots above. Another conceivable interpretation of the difference between speakers in this study, though, involves the distributions of durational values produced by each speaker. Looking in particular at the skewness statistic, while none of the speakers have distributions departing dramatically from symmetry (in each case the skewness statistic is less than twice its standard error), it is curious that Speakers M and TM have positive skewness values of .433 and .310 respectively, indicating long right tails, or a preponderance of duration values produced toward the lower end of each’s range. Speaker DT has a slight positive skewness statistic of .098, indicating a more symmetrical distribution, while Speaker RR has a negative skewness of –.179, indicating a longer left tail on that distribution, and hence more productions tending toward the upper part of the range of durations she produced in the study. Note that this ranking according to skewness positive to negative directly recapitulates the relative degree of correlation between duration and F1 for the speakers given above, with M and TM showing strong correlations (the former moreso in adjusted results), DT with a weak correlation, and RR with an essentially negligible relationship between the two values. It is important to observe in this connection too that, while the results of the linear regressions given above show strong positive correlations between duration and F1 (at least in second pretonic syllables, and for two of the speakers in first pretonics as well) such that the longer the vowel is, the higher its F1 is predicted to be, it is clear that this relationship cannot be truly linear as duration values increase: at a certain point, a maximum amount of jaw opening will be reached, and with it a maximum F1 value for that speaker, such that no amount of additional duration could cause that value to increase beyond this speaker’s maximum value. In other words, as duration values grow, we expect F1 values to approach a horizontal asymptote. The curve that would then in fact describe the relationship between duration and F1 for a given speaker would be such that the effect of duration on F1 would be stronger in the lower part of that speaker’s durational range, and weaker in the higher part. If Speaker RR was consistently producing first pretonic vowels in the upper part of her

62

Stressed syllables and unstressed vowel reduction

potential durational range for this position, as suggested by her negative skewness value, it stands to reason that we would see a weaker relationship between duration and F1 in her data than in that of speaker M, whose duration values were consistently in the lower part of her own range, as suggested by her relatively high positive skewness statistic. While this may be suggestive of an explanation, however, the issue of interspeaker variation in the duration dependence of F1 for [a] in the first pretonic syllable clearly requires further investigation. Combining the first pretonic and second pretonic vowels for each speaker into a single scatterplot gives a sense for the overall shape of the curve describing the relationship between vowel height and duration for Russian unstressed syllables described in the preceding discussion. This is seen in Figure 5. Unstressed /a, o/: Speaker M

Unstressed /a, o/: Speaker TM

1000

1000

900

900

800

800 700 1 pretonic /a, o/

600

2 pretonic /a, o/

500 400

F1 in Hz

F1 in Hz

700

2 pretonic /a, o/

500 400

300

300

200

200

100

1 pretonic /a, o/

600

100

0

0 0

20

40

60

80

100

120

0

20

40

Duration in ms.

Unstressed /a, o/: Speaker DT

80

100

120

140

Unstressed /a, o/: Speaker RR

800

1000

700

900 800

600 1 Pretonic /a, o/

500

2 pretonic /a, o/ 400 300

700 F1 in Hz

F1 in Hz

60

Duration in ms.

1 pretonic /a, o/

600

2 pretonic /a, o/

500 400 300

200

200

100

100

0

0 0

20

40

60

80

100

Duration in ms.

0

20

40

60

80

100

Duration in mx.

Figure 5. Unstressed /a, o/, contexts combined

In the scatterplots above, Speakers M and TM (the two speakers for whom a strong correlation between F1 and duration was observed in both unstressed positions) both clearly exhibit a convergence of the lower ends of their ranges for first pretonic syllables with the upper ends of their

Vowel reduction in Russian

63

ranges of values for second pretonic syllables. The upper end of the first pretonic range also clearly levels off in F1 as duration increases. The extent to which these two clusters fit tidily on a single curve, however, is somewhat obscured by the paucity of tokens occurring with durations in the crucial mid range between 40 and 60 ms. This reflects a failure of the experimental manipulation of speech rate to provide as much durational variability as was hoped for. The same lack of intermediate durational values is clear for Speaker RR above as well, and again makes it unclear whether second pretonic and first pretonic syllables in fact sit on a single curve describing the relationship between F1 and duration, or whether the realizations of F1 for the vowels in these two positions must be considered distinct above and beyond the differences caused purely by duration. Results for Speaker DT, on the other hand, seem telling in this regard. Where DT has produced a reasonable number of vowels from both positions in the 40 to 60 ms. range, apart from a few outliers, it seems clear that the first pretonic vowels maintain a consistently higher F1 than the second pretonic vowels. This suggests that, for DT at least, it may be the case that resizing (specifically narrowing) of the target window for unstressed /a, o/ in the first pretonic syllable prevents significant reduction to schwa even in the face of strong durational pressure. Whether such resizing takes place for all speakers is difficult to determine from the data this experiment has provided. 2.5.2.5. Discussion In second pretonic syllables, then, vowel height is seen to be largely a function of the phonetic duration assigned to vowels in that position, and thus it is not surprising that degree of reduction toward schwa should be reported to vary in accord with such factors as speech rate and degree of final lengthening, as discussed above in Section 2.5.1. There is little reason, in light of this, to imagine that the phonological specification of this vowel is any different from that of any other [a] (< /a/, /o/). /a/ and /o/ in non-firstpretonic syllables are realized as close to pretonic [a] as durational restrictions allow them to be. In cases where more extreme degrees of lengthening are found, such as absolute word-initial position, hiatus with another low vowel, and phrase-final position, these otherwise schwa-like vowels approach realizations identical to those of first pretonic [a], as reported in the descriptive literature. Under conditions of extreme durational pressure, such as in word-internal pre-consonantal second pretonic syllables at faster rates of speech, these same vowels raise

64

Stressed syllables and unstressed vowel reduction

dramatically toward schwa, the last step before devoicing or total overlap by neighboring consonants. While the situation was less clear for first pretonic /a, o/, one possibility may be that the target window for [a] in this position is resized, as discussed in Chapter 1, such that F1 values here resist durational influence more strongly than would the analogous vowels in other positions.36 Another interpretation of the results above, though, is simply that the durations of first pretonic vowels rarely get low enough to make their influence felt consistently on F1. Under this interpretation, then, all unstressed syllables share a single F1 target without any positional resizing. First pretonic syllables could then be seen as just one position among several (absolute initial, phrase-final, hiatus) in which sufficient duration is present that pressure to raise [a] to [] is not significant. All instances of unstressed [a] would thus share a single target window, and this target window would be wide enough that substantial variation could occur when durational pressures became great enough.37 In the data presented above, this pressure seems to become significant at around 60 ms., at which point F1 values begin to lower dramatically. It is equally possible, however, that the data presented in this experiment reflects individual variation, with two (or more) speakers exhibiting the single window width pattern, and two (or more) exhibiting the resizing strategy. The answer remains to be determined. Insufficient productions for most speakers in the crucial 40-60 ms. range were the primary problem in this regard. Again, though, with or without target window resizing in the first pretonic syllable, the important result of the study presented here is that reduction of “Degree 2” vowels in Russian to schwa was shown clearly to be gradient. Vowel height in second pretonic syllables was shown to be largely a function of the duration accorded those vowels, which were perfectly amenable, given adequate duration, to realization in the range typically associated with first pretonic vowels. Phonologically then, it is not clear why “Degree 2” of reduction of /a, o/ in Russian should be considered representationally distinct from “Degree 1” in any way. Rather, “Degree 2” reduction, along with “Degree 1”, should be considered the result of single phonological process: neutralization of the distinction between /a/ and /o/ in unstressed syllables to a single featural specification (whatever we assume it to be for an /a/ in this language). The phonetic realization of the resulting vowel is then determined by the amount of duration accorded to it: when durations are longer, it is lower and more [a]like. When durations are shorter, it is raised toward schwa. The phonology of Russian simply needs no hand in this latter part of this process. If, on the other hand, the two levels of reduction of unstressed /a/ and /o/ were actually to be considered phonologically distinct categories with

Vowel reduction in Russian

65

different constellations of features (or rather, in this case, the features of [a] vs. complete underspecification), we would expect two completely independent sets of spectral realizations, one corresponding to an [a], and the other functioning something like schwa in English, truly phonetically targetless with formant values interpolated from the action of coarticulation with surrounding segments superimposed on an otherwise neutral vocal tract. (See Browman and Goldstein [1992] for discussion of the need for a target articulation for schwa). In such a situation, the rise of second pretonic F1 values toward those characteristic of [a]/[ ] as permitted by adequate duration is entirely unexpected and inexplicable. That they should actually reach those values when given the opportunity is stranger still. While there is ample evidence demonstrating the difficulty of articulating a low vowel under a certain durational threshold, which explains a raising trend in shorter syllables, there is no reason whatsoever to expect a mechanical link between lowering of schwa toward [a] and small increases of duration. In this experiment the highest recorded durations for a second pretonic /a, o/ were in the range of 60 ms. and yet nonetheless the vowels in the upper durational range have F1 values more characteristic of the range found for first pretonic /a,o/.38 Had the experiment included phrase-final or absolute word-initial /a, o/, in which the spectral shift is so conspicuous that it is noted in impressionistic descriptions, the asymptoting effect for F1 of non-first-pretonics as duration increased would have been all the more dramatic. I will conclude from the above, then, that Russian has one phonological reduction process (formerly identified solely with Degree 1), and one phonetic reduction process (formerly identified solely with Degree 2). The phonological merger of /a/ and /o/ takes place categorically whenever the vowels in question appear outside the stressed syllable. Gradient raising of unstressed /a/~/o/ toward [ ] occurs in all unstressed syllables as a function of the duration allotted to the vowels in question by the phonetics. Where this duration is above a certain threshold (apparently somewhere just over 60 ms.), no reduction is seen. The upshot of this analysis is that // is simply not an available category in the phonology of Russian. A vowel with the features (or lack thereof) characteristic of schwa is neither licensed in unstressed syllables, nor specifically prohibited in stressed syllables. Indeed, even assuming the Richness of the Base hypothesis common in most current Optimality Theory (Prince and Smolensky 1993), a hypothesis which proscribes any limitation on possible inputs to the OT-grammar, this analysis simply means that Russian ranks a presumably universal markedness constraint * so high that the phonology deems its inclusion in any representation at all ill-formed. That the phonetics at times causes other

66

Stressed syllables and unstressed vowel reduction

vowels such as /a/ or /o/ to be realized in a manner reminiscent of what we might transcribe as // is irrelevant to phonological grammar of Russian. For some speakers, both the phonetic and the phonological reduction processes described above seem to apply equally in all unstressed syllables. For other speakers, particularly Speaker DT in the data presented here, it seems possible that the target window for the realization of [a] in first pretonic syllables is resized relative to that applying in other unstressed syllables. This narrowing of the target window causes Speaker DT’s first pretonic vowels to resist the phonetic reduction process which applies unhindered in other unstressed syllables. Such resizing of target windows for the realization of a single phonological entity in different prosodic positions is not unusual, and should evoke no surprise here in Russian either. As reviewed above in Section 2.2 and below in Chapter 4’s discussion of the phonetics of initial strengthening, studies such as Cho (2001, 2004) make clear the extent to which presence in a syllable bearing nuclear pitch accent or adjacency to a prosodic boundary can influence such presumably low-level phonetic factors as degree of susceptibility to coarticulation or degree of peripheralization within a given vowel’s normal distributional range, effects which no one would argue constitute changes in the phonological identity of the segment in question: a IP-initial /i/ in English is no more or less a phonological /i/ than the analogous /i/ phraseinternally, despite clear differences in realization which could be argued to reflect precisely target window size. An /a/ bearing a nuclear pitch accent is featurally no different than an /a/ in the postnuclear preboundary trough, one would hope. The only catch in all of this is that for the vowels of Degree 2 reduction to be featurally identical to the analogous vowels of Degree 1, the phonetics must be able to assign different durations to different unstressed syllables on a language-specific basis; while absolute initial vowels, phrase-final vowels, and vowels in hiatus before identical vowels receive additional duration for reasons not specific to the phonology of Russian, significant lengthening of vowels in first pretonic syllables is a property of certain Slavic languages in particular.39 Language-specific apportionment of duration by the phonetics is nothing controversial, however. In Russian, as in other languages, we already need to have a phonetics capable of assigning radically different durations (at least for some speakers here, and for those described by Matusevic¬ above) to equally moraic vowels stressed and unstressed (first pretonic). We also need it to assign different durations to monomoraic phrase-final vowels than it does to monomoraic phrase-internal word-final vowels. We need it to alter durations of monomoraic vowels according to the complexity or presence of onsets and codas in their syllables, to

Phonetic determinism in UVR systems

67

lengthen and shorten monomoraic vowels according to the voicing specifications (and other sonority-scale differences as well – nasal, liquid, glide) of following consonants, according to the aspiration of preceding consonants, to the number of syllables in the word, to the position of the word in the utterance, to the position within the word of the monomoraic vowel regardless of placement of stress, and so on. It is difficult in light of this to see what should prevent the phonetics from assigning different durations to phonologically identical vowels as a function of their position relative to the stressed syllable, as is the case in East Slavic. 2.6. Phonetic determinism in UVR systems I have argued in this chapter so far that the typological characteristics of systems of unstressed vowel reduction are best accounted for using a phonologization-based model of the phonetics-phonology interface. This approach sees regularities in phonological patterns of UVR as a result of the fact that most UVR systems arise from the phonologization of phonetic patterns of vowel approximation and overlap brought about ultimately by durational pressures on the articulation of vowels in unstressed syllables. I have also illustrated in case studies instances of both the phonetic and phonological ends of the phonologization process, including cases in which the two are operative in a single language (Brazilian Portuguese, Russian) or at opposite ends of a dialect continuum (Bulgarian). The patterns start off as phonetic consequences of the apportionment of duration in unstressed syllables, but once phonologized, the neutralizations take place regardless of the phonetic characteristics of the syllables in which they occur. In this sense, then, phonologized patterns of UVR are to a large extent synchronically arbitrary. This phonetic arbitrariness is underscored by the existence of patterns of phonological UVR not resulting from phonetically-conditioned mergers in unstressed syllables to begin with. Section 2.3.2 above raised the possibility that the East Slavic neutralization of /a/ and /o/ in unstressed syllables is actually the result of a failure of earlier short and long /a/ to split into qualitatively distinct vowels in those positions. If this is ultimately borne out, a story of articulatory approximation followed by misperception and phonologization of the merger is obviously entirely incorrect. A clear case of such a set of developments is found in some northern Russian dialects where etymological /o/ under specific accentual circumstances (See Timberlake [1983a] and [1993] for details) splits into /o/ and what is traditionally symbolized /ô/ in the Slavic literature (and in which the contrast between /a/ and /o/ is realized in unstressed syllables). This, in combination with a subsequent

68

Stressed syllables and unstressed vowel reduction

merger of */u/ with original /o/, produces a situation in which /ô/ (higher and often diphthongized) contrasts with /o/ only under stress. This is so not because unstressed syllables were not durationally strong enough to license the contrast (in these dialects there is apparently little reduction of unstressed vowels to begin with – hence the persistence of the /a/ – /o/ contrast), but because /ô/ arose only in syllables which bore particular types of accent. The resulting licensing asymmetry is clearly not “phonetically determined” in a synchronic sense, though it is phonetically explicable as an instance of phonologization. Indeed, Timberlake (1993) suggests a phonetic explanation for the split of original */o/ in North Russian dialects involving the pitch curves and attendant durational patterns produced by the earlier pitch accent system, rephonologized upon the breakdown of that system in the relevant dialects. Approaching the typology of UVR systems from a phonetically-motivated phonologization model, we can account equally well for the existence of these synchronically similar licensing asymmetries by simply identifying more than one phonetic source for a given synchronic pattern (both durational reduction of unstressed vowels and durational and articulatory enhancement of accented vowels40 can result in differing sets of contrasts available in stressed and unstressed syllables). Indeed, in many cases synchronic reduction patterns will not be phonetically motivated even in diachrony, but rather, will result from the layering of several not-necessarily-related sound changes, each phonetically motivated in its own right, but in aggregate creating a pattern which is not wholly phonetically explicable. Such patterns will never be found in gradient versions such as those discussed above for phonetically motivated UVR patterns, since they are not in fact the result of the phonologization of a phonetically conditioned gradient shift from one articulation to another. Such a case is found in Seediq, an Austronesian language of Taiwan, belonging to the Atayalic family. 2.6.1. Vowel reduction in Seediq Holmer (1996) presents a description of the Paran dialect of Seediq (also called Sedik in the literature – Asai 1953; Li 1977, 1980). Seediq is described as having five underlying vowels /i, e, a, o, u/, all of which are contrastive in stressed syllables. In pretonic syllables, however, only one vowel is realized: /u/.41 In unstressed syllables /u/ has the allophonic realizations [u], [], and [], according to Li (1977). Li analyzes the pretonic /u/ as in fact epenthetic. Unstressed vowels reduced to /u/ in this

Phonetic determinism in UVR systems

69

position are frequently deleted, their realization largely predictable from sonority restrictions on consonant clusters. Li therefore analyzes forms liked ‘mupubulebin’, ‘will pull’, as having underlying clusters before the tonic syllable: /mpblebin/. Be this as it may, the reduced vowels of the posttonic syllable cannot be analyzed as such. In these, the following neutralizations take place. (8)

Posttonic vowel realizations in Seediq   

/i/ /a/ /e, o, u/

[i] [a]/[] [u]

Alternations caused by these neutralizations can be observed under affixation, where stress shifts onto the former posttonic final syllable, as shown in (9), where for clarity’s sake Holmer’s intermediate stages are included as a derivation. (9)

Vowel reduction in Seediq /haed/ ‘cook’  h-m-aed  hmaets  [humauts] (Actor Focus) infixation d ts epenthesis and UVR  haed-an  hedan suffixation a  ø /heyeg/

 [huedan] (Locative Focus) epenthesis

‘stand’

Actor Focus  m-heyeg  mheyew  mheyuw42  mheyu  [meheyu] prefix gw VR wø epenthesis, translar. harmony Locative Focus, Preterite  h-n-eyeg  hneyeg-an  hnyegan  [hunuyean] infix suffix eø epenthesis There are lexical exceptions to the pattern, but it is otherwise said to be quite general. The problem, of course, is that in an unstressed syllable which licenses the faithful realization of [i], it is difficult to imagine what would make [u] an optimal realization for unstressed /e/. /e/  [i] involves

70

Stressed syllables and unstressed vowel reduction

the change of [–high] to [+high], involved in any case in /e/  [u] as well. But /e/  [u] involves changes to [back] and [round] as well (or just [back], or just [round], depending on one’s view of underspecification). The answer here is that [u] is not in fact an optimal realization for an /e/ submitted to vowel reduction under any circumstances (which is to say, in comparison with [i] or []. [u] is perfectly fine from the point of view of dispersion). Still though, this is the realization that a layering of sound changes in Seediq ultimately provides. There was no gradual drift of unstressed /e/ toward [u], finally resulting in reanalysis (an implausible tendency at best). The authors describing these facts offer no explanation of the rise of this pattern, and all the facts are still not clear to me concerning precisely what did happen, but the crucial pieces of the puzzle appear to be these: Stressed /e/ in Seediq comes from *Proto-Austronesian // (Asai 1953: 13). Asai likewise attests that when deprived of stress, /e/ is realized as [, , ]. He also claims that /i/ has the same realizations in unstressed syllables, but gives examples of this only in pretonic syllables. By contrast, there are plenty of instances of posttonic [i] throughout his grammar (e.g., [mari] ‘buy’, [mhe i] ‘bearing fruits’), but apparently none of [e]. It is plausible to assume, in fact, given available information, that [e] never arose outside stressed syllables in the first place in Seediq. In any case, the second crucial piece of information is that in the Paran dialect at the time of Asai’s writing (1953), *PAN /u/, realized as such even in other Atayalic dialects, is being realized as [] under stress in Seediq. About this transcription, too, Asai is quite specific, describing it as a high, back, nonround vowel. At least some instances of earlier */u/ becoming Seediq rounded [] are attested in Asai, particularly in final syllables:43 [kkh] < *kuku ‘claw’, and others. It is not clear to me whether this holds of all and/or only final syllables, as examples in Asai are few. What is clear, however, is that earlier /u/ is being realized regularly as [] in stressed syllables in Asai’s description, but in Li’s and Holmer’s later work, also based on direct work with speakers, all these stressed reflexes of */u/ are transcribed as [u], with no indication that they are pronounced any other way (though unstressed /u/ is said to vary from [u] to [] to [], note that all three variants imply rounding). Asai reports far more variation in the realization of unstressed vowels than do the later researchers, perhaps indicating relatively recent transformation of the processes from gradient to categorical, which they now seem to be. Whether or not this is the case, what must have happened in Seediq is as follows: Stressed /e/ alternates with unstressed [, , ], which must be contextual variants of one another. Unstressed [] must have been realized in this same range, such that the two vowels became phonologically neutralized in unstressed syllables. Then at some point be-

Summary

71

fore the stage described by Li and Holmer, the pronunciation of [] became rounded across the board, including the unstressed allophones of /e/. This is reflected in the current situation, where no mention of a high back unrounded vowel is made in either descriptive account, and /e/ alternates with /u/ in unstressed syllables. The unstressed inventory created by these processes is, it must be noted, perfectly well dispersed. (It is even tempting to attribute the rounding of [] to the enhancement of the contrast between high vowels, increasing the goodness of the dispersion.) The problem is the relationship between /e/ and /u/, curious from both phonetic and phonological perspectives. While Seediq UVR is typologically odd, the fact that the phonologization approach to typological regularities locates the source of their phonetic naturalness in diachronic development means that the phonological grammar is in no way prevented from implementing such a pattern, should it somehow, through whatever channels, come into existence. 2.7. Summary I have argued in this chapter that there is no reason to believe that categorical patterns of vowel reduction have any more than a single formal implementation synchronically. The formal system, whether based on Licensing constraints such as those employed by Crosswhite for some UVR processes, or on some other variant of Positional Faithfulness or Markedness should in any event be chosen according to its ability to implement the attested range of phonological patterns, rather than according to its ability to justify the existence of that pattern from a functionalist perspective. If it is true that we must accept more than one distinct formal mechanism for the synchronic implementation of UVR patterns, it should be possible to demonstrate that those two patterns actually behave differently from a synchronic point of view (rather than merely being attributable in a non-specific way to two different sets of general functional principles).44 I have identified two (not unexpected) patterns of this sort in the crosslinguistic patterning of UVR systems: those which are categorical or phonological in nature, and those which are gradient or phonetic. The former are the consequence of a true neutralization, a merger of phonological entities, while the latter are cases of phonetic approximation or overlap of realizations conditioned by durational pressure on articulation. I have also shown that far more UVR patterns than were previously thought in fact belong to that gradient, phonetic type, including that of Standard Bulgarian and Degree 2 of reduction in Brazilian Portuguese and Russian, Furthermore, I have presented the phonologization

72

Stressed syllables and unstressed vowel reduction

model as the link between the two categories, the force which transforms the latter into the former. That synchronic, categorical patterns should frequently be explicable with reference to phonetic tendencies or perceptual limitations is a consequence of the fact that more often than not those patterns are the direct result in a diachronic sense of those very tendencies. Some patterns, such as Seediq, clearly have their origins elsewhere, but it is in no way obvious that this unnatural conception ultimately causes them to behave synchronically any differently than patterns with unimpeachable phonetic pedigrees. Unless it can be shown otherwise, therefore, there is no reason to believe that the synchronic phonological implementation of these unnatural patterns should be in any way formally distinct from any others. The phonologization approach to typological regularity, based as it is on the diachronic development of real phonetic patterns, makes predictions about which types of UVR systems should be commonly attested and which types should not, and those predictions are largely correct. That UVR patterns should center overwhelmingly on neutralizations of vowel height contrasts, with few if any instances of systems based on the neutralization of front/back, round or ATR contrasts in unstressed syllables is predicted in the phonologization model, which identifies as the primary source of UVR patterns durational pressures on the articulatory implementation of height contrasts. No such pressure, on the other hand, is known to exist on front/back, rounding, or ATR contrasts, meaning that UVR involving these should be less likely to arise. That a contrast of vowel nasalization should be neutralized outside stressed syllables, as it is in Copala Trique, for example, is also predictable, since it is well-known that in many languages (as noted above for Brazilian Portuguese, where maintaining the contrast nonetheless seems more important to the categorical phonology than reducing the durations of all unstressed syllables) nasalized vowels are significantly longer than their oral counterparts. Other theories of vowel reduction fail to predict these typological regularities, in fact even predicting the reverse in the case of contrastenhancing UVR. The phonologization model both accounts for the typology of UVR systems, and in doing so creates the possibility for a much simpler treatment of UVR in the phonology, since the latter is no longer compelled to duplicate the work of phonologization in generating the phonetic naturalness of typological patterns.

Chapter 3 Final syllables

This chapter looks at the range of positional neutralization effects found involving the final syllables of a prosodic domain. Final syllables present an interesting test case for theories of positional neutralization for a number of reasons. First and foremost, they commonly share with stressed syllables the phonetic cue taken by phonologists such as Steriade to be among the most crucial in licensing vowel contrasts: duration. The phonetic durations of domain-final syllables are known to be enhanced in many languages by a process known as final lengthening. For this reason, we might legitimately expect to find the same types of licensing asymmetries in final syllables as we found in stressed syllables in the previous chapter. Secondly, final syllables are also known to bear a particular psycholinguistic significance during acquisition, a fact which we might expect to make the existence of licensing asymmetries here all the more likely (Kehoe and Stoel-Gammon 1997). Indeed, final syllables have been claimed to be strong licensers for vowel quality contrasts, but not unambiguously so. Whatever phonological strength effects may be found in final syllables, the right edge of word and phrase alike are also famously host to segmental weakening effects such as devoicing, reduction and deletion. This latter class of effects, which has no counterpart in stressed syllables, introduces a contradiction into the status of final position as a licenser of contrasts, and demands an explanation from any theory seeking to account for the typology of positional neutralization effects found in this position. In this chapter I offer an explanation of these contradictory phonological patterns which attributes their emergence to the effects of a diverse constellation of phonetic properties associated with domain-final position. These effects, while perfectly compatible with one another from a production standpoint, have consequences for the perception of contrasts which frequently prove mutually contradictory. Whether the phonetic patterns tending to enhance perceptual robustness or those tending to diminish it are ultimately phonologized is determined by the relative magnitudes of the languagespecific instantiations of each phonetic tendency. While the phonological effects of this uneasy coexistence in final position among perceptually antagonistic phonetic trends are comprehensible according to the phonologization approach to typological patterning, final syllables present a problem for universalist Pure Prominence

74

Final syllables

approaches to PN. The equally robust attestation of final syllables as strong and weak licensers of vowel contrasts precludes the specification in UG of final position as either inherently strong or inherently weak. While the licensing capabilities of final syllables in a given language will be motivated from a diachronic-phonetic standpoint, synchronically phonological strength or weakness for this position appears to be a parameter which must be set arbitrarily on a language-specific basis. Insofar, too, as the strong or weak licensing capacity of final position must be divined from empirical evidence during the acquisition process, it is reasonable to question the extent to which it is necessary or realistic to assume restriction in UG of the status vis-a-vis potential neutralization patterns of any other specific position either. Section 3.1 reviews arguments for the phonetic and psycholinguistic prominence of final position, and Section 3.2 catalogues and analyzes the phonological strength effects which result from this prominence. Section 3.3 and 3.4 compare the strength effects found in final position to those discussed in Chapter 2 for stressed syllables. The problem of reconciling these effects with tendencies to phonetic and phonological weakness in final syllables is taken up in 3.6 and Section 3.7 is devoted to the licensing of vowel quantity in final position and its comparison with similar effects in stressed syllables. 3.1. Final syllable prominence 3.1.1. Phonetic prominence Domain-final syllables have been shown by many researchers to be the locus of a certain degree of vowel lengthening (Oller 1973; Klatt 1975, 1976; Beckman and Edwards 1987; Wightman et al. 1992; Keating, Wright and Zhang 1999, inter alia on English). Lengthening effects of any kind are of immediate relevance to the present study insofar as we have seen how positional durational asymmetries can give rise to neutralization patterns in other positions, such as stressed and unstressed syllables. Languages in which final lengthening has been identified experimentally include American Sign Language (Brentari 1995), Beijing Chinese (Zhang 2001), Brazilian Portuguese (Major 1985; Simões 1991), Bulgarian (Savitska and Bojadzhiev 1988), Creek (Johnson and Martin 2001), Czech (Dankovicˇová 1997), Dutch (Cambier-Langeveld 1997, 1999), English (see above), French (Delattre 1966; Fletcher 1991), German (Kohler 1983), Hausa (Newman and Van Heuven 1981), Israeli Hebrew (Berkovits 1984, 1993a, 1993b, 1994,), Italian (Farnetani and Kori 1990), Japanese (Ueyama 1999),

Final syllable prominence

75

Jicarilla Apache (Tuttle 2000), Jordanian Arabic (de Jong and Zawaydeh 1999), Korean (Cho 2000), Luganda (Zhang 2001), Russian (Zlatoustova 1981: 13–17), Spanish (Delattre 1966), and Swedish (Nord 1975, 1987). It has been described or implied in numerous descriptive grammars as well, as shall be seen below. It is difficult to assess the universality of the phenomenon due to the tendency for experimental studies to focus on European languages (and primarily English). Most grammar writers, to compound the problem, for fairly obvious reasons make no reference, positive or negative, to the phenomenon. Nonetheless, I know of no study experimental or impressionistic making the explicit claim that a language lacks final lengthening in any form. The existence of phrase-final lengthening in such pre- or non-linguistic behaviors as infant babbling, musical performance, and bird song and insect chirps lead Johnson and Martin (2001) (among others) to speculate that final lengthening is in fact a property of motor performance in general. However widespread, it is clear that domain-final lengthening is nonetheless subject to significant language-specific variation. Johnson and Martin (2001) cite Delattre’s 1966 study of syllable durations in four languages, in which French, English, Spanish and German are shown to differ in the magnitude of final lengthening observed in comparable domains, and in Johnson and Martin’s study, Creek is seen to differ from all four of these in degree of lengthening. To this we may add the comparative study of accentual and final lengthening in English and Dutch of Cambier-Langeveld (1999), which shows that while English words with phrasal accent in phrase-final position are longer than unaccented phrasefinal words, in Dutch items in phrase-final position undergo the same amount of lengthening whether accented or not. Unfortunately, most studies of domain-final lengthening to date have focused only on a single language, such that there is a pressing need for additional comparative studies to determine the extent of crosslinguistic variation in the implementation of this possibly otherwise universal phenomenon. A number of studies of final lengthening have been devoted to determining the extent of its operation in prosodic domains of different levels. These are especially relevant to the present study for their help in understanding how the phonologization of final syllable effects proceed from the level of the phrase to the level of the word. It is difficult in some respects to compare the results of these studies directly. For an experiment to show conclusively how two languages differ in their implementation of final lengthening at a given prosodic level, extreme care would have to be taken not only to replicate general aspects of experimental design and presentation, but in particular to make certain that the domains correspond to comparable levels of the Prosodic Hierarchy in both languages, that

76

Final syllables

target phrases and words were of the same length, that syllabic and segmental environment in the vicinity of the boundaries were identical, and so forth. Studies of this type, preferably of languages with otherwise dissimilar prosodic systems, are sorely needed for further progress in understanding the extent of possible variation in the implementation of final lengthening. In addition, while many studies or descriptions report only utterance-final lengthening, it is often not clear whether other phrasal domains were tested or observed at all. Be this as it may, we can draw a number of important generalizations from the studies on this question: A number of studies (e.g., Beckman and Edwards 1990; Cho 2001; de Jong and Zawaydeh 1999; Ueyama 1999) have shown the magnitude of preboundary lengthening effects to be greater, the higher the level of the prosodic domain in which the relevant syllables are final. Thus, de Jong and Zawaydeh find for Jordanian Arabic the least lengthening before boundaries of non-utterance final phrases with no F0 contour marking the end of the phrase, more lengthening in non-utterance final phrases with boundaries marked by a clear F0 contour, and the most lengthening of all utterance-finally. Ueyama finds more pre-boundary lengthening in Japanese Intonational Phrases than in lower-level Accentual Phrases. While Beckman and Edwards (1990) report a certain degree of (phrase-internal) word-final lengthening in English, this lengthening is clearly less robust both in degree and consistency of application than is the pre-boundary lengthening they detect at the ends of higher-level phrases. The implicational statement following from this observation seems valid for all languages in which final lengthening has been measured or described: the occurrence of lengthening before a lower-level domain boundary implies the occurrence of lengthening at all higher levels. As will be shown below, this pattern is mirrored in the crosslinguistic distribution of positional neutralization effects involving final position as well: If a domain-final effect applies before a phrase-internal word boundary, it will apply when that word boundary coincides with higher-level phrase boundaries as well. The reverse is not the case. Additional phonetic characteristics of final lengthening, in particular issues concerning its articulatory implementation and domain of application, will be discussed later in this chapter in connection with the phonological patterns to which they are relevant. For now it is sufficient to characterize domain-final lengthening as a phonetic effect which is potentially universal in its crosslinguistic distribution, subject however to language-specific variation in its implementation, and most robust in higher-level prosodic constituents. To the extent that final lengthening is universal, we might expect final strength effects to be equally widespread.

Final syllable prominence

77

3.1.2. Psycholinguistic prominence In addition to their well-documented phonetic enhancement, final syllables are also known to exhibit special salience in a psycholinguistic sense. Kehoe and Stoel-Gammon’s 1997 study of prosodic acquisition demonstrates this well. In this study of the truncation patterns of children acquiring English, the authors show that in truncated productions of polysyllabic words, children are most likely to retain internal stressed syllables and final syllables, stressed and unstressed. They characterize their results as follows (Kehoe and Stoel-Gammon 1997: 120–121):45 In WS words, children produced the stressed syllable; in SS, SWS, and SWS words, children produced both stressed syllables; in WSW, SWW, and WSWW words, children produced the stressed syllable and the unstressed syllable in word-final position; and in SWSW, SWSW, and SSWW words, children preserved both stressed syllables and the unstressed syllable in word-final position. Children preserved a medial rather than a final unstressed syllable in their truncations only infrequently.

Kehoe and Stoel-Gammon suggest that the increased perceptual salience of final syllables due to final lengthening is a possible source for this pattern, but that perceptual difficulty with non-prominent material cannot ultimately tell the full story. Firstly, it has been shown that children do in fact accurately perceive the material they otherwise omit from truncated productions. Segmental material from deleted syllables (e.g., their onset C’s) indeed often appears in truncated forms, reinforcing this point. Furthermore, in children’s earliest truncations, it is the rightmost stressed syllable in a word which is most often the one to surface. When the adult form has final stress, the final syllable alone surfaces, and when stress is penultimate, children’s most frequent early truncations preserve both the stressed and the final unstressed syllables (1997: 123). All this suggests that there is some additional importance attached to the right-edge adjacent material in the word which goes beyond simple relative ease of perception of the segmental material located there. Kehoe and Stoel-Gammon argue that children’s phonological representations are relatively intact (compared to those of adults), but that production constraints give rise to the omission of all but a few privileged portions of the word. This enhanced psycholinguistic salience accorded to prosodic heads and edge-adjacent material can be modeled using the parametrized faithfulness constraints now familiar from a variety of OT approaches to positional neutralization. Kehoe and Stoel-Gammon are not specific, however, as to the precise nature of the psycholinguistic salience of word-final syllables. Still, the

78

Final syllables

preferential retention of final syllables in children’s truncations does suggest the possibility that the privileged psycholinguistic status of these syllables could lead to analogous phonological effects in adult grammars. 3.2. Phonological strength effects in final position The foregoing makes the prediction that phonological strength effects should be attested in final position, and this is indeed the case. The following sections provide an account of those effects. I begin with a brief discussion of previous work on tone in final position. I then move on to segmental licensing asymmetries, beginning with two case studies illustrating the complexity of the issue from the standpoint of the relative roles of phonetics and phonology in determining the shape of PN patterns. I then present a catalogue of attested final strength patterns, from which I extract a typological generalization concerning the role of syllable structure in determining the distribution of final strength effects. This generalization allows us to clearly disambiguate the relative roles of phonetic and psycholinguistic prominence in producing final-position PN patterns. 3.2.1. Tone in final position One type of phonological strength effect that has been thoroughly documented in final position is the tendency toward preferential licensing of contour tones on final syllables. Zhang (2001), building on Gordon’s (1999) crosslinguistic study of the phonetic properties associated with tonebearing segments, provides evidence for the strength of prosodic-final syllables with respect to contour tones. Zhang cites the behavior of contour tones in Kikuyu, shown in (1), as an example of this pattern. (1)

Contour tones in Kikuyu: phrase-final vs. non-final a. kariok! 3g6 L b. kariok! 3g6 L

moe"a  kariok! moe"a 56 3g # 56 H L L H L

H

 kariok! 3g6 # L H

‘good Kiriuki’

‘Kiriuki’

Phonological strength effects in final position

79

In the non-prosodic-final context in (a), the floating high tone alone occupies the final syllable of the noun, while the low tone is realized only on the first two syllables. In the prosodic-final context in (b), by contrast, the high associates to the final syllable, forming a contour with the preceding low. Zhang explains this pattern as a consequence of the additional phonetic duration associated with domain-final syllables. Contour tones, for obvious reasons requiring more duration for their realization than other tones, are shown to be restricted in many languages to syllables characterized by a longer sonorous rhyme duration. Among these are syllables with long vowels, syllables with short vowels closed by sonorants, stressed syllables, domain-final syllables, and those located in shorter words.46 3.2.2. Vocalic strength effects in final position Zhang cites Steriade (1994) as claiming that final syllables can sometimes behave as strong positions for other contrasts as well. Steriade lists four such cases, in which either [round] and/or [back] (Hausa, Timugon Murut, and Eastern Cheremis) or [lax] are preferentially licensed in final syllables.47 Zhang takes this to mean that final syllables may be strong positions specifically for vowel contrasts which are difficult to perceive (see discussion of “bad vowels” and harmony systems in my Chapter 2). Such contrasts profit from realization in positions associated with additional duration, since, in Steriade’s words, “extra duration means extra exposure to a dubious vowel quality and thus a better chance to identify it correctly” (Steriade 1994: 20). Of Steriade’s examples of languages with final-syllable strength, the most straightforward is Hausa, which is exemplified in the following section. 3.2.2.1. Hausa Hausa has a vowel inventory consisting of five vowels, long and short, /i, i:, e, e:, a, a:, o, o:, u, u:/. This complete inventory, however, is reliably distinguished only in final position. While long vowels are faithfully realized in all positions, the situation is markedly different for the short vowels. Steriade claims that the feature combination [round] – [back] among short high vowels is distinctive in Hausa only phrase-finally (Steriade 1994: 13). As noted above, Hausa has been shown to have robust (though non-neutralizing) phrase-final lengthening. In phrase-final position, long and short vowels are realized with identical quality. Short

80

Final syllables

vowels, however, are “cut off by a glottal closure”, while a long vowel “has longer duration and just dies away”. (Carnochan 1988)48 Carnochan (1951) notes additionally that the long vowels are often realized “with breathy release”, which I will interpret as partial devoicing. Non-phrase-final short /i/ and /u/ are clearly subject to a great deal of variation in their realizations, determined by both consonantal and vocalic environment (Schuh and Yalwa 1999). Whether or not this variation is ultimately neutralizing as per Steriade (1994) is another matter. The possibly conservative description of Carnochan (1988) makes no mention of such a neutralization, transcribing these vowels as somewhat centralized [] and []. Steriade cites Schuh (1971) for the neutralization data. Schuh and Yalwa (1999: 90), however, say the following: Short /i/ may thus range across [i –  – ], and short /u/ may range across [u –  – ]. In normal conversational speech, medial short high vowels are probably frequently neutralized to a high, centralized vowel, with rounding or lack of rounding determined by environment.

Newman (2000: 398–399), noting the variation and in some cases overlap in the realizations of these vowels, nonetheless argues that they are still phonologically distinct, if perhaps on the way to an ultimate merger. As argued in previous sections, of course, there is a serious difference between segments whose realizations are frequently overlapping to some extent and those which are actually phonologically neutralized. What we may be dealing with in Hausa is a gradient process akin, for example, to the gradient neutralization of back mid and high vowels in the unstressed syllables of Standard Bulgarian discussed in the previous chapter. The merger would thus be ill-characterized as the neutralization of the categorical phonological features [back] and [round]. Hausa short mid vowels behave differently. While these are distinguished consistently in word-final position, there are no underlying short mid vowels in medial position in non-ideophonic native vocabulary. Medial short mid vowels could in principle be created through the process of closed syllable shortening effective in Hausa. This outcome is in fact described by Carnochan 1988, who gives derived closed-syllable forms with the lowered and centralized short mid vowels shown in (2): (2)

Hausa medial short mid vowels à la Carnochan 1988 a.

kare mace goro

[kre] [mte#] [oro#]

‘dog’ ‘woman’ ‘kolanut’

Phonological strength effects in final position

b.

karen Audu macen Audu goron Audu

[kr#audu#] [mt#audu#] [or#audu#]

81

‘Audu’s dog’ ‘Audu’s woman’ ‘Audu’s kolanut’

Both Schuh and Yalwa (1999) and Newman (2000), however, are in agreement that at least for most speakers in closed syllables the contrast between short /e/, /o/ and /a/ is neutralized. The resulting vowel is transcribed [], but in fact varies greatly in realization depending on its environment. This is shown in (3), with forms taken from both sources: (3)

Hausa medial short vowels a.

[zobe] [ree] [tona] [sabo]

‘ring’ b. [z$bba] ‘branch’ [r$ssa] ‘dig up’ [tntona] ‘new’ [sabn sod%a]

‘rings’ ‘branches’ ‘dig up, pluract.’ ‘new soldier’

The merger of the non-high medial short vowels as schwa thus seems to be a better candidate for description as a categorical neutralization process.49 Both merger patterns clearly have as their ultimate source the durational asymmetry between short vowels in phrase-final and nonphrase-final position. As with the mixed vowel reduction systems discussed in Chapter 2, however, in the gradient cases (medial short high vowels and potentially phrase-medial word-final short non-high vowels as well), the likelihood or degree of phonetic overlap will be determined by the interaction of the duration of the vowel with the coarticulatory influence of surrounding segments, while in the categorical case merger will occur or fail to occur based solely on the position of the vowel within the word or phrase, actual durational characteristics of any particular instantiation of the vowel in that position notwithstanding. In Hausa, then, the domain-final syllable is actually the sole position licensing the full array of contrasts available in the language. All other positions allow the realization of only a subset of those contrasts, making this essentially the final-syllable analogue of stress-based vowel reduction. Yet, interestingly, such cases are far from the norm. In fact, uncontroversial instances in which this is the case irrespective of the placement of stress, and in which the phonetic source of the pattern is clearly the additional duration supplied by final lengthening are few and far between. Steriade’s inclusion of Pasiego Spanish in her list of such systems is illustrative in this connection.

82

Final syllables

3.2.2.2. Pasiego Spanish Pasiego Spanish (Penny 1969; McCarthy 1984; Flemming 1993; Sanders 1994; Dyck 1995, inter alia), as Steriade notes, does indeed allow a contrast between tense and lax vowels only in final syllables. More specifically, [tense] and [lax] are contrastive only for high back vowels in absolute word-final position.50 Whenever the lax -U suffix appears, however, all preceding vowels in the word are realized as lax as well, a process usually analyzed as a right-to-left laxing harmony.51 Penny (1969: 148–149) gives the vowel inventory below in (4) for the tonic and non-final atonic syllables of Pasiego Spanish. He describes the lax variants of the vowels in the following way: The high front -I- is centralized and lowered, as is -U-. -A- is “fronted and slightly raised, almost []”, and -O- is similar to the front rounded vowels of French, but “without any great protrusion of the lips”. In final unstressed syllables, only the [e, a, u, U] appear. Examples of the spreading of the lax feature are given in (5) below (McCarthy 1984: 293–294; Dyck 1995: 28). (4)

Pasiego Spanish vowel inventory a.

i e

u o

b. I

a (5)

a. PLURALS tense abi&anus soldáus pu&úkus kantárus

U O A

MASC. SG. COUNT

‘hazels’ ‘soldiers’ ‘young chickens’ ‘5 gal jugs’

lax AvI&ÁnU sOldÁU pU&ÚkU kAntÁrU

b. SG. MASS málu límpju sú'ju

SG. COUNT

c. AUGMENTATIVE bur(ikon ‘donkey’ ar(uyón ‘big arroyo’ kantón

COUNT

mÁlU lÍmpjU sÚ'jU bUr(IkÚcU Ar(ÚyU kÁntU

‘hazel’ ‘soldier’ ‘young chicken’ ‘5 gal. jug’

‘evil’ ‘clean’ ‘dirty’ ‘donkey’ (pej.) ‘arroyo’ ‘rock that has been rounded and worn by water’

Phonological strength effects in final position

83

Again, Steriade attributes the strong licensing capacity of the final syllable to the extra duration found in that position, duration which (Zhang adds) assists in the perception of difficult contrasts. But is this really the case in Pasiego Spanish? A number of facts cast suspicion on this hypothesis. First among these is the description Penny (1969: 149) gives of the realization of final syllables in Pasiego. According to him, in both closed and open final syllables, “the vowel sounds of pasiego speech are very relaxed (although not as much as in Portuguese, for example) and are almost entirely devoiced when a voiceless consonant or [rr] precedes”. In final syllables, even the tense /u/ is realized lower (at “similar” aperture as /u/) and laxer, the difference between /U/ and /u/ now being only that the former is “centralized” while the latter is a “clear velar”. /e/ in final position is raised (“almost an open ‘i’”), which suggests why there is no contrast between front mid and high vowels in these syllables. Recall also from note 50 above that the masculine mass noun suffix /-u/ comes from an earlier -o. All this suggests vowel weakening in the final syllable, perhaps if anything less duration than in other unstressed syllables (recall the common raising of mid vowels and/or centralization of peripherals in UVR systems from Chapter 2). To this we can add facts from neighboring Tudanca Spanish (Hualde 1989; Flemming 1993; Dyck 1995). In Tudanca Spanish word-final high vowels are phonetically laxed.52 This laxness spreads left to the stress ed vowel, laxing any intervening vowels as well in the process. Additionally, in word-final unstressed syllables the contrast between underlying /i/ and /e/ is neutralized, both vowels being realized as lax [i] or [].53 Again, this neutralization suggests a position susceptible to vowel reduction, i.e. actually a rather poor place to perceive fine contrasts in vowel quality. A third piece of evidence comes from Sanders (1994), who presents an extensive study of a similar (but unrelated) laxing harmony in the dialects of Eastern Andalusia. Sanders provides data from an experimental study of the phonetics of vowels in this dialect, showing that word-final (phraseinternal) vowels in this dialect (both tense and lax) are shorter than the vowels of stressed syllables, but longer than the vowels of pretonic unstressed syllables. For example, mean duration from three speakers for tonic tense /a/ was 86.98 ms., for word-initial tense /a/ 60.28 ms., and for word-final tense /a/ 76.85 ms. Values for lax /A/ (not Sanders’s notation) were 90.18 ms., 58.24 ms., and 73.50 ms. respectively. Ratios for other vowels were similar (Sanders 1994: 168). Sanders had hypothesized that his lax vowels might be phonetically longer than his tense vowels. The reason he chose to measure final vowels in non-prepausal position was, in his words, “to prevent vowel reduction due to suprasegmental processes”.

84

Final syllables

(1994: 159) Specifically, “In eastern Andalusian the phrase-final position is subject to considerable reduction.” (1994: 118) From the above it is clear that in Pasiego and other laxing harmony languages of Spain we are not dealing with the type of system in which certain features occur contrastively only in positions where they are perceptually most robust (such as in some of the vowel reduction systems discussed in Chapter 2, or the strong-to-weak harmony systems dealt with below in Chapter 4). In fact, the very opposite appears to be the case. In Pasiego Spanish tense and lax contrast underlyingly only in a position, whose phonetic characteristics would actually hinder accurate perception of fine differences in vowel quality (i.e. a position characterized by shorter durations, laxing or centralization of vowels, in some systems even raising of mid vowels leading to merger with highs, and increased susceptibility to devoicing). In these systems the direction of spreading then is not from strong position to weak position, but from weak (phonetically nonprominent) to strong. This characterization places laxing harmony in a class with other weak-to-strong assimilation processes such as those traditionally labeled umlauts or metaphonies.54 The notion of umlaut or metaphony being motivated primarily by the perceptual weakness of the trigger (as opposed to its phonetic salience, as Steriade suggests for Pasiego) is central to Ohala’s theory of sound change (as in e.g., Ohala [1993a], discussed below). A crucial distinction between harmonies proceeding from nonprominent to prominent positions (perception-based) from those operating from prominent to non-prominent (articulation-based) is the central thesis of McCormick (1982), and is explored more recently in Majors (1998: 160–167) and Walker (2001b). From a diachronic point of view such patterns have a relatively straightforward explanation. Specifically, they are examples of hypocor-rective change, in the Ohalian parlance (Ohala passim, but especially 1993a). As discussed in Chapter 1, the term hypocorrective change refers to instances in which the listener hears, for example, the coarticulatory effect on a vowel of a neighboring segment, but rather than correctly attributing it to its actual source and disregarding it, or taking it as a remote cue for the identity of the source segment, the speaker instead takes the extended feature to be a property of the vowel in question, and constructs a new underlying representation for the form as result. A simple example would be the development of a contrastively nasal vowel from an earlier vowelnasal sequence. In Ohala’s schema, the speaker, with a UR of /VN/, produces the coarticulated sequence [v(N]. The listener then fails to attribute the nasalization he or she hears to the following nasal consonant, hearing instead [v(]. Now, believing this to be the speaker’s intended pronunciation, he or she constructs a new representation for the form in question: /v(/. In

Phonological strength effects in final position

85

the metaphony case, then, a sequence /éCi#/ would perhaps include some raising of the tonic mid vowel as a result of vowel-to-vowel coarticulation. A correct interpretation of the speaker’s intentions would attribute the raising effect to the following high vowel and parse the tonic vowel as mid regardless. The final high vowel, however, occurs in a position in which accurate perception is impeded by the phonetic weakening effects discussed above. It is possible, then, that a listener could misperceive the /éCi/ sequence such that he or she failed to attribute the coarticulatory raising to a following high vowel, reconstructing the speaker’s intended production instead to be an actual stressed high vowel in the penultimate syllable.55 Another, synchronically-oriented approach to weak-to-strong harmony patterns is that of Walker (2001b). In treating, inter alia, the metaphony patterns found in Italian dialects, Walker notes the differences between these patterns and, for example, stressed-syllable triggered harmonies (also present in Romance dialects). She characterizes the two patterns as based on different phonetic motivations and deserving of different synchronic treatments. Walker too sees the metaphony patterns as resulting from the perceptual weakness of the triggering segment (inherent or positionally acquired), and attributes the feature spreading not to misperception, but to the purposeful extension of perceptually weak features to phonetically prominent positions such as the stressed syllable. Both the realization of the weak(ened) features in a perceptually-more-hospitable environment and the mere fact that the extended feature now occurs over a span of longer duration would contribute to increased perceptual robustness. Here Walker is invoking the principle central to the thesis of Kaun’s treatment of rounding harmony (which Kaun attributes in part to Suomi 1983): Bad vowels spread (perceptually difficult contrasts are more easily perceived if they are realized over longer spans within the word). To which it seems necessary to add, in the case of metaphonies involving major distinctions in vowel height, that in fact even good vowels will spread if they wind up in bad enough neighborhoods. The notion behind Walker’s story then is essentially a functionalist one, in which the spreading of vowel features takes place in order to preserve vowel contrasts which are otherwise in danger of collapse due to features of their environments. Walker notes in this connection a generalization made by Dyck (1995) about metaphony in Spanish and Italian dialects: Metaphony does not occur (high vowels do not raise preceding vowels) from positions in which there is no contrast between high and mid vowels.56 If metaphony were not at least in part concerned with the preservation of contrasts in the triggering syllable, such a generalization would go unexplained. It also seems relevant in this regard that in the languages in question the triggering vowels are generally the

86

Final syllables

sole exponents of one or another morphological category, meaning that neutralization of contrasts here would result in the loss of grammatical information. Recall that in the case of the only vowel triggering laxing harmony in Pasiego, neutralization of the tense-lax opposition would result in the collapse of the mass vs. count opposition for masculine nouns as well.57 It is clear from the above that in systems such as Pasiego Spanish, the privileged licensing of features is not due to the beneficial phonetic characteristics of a particular position in the way that, e.g., retroflexion contrasts are preferentially licensed in postvocalic position due to the salience of cues for this distinction in that position (i.e. VC transitions [Steriade 1997]). In many cases contrasts for which final position is phonologically strong are not “licensed” in any sense by the phonetic characteristics of that position.58 In some cases, such as Pasiego Spanish, those characteristics may even be detrimental to the perception of the contrast (a fact which one way or another provides the ultimate motivation for the harmony patterns which may develop there). Further examples of cases of final syllable phonological strength owing explicitly to final syllable phonetic weakness are discussed in the Section 3.6 below. More often than not this turns out to be the case in systems where final position alone is the sole licenser of the full range of contrasts in the inventory. In this sense Hausa, described above, is the exception. None of the other putative instances of (non-tonal) final syllable strength has this characteristic. Rather, where final strength is observed at all, it is most often in the form of what I will call final resistance, introduced in the next section. 3.2.3. Final syllable resistance to assimilation and reduction By far the most common vocalic strength effect that can be observed in final syllables is some degree of resistance to reduction or assimilation processes which would otherwise target vowels of similar prosodic status (unstressed, short, non-initial, etc.). In such cases the final syllable is not the strongest licenser in the word (as it is in Hausa), but rather shares this distinction with some other prosodically prominent syllable. In many, perhaps most of these systems, the strength effect is observed only at the phrase or utterance level, with phrase-internal word-final syllables submitting to reduction (often described as gradient or optional), in contrast with the obligatory categorical reduction or assimilation of other nonprominent vowels. Very often the strength effect itself bears some hallmark of gradient application: It is described as “optional”, rate- or styledependent, or even only partial. This section presents a catalogue of

Phonological strength effects in final position

87

languages in which it has been possible to identify relatively clear cases of final-syllable resistance.59 As this pattern of final strength is largely overlooked in existing literature,60 a few words on each system are in order as well. This catalogue in examples (6) through (19) below is meant to be illustrative, but by no means exhaustive. It is followed by discussion of the common characteristics of final resistance patterns and the ramifications of these systems for the place of phonetic and psycholinguistic factors in an account of final strength effects. (6)

Russian (Slavic – Matusevic 1976: 96–120; Zlatoustova 1962: 109–139) Maximal inventory: [i, e, a, o, u, ] Neutralization:

Basic facts (See Chapter 2 for some additional details) In first pretonic syllables, /a/, /o/  [ ] In first pretonic syllables after C, /a/, /e/, /i/  [i] Other unstressed syllables: /a/, /o/  [] Other syllables following C, /a/, /e/, /i/  []

Resistance: In phrase-final open syllables, /a/, /o/  [ ], as in first pretonic syllables (also associated with more duration than other unstressed syllables). Following palatalized consonants, /i/, /e/  /i/ but /a/  approx. [æ]. Conditions: Phrase-final only. Phrase-internal final vowels are often extremely weak, prone to devoicing or deletion. Reduction to [] is gradient in all cases (see Chapter 2). Phrase-final /a/ and /o/ often described as somewhere between the first and second degrees of reduction. Rate-dependent, greater reduction likely in fast speech or less “conscientious” style.

(7)

Belarusian (Slavic – Czekman and SmuΩkowa 1988: 223–234) Maximal inventory: [i, e, a, o, u] Neutralization:

Basic facts (See Chapter 2 for additional discussion) In first pretonic syllables, /a/, /o/ /e/  [a] In other pretonic syllables, /a/, /o/, /e/  [ ]61

Resistance: In final open syllables62, /a/, /o/, /e/ pretonic syllables.

 [ a], as in first

88

Final syllables Conditions: As in Russian, resistance does not reverse categorical neutralizations, only realization of height of resulting low vowel. Not clear from this source whether resistance occurs in all word-final syllables or only in phrase-final position.

(8)

Ukrainian dialects (Slavic – Shevelov 1979: 218–240) Maximal inventory: [i, , , a, , u] Neutralization:

//, /e/  [] (/u/, /o/  [u])

Details: The dialect geography of vowel reduction patterns in Ukrainian is complex, and cannot be treated at length here. The processes most commonly involved are the raising of /e/ and /o/ toward // and /u/ to one extent or another and in various combinations. The pattern in which the distinction between /e/ and // is categorically neutralized in unstressed syllables is the most common,63 being characteristic of all Ukrainian dialects save the eastern and central Polissian and the Lemkian (northern periphery adj. to Belarusian and far western periphery in Slovakia respectively). Standard Ukrainian has only this reduction (Shevelov 1979, Pugh and Press 1999). Disregarding here less robust instantiations of the change, across-the-board neutralization of /o/ and /u/ is much less general, obtaining primarily in a North-South strip from Bukovyna to Pidljasia (on the western periphery, but excluding some far western dialects [Shevelov 1979: 519]). Resistance: In most dialects reduction does not apply to word-final vowels (sources make no mention of phrase vs. word distinction). In the dialects of the Berestja region (northwestern), by contrast, reduction is apparently resisted in word-final open and closed syllables. Conditions: Shevelov describes reduction as categorical in the dialects where it occurs regularly. Other sources speak of a gradient-sounding process, e.g., the neutralizing vowels having only a “tendency to approximate one another” (Pugh and Press 1999), or of reduction being optional (Janiak 1995).

Phonological strength effects in final position

(9)

89

Brazilian Portuguese (Romance – Major 1981, 1985; Nobre and Ingemann 1987; Simões 1991; Wetzels 1992) Maximal (oral vowel) inventory: Neutralization:

/i, e, , a, , o, u/

Pretonic syllables – /e, /  [e], /o, /  [o] Posttonic syllables – /i, e/  [i], /u, o/  [u]

Resistance: For some speakers the second degree of reduction (posttonic high-mid neutralization) does not take place in prepausal word-final vowels (Major 1985: 266–267). Conditions: The pretonic mid-vowel neutralization is categorical (nonrate dependent, not reversible in careful speech. The post-tonic reduction is gradient (rate dependent, reversible in careful speech). Major 1985 suggests that for those speakers whose lengthened prepausal final vowels reduce nonetheless, posttonic reduction too has become “lexical”. In casual speech, pretonic high and mid vowels are optionally subject to the same (gradient) reduction as posttonic vowels (Major 1985: 266)

(10)

Eastern Mari (Finno-Ugric – Sebeok and Ingemann 1966; Flemming 1993; Kangasmaa-Minn 1998; Majors 1998) Inventory:

Full: /i, y, e, ø, a, o, u/

Reduced: //

Reduction: Stress falls on the rightmost (underived) full vowel of the word. Words containing only reduced vowels take the default initial stress. Only the reduced vowel occurs in non-final posttonic syllables. Neutralization: Posttonic vowels harmonize with the stressed vowel in backness and roundness. Final (non-low) vowels do so in a categorical, neutralizing fashion. Non-final reduced vowels have variants with allophonic fronting and rounding in agreement with the stressed vowel. Resistance: No short (reduced) vowels surface word-finally (Flemming 1993). Results in alternations, whereby root-final or suffix mid vowels surface as reduced when appearing non-finally. I class this with resistance to reduction effects in that all posttonic vowels save those in word-final open syllables are realized as schwa. The disallowing of short word-final vowels, on the other hand, results in the neutralization of potential full vs. reduced contrasts.64

90

Final syllables tlze ludo pørttø

‘moon’ ‘reader’ ‘in the house’

 tlzste  ludn  pørtt%ø

‘in the moon’ (inessive) ‘reader’s’ (genitive) ‘in his house’

Conditions: Described as categorical, affecting all word-final vowels. Varies greatly across dialects.

(11)

Uyghur (Turkic – Hahn 1991) Inventory: /i, y, e, œ, æ, , a, o, u/65 Neutralization: Low, short vowels in non-initial open syllables are raised to [i]. Resistance: Applies only to non-prepausal word-final vowels (Hahn 1991: 52). Conditions: Applies categorically within words but is rate-dependent across word boundaries.

(12)

Central Eastern Catalan (Romance – Recasens 1986) Inventory: /i, e, , a, , o, u/ Neutralization:

Unstressed syllables

/e, , a/ /u, o / /i/

  

[] [u] [i]

Resistance: In a number of dialects in this area, e.g., Oden (transitional between Northeastern and Central eastern), word-final vowels are resistant to the elimination of oppositions associated with unstressed syllables (Recasens 1986: 111).

(13)

English (Germanic – Hammond 1997) Inventory: /i, , e, , æ, , , , o, , u/ Neutralization: Unstressed word-internal preconsonantal syllables contain only [].66 Unstressed word-internal prevocalic syllables contain [i, u, o, e]. Resistance: Word-final open syllables show [i, u, o, ].

Phonological strength effects in final position

(14)

91

Yakan (Austronesian, Philippines, SW Mindanao, Basilan Island – Behrens 1975; Flemming 1993) Inventory: /i, e, a, o, u/ Neutralization: Primary stress is penultimate, with secondaries on alternating syllables preceding. In unstressed syllables /a/  [e]. Resistance: In word-final open and closed syllables reduction does not take place (nor does it in pretonic syllables when the following syllable is also [a]).

(15)

Maltese (Semitic – Puech 1978) Neutralization: In Standard Maltese and closely related dialects, there is a left-to-right rounding harmony which targets all short non-low vowels triggered by [o]. In Siggieha and Gozitan dialect, harmony targets vowels of all heights, affecting specifications for [back]. Resistance: While in some dialects suffix vowels only harmonize if they are in a final closed syllable (with no harmony taking place if additional syllables intervene between the stem and the word-final syllable, even if intervening syllables contain otherwise-eligible vowels), in no dialect do vowels in final open syllables ever harmonize.67

(16) 

Nawuri (Kwa – Casali 1995) Inventory: /i, , e, , a, , o, , u/ Reduction: Underlying short, front vowels centralize: /i, , e, /  [, , , ]. Resistance: Does not apply to absolute word-initial vowels (presumably due to increased duration, see Chapter 2 discussion of Russian vowel reduction). Does not apply to vowels in phrase-final position. Details: Casali notes that in word-internal syllables, centralization is obligatory, even in very deliberate speech. To (phrase-medial) wordfinal vowels centralization applies “postlexically”. Kirchner (1998) uses Nawuri centralization as an argument for the need to treat even universally non-contrastive phonetic properties such as the additional duration of final vowels as phonological. His reasons are as follows:

92

Final syllables There is a lexical process of rounding harmony targeting high non-front vowels which applies from root-to-prefix and holds as a static generalization within roots as well. This rounding harmony does apply to prefixes which undergo centralization (i.e. become non-front), but (of course) does not apply to absolute word-initial prefix vowels (which fail to undergo centralization). Kirchner argues from this that since the output of centralization conditions phonological rounding harmony, centralization (which is sensitive to subphonemic durational variations) cannot be “relegated” to the phonetic component, and hence subphonemic durational features must be available to the lexical phonology. This, however, cannot be right for the following reason: As noted, according to Casali, centralization applies to word-final vowels only postlexically, while its internal application is lexical and obligatory/non-rate-dependent. The lexical application, then, is clearly not conditioned by subphonemic variations in duration, while the postlexical application seems to be. This, then, is another instance of the categorical application of a process word-internally, with gradient application word-finally. Additional support for this view comes from the application of the otherwise-clearly-categorical rounding harmony. Casali notes (1995: 653, note 4) that in addition to its lexical application, rounding harmony also applies postlexically across a wordboundary to (postlexically) centralized phrase-internal word-final vowels. This postlexical application, however, is “optional”, and produces final vowels of only “an intermediate degree of rounding” (viz. applies gradiently). Clearly then centralization and rounding harmony apply both categorically in the lexical phonology and gradiently in the postlexical phonology. Further proof of the dual nature of these processes comes from the absolute initial vowels. The exceptionality of these is clearly due historically to increased duration. Synchronically, however, absolute initial vowels never centralize even postlexically (1995: 655), and hence are never subject to rounding harmony lexical or postlexical (which is to say, their exceptionality, previously a function of their phonetic duration, has since been phonologized, such that it is now derived from their structural description alone. Final syllables retain the earlier state, in which resistance to alterations is still directly contingent on physical duration).

(17)

Shimakonde (Bantu, Mozambique – Liphola 2001) Inventory: /i, e, a, o, u/

Phonological strength effects in final position

93

Neutralization: Optional reduction of pretonic mid vowels /e/ and /o/ to [a]. Stress is fixed penultimate. Resistance: Final vowels never reduce.68

(18)

Bonggi (Austronesian, NE Borneo, Banggi and Balambangan Islands – Boutin and Pekkanen 1993; Kroeger 1992) Inventory: /i, , a, , u/ underlying in native roots (Kroeger 1992: 291) Neutralization:

Pretonic – Non-high vowels reduce to // Posttonic (final syllable) – /a/  []69

Resistance: Does not apply to word-final vowels

(19)

Timugon Murut (Austronesian, NE Borneo – Prentice 1972; Kroeger 1992) Inventory: /i, a, o, u/ Neutralization: /o/ is not contrastive in pretonic syllables. All occurrences are derivable synchronically from right-to-left spreading of [round].70 Resistance: /o/ does occur contrastively in the stressed (penultimate) syllable and final syllables (closed or open) (see note 68 above on Shimakonde, etc.). Furthermore, [o] only surfaces in stressed syllables if the following syllable also contains [o], meaning that the distribution of [o] is actually least restricted in the (posttonic) final. See Section 3.5 below for a cautionary tale, however, on the dangers of extrapolating claims of special final “prominence” from distributional evidence alone.

3.2.3.1. The phonetic roots of final resistance This pattern of final syllable resistance to vowel reduction or assimilation processes is very much expected given what is known about the phonetics of domain-final syllables crosslinguistically. If these processes target vowels which are for whatever reason durationally deficient, final lengthening would naturally serve to exempt the vowels of final syllables from the effects of the duration-sensitive processes. Qualitative unstressed vowel reduction as in Russian or Brazilian Portuguese occurs primarily in

94

Final syllables

languages with strongly duration-cued stress (see Chapter 2). In Nawuri, centralization exempts specifically long vowels, absolute-initial vowels, and phrase-final vowels, as noted, all three positions characteristically associated with increased duration. In Uyghur, raising applies only to noninitial low vowels in open syllables. This may seem odd at first, particularly since Uyghur, like most Turkic languages, has fixed final (not initial) stress. It makes more sense in light of phonetic facts documented in other Turkic languages: Turkic final stress is not strongly correlated with increased vowel duration (see Konrot 1981 for experimental evidence from Turkish).71 In Turkish (Barnes 2002, and Chapter 4 below) and the closelyrelated Turkmen (Mollaev 1980), it has been shown that vowels in wordinitial syllables, regardless of the placement of stress, are realized with longer phonetic durations than the vowels of comparable word-internal syllables. Additionally, in Anatolian Turkish, contrary to near-universal expectations, all things being equal, vowels in closed syllables are longer than vowels in comparable open syllables (Lahiri and Hankamer 1988; Jannedy 1995; Kopkallı-Yavuz 2000; Barnes 2001). The shorter duration characteristic of open syllables in Turkish has the result of conditioning frequent reduction of /a/ to [] in ordinary speech. If these two durational regularities are found also in Uyghur,72 then raising can be seen to fail specifically in initial syllables, closed syllables, syllables with (underlying) long vowels,73 and phrase-final open syllables; these are all positions in which vowels would have characteristic additional phonetic duration. Phonetic studies of final lengthening add empirical weight to this hypothesis. In addition to the temporal augmentation of vowels in domainfinal syllables, several researchers have identified a process of (spatial) articulatory strengthening as well. Edwards, Beckman, and Fletcher (1991) in a study of the gestural implementation of final lengthening in English concluded that while phrase-final stressed /a/ was longer and slower than its non-final counterpart (indicating a local change in gestural stiffness), a phrase-final reduced vowel [] was not only longer, but also enhanced by greater gestural displacement (in their measure of jaw height). Vayra and Fowler (1992) find a strikingly similar pattern in their investigation of supralaryngeal articulatory declination in Tuscan Italian.74 While for stressed vowels (/a/ and /i/) measures of F0, amplitude and jaw opening decreased monotonically from beginning to end in trisyllables uttered in isolation, for unstressed vowels (again /a/ and /i/) the same declination was observed for F0 and amplitude, but measures of F1 and jaw opening showed minima in the second syllable, while unstressed vowels in final syllables showed higher measures, suggesting supralaryngeal strengthening of final unstressed vowels of the same type observed by Edwards, Beckman and Fletcher.

Phonological strength effects in final position

95

Cho (2001), in a detailed study of gestural strengthening in accented and boundary-adjacent syllables in English found articulatory strengthening reminiscent of that found in pitch accented syllables for preboundary vowels as well. Specifically, Cho noted in domain-final vowels what he refers to as sonority enhancement, measured as a lower tongue position and greater lip opening gesture for both phrase-final [o] and [a]. He did not, however, record the lower jaw position found by Edwards, Beckman, and Fletcher and Vayra and Fowler, and which he himself notes for vowels under phrasal accent. Note, however, that Cho measured only vowels carrying at least lexical stress, and then varied their prosodic position within the phrase. Since the lower jaw position in final vowels recorded by both Vayra and Fowler and Edwards, Beckman, and Fletcher was found only for lexically unstressed vowels, it is unsurprising that Cho saw no such effect.75 In addition to this, Cho also demonstrated increased resistance to vowel-to-vowel coarticulation for both accented and phrase-final vowels, another result that provides insight into the phonetic origins of the phonological final strength patterns described above. The fact that so many of the phonetic studies of final lengthening have detected the phenomenon in a robust and consistent manifestation only before higher phrase boundaries likewise makes clear the reason so many of the final resistance patterns do not operate in word-final syllables internal to the phrase. Though studies of the articulatory properties of final syllable vowels have so far dealt with only a very few languages, the frequency of occurrence of final resistance effects (in conjunction with the understandable reticence of most grammar writers on this matter whether the effect is present in the language or not) might suggest that the articulatory characteristics documented for English and Tuscan Italian final lengthening might in fact be properties of the phenomenon in general, and hence that most vowel reduction systems should show at the very least a low-level phonetic version of final resistance. There exist at least two studies of the spectral characteristics of final syllable vowels in languages with unstressed vowel reduction, however, which show this not to be the case: Nord’s studies of vowel reduction and duration in Swedish (1975, 1987), and Johnson and Martin’s (2001) study of vowel reduction in Muskogee Creek. Nord’s studies showed that despite clear application of final lengthening (with final-syllable unstressed vowels approximating initial-syllable stressed vowels), all vowels tested (/e/ and /a/ in 1975, /e/, /a/, /i/ and // in 1987) still underwent reduction. It is, however, worth noting several things about these studies: Firstly, Nord measures only vowels in closed final syllables, which, as will be shown in the following section, in many languages at least undergo less lengthening than vowels in final open

96

Final syllables

syllables (immediately adjacent to the word- or phrase-boundary). Also, while the final-syllable unstressed vowels do reduce in comparison with their stressed counterparts, it is clear from Nord’s data that for vowels other than //, there is a statistically significant difference between formant measurements for final-syllable unstressed vowels and unstressed vowels in initial syllables. Strikingly, for /a/, /i/ and /e/, the final syllable unstressed vowel has a higher F1. Nord seems to take this to be the “coarticulation with the rest position” he claims is responsible for the movement toward a schwa-like vowel particularly characteristic of final unstressed /e/ in Swedish. But the higher F1 for /a/ as well suggests that this is not necessarily the case. In fact, it is more reminiscent of the final sonority-enhancement documented by Cho (2001) for English. If this is true, then final lengthening does inhibit reduction somewhat in unstressed final closed syllables in Swedish.76 A comparable study of open syllables would most likely clarify this matter. Johnson and Martin (2001) show more conclusively that in Muskogee Creek final vowels are generally more centralized than non-final, though final lengthening is operative. This suggests that at least in Creek final lengthening is carried out without concomitant articulatory strengthening. Johnson and Martin wish to attribute this to a “final fade” in supralaryngeal articulations similar to that found in F0 and amplitude at the ends of phrases. They hypothesize that, along with lengthening, decreased gestural stiffness could also cause articulatory reduction, citing Beckman, Edwards, and Fletcher (1992), Vayra and Fowler (1992), as well as work by Krakow (1993) and Vaissiere (1986) showing declination of velum height. As described above, however, the results of Edwards, Beckman, and Fletcher’s later study, and in fact the results of this very study of Vayra and Fowler show just the opposite of final fade when it comes to unstressed vowels, any declination being found only for the stressed counterparts thereof. Also, later work by Vaissiere (1988) demonstrates stronger velic closure in phrase-final syllables than in phrase-medial, again arguing against a general phenomenon of final supralaryngeal fade. To this can be added the work of Keating, Wright, and Zhang (1999), who found increased linguopalatal contact for phrase-final consonants in English, and of course Cho (2001), again demonstrating the link between phrase-final lengthening and articulatory strengthening, the latter in fact specifically for lexically stressed syllables. In sum, at least for unstressed syllables there is very little (if any) direct articulatory evidence for supralaryngeal “final fade”, while at the same time there is an abundance of evidence for the opposite effect: phrase-final strengthening of unstressed syllables, and in some studies stressed syllables as well. While this perhaps casts some doubt on Johnson and Martin’s more

Phonological strength effects in final position

97

general interpretation of their results, it still leaves the facts of Creek unexplained. Were this all simply a matter of phrase-final vowels lengthening while still reducing to the same degree as phrase-internal vowels, it would be possible to account for the application of reduction in the presence of lengthening through phonologization, as the generalization of spectral targets from the phrase-internal variants of word-final vowels to all tokens of those vowels regardless of phrasal position or duration. As noted above, Major (1985) argues precisely this to have taken place for those speakers of Brazilian Portuguese who have generalized reduction of the mid vowels to [i] and [u] even in phrase-final lengthening contexts.77 Even this, however, fails to resolve the entire problem. A number of languages show reduction of vowels only or especially in phrase-final position. Here it is clear that no such generalization has taken place. The phrase-final syllable is simply weaker phonetically than the non-phrasefinal syllable. Two examples of languages apparently displaying this pattern of phrase-final reduction have already been cited.78 I will return to this problem below in my discussion of final weakening effects. For now it is enough to note that if a process of supralaryngeal final fade exists at all, it is far from universal, given the abundant attestation of the very opposite pattern in both phonetic studies and in patterns of resistance to processes which should otherwise have targeted the final vowel. As for final resistance, the foregoing shows clear phonetic motivations for the existence of such a pattern. Studies like that of Johnson and Martin, though, provoke questions about the universality of the implementational aspects of final lengthening which give rise to it. The topic of reduction in final position is taken up again in Section 3.6 below. 3.2.3.2. Syllable structure effects on final resistance In discussion of the phonetics and phonology of final lengthening effects I have thus far alluded to, but not addressed directly the role of syllable structure in the implementation of lengthening. Looking at the typology of final strength effects in the phonology of vowel systems, a clear pattern emerges. In the overwhelming majority of cases in which final syllables behave as strong licensers in one way or another, final strength occurs only in final open syllables (which is to say, affects only absolute word-final vowels). There are very few uncontroversial cases of final strength effects in both open and closed final syllables, and none to my knowledge operating solely in final closed syllables, with final open syllables behaving as weak. Of the first type listed above are Hausa, Russian and Belarusian, nearly all dialects of Ukrainian where reduction exists at all, Brazilian

98

Final syllables

Portuguese, Eastern Mari, Uyghur, Catalan, English, Maltese, Nawuri, Shimakonde, and Bonggi. Showing strength of both closed and open syllables were only Yakan, Murut (to the extent that this is an instance of final strength at all), and apparently the Berestja area dialect of Ukrainian (Shevelov 1979). To this can be added the Polish dialect of the city of Niemirowa nad Bugiem, in which pretonic syllables contain only [i, a, u], while (fixed penultimate) stressed syllables and final syllables (open and closed) contain [i, e, a, o, u] (Janiak 1995). Note however that three of four of these examples have fixed penultimate stress, making it ambiguous whether we have resistance to reduction of final syllables specifically due to final lengthening, or rather failure to reduce posttonic syllables in general (all of which happen to be final), or even failure of reduction in vowels belonging to the head foot of the word, as suggested above. This cooccurrence of fixed penultimate stress and syllable-structure-independent final resistance may well not be accidental. In contrast to the Polish spoken in the city of Niemirowa nad Bugiem, for example, the local Ukrainian dialect has mobile stress and final resistance to vowel reduction in open syllables only. Niemirowa is located in the Berestja dialect region to which Shevelov attributes resistance to vowel reduction in all final syllables. At least this dialect, then, is an exception. It is tempting to ascribe any final resistance of the type Shevelov describes in Berestja Ukrainian to the influence of the Polish of the area which apparently also shows final resistance of this type, and has penultimate stress. In the absence of further information, however, this is purely speculative. With respect to the syllable-structure generalization about final resistance patterns, it is necessary to note that certain of the languages listed above conform to the generalization only vacuously, insofar as they simply lack final closed syllables of any description. Still though, the pattern is clear. It is worth noting finally that this syllable-structure asymmetry could well be at least in part responsible for the fact that the final lengthening study of Edwards, Beckman, and Fletcher (1991) found strengthening of the vowels of final unstressed syllables only. As it happened (and they in fact note), all the unstressed final syllables they measured were open, while all the stressed finals in the study were closed. 3.2.3.3. Final strength and psycholinguistic prominence Recall now the hypothesis sketched in 3.1.2 that final strength effects could be due not to phonetic realization, but to the psycholinguistic prominence of final syllables in general, a prominence which might mandate faithful realization of vowels in those syllables over that of vowels in less

Phonological strength effects in final position

99

prominent positions (just as it did in the study of Kehoe and Stoel-Gammon on children’s truncations in English). The asymmetry between closed and open syllables observed above for final strength effects, however, is deeply problematic for the psycholinguistic-prominence-only theory. Were it psycholinguistic status that gives rise to strength effects, we would naturally expect these effects to be distributed evenly among syllable types (as is the case in the acquisition study), since all final syllables would be equally prominent in this respect. The fact that this is not the case should suggest that the proximal cause of final strength effects is to be found elsewhere. From a phonetic point of view, on the other hand, the syllable-structure asymmetry is readily explicable. A number of studies, including de Jong and Zawaydeh 1999 on Jordanian Arabic; Edwards, Beckman, and Fletcher 1991; Byrd 2000; and Cho 2001 remark on the propensity for vowels in phrase-final open syllables to undergo more lengthening than vowels in phrase-final closed syllables. This effect is generally attributed to the fact that degree of lengthening of a given segment seems to decrease as distance from the boundary increases. When the final syllable is closed, it is the final consonant that undergoes more lengthening, with the vowel lengthening somewhat less than it would have were it in absolute domainfinal position. Byrd (2000) draws a connection between this effect and the results of numerous studies (see Chapter 4), including her own, in which domain-initial consonant articulations are strengthened. Specifically, within a task-dynamic model of speech production, Byrd posits a non-tract variable she calls a -gesture, which straddles the boundary between phrases and causes a local decrease in gestural stiffness, with the effect of lengthening, and conceivably strengthening (but see Cho 2001 for discussion) the segments to either side of that boundary. The effect would be strongest on segments closest to the boundary, which would account for why domain-final vowels and domain-initial consonants are strengthened significantly, while vowels in domain-final closed syllables and vowels in domain-initial syllables with onsets undergo lengthening to a much lesser degree (or in the case of the latter, in fact, at least in English, do not do so at all. Again, see Chapter 4). All this is not to say, of course, that final strength effects in both closed and open syllables could never arise in a system. Indeed, the fact that vowels in final closed syllables do lengthen, if perhaps somewhat less, suggests that the phonologization of non-reductions owing to the phonetic length in both syllable types should nonetheless be possible. From the phonetic point of view, we would merely expect such systems to be relatively less common, which is clearly the case. Ultimately, the durational asymmetry itself may be found only in certain languages, with other languages implementing final lengthening differently. One study

100

Final syllables

by Berkovits (1993b), in fact, of Israeli Hebrew, suggests precisely this. In this study Berkovits demonstrates that vowels in Hebrew undergo lengthening of equal magnitude in both closed and open final syllables. In both final and non-final contexts, however, the vowels in the open syllables are longer than the vowels in closed syllables by a mean 11 ms.. While the phonetic features associated with final lengthening are clearly better suited to account for the patterning of final strength effects than an appeal to the psycholinguistic status of final syllables, I should emphasize that this does not necessarily mean that psycholinguistic status plays no role at all here. It is conceivable, for example, that psycholinguistic status is in some way responsible for the fact that final syllables are phonetically prominent in the first place (though of course given that final lengthening appears to be primarily a phrasal rather than a lexical process, the psycholinguistic prominence in question would have to be related to cuing of boundary position, rather than, e.g., lexical identity – see Chapter 4 for an argument that the same is true of initial strength effects.). Be this as it may, I claim here that it is precisely the specific phonetic characteristics of segments realized in final position which give rise to crosslinguistic regularities in the forms and distributions of final strength effects. It is therefore these phonetic characteristics to which we must turn for an explanation of those regularities. 3.3. Why final strength is different from stressed syllable strength At this point it is possible to return to the general approaches to Positional Neutralization sketched and contrasted in Chapters 1 and 2. For stressed syllables a phonetically motivated approach like Licensing-by-Cue, despite the serious failings it was shown to have in Chapter 2, does at least make the prediction that vowel reduction should occur most readily in languages in which duration is a primary cue for stress, a prediction which a purely structural approach of the type I have labelled Pure Prominence obviously fails to make. For final syllables too these two approaches differ in the degree of success they enjoy in accounting for the typological regularities observed in the patterns and distributions of phonological strength effects. The Pure Prominence approach again does not predict the existence of regular restrictions on the nature of strength effects in final position, nor does it predict the differences observed between the strength effects found in stressed and final syllables.79 In an approach making no explicit connection to phonetic factors, one in which the strength or weakness of a given position is merely the possibility, stipulated in the grammar, for Faithfulness or Markedness constraints to be parametrized to that position,

Why final strength is different

101

a given neutralization pattern should be just as likely to surface in one strong position as in any other. The typological facts tell us otherwise, of course, making the abstract grammatical approach alone a poor predictor of crosslinguistic typology. An approach incorporating phonetic information into the grammar, on the other hand, such as Steriade’s Licensing-by-Cue or other Direct Phonetics approaches such as that of Flemming (2001a) do far better in this respect. In the sections above I have demonstrated the striking isomorphism between the distribution of phonetic duration and the occurrence and patterning of strength effects in final position. Now we can take up the differences between the strength effects in stressed and final syllables and examine the extent to which those too follow from the phonetics of those positions. As noted, while strength effects in stressed syllables (generally but not always the result of unstressed vowel reduction) are both common and robust in languages with duration-cued stress, final strength effects seem both less common and, when found, often not so unequivocal as their stressed-syllable equivalents. Specifically, stressed-syllable strength effects commonly make the stressed syllable the sole licenser of the entire vowel inventory in a given word. In the examples of final strength given above, the final syllable is rarely the unique strongest licenser in the word. The strength effects found there are in fact often best characterized as exceptions to a more general pattern of neutralization (what I have termed final resistance above). Of course, this distinction may not hold in all cases, and is not meant as a formal or definitional statement. Indeed, it is often difficult if not meaningless to designate one strong position “primary” and another “exceptional” in this way.80 In Uyghur , for example, is low-vowel raising primarily an instance of initial-syllable strength, closed-syllable strength, or even bimoraic vowel strength (despite the phonetic manifestation of the vowel length distinction in Uyghur being somewhat ambivalent)? There are a number of phonetically-based reasons why stressed syllables and final syllables might manifest different types of strength effects, and why the final-syllable equivalent of unstressed vowel reduction seems not to exist. The first comes from Cho’s characterization of the gestural correlates of English final strengthening. While Cho does find sonority enhancement (essentially, slight lowering) of vowels in final position, he does not note any of the featural enhancement effects that he describes for accented syllables, and which were reported by de Jong in his 1995 study characterizing accent as local hyperarticulation (affecting all dimensions of vowel articulation, such that [i] would be raised rather than lowered, while [a] would be lowered rather than raised). This would

102

Final syllables

account for the resistance of final syllables to processes decreasing vowel height, but would not necessarily make all vowels perceptually more distinct (in fact, see below for the suggestion that this sonority expansion may in some cases contribute to the development of the phonological weak licensing capacity of final syllables as well), and certainly would not make them as able licensers as stressed syllables. High vowels in these syllables, for example, would be realized lower than their peripheral stressed counterparts. The resistance to coarticulation Cho describes for final vowels is clearly relevant here as well in the development of resistance to assimilatory processes as in, for example, Maltese. The second reason why final syllables are less robust and consistent strong licensers than stressed syllables has to do with the level at which the relevant phonetic strengthening effects take place. In vowel reduction systems, there is a substantial durational asymmetry between lexically stressed and unstressed vowels, even if the stressed syllable does not ultimately carry a nuclear pitch accent. This asymmetry is thus present in all potential tokens of a given lexical form. Final lengthening, on the other hand, the phonetic source of final strength effects, has been shown in many if not all languages to be strongest and most consistent before higher-level phrase boundaries (such of those of IP or Utterance). At the word level, by contrast, final lengthening is often realized inconsistently and to a much weaker degree than in phrase-final syllables. In Italian, in fact, Farnetani and Kori (1990) claim that consistent, robust final lengthening takes place only utterance finally, with the phrase-final effect being somewhat ephemeral and less consistent (a pattern which they attribute to the “syllable-timed” rhythm of the language). This means ultimately that relatively few tokens of a given lexical form will be pronounced with substantial final lengthening. While the consistency of the durational asymmetry between lexically stressed and unstressed vowels then makes this pattern an extremely likely candidate for eventual phonologization (as categorical vowel reduction), the opposite pattern for final lengthening makes phonologization less likely, with patterns often remaining gradient and phrase-level indefinitely (as in Uyghur). If this inconsistency in the durational asymmetry plays a crucial role in determining the typology of final syllable effects, another difference between stressed and final syllables may contribute as well. With final lengthening, the duration of the domain-final syllable increases in some cases quite significantly; this does not mean though that the domaininternal syllables in the word are made in any way durationally deficient. On the contrary, they are as long (or short) as they would be in any other token of the same word at the same rate of speech, such that their non-final status alone does not cause them to be realized with a duration which

Catergorical and gradient effects

103

would in any way compromise them articulatorily or perceptually. There is even evidence from a number of studies (e.g., Berkovits 1993b; CambierLangeveld 1999) that phrase-final lengthening often increases to some extent at least the durations of segments in the word prior to the final syllables as well, though such lengthening decreases in magnitude as distance from the boundary increases. Nonetheless, this failure of the process to single out the final syllable as the unique locus of additional durational prominence would serve to decrease the likelihood that the effects of such an asymmetry would ultimately become categorical through phonologization. The asymmetry in the unstressed vowel reduction case, on the other hand, is quite different. In most systems of phonological vowel reduction, it is not merely that the stressed syllable is longer than the unstressed syllables, but rather that the unstressed syllables (or some subset thereof) are actually durationally contracted, or realized with durations sufficiently short regardless to cause articulatory undershoot (see Chapter 2 for details), which in turn causes certain contrasts to become difficult to perceive with accuracy. Thus, while the additional duration of final syllables may allow them to resist processes targeting non-prominent vowels, by itself it is unlikely to lead to a situation in which the final syllable alone licenses the system’s full vowel inventory, while all non-final syllables are reduced to some subset thereof. Indeed, even Hausa (Section 3.2.2.1 above) is not unambiguously such a system, with the phrase-internal neutralizations targeting specifically the short vowels, which by all accounts are by nature short and variable enough to overlap significantly in casual speech if not subjected to some additional lengthening. We might then expect other such systems to arise in languages with non-duration-cued accentual systems and a vowel quantity distinction such as that found frequently in IndoIranian languages like Kurdish or Punjabi, in which the short vowels are strongly centralized and variable.81 3.4. Categorical and gradient effects in final strength systems The preceding sections have presented arguments for viewing the typology of final strength effects as a consequence of the peculiarities of the implementation of final lengthening. The phonetics of final lengthening were shown to account for the fact that final strength effects are often found at the level of the phrase rather than the word, that final syllables are rarely if ever the sole strong licensers of vowel contrasts in the word domain, and that final resistance in the vast majority of cases targets only the vowels of final open syllables, rather than final syllables of any shape.

104

Final syllables

A Pure Prominence approach to PN effects in final position predicts none of this. In this section I present a sketch of how the phonetic details treated above are transformed into phonological patterns through the medium of phonologization. In many ways the process is similar to that described for stressed syllables in Chapter 2, though here additional details (including interaction with the phonologization of UVR) make this process worth additional attention. Just as with UVR, final strength effects start off gradient, and become categorical through the process of phonologization. This is even clearer, in fact, in the case of final syllables, where the phonetic roots of the effect are most robustly observed in the gradient application of phrase-final lengthening. A final resistance pattern might thus emerge in the following way (taking unstressed vowel reduction to be the operative neutralization pattern. See Chapter 2 for background). The first stage is the development of a gradient vowel reduction process, yielding a system like that of Bulgarian: Unstressed vowels are realized with shorter durations, compromising the achievement of certain articulatory targets to some extent. For expository purposes, we will refer to the phonological features whose implementation is so compromised as [F], though as Chapter 2 makes clear, we are typically dealing with the jaw opening gestures necessary to realize vowel height contrasts in UVR systems. In the system at hand, in any case, the degree to which reduction of [F]’s implementation takes place is a function of vowel duration (as Russian non-first-pretonic syllables were shown to be). Assuming robust enough phrase-final lengthening, gradient phrase-final resistance à la Russian or Brazilian Portuguese is present from the outset of the development of vowel reduction, emerging automatically as a consequence of the duration-dependent reduction of vowels.82 Note that, in terms of Keating’s window model of phonetic targets laid out in Chapter 1, we must assume that at this point phonetic target windows for the realization of [F] are relatively wide, allowing for variability as a response to durational pressure. The second stage in this process is marked by the phonologization of part or all of the word-internal reduction pattern. In Uyghur, for example, word-internal vowel raising is clearly categorical. It occurs whenever its structural description is met, regardless of the physical duration of the segments involved (as evidenced by the raising of new long vowels produced by compensatory lengthening). In word-final position, however, no categorical neutralization is implemented in the phonology. While the consistently shorter durations of word-internal vowels has led to the phonologization of reduction there, the more variable durations of final

Catergorical and gradient effects

105

syllable vowels has prevented this phonologization from occurring. There are now two “reduction” processes in the language in question, one categorical, applying word-internally, and one gradient, applying wordfinally. In Pure Prominence terms, we could generate the categorical end of this situation with the following constraint ranking, where again, this is UVR, and [F] stands in for some set of disprefered vowel features. (20)

Ident[F]/σ@, Ident[F]/σ]PW

>>

*[F]

>>

Ident[F]

The ranking of the general markedness constraint against [F] above the analogous general faithfulness constraint will remove instances of [F] from any part of an input string where it is not protected by a higher ranking Positional Faithfulness constraint; in this case, the two high-ranking constraints mandating the faithful realization of [F] in stressed syllables and final syllables respectively makes certain that only in those two locations will [F] ever occur in the output of the phonology, and thus have a chance at phonetic realization. But this is still only a chance. The resulting output forms with [F] present in stressed and final syllables are now passed along to the phonetics, where short vowel durations still threaten the realization of [F]’s articulatory targets as much as they ever did. Stressed syllables are never subject to such shortening, meaning that [F]-targets are typically reached in that context. In final syllables though, as before, phrase-final vowels are long, allowing for the canonical productions of [F], but phrase-internally vowels are short, yielding undershoot of [F]targets, or phonetic (but not phonological) vowel reduction. The process is still gradient and rate-dependent, taking place when the vowel is shorter (phrase-internally or in fast speech), and meeting resistance when the vowel is longer (phrase-finally or in careful speech). From the point of view of the window model again, the phonologization of reduction wordinternally means that the original [F] specification with a wide target window permitting gradient reduction has been replaced by the phonology with a [–F] specification, the target for which can be assumed to be narrow. Instead of, as before, trying to realize [F] but failing, the speaker is now actively trying to realize [–F]. In word-final position, however, the old wide-target [F] remains in place, unaltered by the phonology. In such a system, then, the internal categorical reduction pattern is an innovation, while the final-syllable gradient pattern is a relic from a previous system. It is appropriate at this point to underscore once again why it should be that the reduction effect is phonologized only wordinternally, leaving a gradient effect in the final syllable, while the reverse seems not to occur. As noted above, phonologization of an effect is most likely where that effect is robust and consistent. This is the case for internal

106

Final syllables

unstressed syllables in the immediate predecessor to a system like Uyghur. Within a given speech style or tempo (to the extent that such an entity exists monolithically over time in connected speech), word-internal unstressed vowels will be subject to relatively little durational perturbation, comparatively free as they are from the effects of processes like boundaryadjacent strengthening. Word-final vowels, on the other hand, will undergo substantial durational variation depending on the position of the word within the phrase, meaning essentially on the degree of application of the final lengthening effect. The resulting inconsistency of application of the gradient vowel reduction process to vowels in word-final position will hinder the process of phonologization of reduction in that position, giving it a crosslinguistic tendency to remain gradient there long after reduction is phonologized in other positions within the word. Note now that, as sketched above in (20), for systems like Uyghur or Nawuri, the fact that the final syllable undergoes the reduction or assimilation process only gradiently means that the failure of the categorical version of that process to apply must be specifically mentioned in the phonology (via a Positional Faithfulness constraint in the example above). This explicit mention of certain structural positions as apparent arbitrary exceptions to a process is something that proponents of a direct approach to phonetics such as Kirchner argue is undesirable, and a primary reason for allowing the phonology to refer directly to the phonetic features that unify all exceptional positions as a natural class of sorts. In Uyghur then, if we could just say that raising applies to all vowels under X ms. of duration (formalized as e.g., Kirchner’s [+ partially long]), we would avoid having to arbitrarily stipulate that long vowels, initial vowels, closed syllable vowels and phrase-final vowels all fail to undergo the process, since all these could be marked [+partially long]. This approach does simplify the description of the process, and brings its original phonetic motivation directly into its description, but in so doing it oversimplifies the specific characteristics of the implementation of the process, and thus obscures its true nature. As noted above, in Uyghur all underlying long vowels fail to undergo the raising process, despite the fact that the durational distinction between these and underlying short vowels has been largely effaced with time. Furthermore, the process does apply to new long vowels created by compensatory lengthening, despite the fact that they must clearly be over whatever subphonemic durational threshold we take [+partially long] to encode. In other words, the process has been phonologized. It applies categorically word-internally, without regard for matters of phonetic duration. The exceptional final syllables, on the other hand, do not share this property. They are said to optionally undergo raising non-prepausally, which I take to mean that raising applies to them

Catergorical and gradient effects

107

only in such case as they actually are under some minimum durational threshold, which is to say, it applies gradiently. We are dealing, in other words, with two distinct processes, one of which is responsive to the realization of phonetic duration, the other of which is not. That the latter includes a list of arbitrary-seeming exceptions to its application (in whatever form) is not a shortcoming. Rather, it is a virtue, in that synchronically those exceptions really are arbitrary. The relevant structures are exceptional regardless of their precise phonetic realizations. At this point further developments in our UVR system can go one of two ways. The first possibility is for the final syllable to follow its wordinternal counterparts implementing phonologized reduction. This amounts to the generalization of exemplars from the gradient system with the greatest amount of reduction (e.g., the phrase-internal or casual speech realizations). Generalization of these variants has the effect of eliminating the final strength effect entirely from the system. This is the situation for those speakers of Brazilian Portuguese whom Major (1985) describes as having lexicalized the reduction of word-final mid vowels such that it applies even under conditions of final lengthening. For such speakers presumably all posttonic vowels are categorically reduced in this fashion, with finals simply ceasing to be an exception. In the Pure Prominence model of UVR sketched above, this amounts to the demotion of the Positional Faithfulness constraint referring to final position to a place in the constraint ranking somewhere below the markedness constraint against [F], resulting in the effacement of [F] from all syllables but those bearing stress. This development is shown in (21). (21)

Ident[F]/σ@

>>

*[F]

>>

Ident[F]/σ]PW, Ident[F]

In the window model we are developing, final syllables now come to share with other unstressed syllables the narrowly targeted [–F] specification. The other logical possibility is more difficult to locate with any certainty in the languages of the world. In principle, in place of the gradient reduction pattern persisting in final syllables, a language could come to generalize not the more reduced, but rather the less reduced variant of the final syllable vowel. This would yield a system in which no reduction, gradient or categorical, would be found in this position. Of course without experimental evidence (of which there is a limited supply on this question for languages which are neither Romance nor Germanic), it is not possible to be certain that this has taken place in a given system. Descriptive studies are typically silent on this matter altogether, and those that aren’t could always simply be mistaken. Nonetheless, were such a system to arise, it would amount to the final expiration of the vestigial gradient reduction

108

Final syllables

pattern altogether, even in its last refuge, the domain-final syllable. Phonologically speaking of course, this generalization of the unreduced form amounts to no change at all in the system – representations and constraint rankings remain unchanged. In terms of phonetic target windows, this development would involve resizing of the target window for the realization of [F]; an earlier wide window allowing reduction is replaced by a narrow window preventing it; in this system the speaker decides to realize a good [F] in final position at all costs, even if that means increasing articulatory effort substantially. Taking English as a possible example of such a system, and assuming an earlier stage of the English system in which final /i/, /o/ and /u/ were left unscathed by the categorical phonology (i.e. they did not become phonological schwa as did their wordinternal preconsonantal counterparts), but were subject to gradient qualitative reduction in the phonetics, the modern situation in which /i/, /o/ and /u/ remain relatively peripheral in final open syllables is the result of the suspension of that gradient reduction pattern, which at a still-earlier point must have been operating generally throughout the word.83 Eastern Mari, described above, might be another system of this sort, though experimental verification would be necessary to be confident of it. The sections above provide a phonetic explanation for the observed differences between phonological strength effects found in stressed and final syllables, and provide a unified phonologization scenario accounting for the development of a range of systems of final strength effects, relating them one to another in the process. But we have yet to discuss what is perhaps the most crucial factor in accounting for the ambiguous status of final position within the typology of Positional Neutralization. This is the alarming (in terms of the discussion so far) propensity for phonetic and phonological weakening to occur in those very same syllables whose propensity for strengthening we have just reviewed in some detail. 3.5. On assumptions concerning licensing capacity and phonetic prominence Nearly all the instances of phonological final resistance seen in the catalogue in Section 3.2.3 above are cases in which more than one position within the word fails to undergo some reduction or assimilation process, and in each case the exceptional positions also share the characteristic of having relatively longer phonetic durations than the reducing or assimilating positions. The case of Brazilian Portuguese, for speakers for whom both pre- and posttonic reduction is phonologized, even has three distinct levels of licensing capacity (stressed, pretonic, and posttonic),

On assumptions concerning licensing capacity

109

which correspond directly to three characteristic degrees of phonetic length. The case of the Pasiego Spanish tense/lax contrast was presented above as evidence of the danger present in assuming automatically that the ability to license additional contrasts in a system must be correlated with an increased amount of phonetic duration characteristically realized in the relevant position. Another example, that of Timugon Murut, illustrates a similar danger in the assumption, even when duration is involved in determining licensing capacity in a system, that the relationship between these two is a simple direct proportion. As outlined above, Timugon Murut, an Austronesian language of Sabah in Borneo (Prentice 1971; Kroeger 1992), has four contrastive vowel qualities: /i, a, o, u/. These all contrast, however, only in the (fixed penultimate) stressed syllable and the posttonic final. Steriade notes further of Murut that “the contrast of rounding among non-high vowels occurs also – in a more limited fashion – in the stressed syllable, which occurs in penult position. Thus final [o] contrasts with [a] in words of any shape, whereas penult [o] contrasts with [a] only in words in which the final is an [o]”. (Steriade 1994: 13) This is illustrated in (22): (22)

Distribution of non-high vowels in Timugon Murut a. tanom baloy limo ilo

‘plant’ ‘house’ ‘dew’ ‘look!’

b. bolos onto lopot

‘voice’ ‘smell of burnt rice’ ‘wrap up’

*bolas *onta *lopat

As for the non-high vowels in pretonic position, here there is no contrast between /a/ and /o/. Their realizations can be summarized as follows: Where the tonic and posttonic are /o/, all contiguous pretonic non-highs vowels are realized as [o]. Where the tonic and posttonic are not [o] (or the pretonic non-high vowel is separated from these by a high vowel), pretonic high vowels are realized as /a/. Kroeger (1992) follows Prentice in seeing restrictions on the realization of [o] as a right-to-left harmony pattern triggered by the combined presence of [o] in the tonic and posttonic syllable, a rounding harmony with a bisyllabic trigger rather like that treated by Walker (2001a) for Classical Manchu and Tungusic. As seen in (22) above, [round] does not spread from the posttonic to the tonic. The harmony pattern is shown in (23), where in (a) and (b), addition of a suffix

110

Final syllables

without [o] causes preceding non-high vowels to become [a], whereas in (c) presence of [o] in both stressed and final syllables conditions the leftward spread of [round] to contiguous non-high vowels. (23)

Rounding harmony with a bisyllabic trigger in Timugon Murut a. orop-an ongoy-an in-abot-an

  

arapan angayan inabatan

‘perch’ (Referent Focus) ‘go’ (RF) ‘belt’ (RF, past tense)

b. tanom-in ongoy-in sigo-in

  

tanamin angayin sigain

‘plant’ (RF) ‘go’ (Locative Focus) ‘spy on’

c. tanom-on patoy-on pa-sakoy-on mapa-ongoy

   

tonomom potoyon posokoyon mopoongoy

‘plant’ (Object Focus) ‘kill’ (OF) ‘cause to mount’ ‘cause to go’

Since the final syllable is the most unrestricted licenser in the word in Murut, it is tempting to imagine that this perhaps owes to some particularly outstanding phonetic prominence, a prominence which allows it to excel even the stressed syllable in licensing capacity. This, however, would in all likelihood be incorrect. I have no instrumental data on Timugon Murut from which to determine the precise prosodic characteristics of the final syllable, but a diachronic approach to the problem makes clear both the source of the current distribution of vowels in the language and the prominence relationships which must have led to them. Two preliminary facts are necessary to understanding the situation. First, the vowel transcribed ‘o’ in Murut (and many related neighboring languages) is generally not realized as IPA [o]. In Timugon Murut this vowel is [o] only before [w] (Prentice 1971: 19). The base symbol used by Prentice in phonetic transcription, [], is also somewhat misleading, as the vowel is realized with this quality only before velar consonants in closed syllables. The elsewhere realization of ‘o’ in this language is apparently something more like [)], what Prentice describes as a “voiced lower-mid central half-rounded vocoid” (1971: 19). Realization of this vowel also varies significantly among the Murutic, Dusunic, and Paitanic languages of the area (Boutin and Pekkanen 1993). In Kimaragang (Dusunic, Kroeger 1992), it is realized as [] under stress, while other Dusunic languages have it as “a back unrounded or only slightly rounded vowel, roughly [], with considerable tensing of the tongue back” (Kroeger

On assumptions concerning licensing capacity

111

1992: 280). Kroeger even refers to it as “the neutral vowel” in Kimaragang. None of this is unexpected, though, in light of a second consideration. The historical source of /o/ in most cases in the languages in question is *PAN // (Robert Blust, personal communication). While the system of realizations of the pretonic vowels differ significantly among some of the languages in question, Timugon Murut shares with Dusunic Kimaragang its distribution of [o] in the penultimate and final syllables (Kroeger 1992). This is the older part of the system, the pretonic harmony being secondary. It is the development of this part of the system, additionally, which gives the lie to the assumption that the final is in some way prosodically (or perhaps better, phonetically) stronger than the tonic. Facts of the emergence of the distribution of [o] in these positions are the following (*PAN and *Proto-Malayo-Polynesian reconstructions from Blust 1999, 2000, Dusunic Tatana’ forms from Pekkanen 1993, Kadazan/Dusun from Miller 1993): *PAN // becomes /o/ regularly in final syllables. (24)

*PAN // in the Murutic and Dusunic final syllables *PMP *PAN *PAN *PMP

tanm qaNb Sips nipn

> > > >

TM Tatana’ Tatana’ Tatana’

tanom aob lipos dipon

‘plant’ ‘door’ ‘cockroach’ ‘tooth’

Given that the vowel in question is derived from the overshort *PAN // along with what we know about the realizations of this vowel in the modern languages, this should occasion no surprise, and certainly does not require the assumption of special prosodic prominence in the final. The opposite, in fact, may be more plausible, in light of the fact that in the stressed penult, *PAN // becomes TM [a]. This development, central to the formation of the licensing asymmetry described above, should actually be understood as a strengthening of schwa under stress, making its realization lower and longer than in the weaker final syllable. Several examples of changes of this sort, in Cho’s (2001) phonetic sonority enhancement and the Positional Augmentation of Smith (2002), have already been exemplified in the previous chapter, and more come in the discussion of final lowering in 3.6. The heart of the strong licensing capacity of the final, then, originates in the fact that it is prosodically weaker than the tonic, not stronger. The dependent licensing of [o] in the tonic is the result of an exception to this pattern, whereby *PAN // > TM [o] if the final also contains [o]. While such “support” phenomena commonly account for the failure of certain vowels to reduce (see

112

Final syllables

discussion of harmony and reduction in Chapter 4), it is less usual that a vowel should resist strengthening due to the presence of an adjacent vowel. Nonetheless, it must be concluded that the presence of adjacent [o] in the penult is enhancing the perception of stressed [o], preventing it from being realized low and unround. Indeed, the combination of the fact that the final syllable is comparatively weaker prosodically than the tonic, and the fact that whatever rounding was present on the reflex of schwa cannot have been very great (given its reflexes in the daughter languages) makes it tempting to see the emergence of stressed [o] as a metaphony-like change. In such a scenario, the already-weak rounding, realized in a prosodically weak position, is heard instead on the stressed non-high vowel, yielding reinterpretation. See the discussion of metaphony above in the section on Pasiego Spanish for various takes on this phenomenon. If this is the case for Murutic/Dusunic, though, then the limited licensing of [o] in the stressed syllable is the result of the prosodic weakness of the final, and cannot be considered a final strength effect at all. Examples follow in (25): (25)

*PAN // in Murutic/Dusunic stressed syllables a. *PAN tlu *PAN Spat

> >

b. *PMP nm *PMP dmdm *PAN gem84 *PMP depa

> > > >

talu Kadazan/Dusun apat Tatana’

onom rondom gomgo lopo

‘three’ ‘four’ ‘six’ ‘dark’ ‘fist’ ‘fathom’

The realization of original */a/ as [a] in the stressed syllable regardless of the presence of [o] in the final can be seen above in (22) (e.g., tanom – to plant). Adding to the asymmetry, however, */a/ is also realized as TM [o] in final syllables of a certain shape. Since the most regularly attested of these seems to be word-final /a/ (and since a > [o] here makes little sense as a final strength effect), this appears to be the result of a final reduction process such as those discussed in 3.6.5, a hypothesis strengthened by Kroeger’s (1993: 38) observation that in Kimaragang, there are few vowelfinal words, and what final non-high vowels ([a], [o]) exist are realized with a “slight breathiness or aspiration”. The regularity of the change in some of these environments (e.g., /_y#) is still uncertain, and /_h# could simply be the result of an earlier loss of final [h], leaving /a/ final:

On assumptions concerning licensing capacity

(26)

113

*PAN /a/ > [o] in Timugon Murut posttonic syllables a. /_#

*lima *duSa *kita *tuba *mata

b. /_j#

*PMP m-atay ‘to die’ > patoy ‘die' *PMP sakay-an ‘vehicle, ride in’ > sakoy ‘mount, ride’

c. /_h# *qumah

> > > > >

>

limo duo kito tuo mato

umo

‘five’ ‘two’ ‘see’ ‘fish poison’ ‘eye’

‘cultivated field’

As for the pretonic syllables, the right-to-left harmony exhibited in Timugon Murut makes most sense in light of facts from other related languages, such as Kimaragang, in which Kroeger notes that pretonic /a/ and /o/ both “tend to reduce to schwa in normal speech”. (1993: 38) In Timugon Murut, then, this reduction must also have been present. Where the stressed vowel was [o], preceding [] would have assimilated in rounding to the tonic vowel, while in the presence of other tonic vowels, it remained []. At some point in later development, phonetic reduction of pretonics must have become less extreme, such that earlier reduced pretonic non-high vowels were replaced with the fuller variants found there today. Where the schwa was rounded, the result was [o], and where it was not, the result was [a]. In this account, rounding simply spreads from a round tonic vowel to preceding reduced vowels, an unremarkable scenario from the point of view of vowel-to-vowel coarticulation. The appearance of a “bisyllabic trigger” is purely accidental. Only a stressed [o] is necessary, but this stressed [o] happens only to be found as the product of the metaphonic assimilation from final [o], as described above. Note that with respect to the behavior of pretonic syllables, the licensing of [o] in finals d o e s depend on some additional phonetic prominence. Despite the weakness of this position relative to the stressed syllable described above, the final must nonetheless have some additional duration resulting from final lengthening such that it resists the reduction to schwa which takes place in pretonic syllables. There is, therefore, a final strength effect in Timugon Murut, but only relative to these reduced pretonics. The remainder of the distributional restrictions on [o] follow entirely from other factors. The following illustrates the development of some of the alternations discussed above:

114

(27)

Final syllables

Rounding and Unrounding in Timugon Murut *tanm > tanom (stressed /a/ > [a], final schwa > [o]) /tanom-in/ > tanamin ‘plant’ (Referent Focus) (stressed *// > [a], pretonic non-high > *[] > [a]) /tanom-on/ > tonomon ‘plant’ (Object Focus) (stressed *// > [o]/_Co(C)#, pretonic non-high > *[] > [o])

3.6. Final Weakening The weakening, often leading to deletion, of word-final segmental material is well-known to historical linguists and synchronic phonologists alike, far moreso no doubt than the final strengthening trends discussed above. One sees repeated without qualification in study after study, in fact, the claim that final position is “weak”, usually in the sense that the speaker deigns to spend less “articulatory effort” in this position. The Russian phonetician Shcherba, writing in 1912, refers to “a general flaccidity85 of the articulatory organs... coming on toward the end of the word”. Gauthiot’s 1913 monograph, La Fin de Mot en Indo-Européen, is an encyclopedic treatment of the phonological history of final syllables in Indo-European; a narrative of what fell off where and when. More recently Hock (1999), in an article on phonological processes in phrase-final position concludes that final position is a monolithically “weakening” environment, deriving a number of distinct phenomena found there from this general tendency (i.e. final devoicing, stress retraction, vowel loss).86 This conclusion is seriously at odds with the findings catalogued above. For consonants, previous work suggests a solution to this contradiction. Certainly the study of Keating, Wright, and Zhang (1999) shows a marked strengthening effect for consonants in phrase-final position in English, such that their linguopalatal contact becomes in some cases equal to that of comparable word-initial consonants in the same phrasal position,87 a pattern standing in direct opposition to the idea of phrase-final articulatory weakness. On the other hand, Steriade (1997) analyzes the tendency for a variety of consonantal contrasts to be neutralized in word-final position (and elsewhere) as stemming from a lack of robust cues to consonant identity in that environment. Specifically, lack of CV transitions and release burst obscures cues to consonant place and laryngeal specification. We may add to this the sharp amplitude drop extremely common if not

Final weakening

115

universal in phrase- or utterance-final syllables, and along with that the tendency to phrase-final voicelessness cited by Hock, and what emerges is a picture of the extreme perceptual weakness of consonants in final position, despite whatever articulatory strength they may exhibit simultaneously. Regardless of degree of linguopalatal contact or strengthening of any other supralaryngeal articulations, the perceptual difficulties associated with final consonants would suffice to engender such changes as neutralization of laryngeal features (voicing, aspiration, etc.), neutralization of place features, or even ultimately debuccalization and loss. In many instances, perceptual factors could ultimately win out in determining the fate of the final consonant even if the type of articulatory strengthening documented by Keating, Wright, and Zhang were operative. Turning now to vowels, I will propose a similar explanation of the contradictory phonological trends encountered in final position. Before turning to the relevant vowel-quality neutralization patterns, I will review a number of other phonetic tendencies commonly found in domain-final position. These are relevant here in that, while they are natural concomitants of the phonetic final lengthening described above, their effects on the perceptual robustness of vowel contrasts realized in final syllables are the opposite of those of final lengthening and strengthening. In other words, these phonetic patterns have the effect ultimately of obscuring the realizations of vowel contrasts in final position. This phonetic anti-prominence, in turn, can not only counteract the prominence enhancing effects of final lengthening, but can even result in the phonologization of neutralizations taking place specifically in domain-final syllables. It is these phonetic factors which are responsible for the phonological ambiguity of final syllables with respect to patterns of positional neutralization. Because final syllables have received little attention in the context of PN in previous literature, and because even the phonetic literature is rather one-dimensional, focused primarily on phrasefinal lengthening, the following sections have as an ancillary purpose the documentation in as much detail as possible of the crosslinguistic range of these patterns. Each section will therefore be quite extensive. 3.6.1. Devoicing The first such effect to be discussed, touched upon already above with reference to Hock’s weakening theory, is the tendency to phrase-final voicelessness. The significance of this effect for consonants is commonly known, so I shall limit the discussion here to the consequences of this effect for the realization of vowels. Domain-final devoicing of vowels is

116

Final syllables

extremely common crosslinguistically. It ranges from the radically lowlevel, sporadic effect rarely even mentioned in descriptions of languages where one finds it such as, for example, Russian, to the heavily morphologized devoicing patterns in the “utterance final” forms of Oneida analyzed by Michelson (1999). It often figures in descriptions as the addition of aspiration, or the epenthesis of final /h/ (as in Sapir and Hoijer’s description of Navaho [1967]). It is also a common phonetic source of final vowel deletion, as can be seen in the Austronesian languages discussed in Blevins (1997) and Blevins and Garrett (1998). Final vowel devoicing is generally assumed to have its phonetic roots in the steep drop in subglottal pressure associated in most languages with the ends of phrases or utterances along with final lowering of F0 (Dauer 1980; Gordon 1998). This decreased subglottal pressure can result (particularly with a close supralaryngeal constriction downstream) in the elimination of the pressure drop across the glottis necessary for voicing to be maintained. In this original form we may conceive of it as a form of passive devoicing (as discussed in, e.g., Ladefoged and Maddieson 1996: 49), whereby devoicing occurs as a consequence of some other phonetic process, rather than as a result of the planned implementation of, e.g., glottal abduction. Gordon (1998) presents a crosslinguistic survey of vowel devoicing in both internal and final positions, in which he makes a number of generalizations of relevance to this discussion. The first is that in nearly all cases the presence in a language of word-internal vowel devoicing implies the presence of final vowel devoicing, though the opposite is not true. Gordon cites four apparent exceptions to this generalization, noting that in three of them (Turkish, Azerbaijani, and Montreal French) stress is fixed on the final syllable.88 We will return to the topic of the devoicing of stressed vowels later in this section). A second generalization which emerges from Gordon’s survey and is mirrored in my own is that the devoicing of wordfinal vowels in a language implies the devoicing of phrase-final vowels, while the reverse is not true. There are many languages in which only phrase-final vowels are subject to devoicing (and deletion, most likely a later consequence). This pattern falls out directly from the phonetic facts discussed above concerning the regular extreme lowering of subglottal pressure in phrase- (but not word-) final syllables. From a diachronic point of view, this almost certainly means (as it does for word-final consonant devoicing), that the pattern gets its start in phrase-final position, and is extended only subsequently to phrase-internal word-final vowels. In this respect devoicing mirrors the final strength effects discussed above in following what is the usual view of the development of word-edge phonology in traditional historical linguistics.

Final weakening

117

From this diachronic scenario an important prediction concerning both the strengthening and devoicing patterns emerges. Gordon notes in his description of final devoicing that just as with final strengthening above, in a great number of cases the effect is optional or gradient, with degree of devoicing occurring along a continuum from fully voiced to fully devoiced. Any single instance of devoicing would then be a function of other phonetic factors, such as emphasis, preceding consonant, speech rate, and the like. On the other hand, many instances of vowel devoicing are clearly phonological, due either to surface contrast with voiced vowels or interaction with other processes in the system which are believed to be phonological (Gordon 1998: 96). The prediction then is this: Where final vowel devoicing has been generalized from the position in which it is phonetically-motivated (phrase-final) to a position in which it is not (or perhaps less) phonetically motivated (word-final), it should no longer retain the gradient character it may have had in the beginning, which is to say, it must be phonologized. Here the term “phonologization” is used in the specific sense developed in this work; namely, to refer to a formerly gradient, phonetic process which has come to be reflected in the categorical phonology. The phonologization of devoicing in this light means that phonetic voicelessness, previously a consequence of the less-than-ideal implementation of a (presumably wide-target) [+voice] specification in the phonology, is now the product of the replacement of that feature with an actual (possibly narrow target) [–voice] specification. It is worth noting though that in the parlance of Hyman (1976), this process must be taken in the more limited sense of phonologization, the transition of a given phonetic feature from intrinsic to extrinsic implementation in a given environment, as opposed to phonemicization, essentially just becoming contrastive. As Gordon notes, in the case of voiceless vowels, phonemicization of the contrast is certainly a logical possibility, but one of which, interestingly, no completely uncontroversial cases have been brought to light.89 The prediction that for final vowel devoicing to generalize beyond the phrase-final domain, it must first (or simultaneously) have been phonologized is more difficult to verify, since the vast majority of the languages involved have not been subjected to the kind of phonetic analysis necessary to determine such a thing (e.g., laryngoscopy, airflow measurement, etc.). The majority of descriptive sources for their part mention presence or absence of devoicing in a single sentence, often without qualifications as to degree or domain of application. In addition, the situation is complicated somewhat by the variety of phonetic circumstances which can lead to vowel devoicing. For instance, while word-level final vowel devoicing can be considered phonetically unnatural or unmotivated

118

Final syllables

in that this position is not generally associated with the degree of subglottal pressure decrease found finally in larger domains, other phonetic properties specific to word-final position might actually enhance the likelihood or degree of final vowel devoicing. It is well-known, for example, that shorter vowels are more likely to devoice than longer vowels (presumably because the duration and robustness of the necessary vocal-fold adduction gesture is sufficient to prevent longer vowels being overlapped by neighboring voiceless segments or falling victim to complete devoicing as a result of decreased subglottal pressure). Word-final vowels, though perhaps produced without the phrase-final drop in subglottal pressure, are also produced without phrase-final lengthening. Without this lengthening and a following phrase boundary, in many languages word-final vowels can actually be among the shortest possible in the system, making them susceptible to elision in hiatus contexts or interconsonantal devoicing of the same sort discussed by Gordon for word-internal vowels. In Major’s study of Brazilian Portuguese timing patterns (1981), di- and trisyllabic nonsense forms of the segmental shape [lala] or [lalala] situated in phrase-medial contexts were realized (for a single speaker) with the following durational patterns: Table 3.

Mean vowel durations in Brazilian Portuguese di- and trisyllables (Major 1981)

token a. b.

/lalá/ /lála/ /lalála/ /lálala/

V1 151 217 138 208

vowel durations in ms. V2 222 114 202 122

V3 --108 99

The forms in (a) demonstrate that posttonic vowels are shorter than pretonic in disyllables, while the forms in (b) show both this asymmetry, and that the vowel of the second posttonic (word-final open syllable) is substantially shorter even than the vowel of first posttonic. In Bulgarian, similarly, all posttonic syllables are characterized by durational reduction of up to 70% in comparison with the corresponding stressed vowels (Savitska and Bojadzhiev 1988: 52), with unstressed final vowels (especially, it seems, [u] < /o/, /u/) often barely perceptible if articulated at all, even in forms such as the complementizer /kato/ (frequently realized in colloquial speech and even probably lexicalized dialectally as proclitic [kt]), which due to syntactic restrictions are excluded from ever appearing in phrase-final contexts in the first place. It would not, therefore, be

Final weakening

119

surprisingly if these vowels were seen to undergo a gradient devoicing process completely independent of any concerns regarding domain-final subglottal pressure, in which, for example, unstressed word-final high vowels following a voiceless consonant phrase-medially before a voicelessobstruent-initial word were to devoice completely, while variations of these parameters would induce lesser degrees of devoicing. The existence of gradient devoicing patterns phrase-internally is thus not in and of itself evidence that the phonologization scenario and prediction given above are incorrect. That said, I know of no clear instances of counterexamples to the prediction. Certain devoicing patterns not mentioned by Gordon, such as that of the Cushitic language Dasenech (Sasse 1976), would be crucial test cases in this regard. In Dasenech non-prepausal final vowels are said to be often voiceless or even to delete in fast speech, while in prepausal position the same vowels are retained in all cases, though realized with final “aspiration”. If the internal process is not gradient, this could be an instance of a case analogous to some of the “final clipping” systems described below, in which short final vowels devoice completely and long vowels are devoiced partially, potentially phonologizing devoicing of a more or less set vocalic span, regardless of the total duration of the vowel in question, a mora’s worth, in Hayward’s terms (1998). If the process is gradient, however (as it may be, though the conditions for its occurrence are not elaborated in the description), we may perhaps suspect the circumstances described above, though without additional data this is impossible to know. Returning now to Gordon’s observation concerning stress and final vowel devoicing, findings from my survey support the claim that in general stress impedes vowel devoicing (Turkish notwithstanding), and that no language devoices stressed vowels but not unstressed vowels. In fact, there are cases in which final vowels which would otherwise devoice are realized fully voiced when they receive some sort of secondary stress or emphasis. Blevins and Garrett (1998) discuss the cases of Rotuman and Kwara’ae, in which secondary stresses placed on final vowels in marked speech styles save them from devoicing or deletion. Additionally, Sapir notes for Southern Paiute a type of “rhetorical emphasis” sometimes placed on words which allows final vowels to avoid devoicing, to be realized instead “lengthened and generally followed by glottal stop” (see below for discussion of domain-final glottalization). This does not mean that stressed vowels cannot be realized with devoicing. Gordon mentions one such language, Papago. To this we may add as an example Goajiro (Arawakan) as described by Mansen (1967), in which unstressed final short vowels are devoiced completely, but stressed final short vowels are lengthened and only partially devoiced (as in the

120

Final syllables

clipping patterns mentioned above). At least one case, however, stands out as potentially problematic. In Afar, as described by Bliese (1976), utterance-final stressed open syllables are realized with “final aspiration”, while utterance-internally they are not: (28)

Utterance-final stressed vowel devoicing in Afar aba na-h li yo-h

‘they do’ ‘I have’

cf. internal aba na

Bliese does not mention here the fate of utterance-final unstressed vowels (though he does note some optional preboundary glottalization [1976: 164]), which allows several interpretations. If this is not an oversight, it is conceivable that final unstressed vowels are never in fact devoiced at all, which would be difficult to explain from the point of view of the generalizations and phonetic explanations detailed above. Another possibility is that devoicing of unstressed final vowels is partial, as it is with stressed vowels, only due to the lower subglottal pressure we can assume is present in unstressed finals, the devoicing does not have the turbulence to be considered “aspiration”, and in this way perhaps escaped description. But if this is the case, there is an obvious problem: Why is utterance-final devoicing realized more robustly (or even at all) on a syllable with high enough subglottal pressure to give the impression of frication when it applies, given that the putative phonetic root of phrase-final devoicing is actually dramatically lowered subglottal pressure? And why at the same time are the final syllables in this language that might actually realize this lower subglottal pressure (unstressed finals) being realized with devoicing of a far less salient degree? The answer must lie in phonologization. Specifically, if devoicing has ceased to be an automatic consequence of its phonetic surroundings (i.e. has been phonologized), such that the target pronunciation of utterancefinal syllables includes partial devoicing of the final vowel (i.e. is phonetically specified to include devoicing), then all utterance-final vowels, regardless of the degree of subglottal pressure associated with any given token, will be realized with some amount of devoicing. Furthermore, and perhaps most importantly, the implementation of that planned devoicing would now potentially vary according to its phonetic environment, meaning that in syllables with increased subglottal pressure it would be realized with greater turbulence and hence be perceptually more salient. Again, the distinction is essentially that drawn by Ladefoged and Maddieson (1996: 49) between passive devoicing and active devoicing. In the former case, that involved in gradient final devoicing, subglottal

Final weakening

121

pressure drops at the end of the phrase to the point where voicing can no longer be maintained. The vocal folds thus cease to vibrate, but not because they have been actively configured to do so. In the case of active devoicing, on the other hand, what is by hypothesis occurring in the phonologized cases is a purposeful devoicing of all or part of the relevant vowel, most likely through some degree of intentional glottal abduction. With an active abduction gesture, devoicing is no longer a function of low subglottal pressure. An increase in subglottal pressure, such as that associated with a stressed vowel in some languages, would now serve actually to enhance the perceptual robustness of the devoicing, rather than to impede its implementation, as in the unphonologized cases. A similar account is presented below of the phonologization process in analogous cases of final glottalization. Further evidence that final devoicing is phonologized in this way in Afar comes from what Bliese describes as “h-epenthesis” (1976: 160). As Bliese describes it, a stressed word-final vowel may be followed by an [h] phrase-internally when the following word is vowel-initial. He characterizes the process as an optional means of keeping the two words separate to avoid the confusion which the application of assimilation rules in external sandhi would cause. In the case of certain morphological contexts, however, h-epenthesis is obligatory. All this suggests a process now reasonably distant from its original phonetic roots, subject to, among other things, morphological conditioning. Before moving on, it is necessary to address a final observation made by Gordon. Gordon notes that the lowered subglottal pressure that causes final devoicing is “in direct competition with another common crosslinguistic property of final position: final lengthening”. (1998: 101) This is clearly true with respect to the potential phonologization patterns that might result from these phonetic properties, insofar as completely devoiced final vowels are unlikely candidates for the phonologization of final resistance to qualitative vowel reduction, and vice versa. But there is another sense in which the claim is problematic. Gordon notes (correctly) that despite the evidence that the increased duration associated with final long vowels or final stressed vowels impedes final devoicing, duration contributed by final lengthening generally does not rescue final vowels from devoicing (though the existence of cases such as Dasenech above probably count as exceptions to this otherwise valid generalization). Gordon attributes this pattern to the fact that final lengthening, as opposed to stressed vowel lengthening, does not involve any increase in the magnitude of associated gestures, but rather is just a consequence of local gestural slowing (citing Beckman, Edwards, and Fletcher 1992). If anything, in fact, Gordon notes that since devoiced vowels are actually typically shorter than their

122

Final syllables

crosslinguistic voiced counterparts, the subglottal pressure drop “not only inhibits final lengthening, it also appears to induce final shortening.” Given the evidence of the numerous phonetic studies and phonological patterns cited above, though, we must accept that at least in many languages, final lengthening does indeed involve some sort of enhancement of gestural magnitude as well. Though the phonetic studies deal only with a few languages, were this claim true more generally, the final resistance patterns discussed above would by all rights not exist. I would argue, rather, that the two phonetic trends (final lengthening and a phrase-final drop in subglottal pressure), are not in fact contradictory in synchronic phonetic implementation in any way.90 If the defining characteristics of final lengthening are a reduction in gestural stiffness and an increase in the magnitude of certain supralaryngeal gestures (e.g., Cho’s sonority enhancement), then it is not clear what possible detriment to the integrity of this pattern’s implementation could be caused by simultaneous low subglottal pressure (except over time through reinterpretation). That the perceptual consequences of this cooccurrence could result in diachronic implementational changes owing to the lowered perceptual salience of the supralaryngeal strengthening in this environment is clear enough, but it is not what is at issue here. The two phenomena can and regularly do coexist within a system, as demonstrated by the phrase-final vowels of Russian, exemplified in Figure 6 below. 5000 4000 3000 2000 1000 0 0

0.808934 Time (s)

–0.25

0

g

r

a

n

a

t

a

0.25 0

0.808934 Time (s)

Figure 6. Russian phrase-final /granata/, [r nat ], ‘grenade’91

Final weakening

123

In this phrase-final token of the word /ranata/ ‘grenade’, it is clear that final lengthening occurs:92 The posttonic vowel, ordinarily among the shortest in a given word, is 121 ms. long (measured conservatively). The stressed vowel has a duration of 119 ms and the pretonic 106 ms. (though of course given the presence of a trilled [r] in the onset of that syllable, this measurement must be considered only approximate). The (relatively lateoccurring) F1 maximum for the final vowel is 568 Hz, with an F2 of 1239 Hz. A short study of formant values for unstressed /a/ in 63 open first pretonic syllables produced by this speaker (in non-nasal contexts followed by voiced or voiceless obstruents) yielded a mean F1 of 555 Hz and a mean F2 of 1216 Hz, placing the values for the final vowel of this token well within the speaker’s range for Degree 1 reduction (see Chapter 2). The unreduced stressed /a/ in this token, as expected, has a higher F1: 608 Hz, with an F2 of 1308 (a bit high, probably due to the surrounding coronals). The final is then clearly resistant to reduction in the sense outlined above. It also has a low amplitude, dropping off drastically toward the end. The irregular wide spacing of the glottal pulses together with the variations in their amplitudes indicates (audible) creaky phonation,93 and the very final portion of the vowel appears to devoice slightly before the pause. All this suggests the low subglottal pressure we expect to find in phrase-final syllables. (See below in the section on final glottalization for the less-thanobvious reason this should be the case for creaky voice.) As noted, the coexistence of low subglottal pressure (with concomitant devoicing) and final lengthening (with concomitant supralaryngeal gestural strengthening), while not inherently problematic from a production point of view, is nonetheless potentially vulnerable to perceptually driven reanalysis, due to the conflicting acoustic results of these patterns (essentially the obscuring of strengthening by low amplitude, non-modal phonation and devoicing). This vulnerability can ultimately lead to phonologization of a situation with very different articulatory characteristics. Leaving aside the Russian pattern which has only minor devoicing toward the end of the lengthened vowel, suppose we have a language in which the laryngeal specifications and subglottal pressure in phrase-final position are such that they frequently result in the devoicing of a significant portion of an otherwise lengthened short vowel (meaning, a vowel the supralaryngeal gestures of which are slowed and enhanced in some respects, and which perhaps is realized as fully or almost fully voiced in some percentage of its instantiations). Again, there is no particular reason that the implementation of final lengthening should not coexist with the severe drop in subglottal pressure or even with the devoicing pattern. Of course, if devoicing becomes frequent enough and lengthy enough, there will be little acoustic evidence left for the lengthening or strengthening gestures of phrase-final

124

Final syllables

vowels (or for accurate identification of any kind for that matter). This situation could easily lead to reanalysis by the listener, with phonologization of the devoicing, such that the target articulation of a phrase-final vowel should now contain the active devoicing plan discussed above. At this point voicelessness is no longer a function of subglottal pressure, and as a targeted articulation becomes subject to the phonetically induced variation common to the realization of segments of its type. Which is to say, in circumstances of raised subglottal pressure, the voicelessness, rather than being impeded, would actually be accompanied by more turbulent airflow through the glottis. Gordon notes that voiceless vowels are generally shorter than comparable voiced vowels, and indeed an active glottal abduction gesture in the absence of lowered subglottal pressure would certainly increase airflow through the glottis, perhaps with the result of a substantially briefer phrase-final vowel. But is this really an instance of “phrase-final shortening”, in the sense of cancellation or even reversal of the otherwise-common phonetic tendency toward phrase-final lengthening? The answer here must be no. Phrase-final lengthening was implemented as normal up until the moment of the phonologization of the final vowel’s voicelessness, at which point the vowel’s durational properties would be changed completely to correspond to the phonetic characteristics of the new target articulation. So what happens to the final lengthening in the process? Certainly at this point phrase-final vowels are now being realized without such lengthening (insofar as they are short and voiceless by design). But this does not exclude the possibility that there is still final lengthening in the system: While the conclusive data relevant to the example here would come from, e.g., Turkana, an extremely interesting finding in an experiment on final lengthening in Dutch by Cambier-Langeveld suggests what the answer may be. Cambier-Langeveld (1997) found that in Dutch, words with final syllables containing full vowels undergo phrase-final lengthening implemented primarily on the rhyme of the word-final syllable. Previous syllables are not significantly effected. On the other hand, when a wordfinal open syllable has a nucleus with the reduced vowel schwa, final lengthening reaches further back into the word, lengthening the preceding syllable as well. Words with a final closed syllable containing schwa also saw some lengthening of the penult, though not as much. CambierLangeveld’s analysis is that the overshort vowel schwa may not be capable of lengthening to the degree that other final vowels are. If this is so, and the domain of application of final lengthening (the period of activation of the -gesture in Byrd’s terms) has a relatively fixed duration, the failure of the schwa to occupy that entire period would result in the extension of the lengthening past the final syllable deeper into the word. In Dutch then, final

Final weakening

125

reduced vowels may undergo little final lengthening themselves, but this is a completely different claim than one which asserts that final lengthening does not operate in words with final schwa, or that the presence of schwa in the final syllable induces a process of final shortening. If final lengthening as a phenomenon is as general and universal as the evidence (linguistic and non-linguistic) suggests that it is, the very same pattern may be manifested even in languages with complete devoicing of phrase-final vowels. If these devoiced vowels are short, it means that final lengthening does not significantly affect phrase-final devoiced vowels. It does not mean that the language has no phrase-final lengthening, or that any lengthening it has differs significantly in kind (rather than in domain of application) from that found in other languages. Lowered subglottal pressure in this case has resulted in the phonologization of phrase-final vowel devoicing, with concomitant (probably predictable) changes to the segmental domain of final lengthening in the relevant phrases. It has not, by contrast, resulted in the effacement (let alone reversal) of the phrase-final lengthening pattern as such. This argument is important for analogous situations in the following two sections. Final vowel devoicing can thus be phonologized directly as one of a variety of patterns of final devoicing, aspiration, clipping or shortening of domain-final material. As I will argue below, it can also contribute to phonologization in a more subtle way, by contributing to the perceptual obscuring of vowel quality contrasts which would otherwise be robust in a lengthened final syllable. 3.6.2. Glottalization As seen above, along with the devoicing caused by the characteristic phrase-final subglottal pressure drop (and the equally-characteristic and related phrase-final lowering of F0, another common salient feature of final syllables crosslinguistically is glottalization or laryngealization. I will use the term glottalization throughout (despite the contradictions this entails described below) to highlight the connection between the allophonic creaky phonation characteristic of phrase-final syllables in many languages (e.g., English, as treated in Henton and Bladon (1988), or in Russian as manifest in the spectrogram above) and phonological epenthesis of final /#/, often ultimately in heavily grammaticalized circumstances, as Hyman (1988) illustrates for Dagbaani. I will assume, along with Hyman (1988), that at least in many cases the epenthetic final glottal stop so common in the languages of the world is ultimately the phonologization of allophonic phrase-final creak. That there is an intimate enough connection both

126

Final syllables

articulatorily and perceptually for such a scenario to be plausible is clear from the discussion of glottal stops in Ladefoged and Maddieson (1996: 73–77) as well as from the analysis of compensatory lengthening triggered by loss of glottal stop in Kavitskaya (2001). Specifically, while glottal stop is definitionally at least produced by a complete adduction of the vocal folds resulting in the momentary cessation of glottal airflow, it is in fact often realized allophonically in many (if not most) languages whose inventories contain such a segment without full closure of the glottis. According to Ladefoged and Maddieson (1996: 75), “In place of a true stop, a very compressed form of creaky voice or some less extreme form of stiff phonation may be superimposed on the vocalic stream”. I will depart from this aspect of Hyman’s analysis, however, in one respect: Hyman claims that though final allophonic creak is surely the ultimate diachronic source of final glottal stop insertion in many cases, this may not be historically true of all cases. While contradicting this claim outright would be rash, I would nonetheless cast doubt on the case for a second source proposed by Hyman to account for comparative evidence from Akan dialects and from the Guang subgroup of Volta-Comoe. The pattern is as follows. For certain cognate forms, one language or set of languages shows a vowel-final disyllable, while another language or set of languages shows a monosyllable ending in a glottalized sonorant. Thus, in data from Snider (1986): (29)

Final vowel loss in Guang a. Gonja ka-wl# k-fl# e-*in#

b. Chumburung wr ‘skin’ k-furi ‘moon’ -*ar ‘man’

Hyman suggests then that rather than having the “phonologization of pause” found in some languages, in this instance we are dealing with the reduction of a final segment, with glottalization of the consonant representing a trace of the lost final vowel. It is certainly common enough that the loss of a final vowel leaves traces on preceding segments (umlauts of various sorts on vowels, distinctive palatalization or affrication of preceding consonants, etc.). It seems unlikely, however, that a vowel would leave of all things glottal constriction on a preceding consonant unless it were itself glottalized. We might hypothesize then that the direct ancestor to the Gonja forms above was one with a short, low amplitude, creakyvoiced final vowel, whose laryngealization may either have affected the sonorant itself in implementation, or at least have obscured the already

Final weakening

127

lower-salience acoustic modulation of the transition from the sonorant to the short vowel,94 such that the sequence could be reinterpreted as merely a laryngealized sonorant.95 This would mean Hyman is correct to see this correspondence as the product of reduction of a final segment, with the sole caveat that the pre-deletion phonetic realization of the final segment in question was quite probably the result of the phonologization of prepausal creakiness. This may seem a pedantic distinction, but it is an important one if the question we are posing is etiological in nature.96 Before looking more closely at the crosslinguistic typology of final glottalization, it is necessary to address the phonetic issues surrounding the development of creaky voice in phrase-final syllables. Most sources treating creaky phonation characterize its production and acoustic and aerodynamic consequences similarly: Gordon and Ladefoged (2001: 386) describe it as owing to “vocal folds tightly adducted but open enough along a portion of the length to allow for voicing”. Ladefoged and Maddieson (1996: 53) speak of “a great deal of tension in the intrinsic laryngeal musculature”, and note that for a given amount of subglottal pressure, the tight adduction of the vocal folds yields substantially lower rates of airflow through the glottis than would be the case for modal voicing (1996: 50). From this we can extrapolate that for a given rate of airflow through the glottis, creaky phonation should require greater subglottal pressure than modal voice. The facts concerning pitch are ambiguous (and tellingly so, as will be shown): Creaky phonation is most commonly associated with a low F0 relative to modal voice, but in many languages it is in fact correlated with high F0 tonal contours (Gordon and Ladefoged 2001: 400). This latter is odd, particularly in light of Silverman’s (1997: 149–152) discussion of the interaction of creaky voice and pitch. Silverman notes that in many ways high F0 and creaky phonation are actually far more compatible than low F0 and creaky phonation, given the articulatory properties attributed to both (1997: 150–151): “First, low pitch involves decreased vocal fold tension, while creakiness is enhanced by increasing vocal fold tension. Also, creakiness is enhanced by increasing subglottal pressure, while lowness is enhanced by reducing subglottal pressure. Finally, low tones involve pronounced larynx lowering, while creakiness may be accompanied by larynx raising.” Gordon and Ladefoged mention allophonic phrase-final and phraseinitial glottalization as cues to prosodic constituency, but make no mention of any potential contradiction in these two phonetic environments. Comparing the two, the higher pitch and subglottal pressure associated with the beginnings of phrases is well-suited to the tightly-adducted-vocal-folds implementation of creak. In initial position, Gordon and Ladefoged also

128

Final syllables

note that glottalization is greater on accented syllables than on unaccented (Pierrehumbert 1995; Dilley et al. 1996). As concerns our phrase-final syllables, however, the relevant phonetic properties seen so far have been the exact opposite of this. All things being equal, phrase-final position is associated with lowered pitch, lowered subglottal pressure, and the syllables that most concern us are unstressed. Compare the two Russian phrase-final words in Figure 7 and Figure 8, one with final stress, and one with medial.

5000 4000 3000 2000 1000 0 0

0.719229 Time (s)

–0.25

0

k

a

b

a

l

a

0.25 0

0.719229 Time (s)

Figure 7. Russian final-stressed /kabala/, [kb la], ‘kabbalah’

The unstressed final vowel in the figure below for /dakota//, [d kot ], ‘Dakota’ is clearly creaky throughout, as the uneven glottal pulse pattern suggests. The stressed final vowel in the figure above for /kabala/, [kb la], ‘kabbalah’, on the other hand, is modal for at least 120 ms., becoming increasingly breathy thereafter, with no creak to be found. In other words, the creak appears on the vowel by all accounts least likely to host it, given the lowered subglottal pressure associated with phrase-final

Final weakening

129

unstressed vowels and evident in this vowel’s lowered amplitude. No creak appears, by contrast, on its far more appropriate stressed counterpart. It is also not at all clear in light of the foregoing how to explain the properties of the final vowel in Figure 6, [r nat ], above, in which amplitude declines gradually to the end of the vowel, first with creakiness, apparently easing into voicelessness at the end (with no evidence of an actual glottal occlusion in the area). 5000 4000 3000 2000 1000 0 0

0.889342 Time (s)

–0.25

0

d

a

k

o

t

a

0.25 0

0.889342 Time (s)

Figure 8. Russian medial-stressed /dakota/, [d kot ], ‘Dakota’

Silverman (1997) recognizes these contradictions in the context of the appearance of creakiness on low tones in Chong and Mandarin. As a solution, he proposes that the creakiness occurring in the low tone environment actually has a different source and mode of production than the creaky voice described articulatorily in most studies. Specifically, he hypothesizes that low-tone creakiness is in fact a consequence of t h e lowering of subglottal pressure (and hence airflow across the glottis) needed for the F0 drop, such that “the vocal folds may more readily seal the subglottal chamber, and thus are only intermittently blown apart by

130

Final syllables

eventual subglottal pressure increases. This slow and irregular glottal pulse may give rise to creakiness”. (1997: 151) He hypothesizes further that this may be the case with phrase-final glottalization as well. Silverman’s hypothesis is largely confirmed by the work of Slifka, reported in Hanson et al. (2001). Slifka shows that in periods of phrasefinal “glottalization”, subglottal pressure is in fact gradually falling, while airflow is increasing, and thus that the vocal folds are in fact being gradually abducted during the production of this vowel. She concludes that the irregular phonation in question cannot possibly be glottalization (as described above), but rather that the irregular glottal pulsing arises from a combination of falling subglottal pressure, vocal fold slackening and vocal fold abduction. She continues to suggest that glottalization in the environment of low tone has the same origin, resulting from the reduction in subglottal pressure and vocal fold slackening necessary to produce the desired pitch level. This explanation is in perfect accord with the observed facts of phrasefinal allophonic glottalization (a term I will continue to employ despite its demonstrated inaccuracy) in Russian. It is likewise immediately clear how the smooth transition from creak to breathiness and voicelessness can occur in the course of the final vowel discussed above. Both final devoicing and final glottalization thus have essentially the same phonetic source (subglottal pressure drop and gradual laryngeal-gesture “fade”, to borrow the term). They can also both be characterized as resulting from phrase-final phonetic weakening and prominence-reduction, though of course, in their phonologized states, there may be nothing weak about them, from an articulatory point of view. The crosslinguistic patterning of final glottalization is parallel to that described above for final devoicing. Gordon’s generalization that wordfinal devoicing implies phrase-final devoicing seems to be continued here. I was unable to locate any clear instance of a language that glottalizes wordfinal vowels only when they are phrase-internal, which is fairly unsurprising given the above. And as above, it is primarily unstressed vowels that are subject to phrase-final glottalization. Again, however, there are some crucial counterexamples. Hyman quotes Timothy Vance writing that in Tokyo Japanese, which inserts glottal stop after short vowels before pause, “...the glottal stop after a short vowel is more salient when a speaker is excited and emphatic.” (Hyman 1988: 114) From what we now know of phrase-final glottalization, however, “excitement” or emphasis should if anything prevent glottalization of the final short vowel. Likewise, recall from the discussion of devoicing above Sapir’s report that in Southern Paiute the “rhetorical emphasis” which rescues final vowels from devoicing actually results in

Final weakening

131

their lengthening, sometimes with the addition of a glottal stop. Finally, in the Isthmus Veracruz Nahuat dialect of Nahuatl, a “junctural glottal stop” is inserted before phrase boundaries (Wolgemuth 1969). This glottal stop after short vowels is distinguishable from the phonemic glottal stop in being very brief and lacking complete closure. When the final short vowel is stressed, however, this junctural glottal is “lengthened and strengthened”, so that it comes close to being indistinguishable from its underlying counterpart. Obviously, all of this is a problem for the above account of phrase-final glottalization as a consequence of lowering subglottal pressure and gradual vocal cord abduction at the end of the prosodic domain. It would make perfect sense, however, if the glottalization in question on these final vowels were actually true glottalization, the tight glottal adduction and tense laryngeal musculature generally described for distinctive creaky voice or glottal stop. Here as with devoicing, the solution to the quandary is in the nature of phonologization. In its embryonic form, phrase-final glottalization is a phonetically contingent gradient effect, dependent entirely on the convergence of the laryngeal and alveolar states listed above. Glottalization is not an independently targeted feature of the articulation. The acoustic features produced by the irregular pulsing of the glottis this “automatic glottalization” causes, however, are extremely similar to those associated with true creaky phonation or glottalization (i.e. irregular pulsing, relative F0 lowering, lower amplitude, and characteristic spectral tilt).97 If gradient glottalization becomes regular and salient enough on final syllables, the possibility for its reinterpretation by the listener as intended or targeted rather than contingent becomes quite real. As a consequence of this reinterpretation, glottalization becomes extrinsic, independently targeted, and hence realized as glottalization proper, with the tightly adducted vocal folds we would expect of it. At this point, a rise in subglottal pressure such as that associated with stress or emphasis would come to increase the perceptual robustness of the creakiness (and glottal occlusion if there is one), yielding the patterns described above. It is important to bear in mind at this point that, as before, “phonologized” does not necessarily mean “phonemicized”. The emphatic glottal stop in Tokyo Japanese, for example is “non-distinctive” according to Vance. This seems to be the case quite frequently here.98 The most illustrative sort of system as regards the phonologization scenario detailed here would be one in which glottalization had been phonologized (as, say, insertion of contrastive glottal stop) either utterance or Intonational Phrase-finally, while before lower level phrase boundaries gradient irregular phonation was still being realized. To show this conclusively for any given language, however, would require an aerodynamic study of the type Slifka performed to obtain her results.

132

Final syllables

3.6.3. Mixed systems It has been demonstrated above that both allophonic devoicing and glottalization can coexist within a given system in measured doses. In Russian, however, glottalization clearly has the upper hand in terms of salience, with any devoicing that takes place phrase-finally being durationally small and sporadic in realization. Other systems, however, have developed differing distributions of the two. Some languages, for example, appear to use one or the other in free variation. Gordon (1998) cites Apinaye as having both lengthened, devoiced and creaky allophones of phrase-final vowels, but mentions no specific distribution. Afar (Bliese 1976) in addition to its final stressed-vowel devoicing, is said to have glottalization before certain boundaries, though the distribution is again not clear. Isthmus Veracruz Nahuat has a free variant of its junctural glottal stop called “fricative”, which is a glottal stop, after which “an ‘h’-like release is heard.”99 The distribution of the variants can be phonologically conditioned, as in Koyra (Hayward 1982: 217–218), an Omotic language in which utterancefinal vowels (all of which are short – see below) are realized voiced but “terminating with a glottal closure” when following voiced consonants or phonemic glottal stop, while following voiceless consonants they are “extremely brief and more or less devoiced”, especially after sibilants, where they are all but inaudible. Varying laryngeal settings can even be distributed according to sociolinguistic criteria, as in the famous case of Koasati (Muskogean, Kimball 1991), in which, in the speech of men, final vowels are all “prolonged”, while in the speech of women a final glottal stop is inserted after the vowel (with laxing of final high vowels).100 Often the distribution of devoicing and glottalization serves to distinguish final long vowels from final short vowels (a contrast which, as described below, is otherwise prone to collapse in final position). Hyman notes that in most languages with final glottal stop insertion, the process only applies after a short vowel (assuming a contrast, though in collapsed systems the quantity-neutralized glottalized vowel is generally short phonetically as well). The long vowel in these systems usually remains untouched, as in Hausa, where according to Carnochan it “just dies away” (Carnochan 1988),101 or in Luiseño (Hyde 1971), where short stressed vowels have a “clipped” quality, while longs and unstressed vowels do not.102 I have not found any system in which the short vowel is glottalized and the long vowel partially (but significantly) devoiced. Such a system could presumably come into being, though it would likely be quite unstable, due to the resulting durational similarity of the voiced portions of the vowels, which could lead to the collapse of the quantity distinction. The

Final weakening

133

reverse, on the other hand, is clearly attested. Hyman (1988) cites Oromo and Tepehua as instances of such a system. In these languages, where the short vowels are completely devoiced and the long vowels are glottalized, the presence of any significant voiced period, even if somewhat abbreviated by the glottal adduction gesture, would be sufficient to distinguish long from short, making a perceptually more stable system. Devoicing of both long and short final vowels is reasonably common (with the proviso that the long vowel is only partially devoiced). This is in fact the system described by Sapir for Southern Paiute (Sapir 1930). In Southern Paiute final long vowels are partially devoiced or shortened, and short vowels are realized voiceless or completely elided. This is essentially the pattern termed “Final Mora-Clipping” by Hayward (1998). Hayward is referring here to any system in which a final long vowel is shortened or partially weakened in some way (a bimoraic vowel becomes monomoraic), and a short vowel is severely reduced or deleted (a monomoraic vowel goes away). Southern Paiute had conserved the pattern of “Final MoraDevoicing”, which could obviously give rise to the alternate pattern of actual “clipping” or deletion. Many languages exhibit this latter pattern, which I assume is a reflex of a devoicing pattern in most cases. Languages such as Latvian (Mathiessen 1997), Maltese (Brame 1972), and D’irayta (Hayward 1998) exhibit this pattern.103 Gilbertese represents a compromise variant, in which all final long high vowels shorten, but only short high vowels after nasals are deleted (Blevins 1997). The remaining logical possibility, glottalization of both long and short vowels (with retention of the contrast) is attested, but in only one instance I have found so far (Isthmus Veracruz Zapotec, Wolgemuth 1969). Again, it is possible that such a system would be inherently unstable, if the glottal adduction gesture were to shorten the long vowels enough that the durational difference became less robust.104 In the above sections I have detailed aspects of the phonetics of phrasefinal position which can be characterized as “weakening” or “prominencereducing” in nature, at least in their unphonologized state. I have presented a typology and developmental analysis of the phonological patterns which arise from these phonetic trends. I have also argued that the presence of these phonetic weakening patterns is in no way implementationally antagonistic to the presence in the same system of phonetic patterns of phrase-final lengthening and gestural enhancement along some dimensions. The acoustic result of the overlay of these patterns within a system, however, may indeed produce instability, resulting in phonologization patterns which ultimately efface the one set of patterns or the other (at least to the extent that in a system in which phrase-final devoicing has led to final vowel deletion, or final clipping, there is an obvious sense in which

134

Final syllables

the effect of the devoicing pattern has in the end won out over the final lengthening pattern). I have also shown, however, that this in no way necessarily means that in the resulting synchronic system there is no process of final lengthening or strengthening. Rather, synchronic final lengthening could simply be active over a wider span of segments than was the case in the ancestral system. 3.6.4. Nasalization Before returning to the question posed at the beginning of this section on final weakness, that of how to reconcile evidence for supralaryngeal strengthening in final position with the idea of “final fade” and descriptions of “reduced” phrase-final vowels, it is necessary to outline one additional phonetic tendency associated with final position crosslinguistically: vowel nasalization. This tendency is not as well-known as those discussed above, and, due to the subtlety of the effect in some cases, I suspect underreported as well. One recent treatment of final nasalization is that of Hock, in his 1999 paper, mentioned above as advancing the hypothesis that final position is a monolithic weakening environment. Hock mentions only one case: Vedic Sanskrit. In Vedic Sanskrit, there is a sporadic tendency for prepausal vowels to be written nasalized. Hock attributes this to a weakening of the velic closure at the end of a phrase, where the articulatory organs are returning to the rest position. Hock does not discuss conditions on the distribution of this nasalization, however. It is still not easy, unfortunately, to assess the claim that phrase-final position is generally correlated with weakening of the velic closure gesture from experimental evidence. Krakow, Bell-Berti, and Wang (1991) (reported in Krakow 1993) present evidence of a gradual decline, primarily in stressed syllables, in the height of the velum for oral consonants over the course of the utterance. Recall, however, that this is roughly parallel to the findings of Vayra and Fowler (1992) for supralaryngeal articulations, findings which also showed strengthening of these same articulations specifically in word-final unstressed syllables. If this is truly a positionspecific “weakening” pattern for the articulation of vowels as Hock claims (rather than what could be seen instead as a gradual decline in the degree of strengthening of gestures in strong positions such as stressed syllables), we would expect it to be especially obvious on unstressed vowels, all the other articulations of which are well known to be weakened or reduced. What is more, the Krakow, Bell-Berti, and Wang study concerns specifically the position of the velum in the production of oral onset obstruents. What is measured is velum height, which of course can vary quite a bit without

Final weakening

135

allowing actual opening of the velum, and in which there are intrinsic differences in the articulation of non-nasal segments. What was found was a declination such as that reported in Vayra and Fowler (1992). Here, the velum height was lower at its last peak in the sentence than it was at its first peak. The final syllable of all the test sentences was a stressed monosyllable, meaning that, were there a pattern even of local strengthening of velic closure on phrase-final unstressed syllables (analogous to that detected in numerous studies for other supralaryngeal articulations), it would not have been detected. A pattern of declination throughout the phrase is, of course, very different from a localized jump or drop on final unstressed syllables. Were this declination trend part of a general weakening tendency for phrase-final material, we might reasonably expect it to result ultimately in phonologization of the pattern in some instances. Recall that the tendency to phrase-final devoicing is phonologized quite commonly for both final consonants and vowels in the languages of the world. I am not aware of any phonological system, by contrast, in which the oral/nasal contrast for consonants is neutralized word or phrase finally in favor of the nasal.105 This suggests that prepausal vowel nasalization may well be the product of something other than velic closure weakening. Another study, in fact, that of Vassiere (1988), finds the opposite pattern for strength of velic closure in phrase-final position. It should be noted, however, that while Keating et al. (1999) interpret this study as showing that phrase-final position is “strong” for velum height, what the study is actually measuring may be somewhat more ambiguous. The claim in Vassiere (1988) is that an oral consonant before a major syntactic boundary undergoes less anticipatory velum lowering from a following VN sequence (in other words, in a -C##VN- string) than it does in CVN strings not interrupted by a major syntactic boundary. The claim, then, is that there is less coproduction involving the velum across such a boundary, rather than that, all things being equal, oral consonants have a higher velum position than they do, e.g., in phrase-medial word-final position. The phonetic evidence is thus somewhat equivocal on phrase-final nasalization as a natural articulatory trend, with a need for further experimental studies to show whether or not such a thing truly exists. The case for final nasalization as a weakening process in Sanskrit itself is also not a particularly strong one. Wackernagel (1896) is not overly specific as to the conditions of phrase-final nasalization of vowels, but a study by Lubotsky examines all such forms in the Rg-Veda (Lubotsky 1993). The first and most obvious generalization is that such examples are quite few in number and pattern in no exceptionless fashion. Over the years scholars have explained it, in fact, as an editorial scheme to avoid certain confusions potentially arising in hiatus contexts. It has also been deemed a

136

Final syllables

“Prakritism”, apparently being a more common phenomenon there and in Pa+li (1993: 197–198). That said, Vedic final nasalization does appear in one particular context more than any other: It affects only /a/, and then most often unaccented /a/ at the end of odd numbered padas (a metrical unit corresponding more or less to a line of verse) before accented vowels (most often /e/ or /o/, also before syllabic /r/). There are other cases as well, though Lubotsky considers the original case to be of an unaccented /a/ nasalizing in hiatus across a metrical boundary before a rising accent on the following vowel. He considers it a consequence of the lengthening of this final vowel he believes would have occurred due to an anticipatory pitch rise before the accented vowel (1993: 206). One other context for nasalization of /a/, interestingly, is in the so-called pluti forms. These are forms with lengthened final vowels used in certain interrogative and vocative contexts, which Lubotsky notes is thought to be a rising intonational contour. Whatever the case ultimately here, it seems clear that lengthening is involved, and in the case of the pluti perhaps even some emphatic element. Whether we can truly speak of “weakening” in this case is simply not clear. One thing that is interesting about the Sanskrit example, however, is that it affects only /a/. This is also the case in some of the examples below. Now, it is a well-known crosslinguistic generalization that velum height tends to vary directly with vowel height, to the extend that low vowels are often actually produced with a substantial degree of velic opening. There has been much dispute as to why exactly this should be so. Whalen and Beddor (1989) support a production-based hypothesis, whereby velum lowering is an automatic consequence of the articulation of low vowels, and not due to active control of velum height by the speaker. It is also wellknown that as systems develop distinctive nasalization, it is generally the low vowels which nasalize first (Chen 1973; Whalen and Beddor 1989), suggesting that this inherent nasalization also makes them more vulnerable to reinterpretation as intentionally nasalized. In order to explain this fact (and particularly its instantiation in Eastern Algonquian spontaneous nasalization), Whalen and Beddor cite experimental data from Henderson (1984) showing that the difference in velum height between oral and nasal /a/ in Hindi is less than the difference for other vowels. Their interpretation of these facts is that /a/, already produced with a lowered velum, will achieve nasalization with a relatively smaller additional lowering of the velum than would other vowels. Thus, if all vowels were produced with the same slight amount of velum lowering (for whatever reason), /a/ might achieve a more salient degree of nasalization than other vowels. If this is the right explanation, then we could imagine that, were there in fact a slight trend toward velum lowering on all vowels phrase-finally, the same amount of

Final weakening

137

lowering might cause /a/ to be perceived as intentionally nasalized sooner than any other vowel in the system. Whalen and Beddor also show, however, that lengthening of a vowel which is already nasalized to some degree will increase the perceptual salience of that nasalization significantly, which is to say that a slightly nasalized vowel will be perceived as nasal more readily if it is longer. Whalen and Beddor note that this alone cannot explain the susceptibility of /a/ to nasalization, since although low vowels do generally have longer inherent durations than other vowels, these differences are not of the magnitude necessary in their study to increase the percept of nasalization. Consider this evidence, however, in light of nasalization in phrase-final position. Phrase-finally lengthening routinely increases the durations of final vowels by the amount shown to be necessary by Whalen and Beddor for an increased perception of nasalization on vowels which are already partly nasalized. The facts of Vedic Sanskrit, associating nasalization with positions of particularly dramatic vowel lengthening, agree very neatly with this line of reasoning. Furthermore, since low vowels are the only ones which would have had any degree of nasalization on them to begin with given the crosslinguistic characteristics of their production, it is possible to understand why only they would be affected in these circumstances. This account of the development of phrase-final nasalization in Sanskrit does not require any phrase-final velic weakening to occur, though of course, neither does it exclude it. The same pattern is found in another case as well. In Neo-Aramaic there is good evidence for phrase-final lengthening. Garbell (1965) provides intriguing information on the realizations of word-final /a/ in the various Jewish dialects of Neo-Aramaic (once) spoken in Persian Azerbaijan (including parts of what is now SE Turkey). In this language /a/ is most commonly realized as [a]. In its Northern dialects, this /a/ is backed in word-final position to [ ]. In the speech of females in the Urmi dialect (NW Iran), it is pronounced word-finally with rounding as []. In the Southern dialects, however, it is lowered, backed, and nasalized: [-(].106 It is possible to understand this development in just the same way as was proposed for Sanskrit above, with the additional possibility that to the extent that velum lowering is an automatic consequence of the production of low vowels, perhaps additional lowering here creates additional nasalization as well, again enhanced perceptually by the lengthening found in final position. In many systems, however, phrase-final nasalization affects all the vowels of the system. This is the case in Russian, for example (Zlatoustova 1962: 44–45), where an original assertion by Shcherba (1912) has been subjected to extensive experimental testing and confirmed. In phrase-final

138

Final syllables

position Russian vowels are nasalized for at least part of their duration.107 This is true regardless of the preceding consonant’s identity, though after nasals it is stronger. It is also stronger, for whatever reason, following palatalized consonants. Stressed vowels for the most part nasalize over less of their duration than unstressed vowels. (Krakow [1993: 102–105] reports also experimental studies on the effect of stress on velum position, in which for one speaker at least low and high vowels had a lower velum position when stressed than when unstressed. For another speaker, however, high vowels had a higher velum under stress, while for low vowels the velum was lower – suggesting local hyperarticulation of sort documented by de Jong 1995.) The effect is non-neutralizing, as Russian lacks contrastive vowel nasalization. This pattern is also attested in Cherokee (which has an underlying nasal /(/), in which all vowels are nasalized prepausally (Walker 1975; Whalen and Beddor 1989). Likewise Aikhenvald (1996: 512) lists the following languages as having the areally-common feature of “nasal pause” (word- or phrase-final vowel nasalization): Pirahã (Mura-Pirahã), Irantxe (affiliation uncertain), Rikbaktsa (Macro-Gê), Xeta (Tupí-Guaraní), Assuriní (TupíGuaraní), Tapirape (Tupí-Guaraní). She notes phonetic nasalization utterance-finally in Jarawara (Arawá). These latter systems may provide evidence supporting the phrase-final phonetic weakening hypothesis, in that they affect all the vowels of the relevant system, while the lengthening-plus-intrinsic-nasalization hypothesis proposed above should logically work only for low vowels. There is, however, another possible explanation for the development of phrase-final nasalization which again needs no assumption of weakening: rhinoglottophilia (Matisoff 1975). Rhinoglottophilia, simply stated, is the propensity for spontaneous nasalization in the environment of laryngeal (or high airflow) segments. This is discussed by Ohala (1975), who attributes nasalization in the context of voiceless glottal and pharyngeal consonants to the fact that an open velum would have very few acoustic consequences for either segment type (meaning that in the absence of the requirement of velic closure for accurate perception, none may ultimately be produced), while for the cases of /h/ and potentially other high airflow segments such as /s/, Ohala implicates the acoustic similarities between nasalization and the breathiness that may accompany these segments. Specifically relevant are increased formant bandwidths, the presence of anti-resonances in the spectrum, and a general lowering of amplitude (Ohala 1975: 303). In the previous two sections it was demonstrated at some length that phrase-final syllables are commonly host to both breathiness/devoicing and glottalization, and hence a prime location for spontaneous nasalization to arise. In such a scenario an effect like breathiness on a phrase-final vowel,

Final weakening

139

itself there as an automatic consequence of other factors (such as partial devoicing), could be perceived as nasalization by the listener and reanalyzed as a pattern of intentional nasalization in final position. In this scenario the rise of nasalization would not depend on the state of the velum in the original utterance (though again, nor would it exclude the presence of some degree of velic lowering there). Evidence for rhinoglottophilia as a possible source for phrase-final nasalization is found in the following: Isthmus Veracruz Nahuat (Wolgemuth 1969) has already been shown in the preceding section to have two variant realizations of its prepausal glottal stop insertion process. The first is simply a glottal constriction of some magnitude, as in (a) below. The second, in (b), is described as glottal closure with an ‘h’-like release. There is, however, a third variant, as in (c), which Wolgemuth describes as voicing continuing past a glottal closure, after which a brief nasal re-articulation of the vowel is heard. This variant is apparently especially common in emphatic circumstances, such as repetition of something a linguistic fieldworker fails to hear correctly the first time. (30)

Variants of prepausal glottal stop in Isthmus Veracruz Nahuat (Wolgemuth 1969) a. [kisa#] b. [kisa#h] c. [kisa#a(]

‘he leaves’

[mo:to#] [mo:to#h] [mo:to#o(]

‘squirrel’

Even stronger evidence for the rhinoglottophiliac origin of at least some instances of phrase-final nasalization comes from Yucuna (Arawakan, Schauer and Schauer 1967). In this language final vowels are nasalized only following /h/ or hiatus (a situation commonly enough resolved with glottalization crosslinguistically, though its realization in Yucuna is not described). In this instance, then, it seems the acoustic environment conditioning reanalysis must be present both before and somewhere during/after the vowel in order for phrase-final nasalization to take place. One further case comes from Warekena (Northern Arawak, Aikhenvald 1996), which exhibits prepausal (and optional word-final non-prepausal) insertion of a sequence /-hV (/, the vowel is a nasalized copy of the preceding word-final vowel. Aikhenvald describes this as the result of three rules: ‘h’-insertion, translaryngeal harmony, and vowel nasalization. Diachronically though it seems more likely to be a consequence of phonologization of a combination of final lengthening, breathy phonation and spontaneous nasalization.

140

Final syllables

The correct explanation for phrase-final nasalization could thus be a consequence of a weakening tendency in velic closure toward the end of a phrase, or a perceptually motivated result of another phonetic characteristic such as lengthening or devoicing or glottalization. It could, in fact, be some combination of all of these, or perhaps differ from case to case. Whatever its source, it has the potential to neutralize the distinction between oral and nasal vowels word- or phrase-finally in some languages, thus making final position phonologically weak for contrasts of nasality. This is the case in South Gujarati dialects (Mistry 1997), in which vowel nasalization is contrastive everywhere but on word-final vowels (with denasalization of etymologically nasalized vowels). An environment with phonetic nasalization of all vowels would clearly be a less-than-ideal location for the perpetuation of a nasality contrast. There also exist, however, systems in which word-final position is phonologically strong for a nasalization contrast on vowels. In these, however, the source is usually something altogether different. Interestingly, in what is a complication for a theory of phonological strength effects which seeks to derive the licensing of contrasts from the cues that make a particular position optimal for the perception of the phonetic features in question, these strength effects have their origins in the lack of perceptual robustness characteristic of final syllables. It has just been demonstrated that phrase-final position may actually be quite a bad place for the realization of a vowel nasalization contrast. Nonetheless, in Nanai (Tungusic, Avrorin 1959: 29–32), nasalized vowels appear only in word-final position. The opposition is contrastive. The source of the nasalized vowels is the deletion of word-final /n/ with concomitant nasalization of the preceding vowel. In unsuffixed forms there are thus abundant minimal pairs. Upon the addition of most suffixes, the [n] is restored and the vowel becomes oral. This is not, however, the case with all suffixes. The diminutive [-ka( n /-k( n ], for example, never conditions the return of a formerly-preceding nasal consonant, and the vowel remains nasal. Several other suffixes have already come to restore the nasal only optionally. In Bare (Northern Arawak, Aikhenvald 1996), nasalization again only occurs on final vowels. Nasalized vowels are underlying in only two forms of Aikhenvald’s corpus, while all the others are derived from the truncation of forms ending in any of several suffixes of the general shape /-nV/. The deletion leads to nasalization of the wordfinal vowel. Interestingly, this truncation takes place only non-prepausally, where before pause, in another instance of domain-final resistance to deletion, the suffix remains intact. In these examples, nothing about the phonetics of final position is “licensing” the appearance of a contrast between nasal and oral vowels. The phonetic characteristics (here,

Final weakening

141

detrimental) have clearly caused the emergence of the contrast, but can hardly be imagined to facilitate it in any active way in the present. The contrast exists here, then, only because it came to exist here, and because it has so far avoided being effaced by the perceptual weakness of the position. 3.6.5. Final vowel reduction and loss of contrasts 3.6.5.1. Lowering I turn back now to the question which began this section on final weakness, the matter of cases in which domain-final position is singled out in particular as a weak licenser of vowel contrasts. The first neutralization phenomenon I will take up here is final vowel lowering, a process most likely not conditioned by any phonetic weakening in final position. In fact, final lowering is more likely to be a consequence of phonetic strengthening of the type Cho refers to as Sonority Enhancement in English stressed and final syllables (See Section 3.2.3.1). As such it would be classed among PN patterns with what Smith (2002) calls Positional Augmentation. (See Chapter 4 for details.) Both neutralizing and non-neutralizing vowel lowering is attested in final position in a variety of languages. Some caution is indicated here of course, as a word-final inventory of /e, o, a/ out of a word-internal fivevowel system could just as easily be the result of devoicing and loss of final high vowels as it could of lowering.108 Some examples of final lowering (or high vowel exclusion) follow in (31) through (34): (31)

Bare (Northern Arawak – Aikhenvald 1996) Utterance-final diphthongization of the high-vowel /i/ to [ie], particularly under emphasis. Perhaps a weak example.

(32)

Ongota (affiliation uncertain, Cushitic/Omotic – Savà and Tosco 2000) Word-final position: high vowels /i, u/ optionally lowered to [e, o], such that only three vowels contrast in this position. Neutralizing.

142

(33)

Final syllables

Dasenech (Cushitic – Sasse 1976) Word-final position: high vowels /i,u/ realized as [e, o]. Underlying /e, o/ are realized as [, ], so the process is non-neutralizing. As noted above, Dasenech final vowels are also subject to devoicing or deletion in phrase-internal final syllables and to partial devoicing phrase-finally.

(34)

Castillian dialects (Penny 1986) In certain Castillian dialects of Spanish, only [e, o, a] are found in final syllables. In others all five vowels still appear.

There are several possible factors which could give rise to final lowering, some or all of which are likely responsible for the emergence of the cases above. First, one thinks of the “sonority enhancement” found by Cho (2001) on English final unstressed vowels, manifesting itself in part as lowering. The common pattern, discussed below, in which languages avoid shorter vowels in final position also comes to mind as a source of potential extra length through lowering. Under this interpretation then final lowering is a Positional Augmentation of the sort Smith (2002) shows to be characteristic of certain otherwise phonologically strong positions. Another possibility is that other phonetic features found in final position are lowering the vowels here, specifically, the presence of creakiness or breathiness/devoicing. Creaky phonation is certainly commonly enough associated with raising of F1 crosslinguistically (Gordon and Ladefoged 2001: 400), though breathiness seems if anything to have the reverse effect. In general the effect of /h/ on neighboring vowels is somewhat ambiguous. In many languages it patterns with other gutturals in lowering neighboring vowels, while in other languages in which gutturals lower or retract vowels, /h/ has no such effect. These patterns are surveyed and analyzed by Rose (1996). Very briefly put, Rose advances a phonological analysis of these systems in which vowel-lowering /h/ has a Pharyngeal node in its featuregeometric representation, while non-lowering /h/ is placeless. The pharyngeal specification is a consequence of the presence in the vowel inventories of certain other gutturals (uvular and pharyngeal continuants), with which the laryngeals must contrast. Cases appearing to be exceptional are argued to involve the spread of [RTR], which /h/ lacks. If lengthening and sonority enhancement or strengthening are responsible for vowel lowering, then this potential instance of phonological weakness may in fact be a consequence of phonetic prominence. If creakiness or devoicing are involved, though, then it is phonetic weakening which is responsible. Again, a combination of these may ultimately be

Final weakening

143

more likely, in which case phonological weakness emerges from a combination of phonetic strengthening and weakening affecting different aspects of the production of final vowels. 3.6.5.2. Final reduction Laxing or reduction of word- or phrase-final vowels is a commonly encountered pattern crosslinguistically. While the laxing of high or mid vowels could be considered an instantiation of the final lowering described above, many of these systems explicitly raise /a/ to schwa in the same environment, precluding any such explanation. Sometimes descriptions are not specific, as in Biyagó (Atlantic, Bissagos archipelago, Wilson 2000), where word-final vowels are said to be “somewhat lax and can be hard to distinguish”. As noted above, if final reduction is identical in both phraseinternal and phrase-final contexts, it might be possible to attribute it to the phonologization for all tokens of the relevant forms of the realizations previously arising only phrase-medially, where the final was indeed short and heavily reduced. Recall the Bulgarian and Brazilian Portuguese durational facts showing the relative weakness specifically of phrasemedial final vowels. This would be difficult to prove, however, without fairly detailed historical documentation of the language, and in any case, it fails to resolve all our problems in this regard, as instances of the weakening specifically of phrase-final vowels, either exclusively or at least to a greater degree than phrase-medial finals are in fact attested. Mentioned already were reports of an especially large amount of reduction of phrasefinal vowels in Andalusian dialects (Sanders 1994: 118). In Kawaiisu (Zigmond, Booth and Munro, 1990: 13), word-final short vowels are often dropped, especially when they are utterance-final. Patterns like these are obviously inconsistent with a view of final open syllables as environments for vowel lengthening and strengthening such as that suggested by the articulatory studies and description of final strength detailed above. Before addressing possible explanations for these patterns, I present a short catalogue of reduction processes, neutralizing and non-neutralizing, targeting final vowels in particular. The balance of examples in the literature is heavily skewed toward processes affecting both phrase-medial word-final vowels and phrase-final vowels alike. Though this may indeed be a valid generalization about crosslinguistic patterning of these effects, quirks of the descriptive literature could play some role in creating the appearance of this distribution: There is a strong tendency toward underdescription of effects which are restricted to phrase-final position, which could be the fault of both oversight, conscious restriction to lexical processes, or the reporting of

144

Final syllables

largely phrase-final effects as “sporadic” or “optional” word-final processes instead. Caution is therefore in order here. Instances of word-final devoicing leading to deletion were treated above in the section on devoicing, and in any case word-final vowel deletion is so common an effect crosslinguistically as to need no exemplification here beyond those examples already mentioned. Often deletion is preceded, however, by reduction of contrasts (perhaps with devoicing as well?). Several potential instances of this are discussed in Blevins and Garrett (1998), in the context of metaphony-like quality transfers from reducing and deleting finals to strong pretonics in Austronesian languages. Exemplary in this regard is the development of final vowels in Nyungar (Pama-Nyungan, Blevins and Garrett 1998: 538–539). In the southwestern dialects of Nyungar, all final vowels are neutralized to schwa (with metaphonic transfer of some features to preceding stressed vowels). In the southern dialects of the language, this process has been taken a step further, with final schwa disappearing altogether. Other examples of final vowel reduction are given in (35) through (44) below: (35)

Highland East Cushitic (Hudson 1976: 250) In several East Cushitic languages and dialects, particularly Burji, Hadiyya and Kambata, word-final /i/ and /a/ are subject to laxing and devoicing, apparently leading often to deletion, more often in some dialects than in others. Hudson notes that this is particularly the case when the final vowel follows a “strong” syllable, where the final becomes “almost inaudible”.109

(36)

Kullo (Omotic – Allan 1976) Word-final /a/ is raised to [] (except, apparently, in the infinitive forms of verbs). Non-neutralizing.

(37)

Iban (Sea Dayak – Omar 1981) /u/ has a tendency to unround to [] in word-final syllables, where some free variation with /o/ also occurs. Potentially neutralizing.

(38)

Brok-skad (Dardic – Sharma 1998) /e/ is realized as [] word-finally. /a/ is realized as schwa especially word-finally (though also apparently elsewhere), and often elided in rapid speech, “particularly as a case-marker”. Non-neutralizing.

Final weakening

(39)

145

Maithili (Indo-Iranian – Jha 2001) Unstressed final syllables contain only [i, u, o, ]. Short [, æ, , a] do not appear. Neutralizing.

(40)

Kandahari Pashto (Indo-Iranian – Elfenbein 1997) Final unstressed /e:/, /o:/  [i], [u], except, apparently, when this merger would collapse certain grammatical distinctions, such as that between the verbal suffixes 2sg.pres. -e and 3sg. -i.110 Neutralizing.

(41)

Muinane (Witotoan – Walton and Walton 1967) Word-final /a/ is sometimes realized as schwa by some speakers. Non-neutralizing.

(42)

Goajiro (Arawakan – Mansen 1967) Variable raising of word-final /a/ to schwa. Non-neutralizing.

(43)

Lunda (Bantu – Larry Hyman, personal communication) Three vowels /i, u, a/ word-finally, five elsewhere. Neutralizing.

(44)

Nanai (Tungusic – Avrorin 1959; Li 1996) The Nanai vowel system is not unambiguously described. Li (1996) notes that he may be idealizing to a prior historical stage when he characterizes it as having /i, u, , , , a/, in which the latter three are RTR, produced with a constricted pharynx as in some other Tungusic languages. He assumes that the RTR distinction is still operative due to the fact that the harmony system continues to operate (Li 1996: 147). Avrorin (1959) transcribes Li’s // as ‘o’, calling it high-mid. In general he treats the pharyngealization distinction as one of vowel height, even describing it in terms of jaw aperture in places. That said, in non-initial syllables /, / have a tendency to raise (lessen degree of pharyngeal constriction?) toward /i, u/. In final syllables this frequently results in complete overlap. This is more consistent with // than with //, which in some words resists reduction, particularly in shorter words in which the preceding vowel is not long, and especially if the preceding vowel is also [].111 // is also especially likely to reduce following a long vowel. The prominence of the initial syllable in Nanai is likely a relic of earlier

146

Final syllables Tungusic initial stress (though see Chapter 4 on initial-syllable prominence in Turkish for an alternative).

3.6.5.3. The phonetics of final syllable reduction It is necessary to note once again that in virtually all examples above the domain of the application of reduction is said to be “word-final”. At first consideration, this could mean one of three things: (1) The language has no final lengthening or strengthening, and final syllables are simply all phonetically weak, (2) the pattern began in phrase internal final vowels and was generalized to phrase-final vowels in the manner proposed above for some speakers of Brazilian Portuguese, or (3) the author of the description has noticed word-final reduction phrase-internally and failed to notice or point out failure to reduce phrase- or utterance-finally. This is especially likely in the gradient reduction cases like Nanai above. Neither of the latter two hypotheses will be simple to confirm without an extended research program. The first hypothesis above, of course, is quite plausible, given the extremely small number of detailed articulatory studies of phrase-final effects and the near total concentration of these on one or two West Germanic languages. It may well be that final lengthening, however universal in its existence, nonetheless varies not only in its domain and degree of application crosslinguistically, but in its articulatory implementation as well. Certainly some languages, such as Creek (Johnson and Martin 2001) discussed above, have final lengthening that does not impede normal phonetic vowel reduction. The presence of articulatory strengthening in the implementation of final lengthening may then be a variable set differently from language to language. Of course, to account for reports of degrees of reduction targeting specifically phrase-final syllables, we would have to assume as well that in some languages lengthening (gestural slowing) is implemented with actual gestural weakening, rather than mere absence of strengthening. This is of course also possible, and without further articulatory studies it may be impossible to resolve this question satisfactorily. Certainly if the question is “Do some languages fail to strengthen, and indeed even reduce phrase-final vowels, regardless of whatever effect phrase-final lengthening may or may not have?”, then the answer is clearly yes, to the extent that reports are correct. We should be cautious, however, before we take this to mean necessarily that this language has no final lengthening, or that final lengthening in this language is non-strengthening, in light of the results of CambierLangeveld’s 1997 study of final lengthening in Dutch, where it was shown

Final weakening

147

that when the final vowel of a word was schwa (and hence potentially resistant to lengthening), final lengthening simply took place over a larger domain extending back from the final syllable. A careful study of a language reporting particularly great reduction of phrase-final vowels could well discover the same pattern, that not only did the language have final lengthening (just retracted from the reduced final vowel), but that it was even strengthening the articulation of the material it applied to as well. If the latter were true, however, we would legitimately wonder why the final vowel was reducing in the first place, perhaps even to the point of reducing a five vowel system to the three vowels common in the unstressed vocalism of the languages of the world. The following is a potential answer to this question: It is clear from the discussion of the phonetic properties of final syllables above that, whatever properties their supralaryngeal articulations may take, other properties are contributing to a general lack of perceptual robustness for material realized in this position. It was argued in the discussion of devoicing, above, that if the subglottal pressure drop associated with phrase-final vowels is strong enough that they are routinely being realized as all or partly voiceless, no amount of supralaryngeal “sonority enhancing” tongue-body lowering, lip opening or gestural slowing will be effective in creating the conditions for the accurate perception of the quality of the final vowel. The following example is highly suggestive in this regard. In Standard Malay as described by Teoh (1994), final syllables license no contrast between mid and high vowels, while elsewhere such a contrast exists. Realization of final syllable vowels, however, varies: In final open syllables, the three possibilities are /i, u, /, while in final closed syllables, there are only [e, o, a], where [e] and [o] are actually realized somewhere between mid vowels and laxed high vowels. Certainly closed syllable laxing/lowering of high vowels is common enough in Austronesian languages of the area, but it is usually accompanied by raising of the low vowel as in, e.g., Standard Indonesian (Macdonald and Darjowidjojo 1967). Here, however, in the final closed syllable the /a/ stays low. A possible, though strictly hypothetical solution is this: Contrasts between mid and low vowels were neutralized in all final syllables, but with the results varying depending on syllable structure. In final closed syllables, we have an example of the type of final lowering discussed above, possibly an effect of final lengthening and sonority-enhancement. This may or may not have originally affected final open syllables as well. Final open syllables were then (either subsequently or from the outset) subject to the type of phonetic attrition discussed above in connection with devoicing. That a combination of features affecting final vowels such as a tendency to total or partial devoicing, a drastic amplitude drop, and various

148

Final syllables

instantiations of non-modal phonation types could lead to a reduction in the perception of vowel quality contrasts for final vowels (while vowels in closed final syllables might be shielded somewhat from these effects due to increased distance from the phrase boundary) is not difficult to conceive. Some combination of various of the above should easily be enough to obscure vowel distinctions and cause reanalysis and collapse of contrasts. But why raising? Certainly breathy phonation has been associated with raising in some languages (Gordon and Ladefoged 2001: 400). Another possibility, however, one that would account for both reduction through raising and laxing of phrase-final vowels suggests itself. Take the following spectrogram, of the Russian word /boloto/, [b lot ], ‘swamp’, in phrasefinal position. 5000 4000 3000 2000 1000 0 0

0.789161 Time (s)

–0.25

0

b

o

l

o

t

0.25 0

o 0.789161

Time (s)

Figure 9. Russian phrase-final [b lot ] ‘swamp’

Again, Russian unstressed /a, o/ is realized anywhere from [ ] to [] in phrase-final position, depending on speech rate and register (i.e. on duration). In this particular instance, the vowel is significantly lengthened (particularly in comparison with a non-phrase-final vowel of this sort), measuring nearly 180 ms. from onset of voicing to the last reasonable

Final weakening

149

glottal pulse. The first pretonic vowel (another strong position) is only 115 ms. in duration. The F1 maximum for the final vowel shows that it is in the Degree 1 reduction range, as discussed above: 595 Hz. (The first pretonic vowel here has an F1 maximum of 556 Hz, close to the mean for this speaker – see Section 3.6.) The pretonic vowel, however, does sound impressionistically at least more ‘a’-like than the final. An important reason for this could be that, in addition to the fact that the amplitude of the pretonic and stressed vowel is substantially higher than that of the final vowel throughout, the amplitude of the first pretonic vowel is more or less equally high throughout its duration, including, obviously, at its center where the F1 maximum occurs. For the final vowel, however, this is not the case. The amplitude of the final vowel decreases steadily throughout, becoming extremely low and ultimately fading to nothing by the end. What is more, the peak F1 for the final vowel occurs relatively far into its course, approximately 90 ms. into the vowel. By this point, of course, the amplitude of the vowel has fallen considerably, and only gets lower. With the application of final lengthening, the vowel has become extremely long by comparison, and assuming that gestural stiffness is lowered (meaning slowing), but target displacement is not reduced, this means that the time to peak displacement will have increased significantly. What seems to be occurring, in other words, is that the “sonority enhancement” discussed by Cho (2001) is taking place, but the higher F1 peak it produces is coming quite late relative to the gradually decaying amplitude of the vowel. This has the effect of making the part of the vowel with the highest F1 cooccur with the part in which perceptibility is drastically decreased, as an already lowered amplitude continues to drop. The most perceptible and least distorted (by non-modal phonation) part of the vowel, on the other hand, is at the very beginning, during the lengthened CV transitions, long before the vowel reaches its target articulation. This unfortunate phasing relationship between the supralaryngeal target of the vowel and the point at which the laryngeal and aerodynamic weakening of the final vowel is at its lowest could cause listeners to perceive the height of the earlier part of the vowel as the intended target, with reinterpretation in this instance leading to raising. For high vowels, by contrast, the same effect could lead to centralization or laxing. Compare this with the realization of the first pretonic vowel, which, though ultimately produced with a lower peak F1, nonetheless attains that target articulation relatively quickly, and then maintains it throughout a robust steady state. If this scenario took place in the final open syllables of Standard Malay as described above, it could explain the raising these final vowels, in particular, undergo. If this masking of the “strengthened” target articulation

150

Final syllables

by phonetic weakness of another sort is enough to lead to phonologization, the result could be similar to that described above for Dutch: Once the percept of laxing or raising gave rise to the phonologization of an actual target articulation of the final vowel that was laxed or raised, the durational characteristics appropriate to the resulting vowel would be realized here as well, final lengthening or not (just as described above for newly devoiced finals, or for the Dutch schwa). In this new situation, phrase-final vowels might actually be shorter and more reduced than their phrase-medial counterparts, though in the original situation, of course, they were not. The nature of final lengthening and the potential gestural strengthening of the affected segments, however, would not necessarily have changed. As demonstrated for Dutch, if the vowel in phrase-final position is inherently short to the point of resisting lengthening, lengthening might simply be felt further back in the word as its unaltered domain span came to include more of the word due to the shortness of the final segment. It is possible, then, that reduction patterns targeting specifically phrasefinal syllables are actually the result of a phonologization caused by the perceptual weakness of those syllables, and not necessarily by any original weakening of supralaryngeal gestures in that position. If the perceptual weakening story is true, then, as with devoicing, in the pre-phonologization state of the system, there is no problem with assuming final lengthening to have applied to unstressed final syllables with articulatory strengthening, just as it has been shown to do in numerous examples laid out in Section 3.2.3. Obviously only experimental evidence will resolve these questions. This idea of final reduction patterns originating in the phonologization of the perceptual weakness of the final syllable, rather than in decreased magnitude of supralaryngeal gestures, may account for another trend found in final syllables in some languages: monophthongization. In most dialects of Seediq, for example, word-final diphthongs /aj/ and /aw/ become /e/ and /o/ (Holmer 1996: 25).112 If under final lengthening the latter parts of these diphthongs were realized in such circumstances that their accurate perception were to be hindered, a listener might perceive only the coarticulated first half of the diphthong, and believe it to be the intended production of the speaker, a syllable-internal umlaut of sorts. This phonologization of a raised and fronted, or backed, raised and rounded /a/ could yield the mid vowels attested today.

Vowel quantity in final syllables

151

3.7. Vowel quantity in final syllables 3.7.1. Neutralization of quantity contrasts in final position Several instances of processes affecting vowel quality have already been discussed in this chapter, particularly the instances of mora-clipping, either through devoicing or deletion, and the deployment of features like devoicing and glottalization in patterns that tend to stave off the collapse of quantity contrasts by inhibiting the phonetic processes which work against the persistence of those contrasts. For the sake of completeness, and due to the relevance of certain of these patterns to wider concerns in phonological theory (e.g., in metrical stress theory) I will now present a survey of cases in which quantity contrasts are in fact neutralized word-finally by one means or another. The situation turns out to be more complex than has previously been suggested in the literature. Languages in which there is a contrast between phonologically long and short vowels in all positions save word-final are extremely common crosslinguistically. Final position, indeed, can safely be pronounced “weak” for the licensing of quantity contrasts, all things being equal. Examples include Yukulta (Pama-Nyungan, Keen 1983), Koyra (Omotic, Hayward 1982), Dasenech (Cushitic, Sasse 1976), Biyagó (Atlantic, Wilson 2000), many Dravidian languages, including Gadaba (Bhaskararao 1998), Kolami (Subrahmanyam 1998), Konda (Krishnamurti and Benham 1998) and Dhangar Kurux (Gordon 1976), as well as Gatha Avestan (Testen 1997), and Tigre (Semitic, Raz 1983). In most cases, the lack of contrast concerns phrase-medial and phrase-final ultima alike, and the resulting vowel is said to be short, presumably varying in actual duration according to position like any other vowel. In some instances, however, the vowel is said to be long. In Gatha Avestan (Testen 1997), for example, all final vowels are written as long (though this is obviously no guarantee as to their pronunciation). In Kolami, Subrahmanyam describes final vowels as long uniformly. Still other cases speak of final vowels which are half-long. These include Dhangar Kurux (Gordon 1976) and Maltese (Borg and Azzopardi-Alexander 1997), in which latter short vowels all deleted, and remaining long vowels shortened, but are apparently realized longer than they would be word-internally. Avoidance of length contrasts word-finally takes another form as well: The well-known propensity for vowels in final syllables to evade a process known as Iambic Lengthening (Hayes 1995; Hung 1994; Buckley 1998). Iambic lengthening is a process whereby an underlying short vowel, if parsed into the strong position of an iamb, is lengthened. Choctaw is often cited as exhibiting this pattern. Choctaw (Western Muskogean, as given in

152

Final syllables

Hung [1994], Hayes [1995] and Buckley [1998]) parses words into iambic feet, lengthening the vowels in the strong position of these: (45)

Iambic lengthening in Choctaw habina ti-habina-li pisa-li ti-pisa ti-pisa-ti-li tokwikili-ti-li

(habi:)na ‘s/he receives a present’ (tiha:)(bina:)li ‘I receive a present from you’ (pisa:)li ‘I see’ (tipi:)sa ‘s/he sees you’ (tipi:)(sati:)li ‘I cause you to see’ (tok)(wiki:)(liti:)li ‘I shine a light’

Final vowels, however, fail to lengthen in this position: (46)

habina-li habina-ti ti-habina ti-habina-ti-li

(habi:)(nali) (habi:)(nati) (tiha:)(bina) (tiha:)(bina:)(tili)

‘I receive a present’ ‘s/he gives a present’ ‘you receive a present’ ‘I give you a present’

Choctaw, like the languages cited above, has no long vowels derived or otherwise in final position (Buckley [1998] citing Nicklas [1975]). For Hung, this state of affairs results from the interaction of a constraint she calls Rhythm with a constraint mandating that all heavy syllables be stressed – Weight-to-Stress (after Prince 1990) and an OT formalization of Iambic Lengthening called Iambic Quantity (the ‘s’ side of a (w s) foot should be heavy). Rhythm is a constraint mandating that every stressed syllable be followed by an unstressed syllable.113 Among the situations this constraint can treat are the avoidance of stress clash and the avoidance of final stress. Hung’s analysis can be summarized briefly as follows: Rhythm demands that final syllables not be stressed. Weight-to-Stress demands that heavy syllables be stressed, and Iambic Quantity demands that metrically strong syllables be heavy. Thus, parsing the penult and final of a word into an iambic foot would require lengthening of the vowel (assuming other routes to heaviness to be blocked by Faithfulness constraints) because of Iambic Quantity. Lengthening the vowel would demand that the vowel be stressed (Weight-to-Stress), but doing this would violate high-ranked Rhythm. Assuming a certain ranking of constraints on parsing, the solution is to avoid lengthening by not parsing the last two syllables into a foot. Rules of final shortening and the general absence of long vowels in final position can be treated similarly. Again depending on the rankings of Faithfulness constraints, it will sometimes be optimal to shorten an

Vowel quantity in final syllables

153

underlying long vowel rather than to violate Weight-to-Stress by leaving it unstressed, or Rhythm by stressing it. While the non-finality of stress approach can account for some of the problems found crosslinguistically with vowel length in final syllables, it cannot account for them all, and while it seems clear that independently of these phenomena there is a demonstrable crosslinguistic pattern of avoiding the parsing of final syllables for metrical purposes, this is not the whole story here. First, non-finality refers to a tendency to avoid stressing vowels in final syllables of any kind, open or closed. As Buckley (1998) argues, there is a set of phenomena connected specifically to the vowels of final open syllables, not all of which non-finality approaches can address. In addition to the failure of Iambic Lengthening, Buckley calls attention to other cases in which lengthening processes fail to apply to word-final vowels. One of his examples is Italian, in which the vowels in stressed open syllables lengthen (closed syllables being already bimoraic), unless stress is final. This is seen in (47). (47)

Italian stress and vowel length a. eco papa capitano capitano povero

[ko] [pá:pa] [kapitá:no] [ká:pitano] [pvero]

b. papà così caffè resterà

[papá] [kozí] [kaffé] [resterá)

‘echo’ ‘pope’ ‘captain’ ‘they arrive’ ‘poor’ ‘father’ ‘thus’ ‘coffee’ ‘will stay’

When these stressed open final syllables are followed by a consonantinitial word (subject to prosodic restrictions), the preference for a bimoraic stressed syllable resurfaces through a process known as raddoppiamento sintattico (Buckley cites Nespor and Vogel [1986: 165]), whereby the need for a bimoraic stressed syllable results in gemination of the following initial consonant, closing the final stressed syllable. This is shown in (48). (48)

Raddoppiamento sintattico (Nespor and Vogel 1986: 165) a. così buono b. caffè nero c. resterà con me

[kzibbwno] ‘so good’ [kaffennero] ‘black coffee’ [resterakkomme] ‘will stay with me’

154

Final syllables

This is obviously a problem for the non-finality approach, since the syllable in question is in fact stressed (violating Rhythm, which was previously satisfied by avoiding final length), but nonetheless refuses to lengthen. An example similar to the Italian is that of Kuku-Thaypan (Paman, Rigsby 1976), in which stressed vowels are lengthened in closed syllables and open medial syllables. In word-final open syllables, however, stressed vowels do not lengthen, but rather, a glottal stop is inserted after them. One more similar system is Lithuanian (Blevins 1993; Kenstowicz 1972), in which a rule lengthening /a/ and /e/ under accent fails to apply word-finally. The problem here is the same: Since the final vowel is already stressed, there is no need to avoid lengthening it. Buckley also points out the case of Luganda (Hyman and Katamba 1990, 1993), in which compensatory lengthening caused by glide formation fails to apply where it would result in a word-final long vowel. It is not clear how avoidance of final stress or metrical invisibility of any kind could be invoked here, since Luganda is not generally analyzed as having a stress, final or otherwise (see Hyman and Katamba 1993 specifically for an overview and analysis of matters concerning tone and accent in Luganda). It is Buckley’s thesis, instead, that in addition to constraints mandating avoidance of stress on final syllables, languages have a more general tendency simply to avoid final long vowels. It is this tendency which motivates both the failure of Iambic Lengthening word-finally and a variety of other processes. Buckley attributes this tendency to “final lengthening at the phonetic level”. (1998:183) If a language lengthens vowels finally (in some prosodic constituents at least), this could result in the contrast between underlyingly long and phonetically lengthened short vowels becoming difficult to perceive, and thus potentially subject to reanalysis and the collapse of the contrast. Certainly final lengthening could have the effect of making the contrast between long and short difficult to perceive, especially if other phonetic properties of final syllables identified here were simultaneously weakening the perceptibility of the latter parts of any vowels being realized there, such that both long and short vowels would sound like a lengthened vowel that fades away toward the end. That the vowels merge ending up generally as short could be seen as the result of a hypercorrective change, whereby the listener, hearing additional length in final position, attributes that length to the final lengthening (wrongly, in the case of underlying long vowels) and factors it out in his or her interpretation of the speaker’s intentions, reconstructing a final short vowel in all cases. That more must be involved in the tendency to avoid a long/short contrast in final position than merely phonetic lengthening becomes clear when we compare potential fates for quantity contrasts here with those

Vowel quantity in final syllables

155

found in stressed syllables, as presented in 2.1. As noted, some languages collapse quantity contrasts in stressed syllables through lengthening of short vowels. This durational enhancement, however, invariably results in neutralization to a long vowel. This is of course unsurprising given the phonetic processes commonly associated with stressed syllables, the phonologization of which often results in grammars requiring stressed syllables to be bimoraic, or to avoid low sonority vowels like [] or [] (which latter also occurs in final syllables – see below). In short, there is nothing about stressed syllables per se which would ever make them bad positions in which to realize a long duration vowel. If anything, the additional duration is beneficial, serving to enhance the contrast between stressed and unstressed syllables. There are also languages, however, in which only the stressed syllable licenses the contrast between long and short vowels. The implication here is that while unstressed syllables are apparently required to be too short to contain long vowels, short vowels in stressed syllables are not lengthened so much as to endanger the long/short contrast. In final syllables, the situation is different. That these are never the sole position in which a quantity contrast occurs falls out from the fact that, while unstressed syllables may be durationally compromised or restricted relative to stressed syllables in some systems, it is never the case that along with final lengthening the durations of non-final syllables are actually curtailed or limited. This difference must be due to the lexical nature of the stressed/unstressed contrast, as opposed to the largely phrase-level operation of the final lengthening process. But why, when final lengthening does cause neutralization of long/short contrasts, is the resulting vowel so often phonologized as short or “halflong”, as some descriptions put it? Buckley’s answer is that short is the “unmarked” member of the opposition, to which we therefore default. But this seems rather unsatisfying. By now we recognize clearly that “markedness” of features or segment types is often not absolute, but rather is comprehensible only relative to the environments within which the features in question are being realized (Steriade 1994, 2001). In stressed syllables, obviously, the perceptual confusion caused by lengthening short vowels does not result in default to the “unmarked” short member of the opposition.114 Of course, it should not, since stressed syllables are a lengthening context, in which short vowels may sound long, but nothing about long vowels is likely to sound short. For a listener to simply decide, then, that the phonetically manifestly long stressed vowels were in fact intended to be short because this option is “unmarked” is incomprehensible. But final position is also a lengthening context, and, absent any other conditioning factors, such a decision on the part of the

156

Final syllables

listener seems no less unwarranted. There must, therefore, be something else at work here. Again, while lexically stressed syllables will express their additional duration in all tokens of the string in question, final syllables will generally acquire significant length only prepausally, where they are also characteristically weakened in some respects as well. This weakening, primarily the result of a sharp drop in subglottal pressure in phrase-final position, is not a characteristic of stressed syllables, and may help account for differences in the quantity neutralization patterns found in the two positions. The logic is this: In stressed syllables, phonetic realizations of both long and short vowels prior to neutralization may come to be, let us say for the sake of argument, in the neighborhood of 200 ms. and with steady amplitude throughout. Their similarity in this respect would obscure the contrasts between long and short and cause neutralization in favor of the longs. In final syllables, by contrast, both long and short vowels might come to be 200 ms. or so in duration, but with a significant amplitude drop, possibly together with non-modal phonation or devoicing toward the end. They have thus also come, as it was with stressed syllables, to be so similar that the contrast between them is in danger of collapse. But here there is a problem. If word-internal long vowels are, say, also 200 ms. long, but with steady amplitude and no devoicing or the like, and the listener ordinarily factors out or corrects for some additional duration in phrase-final position as the result of domain-final lengthening, final vowels would sound less like true long vowels (which should be long and steady), and more like short vowels that had some duration added to mark the phrase boundary, but which nonetheless sound weaker in some respects than a real long vowel ought to. The full complexity of the typological picture is thus only comprehensible when the full range of phonetic effects characteristic of final position is taken into consideration. While Buckley’s notion of a general tendency to avoid length in final syllables owing to the phonetic trend of phrase-final lengthening does help to explain the failure of a variety of lengthening processes word-finally, including Iambic Lengthening, it is less clear what we should make of the more general tendency for avoidance of final stress crosslinguistically with or without Iambic Lengthening playing a role. For Hung this is the effect of the constraint Rhythm, demanding an unstressed syllable following every stressed one. A phonetic lengthening trend in final position, on the other hand, should have the opposite effect (if any). Insofar as duration is a frequent crosslinguistic cue for stress, we might imagine that final lengthening would actually attract final stress, rather than the reverse. Hock (1999: 22–24), by contrast, discusses accent retractions from final syllables (in Baltic in particular) as a function of the monolithic final weakening he

Vowel quantity in final syllables

157

proposes. Essentially, he views accent retraction as a consequence of some degree of “segmental attrition” in final syllables, in some cases leading to the deletion of final vowels (as in certain Lithuanian dialects, and Latvian as noted above). Given the above discussion concerning the phonetics of domain-final syllables (assuming the retraction trends to have originated in larger domains, and not phrase-internally), whatever segmental attrition may be giving rise to accent retractions is most likely the result of the final drop in amplitude and F0 (possibly with attendant devoicing or creak), which have been seen to take place even in the presence of substantial final lengthening. A potential prediction arising from this characterization of final position is that languages with strongly duration-cued stress (such as Russian) should be less subject to stress retraction or final stress avoidance, these latter being more characteristic of languages in which stress is cued more saliently by, e.g., F0 contours. Crosslinguistic investigation of the correlation of accent type and non-finality (particularly in Balto-Slavic) may yield telling results in the future. 3.7.2. Final short vowel avoidance As much as Buckley is correct in seeing a crosslinguistic tendency for languages to avoid contrastively long vowels in word-final position, and despite the fact that at least some languages (e.g., Dutch) clearly have the means of implementing final lengthening even in cases where the final vowel itself is resistant to the process, there is nonetheless an equally clear tendency for languages to avoid certain short vowels in word-final position as well. Johnson and Martin (2001) note in passing that Korean, Tigrinya, and Tigre all disallow final [], while Moroccan Arabic prohibits final []. To these examples we may add the following: (49)

Eastern Mari (Finno-Ugric – Sebeok and Ingemann 1966; Flemming 1993, Kangasmaa-Minn 1998, Majors 1998) No reduced [] in word-final position. See above, Section 3.2.3 for details.

(50)

Sulaimania Kurdish (Iranian – Ahmad 1986; McCarus 1997) No short (lax) high /, / in word-final open syllables.

158

(51)

Final syllables

Punjabi (Indo-Aryan – Bahl 1957) No short /, , / before pause, where these lengthen instead.

(52)

Thaadou (Kuki-Chin – Thirumalai 1972) No mid central // in word-final open syllables.

(53)

*Proto-Austronesian (Blust 2000) No schwa in word-final open syllables. Among the many modern Austronesian languages exhibiting this trait are Batak (Dairi dialect, Sumatra, van der Tuuk 1971), Cotabato Manobo (Mindanao, Kerr 1988), Iban (Sarawak, Omar 1981), and Buginese (S. Sulawesi, Sirk 1983).

The pattern here is immediately evident. In all cases the short vowels disallowed are either high or mid central vowels, renowned for their brevity, or the set of “short”, “lax”, or “reduced” vowels in systems where the long/short contrast is one in which long or full vowels are not in fact particularly long, but are peripheral, while the short vowels are centralized and often characterized as “over-short”. Thus, in Mari for example, Crosswhite cites Gruzov’s 1960 instrumental study of Standard Mari (and Volga Mari) in which the average duration for the reduced vowel in standard Mari pretonic syllables is somewhere in the neighborhood of 50–60 ms., depending on context, while full vowels are generally between 95 and 125 in the same environments. What seem to be disallowed in final syllables crosslinguistically, then, are these over-short vowels. In systems where short vowels too are peripheral and long vowels are described as more or less twice the length of short vowels (canonical monomoraicbimoraic oppositions, as in the Dravidian languages noted above), by contrast, the long/short opposition seems generally (though perhaps not exclusively) to be neutralized in favor of the short vowel. 3.8. Summary In this chapter I have presented a survey of the phonetic and phonological patterns common to domain-final syllables and, where possible, I have explicated the relationship between the former and the latter. Final syllables were shown frequently to host patterns both of phonological strength and phonological weakness. Strength effects in final position were of a sharply

Summary

159

limited character; final syllables rarely if ever function as the sole center of prominence in the word domain in the way that, say, stressed syllables, frequently do. When they are strong at all, this strength is generally manifested as a pattern of final resistance to some other reduction or assimilation process (such as UVR) which would otherwise have targeted a syllable of the prosodic status in question. This difference between final strength and UVR was attributed to differences in the consistency and magnitude at the word level of the durational asymmetries giving rise to the patterns. Final resistance was shown to be often phrase-level only and gradient in its operation. It also targets in the vast majority of cases only vowels in open final syllables, a fact following directly from phonetic details concerning the domain of application of final lengthening. A nonphonetically-oriented approach to PN patterns in final position has no way of accounting for any of the typological generalizations drawn here. As for phonological weakness, neutralization patterns targeting specifically domain-final material were shown to exist, a fact which is wholly contradictory from the point of view of the foregoing discussion of final strength and phonetic enhancement. In addition to these trends, however, final position also hosts a number of phonetic patterns, the effect of which is ultimately to decrease the perceptual robustness of domain-final material. These patterns include drops in pitch and intensity correlated with lowered subglottal pressure, devoicing, non-modal phonation, such as breathiness or creak, and even potentially nasalization. Prominent instantiations of various combinations of these tendencies can interfere with the accurate perception of vowel contrasts in domain-final position, leading to the phonologization of neutralizations specific to that position. This combination of factors contributing at once to increases in phonetic prominence and to perceptual obscurity is the source of the phonological ambiguity of final position with respect to the typology of positional neutralization. Because of this, final syllables cannot be considered from a crosslinguistic perspective either monolithically strong or monolithically weak in terms of their potential to license vowel contrasts. Indeed, they may exhibit this variable licensing potential even within a single system, such as in Nanai, where final syllables are strong for the licensing of nasality, but weak for pharyngealization contrasts. This is a serious challenge to theories of positional neutralization which assume that the phonological strength or weakness of structural positions is specified in Universal Grammar. At best strength or weakness in final position is a parameter which must be set on a language-specific, inductivelydetermined basis. This fact about final syllables also gives us reason to question whether it is truly necessary for strength or weakness to be predetermined by UG in any other position either. Given that patterns of

160

Final syllables

strength or weakness will arise in each position through the phonologization of phonetic patterns occurring there independently of phonological licensing restrictions, it is not clear what such restrictions in Universal Grammar are contributing to the situation. Where phonetic patterns are consistently oriented toward a single perceptual profile, robust or obscure, phonological patterns in that position will be equally uniform (as in, e.g., stressed syllables). Where the phonetic picture is more equivocal, however, phonologization will yield a typological range of patterns which is equally ambiguous, as the facts of final syllables have made clear. Furthermore, the relationship of phonetic prominence to phonological strength is not a simple one. Patterns of phonological strength may arise which bear no obvious relationship to the quality of the phonetic cues for that contrast in that position (laxness in Pasiego Spanish, nasalization in Nanai, final dominance in Timugon Murut), and likewise contrasts may collapse due to the addition of properties such as duration, which are otherwise thought to enhance the perceptibility of the contrasts in question (Smith’s Positional Augmentation – here potentially instantiated in final lowering). It is by no means obvious, however, that the synchronic status and implementation of a pattern of phonological strength not supported by the robustness of its characteristic cues (Nanai nasalization, Timugon finalsyllable super-licensing, Pasiego laxing) is in any way different or less authentic than a pattern that is supported by such phonetic cues. I will take up the implications of these complications for theories of positional neutralization again in Chapter 5.

Chapter 4 Initial syllables

Domain-initial syllables are famously associated in the literature with phonological strength effects. For vowels, the best-known of these effects are the harmony systems or licensing asymmetries of the Uralic, Altaic, and Bantu languages. The first striking fact about initial-syllable PN patterns affecting vowels is the wide variety of featural contrasts potentially involved. Chapter 2 demonstrated that PN patterns involving stressed syllables were severely limited in the set of contrasts they potentially restrict: The overwhelming majority of cases involve vowel height, while nasalization and quantity may also be affected. In initial syllables, by contrast, PN patterns are attested involving vowel height (Bantu, Gujarati), frontness/backness (Uralic and Altaic), roundness (Uralic and Altaic), nasality (Gokana, Kurux), pharyngealization (Tuvan, Tungusic), and breathiness (Gujarati). Thus, while in unstressed vowel reduction the driving force behind most attested systems was clearly pressure on the articulation of non-high vowels due to the durational impoverishment of unstressed syllables, in initial syllable the phenomenon leading to the phonologization of PN is something far more general in its effect. Interestingly too, most of these PN systems involve vowel harmony, rather than, for example, the types of surface inventory reductions characteristic of UVR systems. The second, even more striking fact about the positional neutralization of vowel contrasts in initial syllables is the rarity, not to say non-existence, of uncontroversial examples of this pattern in the first place. I show in this chapter, in fact, that the vast majority of cases (indeed perhaps all) cited in the literature as instances of initial syllable PN can in fact be attributed to one of two confounding factors: past or present fixed initial stress in the languages in question (e.g., Uralic and Altaic, Dravidian), or cases in which the initial syllable is or was at the relevant moment coextensive with the morphological root in the grammatical subsystems in question (Tiv, Gokana, Bantu, Yowlumne). In this chapter I contrast two approaches to initial syllable PN effects. One of these, the phonologization approach familiar from the preceding chapters, attributes typological regularities found in PN systems to the phonologization of patterns arising originally in the phonetics. The second approach locates the force generating typological patterns squarely in Universal Grammar. In the case of initial position, furthermore, it is

162

Initial syllables

commonly said to be the psycholinguistic status of initial material which is responsible for the prominence which underwrites the attested patterns (Beckman 1998; Smith 2002). It is initialness itself, in other words, rather than any concrete phonetic attributes of that initial material, which drives the patterns in question. I will argue in this chapter that neither the startling rarity of initialsyllable PN affecting vowels, nor the apparent limitation of true instances of this PN to vowel harmony follow from the “initialness itself” approach. Both these patterns, however, can be understood as consequences of the phonetic characteristics commonly associated with initial syllables crosslinguistically. The chapter is structured as follows: Section 4.2 presents a discussion of the most important recent literature on initial syllable effects. Section 4.3 introduces the phonetic processes commonly associated with initial syllables. Section 4.4 begins an analysis of initial syllable vocalic strength effects in light of what we know of the relevant phonetic facts, while Section 4.5 presents the results of an experimental study of vowel durations in Turkish demonstrating the possibility of a non-accentual origin for additional duration on the vowels of initial syllables, and sets forth a model for how this additional duration could contribute to the phonologization of the harmonic strength effects associated with that position. 4.1. Recent work on initial syllable phonology The most comprehensive recent treatments of phonological licensing patterns involving initial syllables are to be found in Beckman’s 1998 study of Positional Faithfulness and Smith’s 2002 study of Positional Augmentation. These present catalogues and important analyses of initialsyllable strength effects for both consonants and vowels. Beckman treats only Positional Faithfulness or phonological strength effects in initial syllables, while Smith is concerned with both these and phonological weakness effects. Smith refers to restrictions on the licensing of contrasts in otherwise phonologically strong positions as Positional Augmentation, since in all cases these stem from phonetic strengthening processes charged with increasing the perceptual prominence of the position in question. She contrasts Positional Augmentation effects with phonological strength effects in these same positions, which she calls Positional Neutralization effects, though in the now-generic usage derived from Steriade (1994), both patterns are examples of the latter. To avoid confusion with terminology used in earlier chapters here, I will substitute Positional Strength for Smith’s Positional Neutralization. The issue is purely practical

Recent work on initial syllable phonology

163

and should not be taken to bear theoretical significance. One example of a Positional Augmentation pattern in initial syllables is the relatively common demand for low-sonority consonants in initial or stressed syllable onsets, which has the result of causing fewer consonantal contrasts to be licensed in those positions. Another example would be the demand for long or high-sonority vowels under stress, discussed in Chapter 2 and also in Chapter 3 in the context of final vowel lowering. 4.1.1. Functional motivations for PN in stressed and initial syllables Both Beckman (1998) and Smith (2002) divide strong positions into the phonetically strong (e.g., stressed syllables), and the psycholinguistically strong (e.g., initial syllables, roots). For Beckman little follows from this classification. Smith pursues this dichotomy further however and examines the potential effect the source or nature of a position’s phonological strength may have on the typological patterning of the licensing asymmetries found therein. Smith’s prediction is essentially the following: A position which is phonetically strong may show both Positional Strength and Positional Augmentation effects, since phonetic strengthening can have the effect of both exempting a position from lenitions or reductions (as with final resistance to VR) and also itself of effacing contrasts by, for example, lengthening or lowering the sonority of vowels. Psycholinguistically strong positions, on the other hand, will, according to Smith, show only a restricted inventory of Positional Augmentation effects, since the importance of these positions for processing makes it crucial that as many contrasts as possible be realized faithfully there.115 It is Smith’s contention that initial position shows precisely these characteristics crosslinguistically, such that while instances of initial syllables as sole licensers of a language’s full inventory of contrasts are common, both for consonants and vowels, instances of Positional Augmentation are severely restricted. Specifically, Smith claims that the only such processes one finds are related to the presence or sonority of onset consonants in initial syllables. This is expected according to a principle Smith formulates as the Segmental Contrast Condition, which mandates (to oversimplify somewhat) that segmental contrasts not be neutralized in a way that would be detrimental to early-stage word recognition, unless those neutralizations themselves contribute to segmentation of the speech stream (e.g., by providing cues for the presence of word boundaries). Thus, the requirement for low-sonority word-initial onset consonants fulfills this latter goal, and hence is allowed to neutralize, for example, the voicing contrast for obstruents, even though doing so is unhelpful from the point of

164

Initial syllables

view of word recognition. According to Smith, these are the only contrastneutralizing effects found in initial syllables, with the types of Positional Augmentation phenomena affecting, for example, the vowels of stressed syllables being conspicuously absent. The typological regularities Smith seeks to capture for initial position using the psycholinguistic vs. phonetic strength dichotomy are, to summarize, these: Word-initial onset consonants are seen both to undergo neutralizations involving the lowering of their sonority, and also to preferentially license more contrasts than the onsets of other syllables within the word. For vowels, on the other hand, initial syllables frequently license more contrasts than other positions, but no neutralizations resulting from phonetic augmentation for the sake of perceptual prominence are seen to occur. This follows from the classification of initial syllables as psycholinguistically, rather than phonetically, strong. If we assume (incorrectly, as it turns out) that initial syllables are not subject to phonetic strengthening, then any licensing asymmetries observed in initial syllables must owe their existence to something else, such as psycholinguistic prominence. This prominence, with the restrictions on licensing asymmetries it carries, is what Smith uses to account for the typological regularities observed in the licensing potential of this position. By contrast, stressed syllables, which are phonetically strong, are seen by Smith both to behave as strong licensers of vowel contrasts (as in VR systems), and often to undergo Positional Augmentation of the vowel, resulting in the neutralization of contrasts. The onsets of stressed syllables are also seen to undergo Positional Augmentation of the type found in initial syllables (e.g., mandatory low sonority onset), but are said not to function as strong licensers of contrasts for onset consonants in the way initial syllables are. This is said to be comprehensible under the assumption that stressed syllables are not psycholinguistically prominent in the same way the initial syllables are, such that there is no processing-related mandate to conserve contrasts in those syllables. On the contrary, the most important factor is that the stressed syllable be perceptually prominent as such, and often in the service of achieving this additional prominence, both onset and vowel contrasts are lost. In sum, Smith’s account of initial syllable PN processes relies on psycholinguistic properties resulting from the fact that the syllable in question is initial in a lexical entry to derive the typological patterning of PN effects.

Licensing asymmetries in initial position

165

4.1.2. Agenda for this chapter As noted above, one of the main contributions of this chapter is a reexamination of the typology of vocalic licensing asymmetries in initial position, a typology which as it turns out has been characterized only incompletely in all previous work. On the basis of this reexamination, I will argue that all patterns of phonological licensing asymmetries attested in initial position are entirely predictable on the basis of the documented phonetic characteristics of those syllables, and, furthermore, that the predictive fit achieved on this basis is a closer one than that of a model assuming this typology to flow from the psycholinguistic status of the initial syllable. This argument is very specifically devoted to the question of how best to account for the typological regularities observed in patterns of positional licensing asymmetries, and here again I argue for the phonologization model over accounts based on the putative structure of Universal Grammar. In arguing that psycholinguistic status is not a directly operative factor in determining the typological regularities involving initial syllables, I am not in any way disputing that psycholinguistic status itself, which has been clearly demonstrated irrespective of any hypothesized role in constraining patterns of positional neutralization. I am also not disputing the potential need for separate formal mechanisms in the synchronic grammar to deal with Positional Strength and Positional Augmentation, as proposed by Smith (2002). As discussed in Chapter 5, I am also in agreement with Smith that the patterns of asymmetrical phonological licensing in question are best handled in synchrony through the manipulation of abstract categorical symbols in the phonology. These patterns should be held distinct, however, from the phonetic patterns out of which, through phonologization, they arise. It is this latter point which will concern me most here. 4.2. Licensing asymmetries in initial position This section provides an overview of the typological generalizations emerging from the database of several hundred languages I have constructed in connection with this project. In my survey, the licensing asymmetries involving the onsets of initial syllables described by Beckman and Smith are clearly attested. Other regularities, however, not identified in previous studies, emerge as well. For cases in which initial syllable onsets license more contrasts than the onsets of other syllables, Beckman cites as examples, among others attested, Doyayo (implosives, labiovelars), Shilluk (secondary articulations), and !Xóõ (clicks). Among languages with

166

Initial syllables

Positional Augmentation of initial-syllable onsets, Smith cites Arapaho and Guhang Ifugao as requiring initial syllables to have onsets, and Mongolian, Mbabaram and Campidanian Sardinian as requiring that those onset consonants be of lower sonority. Again, that these contrast-neutralizing constraints should be allowed in this psycholinguistically strong position is justified by the fact that the presence of a low-sonority onset aids in the segmentation of the speech stream by marking the beginning of the word with a perceptually robust low-sonority to high-sonority CV-transition. Smith presents evidence that robust modulation of the signal in this way increases the perceptual salience of a syllable. While it is true, however, that this “strengthening” of the onset of the word could assist in demarcating a word boundary, and thus be permitted under Smith’s Segmental Contrast Condition, it is not at all clear how this requirement necessarily limits the Positional Augmentation constraints referencing that position to those affecting the presence or sonority of onsets alone. If it is the case, as Smith argues, that the perceptual robustness of initial material can assist in the segmentation of the speech stream, and if it is likewise true that a steep rise in sonority at the beginning of a syllable increases its perceptual prominence (Smith 2002: 78–80), then there is no reason that raising the sonority of the nucleus of the initial syllable would not be just as effective a measure in the service of boundary demarcation as would lowering the sonority of its onset. The latter is quite well attested, of course, while the former is not known to occur.116 The psycholinguistic explanation of the typology here, however, provides no reason to favor the one strategy over the other (in fact, if the informationbearing status of the contrasts in question should be a factor as to which neutralization would be more detrimental to lexical access, the consonantal effects might well be disfavored, rather than attested to the exclusion of anything else). Implementation of both might conceivably neutralize too many contrasts for the system to bear, though then we need a metric for evaluation of licit numbers of neutralizations in a given system, rather than merely guidelines as to which types are permitted and which are not. For that matter, since it is argued also that a bimoraic vowel increases the perceptual prominence of a stressed syllable, and also that in some systems (such as English) the prominence of the stressed syllable is used to facilitate the segmentation of speech, it is not clear why, in a language without duration-cued stress, all initial-syllables should not be required to be bimoraic. Such a neutralization would contribute enormously to the perceptual prominence of the initial syllable, and hence to the demarcation of the word boundary, while presumably not detracting nearly as much from word recognition as the neutralization of multiple consonantal contrasts would. And yet such a thing seems to be utterly unattested in the

Licensing asymmetries in initial position

167

languages of the world. Smith’s formulation of the Segmental Contrast Condition defines legitimate exceptions to bans on neutralizations in psycholinguistically strong positions as those in which the satisfaction of the Positional Augmentation constraint “serves to demarcate the left edge of σ1” (Smith 2002: 296), which perhaps implies a priori limitation to the onset consonant, yet the sonority rise from C to V implicated in boundary demarcation by Smith is surely as much a property of V as C. Additionally, even if the constraint affected only the vowel, by mandating that it be long, for example, surely the fact that the vowel is not always in absolute-initial position cannot mean it is incapable of assisting the listener in the process of segmentation of the speech stream. Indeed, it is a truism of functionalist speculation that systemic properties like fixed stress (initial or otherwise) and even word-bounded vowel harmony ultimately serve some demarcative purpose, and yet do not necessarily involve directly boundary-adjacent material. The purely psycholinguistic approach to licensing asymmetries in initial syllables is therefore by no means as restrictive as it would need to be to account for the typology of Positional Augmentation effects in that position. Turning to vocalic contrasts, as noted, Positional Augmentation effects appear not to occur. As for Positional Strength, perhaps the most striking thing about initial position is, despite numerous claims in the literature, how very few such licensing asymmetries are in fact attested. Indeed, even of those known and accepted, the majority are either ambiguous in status from the outset, or ultimately attributable to phonetic factors rather than to the “initiality” of the syllable per se, or any psycholinguistic prominence bestowed on the initial syllable thereby. Before turning to the attested cases of licensing asymmetries for vowels in initial syllables, it should be noted that, in contrast to an entirely processing-based account, the typological patterning of licensing asymmetries in initial position described above is entirely predictable from the well-attested phonetic characteristics of initial syllables alone. The idea that initial syllables are not the locus of phonetic prominence enhancement is just not accurate. In fact, a phenomenon known as “initial strengthening” in the phonetic literature is attested in a variety of languages, where it has been seen to affect only the onset consonants of domain-initial syllables (Byrd 2000; Cho and Jun 2000; Dilley, Shattuck-Hufnagel and Ostendorf 1996; Fougeron 1999; Fougeron and Keating 1996; Keating et al. 1999; Oller 1973, inter alia). The following section introduces the details of initial-strengthening phenomena.

168

Initial syllables

4.3. The phonetics of initial position Domain-initial strengthening refers to a set of processes that have been shown to enhance various aspects of the articulations of domain-initial consonants in a fairly wide variety of languages, including to date English, French, Taiwanese, and Korean. Experiments such as those cited just above have documented an increase in the magnitude of the supralaryngeal gestures associated with initial consonants, assessed according to the amount of linguopalatal contact involved in the constrictions (measured using electropalatography). In addition to gestural magnitude, closure durations of the relevant initial consonants were also found to be greater than those of their word-internal counterparts.117 Laryngeal gestures have also shown significant enhancement in domain-initial position. Specifically, glottal opening gestures (and likewise voice onset time of aspirated stops) have been shown to increase in magnitude in English (Pierrehumbert and Talkin 1992), as does VOT in Korean (Keating et al. 1999). Moreover, Cho and Jun (2000) actually demonstrate for Korean increased glottal airflow and VOT for the lenis and aspirated series of stops, and decreased glottal airflow and VOT for the fortis stops, suggesting a general enhancement of articulations producing cues important to the perception of specific contrasts. A number of the initial strengthening studies have also recorded an effect similar to that discussed in Chapter 3 in connection with the implementation of final lengthening, to wit, that the degree of strengthening of domain-initial articulations is greater at the boundary of higher-level prosodic domains (e.g., U, IP, AP) and smaller at the left edge of lower-level constituents. Cho and Jun (2000) go on, intriguingly, to characterize certain patterns of initial strengthening as functioning specifically to enhance the syntagmatic contrast of the relevant segments. Thus, what they characterize as an increase in the “consonantality” (Cho and Jun 2000: 58) of the initial consonant serves to enhance the contrast between it and the following vowel. Additionally, building on work by Hsu and Jun (1998) on Taiwanese, they hypothesize (and demonstrate for Korean stops as noted above) that other strengthening patterns serve to enhance paradigmatic contrasts (in the Korean case, those between lenis, fortis and aspirated stops). Cho and Jun (2000: 67) themselves suggest directly that these phonetic strengthening processes may serve to enhance the segmentation of the uninterrupted speech stream into smaller units, in addition to potentially facilitating lexical access. The parallels here should be abundantly clear between phonetic syntagmatic strengthening of the initial CV contrast and phonological Positional Augmentation of wordinitial onsets on the one hand, and phonetic paradigmatic strengthening of

The phonetics of initial position

169

initial consonants and phonological positional neutralization of consonantal contrasts in non-initial positions on the other hand. Domain-initial strengthening of vowels of initial syllables with onsets, however, has not been documented in any of the studies cited above. Cho and Jun (2000) note that domain-initial strengthening seems to target exclusively that first segment of the word of domain-initial position (in contrast with final lengthening, which targets the entire final rhyme, and often more). Byrd (2000) discusses this pattern together with final lengthening in the context of an experiment assessing the lengthening of gestures on both sides of a phrase boundary. As noted in Chapter 3, she attributes the lengthening of both domain final and initial material to a single articulatory phenomenon, a local decrease in gestural stiffness resulting in increased duration for the segmental material involved. This decrease in gestural stiffness she attributes to the activation of what she calls a - (prosodic-) gesture, a non-constriction-based variable in the task dynamic model of Articulatory Phonology that can alter the stiffness parameter for any gestures taking place during the scope of the -gesture. The -gesture in question here straddles the phrase boundary, affecting the segmental material on either side, and the material closest to the boundary more dramatically than that further in on either side. This accounts nicely for the gradual decrease in the effect of final lengthening as distance from the boundary increases. It also predicts that initial consonants should be more strongly affected than the vowels following them, though perhaps not the total lack of effect seen on this vowels in the languages tested. Byrd speculates that the articulatory strengthening documented on boundaryadjacent segments might also be a function of the presence of this gesture, though the specifics of this interaction are still not entirely clear. Now if the pattern of phonetic initial strengthening described here were ultimately to form the basis of the positional licensing asymmetries found in initial syllables, we would expect exactly the effects described by Beckman and Smith for initial onset consonants, while expecting the absence of either Positional Augmentation or Positional Strength effects involving the vowels of initial syllables. Smith documents the nonexistence of the former, and I will turn momentarily to the documentation, despite all expectations (including my own), of the latter. I should note here, however, an important point about the potential relationship between psycholinguistic prominence and phonetic strengthening. Inasmuch as syntagmatic contrast enhancement involving initial onset consonants can be said to facilitate segmentation of the speech stream, and insofar as paradigmatic contrast enhancement affecting that same can be imagined to enhance perceptibility of the contrasts in initial position so crucial to earlystage word recognition, it is perfectly logical to hypothesize (though I will

170

Initial syllables

refrain from doing so) that these phonetic effects themselves are derived from precisely the concerns invoked by Smith in her characterization of the psycholinguistic status of the initial syllable. This may indeed be so, but it in no way changes the substance of my argument here. For even if psycholinguistic concerns of one sort or another are ultimately the raison d’être of the set of phonetic patterns associated with domain-initial segmental material, it is still the precise characteristics of those phonetic patterns which determine the shape and distribution of the phonological licensing asymmetries to which they may ultimately give rise through phonologization. Psycholinguistic status, in other words, may ultimately explain why anything gets phonologized in initial position, but it is the phonetic patterns alone which determine what will be phonologized where. Smith, by contrast, places the psycholinguistic concerns discussed here directly into the phonology, as filters on the types of position-specific constraints which can be formulated by the grammar. The typological patterns we wish to account for here flow directly from the categorical phonological grammar, and as Smith argues, since the constraints in question have the entire initial syllable as their domain, should in principle hold for all elements within that domain equally. This is not the case, which the psycholinguistically-based phonology model fails to predict. The phonetically-based phonologization model, on the other hand, predicts precisely the attested asymmetry, whether the phonetic patterns in question ultimately have psycholinguistic motivations (as is potentially the case with initial strengthening) or are general properties of motor systems linguistic and non-linguistic (as with final lengthening). Smith discusses domain-initial strengthening in the context of determining the precise domain to be referred to by the phonological constraints producing positional licensing asymmetries. She decides ultimately on the morphological word rather than the prosodic word for a number of reasons. The first and most important of these is that initialstrengthening is phonetic, and hence “cannot be directly related to true initial-syllable augmentation effects, which are phonological.” (Smith 2002: 312) The gradience and interspeaker variability in implementation of initial strengthening distinguish it from, for example, categorical prohibitions on onsets above a given sonority level. Likewise, the fact that initial strengthening is stronger in the higher prosodic domains is troubling, since the majority of phonological initial-syllable effects take place at the lexical level. Smith notes that initial strengthening may ultimately “bolster” the pressure to demarcate the left edge of the word, but that it is not by itself enough to make initial syllables phonologically strong positions. From the point of view of the synchronic modeling of individual phonological systems, this is entirely correct. Synchronically, initial

The phonetics of initial position

171

strengthening is hardly the same thing as phonological strength. Indeed, given the apparent generality of attestation of the former, it is likely that languages with specific phonological constraints on word-initial onset sonority also implement phonetic initial strengthening of prosodic word- or phrase-initial material, such that, for example, a phrase-initial voiceless obstruent satisfying the phonological constraints will have greater linguopalatal contact, closure duration, and VOT than its phonological equivalent phrase-internally. Obviously, the grammar of a speaker of this language must account for the phonological distribution and the phonetic pattern separately, as distinct regularities, and within that grammar there would be little reason to imagine that the implementation of the former should be necessarily contingent upon the presence or particularities of the latter. But this only addresses part of the question. Indeed, where the focus of our inquiry is on how to account for typological regularities in attested patterns of licensing asymmetries in initial position, then, as in the cases of stressed and final syllables surveyed above, it is precisely this phonetic information which should interest us. The phonologization approach to typology makes the relationship between the phonetic pattern of initial strengthening and initial-syllable phonological licensing effects, however hazy it may be in synchrony, both central and clear to our accounting. As in other cases, the gradient phonetic process, despite its variability and stronger implementation in higher prosodic domains nonetheless creates sufficient individual strengthened tokens of the forms in question to bring about the ultimate phonologization of that strength, be it as resistance to neutralizations operating internally, or as neutralization along the lines of Positional Augmentation. In this sense the initial syllable is no different from the stressed syllable. Both positions undergo phonetic strengthening. What is different, however, is the nature of the phonetic strengthening involved. Phonetic initial strengthening targets primarily the first segment of the word or phrase, thus making it perfectly clear why both Positional Augmentation and Positional Strength processes should single out onset consonants. It also explains the increased duration of the absolute word-initial vowels which leads to phonological effects never observed on initialsyllable vowels following onset consonants. Some evidence for such durational increase was presented in Chapter 2; recall that Balasubramanian (1981) demonstrates this longer duration experimentally for Tamil; the same duration was implicated in the resistance of absolute initial /a/ to reduction to schwa in Russian and in the resistance of absolute-initial front vowels to centralization in Nawuri (Casali 1995). Increased syllable duration may also potentially be implicated in the reduction of the five

172

Initial syllables

vowel inventory of Luganda to three vowels ([e, a, o]) morpheme-initially (Hubbard 1994). There is also evidence of this process from the literature on initial strengthening. Absolute word-initial vowels are realized somewhat longer than word-internal vowels in French and English (Fougeron 1999; Turk and Shattuck-Hufnagel 2000). Furthermore, contrary to the expectations engendered by the synchronic psycholinguistic approach, there are indeed instances of the phonologization of initial syllable effects only in higher prosodic domains: In some dialects of Runyambo, for example, word-initial /i, u/ are lowered to [e, o] after pause (Larry Hyman, personal communication). The origin of this pattern in the phonetic durational increase and sonority enhancement characteristic of initial strengthening should be abundantly clear. It is certainly true that phonologized versions of initial strengthening effects tend to apply at the lexical, rather than the phrasal level, making patterns such as that of Runyambo comparatively rare. But is this not also true of other phonologization patterns originating in phrase-level phonetics, such as word-final obstruent devoicing? It is generally accepted that final devoicing has phonetic motivation at the phrase level, while at the word level this motivation is at least weak if not absent (Gordon 1998; Hock 1999). Yet, phonological (neutralizing) final devoicing processes are generally recognized at the level of the word, and not the phrase. None of this is problematic from the point of view of phonologization. The nature and attestation of initial-segment effects (C and V) are attributable to the phonologization of phonetic initial-strengthening patterns. The phonologization approach predicts precisely the effects attested, while making the non-existence of unattested patterns (such as postconsonantal initial-syllable vowel augmentation) comprehensible as well. This result was obtained in Chapters 2 and 3 as well, and it is a central thesis of this study that regularities in phonological typology are in general best accounted for from the point of view of phonologization. It may be that these phonetic effects, sometimes or in part, serve a function related to the psycholinguistic status of initial material, but it is the specific characteristics of the phonetic processes thus engendered, and not the existence of the psycholinguistic status per se, which determine the shape and attestation of the phonologized patterns. An approach relying solely on restrictions embodied in the synchronic grammar fails to make the connections between the phonetic patterns and the phonological typology, and so provides both less comprehensive and less restrictive predictions as to the nature of that typology. Most current work in generative phonology (and in particular with respect to the centrality of the factorial typology to work in Optimality Theory) makes the entirely aprioristic assumption that in providing a complete account of the representation of a specific

Initial-syllable strength effects

173

phonological process in the grammatical competence of a speaker of a given language, we will also necessarily achieve a comprehensive account of the range of typological regularities observed crosslinguistically in the patterning and attestation of all related phonological processes. This postulate, I am arguing, is both fallacious and counterproductive, as it places logically out of bounds precisely the information necessary for a full accounting of the crosslinguistic facts. By contrast, the strong empirical coverage of the phonologization approach (often based on information either irrelevant or unavailable to the individual speaker in synchrony) in accounting for phonological typology underscores this point. As evidenced throughout this study, there is simply no reason to assume that the same body of information necessarily explains both why set X and only set X variants of a phonological process exists, and at the same time how a given speaker of a language acquires and implements a single member of that set X. Put more directly, a speaker certainly needs to know that his or her language contains a given pattern, but the assumption that he or she must also know why his or her language contains that pattern (in functionalist terms) is not at all obviously supported. 4.4. Initial-syllable strength effects The foregoing account of the development of phonological initial strength effects from phonetic strengthening of initial material makes a prediction which flies in the face of apparent typological fact: Insofar as significant phonetic initial strengthening has been detected only on word- or phraseinitial segments, and not on segments deeper inside word-initial syllables, we would expect not to find phonological licensing asymmetries involving those segments. This is indeed the case as concerns Positional Augmentation effects, and yet the literature contains numerous reports of vocalic Positional Strength patterns in which the initial syllable is the strongest licenser. In this section I will argue that in none of the cases reported in the literature is it actually the “initiality” of the syllable per se which is in fact responsible for the licensing asymmetries. Rather, in each case either phonetic factors (various types of additional duration) or morphological factors (the initial syllable is also the root) can account for the phonologization of the patterns in question. Other factors, such as processing concerns, need not be directly invoked to reach an understanding of attested typological regularities. Aside from this, a third group of patterns, to be discussed first, has simply been misanalyzed in earlier accounts of initial syllable typology. Section 4.4.1 analyzes this

174

Initial syllables

group of putative initial-syllable PN effects, better attributed to initial stress. Section 4.4.2 introduces the possibility of durational asymmetries arising between initial and non-initial vowels from sources other than stress. This discussion leads into 4.5, which presents experimental evidence of initial-syllable vowel lengthening from initial strengthening in Turkish, and details a hypothesis as to how this small durational asymmetry could ultimately lead to the phonologization of vowel harmony. 4.4.1. Initial syllables in languages with fixed initial stress Beckman (1998) notes that initial-syllable phonological strength effects are robustly attested in the harmony systems of Turkic, Tungusic, Mongolian, Uralic , and Bantu (Beckman 1998: 54), to which she adds also the surfaceinventory reducing patterns of Tamil and Dhangar-Kurux. I will assume for the sake of argument that the first four of these families showing initialsyllable strength effects do not derive from a common ancestor in which the strength effects were already present. Others, such as Poppe (1960), would disagree, reconstructing some form of harmony in the protolanguage from which all these families are claimed to be derived. I will also assume that their geographic contiguity is irrelevant to the explanation of the phonological patterns. Steriade (1994) also lists asymmetries in initial syllable/non-initial syllable licensing in a variety of Uralic and Altaic languages as well as Tiv and Gokana. The licensing asymmetries in all of these are clear enough in the descriptions. The problem, however, is that in each of these cases (excluding of course the Niger-Congo examples) at the time of development of the relevant licensing asymmetries the languages or language families in question had stress fixed on the initial syllable. Proto-Finno-Ugric is generally thought to have had both initial stress and a form of palatal harmony, and indeed the overwhelming majority of modern Finno-Ugric languages have either initial stress or some obvious derivative synchronically as well (Abondolo 1998; Sammallahti 1988).118 To the extent that we wish to reconstruct Proto-Altaic at all, this is generally done with fixed initial stress. Poppe reconstructs an “expiratory accent” on Altaic word-initial syllables, generalized from the reconstruction of the same in the Mongolian, Tungusic, and Turkic parent languages (Poppe 1960: 143–147). Li (1996: 20–21) notes that Poppe’s characterization of Tungusic stress as fixed initial is too categorical as concerns the modern languages, though the reconstruction seems solid enough. Most Dravidian languages are also characterized by initial stress or some obvious derivative thereof (e.g., default initial with attraction to

Initial-syllable strength effects

175

heavy syllables further right), and fixed initial stress is assumed for the proto-language as well (Zvelebil 1970: 40). This accent, like that assumed for Mongolian, Tungusic, Turkic, and Uralic, was probably not strongly duration-cued like that of Russian or English, but this does not mean that a durational asymmetry was absent altogether. The resistance of Dravidian initial syllables is especially apparent in the massive syncopations of noninitial vowels found throughout the language family, and most dramatically attested in Toda and Kota (Zvelebil 1990: 3).119 Dhangar Kurux , cited by Beckman as contrasting nasal and oral vowels in initial syllables only lost original final vowels, while reducing most other non-initial short vowels to an ultra-short schwa, later restored to “syllabicity” as short copies of the initial (Gordon 1976). Bosch (1991) notes that Tamil, another language with a restricted inventory of vowel contrasts outside the initial syllable, also has initial stress (albeit not strongly cued by duration). Andronov argues that Classical Tamil must have had initial stress (Andronov 1975: 6) (while arguing that stress must have shifted rightward in Colloquial Tamil to account for central vowel deletions from etymological initial syllables). Balasubramanian (1980) reports native speaker intuition of stress on initial syllables of words in isolation and lengthening of various elements thereof when the relevant word “has to be said with special emphasis”. (Balasubramanian 1980: 456) A similar state of affairs may obtain in Gujarati, where the stress pattern is somewhat less clear. Cardona (1965) describes it as a close derivative of the accent system of Classical Sanskrit (stress a heavy penult, else stress the antepenult – Bubenik 1996), which is to say not fixed initial. There are, however, licensing asymmetries in which the initial syllable is strong. Specifically, in Gujarati there is a contrast between plain and breathy-voice vowels, but there is a strong tendency for this opposition to be realized in initial syllables only (Silverman 1997: 115).120 Importantly, Gujarati breathy vowels are distinguished from clear vowels by, among either things, characteristic longer duration (Fischer-Jørgensen 1967: 94). Additionally, while Gujarati contrasts tense and lax mid vowels /e, o/ and /, /, this contrast is only available in the initial syllable (Pandit 1955). The origin of this licensing asymmetry in durational differences between initial and non-initial syllables becomes apparent when we consider the source of this pattern. Initial syllable // and // arise through the monophthongization of the earlier diphthongs /aj/ and /aw/. Prior to this the difference between [e, o] and [, ] in initial syllables was purely allophonic, a function of segmental context and syllable structure. Crucially, monophthongization of /aj/ and /aw/ took place in non-initial syllables as well, only here the outputs of this sound change merged with preexisting /e/ and /o/ (Pandit 1961: 62–63). In other words, the diphthongs

176

Initial syllables

have two monophthongal realizations: a lower, presumably longer vowel in initial syllables, and a higher, presumably shorter vowel in non-initial syllables. The change in initial syllables causes the distribution of the low mid vowels to become unpredictable, and hence contrastive. Clearly though, the different outputs of the monophthongization must be the result of different phonetic characteristics, most likely duration, of the syllables in question. Other accounts of stress in Gujarati suggest a source for that additional duration in initial syllables: Pandit (1958), for example, reports that all syllables have “even stress”, except for those which are postjunctural. Initial syllables after juncture carry a stronger stress than others (Pandit 1958: 216). It is unclear whether this should be understood to imply a phrase-initial stress (as in Bengali), or the type of initial strengthening described below for Turkish. Clearly, however, additional duration in the initial syllable can be implicated in the phonologization of both licensing asymmetries. Dave (1967: 2), in fact, claims that murmured vowels are only possible in stressed syllables, but does not elaborate in that particular study. The precise phonetic characteristics of the domain-initial syllable in Gujarati and the status of these with respect to accentuation awaits experimental verification. In any case, the problem with all the above licensing patterns is clear enough: If we are attempting to distinguish the set of phonological licensing asymmetries found in initial syllables from the set of those found in unstressed syllables, we can gain no insight by enumerating as examples of a particular initial-syllable strength pattern languages in which the wordinitial syllable is also always the stressed syllable as well. This confound in the data makes it impossible to distinguish the effect of the initiality of the syllable from that of its stressedness. Fixed stress languages tend not to have strongly duration-cued stress, which explains why most of the languages in question have not developed UVR systems of the type discussed in Chapter 2. It will be argued below, on the other hand, that the smaller durational asymmetries characteristic of languages with fixed stress can actually give rise to vowel harmony, precisely the form initial-syllable strength effects almost always take (Uralic, Turkic, Tungusic, Mongolian, Niger-Congo). That not all of the relevant languages have initial stress synchronically is irrelevant to the phonologization approach to the typology of positional neutralization, assuming initial stress in the parent language at the point at which the licensing asymmetry first arose. As demonstrated in numerous cases in Chapters 2 and 3, the phonetic effects associated with the syllables in question are relevant only at the very point at which phonologization takes place. That such effects were present then (e.g., that Proto-Uralic or Proto-Turkic had both initial stress and palatal harmony) is enough to account for the existence of the pattern. Once phonologized, of

Initial-syllable strength effects

177

course, the licensing asymmetries cease to be dependent on the phonetics for their existence, and may continue to exist even in the event of radical changes to the phonetics of the language (e.g., shifts of stress away from the initial syllable. This is in fact the standard account of the development of restrictions on vowel licensing outside initial syllables in Latin). Such cases receive their “accounting” as part of the typology of initial-syllable effects from a stage of the language long since ceasing to exist, an accounting which the largely-irrelevant synchronic state serves only to obscure. 4.4.2. Other sources of duration in initial syllables Remarkably, the exclusion of languages with fixed initial stress in synchrony or diachrony immediately removes from our typology the overwhelming majority of vocalic initial-strength effects remarked upon in the literature. Still, a few cases, such as that of the Bantu languages, cannot be explained in this manner, as there is no evidence that anything like an initial stress accent ever existed in these languages. But is lexical stress the only possible source of additional duration in initial syllable vowels? Certainly the literature on initial strengthening to date has found (to the extent that it has sought) additional duration on vowels only when these are absolute initial in the word, not preceded by an onset consonant. Otherwise, no across-the-board lengthening of initial syllable vowels has been identified experimentally. The nearest thing is found in the Korean experiments of Cho and Keating (1999), where two of four speakers show increased durations of vowels in initial #CV sequences, but only after consonants, whose VOT does not increase as a function of initial strengthening ([n] and tense t, which they transcribe as [t*]). Cho and Keating speculate on this basis that there is a compensatory relationship in the implementation of initial strengthening between the duration of the initial-syllable vowel and the VOT of the preceding consonant. As in most of the initial-strengthening studies, unfortunately, here comparisons are only among initial-syllable vowels at the left edge of prosodic domains of differing levels. No comparison is made between initial-syllable vowels (phrase-initially or phrase-medially) and comparable vowels in wordinternal syllables. Certainly for predictions concerning lexical-level licensing asymmetries this would be among the most important measures to be taken. To assess the possibility that in certain languages initial strengthening does in fact affect initial-syllable vowels in a way that could ultimately lead to the phonologization of initial-syllable licensing asymmetries such as

178

Initial syllables

those found throughout Central Eurasia and sub-Saharan Africa, I constructed an experiment investigating the durations of initial syllable vowels in Turkish and English. English was selected as a control for the experiment, as previous work (Fougeron and Keating 1996; Byrd 2000) had found no significant lengthening of initial syllable vowels in that language. Turkish was selected due to persistent reports in the literature and impressionistic observations in my own fieldwork that, despite fixed final stress in most of the native lexicon, something on the order of durational enhancement was nonetheless afoot in the Turkish initial syllable. There is, of course, a potential confound here, since as noted above, Proto-Turkic is generally reconstructed with fixed initial stress, so that any additional duration found there could in principle be a phonetic footprint (as it were) of the earlier prosodic system. I will not dispute the reconstruction of initial stress for Proto-Turkic, though it should be noted that of the putatively Altaic languages discussed by Poppe, by far the weakest case for an earlier initial stress can be made for Turkic. Johanson (1998: 111) argues that an assumption of initial stress would help to explain the development of vowel harmony, early reduction and loss of final vowels, and alliteration patterns in Old Turkic. It is the purpose of the following to demonstrate that even without initial stress, enough additional duration is contributed to initial syllable vowels in Modern Turkish by initial strengthening that, had a similar phonetic enhancement been in effect at the relevant moment in the deep prehistory of Turkic, attested patterns of palatal vowel harmony could have been the result. Substantial initial strengthening of consonants could also have been responsible for giving sufficient prominence to these that alliteration might stand out as an attractive literary device. As for the reduction of final vowels, it was noted in Chapter 3 that stressed final high vowels in Modern Turkish are subject to devoicing, and that /a/ in open syllables is raised toward schwa in Modern Turkish, again irrespective of the placement of stress. Furthermore, Chapter 3 also cites the abundant cases in the diachronic literature of other language families (originally) with pitch-cued accent (e.g., Baltic and Slavic) in which vowels which were indisputably stressed at one point ultimately see stress retraction and deletion. For none of these problems, then, is an initial stress the sole (or even an adequate) explanation. The default initial stress in Chuvash in words containing only reduced vowels does suggest an earlier initial stress, which was later attracted to heavy syllables further to the right (though see Dobrovolsky [1999] and following this Gordon [2000] for phonetic data suggesting that the initial pitch peak in Chuvash words is better analyzed as an intonational rather than a stress prominence). Be this as it may, some scholars continue to maintain, along with Turkologists like Poppe, that Modern Anatolian Turkish continues to

Vowel harmony and initial syllables

179

have both a pitch accent on the final syllable and a “dynamic stress” generally, though not always on the initial syllable (Csató and Johanson 1998: 207). It is my contention that what these scholars are in fact hearing is initial strengthening of onset consonants and vowels in a language which otherwise has a non-duration-cued fixed final stress. The following sections discuss my experimental results and the implications thereof for the development of palatal harmony systems of the Eurasian type. 4.5. Vowel harmony and initial syllables The wide attestation of progressive vowel harmonies proceeding (at least historically) from the initial syllable in many languages without initial stress (such as Turkic or Bantu) demands an explanation. In such systems, only the initial syllable of the word realizes the language’s full set of contrasts, while in non-initial syllables certain features are predictable from the specification of the vowel in the initial syllable. In languages with progressive palatal vowel harmony, such as Turkish, the frontness or backness of non-initial vowels is generally determined by the frontness or backness specification of the initial vowel. This is illustrated in (1) using several monosyllabic roots and suffixes.121 (1)

gøzlerim dostla»rμm eye-PL-1SG friend-PL-1SG ‘my eyes’ ‘my friends’

dZanla»rμm soul-PL-1SG ‘my souls’

dZeple»rim pocket-PL-1SG ‘my pockets’

Given the lack of initial stress in Modern Turkish, if we are to assume that this pattern is ultimately the result of the phonologization of some phonetic characteristics of Turkish initial syllables, we must seek those phonetic characteristics in some other aspect of Turkish prosody, such as the implementation of initial strengthening. 4.5.1. Initial syllables in English The first experiment to be presented here was designed as a control to see that the results of the methods employed here matched those of Fougeron and Keating (1996) and Byrd (2000), who demonstrated a lack of initial strengthening for initial-syllable vowels in English. Fougeron and Keating demonstrated that in English vowel duration is strongly correlated with degree of jaw opening. In my experiments here I analyze only the durations

180

Initial syllables

of initial-syllable vs. second-syllable vowels. No articulatory measures were collected. 4.5.1.1. Subjects Subjects were four university-age, female native speakers of American English. All four had some training in linguistics, but were naive as to the purpose of the study. 4.5.1.2. Materials Target words were 24 real words of English, 12 with an initial open syllable containing the vowel /æ/ under secondary stress, and 12 with an analogous second syllable. The vowel /æ/ was chosen for its long inherent duration, on the assumption that any systematic durational variation would be more readily detectable in a longer stimulus than in a shorter one. Secondary stresses were chosen specifically for the following reasons. Since unstressed vowels in English are heavily reduced and extremely short, they would make poor candidates for the detection of small-scale durational variation. Choosing primary stresses, however, would in effect confound two distinct loci of potential phonetic enhancement, since target words were consistently pitch-accented in the frame sentences constructed for this study. This would be particularly detrimental in comparing vowels initial in a number of different prosodic constituents, since accentual lengthening is thought to exhibit the same gradient enhancement as a function of domain-level that initial strengthening of consonants has been shown to undergo. Secondarily stressed syllables then represent a sort of compromise, insofar as they retain full vowels, but were seen for the most part to escape pitch accenting in the sentences created here. Adjacent segmental context was carefully controlled and balanced across the Syllable 1 and Syllable 2 word classes to avoid perturbations of vowel duration. Following segments were either voiced stops (four in each word class), voiceless stops (six in each word class), or voiceless fricatives (two in each word class). Preceding consonants were either voiced stops, nasals, or laterals. Preceding and following segment types were selected primarily to afford uncontroversial durational measurements. Voiceless obstruents were avoided as preceding segments because of their clear shortening effect on vowels of metrically prominent syllables of English. Open syllables were chosen as a context for the target vowels, again with an eye to reducing non-position-dependent durational variation, and again under

Vowel harmony and initial syllables

181

the assumption that, given the longer durations typically associated with open syllable vowels, any durational variability might be maximally observable precisely in this context. (2) shows two pairs of tokens illustrating the types of stimuli to be compared. (2)

Syllable 1 làbyrínthian làpidárian

vs.

Syllable 2 elàborátion dilàpidátion

All target words were either selected from the Oxford English Dictionary, or derived using familiar and productive morphology. While some target words contained five syllables and others contained six, the distribution of each length was balanced across the Syllable 1 and Syllable 2 word classes. In addition, 24 prosodically-analogous filler words containing vowels other than /æ/ were selected to divert subjects’ attention from patterns within the data. 4.5.1.3. Methods To test for the effect of prosodic constituency on the realization of target vowels, each target word was situated in three different frame sentences selected to place the target word in initial position in a variety of prosodic domains à la Fougeron and Keating (1996). The relevant domains were Utterance, Phonological Phrase, and Phonological Word. These frames are shown in (3). Utterance-initial position (a) was achieved simply by placing the relevant word at the beginning of the frame sentence. Phrase- (but not utterance-) initial position (b) was secured by subordinating a similar frame sentence to a main clause beginning with ‘In my opinion...’. Word- (but not phrase-) initial position (c) was constructed by making the target word the second element in the phrase “The word X”, and placing the result in the phrase-initial position. Note that the predicates of each sentence type were manipulated to keep the number of syllables per frame sentence constant across environments (seven syllables per sentence, excluding the target word. (3)

Prosodic Environments a.

Utterance-initial: U[Phr[W[X

sounds perfectly natural.

182

Initial syllables

b.

Phonological Phrase-initial: U[Phr[In

c.

my opinion,]Phr[W[X sounds fine.

Word-initial: U[Phr[The

word W[X sounds perfectly fine.

A total of 288 sentences (48 words x 3 prosodic contexts x 2 repetitions) were arranged in pseudo-random order and presented to speakers in a printed list. A familiarization session was conducted first, in which speakers read aloud a randomized list of all words to be presented, target and filler, in isolation. Pronunciation suggestions were supplied only at subjects’ request. (Some of the target words, e.g., Maccabeical, were not familiar in advance to all speakers.) Unsolicited corrections were not provided, a policy which ultimately necessitated the exclusion of a small number of tokens from the results. (The word anachronistic, for whatever reason, was particularly prone to variant pronunciations.) Following the familiarization session, speakers were instructed to read the list of sentences aloud at as natural a rate of speed as possible. Sessions were recorded digitally at 44.1 KHz in a sound-attenuated room using a Marantz CDR-300 portable CD-recorder and a Shure SM10A cardioid headworn dynamic microphone run through a Rolls MP13 Mini-Mic preamplifier. Vowel durations were measured from spectrograms and linked waveforms using the Praat 4.2.18 speech analysis software (Copyright 1992–2004 by Paul Boersma and David Weenink). For segmentation purposes, before voiceless obstruents vowel edges were measured at the offset of reliable periodicity (including markedly breathy periods preceding voiceless fricatives). Before voiced obstruents measurements were taken at discontinuities indicating stop closure. Vowel onsets were marked at the beginning of periodicity following voiced stop bursts, and after sonorants, either at visible spectral discontinuities associated with the consonant’s release, or failing this, at the onset of significant energy in the second and higher formants. In certain extreme cases, particularly those following laterals, where no sufficient landmark could be detected between the consonant and the vowel, tokens were simply discarded. 4.5.1.4. Results Mean vowel durations for all classes of stimuli are shown for each speaker in Table 4. A repeated measures ANOVA with phrasal context and syllable

Vowel harmony and initial syllables

183

within the word as within-subjects factors confirms what is apparent from the values given below. A main effect emerges for syllable within the word, F(1, 3) = 95.41 (p = .002), unexpectedly enough with Syllable 2 significantly longer than Syllable 1 across the board. No main effect was found of phrasal context, F(2, 6) = 3.148 (p = .116), nor was the interaction of the two factors found to be significant, F (2, 6) = 2.47 (p = .165). Table 4.

Mean vowel durations in ms. and standard deviations122

Speaker Speaker 1 Speaker 2 Speaker 3 Speaker 4 Mean

Syll 1 109 (15.0) 108 (15.3) 96 (11.5) 114 (13.6) 106.8 (13.9)

Word Syll 2 127 (12.7) 126 (12.2) 114 (10.5) 131 (12.9) 124.5 (12.1)

Phrase Syll 1 Syll 2 106 123 (14.4) (12.4) 95 123 (17.7) (15.2) 99 114 (11.8) (13.6) 113 131 (11.0) (8.0) 103.3 122.8 (13.7) (12.3)

Utterance Syll 1 Syll 2 101 124 (13.2) (12.5) 91 123 (13.2) (17.5) 95 110 (14.0) (12.2) 110 135 (18.6) (10.5) 99.3 123 (14.8) (13.2)

These results are illustrated graphically in Figure 10 below. The general trend is surprising, and frankly beyond my power to explain at this time. What is clear is that this study replicates the results obtained earlier by Fougeron and Keating (1996) and Byrd (2000), at least insofar as in all three studies, position in the initial syllable of a word was found to contribute no additional duration to vowels when compared with vowels in analogous non-initial syllables. English does not lengthen initial-syllable vowels. Why the very opposite seems to have occurred with such consistency and strength in this study is not immediately obvious. The presence of several relatively high frequency words in the Syllable 2 list (collaboration, imagination) should have induced if anything the opposite trend, with Syllable 2 words having overall shorter durations. The only additional prosodic difference that could be found between the two sets of words was the inclusion of several words in the Syllable 1 list exhibiting a two syllable lapse between secondary and primary stresses (nàturalístically, nàtionalístically, làboratórial). All Syllable 2 words had a lapse of only a single syllable between stresses. Assuming the two intervening unstressed syllables in the lapsing words to be footed with the initial, and assuming also some sort of notion of foot isochrony, we might

184

Initial syllables

imagine that the trisyllabic feet would force some phonetic shortening of the vowels of interest to us relative to those found in their lapse-free counterparts. Removal of results from the suspicous Syllable 1 words (together of course with their Syllable 2 segmental analogues to maintain balance across sets), however, had no effect on the durational differences in question. The Syllable 2 vowels remain longer. Leaving further investigation of this result for the future, I will simply conclude for now, along with the studies described above, that English, whatever else it may be doing, manifestly does not strengthen the vowels of initial syllables in the same way that it does onset consonants.

Speaker 1

Duration in ms.

120

127 109

Speaker 2 140 120 101

100

126

124

123 106

80

Syllable 1 Syllable 2

60 40

Duration in ms.

140

20

95

100

91

80

Syllable 1 Syllable 2

60 40

0 Word

Phrase

Utterance

Word

Prosodic context

Speaker 3

96

114 99

120

110 95

80

Syllable 1 Syllable 2

60 40 20

Duration in ms.

114

Utterance

Speaker 4 140

120

Phrase Prosodic context

140

Duration in ms.

123

20

0

100

123

108

131 114

135

131 113

110

100 80

Syllable 1 Syllable 2

60 40 20

0

0 Word

Phrase

Utterance

Prosodic context

Word

Phrase

Utterance

Prosodic context

Figure 10. Mean vowel durations by prosodic context for all speakers

4.5.2. Initial syllables in Turkish Turkish shows clear strengthening of domain-initial consonants. While a comprehensive study remains to be done, my preliminary observations over a large corpus of Turkish words recorded in isolation under laboratory

Vowel harmony and initial syllables

185

conditions by a single speaker (created by the Turkish Electronic Living Lexicon project of Sharon Inkelas, University of California, Berkeley) suggest that at least VOT of voiceless stops, prevoicing of voiced stops and duration and energy of voiceless fricatives are significantly enhanced domain-initially. Such strengthening is clearly visible in the spectrogram in Figure 11 below. 5000 4000 3000 2000 1000 0 0

0.67093 Time (s)

p

a

p

a

t

0

j

a

m 0.67093

Time (s)

Figure 11. [pHa»pHatjam] ‘my chamomile’

Here the strong aspiration of the initial labial stop is clear. It is in fact substantially stronger than that found in the onset of the stressed second syllable.123 The longer duration of the second vowel is due to syllable structure, closed syllables hosting longer vowels than open syllables in Turkish (Kopkallı-Yavuz 2000; Barnes 2001). I suspect that it is this initial-consonant strengthening, combined with the vocalic effect presented below, with is being heard and interpreted as an initial “dynamic stress” in Modern Turkish by some scholars, as detailed above. The fact that it continues to occur in words like [pHa»pHatjam], where the adjacent penult is unequivocally stressed (or receives what Johanson would call the pitch accent and possibly “dynamic stress” as well),124 makes it unlikely that an accent or secondary stress is somehow involved. Given the generality of

186

Initial syllables

initial strengthening in languages in which there can be no question of any fixed initial stress (e.g., English), there is little reason to assume an analysis of an altogether different nature to account for similar (though not identical) facts in Turkish. This experiment investigates the phonetic durations of vowels in Turkish initial and non-initial syllables in a manner analogous to that used in the English experiment described in the previous sections. While both that experiment and two previous ones confirm the absence of strengthening of initial-syllable vowels in English, Turkish, I will show, presents a different picture. 4.5.2.1. Subjects Subjects were four native speakers of Istanbul Turkish between the ages of twenty and forty, two males and two females. Three of the subjects were students at Bosphorus University in Istanbul, where their recordings were made. The fourth currently resides in the United States, but was born and raised in Turkey, where he lived until after university. While several of the speakers had at least some linguistic training, all were naive as to the goals of this investigation. 4.5.2.2. Materials Target words were 85 actual trisyllabic nouns or adjectives of Turkish. The first group of these had an initial closed syllable containing the vowel /a/, while the second group had the same low vowel in a closed second syllable. As in the English study above, the low vowel was chosen for its inherent duration. Turkish stress does not affect vowel duration (Konrot 1981), and unstressed vowels do not undergo reduction,125 making the problems in this connection in English irrelevant here. Instead, all stimuli had final stress, placing target vowels in either an initial or peninitial unstressed syllable. Surrounding consonantal context was again controlled. Codas in target syllables were (in equal numbers for each class) voiceless stops and nasals. Onsets were voiced stops or sonorants. These were chosen both for ease of segmentation, and to avoid significant durational influence on following vowels, as would be expected from, e.g., voiceless stops in this position. An initial version of the experiment included forms with voiceless fricative onsets, which can negatively influence following vowel duration. The majority of these were later replaced. Example target forms are given in (4), with target vowels indicated in boldface.

Vowel harmony and initial syllables

(4)

Syllable 2

vs.

[kaj.maak.»tSμ] vs. cream-AGT ‘cream maker/seller/ enthusiast’

187

Syllable 1 [maak.buz.»dZu] receipt-AGT ‘receipt maker/seller/ enthusiast’

4.5.2.3. Methods As in English, each word was placed in three frame sentences, such that it would appear initially in three different prosodic constituents. The level intermediate here between Utterance and Phonological Word is in all likelihood an Intonational Phrase, though to my knowledge no comprehensive study exists of prosodic phrasing in Turkish. The three prosodic environments selected are shown in (5). In each environment the target word is the initial (non-head) element in a compound forming a hypothetical street name. The utterance-initial context (a) simply places this compound at the beginning of the frame sentence. The phrase-initial context (b) places the subordinate clause bana sorarsan, ‘if you ask me’, before that same frame sentence. The word-initial environment (c) makes that compound the head of an NP with a preceding modifier. These particular constructions are quite natural sounding (if in some cases implausible), many street names in Turkey taking this form. (5)

Turkish Frame Sentences a.

Utterance-initial sokaμ tSok gyzel bir jerdir street very nice one place-PRED ‘X street is a very nice place.’ U[Phr[X

b.

Phrase-initial sorarsan]Phr[X sokaμ tSok gyzel bir jerdir I-DAT ask-COND street very nice one place-PRED ‘If you ask me, X Street is a very nice place.’ U[Phr[bana

188

Initial syllables

c.

Word-initial sokaμ tSok gyzel bir jerdir istanbul-LOC-PCP street very nice one place-PRED ‘The X street in Istanbul is a very nice place.’ U[Phr[istanbuldaki W[X

The combination of 85 target words and three contexts produced a total of 255 sentences, which were arranged in pseudo-random order and presented to subjects in a printed list using Turkish orthography. Speakers were again instructed to read the list aloud at as natural a rate of speed as possible. Sentences were uncovered one at a time by the experimenter to ensure a short pause after each sentence, and to prevent subjects from increasing the rate at which they read progressively throughout the experiment. Test sessions were recorded digitally at 44.1 KHz in quiet indoor environments at Bosphorus University and in Berkeley, California using a Sony MZ-R700 portable minidisc recorder and a Shure SM10A cardioid headworn dynamic microphone run through a Rolls MP13 MiniMic preamplifier.126 These recordings were then downsampled to 22.05 KHz., and vowel durations were measured from spectrograms and linked waveforms using the Praat 3.9.5 speech analysis software (Copyright 1992–2000 by Paul Boersma and David Weenink). For segmentation purposes, in the context of following voiceless obstruents vowel offsets were measured at the end of reliable periodicity, while for nasals spectral discontinuities were taken to represent segment boundaries. Following initial voiced obstruents, vowel measures began immediately following the stop burst, while for sonorants, vowel onsets were taken to begin either at visible spectral discontinuities associated with the consonant’s release, or failing this, at the onset of significant energy in the second formant. In many cases, where no suf-ficient landmark could be detected between the consonant and the vowel, tokens were simply discarded. 4.5.2.4. Results The results of this experiment are shown below in Table 5 and Table 6. It becomes clear immediately that for each speaker, mean durations of initialsyllable vowels are significantly longer than those of the vowels of second syllables. This conclusion is supported by the results of a repeated measures analysis of variance with phrasal context, syllable within the word, and coda type as within-subjects factors.

Vowel harmony and initial syllables Table 5. Speaker Speaker 1 Speaker 2 Speaker 3 Speaker 4 Mean

Table 6. Speaker Speaker 1 Speaker 2 Speaker 3 Speaker 4 Mean

189

Mean vowel durations in ms. and standard deviations for syllables with nasal codas127 Word Syll 1 Syll 2 136.5 123.3 (13.5) (19.5) 83.3 76.4 (10.0) (14.1) 99.4 90.1 (6.8) (9.8) 109.3 95.9 (5.0) (10.7) 107.1 96.4 (8.8) (13.5)

Phrase Syll 1 Syll 2 128.8 117 (14.1) (15.1) 77.9 77.7 (7.0) (13.6) 93.6 87.7 (6.4) (10.2) 107 95.1 (6.5) (5.0) 101.8 94.4 (8.5) (11.0)

Utterance Syll 1 Syll 2 135.6 123.1 (12.7) (14.6) 84.8 76.3 (7.1) (9.7) 97.9 95.7 (10.1) (14.2) 106.5 99.3 (6.5) (7.0) 106.2 98.6 (9.1) (11.4)

Mean vowel durations in ms. and standard deviations for syllables with voiceless obstruent codas Word Syll 1 Syll 2 109.7 100.4 (9.8) (9.9) 68.7 62.6 (9.7) (8.5) 81.3 73.1 (6.0) (8.3) 92.3 81.9 (5.0) (5.2) 88.0 79.5 (7.6) (8.0)

Phrase Syll 1 Syll 2 104.6 96.7 (10.6) (11.8) 62.2 61.1 (9.9) (8.1) 74.2 73.5 (6.5) (8.7) 90.2 82.9 (5.7) (5.7) 82.8 78.6 (8.2) (8.6)

Utterance Syll 1 Syll 2 102.1 96.6 (8.5) (9.1) 67.6 66.7 (9.4) (7.9) 76.8 76.7 (8.1) (10.3) 92.3 86.1 (8.0) (6.6) 84.7 81.5 (8.5) (8.5)

A main effect of coda type was unsurprisingly found, with vowels before nasal codas significantly longer than vowels before voiceless obstruent codas (F [1, 3] = 48.4, p = .006): 100.8 ms. vs. 82.5 ms. overall. There was also a main effect of phrasal context, reflecting the somewhat shorter durations characteristic of vowels in the phrase-initial context (F [2, 6] = 9.78, p = .013): 89.4 ms. compared with 92.8 ms. for both the utterance- and word-initial position. This is most likely a consequence of the unintentionally greater prosodic and syntactic complexity of the phraseinitial context, an unfortunate confound in the construction of this study. Most importantly for our purposes here, there was also a main effect found

190

Initial syllables

of syllable within the word, with mean initial syllable durations exceeding those of second syllables by 91.3 ms. to 84.7 ms. (F [1, 3] = 18.668, p = .026). In addition to these main effects, a marginally significant interaction was found for phrasal context and syllable within the word, with the mean difference between initial and peninitial syllables greater in the word-initial context (9.6 ms.) than in the phrase- or utterance-initial contexts (5.8 and 5.4 ms. respectively): F (2, 6) = 5.791 (p = .04). Lastly, there was a significant interaction in the data between coda type and syllable within the word, with the difference before nasal codas (8.5 ms.) greater than that before voiceless obstruent codas (5.3 ms.): F (1, 3) = 34.386 (p = .01). One possible explanation for this distinction would be that durational variability is simply greater in contexts with typically longer durations. Thus, the longer overall durations of vowels before nasal codas allow more dramatic expression of this initial strengthening effect than do the shorter durations characteristic of vowels before voiceless obstruent codas. Speaker 1

Speaker 2

160

90 133.6

120

82

76.8

80 121.1

66.2

70 105.5

100

97.9 Syllable 1

80

Syllable 2

60 40

Duration in ms.

Duration in ms.

140

50

Syllable 1

40

Syllable 2

30 20

20

10 0

0 N-coda

T-coda

N-coda

T-coda

Prosodic context

Prosodic context

Speaker 3

Speaker 4 120

120 97

100

91.2 77.4

80

74.4 Syllable 1

60

Syllable 2

40 20

Duration in ms.

Duration in ms.

100

63.5

60

107.6 96.8

91.6 83.6

80 Syllable 1

60

Syllable 2

40 20

0

0 N-coda

T-coda Prosodic context

N-coda

T-coda Prosodic context

Figure 12. Mean vowel durations for Turkish syllables 1 and 2

Vowel harmony and initial syllables

191

It is noteworthy here that the strengthening effect detected was nonetheless not seen to increase at the boundaries of higher-level prosodic constituents in the manner documented for other initial strengthening effects in a variety of languages in studies such as those cited in Section 4.3. In fact, degree of strengthening turned out to be greatest in word-initial position, precisely where other studies’ results might have predicted it to be the least. The reason for a smaller strengthening effect in the phrase-initial context could plausibly be the overall shorter durations characteristic of that context in this study, where, as in the English study described above, the length and complexity of this frame sentence seems to have had an unwelcome influence on target vowel durations. The weaker effect in utterance-initial position is not readily explicable, though it should be noted that a number of the earlier studies of initial strengthening cited above also found minor deviations from expected patterns in utterance-initial position. This matter will remain unresolved here, in any case. While it may thus be that initial strengthening in Turkish is not sensitive to distinctions in levels of the Prosodic Hierarchy, it is also worth mentioning that even in the earlier studies which did detect increases at higher-level boundaries in other languages, differences were not always found for all speakers, and speakers differed frequently in their choices as to which boundary levels played a significant role. It is conceivable then that a larger study of Turkish with the confounds mentioned above removed would ultimately uncover the missing effect. It is also possible that the phrase boundary selected for comparison with the word boundary was insufficiently high in the hierarchy to trigger the increased strengthening effect. One final problem may also have been in the choice of realword stimuli, which, although carefully controlled, may nonetheless have introduced sufficient variation into the results to make detection of this level of durational asymmetry difficult. The studies that detected the hierarchical strengthening patterns all used single repeated stimuli and nonsense words representing each phonetic environment under investigation. Such a strategy could prove more effective in Turkish as well. The results of the experiments detailed in the sections above are clear in one crucial respect: We have seen that Turkish, but not English, exhibits a pattern of domain-initial strengthening affecting the vowels of initial syllables. Minimally then, we can say that domain-initial strengthening is implemented to differing degrees in different circumstances on a languagespecific basis. This was shown in the earlier experiments as well (i.e. Keating et al. [1999]), in service of the point that insofar as domain-initial strengthening varies on a language-specific basis, it cannot be relegated to the level of “universal phonetic implementation”, and must receive some representation in the grammars of the languages in question.

192

Initial syllables

Concerning the language-specific nature of domain-initial strengthening, I believe it is not by chance that it is Turkish, and not English, which exhibits initial-syllable vowel lengthening. One of the primary cues for stress placement in English is additional vowel duration in the stressed syllable. It is therefore unsurprising that English would avoid simultaneous implementation of other positionally-determined vowel-lengthening patterns, insofar as doing so would have the potential to seriously confound accurate perception of the placement of stress. This is hardly a novel argument. Indeed, something similar is hypothesized in Keating et al. (1999) concerning the deployment of boundary signals in English, as opposed to French and Korean, which have substantially different types of prosodic systems. They in turn cite Lehiste (1964) making a similar argument concerning vowel length and boundary signals. Continuing this line of reasoning, then, in Turkish stress is not even weakly correlated with vowel duration (Konrot 1981), leaving that phonetic resource available for use in signaling word boundaries, as the experiment presented here shows. The implicational force of this analysis is, to be sure, only negative, predicting which languages should not show initial-syllable vowel lengthening. Only a more extensive experimental survey of languages with non-duration-dependent accentual systems will provide the information necessary to make any further predictions on this matter. Mollaev (1980) demonstrates a strong tendency for the vowels of syllable 1 to be longer in duration than comparable vowels in syllable 2 of disyllables in Turkmen, but the closeness of the relationship of the latter to Modern Turkish does not allow generalizations to be expanded on this point. It might be objected at this point that English does in fact use vowel duration to signal things other than placement of stress. Final lengthening is a well-known example.128 However, English final lengthening is consistent and robust only in the higher prosodic domains. At the word-level, it is sporadic and weak (cf. Beckman and Edwards 1990), which would tend to minimize its effect on perception of lexical stress placement. (The phonetic differences between final lengthening and stress are discussed at length in Chapter 3.) Turkish initial strengthening, unlike the final lengthening found in English, is realized consistently in both lower and higher prosodic domains, and would therefore be expected to be more disruptive to the perception of stress placement than English final lengthening. Final lengthening is also frequently realized alongside a severe amplitude and F0 drop, both of which would minimize the likelihood of additional duration on the final syllable being mistaken for lexical stress. The crosslinguistic propensity for stress to avoid final syllables is also discussed in Chapter 3.

Vowel harmony and initial syllables

193

4.5.3. Initial strengthening and Turkic Palatal Harmony The experimental results given above demonstrate the application of domain-initial strengthening to the vowels of initial syllables in Turkish. This process represents a source of additional phonetic duration in initial syllables independent of the initial stresses discussed above for Uralic, Dravidian, and at least some of the languages grouped as Altaic. Since the fixed initial stresses in those languages are not known to be strongly duration-cued, any additional duration contributed by stress to those initial syllables might well be on the order of that documented here in the initial unstressed syllables of Turkish. If strengthening of initial-syllable vowels ultimately turns out to be more widely attested in languages with nonduration-cued stress, this process may represent a second source for the initial-syllable phonological strength effects attested in the literature. Still, it is not at all clear how a small-scale durational asymmetry of the type documented above could develop into word-bounded phonological vowel harmony of the Turkic variety. The following presents the skeleton of a possible solution to this problem. 4.5.3.1. Does vowel harmony come from vowel reduction? It is often observed that vowel harmony and vowel reduction share many common features, most notably that they both involve the positional neutralization of vowel quality contrasts. Indeed, for Beckman (1998), the difference between the two is essentially just whether the language allows multiple linking of vocalic features. It seems quite plausible that the one type of system might be linked in some way with the other developmentally, and specifically, that the chain of assimilations imagined to give rise to systems of vowel harmony would be phonetically quite a bit more plausible were it to take place across a string of vowels whose quality had already been neutralized by some other process (as was suggested above for the shift from vowel harmony to vowel inventory reduction in Ob-Ugrian dialects). If weak positions (e.g., non-initial syllables) already licensed the appearance of fewer contrasts, and the vowels which did surface there were of significantly diminished duration, we could imagine they would be more susceptible to coarticulatory effects from neighboring strong vowels. Rhodes (1996), for example, observes English reduced vowels undergoing low-level gradient assimilation in roundness to neighboring back rounded vowels. Certain East Slavic dialects with robust systems of vowel reduction also display patterns of dissimilation or assimilation involving the stressed vowel and the first pretonic vowel (see

194

Initial syllables

Crosswhite [2001] for an OT-based analysis of some of these systems). Languages with some sort of harmony system already in place are also known to add new assimilations to pre-existing ones (e.g., the gradual extension of rounding harmony in the Turkic languages). All these cases, while not in fact instances of the development of fullblown word-domain vowel harmony systems from earlier non-harmonizing reduction systems, nonetheless suggest there is something to the reductionthen-assimilation hypothesis. There are a number of reasons, however, why in the cases at hand, such an origin for harmony is implausible. First, ProtoTurkic is generally reconstructed with some system of palatal harmony already in place. Assuming that it is correct to reconstruct initial stress (rather than just phonetic initial strengthening) for Proto-Turkic, it is necessary to bear in mind that fixed-stress systems are not generally strongly duration-cued and non-duration-cued stress systems rarely generate robust patterns of phonological vowel reduction. A glance across the prosodic systems of the Slavic languages illustrates this point well on a smaller scale: Czech, Polish, Serbo-Croatian, Standard Macedonian with no duration-cued lexical stress and no phonological vowel reduction, alongside Russian, Belarusian, Bulgarian, and Macedonian dialects with duration-cued phonemic stress and phonological vowel reduction, to name only the best known cases. For the reduction-first generalization to be correct for Turkic then, Pre-Proto-Turkic must have been a counterexample to this generalization. Specifically, it would have had to develop a vowel reduction system sufficiently sweeping in nature to give rise to the attested harmony system, and to do so in a system with fixed, non-duration-cued stress. Alternatively, one might imagine that Turkic fixed stress was at one point duration-cued and later changed. If the initial-stress reconstruction is wrong, on the other hand, and Proto-Turkic had only some degree of phonetic initial strengthening (like Turkish today), then we are still dealing with a relatively small durational difference between the vowels of initial and non-initial syllables. We would not expect such a system to yield the necessary contrast neutralizations either. There were in fact reductions affecting vowels in non-initial syllables in Turkic. In Proto-Turkic initial syllables, a complete inventory of long and short vowels was contrasted (Róna-Tas 1998: 69–70; Johanson 1998: 90–91). In non-initial syllables of the root short vowels contrast with “reduced” vowels, these latter derived from earlier short vowels and ultimately subject to syncopation or reduction to schwa depending on the syllabic environment. There is no reason, however, to think that unreduced vowels in non-initial syllables were especially durationally impoverished, but they were subject to palatal harmony nonetheless.

Vowel harmony and initial syllables

195

More damning though for the reduction-first hypothesis is the following: Recall that Turkic vowel harmony involves the neutralization of frontness/backness distinctions, which is to say, quality contrasts along the F2 dimension. In order to derive this state of affairs from an earlier system of vowel reduction, then, we must imagine that Pre-Proto-Turkic had a reduction system allowing a full range of contrasts in initial position, but neutralizing all F2 contrasts in non-initial syllables while retaining three contrastive heights (assuming the non-initial syllable inventory of RónaTas 1998: 70). As elaborated in Chapter 2, however, among the unstressed vowel reduction systems of the world, such a system seems completely unattested, and with good reason, given the phonetic motivations behind the development of UVR systems. Were the immediate predecessor to the Turkic palatal harmony system one in which the soon-to-be-alternating qualities did not contrast due to reduction, that reduction system would have had to be virtually unique. Furthermore, while some featural agreement of stressed and unstressed vowels can arise in UVR systems, it tends always to be harmony of the same type: either an unstressed vowel retains its earlier quality when the stressed vowel is of the same quality, or the unstressed vowel takes on the quality of the stressed vowel entirely, showing complete agreement.129 Thus, in Yakan, pretonic /a/ does not reduce to [e] if the stressed vowel (along with any intervening pretonics) is also [a] (Behrens 1975). In Timugon Murut pretonic non-high vowels are [o] if the stressed vowel is [o], and otherwise they are [a] (Prentice 1971). In Russian dialects exhibiting what is known as “assimilative” vowel reduction, there are systems in which, for example, a first pretonic /o/ will fail to reduce if the tonic vowel is also /o/, but otherwise will merge with /a/, or in other dialects, first pretonic /o/ remains [o] unless the stressed vowel is [a], in which case it merges with /a/ (Chekmonas 1987: 342). It is important to note that in Murut and Russian, where unstressed /o/ retains its rounding due to what we might call support for this from the stressed syllable, it is only under conditions of identity with the stressed vowel that this occurs. The rounding of stressed /u/, for example, is insufficient to allow the harmonization that prevents the loss of rounding in the stressed vowel.130 This support provided by the stressed vowel to unstressed vowels of certain qualities could be understood in some cases as a strategy for coping with durational pressures on the realization of certain gestures. Thus, if an unstressed vowel is too short for its target articulation to be reached, one solution is to undershoot or even alter the phonological target. Another solution, possible in cases of identity of the relevant features would be to realize a single token of the problem gesture over two syllables, thus providing the necessary time to reach target. This is essentially Kaun’s

196

Initial syllables

(1995) notion of Uniformity, a phonetic multiple-linking which Boyce (1990) shows to be implemented in Turkish sequences of harmonizing round vowels (but not in similar sequences in English). Supporting this interpretation is an intriguing note on Russian vowel reduction from Avanesov (1968). In this work Avanesov is concerned with the norms of Contemporary Standard Russian. In addition to describing these norms, along the way he provides advice for teachers of non-standard dialect speakers on how best to get their students to conform to that description in, for example, the system of UVR they employ in their speech. In this connection, on the topic of causing speakers of dialects that realize [o] faithfully in unstressed syllables131 to begin merging /o/ and /a/ when unstressed as described in Chapter 2, Avanesov notes that speakers of these dialects usually pass through a number of intermediate stages of pronunciation before completely acquiring the norm: Often the first instances of /o/ reducing to merge with /a/ (as [a] or [ ]) are those in pretonic syllables preceding a stressed [a]. Additionally, often speakers learn to reduce nearly all unstressed /o/’s to [a], but nonetheless retain thereafter unstressed [o] in pretonic syllables immediately preceding stressed /o/ (Avanesov 1968: 52). This discussion, entirely in the context of the linguistic development of the individual speaker of a completely non-/o/-reducing dialect nonetheless describes precisely the characteristics of the two “assimilative” UVR dialects described above, dialects which, not accidentally, are located in the borderlands between the faithfully ‘o-saying’ and the solidly ‘a-saying’. In general this is a phenomenon worthy of serious further phonetic and dialectological study. So while there clearly are systems in which there is a connection between systems of UVR and some kinds of harmonization, what we apparently never find are systems in which one set of contrasts is effaced by UVR, upon which a categorical system of, for example, palatal harmony, springs up opportunistically on the weakened vowels of the unstressed syllables.132 The development of palatal vowel harmony in Turkic will not be understood through an appeal to UVR. If it is not the case that Proto-Turkic palatal harmony arose from an earlier phonological vowel reduction system, to what then can its emergence be attributed? I will argue that it emerges from precisely the type of small-scale durational asymmetry between initial and non-initial syllables as this paper shows is still found in the Anatolian Turkish of today. A widespread and intuitively plausible conception of the development of vowel harmony patterns maintains that harmony arises diachronically through the phonologization of vowel-to-vowel coarticulation. This view is defended in, e.g., Ohala (1993b) and (1994), and Flemming (1997) derives vowel-to-vowel assimilation patterns from coarticulation synchronically as

Vowel harmony and initial syllables

197

well. Majors in her 1998 dissertation shows that in many languages vowelto-vowel coarticulation is most robust from stressed vowels to unstressed, a fact which she argues could lead to the development of harmony patterns with stressed vowels as their triggers. Unfortunately, this information alone does not provide us with an analysis of the emergence of harmony in Turkic. Inkelas et al. (2001), building on work by Beddor and Yavuz (1995), demonstrate experimentally that in Modern Turkish, anticipatory vowel-tovowel coarticulation is stronger than carryover. Furthermore, this is true regardless of the position of stress (again, usually but not always final in Turkish). These findings present a serious challenge for coarticulationbased theories of vowel harmony for the following reason: If vowel harmony is driven synchronically by vowel-to-vowel coarticulation, then stronger anticipatory coarticulation should yield right-to-left harmony patterns, rather than the left-to-right system attested in Turkish. In Turkish, coarticulation and harmony run in opposite directions. For diachrony, of course, it would be possible to conclude, with Beddor and Yavuz, that all this simply means that prosody was radically different in Proto-Turkic times. It is conceivable that at the time of the emergence of palatal harmony in Proto-Turkic, carryover coarticulation was stronger than anticipatory, and that at some point this situation reversed itself. If ProtoTurkic did in fact have fixed initial stress as some scholars hypothesize, we could imagine that, unlike in Modern Turkish, Proto-Turkic coarticulation flowed most strongly from the stressed syllable à la Majors (1998), yielding the desired direction of assimilation. This hypothesis, however, has a number of serious flaws: If Proto-Turkic did in fact have fixed initial stress, then this stress pattern obviously changed somewhere along the way to Modern Turkish, such that the language developed a fixed final stress. Now, if early fixed initial stress was in fact duration- and/or amplitudecued,133 then the assumption of Proto-Turkic carryover coarticulation would acquire a certain phonetic naturalness (again, as per Majors 1998). The idea of a subsequent shift to stronger anticipatory coarticulation, however, in the face of the uninterrupted phonetic prominence of the initial syllable, now becomes difficult to countenance.134 Certainly the innovation of the F0-cued accent of later Turkic is a poor candidate for the driving force behind that change. A shift in the direction of vowel-to-vowel coarticulation in these circumstances seems capricious and unmotivated. On the other hand, if the earlier fixed initial stress had prosodic characteristics similar to those of the stress in Modern Turkish, then even the motivation for assuming an earlier carryover coarticulation in the first place becomes obscure.

198

Initial syllables

The preferable option, of course, would be to produce an analysis that could save the coarticulation theory without recourse to the historical assumptions discussed above. To this end I propose the following: Anticipatory coarticulation, all things being equal, may well be stronger than carryover in Modern Turkish, and lacking any reason to assume otherwise, I take this to have been the case in Proto-Turkic as well. As demonstrated above, however, initial-syllable strengthening in Modern Turkish produces, independent of the position of stress, a durational asymmetry between the vowels of initial and non-initial syllables. Assuming domain-initial strengthening to have been active in the past as it is today, the same would be true of Turkic at the time of the phonologization of vowel harmony. This durational asymmetry seems not to affect the dominant direction of coarticulation in Modern Turkish (Inkelas et al. 2001). It could, however, produce significant changes in the perception thereof. Turkish vowel-to-vowel coarticulation patterns received absolute measurements in Inkelas et al. (2001). These were arrived at through comparison of mean formant values at vowel onsets or offsets in a coarticulated context (adjacent vowel different) with the corresponding values found in a baseline context (adjacent vowels identical). In a relative sense, however, an absolute coarticulatory effect of a given magnitude would nonetheless occupy a smaller portion of the total duration of a longer vowel than it would of a shorter vowel. This coarticulatory effect, however strong in the absolute sense, could then prove perceptually less salient on the longer initial-syllable vowel than on a shorter one. In Turkish this would mean that the overall stronger effect of anticipatory coarticulation might nonetheless fail to be perceptually robust on the lengthened vowels of the initial syllable. By the same token, carryover coarticulation, however weak overall, would receive additional salience perceptually on the shorter vowels of non-initial syllables. This durational skewing effect allows us to understand why, stronger direction of coarticulation aside, Vowel 1 of a Turkic word might still be less likely to assimilate to Vowel 2 than viceversa. All the foregoing, however, buys us no more than a single sound change: Vowel 2 assimilates to Vowel 1 in frontness/backness in PreProto-Turkic. But this alone cannot be the full story. I am also less than sanguine about the plausibility of an analysis in which word-domain harmony is brought about gradually by the methodical creep of palatality across from left margin to right in the word. Rather, the sound change described here must account for only the first step in the rise of Turkic vowel harmony. The remainder of the process would then have to be analogical in nature.

Vowel harmony and initial syllables

199

This idea receives support from the fact that the overwhelming majority of roots reconstructed for Proto-Turkic are either one or two syllables in length. Trisyllabic roots are shadily attested at best (Johanson 1998).135 Assuming also the possibility of the addition of suffixes to the roots in question, we must bear in mind crucially the following word forms in any discussion of the emergence of vowel harmony: (6)

Some Crucial Proto-Turkic Word Shapes136 a. [[CV.CV]root] b [[CV]root[CV]suffix] c. [[CV.CV]root[CV]suffix]

Assuming now that Vowels 1 and 2 in (6) disagree in frontness/backness, the application of a sound change assimilating Vowel 2 to Vowel 1 in this respect yields the results schematized in (7): (7)

Assimilation of V2 to V1 a. b c. d.

[[CVαCV–α]root] > [[CVα]root[CV–α]suffix] > [[CVαCV–α]root[CV–α]suffix] > [[CVαCV–α]root[CVα]suffix] >

[[CVαCVα]root] [[CVα]root[CVα]suffix] [[CVαCVα]root[CV–α]suffix] [[CVαCVα]root[CVα]suffix]

The sound change in (a) produces a disyllabic root with palatal harmony, meaning essentially that the overwhelming majority of roots in the language are now harmonic. The output of the change in (b) is a disyllable in which the suffix takes on the palatality specification of the root, creating another word without disharmonic vowels. Example (d) merely shows that if a suffix added to a disharmonic disyllabic root has the same [back] specification as the first syllable of the root, the output of the sound change will be a completely harmonizing trisyllabic word. The only problem among these examples, in fact, is form (c). Here the output of the sound change is a harmonized disyllabic root with a non-harmonizing suffix. The crucial analogical step occurs here. The addition of the same suffix to [+back] and [–back] monosyllabic roots means that the sound change shown in (7) has the effect of creating an alternation whereby the choice of suffix vowel is dependent on the identity of the root vowel. The fact that no alternation occurs when the root is disyllabic creates irregularity in the choice of suffix vocalism: Sometimes the suffix vowel harmonizes, and sometimes it does not (the addition of two suffixes to a monosyllabic root potentially creates the same

200

Initial syllables

irregularity). This irregularity is removed from the system simply by the generalization of the harmonizing allomorph to all forms, as illustrated in (8) where the plural suffix -lAr is taken as representative. (8)

Generalization of suffix alternation pattern: Proportional Analogy CV[α]C X

=

:

CV[α]ClA[α]r

::

CV[α]CV[α]C

:

X

CV[α]CV[α]ClA[α]r

At this point we have a system with all trisyllables, regardless of morphological structure, displaying palatal harmony. The extension of the alternation pattern to longer strings of suffixes is not difficult to conceive. It is worth noting that just as the generalization of the alternating allomorph is possible, we might equally well expect the reverse, viz., instances of generalization of the invariant allomorph to all forms. The fact that some suffixes are reconstructed for Proto-Turkic as non-harmonizing could be an indication that this in fact took place. This account derives palatal vowel harmony in Turkic from a smallscale durational difference between initial and non-initial syllables, a difference which, while not enough to cause articulatory difficulties of the type implicated in Chapter 2 in the development of UVR, is nonetheless sufficient to cause a skewing in the perception of vowel-to-vowel coarticulation, such that a non-initial vowel might be reinterpreted as agreeing with a preceding initial-syllable vowel in some featural dimension. Note also however the importance of the role played in this account by root-structure and morphology. It is significant indeed in light of this that the Altaic, Uralic and Bantoid languages with initial-syllable strength manifest as vowel harmony all have primarily mono- and disyllabic roots, suffixing morphology,137 and stress (if any) not strongly cued by duration. It is possible that a similar account could be viable for others of these cases as well. 4.5.4. Bantu Turning now to a case of initial syllable strength for which past or present initial stress is clearly not a potential explanation, it is possible, given the lack of duration-cued stress in most Bantu languages (and certainly in the proto-language), that the same pattern of initial strengthening observed in Turkish is or was active in Bantu, and could be implicated in the development of Vowel Height Harmony in those languages. Unfortunately,

Vowel harmony and initial syllables

201

I know of no experimental results providing evidence one way or the other. Researchers such as Hyman (1987) and Goldsmith (1982) have certainly proposed an abstract “accent” on the stem-initial syllables of some Bantu languages. Hyman (1987), on the basis of work by Paulian (1974) proposes an analysis of Kukuya placing an accent on the initial mora of the stem. The accent in question is not in any way meant to indicate that Kukuya is a “stress language”; rather, accent here is a slightly more abstract property affecting the licensing potential of consonants and vowels and affecting tone as well. The licensing restrictions are perhaps suggestive of phonetic strengthening, but without explicit evidence thereof no conclusions can be drawn. One fact about Kukuya does suggest a phonetic strengthening of accented material similar to that typically found in stress languages: There is a postlexical variable rule (which is to say, most likely, a phonetic process) of vowel reduction operative in Kukuya affecting material linked to an immediately postaccentual unaccented mora (providing the following mora is also unaccented), such that /a/ in this position is realized as schwa (Hyman 1987: 331). Phonetic investigation into the nature of this accent could yield important results for an understanding of the development of licensing asymmetries in Bantu. Otherwise, it is certainly the case that initial segments in many Bantu languages are subject to initial strengthening of some kind. I have already discussed domain-initial vowels in Luganda and Runyambo. Hubbard (1994) provides evidence from a number of Bantu languages showing significantly greater durations for stem-initial consonants than for comparable consonants stem-internally.138 Languages showing this effect included CiYao and Kikerewe among others (Hubbard 1994: 95, 102).139 Certain Bantu languages of the southern zones, such as Chichewa, have regular lengthening of vowels in penultimate syllables. In some languages, such as Makonde, this has even been analyzed as a penultimate stress with concomitant UVR in unstressed syllables (Liphola 2001). What is fascinating here are the attendant changes in the results of Hubbard’s durational measures for consonants. Specifically, while a first set of measurements makes it appear that in Chichewa as elsewhere root-initial consonants are significantly longer than root-internal, measures taken comparing situations in which the root-initial consonant belongs to either the penult or the antepenult show that in Chichewa consonant strengthening takes place not in all root-initial syllables, but rather in the onset of the penult, where the vowel also undergoes significant lengthening. When in the antepenult, the root-initial consonant is significantly shorter (Hubbard 1994: 148). Now if strengthening of the penult in the relevant languages, an innovation in the Bantu context, is in fact the reanalysis of an earlier prominence on stem-initial syllables (an attested process in the histories of

202

Initial syllables

some stress languages, e.g., West Slavic), as is suggested by the transfer of consonantal strengthening from the initial to the penult, we could hypothesize that the new vocalic prominence on the penult may have had a correlate on initial-syllable vowels as well. Hyman (1999) proposes an analysis of the origins of Vowel Height Harmony in Bantu which adds some credence to this hypothesis. It has traditionally been thought that, for example, in the verbal system, harmonizing suffixes originally contained high vowels, but that when those suffixes were added to mid vowel roots, they eventually came to assimilate to the root in height, themselves becoming mid, and producing root-tosuffix height harmony. The issues are complex and cannot receive justice in the context of this study, but the essence of Hyman’s proposal is this: There are compelling reasons to believe that the suffixes in question may actually have originated with mid vowels. What followed then was a process of raising of mid vowels in non-initial syllables. Such a process, of course, bears a striking resemblance some systems of UVR discussed in Chapter 2. Even more striking is what would then give rise to the roots of the harmonies to develop later: Mid vowels would fail to raise (reduce) only in case the root vowel was also mid. This support provided by the stem-initial-syllable mid vowels for their non-initial counterparts is clearly reminiscent of those similar processes discussed above in the UVR systems of Yakan, Timugon Murut, and Russian dialects. A phonetic durational asymmetry between vowels in stem-initial syllables and other vowels provides an obvious motivation for such an account of the rise of Bantu VHH. Further evidence comes from a group of Bantu languages, such as Punu, which today lack VHH. In Punu mid vowels are excluded from nonstem-initial syllables altogether (Kwenzi Mickala 1980; Hyman 1999: 240). In these syllables only /i, u, a/ of the five-vowel inventory may be realized. Crucially, in Punu non-stem-initial syllables /a/ is realized as schwa, a phonetic process which can only bespeak a concomitant reduction of phonetic duration. Punu, then, can be seen as the natural endpoint of the chain of developments of which VHH in various incarnations represent middle stages, a process of the imposition of gradually stricter limitations on the realization of non-high vowels outside stem-initial syllables. In the ancestor to phonologized systems of VHH (which of course develop subsequently according to a logic constrained not by phonetic limitations), mid vowels can occur in non-initial syllables only when supported by a mid vowel in the initial. Subsequent weakening then leads to a situation such as that found in Punu, where mid vowels are banned altogether from nonprominent positions, much in the manner of a UVR system (though of course, without assuming an initial stress per se in the Bantu cases).

Summary

203

Another consideration, and probably more to the point in any case in the Benue-Congo cases of initial-syllable strength effects is the morphological status of the initial syllable. In the cases of the Bantoid Tiv (Pulleyblank 1988) and the Ogoni Gokana (Hyman 1982) cited by Steriade (1994), the initial syllable which determines the specifications of [high], [low], and [round] (Tiv) and [nasal] (Gokana) of the vowels of following syllables is the stem-initial syllable of the verb. This syllable is also historically the root in both cases, with following syllables being suffixes etymologically at least, though probably not analyzably so in all cases in synchrony (Larry Hyman, personal communication). It is not clear whether it is the psycholinguistic status of the root which allows it to license more contrasts than do affixes, or if this too can be ascribed to phonetics (through the weakening of segmental material in affixes as they become increasingly bound prosodically to roots or stems). In this case it may well be a combination of both, or the phonetic effect may be a direct consequence of the psycholinguistic facts. Whatever the reason, it is quite possible that the status of the initial syllable in these cases as the etymological verb root has more to do with the development of the relevant licensing asymmetries than does the initialness of syllables in question per se. 4.6. Summary In this chapter I have presented evidence from a broad typological survey that phonological strength patterns involving the vowels of initial syllables are substantially rarer than previous treatments of the topic suggest. The vast majority of cases, in fact, can be attributed to a present or past fixed initial stress in the languages in question. Furthermore, while in this group of languages patterns both of vowel harmony and surface inventory reduction as in canonical UVR patterns are attested, in the small set of languages for which initial stress cannot be documented or reconstructed, only harmony seems to be found. I have argued that these typological facts are readily accounted for by the phonologization approach, in that they follow directly from the phonetic profile of initial syllables crosslinguistically. Robust operation of a process of phonetic initial strengthening is well-attested on the initial segments within domains, but unlike final lengthening, its effects in most cases which have so far been investigated do not extend further into the domain than that initial syllable. Thus, while PN effects involving initial consonants and absolute word-initial vowels are both quite common crosslinguistically, PN patterns affecting all initial-syllable vowels are not at all common.

204

Initial syllables

Toward an explanation of why initial-syllable strength effects should appear at all then, I have demonstrated on the basis of experimental work with Turkish that in a language with the right type of prosodic system some phonetic initial strengthening can have a measurable effect on the duration of all initial syllable vowels. Furthermore, I have presented a scenario explaining how the small-scale durational asymmetry this causes between initial and non-initial syllables can lead to the development of a system of vowel harmony such as those attested in the initial-syllable strength languages left over when the initial-stress group has been removed from consideration. The fact that such patterns are as rare as they seem to be follows from several factors about the developmental process I propose: First, only some subset of the world’s languages (of unclear proportions) have prosodic systems (with non-duration-cued stress) which might allow the expression of postboundary strengthening on the vowels of initial syllables. Second, in addition to the sound change phonologizing patterns of vowel-to-vowel coarticulation which is the first step in my account, the development of a harmony system of the kind found in the Turkic languages relied crucially on subsequent morphological changes rooted in paradigm uniformity to extend harmonization across-the-board. Third, these two steps are likely to take place creating word-bounded vowel harmony only in languages of a particular morphological profile; for the process to work, the language must have almost exclusively mono- or disyllabic roots and robust (if not exclusive) transparently agglutinative suffixing morphology. The need for the convergence of all these factors for a true initial-syllable strength effect to arise for vowels explains both why we find such systems in Turkic and Niger-Congo, and why they are so rare otherwise. The UG-based approach to PN in initial syllables, relying as it does on psycholinguistic properties of initial material arising as a consequence of the very fact of that material’s initialness, makes none of the typological predictions recorded above. In fact, it makes concrete predictions which are seen not to be born out in attested typological patterns. Were it the case that licensing asymmetries spring from the fact that it is equally important to the grammar for processing reasons to conserve all segmental contrasts for every segment in the initial syllable, we would expect equal attestation of positional strength effects involving consonants and vowels. This is not the case. Furthermore, if the need to preserve vowel contrasts in initial syllables were truly enshrined universally in phonological grammar, we might expect, for example, the existence of systems of unstressed vowel reduction in which all unstressed vowels save that of the initial syllable undergo neutralizations of contrasts, while the vowel of the initial syllable, because of its psycholinguistic importance, resists the reduction process.

Summary

205

We know there are systems in which absolute initial vowels resist reduction for phonetic reasons (Russian). We have seen numerous instances of phonetic final lengthening giving rise to patterns of final-vowel resistance to reduction, both phonetic and phonological. Crosswhite (2001) provides an analysis of several systems in which the vowels of certain affixes fail to undergo reduction, presumably due to the need to preserve morphological contrasts. But I know of no system in which vowels of all initial syllables, onset or no, are exempt from a process of unstressed vowel reduction that otherwise targets all unstressed vowels. This gap in the typology is expected given the predictions made above as to which languages should exhibit initial-syllable vowel strengthening and which should not. It was predicted that languages like English, in which duration is cued strongly or primarily by stress, should not implement phonetic initial-syllable vowel strengthening. Since, however, it is precisely these languages in which we expect to find systems of unstressed vowel reduction (and not in those, like Turkish, with small or non-existent durational differences between stressed and unstressed syllables), the fact that the two phenomena should not be found to cooccur is unsurprising. In sum, the phonologization approach to initial-syllable strength effects provides an account which is far more accurate than that provided by other approaches, and which allows us to understand both the differences and similarities between PN effects found in initial syllables and those affecting the other positions surveyed in this study.

Chapter 5 Conclusions 5.1. Phonologization and Universal Grammar in typology I have argued throughout this study that typological regularities observed in patterns of positional neutralization are best accounted for using the phonologization approach presented here: Phonetics influences phonology by providing the inputs out of which categorical patterns are created by phonologization. Once phonologized, however, these patterns cease to be dependent on phonetic factors for their existence. It is this last which represents the central argument of this work. Another class of approaches to typological patterns involving the phonetic content of positional neutralization systems seeks to derive the range of attested patterns from restrictions on possible systems stated directly in Universal Grammar. UGbased approaches of this sort fall into two general classes: The first type, essentially Grounded Phonology and its descendents (Archangeli and Pulleyblank 1994, Hayes 1999, Smith 2002), posits “grounding conditions” or filters on the types of generalizations (constraints or otherwise) that can be expressed by the phonological grammar. Such approaches maintain the abstract, symbolic nature of the phonology, but limit it to expressing patterns which “make sense” phonetically. The second type of approach, including Licensing-by-Cue (Steriade 1994) and its descendents, as well as the Integrated Phonetics and Phonology of Flemming (e.g., 2001), seeks to rule out all but phonetically motivated neutralization patterns by deriving neutralization from generalizations stated over phonetic primitives such as segment duration in milliseconds or distance in acoustic space. In the phonologization approach, by contrast, phonology is natural because the phonetic inputs from which most phonological regularities are derived are natural. The phonetics, in other words, determines the shape of future phonological patterns according to its own logic, one not involving abstract categories or structural positions. This view of phonologization has its roots in the listener-oriented approach to sound change developed by Ohala (1981, 1993a), as well as in Hyman’s (1977) work on phonologization. Recent work sharing certain (though not all) aspects of this approach includes Blevins and Garrett (1998), Blevins and Garrett (2004), and the work of Blevins (2004) under the rubric of Evolutionary Phonology. Chapter 1 of this study sketched how the interaction of phonetics and phonology through phonologization takes place, while Chapters 2 through

Phonologization and Universal Grammar

207

4 demonstrated the operation of phonetic principles in phonologization. Chapter 2 demonstrated that the vast majority of unstressed vowel reduction patterns are based on the neutralization of vowel height contrasts due to perceptual difficulties caused by the shrinking of the vowel space produced by duration-dependent undershoot. This very specific and highly consistent, indeed nearly exceptionless, pattern makes perfect sense in the phonologization model of typological patterning. Current UG-based approaches such as that of Crosswhite (2001) predict a wide variety of completely unattested patterns, while failing to provide sufficient rationale for the restricted nature of the attested patterns. Direct Phonetics approaches, on the other hand, restrict what is possible in phonology too severely, as will be argued below in 5.2. Chapter 3 dealt with the behavior of the final syllable in phonology, showing at once a widespread tendency for final syllables to resist various neutralization processes which might otherwise apply to them, but also in some languages to be the target of neutralization processes not affecting other prosodically and segmentally comparable targets. The ambiguous nature of the final syllable, I argued, is due to the complexity of the phonetic patterns characteristic of vowels in this position, and thus perfectly understandable in the phonologization approach. Final lengthening and supralaryngeal gestural strengthening make final syllables phonetically strong; low subglottal pressure, however, together with an attendant drop in F0, lower intensity, and partial or complete devoicing or the onset of non-modal phonation, can also negatively impact the perceptual robustness of contrasts realized in this position. Which tendency ultimately wins out seems to be a matter decided by the particular configuration of the language-specific phonetics of a given system. Furthermore, even when one of these phonetic tendencies is phonologized, creating either phonological weakness or strength, this by no means entails the effacement of the opposing phonetic tendencies, since these two systems function largely independently. From a Pure Prominence UGbased perspective, on the other hand, there is no particular reason to expect such waffling from the final syllables of the languages of the world. If phonological prominence is a matter of Universal Grammar specifying, for whatever reason, whether a given position is to be strong or weak as a licenser of contrasts, the best we can do is to let final position belong to both lists. The precise nature of the patterns arising in cases of either behavior, however, remains mysterious. Chapter 4 contrasted the phonologization approach with a more nuanced UG-based approach deriving PN in initial syllables from psychological characteristics connected with the initialness of the syllable in question. Contrary to what that approach would predict, however, positional

208

Conclusions

neutralization patterns affecting all vowels in initial syllables were shown to be few and far between in the languages of the world (in contrast to the well-attested patterns involving initial consonants, which the phonologization model derives from the equally well-attested phonetic strengthening of precisely these segments, as well as word-initial vowels). In fact, most of the putative instances of PN involving the vowels of initial syllables take place in languages that either have fixed initial stress today, or had fixed initial stress at the time of the development of the patterns in question. There are a few other cases, such as Gujarati, in which the placement of stress is less clear, but in which the patterns of PN are clearly the result of phonologizations affected by durational asymmetries. Remaining cases of initial-syllable vowel PN, in which there can be no question of initial stress at any stage (such as the Benue-Congo examples and potentially also Turkic, though the accentual facts are more ambiguous there) are remarkable in that all show systems of vowel harmony (or derivations thereof) rather than inventory reductions of the UVR type. This regularity among the cases left over when initial-stress languages are removed from the picture is unexpected in the initialness model, but I presented evidence that in certain languages, small-scale durational asymmetries may arise through the application of initial strengthening to the vowels of initial syllables. I also present a scenario for how such an asymmetry could lead to the development of vowel harmony, as in ProtoTurkic. UVR-like patterns would be unexpected, on the other hand, as the durational asymmetries in question would be too small to produce the articulatory pressures that lead to patterns of this sort. In each case (stressed, final, and initial syllable positions), the phonologization model was shown to account better for the full range of typological regularities in PN systems than competing approaches. I have argued that since attested typologies fall out naturally from the phonologization of phonetic patterns in diachrony, there is little reason to imagine that the phonology needs to cover this same ground again, “justifying” within the competence of the individual speaker the functionalist wholesomeness of phonological patterns, which are already in essence a fait accompli for the language. If it is necessary at all that the phonology place restrictions on the phonetic content of the patterns it implements, it should be possible to show that there is some aspect of these patterns which cannot be accounted for through phonologization. We should not simply assume a priori that the phonological grammar contains such information because it is possible to model it in such a way. It must be clear additionally that it is necessary for us to do so. The central problem here is the aprioristic assumption that the phonological grammar must be capable both of implementing the

On the integration of phonetics and phonology

209

phonological patterns of any given human language and accounting restrictively and exhaustively for the range of variation in those patterns crosslinguistically. This assumption (though prevalent in generative phonology, and especially Optimality Theory) is simply unwarranted. It burdens the phonological grammar unnecessarily with the machinery needed to derive phonetically grounded typological regularities, while failing to account for those regularities as accurately as an approach appealing to the phonetics itself. There is still, of course, a long way to go to the complete resolution of these issues. In addition to the search for evidence of phonetic content in phonology in the behavior of individual speakers, more typological investigation may serve to further elucidate the issues at hand. Smith (2002) treats licensing asymmetries between roots and affixes, arguing that such asymmetries are a result of the psycholinguistic status of the root. Here I think this argument is quite plausible. A competing hypothesis, though, might seek to derive such asymmetries from processes of phonetic weakening visited on affixal material as it suffers prosodic demotion to form a single prosodic word with the root.140 Typological surveys may serve to disambiguate further. Other potential loci for the phonologization of positional neutralization patterns, such as closed and open syllables, could also provide further empirical tests for the phonologization approach. The success of the phonologization model in using phonetics to account for typological regularities in phonological patterns suggests the possibility that a synchronic model of phonetics-phonology incorporating phonetic information directly into the phonological grammar might be equally successful. In the next section I will discuss this option briefly, arguing that there are fatal problems with the Direct Phonetics approach which are avoided by the phonologization model. I conclude with some discussion of a few important challenges facing both the Direct Phonetics and the phonologization model of phonetics in phonology, particularly that of socalled incomplete neutralization phenomena, and I present a number of avenues toward the resolution of those problems. 5.2. On the integration of phonetics and phonology Direct Phonetics approaches to phonological typology seek to derive the naturalness of phonological patterns from an integration of quantitative phonetic primitives into the phonological grammar. By asserting that it is the phonetics itself which drives synchronic neutralization patterns, these models rule out all but phonetically natural neutralization patterns from the phonology, and in this manner achieve much the same typological

210

Conclusions

restrictiveness in their predictions as the phonologization model does. Such models include Steriade’s Licensing-by-Cue approach (Steriade 1994, 1997) and Flemming’s integrated model of phonetics and phonology (passim) introduced in Chapter 1. Direct derivation of categorical neutralization patterns from phonetic cues or pressures, however, puts forward strong predictions concerning the relationship between phonetic causes and phonological effects, making this connection ultimately far more intimate than I will argue the facts allow. The phonologization approach presented here, by contrast, maintains phonetics and phonology as two distinct levels of description, whose generalizations are made over distinct sets of representational primitives. Phonetic patterns and phonetic knowledge drive the phonologization process, but not the phonology itself. The same precise accounting for crosslinguistic regularities is thus achieved without forcing those patterns in synchrony to be something which they are not. To illustrate the differences between the two approaches, we return again to unstressed vowel reduction, since a treatment of it has already been worked out in a Direct Phonetics model of the phonetics-phonology interface (Flemming 2001b). An approach deriving UVR from phonetic factors would need to focus, as described in Chapter 2, on the relationship between the time needed to achieve the relatively more open target articulations for the upper vocal tract characteristic of non-high vowels and the duration allotted in UVR languages to vowels in unstressed syllables. Phonetic UVR such as, say, raising of /a/ toward [] or of mid vowels /e, o/ toward [i, u], can be modeled as the interaction of durational targets for vowels in unstressed syllables and constraints mandating minimization of articulatory effect where possible. Shortening the duration of an unstressed low vowel would increase the amount of articulatory effort necessary to reach the low vowel’s articulatory targets in the time allotted, since without altering either the duration of the vowel or the gestural targets the only fix is to increase the speed of the gestures in question. Assuming a wider target window for the F1 of non-high vowels, however, narrowly targeted durations for unstressed syllables, together with a drive to minimize the expenditure of articulatory effort, will force undershoot of these vowels’ openness targets (raising the non-high vowels phonetically). Gestural maxima are thus reached in a timely fashion with an acceptable expenditure of articulatory effort. This describes gradient vowel raising in UVR systems such as that found for unstressed /o/ ([o]~[u]) in Bulgarian or /a/ ([a]~[]) in Russian. Formalized correctly, this system would generate F1 values for the vowels in question as a function of the durations allotted to those vowels in a given unstressed syllable. Where unstressed syllables were shorter (fast speech,

On the integration of phonetics and phonology

211

immediately posttonic non-final syllables) more reduction would take place, and where durations were longer (absolute word-initial or absolute phrase-final position) less reduction would take place. Both models take as their starting point at least this much. To produce neutralization of contrasts, Flemming’s model (2001a, 2001b) uses the interaction of constraints mandating the maintenance of a certain number of contrasts with other constraints demanding that there be a minimum phonetic distance between entities in contrast.141 Put simply, if constraints mandating that the system retain the contrasts between /o/ and /u/ or /e/ and /i/ outrank constraints specifying the minimum phonetic distance along the height dimension that [o] must be from [u] or [e] must be from [i], then no neutralization will take place, and the contrast will be maintained even in such case as the mid vowels are raised perilously close to the high vowels. If, however, Minimum Distance constraints outrank the relevant Maintain Contrasts constraints, and durational targets and constraints against articulatory effort demand realizations of [e] and [i] or [o] and [u] closer to one another than Minimum Distance would allow, the only viable solution is the neutralization of contrasts. By merging /e/ – /i/ as [i] and /o/ – /u/ as [u], we will violate the Maintain Contrasts constraint, but Minimum Distance will be satisfied, as will the duration and effort constraints. This approach, in essence, sees neutralization as the consequence of two contrasting entities being forced so close together in phonetic space by realizational constraints on vowels of a certain duration that in the end the contrast must be abandoned rather than realized in so perceptually non-robust a fashion. Now here the phonologized versions of UVR processes present a fairly obvious problem common to all Direct Phonetics models. The problem generally is in the variability of vowel durations. If it is necessary to specify a particular durational minimum below which a contrast is best left undeployed, there will always be some subset of all the tokens of a target string which for whatever reason (segmental context, speech rate, phrasal position, etc.) exceed that minimum, and also some subset of all the tokens of non-target (e.g., stressed) strings which fall below it. Turning specifically to Flemming’s model, what the scenario sketched above predicts is a distribution for the realization of mid vowels in which, as duration decreases, the vowels are raised proportionally along with it, up to a certain point (specified by Minimum Distance constraints), after which there is a break in the continuum following which neutralization must occur, and for all durations below a certain boundary, mid vowels would simply be realized as high vowels (for example). We picture an /e/ which becomes gradually more and more [i]-like down to X ms., after which realizations jump from having F1 values of [i] plus Y Hz at X ms. straight

212

Conclusions

to [i] at X + 1 ms. But no such systems seem to exist. Rather, systems are either neutralizing or non-neutralizing, phonologized or unphonologized, as detailed in Chapter 2. Russian /o/ does not become gradually more [a]-like in unstressed syllables until it reaches the threshold of contrastibility and leaps to become identical to /a/. It is simply realized in the same distribution as /a/ in all unstressed syllables regardless of duration. I argued in Chapter 2 that this is because after phonologization, a neutralization process is nothing more than the substitution of one abstract symbol for another in a given structurally-definable position, completely divorced from whatever phonetic trends or pressures may ultimately have given rise to it. Direct Phonetics approaches to neutralization that are truly direct predict that neutralization of contrasts in UVR systems should be sensitive to speech rate (or position in the phrase, etc.). The experiment in Chapter 2, on the other hand, shows clearly that while phonetic vowel reduction is indeed sensitive to speech rate, applying more in faster speech and less in slower, phonological vowel reduction, such as the merger of Russian /a/ and /o/ in unstressed syllables, is not sensitive to factors such as these. No amount of additional phonetic duration will avert this merger and bring back the contrast between /a/ and /o/. What all this means is that when we speak of direct reference to phonetics in the phonology, we cannot mean this literally at the level of the individual token in speech production. Some relativizing or generalizing process must be able to intervene, such that the phonetic representations and values the grammar evaluates are actually more like statistical generalizations over distributions of concrete tokens, rather than values for the individual tokens themselves. This result of this generalization (I will call it for simplicity of exposition the mean realization, though in practice it might not be formalized as such) allows the grammar to judge that, for example, while a given posttonic /e/ in Brazilian Portuguese in phrase-final position might exceed even the stressed vowel in duration, nonetheless for a speaker for whom posttonic raising is phonologized, the fact that the mean realization calculated over all tokens of posttonic /e/ is below a certain threshold means that neutralization must apply, final lengthening or no. In other words, variation due to speech rate or phrasal context must be factored out, so the “phonetic” duration the grammar considers is actually relatively abstract. Even this less direct link between average phonetic duration and neutralization turns out to be too constraining though. Abstracting away from actual phonetic conditions in any given utterance or context may overcome the effects of variation in phonetic implementation, but no abstraction cannot help where duration values are on average simply long, while phonological reduction nonetheless applies. As a first example,

On the integration of phonetics and phonology

213

consider the Russian first pretonic syllable described in Chapter 2. Here /a/ and /o/ are realized as something like [a] or [ ], depending on the dialect, the speaker, the speech style, and so forth. In the speech of many Muscovites, the vowel in question is famously a robust [a], with a long duration frequently exceeding even that of the stressed syllable.142 I have no statistical analyses demonstrating the mean duration for /a/~/o/ in this position, but it will certainly be longer than the 90 ms. given by Matusevic¬ (1976) for this vowel in Standard Russian. We might well wonder at this point just how long a vowel would need to be to realize the contrast between /a/ and /o/ robustly. It would seem that even a mean duration such as the Standard Russian 90 ms. is not terribly impoverished in this respect. Additionally, given that a clear [a] is realized here consistently, there can be no question of durational pressures causing undershoot of any kind, so that it not clear why any minimum distance constraints would be violated in the first place.143 The problem is that whatever phonetic circumstances gave rise to this particular neutralization were relevant and operational only hundreds of years ago, when the neutralization was phonologized.144 Synchronically, the neutralization just takes place when the relevant segments occur in specified structural positions, regardless of the those vowels current phonetic characteristics. A better documented case of the same problem is seen in Belarusian. As detailed in Section 2.3.2, Belarusian contrasts five vowels under stress, [i, e, a, o, u], but in all unstressed syllables, only the three vowels [i, a, u] contrast. Underlying /e, o/ are merged with [a]. The problem here is with the realization of that low vowel in unstressed syllables. Czekman and SmuΩkowa (1988: 226) carefully differentiate between the realization of this vowel in the first pretonic syllable, which is a clear low [a], and its realization in other unstressed syllables, where it is raised slightly to what they symbolize using the Greek letter α, and which I will equate with [ ]. This slightly raised low vowel is what most speakers of Contemporary Standard Russian realize in their first pretonic syllables. The raising is slight enough that we might question the extent to which any but the most sensitive MinDist constraints would object to its proximity to the mid vowels. The [a] of the Belarusian first pretonic syllable is a bigger problem though. Here the acoustic distance between the low vowel and canonical mid vowels is essentially the same as it would be in stressed syllables. Little to no phonetic undershoot has taken place, so the Direct Phonetics account predicts that no contrasts should be in danger of neutralization. But they are neutralized nonetheless. The problem here is that Belarusian unstressed vowel reduction is motivated not by the phonetics of Modern Belarusian, but rather by the phonetics of the East Slavic dialects of however many hundreds of years ago (depending on the diachronic story

214

Conclusions

we accept) in which akanje first arose (see Section 2.3.2 for some details). The phonologization approach, since it maintains a distinction between the phonetic source and the phonological implementation of neutralization patterns, has no problem with this. Another case which makes a similar point is that of Shimakonde. As discussed in Section 3.2.3, Shimakonde (all data from Liphola 2001) is a Bantu language of Mozambique with fixed penultimate stress and a fivevowel system [i, e, a, o, u] in that position. In pretonic syllables, there is an optional process of unstressed vowel reduction which merges /e, o/ with [a]. Final vowels are not affected, a fact which I attributed to final resistance to reduction, one of the central topics of Chapter 3. In (1) below, we can see the effects of optional reduction in strings of various lengths. Note that the long vowels in stressed penultimate syllables are derived through a process of penultimate lengthening active in Shimakonde. (1)

Vowel reduction in Shimakonde /kú-tót-án-a/ /kú-tép-án-a/ /kú-bóvól-a/ /kú-tétékél-a/ /kú-tóngódík-a/

‘to sew each other’ ‘to bend from e.o.’ ‘to punch’ ‘to give up’ ‘to lament’

kútótáána kútépáána kúbóvóóla kútétékééla kútóngódííka

kútátáána kútápáána kúbávóóla kútátákééla kútángádííka

What is interesting about Shimakonde in the present context, however, is the following. Not only is the output unstressed short [a] from reduction not subject to any appreciable phonetic raising or target undershoot (which Liphola demonstrates with acoustic measurements), but this same reduction process actually applies equally to long vowels arising in unstressed syllables through glide formation with compensatory lengthening.145 This is shown in (2). (2)

Reduction of long vowels in Shimakonde a. /kú-ék-a/ /kú-ék-áng-a/ /kú-ék-án-a/

‘to laugh’ kwééka ‘to laugh repeatedly’ kwéékáánga ‘to laugh at e.o.’ kwéékáána

b. /kú-óm-a/ ‘to pierce’ kwóóma /kú-óm-án-a/ ‘to pierce e.o.’ kwóómáána /kú-óm-áng-a/ ‘to pierce repeatedly’ kwóómáánga

(*kwááka) kwáákáánga kwáákáána (*kwááma) kwáámáána kwáámáánga

On the integration of phonetics and phonology

215

These facts present a serious challenge to any theory of vowel reduction in which phonetic duration plays a central role, since the vowels in question are obviously not durationally impoverished, and thus should not undergo reduction. A Direct Phonetics approach must therefore see this system as “unnatural,” or something other than vowel reduction as it is ordinarily implemented, and so must analyze systems like it and like Belarusian, where reduction is also clearly not driven by phonetic duration, as something altogether different. “Natural” UVR systems would then be generated by the interaction of duration, effort, and perceptual distance in the integrated phonetics/phonology, while these unnatural systems would have to be relegated to some other subsystem of the grammar, whatever that would ultimately be. And yet on the face of it, these systems manifest the very same phonological patterns as many other typical UVR languages: They prohibit the realization of mid vowels in unstressed syllables, and they do so regardless of the phonetic durations of those syllables.146 Put simply, Belarusian looks remarkably similar to Contemporary Standard Russian in nearly every respect, and yet the Direct Phonetics approach forces us to conclude that it is actually something altogether different. Direct Phonetics creates the necessity for a radical formal dichotomy between these two system types in the absence of any evidence that they in fact differ in any way, save that one raises its low vowels somewhat more toward schwa than the other does. The phonologization approach predicts phonetically unnatural systems of vowel reduction like these (and the others, such as Tromagan Khanty and Sosva Mansi, surveyed in Chapter 2) to be rare or typologically unusual, which of course they are; the Direct Phonetics approach, on the other hand, makes them grammatical impossibilities. One final system will be considered in this respect, as it illustrates both aspects of the abstraction problem quite well. The facts are those of Uyghur, as described in Hahn (1991). Synchronically, this is perhaps best not described as a case of unstressed vowel reduction per se (though it sometimes is, as in, e.g., Kaidarov [1969]). The issues though are identical in any case. As described in Section 3.2.3, Uyghur has a process of reduction which merges the low vowels / , æ/ with /i/ in non-initial open syllables. Initial syllables and closed syllables fail to undergo the process. I have suggested above also that this resistance is understandable if we assume Uyghur prosody to resemble that of its relative Anatolian Turkish in having extra vowel duration in both initial syllables and closed syllables anywhere in the word. Indeed, in Anatolian Turkish, /a/ in open syllables is frequently realized as schwa, even in stressed syllables, while closedsyllable /a/ never undergoes such reduction. One immediately wonders now what the experiment in 4.5.2 was sadly not designed to test: If it is true that

216

Conclusions

initial closed syllables of Turkish are realized with some additional duration, is this also true of open syllables, and if so, is initial open-syllable /a/ also resistant to reduction in Turkish? If this is the case, then we can see the current situation in Turkish as the gradient, phonetic precursor to the clearly phonologized system in Uyghur exemplified below. (3)

Low vowel raising Uyghur non-initial open syllables a. tøpæ-lær-i → peak-PL-3POSS.

tøpiliri

‘their peaks’

sæpær-im → journey-1SG.POSS

sæpirim

‘my journey’

tøæ-dæ-ki → camel-LOC-REL

tøidiki

‘(that which is) on the camel’

ani0a

‘to a/the mother’

bala-n- → bein-PASS-NOM

balini

‘beginning’

arqa-da-ki → back-LOC-REL

arqidiki

‘(that which is) in the back’

b. ana-0a mother-DAT



Several points must be clarified at the outset here. To begin with, as the first form in (b) shows, and as discussed in Section 3.2.3, phrase-final vowels are also resistant to this reduction. Hahn describes this resistance as being total before pause, while in connected speech word-final, opensyllable low vowels are merged with /i/ in a manner that “depends on the rapidity of speech” (Hahn 1991: 52). I took this above to constitute an example of final resistance to a phonologized, categorical reduction process. At the same time, a gradient, rate-dependent reduction process still applying across word boundaries. The analysis in OT-terms was presented in Section 3.4. Secondly, the choice of /i/ as a target for the merged low vowels in Uyghur merits some discussion. As can be seen in the unreduced form of ‘beginning,’ in (b) above, Hahn analyzes Uyghur as maintaining an underlying contrast between /i/ and //, though on the surface no such contrast is realized; the two underlying vowels are identical on the surface. Hahn maintains the underlying contrast as a device to account for the fact

On the integration of phonetics and phonology

217

that certain surface [i]’s condition front-harmonic variants of alternating suffixes, while other [i]’s condition back-harmonic variants of those same. Historically, of course, this is because there were indeed two distinct vowels in the relevant forms, but their subsequent merger has introduced a level of opacity into the system of palatal harmony in the modern language. This said, phonetic realizations of the resulting /i/ appear to vary quite a bit in Uyghur as a function of context, probably reflecting the full range of realizations of both original vowels. Hahn transcribes allophones of Uyghur /i/ ranging over the following values: [i, , e, , , , ]. Low vowels which have been merged with /i/ due to the raising described here range over the same regions of the vowel space as a function of segmental and prosodic context. In other words, the merger is categorical. Diachronically, I expect what has happened here is this: An original phonetic process raising the low vowels to something like [], analogous to the process we find today in Turkish, ultimately causes the merger of those reduced low vowels with the high, back, unrounded vowel [] in the relevant contexts. Indeed, in Anatolian Turkish the schwa resulting from reduction of /a/ in the same contexts is acoustically perilously close to [], making them fairly difficult distinguish (at least for this admittedly nonnative speaker). A subsequent sound change then merges original /i/ and //, resulting in the phonetically rather dramatic alternations between, e.g., [a] and [i] in the modern language. What is most interesting again for our purposes here however, is the behavior of the long vowels of Uyghur, original and derived, with respect to this raising process. Uyghur has a series of inherited long vowels, which Hahn analyzes as still contrastive in the modern language. The distinction, however, is mostly perceptible (“as far as average aural perception is concerned”) only in lexically stressed syllables receiving primary phrasal stress as well (“…particularly where such a strong syllable is not wordinitial, which is commonly the case among loanwords.”). “Otherwise”, says Hahn, “there is little or no clearly audible phonetic length distinction” (1991: 55). Since stress is usually final in native Turkic vocabulary, this means that vowels in the raising context (non-initial, non-final, open syllables) are typically unstressed, so that long vowels in this context should be phonetically indistinguishable from short vowels in these positions. Still, the long vowels resist reduction. This is shown in (4) below. (4)

Non-reducing “long” vowels in Uyghur /hawa+-da/ air-LOC



hawada

‘in the air’

(*hawida)

218

Conclusions



/bna-da/ building-LOC

binada

‘in the building’

(*binida)

The fact that these non-reducing “long” vowels in Uyghur have average durations apparently little, if at all, longer than the average durations of reducing “short” vowels makes an account of their resistance to reduction driven by phonetic duration, even in the most abstracted sense, implausible. For the forms above, in which the long vowels are root-final, and therefore sometimes lexically (and by extension phrasally) stressed, we could imagine a sort of paradigm uniformity account of their resistance to reduction, but this would do little for non-root-final long vowels, and would in any case make the lack of paradigm uniformity for short vowels seem peculiar. A constraint of the sort IDENT(LOW)-Vμμ mandating faithful realization of the [low] specifications of phonologically bimoraic vowels would of course derive their resistance, but at this point we have taken our abstraction to the point that we are no longer concerned with actual durations even indirectly. If such measures are necessary in order to block reduction from applying, it might well be wondered whether a similarly abstract approach to driving reduction might not be equally appropriate. Interestingly in this connection, the inherited long vowels of Modern Uyghur are not the only long vowels in the language. Uyghur also has an optional process of liquid deletion which results in compensatory lengthening of preceding vowels. This is shown in (5). (5)

Liquid deletion with compensatory lengthening in Uyghur tær bar kælæn bazar

→ → → →

tæ+ ba+ kæ+æn baza+

‘sweat’ ‘go!’ ‘having come’ ‘market’

These long vowels, unlike the inherited long vowels discussed above, are actually long phonetically. In the examples shown above, of course, none of these vowels would be subject to raising long or short, since they are all in word-initial or word-final syllables. The addition of suffixes such as the locative -dA to forms like ‘market’, however, would place the derived long vowel in an environment which for a short vowel at least would condition reduction. If reduction is in fact phonetically driven in Uyghur, we should expect these long vowels to resist it, much as do phrase-final, word-initial, and closed-syllable vowels. This seems not to be the case, however. Unfortunately, Hahn provides no examples positive or negative of derived

On the integration of phonetics and phonology

219

long vowels in the raising environment, so I hesitate to extrapolate a form for the hypothetical ‘in the market’ discussed above. He does however remark that “phonemically short vowels that are lengthened in this fashion are still subject to umlauting and raising where the liquid comes to take the coda position as a result of resyllabization” (Hahn 1991: 56). I take this to mean that raising applies to the output of compensatory lengthening. The problem here is identical to that in Belarusian and Shimakonde: Phonetically long vowels are subject to reduction, suggesting that while historically this reduction process was clearly duration-driven, synchronically, duration is simply beside the point; the process applies wherever its structural description is met, i.e. in word-internal open syllables. Note that a paradigm uniformity account of raising here is unworkable, since no reduction applies in the base form of words like “market” with or without compensatory lengthening. The point in all of the above is that to maintain a “duration-driven” account of many reduction processes, it is necessary ultimately to adopt so abstract a sense of “duration” that the distinction between this phonetically-driven approach and a truly abstract phonological approach becomes more of a terminological quibble than a meaningful distinction. In other reduction cases, mention of duration in any form, however abstract, just ends up making faulty predictions. The phonologization approach provides a phonetically-based account of the rise of reduction systems like that of Uyghur, but rightly stops short of importing phonetic factors into the synchronic account of those systems. Finally, the Direct Phonetics approach makes peculiar diachronic predictions as well. As discussed above, the application of UVR in unstressed syllables in a given language implies that at some level of statistical generalization, the durational value for unstressed vowels is such that they are forced by constraint ranking either to violate Minimum Distance constraints or to submit to neutralizations. Again, this means that in spite of the phonologization of the pattern of neutralization, the connection of the process to phonetic reality is alive and functional. With this in mind, consider a scenario in which the language in question undergoes some change to its prosodic system such that the durational asymmetry between stressed and unstressed syllables evens out (e.g., if the language were to switch from strongly-duration-cued to non-duration-cued or less-duration-cued stress).147 We would now predict that once the mean realization of the durations of unstressed vowels came to exceed whatever threshold would allow Minimum Distance constraints to be satisfied properly again, the contrasts between unstressed vowels which were previously neutralized should emerge again unscathed, as if nothing had ever happened. In other words, were Russian suddenly to adopt the

220

Conclusions

durational targets for unstressed syllables of Polish, akanje should disappear without a trace, unstressed /o/ restored to its fully contrastive realization. I know of no such attested development, nor would I expect one to arise. Indeed, such a possibility seems to contradict the Labovian notion of the impossibility of unmerger (Labov 1994: Ch. 10–14). 5.3. Does “anything go” in phonology? What I am arguing here, then, is that while categorical patterns of positional neutralization are best understood in their typological dimension through the phonologization approach applied throughout this study, in synchrony the best approach to the implementation of phonologized PN is one which accomplishes the substitution of one abstract category for another without reference to or concern for phonetic reality. Phonetic factors in synchrony could of course influence patterns of PN by leading to further phonologizations, perhaps disrupting patterns in some way. It is not, however, necessary for the phonetics to continue to provide formal justification in the synchronic phonology for the existence of the patterns it creates. From this point of view, It is enough for speakers to know that their languages contain a given phonological pattern. They need not also know why. This view prompts an obvious question: Since phonology contains no direct restrictions on the phonetic content of the patterns it implements (whether these be present up front in the constraint system or set back in grounding conditions), is it the case that as far as the phonology is concerned, anything goes? The fact that the vast majority of phonological patterns appear to be phonetically “natural”, or “grounded”, or “motivated” seems to argue otherwise. As I have argued throughout this study, however, this on its own means nothing. In fact, this would be the case with or without phonetic restrictions operating in the phonological grammar. Phonological patterns appear to be “natural” because the primary source of phonological patterns in diachrony literally is natural. Phonological patterns such as those treated in this study arise through the phonologization of phonetic patterns, which are natural by definition. Barring any further developments then, the overwhelming majority of phonological patterns will obey phonetic “restrictions” as an automatic consequence of their diachronic origin. Patterns of morphological change, subsequent sound changes, or even cases such as Seediq UVR in which two unrelated sound changes apply to the same segment in different environments can ultimately disrupt the uniformity of natural patterns by creating correspondences in synchronic PN systems that no longer operate in accord

Anything goes?

221

with their original phonetic logic. But such cases are the exception rather than the norm, involving as they do phonetically motivated sound changes plus some number of additional complicating developments. It is not therefore necessary to assume that the categorical phonology restricts the phonetic content of the patterns it implements in order to account for the fact that the vast majority of sound patterns appear to be phonetically grounded or natural. Such an assumption, in fact, makes the undesirable prediction that the phonology should contain only sound patterns which are phonetically grounded, or at the very least that processes such as those described in the preceding paragraph should be strongly selected against by the phonology. We might expect, for example, that phonetically unmotivated alternations, where they succeeded in arising at all, would be pressured out of the phonology over time. This, however, seems not to be the case. Garrett and Blevins (to appear) demonstrate in several languages the free analogical extension of phonetically unmotivated alternations such that they come actually to play a greater role in the grammars of the languages in question rather than a lesser one. Similar patterns are reported in Anderson (1981) and Bach and Harms (1972). In addition to research into morphophonological change, investigation of the behavior of natural and unnatural sound patterns in child language acquisition might also yield important evidence bearing on these questions. Is it the case, for example, that children’s acquisition of phonological patterns is driven or biased by phonetic knowledge or expectations? Recent work, for example by Wilson (2003), is aimed at discovering just such biases at work in adult learners of toy grammars in an experimental setting, with an eye to demonstrating that knowledge of phonetics guides the types of hypotheses learners of phonology entertain, and ultimately makes certain patterns more easily learned than others. Given ongoing efforts in this line of research, it is important at this point to state explicitly what the phonologization approach presented here does and does not assume on the subject of phonetic knowledge, particularly as this is an area in which there are potential differences between it and other diachronically oriented approaches to phonological typology, such as Evolutionary Phonology (Blevins 2004) or the work of Hale and Reiss (2000). The phonologization approach in no way assumes that speakers, listeners, and learners operate without the benefit of phonetic knowledge, nor does it claim that that phonetic knowledge does not consistently influence the substance of the phonology. On the contrary, for the phonologization approach to make any sense at all, it must assume not only that speakers, listeners, and learners have such knowledge, but in fact that they are awash in it, constantly relying on such biases to construct and maintain the phonological systems they ultimately acquire. Without

222

Conclusions

knowledge of factors like similarity and perceptual distance, basic processes in speech perception and production, such as the categorization of incoming stimuli or the ability to hyperarticulate, become impossible. Indeed, the whole Ohalian notion of listener-driven sound change becomes nonsensical without the assumption that listeners are at every moment making sophisticated judgments predicated on knowledge and expectations of a phonetic nature in order to reconstruct speakers’ intended utterances. It is after all only the tiny fraction of cases in which this process is thought to go awry, either through over- or underapplication of phonetic biases, resulting in sound change. Phonologization is thus predicated on the existence of phonetic knowledge and the ability of that knowledge to influence phonology. What is important here is what does and does not follow from this. It has been argued throughout this work that, given our findings concerning the nature and implementation of positional neutralization patterns in the languages of the world, there is substantial empirical reason to draw a distinction between processes operating at two distinct levels of description, one gradient and sensitive in its application to quantitative phonetic factors, and one categorical and synchronically insensitive to such factors. Furthermore, it is a matter of fact that patterns at the latter level of description typically derive diachronically from patterns at the former level of description in any given language. The question to be asked is now this: Given the typological observation that the overwhelming majority of phonological patterns can be described as “natural,” are we justified in positing that they are in any way restricted to be this way as a consequence of some aspect of their formal encoding? I have argued throughout that the fact that phonology is influenced by aspects of language-specific phonetics does not imply that it is encoded in terms of those phonetic factors, or that it is formally restricted to carry out operations that are sanctioned by the phonetics. Indeed, evidence from a wide variety of phonological patterns was presented in support of a phonological grammar, the workings of which are operationally oblivious to phonetic information. If approaches relying on the importation of phonetic factors into phonology thus succeed in achieving narrower typological predictiveness than the Pure Prominence approaches sketched above, they do so only at the heavy cost of predicting too narrow a dependence of phonological patterns on their phonetic antecedents. Phonology of course places its own restrictions on the implementation of sound patterns, involving perhaps concerns of predictability and symmetry rather than phonetic naturalness. It is the task of the phonologist to determine what these restrictions are and how they function to mold the inputs provided to the phonologization process by the phonetics. It is by

Incomplete neutralization

223

concentrating on concerns such as these that we will ultimately locate the contribution the phonology itself makes to the shapes of the processes it implements. 5.4. Further challenges: incomplete neutralization The foregoing discussion has been predicated to a great extent on the assumption that we, on all sides of the debate, at the very least know what the facts are. Which is to say, we know the data, and we know what it is that we are trying to model in investigating the phonetics-phonology interface. There exists by now, however, a sizable body of experimental literature which calls much of this assurance into question. Beginning in the late 1970’s, and gaining support as time goes on, evidence has accumulated suggesting that many processes that have traditionally been considered neutralizations are in fact not complete in the sense we have been assuming here. Throughout I have relied on the distinction (and therefore on our ability to distinguish) between processes which are categorical and phonological, and those which are gradient and phonetic in nature. Gradient processes may change phonetic targets for a given phonological entity such that its realization comes to approximate the realization of another entity. In extreme cases, the distributions of the two categories may come to overlap to such an extent that many if not most tokens of either become perceptually ambiguous. Still though, they remain distinct, however subtly. Categorical processes, on the other hand, operate by changing category labels in the string to be pronounced. A categorical merger of the type we have been considering takes two distinct categories and merges them in some context such that they become representationally identical. Crucially, representational identity has implied identical realization as well: The formerly distinct entities should share identical distributions in phonetic space. No trace of their original category membership should remain. If, for example, the category label [+voice] is changed to [–voice] on a word-final obstruent, for any measure capable of distinguishing the two, values for the devoiced obstruent should be statistically indistinguishable from values for analogous non-alternating [–voice] obstruents. For final devoicing in particular, however, studies from languages such as German (Port and O’Dell 1985; Port and Crawford 1989), Dutch (Warner et al. 2004; Warner et al., in press; Ernestus and Baayen, in press), Catalan (Dinnsen and Charles-Luce 1984), and Polish (Slowiaczek and Dinnsen 1985; Tieszen 1997) have all been produced showing that, although word-final obstruents are nearly identical with respect to phonetic

224

Conclusions

cues for voicing, a small, but measurable, difference nonetheless remains. In what is perhaps the most carefully controlled and comprehensive study of this phenomenon to date, Warner et al. (2004) demonstrate for Dutch final obstruents significant differences in both duration of the preceding vowel (longer for underlying voiced) and stop burst duration (longer for voiceless), suggesting that these segments retain phonetic characteristics of their underlying counterparts even in the supposedly neutralizing final devoicing environment. These mean differences, however, are exceedingly small, just 3.5 ms. in the case of preceding vowel duration in Warner et al.’s Dutch study. Similar results have been obtained by researchers investigating flapping in English (e.g., Fox and Terbeek 1977, Zue and Laferriere 1979), and tone sandhi in Mandarin (Peng 2000) and Cantonese (Yu 2005), as well as in investigations of apparent neutralizations in child phonology (Scobbie et al. 2000). Incomplete neutralization (hereafter IN) has inspired a great deal of skepticism among phonologists, in part just due to a general uneasiness inspired by the idea that we cannot in fact trust our ears in the description of sound patterns, and that without painstaking phonetic analysis, virtually any of our inherited assumptions about the way sound patterns work could turn out to be wrong (see, e.g., Manaster Ramer 1996 for an expression of this frustration). If trained phoneticians cannot reliably discriminate these incompletely neutralized sounds, goes the reasoning, then how could naive speakers possibly do any better; and if speakers cannot perceive these distinctions, then they cannot learn them, and in any case they would be wasting their time to do so, since no one will be able to tell whether they are deploying them correctly or not in the first place. It has since been demonstrated (Port and Crawford 1989; Warner et al. 2004; Ernestus and Baayen, in press), however, that, remarkable as it might seem, listeners can indeed make use of these small phonetic cues in disambiguating apparently neutralized lexical contrasts. Warner et al. demonstrate this with a perception task in which they present subjects with tokens from their production study of the same phenomenon and ask them to identify which form from a given minimal pair the stimulus represents. Listeners’ performance, while for obvious reasons not particularly accurate, was above chance. Ernestus and Baayen demonstrated that listeners made use of such low-level cues to voicing for word-final obstruents even in nonce words of Dutch. Their task was one in which listeners were required to produce past-tense forms for a set of pseudo-verb stimuli; the shape of the relevant past-tense suffix in Dutch (-te or -de) is sensitive to the underlying voicing of stem-final obstruents. Listeners heard pseudo-verb forms which underwent final devoicing, some of the stimuli ending in truly voiceless obstruents, while others were endowed with what Ernestus and Baayen call

Incomplete neutralization

225

“slight voicing” (the phonetic result of IN). The effect of this slight voicing was to skew listeners’ decisions in a small but significant way toward the voiced-initial version of the past-tense suffix. This experiment is particularly significant because it avoids one of the typical pitfalls of IN studies, the paralinguistic contamination problem. Much of the skepticism about IN has been centered on the fragility of the experimental results, not just the size of the differences reported, or the fact that these differences seemed to emerge only averaged over a reasonably large subject pool, but in particular, that the robustness of the IN finding seemed to diminish, the more naturalistic or ecologically-valid the task became. The closer to natural speech the study approached, the more ephemeral the result became. Indeed, IN seems to emerge most unquestionably in tasks that involve a sort of paralinguistic component, such as dictating potential homophones to a listener (as, e.g., in Port and Crawford 1989). Port and Crawford argue that this demonstrates that IN is under speaker control, and is actively employed in appropriate pragmatic contexts to disambiguate possibly confusable strings. Fourakis and Iverson (1984), on the other hand, argue that this characteristic of IN marks it as a laboratory artifact, a hypercorrection that one can elicit if one chooses, but which has no place in the study of ordinary language use. Indeed, even Warner et al. (2004 and in press) remain skeptical for this reason, and in fact demonstrate that Dutch speakers will also produce IN-like phonetic distinctions in the lab when asked to read pairs of words which differ only orthographically, but not phonemically. The relevant distinction was between orthographic geminate and singleton obstruents in Dutch, a spelling distinction with no phonological significance. This result, combined with the apparent either absence or decreased consistency of incomplete voicing neutralization in languages which do not spell final underlying voicing distinctions (Kopkallı 1993 for Turkish, Charles-Luce 1993 for Catalan), leads Warner et al. to suspect that IN, though real enough, is still somehow unnatural or paralinguistic. The above shows that small subphonemic cues for contrasts like those found in IN can be the result of hypercorrection or spelling pronunciation. It does not, however, show that this is always the case. The study described above by Ernestus and Baayen goes some distance to dispelling worries of ecological soundness in IN experiments, in that it shows listeners making use of IN cues even in a task that, unlike those of previous studies, does not actually require that they do so. Underlying voicing in Dutch is, as Ernestus and Baayen show here and elsewhere, to a large extent recoverable from the structure of the relevant syllable rhyme, assuming of course that the task is not one of discriminating minimal pairs, as IN perception studies often are. More important still in this context, however, is evidence from

226

Conclusions

studies showing IN of contrasts that are not in fact spelled anywhere in the language. The tone studies of Peng for Mandarin and Yu for Cantonese mentioned above are significant in this respect, as tone is not represented in Chinese orthography, meaning that IN of tone in sandhi contexts cannot possibly be due to a mere “spelling pronunciation.” Despite continuing skepticism, then, we must I believe conclude that the bulk of the evidence weighs in favor of the existence of incomplete neutralization as a naturally-occurring phenomenon, and requires that our models of the phonetics-phonology interface have some way of accounting for it directly, rather than merely dismissing it as an artifact or a curiosity. This being the case, there are certain obvious problems to be addressed. For the phonologization approach, the main problem is that IN calls into question the fundamental distinction being made here between phonetic and phonological processes. If phonological neutralization processes operate by changing category labels in symbolic representations, and if we expect the mapping between phonology and phonetics in speech production to be such that a single category label in a given context should be interpreted in a unique and unified fashion, then we have no obvious explanation for cases in which small-scale phonetic differences are found between what are ostensibly instances of the same phonological category. Matters are complicated all the more if we assume the standard feedforward modular approach to the grammar (as characterized and challenged in, e.g., Pierrehumbert 2002).148 In this sort of model, the grammar operates in such a way that abstract, underlying representations of morphemes are retrieved from the lexicon, after which they are submitted to the phonology, the output of which is sent to the phonetic component for interpretation and eventual production. This strict modularity allows the phonetics access only to the representation provided to it by the phonology; derivationally prior forms of that representation, such as the shapes of morphemes as they exist in the lexicon, are no longer available. What this means is that once a categorical neutralization takes place in the phonology, the phonetics has access only to the result of that neutralization. Underlying specifications for [voice], for example, should be unavailable to a phonetic component interpreting the output of a phonological process of word-final devoicing. IN as it is usually interpreted seems to present a counterexample to this prediction, since the phonologically neutralized contrast is nonetheless reflected in the phonetics. Perhaps the simplest response to the challenge posed by incomplete neutralization, and the point of view I adopt in this work, is one which accepts the phonetic facts as they have been described, but questions the level of description at which processes subject to IN in fact apply. It has been assumed, following traditional descriptions, that word-final devoicing

Incomplete neutralization

227

in the languages in question is a process which must take place in the phonology, and yet it is not at all clear that this is so. I take IN, in fact, simply to be evidence to the contrary, evidence that for the relevant dialects of the relevant languages, word-final devoicing is a gradient process operating in the phonetics. On this account, IN is simply yet another instance of the operation of phoneme-to-sound mappings of the type discussed throughout this work, whereby the phonetic targets for two categorically distinct entities come to overlap to such a degree that they are rendered almost, but not entirely, identical. As scandalous as IN has seemed over the years in discussion of consonants, identical facts seem to provoke far less outrage and dismay from phonologists when they affect the realizations of vowels.149 Situations of what can only be described as IN have been presented in numerous places in the preceding chapters as instances of the state in which systems find themselves immediately prior to the phonologization of some merger. Recall the case in particular of the vowels /o/ and /u/ in unstressed syllables of Bulgarian. The distributions of these two vowels overlap to an extreme degree, bringing them so close together in production that certain disambiguation is nearly impossible (in casual speech at least). Nonetheless, the data presented by Pettersson and Wood clearly shows overlapping, but non-identical distributions for these two vowels in unstressed syllables. They conclude on this basis that reduction in Bulgarian is graded, not discrete. Why phonologists continue to ignore these results and insist on the categorical application of vowel reduction in Bulgarian is not at all clear, but if the phonetic results are accurate,150 then reduction of (both) Bulgarian mid vowels is simply incompletely neutralizing: Reduction is gradient, with targets overlapping to such a degree that the two vowels are realized almost, but not quite, identically. From the point of view of acquisition and perception, of course, this situation is a precarious one. If the vowels are indeed so close together that they are difficult, sometimes impossible, to tell apart, then listeners will be prone to confuse them, and to the extent that they do so, this contrast will be in danger of being lost in unstressed positions. When it is, Bulgarian vowel reduction will have been phonologized. Incomplete neutralization, then, is the phonetic precursor to phonologization. Thinking again about voicing, we know that some languages (e.g., Hungarian) maintain a voicing contrast in word-final position which is subject to little pressure from phonetic devoicing. Other languages, such as English, also maintain a voicing contrast in this position, though in practice word-final (nonprevocalic) stops have little actual voicing during closure.151 What I am arguing here then is that languages described as having IN of word-final devoicing are in fact just one step further along on the road to

228

Conclusions

phonologization of this merger: Phonetic targets for voiced and voiceless obstruents overlap to such a degree that listeners can only distinguish the two categories with limited success. Obviously, this account hinges on the existence of languages where word-final devoicing is in fact phonologized, such that neutralization is in fact complete. As already noted, studies like that of Kopkallı (1993) go a long way to bearing out this prediction. Turkish, it would seem, has phonologized devoicing where Dutch has not. Dinnsen and Luce’s (1984) result to the effect that some, but not all Catalan speakers tested display IN of final devoicing makes sense in this light as well; we might understand this as a case of phonologization in progress, with some speakers merging the two obstruent series, where others have yet to do so. The fact that the Dinnsen and Luce study involved only five speakers (two with complete neutralization, three with IN of at least some variables) of course leaves any such conclusions sorely in need of extensive empirical follow-up. More important are the results of Tieszen (1997) regarding the status of final devoicing in Polish. Polish IN has been a matter of some dispute over the years. Slowiaczek and Dinnsen find IN of final devoicing in 1985. Jassem and Richter, however, fail to find it in 1989. Tieszen’s study of final devoicing in Polish finds it in two dialects, though with different phonetic manifestations, and fails to find it in a third dialect. This situation makes final devoicing in Polish look analogous to raising of unstressed /e/ in Bulgarian, a process which is not present even phonetically in the west of the country, which is present phonetically to a greater degree as one moves eastward, and which is (probably) phonologized in the eastern dialects of Bulgarian. Findings of this kind, if they hold up to further empirical scrutiny, would provide strong evidence for the view of IN as a stage in the process of phonologization. Once again, the picture is this: Two distinct phonological categories begin, under pressures such as undershoot, coarticulation or the like, to approximate one another phonetically. Their distributions come to overlap significantly, making many realizations of both perceptually ambiguous. This is the kind of gradual, historical change that is sometimes called drift, and which is sensitive to factors such as token frequency of the words containing the relevant segments, with highfrequency words in the lead, and low-frequency words lagging behind. See Pierrehumbert 2001 for an exemplar-based model of how this should operate. It is this stage that we call IN, or in instances where no morphologically related forms support the underlying identities of segments, “near-merger”. Now, however, at a certain point, the distributions of the two categories comes to overlap to such an extent that they become impossible to disambiguate reliably. Consistent misperception leads to a relabeling of the distributions in question, a merger of the two

Incomplete neutralization

229

categories. Relabeling should be abrupt and not sensitive to frequency differences, since it operates on the labels unifying the entire categories, rather than on some subparts thereof. This type of categorical merger, impossible to reverse, is what is usually called neo-grammarian sound change. This would be what has happened with Turkish final devoicing, as well as in the Polish dialects not exhibiting IN. Counterevidence to the feed-forward, modular implementation of this view at least would come in the form of clear evidence that the neutralization of the features in question must take place in the phonology. If, for example, we could find evidence of rule interaction in which some other clearly phonological process (i.e. one which could not itself be analyzed as yet more gradient, phonetic IN) treated the output of IN as a categorical merger (i.e. if the phonology could “see” the merger), the strong version of this view would have to be abandoned. If, for example, some other part of the phonology of Bulgarian indisputably treated underlying /o/ in unstressed syllables as [+high], treating mid vowel reduction in that language as purely phonetic ceases to be an option. The same is true for final devoicing in Dutch, German, and the relevant dialects of Polish and Catalan. At the time of this writing, I know of no such processes in any of the relevant languages, though in principle of course such evidence could exist. It must be asked how it is, if the distributions of the two categories truly overlap to such an extent, that distinctions such as those found in IN systems manage to remain as stable over time as they apparently can. If the danger of phonologization is as real as we predict it to be, it is perhaps a wonder that languages resist it at all for any length of time, and yet clearly they do, to the extent that final devoicing is not a new phenomenon in Dutch or German. Here we must assume that morphological relatives play a crucial role in supporting the imperiled contrasts. The existence of alternations provides evidence for the correct analysis of the nearly neutralized segments, and presumably helps to stave off their reanalysis. To the extent that orthography can play a role in promoting IN, spelling may contribute some conservatism here too. On the other hand, work on another aspect of what I maintain is ultimately the same phenomenon, near-merger (Labov 1994, ch. 10–14), suggests that even in the absence of alternations, the state of near-merger can persist for extended periods (even reversing itself eventually in some cases, something which is impossible once merger is complete). The view of IN presented here has as a virtue the fact that it allows us to see IN and near-merger as fundamentally the same phenomenon, a state of phonetic approximation which often, but not always or immediately, leads to phonologization. The difference would be that IN is supported by alternations, where near-merger typically is not.

230

Conclusions

It should be noted at this point that if this analysis of IN does prove to be the correct one, a serious problem arises for the Direct Phonetics approach to phonological neutralization in general. Driven as they are by the imperative to avoid expending articulatory effort in the service of perceptually weak contrasts, it is unclear how approaches such as Licensing-by-Cue or Flemming’s Integrated Phonetics and Phonology approach to vowel reduction would derive IN. Clearly languages may differ in how much perceptual approximation they will tolerate before giving up on a contrast entirely. This follows from the fact, for example, that some languages do contrast things like stop place of articulation or laryngeal features in non-prevocalic position, even though such contrasts may be generally disfavored crosslinguistically. We could imagine then that languages with IN are simply extremely tolerant of perceptual weakness in contrasts, or that they rank constraints on the maintenance of contrasts so high, that they are willing to expend articulatory effort to produce contrasts so weak that speakers’ identification of them turns out to be at best only slightly above chance. If this is the case, however, we would predict that such languages would lack neutralization processes altogether, at least in the relevant areas of their inventories. Put concretely, a language with IN for word-final devoicing should strive to preserve voicing contrasts at any cost; such languages should not, for example, exhibit complete neutralization through obstruent voicing assimilation, since the perceptual cues for obstruent voicing in preobstruent position could hardly be worse than they are in word-final IN in those languages. If what is truly at issue in neutralization is perceptual robustness of contrast, we would not expect the language to maintain a contrast so quixotically in word-final position, only to abandon that same contrast unceremoniously elsewhere. Again in the realm of vowel reduction, if the approximation of Bulgarian /o/ and /u/ in unstressed syllables is an example of IN, we would predict that no other vowel height contrast in Bulgarian should be categorically neutralized either. This would mean that the approximation of /a/ and /â/ should also be an instance of IN. The full story here has yet to be determined. Most scholars assume this to be a categorical merger (though the same could be said of /o/ and /u/, which we have seen is probably not the case in most dialects). In another example, Padgett and Tabain (in press) present some evidence suggesting that the merger of [e] and [i] in unstressed syllables in Russian may in fact be an instance of IN. If this is true, it is hard to imagine what could be perceptually so inadequate about the potential contrast between [o] and [a] in unstressed syllables such that it is neutralized categorically, while the perceptually hopeless contrast between [e] and [i] is maintained through IN.

Incomplete neutralization

231

There is, however, another approach to IN advocated in various forms in the literature, which, if correct, would potentially alleviate most of the problems raised by IN for both models of the phonetics-phonology interface under discussion here. This approach has its roots in the original view of IN expressed by Port and colleagues in their 1985 and 1989 studies, and sees more modern expressions with more explicit formalization in recent work by Pierrehumbert (2002), Ernestus and Baayen (in press) and Gafos (in press). These approaches all share the fundamental assumption that IN is not an aberration, a quirk of certain systems, or even a state of affairs distinct from others in the typology of neutralization patterns. IN in these approaches is inherent in the nature of the contextual merger of phonological categories. It is an automatic consequence of the structure of the grammar, and if we are only discovering its pervasiveness now, this is simply because in the past we have lacked the tools to identify it systematically. All three of these approaches locate the source of IN in the connection between IN word forms and morphologically related forms in which the relevant contrast is robustly expressed. Pierrehumbert (2002), for example, gives an explicit formulation to this effect. In this approach, predicated on an exemplar-based view of the lexicon, IN follows from the nature of speech production broadly as follows: The relationship between phonetic variables and phonological categories is essentially that of a “map of perceptual space” on which are plotted stored exemplars of words containing the categories in question, together with a set of labels on that map identifying category boundaries. The exemplar “cloud” for a given vowel, for example, is the region of n-dimensional perceptual space in which the distribution of all exemplars of that category are plotted. The exemplars themselves are stored in fine phonetic detail, together with information for each exemplar about lexical context, individual speakers, and the like. Category labels are just a more abstract level of generalization over this same space. For speech to occur, production targets for each category in a given string must be selected, and this selection process is not random. Rather, certain areas of the exemplar cloud for a given category will be more highly activated at any given moment than others. More recent exemplars, for example, should stand out more and bias target selection in their favor. Lexical context has this effect as well. Target selection is biased in the direction of exemplars of a given category actually occurring in the word in question. In this manner, word-specific phonetic patterns of the type surveyed by Pierrehumbert are predicted to arise. In addition, however, we expect biasing of production targets from morphologically-related forms as well. Pierrehumbert notes the tendency for allophony to be transferred from bases to derived forms, and locates the source of this trend here. This, presumably, is what creates IN. In selecting

232

Conclusions

production targets for aspects of a devoiced final /d/ such as burst duration or duration of preceding vowel, even if neutralization is categorical (meaning that the label on the region of perceptual space from which we are sampling is /t/, or [–voice]), the existence of morphologically-related forms containing /d/ or [+voice] exemplars in the relevant position should bias the selection of production targets in the direction of those related forms. A [t] with [d] relatives (i.e. an underlying /d/, if we allow for such things) would end up with burst duration targets selected from the shorter end of the scale (the end closer to [d]-space), and with targerts for preceding vowel duration selected from the longer end of the scale (again, toward [d]-space). The resulting segment would be a phonological /t/ that was ever so slightly more [d]-like than would have been an analogous /t/ with no /d/-relatives in the exemplar space. Ernestus and Baayen attribute IN to essentially the same factors, though in slightly different terms. Also assuming an exemplar-based view of the lexicon, in which storage includes fine phonetic detail of all forms, inflected and otherwise, of a given morpheme, they derive IN from “stochastic information” concerning the probability of slight voicing in a final obstruent, given the choice to produce a particular word form. Crucially, they also assume a process of “lexical analogy,” whereby the fine phonetic detail of one stored word form may influence the fine phonetic detail of another. Again, this would be the case with or without a change in phonological category membership. Gafos too, again in slightly different terms, presents essentially the same picture. Phonological constraints are viewed as “attractors” in Gafos’ nonlinear-dynamics view of the phonetics-phonology interface. The difference between phonological categorization and fine phonetic detail is again not a modular one, but rather a difference in level of description within the same representational space. Phonological constraints may require realization of some category within a particular region of perceptual space, but the existence of morphologically related forms with different realizational profiles may again bias productions toward one or another side of the space of possible productions for the category in question. For Gafos, the results of Port and Crawford demonstrating that speakers of German produced more dramatic IN as a function of the pragmatics of the speech situation (more in dictation of words in isolation to a waiting transcriber, less embedded within sentences read in non-contrastive contexts) are particularly important. Gafos views IN as a result of speakers’ desire to preserve elements of their message in communicatively-compromised contexts. Some constraint requires production of a given segment in such a way that lexical contrasts are endangered. IN is then the strategy of the speaker to give all possible information to the listener concerning his or her intended

Incomplete neutralization

233

production, while still realizing that production within the boundaries of phonetic space acceptable to the phonological constraint system. The approach to IN shared by all of the above researchers is attractive, in that provides a potential solution to a thorny problem, and particularly does so in a way that makes a variety of clear predictions which are also eminently testable. The relevant experiments, once carried out, should give us a clear idea of whether or not this is ultimately the right view both of IN in particular, and perhaps of the phonetics-phonology interface more generally as well. It should be noted, before considering those predictions further, that while this view of phonetics-phonology-lexicon is not compatible with the traditional modular feed-forward view of the workings of the grammar (as shown by Pierrehumbert 2002), nothing in this conception of IN or of the grammar itself is necessarily incompatible with the phonologization approach to typological patterning as presented in this work. Models such as that described by Pierrehumbert, to the extent that they allow for the expression of generalizations both at the level of fine phonetic detail and at the level of the category label, and to the extent that they allow for the formal independence of those two sets of generalizations, are largely compatible with the fundamental claims I have been advancing here. Clearly, however, these models of IN are incompatible with the view specifically of IN which I advance myself in the immediately preceding pages. Crucial avenues for future research evaluating what we might call this “morphophonetic gravitation” model of IN include the following. Firstly, the idea that IN is in fact sensitive to pragmatics is central, for example, to Gafos’ view of how the system functions. Evidence does exist supporting this notion. (Charles-Luce 1993, for example, finds IN in Catalan only where semantic context contains insufficient evidence of a speaker’s intended meaning. Where disambiguating context was present, little effect was found.) Questions remain however concerning the extent to which this varies in ordinary speech in contexts not involving the disambiguation of minimal pairs. Indeed, the study Gafos relies on as evidence for a model which brings IN under active speaker control is Port and Crawford 1989. This study is extremely suggestive, but unfortunately was based entirely on productions of precisely three minimal pairs of German: Rat ‘advice’ vs. Rad ‘bicycle’, bunt ‘colorful’ vs. Bund ‘group’, and seid ‘be-2PL ’ vs. seit ‘since’. This exceedingly limited sample, together with the relatively small number of participants (five) requires that this result be replicated on a much larger scale before we reconfigure our view of the phoneticsphonology interface to accommodate it. Additionally, while Port and Crawford do report no significant differences in subjects’ productions for the different word pairs, particularly given what we now know about word-

234

Conclusions

specific phonetic patterns, comparisons of forms like seid ‘be-2PL’ vs. seit ‘since’, the first of which is frequently a main verb, while the latter is a typically stressless function word, should be avoided in future replications. Since we know that IN can be induced to signal even orthographic distinctions with no phonological significance, studies seeking to dispel once and for all persistent skepticism concerning the paralinguistic contamination problem for IN will be extremely important. Assuming that IN is indeed the result of phonetic gravitation toward morphological relatives, results already obtained in support of an exemplarbased view of the lexicon make several other important predictions about how IN should function, which, if borne out experimentally, would go a long way toward settling this issue. Pierrehumbert (2002), for example, predicts phonetic gravitation to cause gradient transfer of allophonic properties from bases to derived forms in proportion to the morphological decomposability of the derived forms. Her example is the transfer of wordinitial glottalization from align to realign, to the extent that align is transparently present inside realign. She cites as evidence for such gradient transfer the experimental results of Hay (2000) on [t]-deletion in clusters (more deletion in forms like listless, where subjects were unlikely to parse the form into list and -less, less deletion in forms like swiftly, where the morphological parse was more transparent). Importantly, for Hay morphological decomposability is a function of the relative frequencies of the base and the derived form. A form is more likely to be perceived as morphologically complex (and hence more likely to be subject to allophonic transfer, our phonetic gravitation) when its base is a high frequency item, and the resulting derived form is relatively low frequency. Morphological analysis (and by extension gravitation) is less likely where the derived form is high frequency and its base is lower frequency. The prediction then is this: Given a case of putative IN like English flapping, in which underlying contrasts are present in bases (e.g., grate vs. grade), but neutralized in derived forms (e.g., grating vs. grading), we ought to find stronger IN effects where high-frequency bases are paired with lowerfrequency derived forms, and weaker IN (i.e. more complete neutralization) where derived forms are more frequent than bases. To my knowledge, no IN study to date has attempted to control for frequency in any way, much less to seek effects of the kind apparently predicted by the gravitation approach. The results of this study should be telling indeed. Other evidence, however, suggests that the simple base-derived relationship described above may be insufficient to characterize the true complexity of the picture drawn by the gravitation model. One obvious objection to the narrowest reading of proposals like Pierrehumbert’s is that, to the extent that they rely on real perceptual experience of actually-

Incomplete neutralization

235

existing morphological relatives of the forms in question, they make the strong prediction that IN should not be found, for example, with nonce words. Nonce words will by definition have no lexical listings for morphological relatives, and indeed no pre-existing exemplar clouds of any kind. Morphophonetic gravitation should therefore be impossible. Recall, however, that Ernestus and Baayen elicited and evaluated the perceptual efficacy of IN for Dutch final voicing contrasts using only an experimental corpus of made-up verbs. While it is difficult to exclude the possibility that this is again just a paralinguistic effect, the result of spelling pronunciation in an unnatural experimental context or the like, if this result is genuine, then the domain of gravitation must be expanded considerably to include more than just bases transparently present in derived forms. Ernestus and Baayen in fact propose to account for this effect by allowing gravitation due to the co-activation of (orthographic or) phonological neighbors as well. If phonological similarity can induce IN, however, then we ought to be able to demonstrate, for example, stronger IN for nonce forms in dense non-neutralizing neighborhoods and weaker or zero IN in sparser neighborhoods. To take again the flapping example, we might expect strong IN of nonce pairs like vating vs. vading, due to the large number of phonologically similar unneutralized analogues in the lexicon (e.g., rate/raid, late/laid, mate/made, etc.). We would expect (more) complete neutralization, on the other hand, for nonce pairs like soiting vs. soiding, due to the relatively smaller number of legitimate residents of this monosyllabic –oit/-oid neighborhood (Boyd, Floyd, Lloyd, toyed, void and Coit, Hoyt, Voight being some of the very few, and rather marginal, examples). Measures of neighborhood density in such a study would also need to be frequency-weighted given the preceding discussion of that effect as well. This broader view of the gravitation effect probably requires even the view of gravitation within morphological families to be expanded. If even phonologically similar (but morphologically unrelated) forms can affect the realization of IN, we might expect a measure based just on the relative frequencies of the base and the derived form to be too narrow to be predictive for degree of IN.152 Rather, the total relative frequencies of the neutralizing and non-neutralizing allomorphs of the morpheme in question throughout the lexicon would ultimately be the measure that would concern us. We would want to know, most likely, not just about the relative frequencies in pairs like beat/beating and bead/beading, but rather about the frequencies the relevant allomorphs in other forms as well (such as beats with a [t] vs. beater with a flap and beads or even beadless with a [d] vs. beader with a flap, and so forth).

236

Conclusions

The relative frequencies of flapped and unflapped allomorphs of these morphemes across entire morphological families is thus probably ultimately the measure we would expect to be predictive for the application of IN, given the assumptions of the gravitation model. Note that if this effect of phonetic gravitation or subphonemic analogy could be shown to be “omnidirectional” in this sense throughout a morphological family, it would be interestingly different from the more familiar phenomenon of transfer at the level of the phonological category, the cyclic or output-tooutput effects familiar from the literature on morphophonology. The latter of course seem to be primarily, if not exclusively, inside-out or base-toderived in direction. The next crucial prediction made by the gravitation approach to IN, however, is perhaps also the area in which this approach shows the most vulnerability. Specifically, IN is now a natural consequence of the structure of the lexicon and the interface between phonetics and phonology. Cases like final devoicing in Dutch and German, where IN has been robustly demonstrated to occur, should trouble us no more. Now, however, we have no explanation for cases in which IN somehow fails to apply. Indeed, if the gravitation approach to IN is correct, then the studies of Catalan (for some speakers), Polish (for some dialects), and Turkish, all of which failed to discover IN effects, must be due to methodological errors on the part of the relevant experimenters. Somehow, these particular studies missed an effect which the theory now predicts to be present in all cases of neutralization of contrast. This prediction is particularly worrisome given the results of Tieszen (1997), in which the same experimenter using the same methodology achieved different results in different dialect areas of Poland. If we accept the gravitation account of IN, in other words, it is incumbent on us as experimenters to revisit all putative cases of actual complete neutralization with an eye to discovering what went wrong in previous studies. In the context of the present work, for example, my study, and also that of Padgett and Tabain claiming to find complete neutralization for Russian /o/ and /a/ in unstressed syllables must be investigated further. Could we, given the right set of stimuli and the right pragmatic context in our experimental design, elicit a reemergence of this seemingly vanished contrast? And if we succeeded in doing so, could we feel certain that we had not inadvertently activated some sort of paralinguistic mechanism, such as spelling pronunciation or the like? Along these same lines, we must expand the empirical horizon of our IN investigations to cover a greater number of neutralizing feature contrasts. To date, the most robust attestation of IN has come in measures of vowel duration or consonant burst duration. If my conclusions about, for example, Bulgarian vowel reduction are correct, then vowel height is another feature

Incomplete neutralization

237

subject to gravitation, and the tone sandhi studies of Peng and Yu add this dimension as well to our expanding typology. But where, if anywhere, does this end? Are all aspects of phonetic realization subject to gravitation of this kind? Are stem back vowels in bases that combine with frequent front vowel suffixes systematically fronter than stem back vowels in bases that do not? Perhaps they are, but this remains to be determined. Are long vowels that undergo closed-syllable shortening in related forms in a language like Turkish systematically shorter than analogous non-alternating long vowels? The predictions seem to run in this direction, but again, the question is open. Lastly, if the size of an alternant’s exemplar cloud plays a role in determining the strength of the gravitation it exerts on related forms, we might wonder whether distance in perceptual space is a conditioning factor as well. All the examples discussed so far are of forms which could be considered adjacent along some dimension of our n-dimensional perceptual map. Where one alternant has a long burst duration and the other’s burst is short, the result of gravitation is an intermediate value for this measure. Where one vowel is mid, and the other high, again we can easily imagine speakers shifting, say, an [u]-like reduced version of /o/ slightly back toward [o] on the mid-to-high F1 continuum in order to provide the listener with valuable information concerning lexical identity, in the sense that Gafos suggests. But is this possible in all cases of neutralization? We can imagine what Russian speakers might do to an [a] to make it sound ever so slightly more [o]-like: Perhaps raise and back it, perhaps shorten it a bit and add some lip rounding. But take again the case of the reduction of Seediq unstressed /e/ to a high, back, round [u]. Is there any meaningful sense in which that [u] could be made to sound ever so slightly more [e]-like, as a helpful nudge for the listener toward the disambiguation of a minimal pair? Or recall Uyghur raising, a neutralization across discontinuous phonetic space whereby /æ, / are merged with [i], in a context where /e/ is nonetheless realized faithfully. What would be the result of the application of phonetic gravitation to the output of the Uyghur merger of [æ] with [i], and how exactly would this be helpful? Perhaps there are constraints on the application of gravitation involving the type or proximity of phonetic variables involved. Crucially, however, for the gravitation account to hold up, we would not want to find IN to be constrained, for example, by the degree of phonologization of the merger, where new, transparent, exceptionless mergers showed IN, while older mergers exhibiting opacity and lexical exceptions and the like showed none. This, for example, might well underly the apparent absence of IN in Turkish, where non-devoicing forms (admittedly few) exist alongside devoicing ones. Such a pattern, if discovered in our typology, would

238

Conclusions

seriously undermine the attempt to make IN an ordinary consequence of the structure of the phonological grammar, and start to make the account I advocated at the outset of this discussion, in which IN occurs where mergers have yet to be phonologized, seem once again more attractive. The gravitation approach also has the drawback that it fails to allow us to unify IN with ‘near-merger’ in the same way the phonologization approach to both does. For the gravitation model, since near-merger is not supported by the existence of morphologically related forms demonstrating alternations, it must be treated as something altogether different, just the phonetic approximation of two phonological categories that have yet to merge completely. This of course is the same account the phonologization approach gives to near-merger, but this account does not extend to a gravitationinduced version of IN. There, phonological merger has already taken place, but its outcome is altered in the manner described above. The issues surrounding IN are clearly complex, in any case, but the predictions of the competing approaches are both clear and falsifiable. Whatever the outcome, testing these predictions will be an important step forward in the ultimate resolution of the issues this book has throughout sought to address.

Notes

1.

2.

3.

4. 5.

6.

7.

To say that the sole link between phonetics and phonology is diachronic potentially requires qualification, depending on one's interpretation of the bounds of diachrony: In a performance sense, this interface is obviously active in speech production and perception. To the extent that the gap between speakers’ intentions and listeners’ interpretations is the locus of much of the language change which will interest us (as described below), we are right in characterizing this interaction as in some sense diachronic. For that matter, diachrony might equally be construed in a broad enough sense so as to include language acquisition, another crucial arena for the interaction of phonetics and phonology in driving language change. Additional constraints necessary to ensure raising rather than lowering of unstressed mid vowels are not shown. Specifically, Ident[lo] would have to outrank Ident[hi] to make a change of [–hi] to [+hi] more acceptable than the analogous alteration for [lo]. Zoll’s reason for advocating the Positional Markedness approach for certain cases was due to the existence of patterns of positional neutralization in which material banned from weak positions migrates to the strong position and is realized there. The core concept of Positional Faithfulness, that it is more important not to alter input representations in strong positions than elsewhere cannot accommodate this pattern, since in moving the marked features from weak position to strong it has precisely the effect of altering the input specifications of the strong position. This is exactly the kind of issue which this book argues should concern phonologists modeling positional neutralization in the grammar. Indeed, in the case of consonant place and laryngeal features, Steriade presents compelling evidence that traditional syllable-based accounts of licensing asymmetries are descriptively inadequate in precisely this manner. The first two of these are uncontroversially included in most lists of strong/ weak pairs, while the last is a matter of some disagreement. Steriade (1994) singles final position out as strong, while Zhang (2001) and Smith (2002) note that the status of final syllables is somewhat unclear. Final position is also traditionally cited as a monolithically weak position (Hock 1999). I refer here only to aspects of sound patterns in which issues of naturalness or phonetic content are concerned, of course. I do not mean to suggest that the phonological grammar does not or cannot make demands of its own on the processes it implements. The existence of such restrictions should be demonstrated empirically, however, rather than merely assumed a priori. I use duration here only as an example of a factor which can condition variation in the phonetic realization of a given phonological category. This is in no way meant to suggest either that duration is the sole factor involved in

240

8.

9.

10. 11.

12. 13. 14.

15.

16.

17.

Notes such conditioning, or that duration should be equally active in conditioning realization of analogous categories in all languages or all contexts. Differing window widths for vowel height in stressed and unstressed syllables are in some sense the phonetic analogue of Positional Faithfulness constraints in the phonology: the latter prevent categorical changes to input specifications in certain positions while permitting them in others; the former require in those same positions realization of input specifications closer to category centers or prototypes, while allowing more marginal realizations elsewhere. This means that two phonological entities with non-identical feature specifications in a given context must not have identical distributions of their phonetic realizations. This follows from a view of category formation in acquisition as a generalization made from the analysis of statistical distributions (see e.g., Maye 2000) See Myers (2000) for illuminating discussion of similar issues. In the sense in which a UVR system reducing five vowels [i, e, a, o, u] to three vowels [i, , u] is unambiguously based on the neutralization of height contrasts, or palatal harmonies are unambiguously based on the neutralization of contrasts of frontness or backness. Actually six vowels to three. See below. Another example of this neutralization pattern, in its slightly idealized form outlined in Steriade (1994), is that of word-medial short vowels in Hausa. See Section 3.2.2.1 below for details. Proto-Ob-Ugrian, like Proto-Uralic, is reconstructed with fixed initial stress (Sammallahti 1988). Khanty, however, appears to have at least some instances of stress attraction heavy peninitial syllables in an otherwise fixed initial system (Honti 1988). This may mean that synchronically harmony-free dialects of Khanty should not be considered systems of UVR, though at the relevant historical period at least they may have been. Doubly so in fact for this being the second occurrence of this sound change in the history of the language, the first instance being in common with SerboCroatian the merger of the overshort Late Common Slavic Jer vowels (originally short /i/ and /u/), yielding Slovenian / / and SCr. /a/. This stress, however, is not duration-cued (Williams 1989). In fact, Welsh final syllables are routinely longer than stressed penults. This is a result of the fact that stress was previously final, but upon retraction to the penult, left behind traces of that earlier stage in the form of increased duration. Crosswhite (2001) points out that merger patterns are not predictable from the synchronic positioning of the vowels relative to one another in the stressed syllables of the languages in question. This is certainly true, and underscores the extent to which synchronic patterns of PN are independent from the phonetic factors which created them. To understand the development of a given UVR pattern, though, it is necessary to consider not the phonetic realizations of the stressed counterparts of the reducing vowels (since these by virtue of those very realizations are not in fact subject to mergers), but to the realizations of the unstressed vowels themselves immediately prior to the phonologization of the synchronic neutralizing patterns.

Notes

241

18. Translation mine. The original reads as follows: The unstressed vowel “se xarakterizira s po-goljama otvorenost na ustnata kuxina, tâj kato ezikât ima po-nisko polozhenie, i sâotvetno s po-goljama zatvorenost na gârlenata kuxina”. And further, “V akustichen plan tazi promjana v konfiguratsijata se otrazjava vârxu spektâra v sâsredotochavane na akustichna energija chrez povishavane na chestotite na pârvija formant, i sâotvetno chrez ponizhavane na chestotite na vtorija formant — ukazanie za po-otvoren glasezh” (Tilkov 1982: 50). 19. Neutralizing reduction of all vowels, though /e/ in particular, as well as automatic palatalization of consonants before front vowels are indeed features of eastern dialects subject to parody in other parts of the country, as in the mocking phrase [as sm ud dleku, i uvor meku], “I am from far away, and speak with automatic palatalization of consonants before front vowels” (lit. ‘softly’). 20. It may be that the confusion is in their use of the term 'neutralizing', which they employ in several places in an articulatory sense, meaning “becoming more neutral”. For example, on p. 240, of the failure of mid and low vowels to merge completely in many dialects they say the following: “... selective relaxation of articulatory control and neutralization of components of the unreduced vowel would diminish articulatory precision and lead to a spectral transition between unreduced and fully reduced forms.” Describing as they are a gradient “transition” from one vowel to another, they clearly do not mean neutralization in the sense of Aufhebung. They then continue, “Alternatively (italics mine), switching between complete configurations, by substituting [i] for [e] etc. in the underlying motor programme would lead to discrete shifts between reduced and non-reduced forms”, viz. actual neutralization. 21. Still though, a weak, gradient version of the East Slavic pattern present in Bulgarian, where in most dialects (with certain exceptions in the Rhodopes) the directions of neutralization are completely different from those of East Slavic would be a fascinating testament to the staying power of minute aspects of the phonetic detail of a given prosodic system to survive the complete destruction and transformation of that system and to linger in a context in which they are otherwise wholly alien and phonologically unexpected. 22. Crosswhite (2001) presents a similar account of the mergers of the tense and lax mid vowels in Italian (which if true would hold for all other Romance languages with this distinction, such as Brazilian Portuguese). The distinction between tense and lax mid vowels appears to have been neutralized already in Vulgar Latin, as they fail to show distinct reflexes in any daughter languages (Hall 1976). Whether this merger took place before or after the relevant contrasts shifted from being quantity-based to quality-based (/e, o/ vs. /, / instead of /e:, o:/ vs. /e, o/) is not entirely clear. This shift must have been underway by the time some of the early inscriptions at Pompeii, in which we find attested the merger of [] (< Latin short /i/) and [e] (< Latin long /e:/) as [e] (Vincent 1988: 33). A similar merger of the corresponding back vowels takes place later, attested in all of Western Romance, but not in Romanian. These mergers of original short and long vowels of different (but adjacent)

242

23.

24.

25. 26.

27.

28. 29.

30.

31. 32.

Notes heights makes no sense if the vowel system is still quantity-based, but if durational differences were not as significant by the time of the merger, with distinctions based already primarily on quality, then these sound changes are perfectly comprehensible. On normal and casual, Major writes “NOR is the natural speech used in settings which vary from slightly informal to formal — such as a lecture, a newscast, or consultation with one’s colleagues; it is the style which the layman considers good or correct pronunciation. CAS is used in very informal, casual, or intimate settings, e.g., conversation between good friends and lovers. The layman often considers this to be incorrect and sloppy” (Major 1985: 265). Varying rankings of potential Licensing constraints with Faithfulness constraints will see that this is not always the system contrast-enhancing UVR ultimately generates. Different outputs depend on language-specific constraint rankings. The scales assumed by P & S were Syllabic Prominence, distinguishing the Peak from the Margin, and Segmental Prominence, based as here on sonority. It is interesting to note that most of the dialects which actually have this system of vowel reduction also have automatic palatalization of consonants before front vowels, which may not be an accidental correlation in light of the reduction of /e/ to [i] present here and absent or weak in other dialects. There are, however, dialects with more pronounced palatalization than in these and only weaker, non-neutralizing reduction of mid vowels; the correlation is thus not automatic. As noted above, it is not entirely clear whether the contrast between today’s tense and lax mid vowels was eliminated in Vulgar Latin unstressed syllables at a time when it was still primarily a quantity contrast, or whether it had already taken on the form it has in the stressed syllables of the modern languages. Either way, however, insufficient duration to maintain the contrast is clearly the driving force behind the merger. A “drawled”, strong [a] under a rising pitch curve in certain contexts is a stereotypical feature of Muscovite speech. Matusevic (1976) gives “F1 around 700 Hz and F2 around 1000” for first pretonics and stressed vowels, while claiming that the former is perceptually (‘na vosprijatie’) “more back and more closed” than its stressed counterpart, which seems odd. However, Matusevic also lists F1 of the non-first-pretonic unstressed [] as “around 700”, suggesting that these figures may not be completely reliable. Phonetic lowering substantial enough to lead to reinterpretation of high vowels as mid in particular would be a consequence of what Cho calls “sonority enhancement” and documents as part of the complex of strengthening effects found in English in boundary-adjacent and accented syllables. Unstressed [o] is pronounced only in unassimilated loans in Russian, and even in the vast majority of loans it is excluded. For a very few forms the synchonic morphological link was questionable, but at least etymologically correct. For example, /boatj/, ‘rich’, with first

Notes

33. 34. 35. 36.

37.

38.

39. 40. 41. 42. 43. 44. 45. 46.

243

pretonic /o/, is etymologically related both to /bo/, ‘God’, and /uboij/, ‘wretched, poor’ (Vasmer [1950–1958] 1986); whether all native speakers of modern Russian make this decomposition synchronically is not entirely clear. It should also be noted that in many cases forms with underlying /a/ in this experiment had no morphological support of this kind (e.g., /batalon/, ‘battalion’). Here it was assumed that orthography and surface realization alone were sufficient indicators of underlying form. All statistical tests reported in this section done using SPSS 11 for Mac OSX (Copyright 1989–2002 SPSS, Inc.) Scatterplots created with Microsoft Excel X for Mac (Copyright 1985–2001 Microsoft Corporation). Using the formula zx = ( x – μx ) / σx. Whether that specification is independent of the specification of stressed /a/, as the literature suggests, or is in some way just reflecting conditioned variation (window resizing in the terms discussed in Chapter 1) of the specification for stressed /a/ is a matter to be determined through further empirical inquiry. The assumption of a single target is consistent with the interpretation that reduction to schwa is not phonological, all unstressed instances of /a/ having the same featural representation in the phonology. It is not, however, a necessary consequence of that representation. While some languages do avoid stressing inherently shorter vowels such as schwa, fail to lengthen them under certain circumstances otherwise requiring it, and perhaps even enhance their sonority in positions of strength, in such cases the relevant degree of durational enhancement is far greater than the paltry amount in question here. I know of no reason a true schwa should respond so dramatically to so marginal an increase of duration as that found in this experiment. A trace of an earlier prosodic system with bimoraic pitch accent domains. See de Jong (1995) and Cho (2001: 77–80) for discussion of the nature of this enhancement under accent. Stress in Seediq is fixed penultimate. Final -Vw simplified to -V is independently attested and is not the source of [u] here. Other dialects retain /w/. It is tempting to see the final syllable [] realization as crucial to the UVR pattern, but without a better understanding of the distribution of this reflex this might be premature. Principles which may indeed be active in the phonetics. Where S refers to a syllable with primary stress, S to a syllable with secondary stress, and W to an unstressed syllable. Zhang argues on the basis of this evidence for a “direct” approach to the licensing of tone, his formalization of Steriade’s Licensing-by-Cue theory. Simply put, what all the positions preferentially licensing contour tones have in common is additional phonetic duration of the sonorous portions of their rhymes. If the phonology can refer directly to this phonetic duration, it can characterize contour-tone licensing in a unified and explanatory fashion that

244

47. 48.

49.

50.

51. 52. 53. 54.

55.

Notes would be impossible with traditional phonological representations (i.e. through reference to moraicity). Zhang adds to Steriade’s list final syllables acting as strong positions for nasalization and fine degrees of vowel height, but cites no such examples. The claim is ultimately correct, as will be seen below. I leave aside here what Newman and Van Heuven show to be a morphologically-restricted third, intermediate class of vowels: those which are long non-prepausally, but prepausally are realized with the glottal closure characteristic of short vowels, and a duration in between that of ordinary long and short prepausal vowels. Newman notes additional details: recent foreign loans with short /e/ and /o/ in closed syllables (such as [fnsir] ‘pencil’ and [b$s] ‘boss’) constitute lexical exceptions to this process, tending not to merge them with /a/. Also while the word-internal closed-syllable merger is categorical, often even reflected in orthography, non-prepausal word-final short /e/ and /o/ are said only in fast speech to have a tendency to centralize. This suggests that the word-final version of the merger is actually gradient, another instance such as those of Uyghur and Nawuri discussed below in which a word-internal categorical change operates gradiently in word-final position. Still more specifically, the only underlyingly lax vowel in the language is in the lax /-U/ suffix of masculine singular count nouns. The suffix for masculine mass nouns is a tense /-u/, raised from an earlier /o/ to contrast with the nowlax original /u/ of the count-noun suffix. I follow here the transcription of lax vowels with capital letters used in McCarthy (1984). Penny (1969) marks lax vowels with an acute accent (e.g., -ú vs. -u), disturbingly reminiscent of tone or stress. Questions concerning the actual phonetic realizations of the lax vowels would make use of IPA symbols somewhat arbitrary. In fact, the domain of this harmony is actually the clitic group, which Penny calls the “syntagma”. There is no phonemic contrast between tense and lax vowels in Tudanca. This neutralization does create surface contrasts between laxed vowels preceding underlying [i] and tense vowels preceding underlying [e], as in the pair /abri/ [Ábr] ‘open’ and /abre/ [ábr] (Flemming 1993: 11). In fact, the very northwestern Spanish dialects in question also exhibit vowel height harmonies. While the systems are complex (see Flemming 1993 or Dyck 1995 for details), in Pasiego for example, the /-U/ suffix conditioning laxing harmony also raises tonic mid vowels to lax high vowels in a manner reminiscent of the metaphony systems of Italian dialects. This in itself is perhaps the strongest evidence of the type of process involved here. This scenario is not without complications. It is most convincing, to my mind, in instances where the conditioning environment for the change is actually lost, a result of the perceptual failure preventing correct attribution of the coarticulatory effect. In the metaphony case (and others), however, the conditioning environment is not lost. Ohala (1993a: 246–247) suggests that in such cases the listener might perceive both segments accurately, but lack the experience to attribute the coarticulatory effect to its actual source (e.g., the

Notes

56.

57.

58.

59.

245

listener is a child, still acquiring the phonological patterns of the language). One problem here is that it makes clear predictions about differences between child-initiated and adult-initiated sound changes, which, to the extent that these are distinguishable, are wanting in empirical substantiation. Another problem is that the acquisition-based explanation makes irrelevant the fact that the trigger for such changes is in a perceptually-non-optimal environment (or is actually weakened itself articulatorily), but surely this is not accidental. The seeds of a resolution to this problem may lie in another aspect of this type of change. In Ohala’s conception, the listener’s new UR for a given item will often give rise to lexical doublets (assuming the listener does at least occasionally parse this item correctly). It is possible to imagine in such instances a process of leveling across competing representations leading to forms containing both the misparsed target and the original trigger of the change. One is reminded in this connection of the exemplar-based models of the lexicon defended in, e.g., Pierrehumbert 2001). These problems await resolution. For this generalization to hold, of course, systems in which subsequent neutralizing reductions (of all finals to schwa, of -o- to -u-, etc.) have introduced opacity into the metaphony pattern must be analyzed as having a high-mid contrast at least underlyingly. It is possible too that in some ways Ohala’s view and Walker’s are not so mutually exclusive as they otherwise seem. In the Ohalian scenario, for example, the original coarticulatory effect is generally taken as a given. It is something properly “unintended”, which the listener for whatever reason erroneously parses as “intentional”. While Ohala is clearly referring to the realm of the phonemic here, it is by now well known that coarticulation strategies are both planned and language-specific (which is to say, as “intentional” as anything else in phonology). It is also commonly accepted that segments may be cued by features which are not temporally coextensive with them (one thinks of vowel duration as a cue for the voicing distinction in following obstruents in English). It is perhaps not far-fetched to imagine that cues for a positionally-challenged contrast might become phonetically maximized, such that the coarticulatory effect involved in the hypocorrection is in fact particularly robust in those very instances in which the hypocorrections take place. The input to the change would thus be inherently skewed toward the occurrence of that same, perhaps suggesting as well a way of understanding why the change takes place despite the retention of the conditioning environment. All this is fertile ground for future experimental investigation. Those conditions may in fact be the ultimate diachronic source of the contrast (as in the Eastern Andalusian laxing harmony systems in which the lax final vowels of the plural suffix develop from loss of an earlier word-final /-s/, presumably with the intrinsic periods of breathiness on the preceding vowels becoming extrinsic thereby (Sanders 1994). A major problem for the researcher wishing to determine whether, say, a given vowel reduction language exhibits this effect lies simply in assessing the

246

60. 61.

62. 63.

64.

65. 66. 67.

Notes possible significance of a source’s silence on the matter. In the course of this survey it became clear that for the subtler effects even in generously-described languages native to societies with long-standing local traditions of phonetic experimentation (e.g., Russian), only the most conscientious descriptions were likely to contain a direct assertion on this topic, to say nothing of secondary sources from the international phonological literature. Flemming (1993) notes Yakan and Maltese, though see below on some problems here. Kirchner (1998) gives an analysis of Nawuri, but in a somewhat different context. Some descriptions of Belarusian do not note the raising of the low vowel in the reduced system of oppositions outside the first pretonic syllable, and also claim that /e/ outside the first pretonic is realized as [e], curiously reversing its merger with /a/ and /o/ in a phonetically-less-prominent syllable than the one in which the merger takes place. These descriptions are apparently based on the spelling norms of the Belarusian literary language, and do not reflect phonetic reality (Czekman and SmuΩkowa 1988: 229–230). The term used appears to be the Polish translation of Auslaut. Depending on the dialect, or even the individual speaker in some areas, both vowels are realized as [e], as [], or as something varying between the two, in some dialects assimilating to some degree to the height of the stressed vowel (Shevelov 1979: 520). No contrasts, however, are preferentially preserved in this position, as final unreduced mid-vowels harmonize with the earlier stressed vowel. In light of this, Eastern Mari does not seem to exhibit strong licensing of any kind wordfinally, pace Steriade (1994), and thus perhaps belongs more appropriately to the section on neutralization of quantity contrasts in final position below. // and /i/, analyzed as independent phonemes, in fact share the same surface realizations, differing only in the harmony pattern they condition (front or back). Hammond’s analysis admits only one reduced vowel. A possible explanation for this not involving final lengthening directly is that while synchronically all surface final vowels are short (phonologically), underlyingly they are all long. Maltese has undergone a common sound change (to be discussed below in 3.6 and 3.7 whereby long final vowels shorten and short final vowels are deleted. The original length of the remaining final vowels could have prevented their harmonization. Flemming (1993: 21) does not discuss harmony, but notes that final syllables closed and open are never subject to unstressed vowel deletion in Maltese, and attributes this resistance to final lengthening. This may ultimately be so, though I see two reasons for skepticism. First is the tendency for final closed syllables (indeed stronger than for internal closed syllables in some dialects), but not final open syllables, to undergo harmony, suggesting if anything decreased duration in final closed syllables. Another problem is in the nature of the deletion process itself. Syncope, at least in many Arabic dialects is similar to French schwa deletion in that it is restricted to target only vowels in open syllables. While final open syllable resistance is therefore possibly due to final

Notes

68.

69. 70.

71.

72. 73. 74.

75.

76. 77.

247

phonetic prominence, final closed syllable resistance is more likely a consequence of syllable structure. A number of the systems noted here (Shimakonde, Yakan, Murut) have fixed penultimate stress and final syllable vowels which avoid reduction, Yakan and Murut even in closed syllables (an oddity as will be shown below). It is not outside the realm of possibility that the failure of these final syllable vowels to reduce is due not to duration accorded them by final lengthening, but rather due to their presence in the head foot of the word (see Alderete 1995 for discussion of faithfulness effects in prosodic heads). Zoll (1998) presents an instance of this effect in non-final position in Guugu Yimidhirr, in which a contrast between long and short vowels is permitted only in the head foot of the word (the initial and peninitial syllables). In the cases at hand, the lack of any non-final posttonic syllables makes it impossible to determine whether resistance to reduction is a property of the final alone, or posttonic syllables more generally. All instances of final syllable [] have [] in the preceding stressed syllable, indicating their derivation by rounding spread (independently attested in other contexts). Neighboring related languages (e.g., Dusunic Kimaragang, Kroeger 1992) merge /a/ and /o/ in pretonic syllables as //. This situation is most likely the immediate precursor to the distribution in Timugon Murut, which seems to have restored full vowel quality to pretonic non-highs, with [a] and [o] predictable from the identity of the stressed syllable vowel. Stress in Uyghur is generally final in native vocabulary, and is described by Hahn as relying primarily on a pitch distinction, with durational differences between stressed and unstressed syllables being “less pronounced ... than in most European languages” (Hahn 1991: 27). Or at least were found at the time of the development of the raising process. Long vowels created through compensatory lengthening still undergo raising where applicable (Hahn 1991: 56), highlighting the independence from actual phonetic duration of the categorical word-internal raising process. A non-metaphony dialect in which there can be occasional apocope of phraseinternal word-final vowels following sonorants, but no such process phrasefinally. This is in contrast with certain southern dialects, e.g., Pugliese, which reflect metaphony and in which all posttonic vowels are reduced to schwa and word-final vowels are deleted phrase-finally. (Giannelli 1997: 298; Loporcaro 1997: 340–341). See below for discussion of final vowel weakening processes. While Cho found sonority expansion for domain-final vowels, he found no evidence of “featural enhancement” of those same. Under accent, in agreement to some extent with the finding of de Jong (1995), Cho found that [i] was realized fronter (but not necessarily higher) and that [a] was realized lower (but not necessarily backer). Recall that even in Russian final lengthening does not always reverse reduction of /a/ completely. In Bulgarian, for example, phrase-internal word-final unstressed vowels are generally extremely short, moreso even than other unstressed vowels. In

248

78.

79. 80. 81. 82. 83.

84. 85. 86.

87.

Notes phrase-final position, however, those vowels undergo lengthening. This could then be another such case. I have not located any claim in the literature documenting final resistance in Bulgarian, nor have I noticed it impressionistically in my work with the language. Of course, neither of these facts necessarily means that it does not exist. Eastern Andalusian Spanish, where Sanders (1994: 118) claims that phrasefinal vowels undergo an especially strong degree of reduction, and Pugliese dialects of Italian, where posttonic vowels reduce to schwa but are retained, while phrase-final schwa is deleted (Loporcaro 1997: 340–341). Put more strongly, without further elaboration of some kind (e.g., of the type proposed by Smith 2002), it actually predicts the non-existence of those restrictions and differences. In a standard Positional Faithfulness (Beckman 1998) account, in fact, the two would be formally indistinguishable. See also the Volga region Turkic and Uralic languages – Eastern Mari makes sense here. Referring of course only to the Degree Two gradient reduction patterns of these latter, the Degree One patterns being categorical in both. This account is based on the synchronic description of Hammond (1997), in which unstressed word-internal preconsonantal syllables are said to license only the reduced vowel [], while unstressed open final syllables permit [i, u, o, ]. There is the possibility, however, that this account, and by extension the diachronic scenario sketched above are irretrievably flawed in the following sense: There seems to be some circularity in Hammond’s account in the reckoning of unstressed vowel distributions. These are determined by the fact that the only [o], [u] or [i] found in word-internal preconsonantal positions are in syllables said to bear secondary stress (e.g., [12ot3nd], [1mot3l ], etc.). To the best of my knowledge, however, the fact that these syllables bear lexical secondary stress is determined largely if not solely on the basis of the failure of the vowels in these syllables to undergo vowel reduction. Impressionistically at least, these vowels actually seem to undergo a fair bit of gradient reduction nonetheless in ordinary speech, suggesting that the situation in English is not yet fully phonologized in the first place. From Blust (1988), meaning ‘grasp in the fist’. v’alost’, also perhaps ‘withering’, ‘limpness’ or ‘languour’. An important aspect of Hock’s conclusions, though, is on the need to distinguish phrase-final (in his terms “Finality”) effects from word-final (“Right Edge”), in that in phrase-final position there is often phonetic motivation for weakening (e.g., for final devoicing, see below), while in wordfinal but phrase-internal position that motivation may be lacking. This means that weakening processes common to both environments almost certainly begin in the larger domain and are generalized to the smaller. This parallels what we have seen for final lengthening above. Results are for the consonants /t, d, n, l/. The effect was weaker for /l/ than for the other consonants.

Notes

249

88. Stressed vowels rarely devoice crosslinguistically, presumably due to the various language-specific combinations of increases in F0, amplitude, or duration associated with stressed vowels. It should be noted that Turkish does not actually constitute an exception in this regard. Word-final stressed high vowels in Turkish are in fact frequently realized voiceless, particularly when the preceding consonant is a (phonetically) aspirated obstruent (Orhan Orgun, p.c.). In Turkish, stress is cued by raised F0. Amplitude and duration increases are not correlates of final stress (Konrot 1981). Azerbaijani, one imagines, may be similar in this regard. 89. In Turkana, one of the strongest cases, for example, voiced and voiceless vowels are certainly contrasted on the surface in final syllables. Dimmendaal and Breedveld (1986) ultimately argue, however, that this contrast is a derived effect stemming from underlying lexical tone specifications. The analysis is somewhat abstract, perhaps, but clear enough. 90. Where examples of implementationally contradictory phonetic properties would be something like the relationship between nasalization and oral frication, or voicing and turbulent frication downstream, where the realization of the former in each case actually obliterates the phonetic conditions necessary for the implementation of the latter (i.e. a substantial pressure drop across the oral constriction) (Ohala and Ohala 1993). 91. Or potentially ‘pomegranate’, genitive singular. 92. This spectrogram and all the attendant analysis was made using the Praat 4.0.2 speech analysis software (Copyright©1992–2001 by Paul Boersma and David Weenink). Recording conditions, durational measurements, and LPC formant analysis were all accomplished as detailed above for the experiments described in Section 2.5.2. 93. Itself actually potentially contributing some raising of F1 in this example (Gordon and Ladefoged 2001). 94. As compared with, e.g., that of the transition from obstruent to vowel. 95. The additional duration supplied by the misparsed final vowel could actually help in this regard, making a (probably shorter) onset sonorant sound more like a (probably longer) coda sonorant. 96. In treating these facts, of course, one might wonder about the role of the ATR specification of the lost vowel in this connection. The cognate dialect forms retaining the vowel, however, show both vowel series yielding the same result upon deletion. 97. Though creaky voice can of course be produced with higher amplitude (unlike phrase-final irregular phonation presumably), it has been shown that glottalization can actually be cued in some languages that employ it contrastively with lower relative amplitude alone (Baker and Gerfen 2001). The lowered amplitude of the phrase-final syllables could actually increase the likelihood of reanalysis then. Though as noted above higher F0 may enhance creakiness, the direct effect of the irregular pulsing with longer closed periods in the glottal cycle is an F0 drop relative to neighboring modal-voiced segments. 98. And almost exclusively in the case of devoicing.

250

Notes

99. This wouldn’t necessarily be the phonologization of final vowel devoicing though. 100. Men will also choose glottal stop insertion from time to time, apparently when they wish to be “precise”. 101. A small set of long-vowel-final words in Hausa is also realized with final glottal stop, though they do not neutralize durationally with the short vowels (Newman and Van Heuven 1981). 102. It seems likely that the unstressed shorts must have or have had some allophonic glottalization as well, just not enough to sound “clipped”, as would, e.g., a phonologically inserted glottal stop. 103. In the case of full deletion, of course, and absent comparanda or historical records, such a system would be indistinguishable from a system in which the quantity distinction is neutralized, but reflexes of both vowels are retained, described below. 104. The result of the combinations attested above does seem to be that those patterns which would best preserve the quantity contrast are the ones which are best attested. This does not imply teleology, however, or “intelligent design” as wags have put it. These systems are the best attested simply because the other combinations, however often they may emerge, are less likely to remain observable over the long term, resulting in their comparative rarity. The logic here is evolutionary, in the spirit of Ohala (passim), Blevins (2004), Blevins and Garrett (1998). 105. The reverse, however, is attested, if rare: In certain Mbam-Nkam languages of Cameroon, final nasals in various combinations denasalize to become oral stops (Hyman 1975: 257). 106. Certainly the final vowel is undergoing some sort of enhancement of its characteristic features in final position. The possibility of understanding all the features found here, including rounding and nasalization, as serving the perceptual enhancement of /a/ in an environment in which a severe amplitude drop might otherwise obscure it is intriguing. This is particularly so in light of Kingston’s (1992: 101) observation that a small amount of nasalization will cause vowels to sound lower (though a large amount should do the opposite). I will not, however, pursue this line of reasoning here. 107. As, perplexingly, are vowels realized between two palatalized consonants. 108. High vowels are especially prone to such behavior due to their brevity and the potential for the closeness of their constrictions to impede the development of the pressure drop across the glottis necessary for voicing to begin (Ohala 1993a). 109. Indeed, Blevins and Garrett (1998: 527–528) also note a link between final vowel loss (with “compensatory metathesis”) and strong penultimate stress, implicating in particular “the existence of tonic length in peripheral binary feet, where the peripheral vowel is unstressed”. This is particularly interesting in light of the fairly common attestation discussed above of what is essentially the opposite pattern: the tendency in languages with fixed penultimate stress for the vowel of the weak syllable in the final foot to fail to undergo reduction patterns affecting all other unstressed syllables (as exemplified by Seediq,

Notes

251

Yakan, numerous Murutic and Dusunic languages, and potentially Shimakonde, Niemirova Polish and even perhaps Neo-Aramaic). The prediction then should be that fixed penultimate languages in which final syllables resist reduction should not have strongly duration-cued stress (at least as concerns the relationship between the syllables of the head foot). Obviously some of the languages listed above (e.g., Seediq, with optional deletion of many pretonic vowels) appear to implement a good deal of shortening of pretonic syllables. Empirical testing of this prediction would no doubt yield interesting results. 110. See Crosswhite (2001) for other instances of morphological blocking of reduction processes. 111. See the note above on the propensity for final vowels to reduce following long penults. Provocatively, however, Nanai has fixed final stress, which is not duration-cued (Avrorin 1959, 62–3), apparently being manifest primarily by raised F0. (Note discussion of final stressed vowel devoicing in Turkish under similar circumstances.) On the failure to reduce following an identical vowel, see the discussion in Chapter 4 below. 112. Which are both realized as [u] when word-final today due to the vowel reduction described in Chapter 2. 113. Recalling Hyman's discussion (Hyman 1977) of the crosslinguistic frequency of fixed penultimate stress and the insight that metrical prominence may be in some way more perceptible if it has metrical non-prominence following. 114. Speaking, of course, of phonetic lengthening of short vowels in stressed syllables in a system prior to the phonologization of any constraint requiring that stressed vowels or syllables be bimoraic. 115. Smith also presents a rigorous delineation of the type of psycholinguistic prominence that should be involved in determining aspects of the typology of licensing potential. Specifically at issue are positions which have been shown to be important for early-stage word recognition (the relevant discussion is in Smith 2002: 254–306). This is a first stage hypothesized in most models of word recognition in which, in Smith’s words, “phonetic/phonological information is used to identify a set of candidate lexical entries for further examination” and which is followed by “a later stage, in which the selected set is narrowed down (often on the basis of more than just phonetic or phonological information) until the best-matching lexical entry is identified.” (Smith 2002: 257) There is an abundance of strong evidence, surveyed by Smith, demonstrating the importance of word-initial material in precisely this aspect of speech processing. Smith cites further evidence showing that stressed syllables do not exhibit the same importance in this stage of processing. Recalling the discussion of the psycholinguistic status of the final syllable in Chapter 3, if it is indeed solely significance to early-stage word recognition which has the potential to affect the status of a position as a licenser of contrasts, then we might look to this as an explanation of the seeming irrelevance of such information in determining the typology of licensing asymmetries in final position.

252

Notes

116. There is the case of Luganda, in which absolute-initial vowels are restricted to /e, a, o/ out of a standard five-vowel inventory, and these not contrasting in quantity (they have been described as obligatorily long, though Hubbard’s experimental results are somewhat equivocal on this matter), but this restriction does not concern initial syllables with onsets (Hubbard 1994: 161). See below for further discussion. 117. While both magnitude and duration were found to increase in a number of the relevant experiments, correlation tests in Fougeron and Keating 1996 show that the direction or even existence of a causal relationship between these two parameters is questionable. 118. The initial stress in certain Uralic languages, Ob-Ugrian in particular, in which earlier harmony systems have ceased to operate, creates the appearance of the UVR systems of the otherwise unattested type in which front/back or round contrasts are neutralized in non-initial syllables. See Chapter 2 for discussion. 119. Resulting in forms such as Kota an% r c  c  vd4 k , ‘because of the fact that (someone) will cause (someone) to terrify (someone)’. 120. Derived from the loss of earlier voiced aspirates or following /h/ (Ohala 1991). Its development is probably best seen as a case of perceptual metathesis à la Blevins and Garrett 2004. 121. Again, it is important to recognize that the unique prominence of the initial syllable in Turkic palatal vowel harmony, while generally accepted as the diachronic source of the harmony pattern, is clearly not what drives the pattern in the synchronic grammars of languages like Modern Turkish. In Turkish, while the frontness or backness of the vowels in a given word is by and large propagated outward from root to affixes, non-alternating potentially disharmonic vowels in both roots (e.g., [katil] ‘murderer’) and affixes (e.g., the progressive suffix –Ijor, with a harmonizing first vowel, and an invariant second) block rightward spread of backness from preceding vowels and, where additional alternating vowels follow, initiate harmony spans of their own. 122. All values rounded to one decimal place. 123. In Turkish as in English, both word-initial and stressed-syllable-initial voiceless stops receive strong aspiration. 124. See Konrot (1981) for discussion of the possibility that while Turkish final stress is cued solely by F0, the non-final stress of borrowings and toponyms may actually be cued by both pitch and amplitude (though not duration). 125. Interestingly, a measure of vowel reduction, specifically raising of /a/ to schwa, does take place, in connected speech in both stressed and unstressed open syllables (the vowels of which as noted are typically shorter than those of closed syllables). 126. Though many researchers avoid using minidisc as a recording medium due to questions concerning the nature of the proprietary compression algorithms it uses, in the context of this experiment, in which only segment duration was measured, any losses incurred in the signal due to compression are unlikely to influence the outcome of the analysis. 127. All values rounded to one decimal place.

Notes

253

128. I do not consider durational variations induced by local segmental or syllabic environment in this connection, as these lack the culminativity associated with stress placement. In such cases duration is not being deployed in the language as the primary cue for a uniquely-identifiable prosodic position (such as primary stress or edge). That additional duration supplied by local segmental environment can nonetheless dramatically impact the structure of a stress system is of course clear, as the attraction of otherwise-edge-based stress to internal heavy syllables in many languages clearly demonstrates. 129. This are also, of course, stress-dependent harmony systems, such as those treated by Majors (1998). 130. It is not clear, however, whether this is due to a demand for complete identity of the vowels involved, or only a demand that they be identical in height. The only other rounded vowel in both systems is /u/, and the facts here do correspond to the generalizations about rounding harmony made by Kaun (1995) (specifically even that mid vowels are more likely triggers of rounding harmony than high vowels. Unfortunately in these systems the high vowels do not reduce, so it is unclear what height restrictions might turn out to affect potential targets). 131. The mostly northern, so-called ‘okajushchie’ or ‘o-saying’ dialects. 132. Rounding harmony, though, famously parasitic in any case, appears to do just this, as in Timugon Murut. 133. The alternative potential source for the modern initial syllable vowellengthening mentioned in note 4 above. The relationship between amplitude and stress in Modern Turkish is not straightforward, though there may be a correlation to some extent for non-final stresses (Konrot 1981). Numerous confounding factors in Konrot’s experiment make this difficult to ascertain. As noted before, duration is not correlated with stress at all. 134. Even assuming that only later did stress and coarticulation become dissociated, as they are today in Turkish. 135. Calculations done by Kemal Oflazer in fact suggest that the mean number of syllables per word in running text in Modern Turkish may not be any greater than this (Sharon Inkelas, personal communication). 136. These representations are highly schematic and not meant to imply consideration of open syllables only. 137. Any prefixes being less closely integrated into the stem phonologically than the suffixes. 138. ‘Stem’ here in the Bantuist sense of root plus suffixes. 139. Hubbard notes that the effect is less consistent here. Specifically, while it is strong for voiceless obstruents, it is less so for voiced stops, and actually reversed for sonorants /m/ and /l/. For nasals at least this is the case in Turkish, and also for Korean (Sun-Ah Jun, personal communication), and is probably explicable along with other effects increasing the “consonantality” of initial sonorants. 140. Here again it is conceivable that phonetic weakening is the result of the psycholinguistic status of the entity in question. Still, it is the phonetic patterns

254

Notes

which would be responsible for the shape of the resulting patterns of PN. The role of psycholinguistic factors is indirect. 141. In ongoing development of this approach (Flemming 2005), distinct MinDist and Maintain Contrasts constraints ranked with the usual strict dominance of OT are replaced by a system of weighted, gradient constraints. This does not affect the discussion here, however, which turns on the central role played by the phonetic durations of unstressed syllables in both versions of the analysis. 142. Other dialect speakers often find this feature of Muscovite speech grating, and mock it as a “drawled” pronunciation. 143. It might be possible in the Russian case to derive this result by assuming that the mean realization values for the duration of unstressed /a/ were calculated over all unstressed syllables, rather than just over first pretonics. Given that first pretonic and non-first pretonic syllables must have separate durational targets specified though (the former being substantially longer than the latter), it is not clear why this would be so. 144. Recall that it is still not clear whether the neutralization of /a/ and /o/ in unstressed syllables was actually a merger historically, or if the two (derived originally from long and short /a/ respectively) rather simply failed to split everywhere but the stressed syllable. Either way the conditions in question are not relevant to the synchronic implementation of UVR in Russian. 145. Long vowels resulting from vowel fusion in hiatus across morpheme boundaries are also subject to reduction. The point is the same. 146. Flemming (2005) notes that Shimakonde may in fact be aberrant in other ways as well: Reduction, while generally optional, is subject to the condition that if a given pretonic mid vowel fails to reduce, then all following pretonic mid vowels must resist reduction as well. Thus, /kú-kólómól-a/, ‘to cough’, may be realized as either kúkálámóóla, with reduction of all unstressed mid vowels, or as kúkálómóóla, with the leftmost unstressed mid vowel reducing while the [o] adjacent to the stressed syllable resists reduction. A form like *kúkólámóóla, however, in which a reduced mid vowel intervenes between the resistant unstressed mid vowel and the tonic, is impossible. This is a fascinating twist, and one which awaits a proper formal treatment, but it is hardly aberrant in UVR systems. In particular, it is strongly reminiscent of the so-called “assimilative” vowel reduction dialects of Russian, in which, for example, pretonic mid vowels resist reduction when the adjacent stressed vowel is also mid, or of the reduction systems of Yakan and Timugon Murut, in which serial resistance to reduction proceeds leftward from the stressed syllable. These systems are discussed in more detail above in Section 4.5.3.1. The typology of such “support” or “indirect licensing” (as per Steriade 1995) systems remains to be written. 147. As in fact I argue to have taken place in Timugon Murut in chapter three, where in TM pretonic syllables /a/ is not realized as schwa, while in related neighboring languages it is (and apparently was in TM at an earlier period as well).

Notes

255

148. As discussed further below, while this is one way of interpreting the phonologization approach presented here, none of the major claims of this model hinges on the acceptance of the feed-forward, modular program. 149. I assume that this is because features like vowel heights are readily understood as phonetic continua, in ways that, e.g., obstruent voicing specifications are not. Even so, descriptions of partially voiced obstruents are common (as in English, for example). Additionally, if Stevens’ quantal model of the relationship between articulation and acoustics in the production of voicing is correct, then whatever discreteness there is in the distinction between “voiced” and “not voiced” may only hold at certain levels of description to begin with. 150. And I have no reason to doubt them. Anecdotally, in fact, I personally have had my Bulgarian pronunciation “corrected” on numerous occasions by native speakers who felt that the clear [u] I was producing in unstressed syllables as a realization of /o/ was non-native, and marked me as a foreigner. My initial reaction to this was to dismiss it as perhaps just proof of phonemic awareness on their part, or even as evidence of the influence of orthography on the normative judgments of native speakers. It appears, however, that I was wrong. 151. Though of course in English the voicing contrast in this position receives extra support from an exaggerated distinction in the durations of preceding vowels (Chen 1970). 152. To the extent that an unambiguous non-bound ‘base’ can be identified in the first place for a given derived form. This idea makes simple sense for a language like English or Dutch, on which nearly all this research has concentrated to date, but is less clearly applicable to languages with more complex morphological systems like Russian, or other more conservative Indo-European languages.

References Abondolo, Daniel 1998a Introduction. In The Uralic Languages, Daniel Abondolo (ed.), 1–42. London: Routledge. Abondolo, Daniel 1998b Khanty. In The Uralic Languages, Daniel Abondolo (ed.), 358–386. London: Routledge. Ahmad, Abdul-Majeed Rashid 1986 The phonemic system of Modern Standard Kurdish. Ph.D. diss., University of Michigan. Aikhenvald, Alexandra Y. 1996 Words, phrases, pauses, and boundaries: Evidence from South American Indian languages. Studies in Language 20 (3): 487–517. Alderete, John 1995 Faithfulness to prosodic heads. Unpublished manuscript. Rutgers Optimality Archive #94-0000. Allan, Edward J. 1976 Kullo. In Non-Semitic Languages of Ethiopia, M. Lionel Bender (ed.), 324–350. East Lansing: African Studies Center, Michigan State University. Anderson, Gregory D. S. 1997 Burushaski. In Phonologies of Asia and Africa 2, Alan S. Kaye (ed.), 1021–1041. Winona Lake: Eisenbrauns. Anderson, Steven R. 1981 Why phonology isn’t ‘natural’. Linguistic Inquiry 12: 493–539. Andronov, M. 1975 Observations on accent in Tamil. In Dravidian Phonological Systems, Harold F. Schiffman and Carol M. Eastman (eds.), 3–10. Seattle: The Institute for Comparative and Foreign Area Studies, University of Washington, Seattle. Asai, Erin 1953 The Sedik language of Formosa. Kanazawa, Japan: Cercle Linguistique de Kanazawa, Kanazawa University. Archangeli, Diana, and Douglas Pulleyblank 1994 Grounded Phonology. Cambridge, MA: MIT Press. Avanesov, R. I. 1968 Russkoe Literaturnoe Proiznoshenie [Russian literary pronunciation]. Moscow: Prosveshchenie. Avrorin, V. A. 1959 Grammatika Nanajskogo Jazyka [Grammar of the Nanai language]. Moscow and Leningrad: Izdatel’stvo Akademii Nauk SSSR.

References

257

Bach, Emmon, and Robert T. Harms 1972 How do languages get crazy rules? In Linguistic Change and Generative Theory, Robert P. Stockwell and Ronald K. S. Macaulay (eds.), 1–21. Bloomington: Indiana University Press. Bahl, Kali Charan 1957 Tones in Punjabi. Indian Linguistics 17: 139–147. Baker, Kirk, and Chip Gerfen 2001 Amplitude alone can cue glottalization: crosslinguistic evidence from speech perception. Paper presented at the annual meeting of the LSA, January 6, 2001. Washington, D. C. Balasubramanian, T. 1980 Timing in Tamil. Journal of Phonetics 8 (4): 449–457. Balasubramanian, T. 1981 Duration of vowels in Tamil. Journal of Phonetics 9 (2): 151–161. Barnes, Jonathan A. 2001 Moraicity and speech timing in Turkish and elsewhere. Handout from paper presented at MIT Phonology Circle. November 2, 2001. Barnes, Jonathan A. 2002 Domain-initial strengthening and the phonetics and phonology of positional neutralization. In Proceedings of the Northeast Linguistic Society (NELS) 32, The City University of New York and New York University, Masako Hirotani (ed.). Amherst, MA: University of Massachusetts, Amherst, GLSA. Beckman, Jill N. 1998 Positional faithfulness. Ph.D. diss., University of Massachusetts, Amherst. Beckman, Mary E., and Jan Edwards 1987 The phonological domains of final lengthening. Ohio State University Working Papers in Linguistics 35: 167–176. Beckman, Mary E., and Jan Edwards 1990 Lengthenings and shortenings and the nature of prosodic constituency. In Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech, John Kingston and Mary E. Beckman (eds.), 152–178. Cambridge: Cambridge University Press. Beckman, Mary, Jan Edwards, and Janet Fletcher 1991 The articulatory kinematics of final lengthening. Journal of the Acoustical Society of America 89 (1): 369–382. Beckman, Mary, Jan Edwards, and Janet Fletcher 1992 Prosodic structure and tempo in a sonority model of articulatory dynamics. In Papers in Laboratory Phonology II: Gesture, Segment, Prosody, Gerard J. Docherty and D. Robert Ladd (eds.), 68–86. Cambridge: Cambridge University Press. Beddor, Patrice, and Handan Kopkalli-Yavuz 1995 The relation between vowel-to-vowel coarticulation and vowel harmony in Turkish. Proceedings of the International Conference of the Phonetic Sciences 1995, vol. 2: 44–51.

258

References

Behrens, Dietlinde 1975 Yakan phonemics and morphophonemics. Papers in Phillipine Linguistics No. 7 (Pacific Linguistics Series A) 44: 13–28. Berkovits, Rochele 1984 Duration and fundamental frequency in sentence-final intonation. Journal of Phonetics 12: 255–265. Berkovits, Rochele 1993a Progressive utterance-final lengthening in syllables with final fricatives. Language and Speech 36: 89–98. Berkovits, Rochele 1993b Utterance-final lengthening and the duration of final-stop closures. Journal of Phonetics 21: 479–489. Berkovits, Rochele 1994 Durational effects in final lengthening, gapping, and contrastive stress. Language and Speech 37: 237–250. Bhaskararao, Peri 1998 Gadaba. In The Dravidian Languages, Sanford B. Steever (ed.), 328–355. London, Routledge. Blevins, Juliette 1993 A tonal analysis of Lithuanian nominal accent. Language 69: 237–273. Blevins, Juliette 1997 Rules in Optimality Theory: two case studies. In Derivations and Constraints in Phonology. Iggy Roca (ed.), 227–260. Oxford: Clarendon Press. Blevins, Juliette 2004 Evolutionary Phonology. Cambridge: Cambridge University Press. Blevins, Juliette, and Andrew Garrett 1998 The origins of consonant-vowel metathesis. Language 74: 508–555. Blevins, Juliette, and Andrew Garrett 2004 The evolution of metathesis. In Phonetically Based Phonology, Bruce Hayes, Robert Kirchner, and Donca Steriade (eds.), 117–157. Cambridge: Cambridge University Press. Bliese, Loren 1976 Afar. In Non-Semitic Languages of Ethiopia, M. Lionel Bender (ed.), 133–165. East Lansing: African Studies Center, Michigan State University. Blust, Robert 1988 Austronesian Root Theory: An Essay on the Limits of Morphology. Amsterdam: John Benjamins Publishing Company. Blust, Robert 1999 Notes on Pazeh phonology and morphology. Oceanic Linguistics 38 (2): 321–365. Blust, Robert 2000 Chamorro historical phonology. Oceanic Linguistics 39 (1): 83–122.

References

259

Borg, Albert, and Marie Azzopardi-Alexander 1997 Maltese. London: Routledge. Bosch, Anna R. K. 1991 Phonotactics at the level of the phonological word. Ph.D. diss., University of Chicago. Bosch, Anna R. K. 1996 Prominence at two levels: Stress versus pitch prominence in North Welsh. Journal of Celtic Linguistics 5: 121–165. Boutin, Michael E., and Inka Pekkanen (eds.) 1993 Phonological Descriptions of Sabah Languages. Kota Kinabalu: Sabah Museum. Boyce, Suzanne Elizabeth 1990 The influence of phonological structure on articulatory organization in Turkish and English. Ph.D. diss., Yale University. Bradley, Travis 2001 A Typology of Rhotic Duration Contrast and Neutralization. Proceedings of the North East Linguistics Society (NELS 31). Amherst, MA: GSLA Publications. Brame, Michael K. 1972 On the abstractness of phonology: Maltese. In Contributions to Generative Phonology, Michael K. Brame (ed.), 53–77. Austin: University of Texas Press. Brentari, Diane 1995 Sign language phonology: ASL. In The Handbook of Phonological Theory, John A. Goldsmith (ed.), 615–639. Oxford: Blackwell. Browman, Catherine P., and Louis Goldstein 1986 Toward an Articulatory Phonology. Phonology Yearbook 3: 219–252. Browman, Catherine P., and Louis Goldstein 1990 Tiers in Articulatory Phonology with some implications for casual speech. In Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech, John Kingston and Mary E. Beckman (eds.), 341–376. Cambridge: Cambridge University Press. Browman, Catherine P., and Louis Goldstein 1992 "Targetless" schwa: an articulatory analysis. In Papers in Laboratory Phonology II: Gesture, Segment, Prosody, Gerard J. Docherty and D. Robert Ladd (eds.), 26–56. Cambridge: Cambridge University Press. Bubenik, Vit 1996 The Structure and Development of Middle Indo-Aryan Dialects. Delhi: Motilal Banarsidass Publishers. Buckley, Eugene 1998 Iambic lengthening and final vowels. International Journal of American Linguistics 64: 179–223.

260

References

Bybee, Joan, Paromita Chakraborti, Dagmar Jung, and Joanne Scheibman 1998 Prosody and segmental effect: Some paths of evolution for word stress. Studies in Language 22 (2): 267–314. Byrd, Dani 1996 A phase window framework for articulatory timing. Phonology 13 (2): 139–169. Byrd, Dani 2000 Articulatory vowel lengthening and coordination at phrasal junctures. Phonetica 57: 3–16. Cambier-Langeveld, Tina 1997 The domain of final lengthening in the production of Dutch. Linguistics in the Netherlands 14: 13–24. Cambier-Langeveld, Tina 1999 The interaction between final lengthening and accentual lengthening: Dutch versus English. Linguistics in the Netherlands 16: 13–25. Cardona, George 1965 A Gujarati Reference Grammar. Philadelphia: University of Pennsylvania Press. Carnochan, Jack 1951 A study of quantity in Hausa. Bulletin of the School of Oriental and African Studies 13: 1032–1044. Carnochan, Jack 1988 The vowels of Hausa. In Studies in Hausa language and linguistics in honour of F. W. Parsons, Graham Furniss and Philip J. Jaggar (eds.), 17–32. London: Kegan Paul International. Casali, Roderic 1995 Labial opacity and roundness harmony in Nawuri. Natural Language and Linguistic Theory 13: 649–663. Casali, Roderic 1997 Vowel elision in hiatus contexts: Which vowel goes? Language 73: 493–533. Charles-Luce, Jan 1993 The effects of semantic context on voicing neutralization. Phonetic 50: 28–43. Chekmonas, V. 1987 Territorija zarozhdenija i etapy razvitija vostochnoslavjanskogo akan’ja v svete dannyx lingvogeografii [The territory of emergence and stages of development of East Slavic akan'e in light of the facts of linguistic geography]. Russian Linguistics 11: 335–349. Chen, Matthew 1970 Vowel length variation as a function of the voicing of the consonant environment. Phonetica 22: 129–159. Chen, Matthew 1973 Nasals and nasalization in the history of Chinese. Ph.D. diss., University of California, Berkeley.

References

261

Cho, Taehong 2000 Effects of morpheme boundaries on intergestural timing: Evidence from Korean. UCLA Working Papers in Phonetics 99: 71–108. Cho, Taehong 2001 Effects of prosody on articulation in English. Ph.D. diss., University of California, Los Angeles. Cho, Taehong 2004 Prosodically conditioned strengthening and vowel-to-vowel coarticulation in English. Journal of Phonetics 32 (2): 141–176. Cho, Taehong, and Sun-Ah Jun 2000a Domain-initial strengthening as enhancement of laryngeal features: Aerodynamic evidence from Korean. UCLA Working Papers in Phonetics 99: 57–79. Cho, Taehong, and Sun-Ah Jun 2000b Domain-initial strengthening as enhancement of laryngeal features: Aerodynamic evidence from Korean. Chicago Linguistics Society 36: 31–44. Cho, Taehong, and Patricia Keating 1999 Articulatory and acoustic studies of domain-initial strengthening in Korean. UCLA Working Papers in Phonetics 97: 100–138. Cohn, Abigail 1990 Phonetic and phonological rules of nasalization, Ph.D. diss., University of California, Los Angles. Côté, Marie-Hélène 2000 Consonant cluster phonotactics: A perceptual approach. Ph.D. diss., Massachusetts Institute of Technology. Crosswhite, Katherine 2001 Vowel Reduction in Optimality Theory. New York & London: Routledge. Crosswhite, Katherine 2004 Vowel reduction. In Phonetically Based Phonology, Bruce Hayes, Robert Kirchner, and Donca Steriade (eds.), 191–232. Cambridge: Cambridge University Press. Csató, Éva Á., and Lars Johanson 1998 Turkish. In The Turkic Languages. Lars Johanson and Éva Á. Csató (eds.), 203–235. London: Routledge. Czekman, Walery, and Elz˙bieta SmuΩkowa 1988 Fonetyka i Fonologia Je˛zyka BiaΩoruskiego z Elementami Fonetyki i Fonologii Ogólnej [Phonetics and phonology of the Belarusian language with elements of general phonetics and phonology]. Warszawa: Panstwowe Wydawnictvo Naukowe. Dankovicˇová, J. 1997 The domain of articulation rate variation in Czech. Journal of Phonetics 25: 287–312.

262

References

Dauer, Rebecca M. 1980 The reduction of unstressed high vowels in Modern Greek. Journal of the International Phonetics Association 10: 17–27. Dave, R. 1967 A formant analysis of the clear, nasalized and murmured vowels of Gujarati. Indian Linguistics 28: 1–30. de Jong, Kenneth 1995 The supraglottal articulation of prominence in English: Linguistic stress as hyperarticulation. Journal of the Acoustical Society of America 97 (1): 491–504. de Jong, Kenneth, and Bushra Adnan Zawaydeh 1999 Stress, duration, and intonation in Arabic word-level prosody. Journal of Phonetics 27: 3–22. Delattre, Pierre 1966 A comparison of syllable length conditioning among languages. International Review of Applied Linguistics 4: 183–198. Dilley, Laura, Stefanie Shattuck-Huffnagel, and Mari Ostendorf 1996 Glottalization of word-initial vowels as a function of prosodic structure. Journal of Phonetics 24: 423–444. Dimmendaal, Gerrit J. 1983 Topics in a grammar of Turkana. In Nilo-Saharan Language Studies. M. Lionel Bender (ed.), 239–271. East Lansing, Michigan: African Studies Center, Michigan State University. Dimmendaal, Gerrit J., and Anneke Breedveld 1986 Tonal influence on vowel quality. In The Phonological Representation of Suprasegmentals: Studies on African Languages Offered to John M. Stewart on his 60th Birthday, Koen Bogers, Harry van der Hulst, and Maarten Mous (eds.), 1–33. Dordrecht: Foris. Dinnsen, Daniel A., and Jan Charles-Luce 1984 Phonological neutralization, phonetic implementation and individual differences. Journal of Phonetics 12: 49–60. Dobrovolsky, Michael 1999 The phonetics of Chuvash stress: implications for phonology. Proceedings of the 14th International Conference of Phonetic Sciences: 539–42. Dyck, Carrie Joan 1995 Constraining the phonology-phonetics interface: With exemplification from Spanish and Italian dialects. Ph.D. diss., University of Toronto. Edwards, Jan, Mary E. Beckman, and Janet Fletcher 1991 The articulatory kinematics of final lengthening. Journal of the Acoustical Society of America 89 (1): 369–382. Elfenbein, Josef 1997 Pashto phonology. Phonologies of Asia and Africa 2, Alan S. Kaye (ed.), 733–760. Winona Lake: Eisenbrauns.

References

263

Emeneau, Murray B. 1961 Kolami: a Dravidian Language (Publications in Linguistics 2). Annamalainagar: Annamalai University. Emeneau, Murray B. 1984 Toda Grammar and Texts (Memoirs of the American Philosophical Society, Vol. 155). Philadelphia: American Philosophical Society. Ernestus, Miriam, and Harald Baayen in press The functionality of incomplete neutralization in Dutch: the case of past tense formation. Papers in Laboratory Phonology VIII. Farnetani, Edda, and Shiro Kori 1990 Rhythmic duration in Italian noun phrases: A study on vowel durations. Phonetica 47: 50–65. Farnetani, Edda, and Mario Vayra 1991 Word- and phrase-level aspects of vowel reduction in Italian. Proceedings of the XIIth International Congress of Phonetic Sciences, August 19–24, 1991, Aix-en-Provence, France. Volume 2: 14–17. Aix-en-Provence: Université de ProvenceService des Publications. Fischer-Jørgensen, Eli 1967 Phonetic analysis of breathy vowels in Gujarati. Indian Linguistics 28: 71–139. Flemming, Edward S. 1993 The role of metrical structure in segmental rules. MA thesis, University of California, Los Angeles. Flemming, Edward S. 1994 The role of metrical structure in segmental rules. In NELS 24: Proceedings of the North East Linguistic Society, Vol 1., Mercè Gonzàlez (ed.), 97–110. Amherst: University of Massachusetts. Flemming, Edward S. 1995 Auditory representations in phonology. Ph.D. diss., University of California, Los Angeles. Flemming, Edward S. 1997 Phonetic detail in phonology: Towards a unified account of assimilation and coarticulation. In Proceedings of the 1995 Southwestern Workshop in Optimality Theory (SWOT), K. Suzuki and D. Elzinga (eds.), University of Arizona. Flemming, Edward S. 2001a Scalar and categorical phenomena in a unified model of phonetics and phonology. Phonology 18 (1): 7–44. Flemming, Edward S. 2001b Vowel reduction and duration-dependent undershoot. Paper presented at the Conference on the Phonetics-Phonology Interface, ZAS, Berlin. October 12, 2001. Flemming, Edward S. 2004 Contrast and perceptual distinctiveness. Phonetically Based Phonology, Bruce Hayes, Robert Kirchner, and Donca Steriade (eds.), 232–276. Cambridge: Cambridge University Press.

264

References

Flemming, Edward S. 2005 A phonetically-based model of phonological vowel reduction. Ms., MIT. Fletcher, Janet 1991 Rhythm and final lengthening in French. Journal of Phonetics 19: 193–212. Flier, Michael 1978 On Obojansk dissimilative Jakan’e: the canonical case of absolute neutralization. Communication and Cognition 11 (3/4): 323–382. Fougeron, Cécile 1999 Prosodically conditioned articulatory variations: A review. UCLA Working Papers in Phonetics 97: 1–73. Fougeron, Cécile, and Patricia A. Keating 1996 Articulatory strengthening in prosodic domain-initial position. UCLA Working Papers in Phonetics 92: 61–87. Fourakis, Marios 1984 Should neutralization be redefined? Journal of Phonetics 12: 291–296. Fourakis, Marios, and Gregory K. Iverson 1984 On the ‘incomplete neutralization’ of German final obstruents. Phonetica 41: 140–149. Fox, R. A., and D. Terbeek 1977 Dental flaps, vowel duration and rule ordering in American English. Journal of Phonetics 5: 27–34. Fry, D. B. 1955 Duration and intensity as physical correlates of linguistic stress. Journal of the Acoustical Society of America 27 (4): 765–768. Gafos, Adamantios in press Dynamics: the non-derivational alternative to modeling phoneticsphonology. Papers in Laboratory Phonology VIII. Garbell, Irene 1965 The Jewish Neo-Aramaic Dialect of Persian Azerbaijan: Linguistic Analysis and Folkloristic Texts. Berlin/New York: Mouton de Gruyter. Garrett, Andrew, and Juliette Blevins To appear Analogical morphophonology. In The Nature of the Word: Essays in Honor of Paul Kiparsky. Cambridge, MA: MIT Press. Gauthiot, Robert 1913 La Fin de Mot en Indo-Européen. Paris: Librairie Paul Geuthner. Giannelli, Luciano 1997 Tuscany. In The Dialects of Italy, Martin Maiden and Mair Perry (eds.), 297–302. London: Routledge. Goldsmith, John 1982 Accent systems. The Structure of Phonological Representations (Part I). Harry van der Hulst and Norval Smith (eds.), 47–63. Dordrecht: Foris.

References

265

Gordon, Kent H. 1976 Phonology of Dhangar-Kurux. (Summer Institute of Linguistics, Institute of Nepal and Asian Studies). Kathmandu: Tribhuvan University Press. Gordon, Matthew K. 1998 The phonetics and phonology of non-modal vowels: a crosslinguistic perspective. Berkeley Linguistic Society 24: 93–105. Gordon, Matthew K. 1999 Syllable weight: Phonetics, phonology, and typology. Ph.D. diss., University of California, Los Angeles. Gordon, Matthew K. 2000 Re-examining default-to-opposite stress. Proceedings of the Berkeley Linguistics Society 26. Gordon, Matthew K., and Peter Ladefoged 2001 Phonation types: A cross-linguistic overview. Journal of Phonetics 29: 383–406. Guenther, Frank 1995 Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production. Psychological Review 102: 594–621. Hahn, Reinhard F. 1991 Spoken Uyghur. Seattle: University of Washington Press. Hale, Mark 2003 Neogrammarian sound change. In The Handbook of Historical Linguistics, Brian D. Joseph and Richard D. Janda (eds.), 343–368. Oxford: Blackwell. Hale, Mark, and Charles Reiss 2000 ‘Substance Abuse’ and ‘Dysfunctionalism’: Current trends in phonology. Linguistic Inquiry 31: 157–169. Hall, Robert Anderson 1976 Proto-Romance Phonology. New York: Elsevier. Hamel, Patricia J. 1994 A Grammar and Lexicon of Loniu, Papua New Guinea (Pacific Linguistics C-103). Canberra: The Australian National University. Hammond, Michael 1997 Vowel quantity and syllabification in English. Language 73 (1): 1–17. Hanson, Helen M., Kenneth N. Stevens, Hong-Kwang Jeff Kuo, Marilyn Y. Chen, and Janet Slifka 2001 Towards models of phonation. Journal of Phonetics 29: 451–480. Harris, John 1998 Licensing inheritance in an integrated theory of neutralization. Phonology 14: 315–370.

266

References

Hay, Jennifer 2000 Causes and consequences of word structure. Ph.D. diss., Northwestern University. Hayes, Bruce 1995 Metrical Stress Theory: Principles and Case Studies. Chicago: University of Chicago Press. Hayes, Bruce 1999 Phonetically-driven phonology: The role of Optimality Theory and inductive grounding. In Functionalism and Formalism in Linguistics, Volume I: General Papers, Michael Darnell, Edith Moravscik, Michael Noonan, Frederick Newmeyer, and Kathleen Wheatly (eds.), 243–285. Amsterdam: John Benjamins. Hayward, Richard J. 1982 Notes on the Koyra language. Africa und Übersee 65: 211–268. Hayward, Richard J. 1998 Vowel reduction, tone and nominal declension in D'irayta. In Productivity and Creativity. Studies in General and Descriptive Linguistics in Honor of E. M. Uhlenbeck, Mark Janse (ed.), 503–519. Trends in Linguistics. Studies and Monographs 116. Berlin/New York: Mouton de Gruyter. Henderson, Janette B. 1984 Velopharyngeal function in oral and nasal vowels: A cross-language study. Ph.D. diss., University of Connecticut, Storrs. Henton, Caroline, and Anthony Bladon 1988 Creak as a sociophonetic marker. In Language, Speech and Mind, Larry M. Hyman and Charles N. Li (eds.), 3–29. London: Routledge. Hock, Hans Henrich 1999 Finality, prosody, and change. In Proceedings of LP ’98: Item and order in language and speech, Osamu Fujimura, Brian D. Joseph, and Bohumil Palek (eds.), 15–30. Prague: Charles University Press (Karolinum). Hollenbach, Barbara 1977 Copala Trique phonology. In Otomanguean Phonology, William R. Merriefield (ed.), Dallas: SIL. Holmer, Arthur J. 1996 A Parametric Grammar of Seediq. Travaux de l’Institute de Linguistique de Lund 30. Lund: Lund University Press. Honti, László 1988 Die Ob-Ugrischen Sprachen. In The Uralic Languages: Description, History and Foreign Influences, Denis Sinor (ed.), 147–196. Leiden: E. J. Brill. Honti, László 1998 ObUgrian. The Uralic Languages, Daniel Abondolo (ed.), 327–357. London: Routledge.

References

267

Hsu, Chai-Shune, and Sun-Ah Jun 1998 Prosodic strengthening in Taiwanese: Syntagmatic or paradigmatic? UCLA Working Papers in Phonetic 96: 69–89. Hualde, José 1989 Autosegmental and metrical spreading in the vowel-harmony systems of northwestern Spain. Linguistics 27: 773–801. Hubbard, Kathleen 1994 Duration in moraic theory. Ph.D. diss., University of California, Berkeley. Hudson, Grover 1976 Highland East Cushitic. In Non-Semitic Languages of Ethiopia, M. Lionel Bender (ed.), 232–277. East Lansing: African Studies Center, Michigan State University. Hung, Henrietta J. 1994 The rhythmic and prosodic organization of edge constituents. Ph.D. diss., Brandeis University. Hyde, Villiana 1971 An Introduction to the Luiseño Language. Morongo Indian Reservation, Banning, CA: Malki Museum Press. Hyman, Larry M. 1975 Nasal states and nasal processes. In Nasálfest: Papers from a Symposium on Nasals and Nasalization, Charles A. Ferguson, Larry M. Hyman, and John J. Ohala (eds.), 249–264. Stanford, CA: Language Universals Project, Department of Linguistics, Stanford University. Hyman, Larry M. 1976 Phonologization. In Linguistic Studies Offered to Joseph Greenberg on the Occasion of his Sixtieth Birthday, Alphonse Juilland (ed.), 107–118. Saratoga: Anma Libri. Hyman, Larry M. 1977 On the nature of linguistic stress. In Studies in Stress and Accent, Larry M. Hyman (ed.), 37–82. Los Angeles: Southern California Occasional Papers in Linguistics. Hyman, Larry M. 1982 The representation of nasality in Gokana. In The Structure of Phonological Representations (Part I), Harry van der Hulst and Norval Smith (eds.), 111–130. Dordrecht: Foris. Hyman, Larry M. 1987 Prosodic domains in Kukuya. Natural Language and Linguistic Theory 5: 311–333. Hyman, Larry M. 1988 The phonology of final glottal stops. In Proceedings of the Western Conference on Linguistics, WECOL ’88, Vol. 1. 113–130. Fresno: California State University.

268

References

Hyman, Larry M. 1998 Positional prominence and the ‘prosodic trough’ in Yaka. Phonology 15 (1): 41–75. Hyman, Larry M. 1999 The historical interpretation of vowel harmony in Bantu. In Recent advances in Bantu historical linguistics, Jean-Marie Hombert and Larry M. Hyman (eds.), 235–295. Stanford: C.S.L.I. Hyman, Larry M., and Francis X. Katamba 1990 Final vowel shortening in Luganda. Studies in African Linguistics 21: 1–59. Hyman, Larry M., and Francis X. Katamba 1993 A new approach to tone in Luganda. Language 69: 34–67. Inkelas, Sharon, Jonathan Barnes, Jeffrey Good, Darya Kavitskaya, Orhan Orgun, Ronald Sprouse, and Alan Yu 2001 Stress and vowel-to-vowel coarticulation in Turkish. Paper presented at the annual meeting of the LSA, Washington D.C., January 2001. Jannedy, Stefanie 1995 Gestural phasing as an explanation for vowel devoicing in Turkish. OSU Working Papers in Linguistics 45: 56–84. Janiak, BronisΩawa 1995 Polsko-Ukrainskie Zwia˛zki Je˛zykowe na PrzykΩadzie Gwary Niemirowa nad Bugiem: Fonetika—Fonologia—SΩownictwo [PolishUkrainian linguistic ties as exemplified by the dialect of Niemirowa nad Bugiem: Phonetics—phonology—lexicon]. ∫odz: Wydawnictwo Universyteta ∫odzkiego. Jassem, Wiktor, and Lutoslawa Richter 1989 Neutralization of voicing in Polish obstruents. Journal of Phonetics 17: 317–325. Jha, Sunil Kumar 2001 Maithili: Some Aspects of its Phonetics and Phonology. Delhi: Motilal Banarsidass Publishers Private Limited. Johanson, Lars 1998 The history of Turkic. In The Turkic Languages, Lars Johanson and Éva Á. Csató (eds.), 81–125. London: Routledge. Johnson, Keith, and Jack Martin 2001 Acoustic vowel reduction in Creek: Effects of distinctive length and position in the word. Phonetica 58: 81–102. Kaidarov, A. T. 1969 Razvitie Sovremennogo Ujjgurskogo Literaturnogo Jazyka 1: Ujgurskie Dialekty i Dialektnaja Osnova Literaturnogo Jazyka. [The development of the modern Uyghur literary language 1: Uyghur dialects and the dialect base of the literary language]. Alma-Ata: Nauka.

References

269

Kamprath, Christine K. 1987 Suprasegmental structure in a Raeto-Romansh dialect: A case study in Metrical and Lexical Phonology. Ph.D. Diss., University of Texas, Austin. Kangasmaa-Minn, Eva 1998 Mari. In The Uralic Languages, Daniel Abondolo (ed.), 219–248. London: Routledge. Kaun, Abigail R. 1995 The typology of rounding harmony: An optimality theoretic approach. Ph.D. diss., University of California, Los Angeles. Kavitskaya, Darya 2001 Compensatory lengthening: Phonetics, phonology, diachrony. Ph.D. diss., University of California, Berkeley. Keating, Patricia A. 1988a The window model of coarticulation: Articulatory evidence. UCLA Working Papers in Phonetics 69: 3–29. Keating, Patricia A. 1988b Underspecification in phonetics. Phonology 5 (2): 275–292. Keating, Patricia A. 1988c The phonology-phonetics interface. In Linguistics: The Cambridge Survey, Volume I: Grammatical Theory, F. Newmeyer (ed.), 281–302. Cambridge: Cambridge University Press. Keating, Patricia A. 1990a The window model of coarticulation: Articulatory evidence. In Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech, John Kingston and Mary E. Beckman (eds.), 451–470. Cambridge: Cambridge University Press. Keating, Patricia A. 1990b Phonetic representations in a generative grammar. Journal of Phonetics 18: 321–334. Keating, Patricia A. 1996 The phonology-phonetics interface. In Interfaces in Phonology, Ursula Kleinhenz (ed.), 262–278. Studia grammatica 41, Berlin: Akademie Verlag. Keating, Patricia A., Taehong Cho, Cecile Fougeron, and Chai-Shune Hsu 1999 Domain-initial strengthening in four languages. University of California Working Papers in Phonetics, 97: 139–151. Keating, Patricia A., Richard Wright, and Jie Zhang 1999 Word-level asymmetries in consonant articulation. UCLA Working Papers in Phonetics 97: 157–173. Keen, Sandra 1983 Yukulta. In Handbook of Australian Languages, Volume 3, R.M.W. Dixon and Barry J. Blake (eds.), 190–304. Amsterdam: John Benjamins B.V.

270

References

Kehoe, Margaret, and Carol Stoel-Gammon 1997 The acquisition of prosodic structure: An investigation of current accounts of children’s prosodic development. Language 73 (1): 113–144. Kelm, Orlando R. 1989 Temporal aspects of rhythm which distinguish Mexican Spanish and Brazilian Portuguese. Ph.D. diss., University of California, Berkeley. Kenstowicz, Michael 1972 Lithuanian phonology. Studies in the Linguistic Sciences 2: 1–85. Kenstowicz, Michael 1994 Sonority-driven stress. Ms., MIT. Rutgers Optimality Archive #331094. Kerr, Harland 1988 Cotabato Manobo grammar. Studies in Philippine Linguistics 7 (1): 1–124. Kimball, Geoffrey D. 1991 Koasati Grammar. Lincoln: University of Nebraska Press. Kingston, John 1992 The phonetics and phonology of perceptually motivated articulatory covariation. Language and Speech 35 (1–2): 99–113. Kiparsky, Valentin 1979 Russian Historical Grammar. Ann Arbor: Ardis. Kirchner, Robert 1998 An effort-based appoach to consonant lenition. Ph.D. diss., University of California, Los Angeles. Klatt, Dennis H. 1975 Vowel lengthening is syntactically determined in a connected discourse. Journal of Phonetics 3: 129–140. Klatt, Dennis H. 1976 Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. Journal of the Acoustical Society of America 59 (5): 1208–1221. Kohler, Klaus J. 1983 Prosodic boundary signals in German. Phonetica 40: 89–134. Konrot, Ahmet K. 1981 Physical correlates of linguistic stress in Turkish. University of Essex Language Centre Occasional Papers 24: 26–53. Kopkallı, Handan 1993 A phonetic and phonological analysis of final devoicing in Turkish. Ph.D. diss., University of Michigan. Kopkallı-Yavuz, Handan 2000 Paper presented at the International Congress of Turkish Linguistics, Istanbul, August 2000.

References

271

Krakow, Rena A. 1993 Nonsegmental influences on velum movement patterns: Syllables, sentences, stress, and speaking rate. In Nasals, Nasalization, and the Velum: Phonetics and Phonology V, Marie K. Huffman and Rena A. Krakow (eds.), 87–116. San Diego, CA: Academic Press. Krakow, R. A., F. Bell-Berti, and Q. Wang 1991 Supralaryngeal patterns of declination: Labial and velar kinematics. Journal of the Acoustical Society of America 90 (4): 2343. Krishnamurti, Bh., and Brett A. Benham 1998 Konda. The Dravidian Languages, Sanford B. Steever (ed.), 241–269. London, Routledge. Kroeger, Paul 1992 Vowel harmony systems in three Sabahan languages. In Shifting Patterns of Language Use in Borneo: Papers from the Second Biennial Internationational Conference, Kota Kinabalu, Sabah, Malaysia, July, 1992, Peter W. Martin (ed.), 279–296. Kota Kinabalu: Borneo Research Council. Kroeger, Paul 1993 Kimaragang phonemics. In Phonological Descriptions of Sabah Languages, Michael E. Boutin and Inka Pekkanen (eds.), 31–45. Kota Kinabalu: Sabah Museum. Kwenzi Mickala, Jérome 1980 Esquisse phonologique du punu. In Éléments de Description du Punu, Francois Nsukka (ed.), 7–18. Lyon: Université Lyon. Labov, William 1994 Principles of Linguistic Change: Internal factors. Oxford: Blackwell. Ladefoged, Peter, and Ian Maddieson 1996 The Sounds of the World’s Languages. Oxford: Blackwell. Lahiri, Aditi, and Jorge Hankamer 1988 The timing of geminate consonants. Journal of Phonetics 16: 327–338. Lavoie, Lisa 2001 Consonant Strength: Phonological Patterns and Phonetic Manifestations. New York: Garland. Lea, Wayne A. 1977 Acoustic correlates of stress and juncture. In Studies in Stress and Accent, Larry M. Hyman (ed.), 83–119. Los Angeles: Southern California Occasional Papers in Linguistics. Lehiste, Ilse 1970 Suprasegmentals. Cambridge, MA: MIT Press. Lehiste, Ilse 1972 The timing of utterances and linguistic boundaries. Journal of the Acoustical Society of America 51 (6): 2018–2024.

272

References

Lehiste, Ilse 1984 The many linguistic uses of duration. In New Directions in Linguistics and Semiotics, James E. Copeland (ed.), 96–122. Amsterdam: The John Benjamins Publishing Company. Lencek, Rado L. 1982 The Structure and History of the Slovene Language. Columbus: Slavica. Li, Bing 1996 Tungusic Vowel Harmony: Description and Analysis. HIL Dissertation 18. Den Haag: Holland Academic Graphics. Li, Paul Jen-Kuei 1977 Morphophonemic alternations in Formosan languages. Bulletin of the Institute of History and Philology, Academia Sinica 48 (3): 375–411. Li, Paul Jen-Kuei 1980 The phonological rules of Atayal dialects. Bulletin of the Institute of History and Philology, Academia Sinica 51 (2): 349–403. Liljencrants, Johan, and Björn Lindblom 1972 Numerical simulation of vowel quality systems: the role of perceptual contrast. Language 48 (4): 839–862. Lindblom, Björn 1963 Spectrographic study of vowel reduction. Journal of the Acoustical Society of America 35 (11): 1773–1781. Lindblom, Björn 1986 Phonetic universals in vowel systems. In Experimental Phonology, John J. Ohala and Jeri J. Jaeger (eds.), 13–44. Orlando: Academic Press, Inc. Liphola, Marcelino M. 2001 Aspects of the phonology and morphology of Shimakonde. Ph.D. diss., Ohio State University. Lombardi, Linda 1991 Laryngeal features and laryngeal neutralization. Ph.D. Diss., University of Massachusetts, Amherst. Loporcaro, Michele 1997 Puglia and Salento. In The Dialects of Italy, Martin Maiden and Mair Perry (eds.), 338–348. London: Routledge. Lubotsky, Alexander 1993 Nasalization of the final ‘a’ in the R5gveda. Indo-Iranian Journal 36: 197–210. Macdonald, R. Ross, and Soenjono Darjowidjojo 1967 A Student’s Reference Grammar of Modern Formal Indonesian. Washington, D.C.: Georgetown University Press. Maddieson, Ian 1984 Patterns of Sounds. Cambridge: Cambridge University Press.

References

273

Major, Roy C. 1981 Stress-timing in Brazilian Portuguese. Journal of Phonetics 9 : 343–351. Major, Roy C. 1985 Stress and rhythm in Brazilian Portuguese. Language 61: 259–282. Majors, Tivoli Jane 1998 Stress dependent harmony: Phonetic origins and phonological analysis. Ph.D. diss., University of Texas at Austin. Manaster Ramer, Alexis 1996 A letter from an incompletely neutral phonologist. Journal of Phonetics 24: 491–511. Mansen, Richard 1967 Guajiro phonemes. In Phonemic systems of Colombian Languages, Viola G. Waterhouse (ed.), 49–59. Summer Institute of Linguistics Publications in Linguistics and Related Fields 19. Norman: SIL. Mathiessen, Terje 1997 A Short Grammar of Latvian. Columbus, OH: Slavica. Matisoff, James A. 1975 Rhinoglottophilia: The mysterious connection between nasality and glottality. In Nasálfest: Papers from a Symposium on Nasals and Nasalization, Charles A. Ferguson, Larry M. Hyman, and John J. Ohala (eds.), 265–287. Stanford, CA: Language Universals Project, Department of Linguistics, Stanford University. Matusevic¬, M. I. 1976 Sovremennyj Russkij Jazyk: Fonetika [The modern Russian language: Phonetics]. Moscow: Prosveshchenie. Maye, Jessica 2000 Learning speech sound categories from statistical information. Ph.D. diss., University of Arizona. McCarthy, John J. 1984 Theoretical consequences of Montañes vowel harmony. Linguistic Inquiry 15: 291–317. McCarus, Ernest N. 1997 Kurdish phonology. In Phonologies of Asia and Africa 2, Alan S. Kaye (ed.), 691–706. Winona Lake: Eisenbrauns. McCormick, Susan M. 1982 Vowel harmony and umlaut: implications for a typology of accent. Ph.D. diss., Cornell University. Michelson, Karin 1999 Utterance-final phenomena in Oneida. In Proceedings of LP ’98: Item and Order in Language and Speech, Osamu Fujimura, Brian D. Joseph, and Bohumil Palek (eds.), 31–45. Prague: Charles University Press (Karolinum). Miller, Amy 2001 A Grammar of Jamul Tiipay. Berlin/New York: Mouton de Gruyter.

274

References

Miller, Carolyn 1993 Kadazan/Dusun phonology revisited. In Phonological Descriptions of Sabah Languages, Michael E. Boutin and Inka Pekkanen (eds.), 1–14. Kota Kinabalu: Sabah Museum. Mistry, P. J. 1997 Gujarati. In Phonologies of Asia and Africa 2, Alan S. Kaye (ed.), 653–673. Winona Lake: Eisenbrauns. Mollaev, Allaguly 1980 Akusticheskaia Xarakteristika Udarnyx i Bezudarnyx Glasnyx v Dvuslozhnyx Slovax Turkmenskogo Jazyka [Acoustic characteristics of stressed and unstressed vowels in disyllabic words of the Turkmen language]. Ashxabad:Ylym. Montler, Timothy 1986 An Outline of the Morphology and Phonology of Saanich, North Straits Salish. University of Montana Occasional Papers in Linguistics: 4. Myers, Scott 2000 Boundary disputes: The distinction between phonetic and phonological sound patterns. In Phonological Knowledge: Conceptual and Empirical Issues, Noel Burton-Roberts, Philip Carr, and Gerard Docherty (eds.), 245–272. Oxford: Oxford University Press. Nespor, Marina, and Irene Vogel 1986 Prosodic Phonology. Dordrecht: Foris. Newman, Paul 2000 The Hausa Language: An Encyclopedic Reference Grammar. New Haven: Yale University Press. Newman, Roxana Ma, and Vincent J. van Heuven 1981 An acoustic and phonological study of pre-pausal vowel length in Hausa. Journal of African Languages and Linguistics 3: 1–18. Nicklas, Thurston Dale 1975 Choctaw morphophonemics. In Studies in Southeastern Indian Languages, James M. Crawford (ed.), 237–250. Athens: University of Georgia Press. Nobre, Maria Alvira, and Frances Ingemann 1987 Oral vowel reduction in Brazilian Portuguese. In In Honor of Ilse Lehiste, Robert Channon and Linda Shockey (eds.), 195–206. Dordrecht: Foris. Nord, Lennart 1975 Vowel reduction—Centralization or contextual assimilation? In Proceedings of the Speech Communication Seminar, Stockholm. Aug. 1–3, 1974. Gunnar Fant (ed.), 149–154, Stockholm: Almqvist & Wiksell.

References

275

Nord, Lennart 1987 Acoustic studies of vowel reduction in Swedish. In Proceedings of the Eleventh International Congress of Phonetic Sciences, August 1–7, 1987, Tallinn, Estonia, USSR. Volume 4, 157–160. Tallinn: Academy of Sciences of the Estonian S.S.R. Ohala, John J. 1975 Phonetic explanations for nasal sound patterns. In Nasálfest: Papers from a Symposium on Nasals and Nasalization, Charles A. Ferguson, Larry M. Hyman, and John J. Ohala (eds.), 289–316. Stanford, CA: Language Universals Project, Department of Linguistics, Stanford University. Ohala, John J. 1981 The listener as a source of sound change. In Papers from the Parasession on Language and Behavior, Chicago Linguistic Society, Carrie S. Masek, Roberta A. Hendrick, and Mary Frances Miller (eds.), 178–203. Chicago: CLS. Ohala, John J. 1983 The origin of sound patterns in vocal tract constraints. In The Production of Speech, P. F. MacNeilage (ed.), 189–216. New York: Springer. Ohala, John J. 1993a The phonetics of sound change. In Historical Linguistics: Problems and Perspectives, Charles Jones (ed.), 237–277. London: Longman. Ohala, John J. 1993b Coarticulation and phonology. Language and Speech 36: 155–170. Ohala, John J. 1994 Towards a universal, phonetically-based, theory of vowel harmony. In Proceedings of ICSLP 94, Yokohama. 491–94. Ohala, John J., and Manjari Ohala 1993 The phonetics of nasal phonology: Theorems and data. In Nasals, Nasalization, and the Velum (Phonetics and Phonology, Vol. 5), M.K. Huffman and Rena A. Krakow (eds.), 225–249. San Diego: Academic Press. Ohala, Manjari 1991 Phonological areal features of some Indo-Aryan languages. Language Sciences 13 (2): 107–124. Oller, D. K. 1973 The duration of speech segments: The effect of position in utterance and word length. Journal of the Acoustical Society of America 54: 1235–1273. Omar, Asmah Haji 1981 The Iban Language of Sarawak: A Grammatical Description. Kuala Lumpur: Dewan Bahasa dan Pustaka. Padgett, Jaye, and Marija Tabain in press Adaptive dispersion and phonological vowel reduction in Russian. Phonetica.

276

References

Pandit, P. B. 1955 e and o in Gujarati. Indian Linguistics 15: 15–44. Pandit, P. B. 1958 Duration, syllable and juncture in Gujarati. Indian Linguistics: Turner Jubilee Volume I, 212–218. Pandit, P. B. 1961 Historical phonology of Gujarati vowels. Language 37 (1): 54–66. Paulian, Christiane 1974 Le Kukuya: Langue Teke du Congo, Paris: S.E.L.A.F. Pekkanen, Inka 1993 Tatana’ phonemics. In Phonological Descriptions of Sabah Languages, Michael E. Boutin and Inka Pekkanen (eds.), 15–30. Kota Kinabalu: Sabah Museum. Peng, Shu-Hui 2000 Lexical versus “phonological” representations of Mandarin sandhi tones. In Papers in Laboratory Phonology V: Acquisition and the lexicon. Michael Broe and Janet Pierrehumbert (eds.), 152–167. Cambridge: Cambridge University Press. Penny, Ralph J. 1969 Vowel harmony in the speech of the Montes de Pas (Santander). Orbis 18: 148–166. Penny, Ralph J. 1986 Sandhi phenomena in Castilian and related dialects. In Sandhi Phenomena in the Languages of Europe, Henning Anderson (ed.), 489–503. Berlin/New York: Mouton de Gruyter. Pettersson, Thore, and Sidney Wood 1987 Vowel reduction in Bulgarian and its implications for theories of vowel production: A review of the problem. Folia Linguistica 21: 261–279. Pfeiffer, Martin 1972 Elements of Kurux Historical Phonology. Leiden: E. J. Brill. Pierrehumbert, Janet B. 1990 Phonological and phonetic representation. Journal of Phonetics 18: 375–394. Pierrehumbert, Janet B. 1995 Prosodic effects on glottal allophones. In Vocal Fold Physiology: Voice Quality Control, Osamu Fujimura and M. Hirano (eds.), 39–60. San Diego: Singular Publishing Group. Pierrehumbert, Janet B. 2001 Exemplar dynamics: Word frequency, lenition and contrast. In Frequency and the Emergence of Linguistic Structure, Joan Bybee and Paul Hopper (eds.). Amsterdam: John Benjamins. Pierrehumbert, Janet B. 2002 Word-specific phonetics. In Laboratory Phonology 7, Carlos Gussenhoven and Natasha Warner (eds.), 101–140. Berlin/New York: Mouton de Gruyter.

References

277

Pierrehumbert, Janet B., and David Talkin 1991 Lenition of /h/ and glottal stop. In Papers in Laboratory Phonology II, G. Doherty and D. R. Ladd (eds.), 90–117. Cambridge: Cambridge University Press. Poppe, Nikolaus 1960 Vergleichende Grammatik der Altaischen Sprachen, Teil 1, Vergleichende Lautlehre. Wiesbaden: Otto Harrassowitz. Port, Robert, and Penny Crawford 1989 Incomplete neutralization and pragmatics in German. Journal of Phonetics 17: 257–282. Port, Robert F., and Michael L. O’Dell 1985 Neutralization of syllable-final voicing in German. Journal of Phonetics 13: 455–471. Prentice, D. J. 1971 The Murut Languages of Sabah. Pacific Linguistics, Series C-18. Canberra: Pacific Linguistics. Prieto, Pilar 1992 Vowel reduction in Western and Eastern Catalan and the representation of vowels. Romance Languages Annual 1991, 567–572. Prince, Alan 1990 Quantitative consequences of rhythmic organization. Chicago Linguistic Society 26 (2): 355–98. Prince, Alan, and Paul Smolensky 1993 Optimality Theory: Constraint Interaction in Generative Grammar. Ms., Rutgers University. Puech, Gilbert 1978 A cross-dialectal study of vowel harmony in Maltese. In Papers from the fourteenth regional meeting, Chicago Linguistic Society, April 12–14, 1978, Donka Farkas, Wesley M. Jacobsen, and Karol W. Todrys (eds.), 377–389. Chicago: CLS. Pugh, Stefan M., and Ian Press 1999 Ukrainian: A Comprehensive Grammar. London: Routledge. Pulleyblank, Douglas 1988 Underspecification, the feature hierarchy and Tiv vowels. Phonology 5: 299–326. Raz, Shlomo 1983 Tigre grammar and texts. Malibu: Undena Publications. Recasens, Daniel 1986 Estudis de Fonètica Experimental de Català Oriental Central. Barcelona: Publicacions de l'Abadia de Montserrat, Barcelona. Rhodes, Richard A. 1996 English reduced vowels and the nature of natural processes. In Natural Phonology: The State of the Art. Bernhard Hurch and Richard A. Rhodes (eds.), 239–259. Trends in Linguistics. Studies and Monographs 92. Berlin/New York: Mouton de Gruyter.

278

References

Rigsby, Bruce 1976 Kuku-Thaypan descriptive and historical phonology. In Languages of Cape York, Peter Sutton (ed.), 68–77. Canberra: Australian Institute of Aboriginal Studies. Róna-Tas, András 1998 The reconstruction of Proto-Turkic and the genetic question. In The Turkic Languages. Lars Johanson and Éva Á. Csató (eds.), 67–80. London: Routledge. Rose, Sharon 1996 Variable laryngeals and vowel lowering. Phonology 13: 73–117. Sammallahti, Pekka 1988 Historical phonology of the Uralic languages, with special reference to Samoyed, Ugric, and Permic. In The Uralic Languages. Description, History and Foreign Influences, Denis Sinor (ed.), 478–554. Leiden: E. J. Brill. Sanders, Benjamin P. 1994 Andalusian vocalism and related processes. Ph.D. diss., University of Illinois at Urbana-Champaign. Sapir, Edward 1930 Southern Paiute, a Shoshonean Language. Proceedings of the American Academy of Arts and Sciences 65. 1. June 1930. Sapir, Edward, and Harry Hoijer 1967 The Phonology and Morphology of the Navaho Language. University of California Publications in Linguistics, Volume 50. Berkeley: University of California Press. Sasse, Hans-Jürgen 1976 Dasenech. In Non-Semitic Languages of Ethiopia, M. Lionel Bender (ed.), 196–221. East Lansing: African Studies Center, Michigan State University. Savà, Graziano, and Mauro Tosco 2000 A sketch of Ongota, a dying language of southwest Ethiopia. Studies in African Linguistics 29: 59–136. Savitska, I., and T. Bojadzhiev 1988 Bâlgarsko-polska sâpostavitelna gramatika, Tom 1. Fonetika i fonologija [Bulgarian-Polish contrastive grammar, Volume 1. Phonetics and phonology]. Sofia: Izdatelstvo na Bâlgarskata Akademija na Naukite. Schauer, Stanley, and Junia Schauer 1967 Yucuna phonemics. In Phonemic systems of Colombian Languages, Viola G. Waterhouse (ed.), 61–71. Summer Institute of Linguistics Publications in Linguistics and Related Fields 19. Norman: SIL. Schiffman, Harold F. 1983 A Reference Grammar of Spoken Kannada. Seattle: University of Washington Press.

References

279

Schiffman, Harold F. 1999 A Reference Grammar of Spoken Tamil. Cambridge: Cambridge University Press. Schuh, Russell G., and Lawan D. Yalwa 1999 Hausa. In Handbook of the International Phonetic Association, 90–95. Cambridge: Cambridge University Press. Scobbie, James M., Fiona Gibbon, William J. Hardcastle, and Paul Fletcher 2000 Covert contrast as a stage in the acquisition of phonetics and phonology. In Papers in Laboratory Phonology V: Acquisition and the lexicon. Michael Broe and Janet Pierrehumbert (eds.), 194–207. Cambridge: Cambridge University Press. Sebeok, Thomas A., and Frances J. Ingemann 1966 An Eastern Cheremis Manual. Uralic and Altaic Series, Volume 5. Bloomington: Indiana University Publications. Sevortjan, E. V. 1955 Fonetika Turetskogo Literaturnogo Jazyka [Phonetics of the Turkish literary language]. Moskva: Akademija Nauk SSSR. Sharma, D. D. 1998 Tribal languages of Ladak, Vol 1: A Concise Grammar and Dictionary of Brok-skad. New Delhi: Mittal Publications. Shcherba, L. V. 1912 Russkie Glasnye v Kachestvennom i Kolichestvennom Otnoshenii [Russian vowel quality and quantity]. Sankt Peterburg. Shevelov, George Y. 1964 A Prehistory of Slavic. The Historical Phonology of Common Slavic. Heidelberg: Carl Winter Universitätsverlag. Shevelov, George Y. 1979 A Historical Phonology of the Ukrainian Language. Heidelberg: Carl Winter Universitätsverlag. Silverman, Daniel 1997 Phasing and Recoverability. New York and London: Garland. Simões, Antônio R. M. 1987 Temporal organization of Brazilian Portuguese vowels in continuous speech: An acoustical study. Ph.D. diss., University of Texas, Austin. Simões, Antônio R. M. 1991 Towards a phonetics of the discourse. Cadernos de Estudos Lingüísticos 21: 59–78. Sirk, U. 1983 The Buginese Language. Moscow: Nauka. Slifka, Janet 2000 Respiratory constraints at prosodic boundaries in speech. Ph.D. diss., Massachusetts Institutute of Technology. Slowiaczek, Louisa M., and Daniel A. Dinnsen 1985 On the neutralizing status of Polish word-final devoicing. Journal of Phonetics 13: 325–341.

280

References

Smith, Jennifer 2000 Constraints and positions. Paper presented at the Tri-lateral Phonology Weekend, University of California, Santa Cruz, May 7, 2000. Smith, Jennifer 2002 Phonological augmentation in prominent positions. Ph.D. diss., University of Massachusetts, Amherst. Snider, Keith L. 1986 Apocope, tone and glottal stop in Chumburung. Journal of African Languages and Linguistics 8: 133–144. Steriade, Donca 1994 Positional neutralization and the expression of contrast. Ms., UCLA. Steriade, Donca 1995 Underspecification and markedness. In The Handbook of Phonological Theory, John A. Goldsmith (ed.), 114–174. Oxford: Blackwell. Steriade, Donca 1997 Phonetics in phonology: The case of laryngeal neutralization. Ms., UCLA. Steriade, Donca 2001 Directional asymmetries in place assimilation. In The Role of Speech Perception in Phonology, Elizabeth Hume and Keith Johnson (eds.), 219–250. San Diego: Academic Press. Stevens, Kenneth 1986 On the quantal nature of speech. Journal of Phonetics 17: 3–45. Stojkov, Stojko 1993 Reprint. Bâlgarska Dialektologija [Bulgarian dialectology]. Third edition. Maksim Mladenov (ed.). Sofia: Izdatelstvo na BAN. Original edition, 1962. Subrahmanyam, P.S. 1998 Kolami. In The Dravidian Languages, Sanford B. Steever (ed.), 301–327. London, Routledge. Suomi, Kari 1983 Palatal vowel harmony: a perceptually motivated phenomenon? Nordic Journal of Linguistics 6: 1–35. Suomi, Kari 1984 A revised explanation of the causes of palatal vowel harmony based on psychoacoustic spectra. Nordic Journal of Linguistics 7: 41–61. Tenishev, E. R. (ed.) 1984 Sravnitel’no-istoriceskaja Grammatika Tjurkskix Jazykov: Fonetika [Comparative-historical grammar of the Turkic languages: Phonetics]. Moskva: Nauka. Teoh, Boon Seong 1994 The Sound System of Malay Revisited. Kuala Lumpur: Ministry of Education.

References

281

Testen, David 1997 Old Persian and Avestan phonology. In Phonologies of Asia and Africa 2, Alan S. Kaye (ed.), 569–599. Winona Lake: Eisenbrauns. Thirumalai, M. S. 1972 Thaadou Phonetic Reader. Mysore: Central Institute of Indian Languages. Tieszen, Bozena 1997 Final stop devoicing in Polish: an acoustic and historical account for incomplete neutralization. Ph.D. diss,, University of Wisconsin, Madison. Tilkov, Dimitâr 1982 Gramatika na Sâvremennija Bâlgarski Ezik: Tom 1/ Fonetika [ A grammar of the modern Bulgarian language: Vol. 1/ Phonetics]. Sofia: Izdatelstvo na BAN. Timberlake, Alan 1983a Compensatory lengthening in Slavic, 1: Conditions and dialect geography. In From Los Angeles to Kiev, Dean Worth and V. Markov (eds.). Columbus, OH: Slavica. Timberlake, Alan 1983b Compensatory lengthening in Slavic, 2: Phonetic reconstruction. In American contributions to the ninth international congress of slavists, Michael Flier (ed.). Columbus, OH: Slavica. Timberlake, Alan 1993 Isochrony in Late Common Slavic. In American Contributions to the Eleventh International Congress of Slavists, Bratislava, August– September 1993. Literature. Linguistics. Poetics, Robert Maguire and Alan Timberlake (eds.), 425–439. Columbus, OH: Slavica. Topping, Donald 1969 Spoken Chamorro. Honolulu: University of Hawaii Press. Trubetzkoy, N. S. 1969 Principles of Phonology. Translation of Grundzüge der Phonologie by Christiane A. M. Baltaxe. Berkeley: University of California Press. Turk, Alice E., and Stefanie Shattuck-Hufnagel 2000 Word-boundary-related duration patterns in English. Journal of Phonetics 28: 397–440. Tuttle, Siri 2000 Duration, intonation and prominence in Apache. UCLA Working Papers in Phonetics 99: 35–56. Ueyama, Motoko 1999 An experimental study of vowel duration in phrase-final contexts in Japanese. UCLA Working Papers in Phonetics 97: 174–182.

282

References

Urbanczyk, Suzanne 1999 Echo vowels in Coast Salish. In Proceedings of the Workshop on Structure and Constituency in the Languages of the Americas IV: UBC Working Papers in Linguistics, Marion Caldicott, Suzanne Gessner, and Eun Sook Kim (eds.). Vancouver: University of British Columbia Working Papers in Linguistics. Vaissiere, Jacqueline 1988 Prediction of velum movement from phonological specifications. Phonetica 45: 122–139. van der Tuuk, H. N. 1971 A Grammar of Toba Batak. The Hague: Martinus Nijhoff. Vayra, Mario and Carol A. Fowler 1992 Declination of supralaryngeal gestures in spoken Italian. Phonetica 49: 48–60. Vasmer, Max 1986 Reprint. Etimologic¬eskij Slovar’ Russkogo Jazyka [Etymological dictionary of the Russian language]. Moscow: Progress. Original edition, Russisches Etymologisches Wörterbuch, Heidelberg, 1950– 1958. Vincent, Nigel 1988a Latin. In The Romance Languages, Martin Harris and Nigel Vincent (eds.), 26–78. Oxford: Oxford University Press. Vincent, Nigel 1988b Italian. In The Romance Languages, Martin Harris and Nigel Vincent (eds.), 279–313. Oxford: Oxford University Press. Vlasto, A. P. 1985 A Linguistic History of Russian to the End of the Eighteenth Century. Oxford: Oxford University Press. Wackernagel, Jakob 1896 Altindische Grammatik. Göttingen: Vandenhoeck und Rupricht. Walker, Rachel 2001a Round licensing, harmony, and bisyllabic triggers in Altaic. Natural Language and Linguistic Theory 19: 827–878. Walker, Rachel 2001b Two kinds of vowel harmony in Italian dialects. Ms. of paper presented at UC Irvine, November 30, 2001. Walker, Willard 1975 Cherokee. In Studies in Southeastern Indian Languages, James M. Crawford (ed.), 189–236. Athens, GA: University of Georgia Press. Walton, James, and Janice Walton 1967 Phonemes of Muinane. Phonemic Systems of Colombian Languages, Viola G. Waterhouse (ed.), 37–47. Summer Institute of Linguistics Publications in Linguistics and Related Fields 19. Norman: SIL. Warner, Natasha, Erin Good, Allard Jongman, and Joan Sereno in press Orthographic vs. morphological incomplete neutralization effects. Journal of Phonetics.

References

283

Warner, Natasha, Allard Jongman, Joan Sereno, and Rachel Kemps 2004 Incomplete neutralization and other subphonemic durational differences in production and perception: evidence from Dutch Journal of Phonetics 32: 251–276. Wetzels, W. Leo 1992 Mid vowel neutralization in Brazilian Portuguese. Cadernos de Estudos Lingüísticos 23: 19–55. Whalen, Doug H., and Patrice S. Beddor 1989 Connections between nasality and vowel duration and height: Elucidation of the Eastern Algonquian intrusive nasal. Language 65 (3): 457–486. Wightman, Colin, Stefanie Shattuck-Hufnagel, Mari Ostendorf, and Patti J. Price 1992 Segmental durations in the vicinity of prosodic phrase boundaries. Journal of the Acoustical Society of America 91: 1707–1717. Williams, Briony J. 1989 Stress in Modern Welsh. Bloomington: Indiana University Linguistics Club Publications. Wilson, Colin 2003 Experimental investigation of phonological naturalness. In WCCFL 22 Proceedings. G. Garding and M. Tsujimura (eds.), 101–114. Somerville, MA: Cascadilla Press. Wilson, W. A. A. 2000 Vowel harmony in Bijagó. The Journal of West African Languages 28 (1): 19–32. Wolgemuth, Carl 1969 Isthmus Veracruz (Mecayapan) Nahuat laryngeals. In Aztec Studies I: Phonological and Grammatical Studies in Modern Nahuatl Dialects, Dow F. Robinson (ed.), 1–14. Summer Institute of Linguistics Publications in Linguistics and Related Fields 19. Norman: SIL. Wood, Sidney A. J., and Thore Pettersson 1988 Vowel reduction in Bulgarian: The phonetic data and model experiments. Folia Linguistica 22: 239–262. Yu, Alan C. L. 2005 Sound change as exemplar redistribution: the case of tonal nearmerger in Cantonese. Ms., University of Chicago. Zhang, Jie 2001 The effects of duration and sonority on contour tone distribution — Typological survey and formal analysis. Ph.D. diss., University of California, Los Angeles. Zigmond, Maurice L., Curtis G. Booth, and Pamela Munro 1990 Kawaiisu: A Grammar and Dictionary with Texts. University of California Publications in Linguistics #119. Berkeley: University of California Press.

284

References

Zlatoustova, L. V. 1962 Foneticheskaia Struktura Slova v Potoke Rechi [Phonetic structure of the word in the speech stream]. Kazan’: Izdatel’stvo Kazanskogo Universiteta. Zlatoustova, L. V. 1981 Foneticheskie Edinitsy Russkoj Rechi [Phonetic elements of Russian speech]. Moscow: Izdatel’stvo Moskovskogo Universiteta. Zoll, Cheryl 1998 Positional Asymmetries and Licensing. Rutgers Optimality Archive #282-0998. Zsiga, Elizabeth Cook 1993 Features, gestures, and the temporal aspects of phonological organization. Ph.D. diss., Yale University. Zue, V., and M. Laferriere 1979 Acoustic study of medial t, d in American English. Journal of the Acoustical Society of America 66: 1039–1050. Zvelebil, Kamil 1970 Comparative Dravidian Phonology. Berlin/New York: Mouton de Gruyter. Zvelebil, Kamil 1990 Dravidian Linguistics: an Introduction. Pondicherry: Pondicherry Institute of Linguistics and Culture.

Index

!Xóõ, 165 Accentual Phrase, 76 acquisition, 73, 74, 77, 99, 221, 227 advanced tongue root, 16, 19, 22, 23, 27, 29, 39, 42, 72 Afar, 120–21, 132 affixes, 69, 82, 91, 92, 109, 140, 145, 179, 199, 200, 202, 203, 204, 205, 209, 217, 224 Akan, 126–27 akanje, 36, 214, 220 Altaic languages, 161, 174, 178, 200 American Sign Language, 74 amplitude, 43, 94, 96, 114, 123, 126, 128, 129, 131, 138, 147, 149, 156, 157, 192, 197 Apinaye, 132 apocope, 73, 87, 114, 116, 119, 133, 142, 144, 151, 157, 178 Arapaho, 166 articulatory declination, 94, 96, 135 articulatory effort, 7, 11, 30, 43, 61, 108, 114, 210, 211, 215 Articulatory Phonology, 11, 169 aspiration, 3, 6, 67, 112, 115, 116, 119, 120, 125, 168, 185 assimilative vowel reduction, 193, 195, 196 Assuriní, 138 Avestan, 151 Azerbaijani, 116 babbling, 75 backness. See frontness/backness bad vowels, 42, 79, 85 Bantu languages, 4, 161, 174, 200–203

Bare, 140, 141 Batak, 158 Beijing Chinese, 74 Belarusian, 21, 22, 35, 36–37, 41, 44, 46, 47, 87, 88, 97, 194 Bengali, 176 Berguener Romansh, 21, 22 bisyllabic triggers, 109, 110, 113 Biyagó, 143, 151 Bonggi, 93, 98 Brazilian Portuguese, 22, 30, 37–39, 45, 47, 48, 49, 67, 71, 72, 74, 89, 93, 97, 98, 104, 107, 108, 118, 143, 146, 212 breathy voice, 80, 112, 128, 138, 139, 142, 148, 161, 175 Brok-skad, 144 Buginese, 158 Bulgarian, 13, 21, 22, 30, 32–36, 41, 44, 45, 46, 49, 67, 71, 74, 80, 104, 118, 143, 194, 210 Burushaski, 28 Campidanian Sardinian, 166 Catalan, 22, 98 Central Eastern dialect, 21, 22, 90 Chamorro, 22 Cherokee, 138 Chichewa, 201 Chipewyan, 26 Choctaw, 151 Chong, 129 Chumburung, 126 CiYao, 201 Classical Manchu, 109 clipping, 119, 120, 125, 132–34, 151 closed syllable shortening, 80

286

Index

closed syllables, 79, 83, 88, 91, 93, 94, 95, 96, 97, 98, 99, 101, 106, 110, 124, 147, 153, 154, 185, 186, 209, 215, 218 coarticulation, 65, 66, 96, 102 vowel-to-vowel, 18, 27, 31, 85, 95, 113, 196–200, 204 window model of, 9–15, 60, 62–67, 103–8 codas, 3, 6, 66, 186, 188, 189, 190, 219 compensatory lengthening, 104, 106, 126, 154, 214, 218, 219 complex syllable margins, 66 consonant deletion, 218 contour tones, 78 Copala Trique, 27, 72 Cotabato Manobo, 158 creaky voice, 123, 125, 126, 127–30, 131, 142, 157, 159 Creek, 31, 74, 75, 95, 96, 97, 146 Czech, 74, 194 D’irayta, 133 Dagbaani, 125 Dasenech, 119, 121, 142, 151 devoicing, 27, 64, 73, 80, 84, 114, 115–25, 130, 132–34, 138, 140, 141, 142, 144, 147, 150, 156, 159, 172, 178, 207 Dhangar Kurux. See Kurux diphthongization, 31, 68, 141 Direct Phonetics, 7, 9, 18, 23, 101, 207, 209–20 dispersion, 21–22, 40–41, 70, 71 dissimilative vowel reduction, 193 Doyayo, 165 Dravidian languages, 26, 28, 151, 158, 161, 174, 193 duration and unstressed vowel reduction, 7, 8, 11, 12, 16,

27, 28, 29–32, 37, 38, 45–46, 49–67 as a perceptual cue, 40, 41, 42, 73, 79, 83, 85 of final syllables, 74–76, 79, 93–98, 101, 102, 118, 119, 123, 124, 132, 137, 143, 148, 155, 156, 158 of initial syllables, 168–73, 177–92 represented in phonology, 210–20 Dusunic languages, 110, 111, 112 Dutch, 74, 75, 124, 146, 150, 157 East Slavic, 35, 36, 46, 67, 193 Eastern Algonquian, 136 Eastern Cheremis. See Eastern Mari Eastern Mari, 89, 98, 108, 157 English, 17, 30, 65, 66, 74, 75, 76, 77, 90, 94, 95, 96, 98, 99, 101, 108, 114, 125, 141, 142, 166, 168, 172, 175, 178, 179–84, 186, 191, 192, 193, 196, 205 epenthesis, 5, 68, 69, 116, 121, 125, 126, 131, 132, 139 Evolutionary Phonology, 2, 206 F0. See pitch factorial typology, 39, 172 fast speech. See speech rate final fade, 96, 97, 130, 134, 154 final lengthening, 17, 31, 37, 38, 51, 63, 73, 74–76, 93–98, 100, 102, 104, 107, 118, 121–25, 133, 137, 146–50, 154–58, 168, 169, 192, 203, 207 final resistance, 17, 86–108, 121, 140, 159, 163 French, 74, 75, 82, 116, 168, 172, 192

Index frequency, 228, 234, 235 frontness/backness, 15, 16, 22, 23, 24, 26, 39, 42, 72, 161, 179, 195, 198 Gadaba, 151 German, 74, 75 gestural magnitude, 50, 94–97, 101, 121–24, 133, 134–35, 146, 149, 150, 168–73, 207 gestural targets, 11, 29–32, 43, 50, 104, 118, 121, 124, 130, 195, 210 Gilbertese, 133 glottalization, 3, 80, 120, 121, 123, 125–34, 138, 151 Goajiro, 119, 145 Gokana, 161, 174, 203 Gonja, 126 Grounded Phonology, 8 Guang, 126–27 Guarani, 27 Guhang Ifugao, 166 Gujarati, 140, 161, 176, 208 harmony, 18, 19, 20, 22, 24, 27, 41, 79, 92, 161, 167, 174, 179, 203, 204, 208 height, 200–203 laxing, 82, 83, 84 palatal, 24, 174, 176, 178, 193–200 rounding, 85, 91, 92, 109, 110, 113 RTR, 145 stress-based, 27, 84, 89, 197 translaryngeal, 69, 139 Hausa, 74, 79–81, 86, 97, 103, 132 height. See vowel height hiatus, 50, 51, 63, 66, 118, 135, 139 Highland East Cushitic, 144 Hindi, 136

287

hyperarticulation, 52, 101, 138 hypocorrection and sound change, 15, 84 iambic lengthening, 151–57 Iban, 144, 158 incomplete neutralization, 18, 56, 209, 223–38 Indo-European, 114 Indonesian, 147 initial strengthening, 18, 51, 66, 167–73, 176, 177–200, 201, 203, 208 insertion. See epenthesis Integrated Phonetics and Phonology. See Direct Phonetics Intonational Phrase, 76, 131, 187 Irantxe, 138 Israeli Hebrew, 74, 100 Isthmus Veracruz Zapotec, 133 Italian, 85, 102, 153 Bari dialect, 22, 24 Standard, 21, 22, 31, 41, 45, 46, 49, 74 Tuscan dialect, 94, 95 Jamul Tiipay, 28 Japanese, 74, 76, 130, 131 Jarawara, 138 Jicarilla Apache, 75 Jordanian Arabic, 75, 76, 99 Kabardian, 23, 42 Kadazan/Dusun, 111, 112 Kandahari Pashto, 145 Kannada, 29 Kawaiisu, 143 Khanty, 23, 24, 39 Kikerewe, 201 Kikuyu, 78 Kimaragang, 110, 111, 112, 113 Koasati, 132

288

Index

Kolami, 28, 151 Konda, 151 Korean, 75, 157, 168, 177, 192 Kota, 175 Koyra, 132, 151 Kuku-Thaypan, 154 Kukuya, 4, 6, 201 Kullo, 144 Kurdish, 103 Sulaimania dialect, 157 Kurux, 29, 151, 161, 174, 175 Kwara’ae, 119 Late Common Slavic, 36 Latin, 177 Latvian, 133, 157 length. See quantity lexical neighborhoods, 235 Licensing-by-Cue, 6–7, 8, 9, 18, 20, 39, 40, 100, 210 Lithuanian, 154, 157 Loniu, 21, 22 Luganda, 50, 75, 154, 172, 201 Luiseño, 132 Lunda, 145 Macedonian, 35, 194 Maintain Contrasts, 211 Maithili, 145 Malay, 147, 149 Maltese, 91, 98, 102, 133, 151 Malto, 29 Mandarin Chinese, 129 Mansi, 23, 24, 39 Mari, 158 Marshallese, 23, 42 Mbabaram, 166 metaphony, 15, 84–86, 112, 113, 126, 144, 150 Minimum Distance, 211, 213, 219 Mongolian, 166, 174, 176 monophthongization, 150, 175, 176

mora-clipping. See clipping moraicity, 4, 28, 50, 66, 101, 119, 133, 151, 153, 155, 158, 166, 201, 218 Moroccan Arabic, 157 Muinane, 145 Murutic languages, 110, 111, 112 Nahuatl Isthmus Veracruz Nahuat dialect, 130, 132, 139 Nanai, 140, 145, 146, 159, 160 nasalization, 19, 20, 84, 159, 160, 161 final syllables, 134–41 in unstressed syllables, 27, 29, 39, 42, 72 Navaho, 116 Nawuri, 50, 91, 94, 98, 106, 171 near-merger, 228, 229, 238 Neo-Aramaic, 137 neo-grammarian sound change, 229 non-finality, 153, 154, 157 Nyungar, 144

Ob-Ugrian languages, 23, 24, 25, 193 Oneida, 116 Ongota, 141 onsetless syllables, 49, 50, 51, 64, 66, 92, 171, 205 onsets, 3, 4, 50, 66, 77, 99, 123, 134, 163, 164, 165, 166, 167, 168, 169, 170, 171, 177, 179, 186, 205 open syllables, 37, 51, 53, 83, 87, 88, 89, 90, 91, 94, 93–98, 99, 100, 103, 108, 118, 120, 123, 124, 143, 147, 149, 153, 154, 157, 159, 178, 180, 185, 209, 215, 217, 219

Index Optimality Theory, 4–5, 41, 43–44, 65, 71, 77, 104–8, 152, 165–67, 170, 172, 209, 211 Oromo, 133

Paitanic languages, 110 palatalization, 48, 53, 87, 126, 138 Pali, 136 Papago, 119 paradigmatic contrast, 25, 168, 169 pharyngealization, 19, 22, 27, 142, 145, 159, 161 phonemicization, 15, 117 phonetic knowledge, 210, 221 Phonological Phrase, 181, 182 Phonological Word, 181, 187 phonologization, 1–2, 8–16, 19, 28 of final glottalization, 125, 130–31 of final reduction, 150, 159–60 of final strength and resistance, 93–108 of initial strength, 168–73 of palatal vowel harmony, 196–200 of vowel devoicing, 117–19, 121 of vowel reduction, 23–24, 30–32, 35–36, 37–39, 67–68 pi-gesture, 99, 124, 169 Pirahã, 138 pitch, 15, 16, 17, 68, 76, 94, 96, 116, 125, 127, 129, 130, 131, 136, 157, 159, 178, 192, 197, 207 pitch accent, 30, 66, 68, 95, 102, 179, 185 Polish, 194, 220 dialect of Niemirowa nad Bugiem, 98

289

Positional Augmentation, 111, 141, 142, 160, 162–67, 168, 169, 171, 173 Positional Faithfulness, 4–5, 71, 77, 100, 105, 106, 107, 162 Positional Markedness, 5, 41, 43, 71, 100 prominence scales, 43 Prosodic Hierarchy, 75, 131, 187, 191 Prosodic Word, 170, 209 Proto-Altaic, 174 Proto-Austronesian, 70, 111, 112, 158 Proto-Malayo-Polynesian, 111 Proto-Turkic, 176, 178, 194, 195, 196, 197, 199, 208 Proto-Uralic, 24, 176 psycholinguistic prominence, 73, 74, 77–78, 87, 98–100, 162, 163, 164, 166, 167, 169, 170, 172, 204, 209 Punjabi, 103, 158 Punu, 202 Pure Prominence, 3–6, 7, 73, 100, 104, 105, 107, 207 Quantal Theory, 40 quantity, 19, 25–26, 28–29, 36, 39, 103, 132, 151–58, 161 raddoppiamento sintattico, 153 reinterpretation. See phonologization resizing of target windows, 11, 61, 63, 64, 66, 108 retracted tongue root, 19, 22, 27, 142, 145 rhinoglottophilia, 138–40 Rhythm, 152, 154, 156 Richness of the Base, 65 Rikbaktsa, 138

290

Index

root (morphological), 4, 17, 92, 161, 173, 179, 194, 199, 200, 201, 202, 203, 209 Rotuman, 119 roundness, 16, 19, 23, 24, 25, 26, 29, 42, 50, 70, 72, 79, 85, 89, 91, 93, 109, 110, 112, 137, 161, 193, 194, 195 Runyambo, 50, 172, 201 Russian, 28, 75, 87, 91, 97, 116, 122, 123, 125, 128, 132, 138, 148, 157, 171, 175, 205 northern dialects, 67 southern dialects, 21 vowel reduction, 11, 12, 20, 35, 37, 45, 47–67, 71, 93, 104, 194, 195, 196, 202, 210

Saanich, 23 Sanskrit, 134, 135–37, 175 Seediq, 21, 22, 68–71, 72, 150, 220 Serbo-Croatian, 35, 194 Shilluk, 165 Shimakonde, 92, 93, 98, 201, 214–15 Slovenian, 25–26 sonority, 67, 69 enhancement, 31, 95, 96, 101, 111, 122, 141, 142, 147, 149, 155, 172 in unstressed syllables, 43, 42–47 rise from domain-initial position, 163–64, 165–67, 170 sonority-driven stress, 44 sound change. See phonologization Southern Paiute, 119, 130, 133 Spanish, 75, 85 Andalusian, 84, 143 Castillian dialects, 142

Pasiego, 81, 82, 83, 84, 86, 109, 112, 160 Tudanca, 83 spectral tilt, 131 speech rate, 11, 12, 38, 51, 54, 63, 87, 105, 106, 117, 119, 148, 210, 211 stem (morphological), 4, 5, 91, 201, 202, 203, 224, 237 stress, 4, 11, 12, 13 and children's truncations, 77 and final lengthening, 94–97, 100–103 and glottalization, 130–31 and harmony, 27, 83, 86, 193–97 and Positional Augmentation, 164 and sonority, 42–44 and vowel devoicing, 119–21 and vowel quantity, 28–29, 151–57 and vowel reduction, 5, 8, 19–72 duration-cued, 28, 29–32, 94, 192 fixed, 29, 98, 161, 167, 174–77, 178, 193, 194 non-duration-cued, 94, 186, 194 primary vs. secondary, 31, 180 vs. abstract accent, 4, 201 subglottal pressure, 17, 116, 118, 119, 120, 121, 122, 123, 124, 125, 127, 128, 129, 130, 131, 147, 156, 159, 207 Substance Abuse, 2, 9, 221 substantive filters on constraint construction, 8, 170 Swedish, 31, 75, 95, 96 syllabification, 43

Index syllable structure. See open syllables, closed syllables syllable weight, 152, 175, 178 syllable-timing, 102 syncope, 29, 175, 194 syntagmatic contrast, 168, 169

Taiwanese, 168 Tamil, 26, 29, 50, 171, 174, 175 Tapirape, 138 Tatana’, 111, 112 Telefol, 26 Tepehua, 133 Thaadou, 158 Tigre, 151, 157 Tigrinya, 157 Timugon Murut, 79, 93, 109–14, 195, 202 Tiv, 161, 174, 203 Toda, 29, 175 tone, 4, 15, 78–79, 127, 129, 154, 201 truncation, 77, 99, 140 Tungusic, 109, 145, 146, 161, 174, 176 Turkana, 124 Turkic, 174, 176, 208 Turkic languages, 94, 177–79, 193–200, 204 Turkish, 94, 116, 119, 146, 162, 174, 176, 177–79, 184–200, 204, 205 Turkmen, 94, 192 Tuvan, 161

Ukrainian, 88, 97 dialect of Niemirowa nad Bugiem, 98 umlaut. See metaphony

291

undershoot of articulatory targets, 12, 19, 29–32, 41, 43, 61, 103, 207, 210, 213 underspecification, 10, 65, 70 Universal Grammar, 1, 4, 9, 17, 18, 24, 30, 74, 159, 161, 165, 204, 206–9 unnaturalness, 2, 23, 72, 117, 215, 221, 225 Uralic languages, 24, 161, 174, 175, 176, 193, 200 utterance, 53, 67, 76, 86, 102, 115, 116, 120, 131, 134, 138, 143, 181, 187, 189, 190, 191 Uyghur, 90, 94, 98, 101, 102, 104, 106, 215–19, 215–20, 237

velum height, 96, 134, 135, 136, 137, 138, 139, 140 vertical vowel systems, 23, 42 voice onset time, 54, 168, 171, 177, 180, 185 vowel deletion. See syncope, apocope vowel height in final syllables, 94, 102, 104, 141–43 in harmonies, 200, 202 in metaphonies, 85 in unstressed syllables, 4–5, 12, 16, 19, 20–27, 29–32, 42 vowel reduction and Contrast Enhancement, 40–42 and Prominence Reduction, 42–47 and vowel harmony, 193–98, 201, 202 assimilative, in Russian. See assimilative vowel reduction

292

Index dissimilative, in Russian. See dissimilative vowel reduction in final syllables, 83, 141–50 in unstressed syllables, 5, 11, 12, 19–72, 81, 161, 204, 210–20 resistance to, 86–108

Warekena, 139 Warumungu, 28 Weight-to-Stress, 152

Welsh, 26 West Slavic, 202 windows. See coarticulation Xeta, 138 Yakan, 91, 98, 195, 202 Yowlumne, 161 Yucuna, 139 Yukulta, 151