Click Consonants 9789004232686, 9789004424357

Click Consonants is an indispensable volume for those who want to understand the linguistics of clicks. Contributions in

760 39 9MB

English Pages 467 [483] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Click Consonants
 9789004232686, 9789004424357

Table of contents :
Contents
Editor?s Preface
Illustrations
Chapter 1 Click Consonants: an Introduction
Chapter 2 Click Phonology
Chapter 3 The Interaction between Click Consonants and Tone in Tsua
Chapter 4 Click Loss and Click Insertion in Fwe
Chapter 5 Perception of Non-native Click Consonant Contrasts: Implications for Theories of Speech Perception
Chapter 6 Studying Clicks Using Real-Time MRI
Chapter 7 Recording and Measuring Acoustic Attributes of Clicks
Chapter 8 Nasalized Accompaniments in Proto-Khoe and in Khwe
Chapter 9 Click Loss in Khoe-Kwadi
Chapter 10 Click Replacement and Loss in Ju
Chapter 11 Production of Click Sounds in Acquired Apraxia of Speech: a View to the Motoric Nature of the Disorder
Chapter 12 The ArtiVarK Click Study: Documenting Click Production and Substitution Strategies by Learners in a Large Phonetic Training and Vocal Tract Imaging Study
Chapter 13 Notes on Child Acquisition of Clicks in Hadza
Chapter 14 Paralinguistic Use of Clicks in Chad
Chapter 15 False Alarms: Spurious Reports of Click Consonants
General Index
Language Index

Citation preview

Click Consonants

Empirical Approaches to Linguistic Theory Series Editor Brian D. Joseph, The Ohio State University Editorial Board Artemis Alexiadou, University of Stuttgart Harald Baayen, University of Alberta Pier Marco Bertinetto, Scuola Normale Superiore Kirk Hazen, West Virginia University Maria Polinsky, Harvard University

volume 15

The titles published in this series are listed at brill.com/ealt

Click Consonants Edited by

Bonny Sands

LEIDEN | BOSTON

Library of Congress Cataloging-in-Publication Data Names: Sands, Bonny Eva, 1966- editor. Title: Click consonants / edited by Bonny Sands. Description: Leiden ; Boston : Brill, 2020. | Series: Empirical approaches  to linguistic theory, 22106243 ; volume 15 | Includes bibliographical  references and index. Identifiers: LCCN 2020014923 | ISBN 9789004232686 (hardback) | ISBN  9789004424357 (ebook) Subjects: LCSH: Clicks (Phonetics) | African languages—Clicks. Classification: LCC P235.5 .C55 2020 | DDC 496—dc23 LC record available at https://lccn.loc.gov/2020014923

Typeface for the Latin, Greek, and Cyrillic scripts: “Brill”. See and download: brill.com/brill-typeface. ISSN 2210-6243 ISBN 978-90-04-23268-6 (hardback) ISBN 978-90-04-42435-7 (e-book) Copyright 2020 by Koninklijke Brill NV, Leiden, The Netherlands. Koninklijke Brill NV incorporates the imprints Brill, Brill Hes & De Graaf, Brill Nijhoff, Brill Rodopi, Brill Sense, Hotei Publishing, mentis Verlag, Verlag Ferdinand Schöningh and Wilhelm Fink Verlag. All rights reserved. No part of this publication may be reproduced, translated, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior written permission from the publisher. Requests for re-use and/or translations must be addressed to Koninklijke Brill NV via brill.com or copyright.com. This book is printed on acid-free paper and produced in a sustainable manner.

Contents Editor’s Preface vii List of Illustrations ix 1

Click Consonants: an Introduction 1 Bonny Sands

2

Click Phonology 74 Wm. G. Bennett

3

The Interaction between Click Consonants and Tone in Tsua 129 Timothy K. Mathes

4

Click Loss and Click Insertion in Fwe 156 Hilde Gunnink

5

Perception of Non-native Click Consonant Contrasts: Implications for Theories of Speech Perception 179 Catherine T. Best

6

Studying Clicks Using Real-Time MRI 210 Michael Proctor, Yinghua Zhu, Adam Lammert, Asterios Toutios, Bonny Sands, and Shrikanth Narayanan

7

Recording and Measuring Acoustic Attributes of Clicks 241 Sean Fulop and Richard Wright

8

Nasalized Accompaniments in Proto-Khoe and in Khwe 275 E.D. Elderkin

9

Click Loss in Khoe-Kwadi 291 Anne-Maria Fehn

10

Click Replacement and Loss in Ju 336 Anne-Maria Fehn

vi

Contents

11

Production of Click Sounds in Acquired Apraxia of Speech: a View to the Motoric Nature of the Disorder 356 Anita van der Merwe and Mollie Steyn

12

The ArtiVarK Click Study: Documenting Click Production and Substitution Strategies by Learners in a Large Phonetic Training and Vocal Tract Imaging Study 384 Scott Moisik and Dan Dediu

13

Notes on Child Acquisition of Clicks in Hadza 418 Kirk Miller

14

Paralinguistic Use of Clicks in Chad 422 Florian Lionnet

15

False Alarms: Spurious Reports of Click Consonants 438 Kirk Miller



General Index 445 Language Index 464

Editor’s Preface Clicks are exotic; clicks are interesting; clicks are expressive; clicks are curiosities; yet clicks are a natural part of human language. Thus, in short, clicks are a challenge, and as such, they need to be studied and studied carefully. Therefore when this volume came to my attention for possible inclusion in the Empirical Approaches to Linguistic Theory (EALT ) series, I jumped at the chance to add it to the series’ roster. Exploring clicks from a variety of perspectives—phonetic, phonological, historical, acquisitional, and more—this volume offers the reader an empirically based overview of clicks that demonstrates clearly just how interesting, expressive, curious, and, yes, exotic, these sounds are. While there are many studies of clicks in individual languages and from different angles, as editor Bonny Sands’ massive (20-page!) bibliography indicates, the present volume is perhaps the first ever of its kind anywhere that is focused entirely on clicks. At the same time, this work also represents a first for this series, as it is the first EALT volume in the series that is devoted in its entirety to a phonological phenomenon. While the phonic side of language has not been completely neglected in the previous volumes taken together, mostly those prior to this one treated topics in morphology, syntax, semantics, interfaces and such. This volume is thus an especially welcome addition to the roster of EALT studies. This series aims to offer a cross-linguistic perspective on matters of concern to linguistic theory, trying to get at the ways in which all languages are alike and the ways in which they differ. It thus has a typological dimension to its goals. In principle, although much of the literature in typology and the investigation of linguistic universals treats morphology and syntax, phonology is just as amenable to cross-linguistic investigation. Indeed, the careful and data-rich studies in this work illustrate this point amply. Brian D. Joseph

EALT Series Managing Editor Columbus, Ohio USA, 6 July 2020

Illustrations Figures 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 1.15 1.16 3.1 3.2 3.3

a) Doke’s symbols for Grootfontein ǃXun clicks, b) Transcription equivalents of Grootfontein ǃXun click symbols 5 Ekoka ǃXun fricated palatal click /⨎ / and following breathy vowel, in the word /⨎ ȁ̤hà/ ‘cut’ 11 Ekoka ǃXun lateral click /ǁ/ and following vowel, in the word /ǁa̋è/ ‘monkey’ 12 Burst spectra of the Ekoka ǃXun [⨎ ] (black) and [ǁ] (red) click tokens in Figures 1.1–1.2 13 Midsagittal diagram of a retroflex click production in Grootfontein ǃXun 13 Alveolar click in the word /ᶢǃűː/ ‘belly’ pronounced by Mishe ǁXae Nǀami of Grootfontein, Namibia 14 Retroflex (fricated postalveolar) click in the word /ᶢǃǃűː/ ‘water’ pronounced by Mishe ǁXae Nǀami of Grootfontein, Namibia 15 Palatography of Grootfontein ǃXun retroflex click in the word /ᶢǃǃűː/ ‘water’, as pronounced by Kaqe Useb 17 High-speed ultrasound traces of a Grootfontein ǃXun retroflex click in the word ǃǃhá [ǃǃʰá] ‘meat’ 19 Spectrogram and waveform illustrating an ejected click in the Gǀui word /ǂ’áà/ ‘to faint, fall unconscious, play dead’ 26 Spectrogram and waveform illustrating a glottalized click in the Gǀui word /ǂʔáà/ ‘sp. grass used for an arrow shaft’ 27 Linguo-pulmonic unaspirated alveolar contour click /ǃ͡q/ and portion of the following vowel in the Nǀuu word /ǃ͡qɑo/ ‘to pluck, pick’ 30 Unaspirated alveolar click /ǃ/ and portion of the following vowel in the Nǀuu word /ǃɑɑ/ ‘hartebeest’ 30 Spectrogram of the voiceless unaspirated velar-alveolar click /ᵏǃ/ and portion of the following vowel in the ǃXóõ word ǃáa /ᵏǃáa/ ‘rain’ 33 Spectrogram of the voiceless unaspirated uvular-alveolar click /qǃ/ and portion of the following vowel in the ǃXóõ word ǃqáũ /qǃáũ/ ‘be careful, sharp’ 33 Extracted formant traces taken from the spectrograms in Figures 1.14–1.15 34 The tonal space of the six time-normalized and F0 averaged tone melodies across 3 Tsua consultants 138 F0 traces of HM ǁáē ‘to chew’ and DH-M ɡǁá͓m̄ ‘thorn’ to illustrate H tone depression 141 F0 traces of HL ǀχóò ‘gemsbok’ and DH-L ɡǀá͓à ‘Silver tree’ to illustrate H tone depression 141

x

Illustrations

3.4 The tonal space of the six non-depressed and two depressed time-normalized and F0 averaged tone melodies across 3 Tsua consultants 142 4.1 Spectrogram and waveform of [ⁿdùǀɔ́ mà] ‘It is papyrus.’ 159 4.2 Spectrogram and waveform of [ɡǀìmà] ‘small fish sp.’ 160 4.3 Spectrogram and waveform of [mù-ᵑǀ̥ápì] ‘small frog sp.’ 160 4.4 Spectrogram and waveform of [zì-ᵑǀúmàrɛ̀] ‘berries of Grewia sp.’ 161 4.5 Spectrogram and waveform of [zìᵑǀûʃù] ‘sores’ 163 4.6 Spectrogram and waveform of [cìᵑǀûʃù] ‘sore’ 163 4.7 Spectrogram and waveform of [ᵑǁûʃù] ‘sore’ 163 4.8 Spectrogram and waveform of [mù-ɡɛ̂nɛ̀] ‘thin’ 166 4.9 Spectrogram and waveform of [mù-ɡǀɛ̂nɛ̀] ‘thin’ 166 6.1 rtMRI frame showing midsagittal articulation of the upper airway and maximum constriction of the initial dorsal stop in the Nama word ge [ɡè] ‘we (masc.)’ 213 6.2 Time-aligned audio and video data acquired during Nama lateral click production 215 6.3 Comparison of word-initial clicks produced before mid-back vowels 217 6.4 Evolution of midsagittal lingual posture in four contrastive Nama click releases 218 6.5 siSwati ‘plain’ click production before a high-back vowel 222 6.6 siSwati nasal click production in a low back vowel context 223 6.7 siSwati nasal click production in a high front vowel context 224 6.8 Influence of vowel context on siSwati clicks 224 6.9 Place of articulation of siSwati velar stops 225 6.10 Places of articulation in English consonants produced by the beatbox artist 227 6.11 Articulation of a ‘side K’ rim shot effect as a lateral click 228 6.12 Articulation of a rimshot effect as an alveolar click 228 6.13 Articulation of a ‘clap’ snare drum effect as labialized dental click 229 7.1 Praat “Edit” screen view of the Taa word [ʘŋ̥áʔãː] ‘conceal oneself’ 249 7.2 Waveform and spectrum of bilabial click burst in the Taa word [ʘŋ̥áʔãː] ‘conceal oneself’ 251 7.3 Waveform and spectrum of dental click burst in Taa word [ǀŋ̥úʔwì] ‘be careful’ 252 7.4 Waveform and spectrum of lateral click burst in Taa word [ǁŋ̥áʔám̀ ] ‘be damp’ 253 7.5 Waveform and spectrum of postalveolar click burst in Taa word [ǃŋ̥áʔã] ‘attack’ 254 7.6 Waveform and spectrum of palatal click burst in Taa word [ǂŋ̥úʔùã] ‘be out of reach’ 255

Illustrations 7.7

xi

Waveform and power spectrum of palatal click in Yeyi word [ʃ ìkǂá] ‘scarification’ 256 7.8 Spectrogram of Taa word [ǀk’qàa] ‘hand’ 257 7.9 Dental and velar ejective bursts in Taa word [ǀk’qàa] 258 7.10 Second dorsal burst in Taa word [ǀk’qàa] 258 7.11 Power spectra of first and second dorsal bursts in the Taa word [ǀk’qàa] 259 7.12 Spectrogram of Yeyi word [m̀ ŋǃójì] ‘Acacia erioloba’ (type of plant) exemplifying a voiced nasal alveolar click 260 7.13 Spectrogram of Yeyi word [kà ɡǃáwà] ‘calabash’ exemplifying a voiced alveolar click 261 7.14 Spectrogram of voiced prenasalized palatal click in the Taa word [ɴɢǂà̰̰li] for a kind of tree 262 7.15 Praat Edit view of the Xhosa word [ukúŋǃola] ‘to climb up’ 263 7.16 Spectrogram of Taa word [ǀˀâː] ‘die’ showing glottalized dental click 264 7.17 Spectrogram of Taa word [ǀq’ɜ́n] ‘small, pl.’ showing dental ejective click 265 7.18 Waveform showing click burst from Taa word [ǀq’ɜ́n] ‘small, pl.’ 265 7.19 Spectrogram of Taa word meaning ‘to cry incessantly’ 266 7.20 Praat Edit screen showing initial portion of the Taa word from Fig. 7.19 267 7.21 Frequency response of a built in microphone on the Zoom H4n – fairly flat up to 20 kHz 271 7.22 Pickup patterns of omnidirectional, cardioid, and shotgun microphones 272 9.1 Click load of assorted Southern African Khoisan languages, based on the onsets of lexemes taken from a Swadesh 100 word list 293 9.2 The acoustic properties of the four click influxes dental ǀ, palatal ǂ, lateral ǁ and alveolar ǃ (following Traill and Vossen 1997: 23) 297 9.3 Schematisation of the process of click weakening, according to Traill and Vossen (1997) 298 9.4 The Khoe-Kwadi language family 299 10.1 The Kx’a family 337 10.2 Evolution of click inventories in the Ju dialect cluster 353 11.1 Production of ixoxo by a typical female Zulu second language speaker 371 11.2 Production of ngcono by a typical female Zulu second language speaker 371 11.3 Production of esingenamcamelo by a typical male Zulu first language speaker 372 11.4 Production of ixoxo by the Participant with AOS 372 11.5 Production of ngcono by the Participant with AOS 373 11.6 Production of esingenamcamelo by the Participant with AOS 373 12.1 Tracings of a radiographic image (a), palate casts (b–d), and a midsagittal slice through a 3D intraoral scan (e) 385

xii

Illustrations

12.2 Visual and auditory aids used during the phonetic training phase to instruct participants in the production of (post)alveolar [aǃa] (top row) and dental [aǀa] click (bottom row) target sequences 388 12.3 Waveforms and spectrograms illustrating plain, (post-)glottalized, and nasalized clicks of the target (post)alveolar and dental click types 394 12.4 Static MR images of participants successfully producing sustained articulations of [ǃːːːː] and [ǀːːːː] targets held at the moment prior to click release 395 12.5 Static MR images of participants who failed to produce click postures for both [ǃːːːː] and [ǀːːːː] targets 396 12.6 Real-time MRI frame sequence showing successful production of [aǃa] with a postalveolar click 398 12.7 Real-time MRI frame sequence showing successful production of [aǀa] with a denti-alveolar click 398 12.8 Real-time MRI frame sequence showing successful production of [aǃa] with a postalveolar click in which the velopharyngeal port opens late in click production 399 12.9 Real-time MRI frame sequence showing successful production of [aǃa] with full lingual-palatal contact yielding an unusual palatal click 401 12.10 Real-time MRI frame sequence showing awareness of anterior (coronal) and posterior (dorsal) lingual articulatory components of the [aǃa] target but poor coordination and absence of velum engagement 402 12.11 Real-time MRI frame sequences showing examples of aspirated velar stop [akʰa] substitutions for [aǃa] 403 12.12 Real-time MRI frame sequence showing velar ejective stop substitution, [ak’a], for [aǃa] 403 12.13 Real-time MRI frame sequences showing examples of voiceless retroflex stop [aʈa] substitutions 404 14.1 Laal dental click [ǀ] (Speaker ID, natural speech) 424 14.2 Laal dental click [ǀ] (Speaker KD, natural speech) 425 14.3 Laal lateral click [ǁ] (speaker ID, natural speech) 427 14.4 Laal lateral click [ǁ] (AK, elicited, careful speech) 428 14.5 Laal velar click (speaker KD, elicited, careful speech) 431 14.6 Laal velar click, (speaker AK, elicited, careful speech) 432 14.7 Laal “tchip” (speaker KD, natural speech) 433

Maps 1.1 1.2

Approximate areas where click languages are spoken 37 Southern African Bantu languages with clicks as regular consonants 39

Illustrations

xiii

1.3

Map of southern Africa with pie charts showing the functional load of clicks in the basic vocabulary of some click languages 42 4.1 Locations where Fwe is spoken 157 9.1 Geographical distribution of click loss in the Khoe language family 302 9.2 Geographical distribution of patterns of alveolar click loss in Khoe 303 9.3 Geographical distribution of palatal click replacements in Khoe 312 10.1 Geographical distribution of click replacement and loss within the Ju language complex 337

Tables 1.1 1.2 1.3 1.4 1.5 1.6 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9

Cross-linguistically common phonetic realizations of the five most common click types  4 Words showing regular reflexes of the Proto-Ju retroflex click *ǃǃ type 21 Minimal and near-minimal pairs illustrating ejective and glottalized clicks in Gǀui 25 ǃXóõ click accompaniments contrastive for dorsal place 31 Proto-Kx’a *ʘ cognate sets proposed by Heine and Honken (2010: 27) 50 Kx’a cognate sets involving bilabial clicks and non-dental clicks 51 Nǀuu click and non-click pronouns  76 Changes to clicks in class 9/10 post-nasal condition 82 Presence of clicks implies presence of nasalized clicks 94 ǃXóõ inventory 102 Comparison of three cluster-based analyses of ǃXóõ 105 Inventory of Khoekhoe 108 Juǀ’hoan stop clusters 111 Gǀui clusters 112 Comparison of click and non-click clusters 113 Consonant-tone interaction in African and East Asian languages 130 Tsua click consonant inventory 132 Click replacement comparative data for ǃ and ǂ clicks  135 Root counts of tonal depression exceptions by depressor type 143 Root counts of tonal depression by tone melody 143 Gǀui, Kua and Tsua correspondences for the voiced obstruents 144 Root-initial nasal consonants in Kalahari Khoe East languages compared to Tsua root-initial voiced consonants 146 Examples of nasal clicks co-occurring with five of the six non-depressed tonal melodies in Tsua 147 Cognates where tonal depression occurs in Tsua but not in Gǀui or Kua 148

xiv

Illustrations

4.1 4.2 4.3 4.4 6.1 6.2 6.3

Words showing contrast between clicks and non-clicks in Fwe 158 Click/non-click alternations for each speaker 167 Click/non-click alternations for each word 169 Bantu non-click cognates of Fwe click words 172 Nama click consonants 216 Summary of Phonetic Properties of Click Consonants 232 Nama orthography and phonetic transcriptions, with Afrikaans disambiguators (where used during presentation), and English glosses 234 siSwati orthography and phonetic transcriptions, with English glosses 235 Hierarchy of elicitation techniques in relation to speech style variation 244 Kalahari Khoe doculects used in this paper 295 Symbols used in this paper deviating from IPA standards 296 Diachronic stages of click loss in Khoe 301 Alveolar click retention in ǁAni and Gǁana 304 Variation in alveolar click loss among five speakers of ǁAni 304 Non-click replacements of the alveolar click in Kalahari Khoe 306 Palatal click retention in Ts’ixa 313 Variation in palatal click loss among nine speakers of Ts’ixa 314 Non-click replacements of the palatal click in Kalahari Khoe 317 Assumed correspondences between ǀx’~ǀʔ in Khoekhoe and ts’ in Kalahari Khoe 323 Cognate forms between Kwadi and Khoe displaying click loss in Kwadi 325 Cognate forms between Sesfontein Damara and other Khoekhoe varieties displaying click loss in Sesfontein Damara  328 Reflexes of Proto-Ju *ǃǃ in Doke’s Neitsas ǃXun (C2), Juǀ’hoan, Northwestern ǃXun (W2), Mupa ǃXun and Snyman’s “A1”  339 Reflexes of Proto-Ju *ǂ in Juǀ’hoan, König and Heine’s (2008) Northwestern ǃXun (W2), Mupa ǃXun and Traill’s ǃXun from Menongue  342 Replacements for the fricated palatal click ⨎ attested in Mupa ǃXun 345 Non-click replacements for the alveolar click attested in Mupa ǃXun 346 Patterns of alveolar click loss in Mupa ǃXun, compared to Juǀ’hoan and König and Heine’s (2008) Northwestern ǃXun  347 Non-click replacements for a subset of dental and lateral influxes as recorded with one speaker of Mupa ǃXun  351 Patterns of dental click loss in Mupa ǃXun, compared to Juǀ’hoan and König and Heine’s (2008) Northwestern ǃXun (W2) 351 Patterns of lateral click loss in Mupa ǃXun, compared to Juǀ’hoan and König and Heine’s (2008) Northwestern ǃXun (W2)  352 The word stimuli and characteristics of the click sound in each word 366

6.4 7.1 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 9.10 9.11 9.12 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 11.1

Illustrations 11.2 Spectrogram analysis parameters 369 11.3 Click production characteristics across recordings 1 to 3 as judged perceptually 370 11.4 Comparison of acoustic data of the Participant with AOS and control participants  375 11.5 The difference between minimum and maximum acoustic measurements (range) per participant and control participants to demonstrate variability across recordings 376 12.1 Frequencies of different airstream and/or manner of articulation patterns observed in the participant production audio 392 12.2 Frequencies of different place of articulation patterns observed in the participant production audio (focusing in all cases on the anterior-most stricture)  393 14.1 Paralinguistic clicks in Laal and Wolof 435

xv

chapter 1

Click Consonants: an Introduction Bonny Sands 1 Introduction Clicks are wonderful consonants. On their own, they can be used to express affection [ʘʷ] or dismay [ǀ], to give two examples from English. In other languages, clicks are produced next to vowels, in syllables and words. Clicks can be voiceless and oral, as in the English examples, or they may be produced with a range of different laryngeal, nasal and dorso-radical settings. Words with clicks can express concepts as varied as ʘqháa ‘between’, ʘxàa ‘elder half brother’, and ʘnáni-ʘnâni ‘kill an insect by rubbing it against the limb it is biting’, to give examples from ǃXóõ (Traill 1994). Despite their reputation, click consonants are not exceptionally difficult to pronounce or borrow. Nor are they are exempt from the universal pressures that can lead to sound change. They participate in phonological processes and show sociophonetic variation. The papers in this book are representative of a range of current-day linguistic approaches aimed at understanding how clicks are made, how they are perceived and acquired, and how they change. More than twenty years ago, an important cross-linguistic survey of clicks appeared in Ladefoged and Maddieson’s (1996) book The Sounds of the World’s Languages. Since that time, we have learned more about clicks, primarily through 1) studies of previously unstudied languages and dialects, and 2) studies utilizing new techniques such as ultrasound and real-time-MRI (rt-MRI), which have enabled us to image the tongue dorsum and tongue root, as well as other structures involved in click production. I have no doubt that we can study click consonants for another twenty years and still find that they are able to surprise us. There are still numerous languages and dialects which lack phonetic studies of their clicks. And, as these lects continue to change, their click inventories will also change. The list of “click languages” will also likely change as documentation improves. Click articulations are extremely complex and varied and their phonetic detail is difficult to express using the small number of symbols currently approved by the International Phonetic Association (ʘ, ǀ, ǁ, ǃ, ǂ). There is a great need for more studies of click variation, particularly that which occurs in unelicited

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_002

2

Sands

speech. A particular contrastive click consonant may be pronounced loudly or softly without changing the lexical meaning of a particular word, for instance. But, this sort of variation is dependent on phonological factors and on pragmatic and sociophonetic factors, and these differ across languages and across speakers. Our ability to transcribe the full range of the phonetic variations of click pronunciations has made incredible strides in recent years. It is my hope that linguists reading this book will find the tools necessary to conduct their own studies of clicks, or to include click languages in their studies of typology, phonology and the like. 2

What Are Click Consonants?

Clicks are suction noises made with the mouth. The symbols ʘ, ǀ, ǁ, ǃ, ǂ refer to plain (i.e. voiceless, unaspirated) consonants, or a basic click type.1 To indicate other kinds of click consonants (e.g. glottalized, nasalized, voiced, etc.), the click symbol is accompanied by other symbols (e.g. ǃʔ, ŋǃ, ɡǃ, ǃχ’, ǃ’) or diacritics (e.g. ǃ̃,̥ ǃ̃,̬ ǃ̬).2 These so-called click accompaniments, together with the basic click type, form contrasts which vary from language to language. 2.1 Airstream Mechanism Clicks are made through a series of articulatory gestures. First, two stop constrictions are formed, creating a pocket of air between them. The size of the pocket of air between these two stops differs according to the click type, language, and following vowel (see Proctor et al., this volume, Figures 6.3–6.4). The tongue moves to enlarge this cavity, which causes the air pressure in the cavity to drop, or be rarefied. When the front constriction is released, air rushes in, making the suction noise known as a click. The ways in which the cavity is enlarged also depends on factors such as click type, language, and adjacent vowel (Proctor et al. (this volume); Miller 2008, 2010, 2016). Lionnet (this volume) describes a special case in which suction is created by the back closure being released first.

1  The term ‘click type’ is used rather than ‘place of articulation’ because the (alveolar) lateral click ǁ is distinguished from the (central) alveolar click ǃ by the direction of airflow rather than by the place of articulation. 2  Elderkin (1989: 22–25) notes that the use of diacritics is more consistent with IPA practices for other consonants than is the use of separate symbols. The use of other symbols is perhaps more prevalent, however, because of the relative ease of typing them and in distinguishing them in print and in handwritten notes.

Click Consonants: an Introduction

3

A click can be made without any simultaneous airflow coming from the lungs or initiated by the glottis. This makes clicks distinct from purely pulmonic consonants such as /m, t, ɡ, f, r, tʃ/. Clicks are also distinct from implosives, e.g. /ɓ, ɗ, ɠ/, and ejectives, e.g. /p’, t’, k’, tʃ’/, otherwise known as glottalic consonants. As a class, clicks have been called velaric consonants (Catford 1977: 64), or alternatively, lingual consonants (Miller et al. 2009: 130). In the case of pulmonic consonants, the main source of airflow used to create the consonant (or airstream mechanism) comes from the lungs; the main source of noise in glottalic consonants comes from airflow created by actions of the glottis/­larynx. Since the main noise source of clicks comes from ingressive airflow created by tongue movements and not movements of the velum, the term ‘lingual’ is a better parallel to the other airstream mechanism labels than is the term ‘velaric’. The term ‘velaric’ refers to the velum, or soft palate. Click consonants are made with the back of the tongue (dorsum) pressed up against the velum. Consonants made using the back of the tongue and the velum include velar consonants such as /k, ɡ, ŋ, kx’, ɠ/, and uvular consonants such as /q, ɢ, ɴ, qχ’, ʛ/. Clicks can be made with either a velar constriction or a uvular one. Speakers of Nǀuu consistently use uvular constrictions (Miller et al. 2009), while speakers of Hadza tend to use either velar or somewhat post-velar constrictions. It is known that the position and movement of the tongue dorsum and root (and tongue shape) is dependent on click type, as Miller et al. (2009), and Miller (2016) have shown for Nǀuu and Mangetti Dune ǃXun, respectively, using quantitative ultrasound data. In this volume, Proctor et al. present rt-MRI data showing that the position of the post-velar constriction is dependent on click type for a speaker of Nama, with dental clicks before /ɑ/ having a velar constriction and /ǃ, ǂ, ǁ/ click types having a uvular one. The dental /ǀ/ and alveolar /ǃ/ clicks are released with a lowering of the front of the tongue body while palatal /ǂ/ and lateral /ǁ/ clicks involve retraction of the tongue body. The lateral and palatal clicks have more tongue root retraction than /ǃ/ and /ǀ/ and also have wider laminal anterior constrictions. Bradfield (2014: 4) proposed a turned small-caps K symbol /Ʞ/ to represent the basic click mechanism. Note that lower-case turned k [ʞ] is used by Lionnet (this volume) to indicate a back-released velar click.3 A turned upper-case K (Ʞ) might

3  Tuhuse and Traill (1999) describe a variant of the palatal click in ǃXóõ used in a bird imitation. The closures are the same as for a palatal click /ǂ/, but the dorsal constriction is released while the anterior constriction is not released.

4

Sands

be a better symbol to stand for any click, as this would better parallel the uses of capitalized letters V and C to stand in for any vowel or consonant, respectively.4 2.2 Click Types ʘ, ǀ, ǁ, ǃ, ǂ All clicks are made with two constrictions. The name of a click type comes from the more anterior location of the two constrictions, e.g. at the lips: bilabial [ʘ], at the teeth: dental [ǀ], or from the direction of airflow, e.g. lateral [ǁ]. There are five click types recognized by the International Phonetic Alphabet (IPA): [ʘ, ǀ, ǁ, ǃ, ǂ]. The most cross-linguistically typical phonetic realizations of these click types are shown in Table 1.1, which also indicates the anterior part of the tongue used in the clicks as well as the nature of the burst associated with the release of the anterior click constriction. Example spectra and waveforms of the different click types can be found in Fulop and Wright (this volume). To indicate click types, orthographies either use the letters ʘ, ǀ, ǁ, ǃ, ǂ, or they use the following equivalents: c (ǀ), x (ǁ), q (ǃ), tc or ç (ǂ).5 The IPA once used the click type symbols: ʇ, ʗ, ʖ (equivalent to ǀ, ǃ, ǁ) following Doke’s work on ǃXun (Doke 1923), but these were replaced after the 1989 Kiel convention (Köhler et al. 1988) with symbols introduced by Lepsius (1873) which have been in longstanding use in Khoekhoegowab (Haacke 2018: 145). Other symbols used for click types are surveyed in Breckwoldt (1972) and Haacke (1989). Doke’s symbols (1923: 144) included separate symbols for voiceless, voiced and nasal clicks, as shown in Figure 1.1(a); transliterations of these into a modern transcription table 1.1

Cross-linguistically common phonetic realizations of the five most common click types

Click type letters Place of IPA Roman Articulation

Anterior tongue Direction of Anterior burst contact airflow characteristics

ʘ ǀ ǃ ǂ ǁ

none laminal apical apicolaminal laminal

c q tc, ç x

bilabial dental (post)alveolar palatal (post)alveolar

central central central central lateral

fricated fricated abrupt abrupt fricated

4  Note that small-cap turned k is somewhat problematic as a stand-in for a generic click as it has been proposed as a stand-in for a generic consonant, and lower-case turned k has been used for other consonants as well (MacMahon 1994). 5  Boyce and Davis (1844: 4) use ‘qc’ to spell a click heard in only a few words in isiXhosa. Bleek (1862: 13) speculates that this likely refers to the palatal click ǂ. Bradfield (2014: 27) notes the use of ǂ in place of ǃ in child-directed speech in isiXhosa.

5

Click Consonants: an Introduction

figure 1.1(a) Doke’s symbols for Grootfontein ǃXun clicks Source: Doke 1923: 144; Copyright © the University of Witwatersrand, reprinted by permission of Taylor & Francis Ltd, http://www.tandfonline.com on behalf of the University of Witwatersrand

ǀ

ǂ

ǃ

ǃǃ

ǁ

ᶢǀ

ᶢǂ

ᶢǃ

ᶢǃǃ

ᶢǁ

ŋǀ

ŋǂ

ŋǃ

ŋǃǃ

ŋǁ

figure 1.1(b) Transcription equivalents of Grootfontein ǃXun click symbols Note: Doke’s labels of ‘alveolar’ and ‘palato-alveolar’ refer to clicks that are respectively labelled ‘palatal’ (ǂ) and ‘postalveolar’ (ǃ) in this work.

system is shown in Figure 1.1(b). Separate symbols enable clicks to be treated on a par with symbols for non-click consonants such as t, d, n. Beach (1938: 89) used a curly-tail diacritic to indicate nasalized clicks in Khoekhoegowab. Although a distinct click symbol is equivalent to a particular click type, it should not be assumed to represent a contrastive click consonant. In a language such as Fwe (see Gunnink, this volume), different speakers may produce the same word with a different click type. Ideally, the click type symbol with or without diacritics would be able to convey information about the articulation and its burst characteristics as listed below. But, place of articulation labels for clicks are not so straightforward, as they leave out much information such as the shape of the tongue body and the direction and speed of the release of the click, which one would need to know in order to replicate the pronunciation well-enough to avoid it being mistaken for another click.

6

Sands

Constriction and release descriptors relevant for distinguishing click types include: – Place of articulation: bilabial, labiodental, labial-dental, dental, alveolar, postalveolar, retroflex, palatoalveolar, alveolopalatal, palatal. Lateral passive articulator: premolar, molar teeth; – Point of contact on tongue for the anterior click constriction: apical, apicolaminal, laminal, subapical, sublaminal, center of tongue (plus blade and/or tip); – Release: noisy/fricated vs. abrupt; strong vs. weak (high or low amplitude compared to other sounds); – Tongue body shape: low & flat, concave, domed; – Tongue body movement: tongue body lowering, tongue body/root retraction (into upper vs. lower pharynx, cf. Miller 2016); – Direction of airstream: central vs. lateral. Even bilabial clicks have variant pronunciations that are difficult to transcribe in an unambiguous way. Although bilabial clicks are typically made with somewhat pursed or rounded lips (Traill 1985: 105), they may also be made with lips that are compressed, lying flat or drawn towards the teeth (as in the case of infant lip smacks, Diepstra et al. 2016, or as an allophone of a labialvelar stop in Ninkyob, a Nigerian language, Harley 2012: 59–62). Bilabial clicks in Nǀuu, ǃXóõ and ǂHoan are typically more fricated than those that occur as allophones of labialvelar stops. The dental diacritic may be used with the bilabial click symbol to indicate a labiodental articulation [ʘ̪ ], as has been described as a variant pronunciation of /ʘ/ in Eastern Nǀuu before phonemically unrounded vowels (Miller, Brugman et al. 2007: 123). Different from this is the labial-dental variant of /ʘ/ produced by some ǃXóõ speakers in some tokens, transcribed by Traill as [ʘ͡ǀ] (Traill 1985: 103–104), with simultaneous bilabial and dental constrictions. Different yet again is the bilabial bracing before a dental click produced by a speaker of Kora (du Plessis 2017: 134; 2018: 87). Dental clicks may be interdental, apical dental, or denti-alveolar; laminal dental clicks are often made with the tongue tip braced against the lower teeth (Ladefoged and Maddieson 1996: 251). In the case of the [ǃ] and [ǂ] click types, these have been referred to in the literature with an array of place of articulation labels (cf. Ladefoged and Maddieson 1996: 248). The labels “alveolar” and “palatal” have been used for both [ǃ] and [ǂ], for instance. Phoneticians who lack first-hand experience with African click languages interpret the [ǃ] and [ǂ] click symbols with a range of different pronunciations (cf. the real time MRI IPA charts, https://sail.usc.edu/ span/rtmri_ipa/index.html, Toutios et al. 2016). The IPA uses the label “(post) alveolar” for [ǃ] and “palatoalveolar” for [ǂ] but most Khoisanist linguists today

Click Consonants: an Introduction

7

refer to [ǂ] as palatal rather than palatoalveolar (following Ladefoged and Maddieson (1996) and others). The [ǃ] is alveolar in languages such as Hadza, Nǀuu and Sandawe, but postalveolar in languages such as Juǀ’hoan, isiZulu and Khoekhoegowab. As can be seen in Figures 6.3–6.4 of Proctor et al. (this volume), the [ǂ] click in Nama (Khoekhoegowab) is produced with a broad anterior constriction involving the center of the tongue and the palate, much like palatal consonants such as [c, ɟ]. The term palatoalveolar implies the use of the tongue blade rather than the center of the tongue along with the tip and blade. There is consistency in which words are transcribed by Khoisanists with each symbol: [ǃ] is used for a darker, more hollow-sounding click and [ǂ] for a click with a brighter, more high-pitched sound. Of course, a true palatoalveolar click can also be made, as shown in Figure 12.9 of Moisik and Dediu (this volume), of a click produced by a person who does not speak a click language; this articulation gives an acoustic effect closer to [ǃ] than to [ǂ] as typically pronounced in languages which have both [ǃ] and [ǂ] sounds (such as Juǀ’hoan, Nǀuu, ǂHoan and Nama). The confusion over the precise labels for some click types is due to several factors. First, there are phonetic differences across languages, speakers and individual tokens that are not adequately captured by just five symbols. Many accounts simply reuse labels from the IPA or other sources and do not provide independent phonetic evaluations of how the clicks are made. Second, the choice of a click type symbol has sometimes been made based on the location of constriction on the roof of the mouth without also considering the part of the tongue touching the roof of the mouth, the tongue body shape or the positioning of the tongue root – differences that are phonologically important (cf. Miller, Namaseb and Iskarous 2007). Third, the place of articulation of a click may change during the production of the click, particularly just at or just before the point of release of the constriction. The point of release is what determines the characteristic sound of the consonant, and so should be preferred in the choice of label over the place of constriction during the (silent) closure portion of the click. The positioning and shape of the tongue body and tongue root may also differ dynamically across the production of the click, affecting the acoustic profile of the click type. Finally, although some clicks (/ʘ, ǀ, ǁ/) are typically released more slowly, with frication, while others (/ǃ, ǂ/) typically have an abrupt release, it is possible to produce abruptly released bilabial, dental and lateral clicks and it is possible to produce alveolar and palatal clicks with fricated releases. For instance, a fricated palatal click release is described as occurring before /i/ vowels in Mangetti Dune ǃXun (Miller 2019). A fricated palatal click also occurs in Yeyi (Fulop et al. 2003; Fulop and Wright, this volume).

8

Sands

There is no established way of transcribing clicks which have atypical frication patterns for their anterior place of articulation. For instance, a low amplitude fricated click may be produced on the alveolar ridge, but this is quite unlike a typical production of /ǃ/ in languages which contrast two or more click types; though technically alveolar, it may sound more like a typical dental click (due to the fricated release), or like an allophone of [ǂ] (due to the high frequency noise in the burst). For some speakers of some languages, it is possible to realize a click phoneme without noise due to suction; this occurs when the dorsal click closure is released before the anterior closure (see Proctor et al. Figure 6.3 (this volume)). A coronal sound is one made with the front part of the tongue. The acoustic effect of the constriction is not just dependent on the part of the roof of the mouth that is touched with the tongue, but also on the part of the tongue used (tip, blade, center, underside of tip/blade), and the shape and height of the tongue behind the constriction. There is typically a correlation between apicality/laminality and the tongue body shape. For clicks, the tongue body shape is phonologically significant (Miller 2010), but the IPA symbols do not currently encode tongue body shape for clicks the way that symbols for fricatives do. In practice, Khoisanists use [ǃ] for clicks made with a with a concave tongue body shape. The sublingual space (between the front of the tongue and the floor of the mouth) acts as a resonating chamber, giving the click a characteristic hollow sound, or relatively low frequency burst (cf. Stevens 1998: 121–124). The [ǂ] symbol is used by Khoisanists for laminal or apico-laminal clicks made with a very high, domed tongue body, much like that seen in palatal consonants e.g. [c, ç]. This is unlike the grooved tongue body shape typical for [ʃ] and [ɕ], which might be implied for some by the IPA use of the label palatoalveolar for [ǂ]. 2.3 Additional Click Types 2.3.1 Transcription Issues Related to Additional Click Types Since no click language has been described with more than 5 contrastive click types, it may not be immediately obvious why 5 symbols are inadequate to describe the phonetic range of click types seen cross-linguistically. But, phonological accounts are increasingly making use of phonetic details, and the IPA symbols are not easily modified to transcribe significant phonetic details. Ignoring the bilabial and lateral clicks for the moment, we are left with three click symbols : ⟨ǀ, ǃ, ǂ⟩. The main IPA consonant chart distinguishes 5 places of articulation for consonants made with the tip/blade or center of the tongue: dental, alveolar, postalveolar, retroflex and palatal. In the literature, clicks have

Click Consonants: an Introduction

9

been described with each of these places of articulation, including one language, Grootfontein ǃXun, with four of the five being contrastive (Doke 1923). Fricatives are given unique symbols for (voiced and voiceless) consonants at each of the 5 places of articulation: θ, ð; s, z; ʃ, ʒ; ʂ, ʐ; ç, ʝ. Lateral alveolar fricatives ɬ, ɮ and alveolo-palatal fricatives ɕ, ʑ also have their own symbols, meaning that there are 7 sets of fricative symbols in the IPA6 corresponding to just 4 coronal click symbols ǀ, ǃ, ǂ, ǁ, even though the coronal clicks can be made in comparable ways. The set of diacritics which can be used on plosives t, d, n to specify whether the consonant is dental (t̪, d̪ , n̪ ), or advanced or retracted do not easily fit under most click symbols, particularly if another diacritic under the consonant is already being used (such as C̬ to indicate voicing). Diacritics may be used to indicate a voiceless palatal lateral fricative [ʎ̝̊] that contrasts with an alveolar lateral [ɬ] (as occurs in Dahalo), but it is unclear how to best indicate a similar articulatory difference in apical vs. laminal lateral clicks or those that have a more forward lateral release (as in Khoekhoe) vs. a more back lateral release (as in Sandawe), respectively. If diacritics are used to indicate voicing as well as to indicate a more retracted or a more forward articulation, the space under the click symbol may become quite cluttered. Retroflex pulmonic consonants used to be indicated in the IPA with a dot underneath the primary symbol but are now indicated with a retroflex hook, e.g. ⟨ʈ, ɖ⟩. The (post)alveolar click symbol ⟨ǃ⟩ already comes supplied with what was historically the retroflex dot added to the dental click symbol. To indicate a true (sublaminal or subapical) retroflex click as has been documented by Doke (1923; 1926: 301–302), the symbol ⟨ǃǃ⟩ has increasingly been used (e.g. Sands 2010; Starostin 2018). However, this symbol has also been used for the fricated reflex of the Proto-Ju *ǂ palatal click in Ekoka ǃXun (the W2 lect of ǃXun spoken in Eenhana District, northern Namibia) (König and Heine 2008: 5). A retroflex hook on the dot on the symbol for a (post)alveolar click ⟨ǃ̢⟩ would be a sanctioned IPA transcription but it is not as distinctive looking as ⟨ǃǃ⟩. Because a palatal click is made with a constriction involving the tongue blade (with or without the tongue tip) and the center of the tongue, the noise from a fricated release might occur in the alveolar region or further back (cf. Miller and Holliday 2014; Miller and Elsner 2017). I propose the symbol ⟨⨎ ⟩ for the fricated palatal click in W2 ǃXun as I hear a burst from a constriction released behind the postalveolar region, involving the blade/center of the tongue, as Heine and König (personal communication) do. The anterior 6  Note that there are additional symbols for lateral fricatives in the [2015] revised extIPA chart (Ball et al. 2018).

10

Sands

click constriction is released along the midline of the tongue (i.e. with central, rather than lateral airflow), as the tongue moves from a domed shape to a grooved shape. The symbol ⟨⨎ ⟩ was initially proposed by Beach (1938: 77) to be the transcriptional equivalent of Nama orthographic ǂ. Miller et al. (2011) have proposed the transcription [ǃ͡s] for the contrastive Ekoka ǃXun fricated postalveolar click type (< *ǂ), as it is sometimes made with an apical constriction. More research needs to be done to understand how best to characterize the fricated palatals of Ekoka ǃXun and Yeyi. 2.3.2 The Fricated Palatal /⨎ / Click Type The /⨎ / click type in Ekoka ǃXun has a similar burst duration and intensity to /ǁ/, as shown in Figures 1.2–1.3. The /⨎ / and /ǁ/ click types differ in the spectral characteristics of their bursts, with the former having a lower frequency emphasis and the latter having a higher frequency emphasis in the 0–4000 Hz range, as shown in Figure 1.4. The fricated palatal /⨎ / has more energy below 2000 Hz than does /ǁ/, particularly in the F1/F2 region. It also has a low F3 locus, as also seen in retracted rhotics. The center of gravity of this click burst, however, is high at 4527 Hz, like other palatal clicks (Fulop and Wright, this volume). The lateral click burst has a center of gravity at 2974 Hz. The fricated ǂ-click in Yeyi, as shown in Fulop and Wright (this volume), has a diffuse spectrum with high frequency energy, suggesting that the frication takes place on the alveolar ridge while the tongue body is still high. This is quite different from the /⨎ / click type of Ekoka ǃXun. 2.3.3 The Retroflex /ǃǃ/ Click Type Retroflex clicks were documented in Central ǃXun, near Grootfontein, Namibia, by Doke (1923). He describes the click as being made with the underside of the tongue tip pressed against the hard palate, and released backwards, as shown in Figure 1.5. This creates a click that is not “instantaneous” (Doke 1923: 148), i.e. somewhat fricated. Although Doke was a well-respected phonetician, the fact that researchers working on other languages, including the closely related Juǀ’hoan language, did not find retroflex clicks, led Ladefoged and Maddieson (1996: 253) to conclude that the retroflex pronunciation was a variant of the /ǃ/ click type. However, alveolar /ǃ/ clicks do contrast with postalveolar apical or subapical /ǃǃ/ clicks in the speech of some ǃXun living in Grootfontein (Miller et al. 2009), and can be reconstructed for Proto-Ju (Sands 2010; Starostin 2018). The Grootfontein ǃXun alveolar click type /ǃ/ Amanda Miller and I recorded in 2008 is abrupt, while the retroflex or postalveolar click /ǃǃ/ is fricated, as shown in Figures 1.6–1.7, which illustrate the minimal pair: /ᶢǃűː/ ‘belly’ vs. /ᶢǃǃűː/ ‘water’. Retroflex clicks have a darker sounding burst than do fricated palatal or lateral clicks. The retroflex click is distinct from a [ǃ¡] plopped click

Click Consonants: an Introduction

figure 1.2 Ekoka ǃXun fricated palatal click /⨎ / and following breathy vowel, in the word /⨎ ȁ̤hà/ ‘cut’ Source: Recorded by Johanna Brugman in 2006

11

12

Sands

Figure 1.3 Ekoka ǃXun lateral click /ǁ/ and following vowel, in the word /ǁa̋è/ ‘monkey’ Source: Recorded by Johanna Brugman in 2006

Click Consonants: an Introduction

Figure 1.4 Burst spectra of the Ekoka ǃXun [⨎ ] (black) and [ǁ] (red) click tokens in Figures 1.1–1.2

Figure 1.5 Midsagittal diagram of a retroflex click production in Grootfontein ǃXun Source: Doke 1923: 148; Copyright © the University of Witwatersrand, reprinted by permission of Taylor & Francis Ltd, http://www .tandfonline.com on behalf of the University of Witwatersrand

13

14

Sands

Figure 1.6 Alveolar click (and following vowel) in the word /ᶢǃűː/ ‘belly’ pronounced by Mishe ǁXae Nǀami of Grootfontein, Namibia

Click Consonants: an Introduction

Figure 1.7 Retroflex (fricated postalveolar) click (and following vowel) in the word /ᶢǃǃűː/ ‘water’ pronounced by Mishe ǁXae Nǀami of Grootfontein, Namibia

15

16

Sands

described as an allophone of /ǃ/ in Neitsas ǃXun (Doke 1923) and Sandawe (Maddieson et al. 1999) in which the tongue tip hits the floor of the mouth.7 Palatography photos (linguograms and palatogram) of a retroflex click produced by Grootfontein ǃXun speaker Kaqe Useb in the word /ᶢǃǃűː/ ‘water’ (Scott et al. 2010) are shown in in Figure 1.8. These images show that the retroflex click is produced by making contact with the underside of the tongue blade as well as with the top of the tongue blade. The methodology of palatography is described in detail by Ladefoged (2003), and Anderson (2008). The linguogram in Figure 1.8(a) shows a laminal constriction with the upper side of the tongue blade, while 1.8(b) shows contact with the underside of the tongue.8 This contact pattern could be labelled as laminal-sublaminal, as it involves contact with the tongue tip and blade as well as with the underside of the tongue blade. The linguogram in 1.8(a) shows a U-shape constriction typical of coronal consonants (actually, an O-shape, given the dorsal constriction barely visible at the top of the image). The palatogram in 1.8(c) does not have a clean U-shape, but rather, shows an irregular shape extending from the start of the mid-palatal region to the postalveolar region (cf. Butcher 1995: 18), leaving an untouched space well behind the teeth; similar irregular patterns are seen in palatograms of retroflex consonants in Gujarati (Dave 1977). The streaks of palatography paint in the linguogram in 1.8(c) are thinner and more spread apart in the postalveolar region than they are in the mid-palatal region, suggesting a forward motion of the underside of the tongue. It is known that there is some variability across speakers and across languages in the production of non-click retroflex consonants (cf. Ladefoged and Maddieson 1996; Dart and Nihalani 1999; Hamann 2003).9 For instance, Dart and Nihalani’s 1999 study of pulmonic retroflex consonants in Malayalam found either sublaminal (involving only the underblade) or apicosublaminal (involving both the tongue tip and underblade) contact patterns for non-click retroflex consonants (Dart and Nihalani 1999: 133). Although Doke (1923: 148) describes the retroflex click as being released with a backward movement 7  Plopped clicks have also been described as occuring in Hadza (Maddieson et al. 1999) but at least some instances of double bursts interpreted as plopped clicks in this language may instead be due to a weak anterior click burst followed by an audible dorsal burst. 8  The contact with the underside of the tongue might arguably be called subapical rather than sublaminal. However, given the difficulty of making palatography paint (a mixture of charcoal powder and oil) adhere to the relatively moist underside of the tongue, and due to indications that the location of the constriction moves, I think it more likely that the constriction involves a relatively broad portion of the underside of the tongue best labelled “sublaminal”. 9  Retroflex consonants also show variability due to phonological factors such as different prosodic environments and vowel contexts (Hussain et al. 2017; Tabain 2019).

17

Click Consonants: an Introduction

a.

b.

c.

Figure 1.8 Palatography of Grootfontein ǃXun retroflex click in the word /ᶢǃǃűː/ ‘water’, as pronounced by Kaqe Useb Note: (a) linguogram showing anterior and dorsal contact on the upper part of the tongue; (b) linguogram showing contact on the underside of the tongue; (c) palatogram showing anterior and dorsal contact on the palate. Source: Scott et al. 2010

of the tongue tip, the palatogram in Figure 1.8(c) appears to show a forward movement of the anterior click closure. There is little cross-linguistic laboratory phonetic data to indicate the variability of the direction of the release of the constriction of retroflex consonants but it is physically possible to produce retroflex clicks with the tongue-tip being released in either a forward (anterior) or a backward (posterior) direction. Certainly, retroflex flaps are produced with a forward motion of the tongue, and Kannada speakers in one ultrasound and EMA study were found to have voiceless retroflex stop constrictions that were released with a forward movement (Kochetov et al. 2014: 177). Figure 1.9 shows high-speed ultrasound traces produced by Amanda Miller and Abigail Scott (Scott et al. 2010; Miller and Scott 2011; Miller 2011) of a retroflex click in Grootfontein ǃXun in the word ǃǃhá [ǃǃʰá] ‘meat’. Each frame shows a different point in time in the production of the retroflex click. The timing of each frame is noted in ±ms relative to t = 0, the time of the frame immediately before the release of the anterior click closure. The release of the anterior click closure is shown in frame 137, at the bottom of the figure, which was taken 8.8 ms after frame 136 (t = 0). Note that the front of the mouth in the ultrasound images is to the right, and not to the left, as is customary for midsagittal diagrams such as Figure 1.5. For discussion of the methodology used for these ultrasound images, see Miller and Finch (2011) and Miller (2016). The images include a palate trace (in red) and a trace showing the contour of the

18

Sands

tongue (in black). The palate trace was taken at the beginning of the recording session, with the speaker swallowing water; meaning that the position of the palate relative to the tongue body is estimated and does not reflect possible movements of the lower portion of the velum (velic traverse) and uvula relative to the palate (cf. Gick et al. 2013). In order for there to be a click, there must have been a closure involving the tongue dorsum. This can be presumed by the raised tongue dorsum seen in the left sides of the frames in Figure 1.9 (but which is not shown by contact between the black and red traces). Also necessary for click production is the presence of a cavity (between front and back click closures) which is expanded, thus lowering the air pressure in the cavity so air will rush in when the anterior closure is released. This cavity expansion can be seen in Figure 1.9 frames 133–136, which have a lower center of the tongue and a more retracted tongue root than in preceding frames. The top four frames in Figure 1.9 show the center of the tongue beginning to lower, eventually forming a concave shape behind the anterior constriction in the following 5 frames. The blade of the tongue is noticeably raised in frame 130, and it continues to rise and retract in frames 132 and 133. The position of the front constriction can be seen relative to a dotted vertical line at the 35 cm mark, which marks the most anterior point of constriction during the click. The position of the anterior constriction changes rapidly in the four frames leading up to the click burst. Frames 132–133 show a retracted constriction. Frame 135 shows a more anterior constriction, followed by Frame 136 which again has a more retracted constriction. From these ultrasound traces, I infer that contact is first made with the underside of the tongue tip (frame 132), then with the underside of the blade (frame 133). As the click cavity expands, the anterior constriction moves forward (frame 135), either just before or as the tongue tip unrolls. Then, a (laminal) constriction is made with the upper side the tongue (frame 136) before the release of the click (frame 137). The position of the tongue in Frame 137 shows that the final release would have involved a forward, downward movement of the tongue. We do not yet have evidence as to how the articulation of retroflex clicks may vary across speakers, in different vowel contexts, accompaniments, etc., although such variation very likely occurs as it does for other retroflex consonants. Similarly, we know relatively little about the diversity of ǃXun lects spoken in Grootfontein, Namibia. According to Nedvěd (2016), ǃXun in the Grootfontein area include the ǃXunigǃu ǃXun (Makalani people), ǃU ǃXun (Dune people), Kxabinǃa ǃXun (Tamboti people, or Tsintsabis ǃXun), and Sarisa ǃXun. Since Levi Namaseb and I recorded several ǃXun speakers in Grootfontein in 2010 who did not contrast /ǃ/ and /ǃǃ/, it seems likely that they might

Click Consonants: an Introduction

Figure 1.9 High-speed ultrasound traces of a Grootfontein ǃXun retroflex click in the word ǃǃhá [ǃǃʰá] ‘meat’, said by ǀXai Tsubeb Notes: The x-axes show horizontal position (in cm) from back to front, with the tongue root towards the lefthand side of each trace, and the front of the mouth towards the right. Y-axes show vertical position (in cm). Black traces show tongue position; red traces show the speaker’s palate (pal). The dotted vertical line indicates the position of the anterior-most constriction (dist), or distance = 35 cm. Selected frames shown here are numbered in sequence with a frame number (fr) and marked with an indication of its time (t) relative to the frame immediately preceding the release of the click burst (f = 136, t = 0). Source: Images produced by Amanda Miller and Abigail Scott (Scott et al. 2010; Miller and Scott 2011; Miller 2011)

19

20

Sands

identify with a different group than those Amanda Miller and I recorded in 2008. Unfortunately, we were not familiar with Nedvěd’s findings at the time of our fieldwork and do not know which of these groups the speakers we worked with identify with. 2.3.4 The Forward-Released Lateral /ǀǀǀ/ Click Type An additional click type that has been described is a forward-released lateral [ǀǀǀ], documented by Miller-Ockhuizen and Sands (2000) in the speech of one speaker of Mangetti Dune ǃXun who lived in a Juǀ’hoan-speaking area at the time. She produced this click only in words corresponding to Proto-Ju retroflex clicks (cf. Sands 2010; Starostin 2003). We did not hear this click type when recording ǃXun speakers in Mangetti Dune in 2008 (cf. Miller 2016), so it may be a stylistic variant used as a way to emphasize the differences between Juǀ’hoan and Mangetti Dune ǃXun. There is additional support for this click type from Vedder (1910–1911) who described a second lateral /ǀǀǀ/10 contrastive with /ǀ, ǁ, ǂ, ǃ/ in the ǃXun spoken at Gaub farm near Tsumeb, Namibia. Words with this click type only occur in words with reflexes of a Proto-Ju retroflex click *ǃǃ. While four of the click types (*ǀ,*ǁ, *ǂ, *ǃ) reconstructed for Proto-Ju are generally the same across Ju lects, the retroflex click type has a number of different regular reflexes (Sands 2010; Starostin 2003), as illustrated in Table 1.2. Further support for this click type comes from Doke (1923: 164–165) who notes a “peculiar dental click” made by a Haiǁom speaker named Old Kuɺu when speaking words of “Mixed Haiǁom”, which seemingly reflects a language originally used by Haiǁom that is distinct from both Neitsas/Nurugas ǃXun and from Khoekhoegowab. This click was made with the tongue tip being “moved forward to release, the resulting sound seeming to be acoustically mid-way between [ǀ] (normal dental) and [ǁ] (lateral)”, and occurred in place of an expected /ǃ/ click; /ɡǃɑ̀ n/ ‘chin’ is given as one such word. Given the similarity of Khoekhoe ǃɡȁń-s ‘chin’ and the words for ‘chin’ in Ju languages (as shown in Table 1.2) which may be reconstructed with a retroflex click *ɡǃǃ, it appears that Old Kuɺu may have used a ǃXun loan (or retention) similar to the form recorded by Vedder. The various reflexes of the Proto-Ju retroflex click type suggest that the retroflex click is less stable than are the dental, lateral, (post)alveolar and palatal click types. The change from a fricated, forward-released retroflex click to a forward-released fricated click with lateral airflow may be related to ease of articulation, as it is perhaps easier to create a seal necessary to produce a click with the upper side of the tongue than with the underblade. A subsequent 10  Essing (1964) follows Vedder (1910–1911) in using /// for a second type of lateral click but Anders (1934–1935, 1937) uses /// for a retroflex click. The symbol ⟨⫻⟩ "triple solidus binary relation" may also be used for this click.

21

Click Consonants: an Introduction Table 1.2

Words showing regular reflexes of the Proto-Ju retroflex click *ǃǃ type

Gloss

Gaub ǃXun (Vedder 1910/11)

Neitsas/Nurugas ǃXun (Doke 1923)

Okongo ǃXun (Heikkinen 1986)

Juǀ’hoan (Snyman 1975)

‘to die’ (sg) ‘man, male’ (sg) ‘chin’

ǀǀǀɡē ǀǀǀɡōˤ ǀǀǀɡyan

ǃǃɑ́ �ɪ ́ – ǃǃɑ̃ ‘chin tip’ ɡǃǃàŋ ‘chin’ (Leeunes ǃXun) (Snyman 1997)

ǁé ɡǁȍˤ ɡǁȁŋ

ǃái -ɡǃōˤ ɡǃā̃i

Note: Original transcriptions using non-IPA symbols have been modified.

merger of /ǀǀǀ/ and /ǁ/ click types would then be attributed to acoustic similarity. Articulatory variability of retroflex consonants would lead us to expect that some speakers might produce a retroflex click without sublaminal contact. This would produce a non-fricated click, acoustically very similar to a postalveolar /ǃ/ click. Although the contrast in Grootfontein ǃXun is between an alveolar /ǃ/ and retroflex /ǃǃ/, other ǃXun lects have a postalveolar articulation of /ǃ/. Postalveolar articulations of /ǃ/ might be due in part to language contact or articulatory ease, or, more likely, to an enhancement of the acoustic characteristics of alveolar /ǃ/ to increase its distinctiveness from palatal /ǂ/. 2.3.5 Towards a Complete Description of Click Type Variants To add to the variety of click types already mentioned, the “same” click type can differ from token to token in intensity. The difference between a very weak click and a strong click can be quite striking, which can present difficulties for the automatic recognition of click types (Miller and Elsner 2017). I have noticed differences in click amplitude across speakers, across speech contexts, and across languages (e.g. isiXhosa and ǃXóõ speakers having louder bursts than Hadza speakers). Miller and Elsner (2017) have also found effects of speaker and speech context; for instance, clicks in natural speech are often weaker than those in careful, elicited tokens. Marquard et al. (2015) have shown differences in click amplitude across different prosodic positions. Generally, [ǃ, ǁ, ǂ] have greater intensity than [ǀ] and [ʘ], which have greater intensity than [t, k, q] (Traill 1995; 1997: 110), but I have encountered many instances where clicks have less intense bursts than pulmonic stops. A click phoneme may even be produced without an audible click burst, i.e. as a doubly articulated stop.

22

Sands

Laboratory phonetic methods such as rtMRI, EPG and ultrasound reveal how the place of articulation of clicks changes during the course of click production, but rtMRI and ultrasound also reveal how the shape and position of the front of the tongue, the tongue body and tongue root change during the course of the click. We are learning that a prototypical set of 5 click types is not sufficient to capture the range of variation seen crosslinguistically, and across phonetic environments and communicative events. Understanding the range of phonetic variation of each of the click types should help us better understand diachronic processes of click type change and loss. A complete description of a click type would include information about both the anterior and posterior articulations of a click and how both change over the course of the click. The anterior click constriction has both active (lower lip, top or bottom of the tongue tip, blade, and/or center of the tongue), and passive articulators (upper lip, upper or lower teeth, alveolar ridge, post­alveolar region, palatal region) that should be considered in addition to the shape of the center and back of the tongue and the tongue root. The concept of click type also includes information about the direction of airflow (central vs. lateral). Passive articulators of lateral clicks may include pre-molar and/or molar teeth. Sometimes suction may involve the inner cheek wall as well. The buccal cavity acts as a resonating chamber for lateral clicks. The loudness of a click burst will depend on the size of the click cavity, which varies across click types. Within click types, differences in loudness appears be conditioned by prosodic, sociophonetic, and speaker-dependent factors. 2.4 Overview of Click Accompaniments A complete description of a click consonant includes not only the phonetic characteristics of the click type (place of anterior constriction location, central vs. lateral direction of airflow), but also gestures involving additional articulators commonly referred to as accompaniments. The term “accompaniments” does not describe a natural class of speech gestures but refers to various laryngeal, nasal, dorsal and other gestures that may occur before, during, or after the release of the anterior click closure. Non-clicks might also be described as having accompaniments, and clicks might instead be described without use of the term (cf. Miller, Brugman et al. 2007), but I find it to be a useful concept in helping navigate through the variety of click transcriptions and descriptions one encounters in the literature. Previous crosslinguistic surveys of click accompaniments include: Ladefoged and Traill (1994), Ladefoged and Maddieson (1996). Phonation types of vowels following clicks interact with click accompaniments (Miller-Ockhuizen 2003; Miller 2010; Elderkin 2014; Garellek 2020), so a thorough description of an

Click Consonants: an Introduction

23

accompaniment would also consider these interactions. The dividing line between click accompaniment and vowel feature is a non-trivial matter in some cases. For instance, in narrow phonetic transcriptions, I have sometimes debated whether a syllable such as [ǃɑʢ] with an epiglottalized vowel might not be better considered as containing a click with an epiglottalized accompaniment [ǃʜɑ].11 Accompaniments are important for linguistic descriptions because the number of click consonants (or click clusters) in the inventory of a language depends on the number of accompaniments. There is a great deal of variety in how equivalent click accompaniments are spelled and transcribed across languages. Some of this variation is due to different phonological analyses. For instance, the choice between the transcriptions /ǃʷa/, /ǃwa/, /ǃua/ or /ǃoa/ is primarily one of phonological analysis; whether the click is interpreted as having a labiovelarized accompaniment (/ǃʷa/), as being part of a consonant cluster (/ǃwa/), or as preceding a diphthong (/ǃua/, /ǃoa/). Labiovelarized clicks occur in Dahalo and in some Bantu languages but some accounts treat them as consonant clusters rather than as distinct consonants. As discussed in Bennett (this volume), the choice of a unitary vs. cluster analysis of click accompaniments should be determined based on the patterning of sounds in the language in question and not on a cross-linguistic typology. 2.5 Laryngeal Accompaniments The range of laryngeal contrasts that occur with non-click plosives can also be produced with clicks. Even for fairly straightforward laryngeal accompaniments, there have been a range of transcriptions used. For instance, plain unaspirated clicks may be transcribed with a solitary click symbol, e.g. [ǃ], or with the click symbol preceded by a velar stop [kǃ] or with a superscript [ᵏǃ] to indicate a voiceless dorsal closure (cf. Ladefoged and Maddieson 1996). Voiceless aspirated clicks may be transcribed with either an inline ([ǃh]) or superscript ([ǃʰ]) ‘h’. The inline form is more common, but the superscript makes it clearer that the sequence of symbols transcribes a single consonant and not a sequence of consonants. Voiced clicks have been transcribed with a voiced velar stop symbol before ([ɡǃ, ᶢǃ]) or after the click ([ǃɡ]), or with a diacritic to indicate voicing ([ǃ̬]). Pre-voicing is contrastive in ǃXóõ and Juǀ’hoan (e.g. /ɡǃh/ vs. /ǃh/, /ɡǃχ’/ vs. /ǃχ’/ (Snyman 1978)). Voicing of clicks may be modal, breathy or slack. Modal voicing appears to be the most common cross-linguistically. In isiXhosa, the series of clicks spelled gc, gq, gx that is often labelled “voiced” (e.g. Roux 2007) is actually devoiced in most environments (Jessen 2002). Maphalala et al. (2014) transcribed them as [ǀɡ̊ , ǃɡ̊ , ǁɡ̊ ]. They have also been described as slack voiced (Jessen and 11  Also consider Figure 7.19 in which a click with a glottalized uvular release might instead be considered to be a click followed by a vowel with interrupted phonation.

24

Sands

Roux 2002) and as breathy voiced (Ladefoged and Maddieson 1996: 260). The role of “voiced” clicks as tone depressors in Nguni languages such as isiXhosa suggests that larynx height or vocal fold tension are potentially important phonetic characteristics of this accompaniment type. Voiced clicks in Sandawe are relatively rare and some speakers have replaced them with voiceless aspirated clicks (Hunziker et al. 2008: 19). Glottalized clicks are very common cross-linguistically, occurring in all nonBantu click languages. The glottal click accompaniment refers to a click with glottal constriction that is released after the anterior click release, e.g. [ǃˀ] or [ǃʔ]. Glottalized clicks are often produced with voiceless nasal airflow before and during the click closure (Ladefoged and Traill 1984), which may become voiced if preceded by a vowel (Elderkin 1992; Brugman 2009). Thus, glottalized clicks may also be transcribed with diacritics for nasal airflow [ᵑǃ̊ ˀ], [ŋ̊ ǃʔ] or [ǃ̃ˀ̥ ]. The duration and degree of glottal constriction varies across tokens, from something like a full stop to more of a tap or even an approximant. Nasal airflow provides a helpful additional acoustic cue to the glottal click accompaniment. Preglottalized voiced nasal clicks occur in ǂHoan/Nǃaqriaxe (Bell and Collins 2001; Gerlach 2016) and ǃXóõ (Traill 1985), where they contrast with both glottalized clicks and voiced nasal clicks. Glottalized clicks have commonly been transcribed with an apostrophe, e.g. [ǃ’], particularly in languages such as Hadza and Sandawe which lack a contrast between glottalized and ejected clicks, i.e. /ǃʔ ~ ǃˀ/ vs. /ǃ’/.12 The apostrophe is more convenient to type than the glottal stop symbol and it has widely been used to transcribe glottalized (non-ejective) clicks, despite its stated use in the IPA to indicate ejectives. The apostrophe is also used in many orthographies; the Naro orthography uses an apostrophe to indicate glottalized clicks (e.g. q’) (but note that the Khoekhoegowab orthography uses the plain click symbols to represent glottalized clicks (Haacke 2001)). Research has shown that a few languages such as Gǀui (Nakagawa 2006) and Nǃaqriaxe (Gerlach 2016) do have a contrast between ejective (e.g. /ǃ’/) and glottalized (e.g. /ǃˀ/, or /ǃʔ/) clicks. Minimal and near-minimal pairs illustrating the contrast between ejected and glottalized clicks in Gǀui are given in Table 1.3. One speaker of Nǃaqriaxe is also said to contrast a voiced (presumably pre-voiced) ejective click (Gerlach 2016: 140). Furthermore, ejected and (nonnasal) glottalized variants of the plain click series in isiXhosa have been argued to occur (e.g. Jessen 2002), paralleling the ejected and glottalized variants of the plain stops in that language. 12  The transcription ⟨ǃʔ⟩ implies an analysis of a cluster of click plus glottal stop, while the transcription ⟨ǃˀ⟩ implies a single consonant.

Click Consonants: an Introduction Table 1.3

25

Minimal and near-minimal pairs illustrating ejective and glottalized clicks in Gǀui

Gǀui

English

/ǀ’áò/ /ǀʔáò/ /ǃ’ã́ã̄/ /ǃʔã́ã̄/ /ǂ’áà/ /ǂʔáà/ /ǁ’áā/ /ǁʔáá/

‘to bend’ (twigs, etc.) ‘blood’ ‘to have a good taste and flavor of giraffe meat’ ‘to know’ ‘to faint, fall unconscious, play dead’ ‘sp. grass used for an arrow shaft’ ‘flower’ ‘bat-eared fox’

Source: Nakagawa 2006; Nakagawa et al. 2014

Using aerodynamic data, Nakagawa (2006: 168–171) shows that ejected clicks in Gǀui do not occur with the positive nasal airflow characteristic of glottalized clicks. The absence or presence of nasalization of a preceding vowel would thus be an important cue for distinguishing the accompaniments in connected speech. Nakagawa’s airflow data show a more abrupt influx of air in the oral cavity for ejected /ǃ’/ than for /ǃʔ/; and the ejected click bursts do sometimes sound shorter and sharper than the corresponding glottalized click bursts (with a shorter rise-time to peak amplitude). Vowels after ejective clicks tend to have slightly higher F0 than glottalized clicks in the small number of tokens I have examined, but this, and the relatively long VOT, might show more variation in a larger dataset, comparable to variation seen in non-click ejectives (e.g. Wright et al. 2002). Often, the ejected clicks have a somewhat more hollow-sounding burst. This is the case of the tokens shown in Figures 1.10–1.11, produced by the same male speaker of Gǀui. The ejected click has more burst noise associated with the release of the dorsal click constriction, as indicated by the arrows in the spectrogram and waveform in Figure 1.10. Since glottalized clicks vent air through the velopharyngeal port, the dorsal constriction is released more silently and the clicks may be more narrowly transcribed with voiceless nasalization, e.g. /ᵑǃ̊ ˀ/, /ǃ̃ˀ̥ /. Despite these differences, I find it difficult to consistently distinguish these accompaniments on words produced in isolation and I would not be surprised if ejected clicks have gone undetected in some languages or datasets. Khoisanists are familiar with an additional accompaniment commonly referred to as ‘delayed aspiration’ (Ladefoged and Traill 1994), or a delayed glottal fricative (Haacke 2001: 498; Snyman 1978: 154). Clicks with this accompaniment

26

Figure 1.10

Sands

Spectrogram and waveform illustrating an ejected click (and following vowel) in the Gǀui word /ǂ’áà/ ‘to faint, fall unconscious, play dead’ Note: The red arrows indicate noise associated with the release of the dorsal constriction. Source: Recorded by Hirosi Nakagawa in 1994

may also be called voiceless nasal aspirated clicks13 and occur in Nǀuu, for instance (cf. Miller et al. 2009). Delayed aspirated clicks have longer VOTs than corresponding aspirated clicks and the amplitude of a following vowel rises in amplitude more slowly. The glottal frication may be more breathy [ɦ] than voiceless [h]. Transcriptions such as ǃ¯h, ǃ’h, and ǃhɦ attempt to show both the long duration of the accompaniment and the changing nature of the frication. Delayed aspirated clicks are produced with a lowered velum during the click closure which results in (voiceless) nasal airflow (Ladefoged and Traill 1984). Because of this airflow, delayed aspirated clicks are best transcribed with symbols indicating voiceless nasalization, e.g. [ŋ̊ ǃʰ], [ᵑǃ̊ ʰ], [ǃ̃ʰ̥ ]. This nasal venting 13  Note that a contrast between (pre)voiced and voiceless nasal aspirated clicks occurs in ǃXóõ and Juǀ’hoan, e.g. /ŋǃʱ/ vs. /ŋ̊ ǃʰ/ (Miller-Ockhuizen 2003: 125). Dickens (1991) describes this contrast in Juǀ’hoan as between prevoiced prenasal clicks with velar-inaudible aspiration and voiceless clicks with velar-inaudible aspiration.

Click Consonants: an Introduction

Figure 1.11

27

Spectrogram and waveform illustrating a glottalized click (and following vowel) in the Gǀui word /ǂʔáà/ ‘sp. grass used for an arrow shaft’ Source: Recorded by Hirosi Nakagawa in 1994

prevents the build-up of air behind the dorsal click closure so that when the click is released, there is no audible burst noise associated with the dorsal click closure. Unlike aspirated clicks (/ǃʰ/), which may be produced with audible dorsal frication (particularly in languages which do not contrast /ǃʰ/ and /ǃχ/), delayed aspirated clicks are fairly silent immediately after the click release. It is unwise to presume that the accompaniments in one language will be limited to those found in better-attested languages. In describing the laryngeal accompaniments of clicks, researchers should be guided by general phonetic practices as the range of possible click accompaniments is often nearly the same as found in non-click consonants. That being said, no language has been described, to the best of my knowledge, as having contrastive creaky voiced clicks, preaspirated clicks, or imploded clicks,14 though these are all physically possible. 14  See Miller-Ockhuizen (1998: 219) for discussion of a possible case of implosion.

28

Sands

2.6 Nasal Accompaniments Voiced nasal clicks (e.g. [ᵑǃ]) occur in all click languages for which click accompaniments have been explicitly documented.15 Voiced nasalization occurs prominently during the click closure (before the click burst), so this accompaniment has sometimes been mistakenly referred to as a prenasalized click. A voiced nasal click [ᵑǃ], however, is more comparable to a nasal plosive such as /n/ than to a prenasalized stop such as /ⁿd/ because both have voiced nasalization throughout the consonant closure. The phonemic status of prenasalized clicks varies across languages. A voiced prenasalized click accompaniment (e.g. /ᵑᶢǀ ~ ᵑǃ̬/) has been described as occuring in Khwe by Kilian-Hatz (2008: 358), who spells it with letters before and after the click, e.g. nǀɡ. Elderkin, (this volume) details the diachronic origin of this click accompaniment, which contrasts with the voiced nasal click accompaniment. A contrastive prenasalized voiced uvular click occurs in ǃXóõ (Traill 1985: 131–132). Prenasalized clicks also occur in Nguni languages, where they are often the result of a noun class prefix added to a stem. Although some Nguni prenasalized clicks are derived, others are not predictable and may be treated as single segments (e.g. Dogil and Roux 1996). A prenasalized voiced click in Sandawe has been interpreted as being a medial allophone of a voiced click (Wright et al. 1995), but Hunziker et al. (2008: 19) interpret the same sequence as consisting of a phonemically nasalized vowel followed by a voiced click. The glottalized /ᵑǃ̊ ˀ/ and delayed aspirated /ᵑǃ̊ ʰ/ click accompaniments described above must be mentioned along with the other nasal accompaniments. In addition to these, a voiceless nasal click /ᵑǃ̊ / (without glottalization) has been described in the unrelated languages Fwe (Gunnink, this volume) and ǃXóõ (Ladefoged and Traill 1984; 1994; Traill 1985). A voiced nasal click with breathy voiced aspiration (i.e. [ᵑǃʱ]) occurs in Gǀui (Nakagawa 1996). The voiced aspirated click of Juǀ’hoan (variously transcribed: [ŋǃʰ] (Dickens 1994: 13), [ŋǃʱ] (Miller-Ockhuizen 2003: 20), [ŋŋ̊ ǃʰ] (Ladefoged and Traill 1994: 61)) may be understood as a prevoiced delayed aspirated click (paralleling other prevoiced accompaniments in the language). 2.7 Dorsal Accompaniments Click accompaniments occur which involve distinctive constrictions of the back and/or root of the tongue. Distinct dorsal articulations found across click types include: place differences (velar vs. uvular), constriction differences (stop, fricative, affricate), and timing differences of the dorsal burst with 15  For some languages which have a very small number of click words it is not uncommon to see only plain clicks listed in a consonant inventory even though words with orthographic nasal clicks can usually be found in examples provided in the same publication.

Click Consonants: an Introduction

29

respect to the anterior burst (near-simultaneous or with an audible lag). These different dorsal articulations may be transcribed as differing with respect to voicing, timing, aspiration, prenasalization and airstream mechanism (i.e. pulmonic vs. glottalic, as in the case of /ǃ͡χ/ vs. /ǃ͡χ’/ found in Nǀuu). Note that it is customary to transcribe many click accompaniments (e.g. ǃ, ǃʰ, ǃ’) without a symbol indicating a dorsal place constriction even though such a constriction is necessary for the production of the click. Many click languages do not have clicks that contrast based on velar vs. uvular place alone. In such cases, whether the dorsal constriction for clicks such as ǃ, ǃʰ, ǃ’ is velar or uvular may vary by language, click type, or by speaker. Nǀuu has both linguo-pulmonic clicks such as /ǃ͡χ/, /ǃ͡q/ and /ǃ͡qʰ/, and linguoglottalic clicks such as /ǃ͡χ’/. I follow Miller, Brugman et al. (2007) in analyzing such clicks, which have distinctly audible anterior and dorsal bursts, as being click contour segments and not as obstruent clusters.16 Just as the components of a more familiar contour segment such as the affricate /t͡ʃ/ differ in manner (stop vs. fricative), the components of a click contour segment differ in airstream mechanism. A uvular linguo-pulmonic click such as /ǃ͡q/ in Nǀuu differs from a plain click /ǃ/ in Nǀuu in the timing of the release of the dorsal burst, and not in dorsal place. The anterior and dorsal bursts of a plain click such as /ǃ/ in Nǀuu are produced nearly simultaneously but the dorsal burst of a /ǃ͡q/ click is delayed with respect to the anterior click burst, as shown in Figure 1.12 (For additional examples, see Miller, Brugman et al. (2007), Miller et al. (2009), and Exter (2008a)). A dorsal symbol such as q written after a click symbol represents a dorsal closure and burst that occurs after the anterior click burst. In the token shown in Figure 1.12, there is a 25 ms period of silent closure between the anterior click burst and the release of the uvular constriction. The plain click /ǃ/ in Nǀuu is also made with a uvular constriction (Miller et al. 2009), but its anterior click burst, dorsal burst and vowel onset are nearly simultaneous, as shown in Figure 1.13. No language has been documented as having a delayed velar burst accompaniment (i.e. a velar linguo-pulmonic click) (e.g. /ǃ͡k/) that contrasts with either a plain click (e.g. /ǃ/) or a click with an delayed uvular burst (e.g. /ǃ͡q/). In their work on ǃXóõ, Ladefoged and Traill (1994; also see: Ladefoged and Maddieson 1996: 266) transcribe velar vs. uvular place as being the primary feature distinguishing three sets of click accompaniments: /ɡǃ/ vs. /ɢǃ/, /kǃ/ vs. /qǃ/ and /ɡǃh/ vs. /ɢǃh/. These accompaniments and their orthographic equivalents are shown in Table 1.4, along with my narrow transcriptions based on 16  A recent paper espousing the view that some click accompaniments are obstruent clusters is Güldemann and Nakagawa (2018).

30

Sands

Figure 1.12

Linguo-pulmonic unaspirated alveolar contour click /ǃ͡q/ and portion of the following vowel in the Nǀuu word /ǃ͡qɑo/ ‘to pluck, pick’ pronounced by Ouma Katrina Esau in 2006. Formant tracking is shown with superimposed dots. A superimposed arrow indicates the timing of the uvular burst

Figure 1.13

Unaspirated alveolar click /ǃ/ and portion of the following vowel in the Nǀuu word /ǃɑɑ/ ‘hartebeest’ pronounced by Ouma Katrina Esau in 2004. Formant tracking is shown with superimposed dots

31

Click Consonants: an Introduction

a small set of recordings made by Tony Traill. Ladefoged and Traill also transcribe an ejected uvular accompaniment /qǃ’/, two partially uvular clusters: /kǃ’q’/ and /ɡǃq’/, and eight velar click accompaniments: /kǃʔ/, /kǃʰ/, /ŋǃ/,/ ŋ̊ ǃ/, /ʔŋǃ/, /ŋ̊ ǃʰ/, /kǃˣ/, /ɡǃkx/. The revised ǃXóõ dictionary (Traill 2018) includes a slightly different set of accompaniments. I transcribe ǃXóõ /kǃ/ vs. /qǃ/ with a superscript velar or uvular stop symbol before the click symbol, i.e. /ᵏǃ/ vs. /qǃ/. The spectrogram in Figure 1.14 shows a voiceless unaspirated velar-alveolar17 click /ᵏǃ/ and a portion of the /a/ vowel which follows it in the ǃXóõ word ǃáa /ᵏǃáa/ ‘rain’. This can be compared to Figure 1.15 which shows a voiceless unaspirated uvular-alveolar click /qǃ/ and a portion of the following vowel in the word ǃqáũ /qǃáũ/ ‘be careful, sharp’. Dots showing formant tracking are overlaid on each of the spectrograms. This transcription is not ideal since it fails to denote the noisy dorsal burst which occurs after the anterior click burst. But, it is a convention which helps to distinguish the clicks in ǃXóõ from the /ǃ͡q/ click in Nǀuu that has different timing characteristics. In using diacritics to transcribe /ᵏǃ/ and /qǃ/, I suggest that these do not pattern phonetically as contour segments or as clusters. Table 1.4 ǃXóõ click accompaniments contrastive for dorsal place18

Accompaniment

Ladefoged and Traill (1994)

Orthography (Traill 1994)

Transcription (Sands)

Voiceless unaspirated

kǃ qǃ

ǃ ǃq

ᵏǃ qǃ

Voiced

ɡǃ ɢǃ

ǃɡ ǃɢ

ᶢǃ

Voiced aspirated

ɡǃh ɢǃh

ɡǃqh19 ɢǃqh

ᶢᵏǃʰ ᶰqǃʰ

ɴɢǃ

17  Although Ladefoged and Traill (1994) label the ǃ clicks in ǃXóõ as alveolar, some speakers pronounce them as postalveolar (Traill 1985: 103; Ladefoged and Maddieson 1996: 252–253). The low frequency emphasis in the bursts in the clicks in Figures 1.14 and 1.15 suggests that this speaker has postalveolar ǃ clicks, and not alveolar ones (as seen in the ǃ click in Nǀuu shown in Figures 1.12 and 1.13). 18  Note that the voiced uvulars are transcribed as prenasalized clicks here; I have not heard tokens of voiced or voiced aspirated uvular accompaniments without prenasalization, i.e. no [ᴳǃ],[ᴳǃʰ]. 19  This accompaniment is spelled ɡǃh in Traill (2018).

32

Sands

Although there are some differences among the three sets of corresponding velar and uvular accompaniments20 in ǃXóõ transcribed in Table 1.4 in terms of the duration of voicing and presence of prenasalization (Ladefoged and Traill 1994: 50–52), the relative timing of the anterior click burst, the dorsal burst and the vowel onset are very similar for the clicks shown in Figures 1.14 and 1.15. The center of gravity of the uvular-alveolar /qǃ/ click shown in Figure 1.15 is lower than that of the velar-alveolar /ᵏǃ/ click shown in Figure 1.14 (1000 Hz vs. 1390 Hz). This is consistent with a uvular vs. velar place difference. Formant transitions also suggest a contrast of dorsal place. In order to facilitate comparison of these tokens, I have extracted their formant contours, as seen in Figure 1.16. The velar-alveolar click /ᵏǃ/ and following vowel shown in red have a higher F2 and lower F3 and F4 than does the uvular-alveolar click /qǃ/ and its following vowel (shown in black). The onset of the vowel following the velar-alveolar click /ᵏǃ/ is shown by the leftmost dotted vertical line (marking a “velar pinch” where F2 and F3 converge). The uvularalveolar click /qǃ/ has very little burst energy in the F3 region. It is difficult to come up with a definitive cross-linguistic list of dorsal accompaniments because some have only been documented in the speech of certain individuals (Gerlach 2016: 138), or in a very small number of words. For instance, Exter (2008a) has described a voiced linguo-pulmonic click in Nǀuu, /ᶢǀ͡ɢ/, that occurs in only two words (/ᶢǀ͡ɢɑ̀ nɑ́ / ‘spleen’ and /ᶢǀ͡ɢùú/ ‘be constipated’). He notes (pp. 41–42) that it may be realized with voiced uvular frication [ǀ͡ʁ], or with a voiced uvular [ǀ͡ɢ] or velar [ǀ͡ɡ] stop. Also, linguists may simply disagree in their judgement of which accompaniments are phonemic and which are not. For example, Exter (2008a: 29) treats the linguo-glottalic click [ǃ�q’] in Nǀuu as a contrastive click while Miller et al. (2009) interpret it as an allophone of /ǃ͡χ’/. Of course, differences in labels given to the dorsal accompaniments also make it difficult to make a cross-linguistic comparison. For instance, the linguopulmonic /ǃ͡χ/ and linguo-glottalic /ǃ͡χ’/ clicks in Nǀuu and many other languages involve uvular rather than velar frication, but these sounds are commonly written and labelled as having velar frication, ǃx, ǃx’, even when they are clearly phonetically uvular. Because no language has been shown to contrast velar and uvular place in fricative (/ǃx/ and /ǃχ/), or ejected affricated (/ǃx’/ and /ǃχ’/) click 20  Nakagawa (2006: 197) notes timing differences between the anterior and dorsal bursts for uvular and velar accompaniments in Gǀui that are greater than those I have seen in the corresponding accompaniments in ǃXóõ but less than what is typically seen in Nǀuu clicks. Gǀui clicks might be transcribed with a superscripted dorsal stop symbol after the click, e.g. /ǃq/.

Click Consonants: an Introduction

Figure 1.14

Spectrogram of the voiceless unaspirated velar-alveolar click /ᵏǃ/ and portion of the following vowel in the ǃXóõ word ǃáa /ᵏǃáa/ ‘rain’ (pronounced by a male speaker, recorded by Tony Traill). Formant tracking is shown with superimposed dots

Figure 1.15

Spectrogram of the voiceless unaspirated uvular-alveolar click /qǃ/ and portion of the following vowel in the ǃXóõ word ǃqáũ /qǃáũ/ ‘be careful, sharp’ (pronounced by a male speaker, recorded by Tony Traill). Formant tracking is shown with superimposed dots

33

34

Figure 1.16

Sands

Extracted formant traces taken from the spectrograms in Figures 1.14–1.15; /ᵏǃ/ is shown in red and /qǃ/ in black. The leftmost dotted line shows the onset of the vowel following /ᵏǃ/ (which corresponds to a local dip in F3), and the rightmost dotted line shows the onset of the vowel following /qǃ/ (which corresponds to a local peak in F2)

releases, some researchers have chosen to treat the velar place as being more basic. So, what appears to be a narrow phonetic transcription may actually be an overly broad transcription (given that velars and uvulars pattern differently with respect to the Guttural OCP). One should be wary of taking click accompaniment labels and transcriptions at face value because they may vary even when the clicks described are the same. For instance, Miller (2007) has described Juǀ’hoan as having voiced and voiceless epiglottalized clicks /ǀʜ, ɡǀˤ/ where others have transcribed them as ejected velar affricated clicks /ǃx’, ɡǃx’/ (e.g. Snyman 1975; Miller-Ockhuizen 2003: 18). I would transcribe these ‘radical’ clicks as having ejected uvular frication /ǃ͡χ’, ᶢǃ͡χ’/ and would interpret the epiglottalization as being due to the raised larynx of the glottalic components

Click Consonants: an Introduction

35

of the articulations. Of course, the tongue root moves as the back of the tongue body moves, so dorsal accompaniments might instead be referred to as dorsoradical accompaniments. Transcriptions are based on phonological models and as these abound, so do transcriptions. 2.8 Accompaniments in Orthographies The orthographies of languages such as Juǀ’hoan (Dickens 1994; MillerOckhuizen 2000), ǃXóõ (Traill 1994, 2018) and Khoekhoegowab (Haacke 2001; Haacke and Eiseb 2002) look much like phonetic transcriptions but cannot be straightforwardly interpreted as such. For example, the spelling of aspirated clicks in Juǀ’hoan orthography, (e.g. ǃh), is the same as that for delayed aspirated clicks in Khoekhoegowab (Haacke 2001: 498) and ǃXóõ (Traill 1994: 11). Voiceless nasal aspirated clicks in Juǀ’hoan are spelled with an apostrophe signalling a pause (e.g. ǃ’h). The voiceless aspirated accompaniment transcribed /kǃʰ/ in Ladefoged and Traill (1996) is spelled as a uvular ǃqh in Traill’s (1994) ǃXóõ dictionary. The velar letters k and g following a click letter have different phonetic realizations in different languages. For example, ǃk in Juǀ’hoan orthography indicates an ejected affricated click accompaniment (/ǃkx’/ Dickens 1994: 12), while ǃkh in Khoekhoegowab orthography indicates a uvular fricated click [ǃ͡χ] (Brugman 2009: 22). The letter g in Naro orthography is used for fricated clicks, e.g. qg /ǃx/, qg’ /ǃx’/. Several recent works have proposed orthographic systems which can be used across different subsets of click languages. Some of these works are geared primarily for language development (e.g. Chebanne 2016; Sibanda et al. 2015; Wakumelo-Nkolola et al. 2008; Ndala et al. 2014). Others (e.g. Güldemann 2016: 197–198; Witzlack-Makarevich and Nakagawa 2019) developed out of a need for a practical orthography for (cross-linguistic) language documentation. The practical transcription systems described in Güldemann (2016) and Witzlack-Makarevich and Nakagawa (2019) differ in the number of accompaniments recognized (19 vs. 23) but are similar in that they analyze many click accompaniments as consonant clusters. The somewhat abstract nature of these systems makes it difficult to interpret what the phonetic correlate of a given transcription will be in a particular language. For instance, it would not be immediately obvious to most phoneticians how the unitary consonant ǃʰ and the cluster ǃh would (or should) differ in pronunciation, yet Witzlack-Makarevich and Nakagawa (2019) use ǃʰ for an aspirated click [ǃʰ] and ǃh for a (voiceless) delayed aspirated click [ᵑǃ̊ ʰ].21 The nasal airflow and breathy voicing of the 21  It is unclear to me how a prevoiced nasalized aspirated click such as /ŋǃʱ/ is to be represented in their system.

36

Sands

delayed aspirated clicks are, in their analysis, a minor phonetic detail. This stands in contrast to the facts of click loss in Mupa ǃXun (Fehn, this volume), which indicate that nasalization is phonologically active in delayed aspirated clicks and should be represented in transcription. 3

Which Languages Have Click Consonants?

The areas where clicks are used as regular consonants can be seen in Map 1.1. The majority of click languages are spoken in southern Africa, but three click languages are found in East Africa. Countries with click languages include: Angola, Botswana, Lesotho, Kenya, Namibia, South Africa, Swaziland, Tanzania, Zambia, and Zimbabwe.22 There are more than two dozen distinct language clusters widely accepted as “click languages”, but eight of these are now moribund or extinct. This is a conservative estimate because some of the clusters are comprised of lects that would be considered separate languages based on sociohistorical and emic criteria.23 There are another ten languages where clicks occur in a small number of lexical items (Pakendorf et al. 2017: 6; Maddieson and Sands 2019). Clicks as regular consonants occur in seven distinct genealogical families in Africa, two of which are isolates. Clicks were also used as regular consonants in Damin, an auxiliary language once used by speakers of the Australian languages Lardil and Yangkaal (Hale and Nash 1997; Memmott et al. 2006: 36). The click languages24 of Southern Africa have been traditionally divided into Khoisan and Bantu groups, with clicks being borrowed into Bantu from Khoisan. The Khoisan languages are now widely regarded as belonging to three distinct language families (Kx’a, Tuu, Khoe-Kwadi).25 Segment inventories of Khoisan languages can be found in Vossen (2013). These three families have been in long-term contact in the Kalahari Basin linguistic area (KBA) (Güldemann 1998; 2014; Güldemann and Fehn 2017), indicated on Map 1.1. They 22  Clicks also occur in some languages of Mozambique as marginal consonants. 23  For instance, ǃXun speakers in Platfontein, RSA appear to view Juǀ’hoan as a distinct language with limited mutual intelligibility (Jones 2017: 135). ǃXun and Juǀ’hoan are treated by Güldemann (2014) as belonging to a single language complex/dialect cluster. 24  The language names used in this volume reflect the diversity of naming conventions used in the literature. That is, I use some names which have affixes or diacritics (e.g. isiZulu and ǃXóõ, in preference to Zulu and ǃXoon or Taa), while not in other cases (e.g. Yeyi and Juǀ’hoan, instead of ShiYeyi and Juǀ’hoansi or Žuǀ’hõasi). 25  These roughly correspond to Greenberg’s (1963) Northern, Southern and Central Khoisan groupings, which derive in large part from Dorothea Bleek’s work (e.g. Bleek 1927).

Click Consonants: an Introduction

Map 1.1

37

Approximate areas where click languages are spoken

are also spoken in a wider geographical range, extending beyond the KBA as far north as southern Angola. Language varieties outside of the Kalahari Basin area tend to have fewer contrastive clicks compared to those inside the KBA (Sands and Gunnink 2019) (see also Map 1.3). The Bantu languages of Southern Africa with clicks can be divided into two main geographical areas: Southwest Bantu (SWB) and Southeast Bantu (SEB) (Pakendorf et al. 2017). With the exception of Zimbabwean Ndebele, clicks in Bantu languages outside of these areas are fairly marginal. The click languages of East Africa belong to three genealogically distinct families. Dahalo, spoken along the northern Kenyan coast, is a Cushitic language (Tosco 1991; Maddieson et al. 1993). Hadza and Sandawe are isolates spoken in North Central and Central Tanzania, respectively (Sands 1998). There are some indications that Sandawe has links to the Khoe-Kwadi languages of

38

Sands

southern Africa (Güldemann and Elderkin 2010), but it is currently regarded as a linguistic isolate. Phonetic studies of East African click languages include: Maddieson et al. 1993, 1999, Wright et al. 1995, and Sands et al. 1996. Khoe-Kwadi is a large family with a dozen language clusters (Güldemann 2014: 27), three of which are no longer spoken. In this family, the majority of instrumental phonetic studies have been conducted on Gǀui (e.g. Nakagawa 1996; 2006; Kagaya and Nakagawa 1986) and Nama-Damara (Khoekhoegowab) (e.g. Panconcelli-Calzia 1924; Beach 1938; Ladefoged and Traill 1984; Miller, Namaseb and Iskarous 2007; Proctor et al. 2016). Other phonetic studies of languages in this family include Kagaya (1978) on Naro and Killian (2009). For Kora, also known as Korana, ǃOra or ǃKora, also see: Beach 1938, Ponelis 1975, and Du Plessis 2018. There are a few published spectrograms of clicks in Khwe (Kilian-Hatz 2008: 360–361). Segment inventories of many Khoe languages are given in Vossen (1997) and Fehn (this volume; 2018; 2019). The Kx’a languages divide into two subgroups: ǂ’Amkoe and Ju (also known as ǃXun, and formerly referred to as Northern Khoisan). Güldemann (2014: 27) treats each of these subgroups as a single language cluster, but there is evidence to suggest that ǃXun consists of lects which are not mutually intelligible (Jordan and Manuel 2018; Miller-Ockhuizen and Sands 1999). ǂ’Amkoe consists of varieties called ǂHoan, Nǃaqriaxe and Sàsí (Gerlach 2016; Collins and Gruber 2014), while Ju consists of lects that may be grouped into Northern, Central and Southeastern subgroups (Sands 2010; Heine and König 2015). There have been several instrumental phonetic studies of Juǀ’hoan, a Southeastern Ju language (e.g. Snyman 1978; 1980; Miller-Ockhuizen 2003). Mangetti Dune ǃXun and Ekoka ǃXun (both Northern Ju languages) have also been the focus of phonetic documentation (Miller 2016; Miller and Holliday 2014; Miller et al. 2011). Grootfontein ǃXun (Doke 1923) is a Central Ju lect. There is a great deal of variety across Ju lects with respect to their click inventories. ǂ’Amkoe lects (including idiolects) are also not uniform with respect to their click inventories (Gerlach 2016; Bell and Collins 2001). The Tuu languages divide into two subgroups, Taa and ǃUi, which consist of some seven language clusters (Güldemann 2014: 27). Phonetic studies of the Lone Tree dialect of East ǃXóõ (e.g. Traill 1985, etc.), a Taa language, have particularly informed our understanding of clicks. A recent study of West ǃXóõ is Naumann (2016). Documentation is sparse of other dialects of ǃXóõ and of extinct Taa lects such as Kakia, Nǀuǁ’en, ǀ’Auni, and ǀHaasi. Most ǃUi languages are no longer spoken, but phonetic studies of Nǀuu (also known as Nǁng and ǂKhomani) have been conducted (e.g. Miller, Brugman et al. 2007; Miller et al. 2009; Miller 2010; Exter 2008a). Extinct ǃUi languages include ǁXegwi, ǀXam and ǂUngkue.

Click Consonants: an Introduction

39

Bantu languages with clicks are shown in Map 1.2. Languages of the Southeast Bantu geographical zone include the closely related languages of the Nguni group (isiZulu, isiXhosa, siSwati, isiNdebele, sePhuthi) as well as Southern Sotho. Languages of the Southwest Bantu geographical zone include Kwangali, Manyo, Mbukushu, Fwe, Yeyi and Botswanan Subiya (Ikuhane) (cf. Baumbach 1997; Dammann 1957; Ndana et al. 2017; Sands and Gunnink 2019; Maddieson 2003; Pakendorf et al. 2017; Maddieson and Sands 2019). It is somewhat surprising that, apart from studies of isiZulu (e.g. Thomas-Vilakati 1997; 2010; Roux 2007; Roux and Dogil 1998; Selmer 1933; Doke 1926), isiXhosa (Roux 1991; Sands 1991; 1992; Johnson 1993; Scharf et al. 1995; Roux et al. 1995) and Yeyi (Fulop et al. 2003), there have been few instrumental phonetic studies of clicks in Bantu languages. Little is known about how clicks may vary across the isiXhosa dialects such as Mpondo, Bhaca, Hlubi, Mpondomise, Cele,

Map 1.2

Southern African Bantu languages with clicks as regular consonants Notes and sources: Languages where clicks are marginal are shown in italics. This map is based on a similar figure in Pakendorf et al. (2017), but includes Ikuhane (Botswanan Subiya), (also referred to as Chiikuhane) based on information in Ndana et al. (2017) and Davey (1965). Subiya spoken in Namibia and Zambia is not reported as having words with clicks (Baumbach 1997; Jacottet 1896).

40

Sands

Ngqika and Ntlangwini (Nyamende 1994). IsiZulu dialects in close contact with siSwati (Ngubane 1991: 63) and Southern Sotho show non-standard click inventories (Aycard 2014: 107–108; Gunnink 2014). Among languages where clicks are used as onsets of syllables, I have not counted as “click languages” ones where clicks only occur in one or two words in semantic fields typically associated with sound symbolism (Hinton et al. 1994: 10; Bostoen and Sands 2012). For instance, in the Ngwato dialect of Tswana, clicks only occur in the following words: /ŋǁɛ/ ‘sorry, excuse me’, /ŋǀɛŋǀɑnɪ/ ‘very small’, /lɪŋǁɑ, mɑŋǁɑ/ ‘an expletive’ (sg., pl.) (Tlale 2005: 209– 210). Similarly, in Digo (spoken in Kenya), clicks occur in just two words: /ŋǀa/ ‘go away’, /ŋǀakule/ ‘miniscule, minute’ (Walsh 2006). Nor do I count varieties of Chinese where the words for ‘duck’ and ‘goose’ have clicks in a nursery rhyme (Nathan 2001). Gil (2005) notes the common use of clicks in words with affective meanings as well as logical meanings (‘yes’ and/or ‘no’). Masalit, a Nilo-Saharan language spoken in Sudan, for instance, is reported to have a click in words meaning ‘sucking’ and ‘yes’ (Edgar 1991). The marginal click languages shown in Map 1.2 are those which appear to have only a small number lexical items with clicks that have neither logical nor affective meanings. For instance, Kgalagari is reported as having a word mo-nǃú ‘big intestine’ (Dickens 1986: 29) that is likely borrowed from Naro [ǃùu] ‘large intestine’ (Visser 2001: 166). In this case, the Naro source has a voiceless click while the Kgalagari word has a voiced nasal click, which appears to be a phonological modification to facilitate ease of articulation. Clicks may be less marginal in the Ngamiland dialect of Kgalagari. which is reported to have “a preponderance of clicks adopted from neighbouring languages such as ǁGana, ǀGui, Shiyeyi and Thimbukushu …” (Lukusa and Monaka 2008: 8). (It is unclear whether the Ngamiland lect has more click consonants as well as more click words than do other Kgalagari dialects.) Clicks occur in at least three lexical items in Midlands varieties of Karanga (Pongweni 1990: 16), Tsonga (Afido et al. 1989: 105; Baumbach 1974), Chopi (Bailey 1995), Ronga (Afido et al. 1989: 114), and Ndau (Mkanganwi 1972: 117). It seems likely that these languages have many more words with clicks since the small sample of words reported often show that the language has more than one accompaniment. Ronga, for instance, has words with voiceless (qatha ‘naco’ (‘loaf’)), voiced (mugqomo ‘vasilha em lata’ (‘canister’)) and nasal (nqawu ‘cachimbo’ (‘smoking pipe’)) clicks (Afido et al. 1989: 114). Some speakers drop the clicks in Chopi (Bailey 1995), and it is unclear whether clicks still occur in Tswa words (qabi ‘stick used in fencing’, qama ‘ram’) as reported by Persson (1932: 18) nearly a century ago. Marginal clicks in languages of the eastern part of southern Africa can in large part be attributed to contact with isiZulu or another Nguni language, as

Click Consonants: an Introduction

41

is the case with the Mzimba dialect of Tumbuka (Moyo 1995). Without more information, it is hard to know whether clicks introduced through loan words should be considered as something more than a marginal part of the language’s phonology. These cases show that it is not particularly difficult to borrow clicks. The presence or absence of clicks in the segment inventory is one basis for determining what is a “click language” (Hyman 2009: 214), but phonetic studies of clicks have tended to focus on the languages for which clicks have the greatest functional load, both in terms of the number of clicks in the segment inventory and the proportion of click words in the vocabulary, particularly in basic vocabulary. Different researchers may report click inventories that appear quite different on the surface depending on the idiolectal or dialectal variants used by the speakers they worked with. For instance, siSwati can have q-clicks or c-clicks depending on the dialect (Lanham 1960: 60), or perhaps even a palatal click (Ilunga 2001). Dahalo is reported by Maddieson et al. (1993) as having only dental clicks, while Ehret (2013) reports contrastive dental and lateral clicks in the language, while Elderkin (personal communication) has heard a lateral click as a variant of the dental click. What is not clear is how many (or what sort of) lexical items with clicks a language should have for clicks to count as part of the segment inventory. Just as it can be difficult to determine the segment inventory of some click languages, it can be difficult to get anything approaching a representative list of lexical items containing clicks. Many click languages are still lacking dictionaries, and some dictionaries may leave out click words found only in a single dialect. Sands and Gunnink (2019) provide different measures of the functional load of clicks in various click languages, showing that the functional load of clicks generally decreases with distance from the core of the Kalahari Basin linguistic area. Map 1.3 shows the functional load of clicks in the basic vocabulary of some southern Africa click languages. The functional load of clicks in Kx’a (ǂHoan, Juǀ’hoan), Tuu (ǃXoon, Nǀuu) and Khoe (Khoekhoe, Naro, Gǀui, Kua, Shua) languages is greater than that found in Bantu languages. Clicks in Nguni languages and Yeyi have a higher functional load than those of the other Bantu languages. 4

How Have Click Consonants Been Studied?

In this section, I survey the different types of literature which make reference to click consonants. The majority of works focused on clicks have been phonetic studies of lexical/phonemic clicks, and these are discussed in Section 4.1, with the exception of click consonant perception studies, which are discussed

42

Map 1.3

Sands

Map of southern Africa with pie charts showing the functional load of clicks in the basic vocabulary of some click languages of the core of the Kalahari Basin area (indicated by a heavy dashed line) and on its fringe (indicated by a dotted line). The percentage of clicks is shown by a darker color; blue circles represent Bantu languages Source: Sands and Gunnink 2019

in Best (this volume). Phonetic studies of paralinguistic and emergent clicks are discussed in Section 4.2. Section 4.3 is a very brief survey of linguistic anthropological perspectives on clicks. Section 4.4 looks at clicks in L1 and L2 acquisition, as well as in disordered speech. Section 4.5 is a brief survey of diachronic processes affecting clicks. Two topics I will not survey in this paper are phonological studies of clicks (including phonotactics), which are surveyed in Bennett (this volume), and the possible role of click consonants in language evolution, which I have surveyed and discussed in depth in Sands and Güldemann (2009). 4.1 Phonetic Studies of Click Consonants From a fairly early date, clicks have been at the forefront of instrumental investigative phonetic techniques. A century ago, for instance, Panconcelli-Calzia

Click Consonants: an Introduction

43

(1918–1919; 1919–1920; 1924) studied clicks using X-rays. A few decades later, Pienaar (1936; 1939) and Selmer (1933: 25–26, 51–54) published kymograph tracings of words with clicks. Beach (1938)26 remains an excellent phonetic study of Khoekhoe (Nama-Damara) using palatography, based on his doctoral dissertation from several years earlier at the University of London.27 Other early works on the phonetics of clicks include: Dempwolff (1913), Otto (1922), Planert (1927), Bleek (1927), Copland (1938), Klingenheben (1937). Some of the most important early phonetic research into clicks was produced by Clement Doke. He worked on languages from the Bantu, Kx’a and Tuu families. He produced several works on isiZulu (1923a, b; 1926: 123–138) as well as important papers on a dialect of ǃXun (Doke 1923) and on Nǀuu (Doke 1936). Some early recordings of Tuu and Khoe languages have been made available on CD (Traill 1999). Recording equipment at the Phonetics Laboratory of the University of Cape Town allowed Doke and subsequent researchers such as E.O.J. Westphal and Jan Snyman to produce well-transcribed works on click languages. An archive of Westphal’s field notes and recordings of Kx’a, Tuu and Khoe languages is available online at the UCT Libraries Digital Collections (http://digital collections.lib.uct.ac.za/ernst-westphal-san-languages). There were many early attempts to spell clicks, even dating back to 1634 (Herbert 1634, cited by Breckwoldt 1972: 282). Surveys of early click transcriptions can be found in Haacke (1989) and Breckwoldt (1972; 1978; 1979). W.H.I. Bleek, who was trained as a linguist, described clicks as early as 1858 and discussed early click transcriptions in a more widely accessible work, Bleek (1862: 12–15). Phoneticians have struggled with how to transcribe clicks for over a century (cf. Jones 1907). Versions of the modern symbols have appeared in African language surveys at least since Werner’s 1925 book. In the latter quarter of the 20th century, phonetic studies of click languages were led by South Africans Anthony Traill and Justus Roux, and British researchers Peter Ladefoged and Ian Maddieson, who were based at the University of California, Los Angeles. Students trained by these researchers include: Hirosi Nakagawa, Philip W. Lewis, Bonny Sands, Kimberley Thomas-Vilakati, Richard Wright and Sean Fulop. Amanda Miller, a former Peace Corps volunteer who learned Juǀ’hoan while in Namibia also contributed many important phonetic and phonological studies of clicks and trained students such as Johanna 26  Phillips (1993: 28) shows a picture of Douglas Beach using the kymograph at the University of Cape Town’s new phonetics laboratory in 1929, along with a Khoekhoegowab speaker, Johannes Dâusab. 27  His book attracted the interest of Nikolai Trubetskoy (1939), founder of the Prague School.

44

Sands

Brugman and Abigail Scott. Articulatory laboratory techniques employed by researchers in the latter quarter of the 20th century include: X-ray imaging (e.g. Traill 1985), palatography (e.g. Traill 1983), electromagnetic articulography (EMA) (Scharf et al. 1995), and aerodynamic techniques (e.g. Maddieson et al. 1999; Snyman 1999; Traill 1991; 1992; Traill and Wright 1984). Acoustic analysis was, and still is, commonly done using spectrograms and waveforms (e.g. Traill 1979), using spectral analyses and duration measurements. During this time, analysis was typically done on elicited speech, i.e. careful pronunciations of single words in comparable prosodic positions. Ladefoged’s phonetics textbooks (1993, 2000) have been particularly important in introducing clicks to linguists, as Westermann and Ward’s 1930 textbook did for an earlier generation. Recordings of click languages made by UCLA scholars are available for downloading at the UCLA Phonetics Lab Archive (http://archive.phonetics.ucla.edu/). Funding from the U.S. National Science Foundation supported Ladefoged’s collaborations with many different researchers, resulting in phonetic studies on each of the seven language families with click languages, and culminating in the survey of clicks found in The Sounds of the World’s Languages (Ladefoged and Maddieson 1996), which followed up on the influential surveys of clicks he co-authored with Anthony Traill (Ladefoged and Traill 1980, 1984, 1994). In the last 20 years, new techniques have emerged for studying clicks. Amanda Miller and colleagues have been at the forefront of using ultrasound to study clicks (e.g. Miller 2008, 2010, 2016; Miller, Brugman et al. 2007; Miller, Namaseb and Iskarous 2007; Hudu et al. 2009). Miller has also employed quantitative spectral analysis to clicks (Miller 2019). Electropalatography (EPG) has been used on some Bantu languages (e.g. Thomas-Vilakati 2010; Roux 2007). Magnetic resonance imaging (MRI) is used by Moisik and Dediu (this volume) to image the articulations made by people learning to produce clicks. Michael Proctor and colleagues have used real-time magnetic resonance imaging (rtMRI) to image clicks produced by native speakers of click languages such as siSwati and Khoekhoegowab (Proctor et al. 2016; Proctor et al. this volume), by beatboxers (Proctor et al. 2013), and by phoneticians (Toutios et al. 2016). An important recent trend is for researchers to study click production in natural speech, in varying prosodic conditions (e.g. Midtlyng 2011; Miller and Elsner 2017; Marquard et al. 2015). The increasing availability of recordings of click languages in archives such as the Electronic Language Archive (ELAR) at SOAS and the UCLA Phonetics Lab Archive have made it easier for researchers to hear clicks produced by different speakers of different languages. Increasingly, it is becoming standard for linguists to publish phonetic analyses

Click Consonants: an Introduction

45

of languages they work on even if their primary interest is in other aspects of grammar (e.g. Bell and Collins 2001; Eaton 2006). There are a few phonetic studies which focus on acoustic features of clicks (e.g. Jakobson 1968; Johnson 1993; Traill 1997; 2002; Exter 2008b; 2010) or on the acoustic modelling of clicks (Exter 2011; Stevens 1998: 121–124). Recently, clicks have also been modelled using a three-dimensional biomechanical model (Moisik and Dediu 2015; 2017). There have been several studies looking at coarticulation involving clicks, including Thomas-Vilakati (2010), Sands (1991), Dogil et al. (1997), and Miller (2016; 2017). 4.2 Phonetic Studies of Paralinguistic and Emergent Clicks Clicks occur in many languages other than the click languages discussed so far. When clicks violate the phonological patterning of consonants in a language by occurring without an accompanying vowel, for instance, they are referred to as ‘paralinguistic’ clicks. The term ‘paralinguistic’ is a bit of a misnomer because the production of such clicks certainly reflects linguistic knowledge. In other cases, clicks occur in normal syllables, but only in a few sound-symbolic words, i.e. words with affective, directive or logical (‘yes’, ‘no’) meanings (Gil 2005). Lionnet (this volume) is one of the few publications to provide an acoustic analysis of paralinguistic clicks. Another category to be considered is when clicks are used as discourse markers. Furthermore, in some languages, clicks occasionally result from the overlap of non-click consonants. These emergent clicks are phonetic variants of non-clicks and are not considered part of the regular segment inventory. A number of phonetic studies have been published which analyze the use of clicks as discourse markers. Clicks may be used as hesitation markers, to indicate a new discourse sequence, etc. There are phonetic studies of clicks used as discourse markers in several languages: English (Ogden 2013; Wright 2005, 2007, 2011), German (Trouvain 2013, 2014, 2015), and Spanish (Pinto and Vigil 2018). Scobbie et al. (2011) looks at such clicks in German and in English spoken by German speakers. The use of clicks as discourse markers has also been used in forensic phonetic speaker-identification studies (Gold 2012; Gold et al. 2013). Clicks may also be used as discourse markers in click languages. A voiceless palatal click occurs as a marker of the start of a turn in the speech of some Khoekhoe women, for instance (Christian Rapold, personal communication, 2013). Emergent clicks arise as a result of overlap between a sequence of bilabial + dorsal, or coronal + dorsal obstruent constrictions. A weak emergent bilabial click resulting from the overlap of a bilabial and dorsal constriction has been documented in many African languages as a variant of a labial-velar stop

46

Sands

(cf. Demolin 1991; Harley 2012, Ladefoged 1968: 9). Weak bilabial clicks can also result from the overlap of a sequence of bilabial and velar consonants. For instance, sequences of /p/ and /k/ spoken by Norwegian subjects resulted in a ‘click-like phenomenon’ (Koreman and Morland 2007). An aerodynamic study of nasal clicks resulting from sequences of /m/ and /ŋ/ in Kinyarwanda is provided in Demolin (2016). Emergent clicks have been documented in a number of European languages such as German, English and French, though it is likely to be a more widespread phenomenon (Ohala 1995, 1997). Acoustic evidence for weak clicks in German is shown in Simpson (2007; 2001: 36). There is electropalatographic data showing weak clicks in French due to overlap of dental and velar consonants (Marchal 1987). EPG and aerodynamic data on weak clicks in German due to overlap of various consonant sequences is shown in Fuchs and Rodgers (2013) and Fuchs et al. (2007). 4.3 Linguistic Anthropological Studies of Clicks In addition to having a purely linguistic function, clicks can have social or emblematic meanings attached to them. For instance, clicks have been used to signal ethnicity amongst people of Khoe and San ancestry in South Africa (Brown and Deumert 2017), and among newly orthodox Jews in the U.S. (Benor 2004: 189). The hesitation marker used in the U.S. is typically lower in amplitude than many click consonants used in South African languages yet it is still quite salient. The use of a ǀXam phrase as a motto on the South African coat of arms carries a social and political meaning (cf. Barnard 2003). Both in the motto and elsewhere, the visual representation of clicks with the nonASCII letters ǀ, ǃ, ǁ, ǂ helps them stand out from consonants in orthographies that do not use these letters, and helps draw a connection to a non-Bantu, non-European Khoisan heritage (Brown and Deumert 2017). Social meanings attached to click consonants may have played a role in language contact scenarios where clicks were either adopted or lost (Irvine and Gal 2000; Wilmsen and Vossen 1990). Given the widespread availability of Unicode, there is no longer any technical impediment towards a more widespread use of letters ǀ, ǃ, ǁ, ǂ by international organizations. The use of these letters in astronomical names has not yet been approved by the International Astronomical Union (IAU), but several names with clicks from Juǀ’hoan and ǀXam were submitted to the Small Bodies Nomenclature committee of the IAU (see Grundy et al. 2019). Some of the hesitation to adopt such names may be linked to ideologies about what a name should look like, or that an international spelling should be restricted to a more widely used set of Latin characters.

Click Consonants: an Introduction

47

4.4 Clicks in Language Acquisition and Disordered Speech There have been several studies of click consonants in L1/L2 acqusition but none which have focused on ‘Khoisan’ languages apart from Kirk Miller’s short contribution on the acquisition of clicks in Hadza (this volume).28 Studies of clicks in disordered speech have mostly looked at click productions by English speakers. The study of clicks in acquired apraxia of speech in an isiZulu speaker (van der Merwe and Steyn, this volume) is a most notable and welcome exception. Studies of L1 acquisition of clicks have mainly looked at isiXhosa language acquisition (e.g. Mowrer and Burger 1991; Lewis 1994a; Lewis and Roux 1995; Tuomi et al. 2001; Gxilishe 2004; Maphalala et al. 2014), but there are a few studies that have looked at the acquisition of clicks in isiZulu (e.g. Naidoo et al. 2005; Herbert 1983). Clicks are generally among the last consonants reliably acquired in these languages, along with affricates and prenasalized stops (Naidoo et al. 2005; Maphalala et al. 2014). The dental is generally the first click type acquired followed by either the lateral or the postalveolar (Gxilishe 2004; 1994a; Herbert 1983; Mowrer and Burger 1991). Before they are fully acquired, clicks may be substituted with non-clicks (velars, coronals or palatals) or by different clicks (Lewis and Roux 1996). Lewis and Roux also noticed a tendency for some accompaniments to be modified, for instance, voiceless accompaniments were often produced with nasalization. Aspirated clicks are generally acquired after voiceless unaspirated clicks (Lewis 1994). All three click types are acquired before the full set of accompaniments, and accompaniments may not be acquired across all click types at the same time (Maphalala et al. 2014). L2 click-acquisition shows some similarities to L1 acquisition. One study (Lewis 1994b) showed that English- and Afrikaans-speaking children acquiring isiXhosa as L2 who were between the ages 10;0 and 13;0 produced similar substitutions for clicks and accompaniments as did L1 learners (Lewis and Roux 1996). The L2 learners also produced dental clicks much more reliably than postalveolar clicks. Adult L2 learners of isiXhosa studied by Lewis and Roux (1998; 1999) were perceived by native speakers to produce plain and nasalized click accompaniments more reliably than aspirated or ‘voiced’ accompaniments. They were generally perceived to have substituted one click for 28  There is some documentation of Nogau ǃXun words as pronounced by children (Lebzelter 1934: 66). One word with a dental click in children’s speech matches a target dental click in adult speech, while in another case, a dental click reflects a target lateral click. Other words either show click deletion or replacement with coronal or velar consonants. Affricates and sibilants also show a mix of target and non-target pronunciations.

48

Sands

another rather than to substitute a click with a non-click. Although there have been no studies of L2 acquisition of palatal /ǂ/ clicks, anecdotal evidence from teaching clicks to phonetics students suggests that palatal /ǂ/ and alveolar /ǃ/ are generally more difficult for English speakers to acquire than dental /ǀ/ and lateral /ǁ/ click types. Fehn (this volume) suggests that an influx of L2 speakers into Kalahari Khoe groups may have triggered the loss of abrupt /ǂ/ and /ǃ/ clicks in those languages. Moisik and Dediu (this volume) provide a unique look at the L2 acquisition of clicks by looking at the articulations made by novice learners who were asked to imitate clicks. There are some cases of disordered speech where the clicks targeted for pronunciation are replaced by non-clicks or are distorted (cf. van der Merwe and Steyn, this volume). In the course of doing fieldwork, I have encountered one Sandawe speaker unable to produce any laterals, including clicks, and one speaker of a Tuu language unable to produce bilabial clicks. Tooth loss has caused some individuals to be unable to produce certain clicks (Beach 1938: 78). One isiXhosa speaker with a mild phonological delay has been reported as having difficulty with affricates but not with clicks (Rossouw and Pascoe 2018). Clicks are available for use in innovative non-linguistic and paralinguistic functions. An interesting non-linguistic use of clicks ([ǃ] or perhaps [ǂ]) is in echolocation for people who are blind (Thaler et al. 2011; Yost et al. 2014). Clicks may also occur in twin languages (Bakker 1990: 89) and as an audible co-gesture in Sign languages (e.g. Edward 2018; de Vos 2012: 135). Cases where English pulmonic consonants have been replaced by clicks also occur (cf. Ball and Müller 2007). Clicks have been reported to replace sibilants (Bedore et al. 1994), sonorants (Heselwood 1997), stops and affricates (Gibbon et al. 2008), bilabials (Howard 1993), and a wide range of consonants (Khaila et al. 2007). Conditions associated with such substitutions include: Down’s syndrome (Heselwood 1997), velocardiofacial syndrome (Gibbon et al. 2008), cleft palate and hearing impairment (Howard 1993) and tracheostomy (Khaila et al. 2007). 4.5 Historical/Comparative Studies of Clicks Much of what is known about the diachronic development of clicks in Khoe and Kx’a languages is surveyed in Fehn’s contributions to this volume, which builds on earlier work such as Traill and Vossen (1997). In this section, I will briefly discuss recent findings on the diachronic patterns involving clicks in Tuu, Kx’a and Bantu languages. The diffusion of clicks into Bantu languages is discussed in a number of recent publications: Pakendorf et al. 2017, Gunnink et al. 2015, Louw 2013 and Sands and Gunnink 2019. To summarize some of these findings, words appear

Click Consonants: an Introduction

49

to have been borrowed with clicks in several different language contact situations, without any prerequisite that there be a special speech register such as hlonipha as has been suggested by Herbert (1990). In several instances, click loss subsequently occurred. Clicks were borrowed from Khoisan languages into Bantu, but clicks also appear in items with Proto-Bantu roots, particularly in words with affective, directive or logical meanings. Clicks have also been borrowed from Nguni into other languages. Click loss tends to be greatest among lects found on the periphery of the Kalahari Basin linguistic area (Sands and Gunnink 2019; Sands et al. 2017). Gunnink (this volume) details patterns of click loss in Fwe. For a language to have a high functional load for clicks, it apparently helps to be surrounded by other languages that can supply loan items containing clicks. As Fehn (this volume) has shown, the alveolar click type is prone to being lost before the palatal click type in Khoe languages.29 This may also be the case for ǃUi languages.30 Traill and Vossen (1997: 41–42) provide evidence that some alveolar clicks were lost in ǁXegwi. ǁXegwi speakers were in close contact with siSwati speakers and at least some words with alveolar clicks in ǁXegwi correspond to siSwati forms (e.g. /iŋ-ǃɑlɑː-e ‘I begin’ (ǁXegwi), /ǃɑːla/ ‘begin’ (siSwati), (Lanham and Hallowes 1956)), suggesting that ǁXegwi reborrowed the alveolar click type from this Nguni language or possibly from Southern Sotho (Westphal, n.d.). ǁXegwi appears to have lost some Proto-ǃUi palatal *ǂ clicks which are realized as lateral clicks or as non-clicks (Sands 2007). Surprisingly, ǀXam also appears to have lost some Proto-ǃUi palatal *ǂ clicks and it may be the case that palatal clicks were borrowed back into the language from Kora, a Khoe language. For example, the word for ‘wind’ can be reconstructed with a palatal click *ǂqʰ on the basis of the Nǀuu form /ǂqʰoe/. Reflexes in ǁXegwi (/ʃwee/) and ǀXam (/ǃʰwé/) exemplify click loss and click merger, respectively. Many lexical items with a palatal click in ǀXam have similar forms in Kora or Khoekhoegowab, e.g. ‘springhare’ /ǂɡoː/ (ǀXam), /ǂɡõː-p/ (Kora), /ǂɡȍó-b/ (Khoekhoegowab), so it seems likely that ǀXam reborrowed the palatal click from a neighboring language (Sands 2014). Interestingly, the loss of Proto-ǃUi 29  Mathes (2016) shows that while Tsua, a Khoe language, has contrastive alveolar and palatal click types, these occur in combination with fewer contrastive accompaniments than do the dental and lateral click types. Furthermore, the number of contrastive alveolar click consonants is smaller than the number of palatal click consonants, suggesting that alveolar click loss is more advanced. 30  Note, however, that palatal ǂ clicks have merged with (post)alveolar ǃ clicks for some speakers of Kora (Engelbrecht 1936: 202) and Griqua (Sands and Namaseb 2007) and are cross-linguistically less common. In the case of Nguni languages, palatal clicks in Khoekhoe loans were adapted as either dental or postalveolar clicks (Louw 1977; 2013).

50 Table 1.5

Sands Proto-Kx’a *ʘ cognate sets proposed by Heine and Honken (2010: 27)

ǂHoan

SE-ǃXun

NW-ǃXun

Meaning

ʘ’ũ dzǎ-ʘúí ʘóa nʘóʕa

ǀ’áú nǀúí gǀà’á nǀã’̏ ȁn

ǀ’āō nǀúí gǀà’ā nǀã’ā

‘duiker’ ‘friend, other’ ‘eye’ ‘sky’

palatal clicks in ǁXegwi and ǀXam seems to correspond to a loss of palatal place contrast in non-clicks as well. Nǀuu forms with a palatal plosive correspond to ǁXegwi and ǀXam forms with alveolar plosives, e.g. ‘mouth’ /cu/ (Nǀuu), /tuː/ (ǀXam), /tu/ (ǁXegwi); ‘ostrich’ /cue/ (Nǀuu), /toːi/ (ǀXam), /towe/ (ǁXegwi) (Sands 2014). Palatal clicks also appear to be unstable in some Taa languages (Sands 2014). For instance, reflexes of the lexical item glossed as ‘wind’ in Taa may also be reconstructed with a palatal click on the bases of forms such as: ǃXóõ /ǂqhùe/ ‘wind’ (Traill 1994) and Nǀuǁ’en /ǂweː ~ ǂxeː/ ‘wind’. In Kakia, however, this root appears with a lateral click: /ǁʰweː ~ ǁkxweː/ ‘wind’.31 Kakia also shows alveolar reflexes in words with likely Proto-Tuu palatal clicks, e.g. ‘mouth’ /ǃweː/ (Kakia) (cf. /ǂûe/ ‘mouth’ (ǃXóõ), /ǂũ/ ‘mouth’ (Nǀuǁ’en)) (Sands 2014). Both Nǀuǁ’en, a Taa lect, and ǀXam, a ǃUi lect, show some evidence for a sound change from a palatal *ǂ to an alveolar /ǃ/ click type. A bilabial click *ʘ is reconstructed for Proto-Kx’a by Heine and Honken (2010), who propose that the bilabial clicks merged with the dental clicks in the Ju family. This accounts for cognates shown in Table 1.5. I propose instead (Sands 2014; 2018) that bilabial clicks in ǂ’Amkoe derive from labiovelarized clicks (i.e. *ǃʷ, *ǃǃʷ, *ǀʷ, *ǁʷ). This proposal helps to account for forms that appear to be cognate but which involve other click types, such as shown in Table 1.6. The lack of cognate sets involving bilabial ʘ with palatal ǂ clicks is not surprising given the rarity of labiovelarized palatals crosslinguistically. Labiovelarized clicks occur in Nguni languages, e.g. ukugcweka /uku-ᶢǀʷeka/ ‘to curse’, where the likely source is a Khoe diphthong /ɡǀoe/ ‘to curse’ (Louw 1977). Note that it appears from such instances that clicks are easier to borrow than diphthongs. 31  It is possible that the lateral click transcribed by Dorothea Bleek for Kakia was actually something like the fricated /⨎ / found in Ekoka ǃXun.

51

Click Consonants: an Introduction Table 1.6 Kx’a cognate sets involving bilabial clicks and non-dental clicks

ǂHoan

SE-ǃXun

NW-ǃXun

Proto-Ju click type (if known)

ʘχúì ‘tail’ ɡʘə̄ ʢù ‘heartburn’ ʘhò“me ‘suck ostrich egg of water’ ʘkōa ‘kill’ (pl.) zǎ-ʘkūɪ ‘friend’ ʘkōɑ̃ ‘tortoise’ mʘnò’ru ‘squeeze’

ǃxúi ‘tail’ ɡǃàʢ ‘burp’

ǁxóe ‘tail’ – nǃò’m̀ -sȁŋ ‘drink by sucking through a straw’ – ǃṹi̋ ‘friend, relative’ ɡǁò’à ‘tortoise’ –

*ǃǃ *ǃǃ *ǃ

ǃ’óán ‘kill’ (pl.) ǁò’á ‘tortoise’ nǁoaqrà ‘squeeze out’

*ǃ or *ǃǃ *ǃ *ǁ *ǁ

Source: Sands 2014; 2018 (citing data from Sands (field notes); Collins and Gruber, (forth.); Dickens (1994); Heikkinen (1986))

5 Conclusion Click consonants are integral to the segment inventories of many languages and a great deal of information about their patterning and production is now known. Phonological theories need to be able to account for the patterning of clicks. Although there are many phonetic studies of clicks, numerous languages and dialects still lack basic phonetic documentation of their clicks and we have yet to grasp the range of variation due to prosodic position, speech rate, coarticulation, speech style, etc. With better documentation, we can get better insight into the diachronic and synchronic processes that affect click consonants. Acknowledgments Thanks to all the professors, colleagues and language teachers who enabled me to learn about clicks. I would particularly like to thank Louis Goldstein who taught me how to pronounce Nguni clicks as an undergraduate, Peter Ladefoged who suggested I work on the acoustics of clicks for my M.A. thesis, and Ian Maddieson who suggested I work on East African click languages for my Ph.D. thesis. Colleagues Rosalie Finlayson, Nhlanhla Thwala, Mazisi Kunene,

52

Sands

George Vilakazi and Kim Vilakati-Thomas were important in exposing me to Nguni languages early in graduate school. I thank Pam Beddor, Ed Elderkin and John Ohala who showed me early on that I could combine my interests in phonetics and historical linguistics. I am especially grateful to Amanda Miller, Chris Collins, Johanna Brugman, Levi Namaseb, Andy Chebanne, Herman Batibo, Kemmonye Monaka, Willi Haacke, and Nicholas Blurton Jones, along with Peter and Ian, who enabled me to do fieldwork on a diverse set of click languages. Amanda Miller’s detailed work on the phonetics and phonology of clicks has greatly benefited this current study. My deepest gratitude goes to the many other speakers of click languages that I have worked with, including speakers of Nǀuu, Dahalo, Sandawe, Hadza, ǂHoan, Griekwa, Juǀ’hoan and ǃXun. I thank them for their friendliness and willingness to share their language with me. Among my many language teachers, I would like to single out Gudo Bala (Hadza), Levi Namaseb (Damara), Katrina Esau (Nǀuu), Griet Seekoei (Nǀuu), Andries Oleyn (Nǀuu) and Muchuwaiko Tsaidla (ǂHoan) and thank them for their exceptionally positive attitude towards linguistic fieldwork and patience in teaching me how to pronounce things. For their contributions to the analyses in this paper, I would particularly like to thank Grootfontein ǃXun speakers: Joseph ǀUkx’a Geise, Maria Kunǁa Nǀami, Mishe ǁXae Nǀami, ǀXai Tsubeb, Kaqe Useb, Rosalia ǀHoan Nǂunkuris, Selma Xaro Samseb. I would also like to thank Mangetti Dune ǃXun speakers: Sabine Towe Riem, Jenggu Rooi Fransisko, Martin ǁOshe Aromo, Bingo Kanoho Costa, Caroline Tumbo Kalehi. Many thanks to Amanda Miller, Johanna Brugman and Hirosi Nakagawa for sharing data shown here. I thank Kirk Miller for his numerous and detailed comments on this paper. Others who provided helpful feedback include Michael Proctor, Anne-Maria Fehn, Kerry Jones, Chris Collins, Sean Fulop, Richard Griscom, Amanda Miller, Richard Wright, Scott Moisik and Florian Lionnet. I thank Brian Joseph and Catherine Best for their encouragement that helped me finish this book. I also thank Will Grundy, without whom this work would not have been possible. This material is based upon work supported by the National Science Foundation under Grants #BCS-0726200: ‘Collaborative Research: Phonetic and Phonological Structures of Post-velar Constrictions in Clicks and Laterals’, BCS-0236795: ‘Collaborative Research: Descriptive and Theoretical Studies of Nǀuu’, and SBR-9514214: ‘Research on Hadza Demography and Language’. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

Click Consonants: an Introduction

53

References Afido, Pedro, Gregório Firmino, John Heins, Samba Mbuub, and Manuel Trinta. 1989. Relatório do I Seminário sobre a Padronização da Ortografia de Línguas Moçambicanas. Maputo: Universidade Eduardo Mondlane. Anders, H. 1934–1935. “A note on a South Eastern Bushman dialect.” Zeitschrift für Eingeborenen-Sprachen 25: 81–89. Anders, H.D. 1937. “The clicks.” South African Journal of Science 33: 926–939. Anderson, Victoria B. 2008. “Static Palatography for Language Fieldwork.” Language Documentation and Conservation 2(1): 1–27. Aycard, Pierre Benjamin Jacques. 2014. “The use of Icamtho by children in White City-Jabavu, Soweto: slang and language contact in an African urban context.” Ph.D. thesis, University of Cape Town. Bailey, Richard. 1995. “Issues in the phonology and orthography of Chopi (ciCopi S 61).” The Complete Linguist: Papers in Memory of Patrick J. Dickens, ed. Anthony Traill, Rainer Vossen, and Megan Biesele. Cologne: Rüdiger Köppe, pp. 135–170. Bakker, Peter. 1990. “The structure of autonomous languages: natural or not?” Natürlichkeit der Sprache und der Kultur: Acta Colloquii, ed. W.A. Koch. Bochum: Brockmeyer, pp. 74–95. Ball, Martin J., Sara J. Howard, and Kirk Miller. 2018. “Revisions to the extIPA chart.” Journal of the International Phonetic Association 48(2): 155–164. Ball, Martin J., and Nicole Müller. 2007. “Non-pulmonic-egressive speech in clinical data: A brief review.” Clinical Linguistics & Phonetics 21(11–12): 869–874. Barnard, Alan. 2003. “ǃKe e: /xarra //ke – Multiple origins and multiple meanings of the motto.” African Studies 62(2): 243–250. Baumbach, E.J.M. 1974. Introduction to the Speech Sounds and Speech Sound Changes of Tsonga. Pretoria: van Schaik. Baumbach, Erdmann J.M. 1997. Languages of the Eastern Caprivi. Namibian Languages: Reports and Papers, ed. Wilfrid H.G. Haacke, and Edward D. Elderkin. Cologne: Rüdiger Köppe, pp. 307–451. Beach, D.M. 1938. The Phonetics of the Hottentot Language. Cambridge: W. Heffer & Sons Ltd. Bedore, Lisa M., Laurence B. Leonard, and Jack Gandour. 1994. “The substitution of a click for sibilants: a case study.” Clinical Linguistics & Phonetics 8(4): 283–293. Bell, Arthur, and Chris Collins. 2001. “ǂHoan and the typology of click accompaniments in Khoisan.” Khoisan: Syntax, Phonetics, Phonology, and Contact, ed. Arthur Bell and Paul Washburn. Cornell Working Papers in Linguistics 18, pp. 126–153. Benor, Sarah B. 2004. “Second Style Acquisition: The Linguistic Socialization of Newly Orthodox Jews.” Ph.D. thesis, Stanford University.

54

Sands

Bleek, Dorothea F. 1927. “The distribution of Bushman languages in South Africa.” Festschrift Meinhof, ed. F. Boas, O. Dempwolff, G. Panconcelli-Calzia, A. Werner, and D. Westermann. Hamburg: Heinrich Augustin, pp. 55–64. Bleek, W.H.I. 1858. “The family of sex-denoting languages: South African species, 1: The Hottentot Language.” Catalogue of the Library of His Excellency Sir George Grey, vol. 1 (Philology), pt. 1 (South Africa). London: Trübner & Co, pp. 4–30. Bleek, W.H.I. 1862. A comparative grammar of South African languages. London: Trübner & Co. Bostoen, Koen, and Bonny Sands. 2012. “Clicks in south-western Bantu languages: Contact-induced vs. language-internal lexical change.” Proceedings of the 6th World Congress of African Linguistics, Cologne, 17–21 August 2009, ed. M. Brenzinger and Anne-Maria Fehn. Cologne: Rüdiger Köppe, pp. 121–132. Boyce, William B., and William J. Davis. 1844. A Grammar of the Kaffir Language, 2nd ed. augm. and improved. London: Printed for the Wesleyan Missionary Society. Bradfield, Julian. 2014. “Clicks, concurrency and Khoisan.” Phonology 31(1): 1–49. Breckwoldt, G.H. 1972. “A critical investigation of click symbolism.” Proceedings of the 7th International Congress of Phonetic Sciences, Montreal 1971, ed. André Rigault and René Charbonneau. Janua linguarum: series maior 57. Hague & Paris: Mouton, pp. 281–293. Breckwoldt, G.H. 1978. “Critical review of early descriptions and symbols of click sounds.” Hamburger Phonetische Beiträge 25: 73–95. Breckwoldt, G.H. 1979. “African click sounds: early descriptions and symbols.” Current Issues in the Phonetic Sciences: Proceedings of the IPS-77 Congress, Miami Beach, Florida, 17–19 December 1977, ed. Harry Hollien and Patricia Hollien. Current Issues in Linguistic Theory 9. Amsterdam: John Benjamins, pp. 509–520. Brown, Justin, and Ana Deumert. 2017. “ ‘My tribe is the Hessequa. I’m Khoisan. I’m African’: Language, desire and performance among Cape Town’s Khoisan language activists.” Multilingua: Journal of Cross-Cultural and Interlanguage Communication 36(5): 571–594. Brugman, Johanna. 2009. “Segments, Tones and Distribution in Khoekhoe Prosody.” Ph.D. Dissertation, Cornell University. Butcher, Andrew. 1995. “The phonetics of neutralisation: The case of Australian coronals.” Studies in General and English Phonetics: Essays in Honour of Professor J. D. O’Connor, ed. Jack Windsor Lewis. London & New York: Routledge, pp. 10–38. Catford, J.C. 1977. Fundamental problems in phonetics. Bloomington, Indiana: Indiana University Press. Chebanne, Andy. 2016. “Writing Khoisan: harmonized orthographies for development of under-researched and marginalized languages: the case of Cua, Kua, and Tsua dialect continuum of Botswana.” Language Policy 15(3): 277–297.

Click Consonants: an Introduction

55

Collins, Chris, and Jeff Gruber. 2014. A Grammar of ǂHȍã. Quellen zur KhoisanForschung/Research in Khoisan Studies 32. Cologne: Rüdiger Köppe. Collins, Chris, and Jeffrey S. Gruber. forthcoming. ǂHȍã Vocabulary, Recorded Utter­ ances and Oral Texts. Quellen zur Khoisan-Forschung. Cologne: Rüdiger Köppe. Copland, B.D. 1938. “Notes on the phonetics of Sandawe.” Le Maître Phonetique 53: 60–64. Dammann, Ernst. 1957. Studien zum Kwangali: Grammatik, Texte, Glossar. Abh. aus dem Gebiet der Auslandskunde 63. Hamburg: Cram, de Gruyter & Co. Dart, Sarah N., and Paroo Nihalani. 1999. “The articulation of Malayalam coronal stops and nasals.” Journal of the International Phonetic Association 29(2): 129–142. Dave, Radhekant. 1977. “Retroflex and dental consonants in Gujarati: A palatographic and acoustic study.” ARIPUC (Annual Report of the Institute of Phonetics, University of Copenhagen) 11: 27–155. Davey, Anthony S. 1965. “Morphology of the Substantive in Subiya.” B.A. Honors thesis, University of the Witwatersrand. Demolin, Didier. 1991. “Les consonnes labio-vélaires du mangbetu.” Pholia 6: 85–105. Demolin, Didier. 2016. “Clicks, stop bursts, vocoids and the timing of articulatory gestures in Kinyarwanda.” Studies in African Linguistics 45(1–2): 141–160. Dempwolff, Otto. 1913. “Sprechapparate beim Unterricht in der Namasprache.” Vox. Aus dem Seminar für Kolonialsprachen des Hamburgischen Kolonialinstituts Reihe H, Heft 5, pp. 246–255. de Vos, Connie. 2012. “The Kata Kolok perfective in child signing: Coordination of manual and non-manual components.” Sign Languages in Village Communities: Anthropological and Linguistic Insights, ed. Ulrike Zeshan and Connie de Vos. Sign Language Typology, 4. Boston & Nijmegen: De Gruyter Mouton & Ishara Press, pp. 127–152. Dickens, Patrick. 1986. “Qhalaxarzi Phonology.” Masters Dissertation. Johannesburg: University of the Witwatersrand. Dickens, Patrick. 1991. “Juǀ’hoan orthography in practice.” South African Journal of African Languages 11(1): 99–104. Dickens, Patrick J. 1994. English-Juǀ’hoan, Juǀ’hoan-English dictionary. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 8. Cologne: Rüdiger Köppe Verlag. Diepstra, Heidi, Sandra E. Trehub, Alice Eriks-Brophy, and Pascal H.H.M. van Lieshout. 2017. “Imitation of non-speech oral gestures by 8-month-old infants.” Language and Speech 60(1): 154–166. Dogil, Grzegorz, and Justus C. Roux. 1996. “Notes on unencoded speech: Clicks and their accompaniments in Xhosa.” Proceedings of the Sixth Australian Conference on Speech Science and Technology, ed. Paul McCormack. Canberra: Australian Speech Science and Technology Association, pp. 55–60.

56

Sands

Dogil, Grzegorz, Jörg Mayer, and Justus C. Roux. 1997. “Syllables and unencoded speech: clicks and their accompaniments in Xhosa.” Phonologica 1996: Syllables!?: Proceedings of the 8th International Phonology Meeting, ed. John R Rennison and Klaus Kühnhammer. The Hague: Holland Graphics, pp. 49–60. Doke, Clement M. 1923a. “A dissertation on the phonetics of the Zulu language.” Bulletin of the School of Oriental Studies, University of London 2(4): 685–729. Doke, Clement M. 1923b. “Notes on a problem in the mechanism of the Zulu clicks.” Bantu Studies 2(1): 43–45. Doke, Clement M. 1923. “An outline of the phonetics of the language of the ʗhũ̬ː Bushmen of North-West Kalahari.” Bantu Studies 2(3): 129–165. DOI: 10.1080/ 02561751.1923.9676181. Doke, Clement M. 1926. The phonetics of the Zulu language. Bantu Studies 2, special number. Johannesburg: University of Witwatersrand Press. (Nendeln/Liechtenstein: Kraus Reprint 1969). Doke, Clement M. 1936. “An outline of ǂKhomani Bushman phonetics.” Bantu Studies 10(4): 433–461. du Plessis, Menán. 2017. “The name of the fourth river: a small puzzle presented by a fragment of Kora, for Johan Oosthuizen.” Stellenbosch Papers in Linguistics 48: 123–137. du Plessis, Ménan. 2018. Kora: A Lost Khoesan Language of the Early Cape and the Gariep. Cape Town: South African History Online & UNISA Press. Eaton, Helen. 2006. “Illustrations of the IPA: Sandawe.” Journal of the International Phonetic Association 36(2): 235–242. Edgar, John T. 1991. Maba group lexicon. Sprache und Oralität in Afrika, 13. Berlin: Dietrich Reimer Verlag. Edward, Mary. 2018. “Our signs matter: Protecting sign language in Adamorobe (Ghana).” Ilorin: Linguistic Immersion Centre, University of Ilorin, Nigeria. Institut Universitaire Panafricain. Downloaded 29 Jan. 2019, from Academia.edu. Ehret, Christopher. 2013. “The Extinct Khoesan languages in East Africa.” The Khoesan Languages, ed. Rainer Vossen. New York: Routledge, pp. 465–479. Elderkin, Edward D. 1989. “The Significance and Origin of the Use of Pitch in Sandawe.” Ph.D. thesis, University of York. Elderkin, Edward D. 1992. “Predictable nasality before East African clicks.” Afrikanistische Arbeitspapiere 29: 111–129. Elderkin, Edward D. 2014. “Clicks, prosodies and Khoisan.” Beyond ‘Khoisan’: Historical Relations in the Kalahari Basin, ed. Tom Güldemann and Anne-Maria Fehn. Amsterdam: John Benjamins, pp. 103–122. Engelbrecht, J.A. 1936. The Korana: An Account of their Customs and their History, with Texts. Cape Town: Maskew Miller.

Click Consonants: an Introduction

57

Essing, D. 1964. “Die Urbevölkerung des Südlichen Afrika.” Neues Afrika 6(7): 238–239. Exter, Mats. 2008a. “Properties of the Anterior and Posterior Click Closures in Nǀuu.” Ph.D. thesis, Universität zu Köln. Available online: http://kups.ub.uni-koeln.de/ 4979/. Exter, Mats. 2008b. “Proposed feature representations for clicks. An overview.” Khoisan Languages and Linguistics: Proceedings of the 2nd International Symposium January 8–12, 2006, Riezlern/Kleinwalsertal, ed. Sonja Ermisch. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 22. Cologne: Rüdiger Köppe, pp. 137–152. Exter, Mats. 2010. “Stiffness as an articulatory gesture in Nǀuu.” Poster presented at LabPhon 12, 8–10 July, 2010, Albuquerque, New Mexico. Exter, Mats. 2011. “The acoustic modeling of click types.” The 17th International Congress of Phonetic Sciences (ICPhS XVII), Hong Kong, August 17–21, 2011, pp. 655–658. Fehn, Anne-Maria. 2018. “New data on Northeastern Kalahari Khoe phoneme inventories: A comparative survey.” Africana Linguistica 24: 5–29. Fehn, Anne-Maria. 2019. “Phonological and lexical variation in the Khwe dialect cluster.” Zeitschrift der Deutschen Morgenländischen Gesellschaft 169(1): 9–38. Fuchs, Susanne, and Blake Rodgers. 2013. “Negative intraoral pressure in German: Evidence from an exploratory study.” Journal of the International Phonetic Association 43(3): 321–337. Fuchs, Susanne, Laura Koenig, and Ralf Winkler. 2007. “Weak clicks in German?” Proceedings of the International Congress of Phonetic Sciences XVI, Saarbrücken, Germany, August 2007, pp. 449–452. Available online: http://www.icphs2007.de/ conference/Papers/1678/. Fulop, Sean A., Peter Ladefoged, Liu Fang, and Rainer Vossen. 2003. “Yeyi clicks: Acoustic Description and Analysis.” Phonetica 60: 231–260. Garellek, Marc. 2020. “Acoustic discriminability of the complex phonation system in ǃXóõ.” Phonetica 77(4): 131–160. DOI: 10.1159/000494301. Gerlach, Linda. 2016. Nǃaqriaxe – The Phonology of an Endangered Language of Botswana. Wiesbaden: Harrassowitz. Gibbon, Fiona, Alice Lee, Ivan Yuen, and Lisa Crampin. 2008. “Clicks produced as compensatory articulations in two adolescents with velocardiofacial syndrome.” The Cleft Palate-Craniofacial Journal 45(4): 381–392. DOI: 10.1587/06-232.1. Gick, Bryan, Naomi Francis, Anna Klenin, Ezra Mizrahi, and Denise Tom. 2013. “The velic traverse: An independent oral articulator?” Journal of the Acoustical Society of America 133(3): EL208–EL213. Gil, David. 2005. “Paralinguistic usages of clicks.” The World Atlas of Language Structures, ed. Martin Haspelmath, Matthew S. Dryer, David Gil and Bernard Comrie. Oxford: Oxford University Press, pp. 572–575.

58

Sands

Gold, Erica, Peter French, and Philip Harrison. 2013. “Clicking behavior as a possible speaker discriminant in English.” Journal of the International Phonetic Association 43(3): 339–349. Gold, Erica. 2012. “Calculating Likelihood Ratios for Forensic Speaker Comparisons Using Phonetic and Linguistic Parameters.” Ph.D. thesis, University of York. Grundy, W.M., K.S. Noll, M.W. Buie, S.D. Benecchi, D. Ragozzine, and H.G. Roe. 2019. The mutual orbit, mass, and density of transneptunian binary Gǃkúnǁ’hòmdímà (229762 2007 UK126). Icarus 334: 30–38. DOI: 10.1016/j.icarus.2018.12.037. Güldemann, Tom. 1998. “The Kalahari basin as an object of areal typology – a first approach.” Language, Identity, and Conceptualization among the Khoisan, ed. Mathias Schladt. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 15. Cologne: Rüdiger Köppe, pp. 137–169. Güldemann, Tom. 2014. “ ‘Khoisan’ linguistic classification today.” Beyond ‘Khoisan’: Historical Relations in the Kalahari Basin, ed. Tom Güldemann and Anne-Maria Fehn. Amsterdam: John Benjamins, pp. 1–41. Güldemann, Tom. 2016. “Phonological regularities of consonant systems in genetic lineages of Khoisan.” Lone Tree: Scholarship in the Service of the Koon, Essays in Memory of Anthony T. Traill, ed. Rainer Vossen and Wilfrid H.G. Haacke. Cologne: Rüdiger Köppe, pp. 159–207. Güldemann, Tom, and Edward D. Elderkin. 2010. “On external genealogical relationships of the Khoe family.” Khoisan Languages and Linguistics: Proceedings of the 1st International Symposium, January 4–8, 2003, Riezlern/Kleinwalsertal, ed. M. Brenzinger and Christa König. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 24. Cologne: Rüdiger Köppe, pp. 15–52. Güldemann, Tom, and Anne-Maria Fehn. 2017. “The Kalahari Basin area as a ‘Sprachbund’ before the Bantu expansion.” The Cambridge Handbook of Areal Linguistics, ed. Raymond Hickey. Cambridge: Cambridge University Press, pp. 500–526. Güldemann, Tom, and Hirosi Nakagawa. 2018. “Anthony Traill and the holistic approach to Kalahari Basin sound design.” Africana Linguistica 24: 45–73. Gunnink, Hilde. 2014. “The grammatical structure of Sowetan tsotsitaal.” Southern African Linguistics and Applied Language Studies 32(2): 161–171. Gunnink, Hilde, Bonny Sands, Brigitte Pakendorf, and Koen Bostoen. 2015. “Prehistoric language contact in the Kavango-Zambezi transfrontier area: Khoisan influence on southwestern Bantu languages.” Journal of African Languages and Linguistics 36(2): 193–232. Gxilishe, Sandile. 2004. “The acquisition of clicks by Xhosa-speaking children.” Per Linguam 20(2): 1–12.

Click Consonants: an Introduction

59

Haacke, Wilfrid H.G. 1989. “Nama: survival through standardization?” Language reform, vol. 5, ed. István Fodor and Claude A. Hagège. Hamburg: Helmut Buske, pp. 397–429. Haacke, Wilfrid H.G. 2001. “Nama.” Facts About the World’s Languages: An Encyclopedia of the World’s Major Languages, Past and Present, ed. Jane Garry and Carl Rubino. New York: H.W. Wilson Co, pp. 497–501. Haacke, Wilfrid H.G. 2018. “Khoekhoegowab (Nama/Damara).” The Social and Political History of Southern Africa’s Languages, ed. Tomasz Kamusella and Finex Ndhlovu. London: Palgrave Macmillan, pp. 133–158. Haacke, Wilfrid H.G., and Eliphas Eiseb. 2002. A Khoekhoegowab Dictionary, with an English-Khoekhoegowab Index. Windhoek: Gamsberg Macmillan. Hale, Kenneth, and David Nash. 1997. “Damin and Lardil phonotactics.” Boundary Rider: Essays in Honour of Geoffrey O’Grady, ed. Darrell Tryon and Michael Walsh. Pacific Linguistics C-136. Canberra: Research School of Pacific and Asian Studies, Australian National University, pp. 247–259. Hamann, Silke. 2003. The Phonetics and Phonology of Retroflexes. LOT Dissertation Series 75. Utrecht: LOT. Harley, Matthew. 2012. “Unusual sounds in Nigerian languages.” Advances in Minority Language Research in Nigeria, vol. 1, ed. Roger M. Blench and Stuart McGill. Kay Williamson Educational Foundation, 5. Cologne: Rüdiger Köppe, pp. 39–66. Heikkinen, Terttu. 1986. “Phonology of the ǃXũ dialect spoken in Ovamboland and western Kavango.” South African Journal of African Languages (Suid-Afrikaanse Tydskrif vir Afrikatale) 6(1): 18–28. Heine, Bernd, and Henry Honken. 2010. “The Kx’a family: A new Khoisan genealogy.” Journal of Asian and African Studies (Ajia Afuriku gengo bunka kenkyu) 79: 5–36. Heine, Bernd, and Christa König. 2015. The ǃXun Language: A Dialect Grammar of Northern Khoisan. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 33. Cologne: Rüdiger Köppe. Herbert, Robert K. 1983. “Clicks in normal and delayed acquisition of Zulu.” [abstract.] Abstracts of the Tenth International Congress of Phonetic Sciences, 1–6 August 1983, Utrecht, The Netherlands. Dordrecht, Holland & Cinnaminson, NJ: Foris, p. 629. Herbert, Robert K. 1990. “The sociohistory of clicks in Southern Bantu.” Anthropological Linguistics 32(3–4): 295–315. Herbert, Thomas. 1634. A relation of some yeares travaile, begvnne Anno 1626, into Afrique and the greater Asia, especially the territories of the Persian Monarchie, and some parts of the Orientall Indies and Iles adiacent. Of their religion, language, habit discent, ceremonies, and other matters concerning them. London: Stansby & Bloome. Heselwood, Barry. 1997. “A Case of Nasal Clicks for Target Sonorants: A Feature Geometry Account.” Clinical Linguistics & Phonetics 11(1): 43–61.

60

Sands

Hinton, Leanne, Johanna Nichols, and John Ohala, eds. 1994. Sound Symbolism. Cambridge: Cambridge University Press. Howard, Sara J. 1993. "Articulatory constraints on a phonological system: a case study of cleft palate speech." Clinical Linguistics & Phonetics 7: 299–317. Hudu, Fusheini, Amanda L. Miller, and Douglas Pulleyblank. 2009. “Ultrasound imaging and theories of tongue root phenomena in African languages.” Proceedings of the Conference on Language Documentation and Linguistic Theory 2, ed. Peter K. Austin, Oliver Bond, Monik Charette, David Nathan and Peter Sells. London: SOAS, pp. 153–163. Available online: http://www.hrelp.org/publications/ldlt2/papers/ldlt2 proceedings.html. Hunziker, Daniel A., Elisabeth Hunziker, and Helen Eaton. 2008. “A description of the phonology of the Sandawe language.” SIL Electronic Working Papers (SILEWP) 2008–004. Hussain, Qandeel, Michael Proctor, Mark Harvey, and Katherine Demuth. 2017. “Acoustic characteristics of Punjabi retroflex and dental stops.” Journal of the Acoustical Society of America 141(6): 4522–4542. Hyman, Larry M. 2009. “How (not) to do phonological typology: the case of pitchaccent.” Language Sciences 31(2–3): 213–238. Ilunga, Mpunga wa. 2001. “siSwati.” Facts About the World’s Languages: An Encyclopedia of the World’s Major Languages, Past and Present, ed. Jane Garry and Carl Rubino. New York: H.W. Wilson Co, pp. 660–663. Irvine, Judith T., and Susan Gal. 2000. “Language ideology and linguistic differentiation.” Regimes of Language: Ideologies, Polities, and Identities, ed. Paul Kroskrity. Oxford: James Currey, pp. 35–83. Jacottet, Édouard. 1896. Études sur les langues du haut-Zambèze. Première partie: Grammaires Soubiya et louyi. Publications de l’école des lettres d’Alger, Bulletin de correspondance africaine 16. Paris: Ernest Leroux. Jakobson, Roman. 1968. “Extrapulmonic consonants (ejectives, implosives, clicks).” Quarterly Progress Report of the Research Laboratory of Electronics at the Massachusetts Institute of Technology 90: 221–227. Jessen, Michael. 2002. “An acoustic study of contrasting plosives and click accompaniments in Xhosa.” Phonetica 59: 150–179. Jessen, Michael, and Justus C. Roux. 2002. “Voice quality differences associated with stops and clicks in Xhosa.” Journal of Phonetics 30: 1–52. Johnson, Keith. 1993. “Acoustic and Auditory Analyses of Xhosa Clicks and Pulmonics.” UCLA Working Papers in Phonetics 83: 33–45. Jones, Daniel. 1907. “Implosive sounds and clicks (implousiv saundz ənd kliks).” Le Maître Phonétique Nov.–Dec.: 111–114. Jones, Kerry Lee. 2017. “Language Attitudes as a Change Agent for Language Vitality: A Case Study of Two Khoesan Languages in Platfontein (RSA).” Ph.D. thesis, University of KwaZulu-Natal, South Africa.

Click Consonants: an Introduction

61

Jordan, Linda, and Isata Manuel. 2018. “Sociolinguistic survey of ǃXun in Huila Province.” SIL Electronic Survey Reports, 2018–015. Kagaya, Ryohei. 1978. “Soundspectrographic analysis of Naron clicks: A preliminary report.” Annual Bulletin of Institute of Logopedics and Phoniatrics 12: 113–125. Kagaya, Ryohei, and Hirosi Nakagawa. 1986. “A phonetic and soundspectrographic study of the G/uikhue language.” Journal of Asian and African Studies (Tokyo) 31: 1–23. [In Japanese.] Khaila, Harveen, Jill House, Lesley Cavalli, and Elizabeth Nash. 2007. “A phonetic and phonological study of so-called ‘buccal’ speech produced by two long-term tracheostomised children.” Proceedings of the International Congress of Phonetic Sciences XVI, Saarbrücken, Germany, August 2007, pp. 2033–2036. Available online: http:// www.icphs2007.de/conference/Papers/1476/1476.pdf. Kilian-Hatz, Christa. 2008. A Grammar of Modern Khwe (Central Khoisan). Quellen zur Khoisan-Forschung/Research in Khoisan Studies 23. Cologne: Rüdiger Köppe. Killian, Donald. 2009. “Khoemana and the Griqua: Identity at the Heart of Phonological Attrition.” M.A. thesis, University of Helsinki. Klingenheben, August von. 1937. “Die Schnalze in den afrikanischen Sprachen.” Schriften der Deutschen Gesellschaft für Tier- und Ursprachenforschung, Reihe H, Bd 8. Deutschen Gesellschaft für Tier- und Ursprachenforschung. Leipzig: Kurt Vieweg Verlag, pp. 1–12. Kochetov, Alexei, N. Sreedevi, Midula Kasim, and R. Manjula. 2014. “Spatial and dynamic aspects of retroflex production: An ultrasound and EMA study of Kannada geminate stops.” Journal of Phonetics 46: 168–184. Köhler, Oswin., Peter Ladefoged, Jan Snyman, Anthony Traill, and Rainer Vossen. 1988. “The symbols for clicks.” Journal of the International Phonetic Association 18: 140–142. König, Christa, and Bernd Heine. 2008. A Concise Dictionary of Northern ǃXun. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 21. Cologne: Rüdiger Köppe. Koreman, Jacques, and Aleksander Morland. 2007. “Speaker consistency of coarticulatory gestures in clusters of labial and velar plosives.” Proceedings of the 16th International Congress of Phonetic Sciences, Saarbrücken, 6–10 August 2007, pp. 577–580. Ladefoged, Peter. 1968. A Phonetic Study of West African Languages: An AuditoryInstrumental Survey, 2nd ed. Cambridge: Cambridge University Press. Ladefoged, Peter. 1993. A Course in Phonetics, 3rd ed. London: Harcourt Brace College Publishers. Ladefoged, Peter. 2000. Vowels and Consonants: An Introduction to the Sounds of Languages. Oxford: Blackwell. Ladefoged, Peter. 2003. Phonetic Data Analysis: An Introduction to Fieldwork and Instrumental Techniques. Oxford: Blackwell.

62

Sands

Ladefoged, Peter, and Ian Maddieson. 1996. The Sounds of the World’s Languages. Oxford: Blackwell. Ladefoged, Peter, and Anthony Traill. 1980. “Phonological features and phonetic details of Khoisan languages.” Bushman and Hottentot Linguistic Studies, 1979, ed. Jan W. Snyman. Miscellanea Congregalia, 16. Pretoria: University of South Africa, pp. 134–166. Ladefoged, Peter, and Anthony Traill. 1984. “Linguistic phonetic descriptions of clicks.” Language 60(1): 1–20. Ladefoged, Peter, and Anthony Traill. 1994. “Clicks and their accompaniments.” Journal of Phonetics 22: 33–64. Lanham, L.W. 1960. “The Comparative Phonology of Nguni.” Ph.D. thesis, University of the Witwatersrand, Johannesburg. Lanham, L.W., and D.P. Hallowes. 1956. “Linguistic relationships and contacts expressed in the vocabulary of Eastern Bushman.” African Studies 15(1): 45–48. Lebzelter, Viktor. 1934. “Die Nogau-Buschmänner von Epukiro.” Eingeborenenkulturen in Süd- und Südwestafrika: wissenschaftliche Ergebnisse einer Forschungsreise nach Süd- und Südwestafrika in den Jahren 1926–1928. Rassen und Kulturen in Südafrika 2. Leipzig: Karl W Hiersemann, pp. 65–79. Lepsius, Karl R. 1873. “Über Buschmänner und Hottentotten, sowie über die Steinund Eisenzeit im alten Aegypten.” Verhandlungen der Berliner Gesellschaft für Anthropologie, Ethnologie und Urgeschichte. Berlin, pp. 62–65. Lewis, Philip W. 1994a. “Aspects of the Phonological Acquisition of Clicks in Xhosa.” M.A. Thesis, University of Stellenbosch. Lewis, Philip W. 1994b. “The Acquisition of Clicks by non-mother-tongue speakers.” Stellenbosch Papers in Linguistics Plus 26: 98–114. Lewis, Philip W., and Justus C. Roux. 1996. “A Phonological Process Analysis of the Acquisition and Loss of Clicks in Xhosa.” South African Journal of African Languages 16(1): 1–7. Lewis, Philip W., and Justus C. Roux. 1998. “On Xhosa L2 speech and intelligibility: vowels and clicks.” South African Journal of Linguistics, Supplement 36: 51–63. Lewis, Philip, and Justus C. Roux. 1999. “On Xhosa L2 speech and intelligibility: Ejectives, implosives and clicks.” Proceedings of the 14th International Congress of Phonetic Sciences, 1–7 August, San Francisco, ed. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville and A. Bailey. Berkeley: Department of Linguistics, University of California at Berkeley. Vol. 2, pp. 1149–1152. Louw, Jacobus Abraham. 1977. “Clicks as loans in Xhosa.” Bushman and Hottentot Linguistic Studies 1975, ed. Jan W. Snyman. University of South Africa. Pretoria, pp. 82–100. Louw, Jacobus Abraham. 2013. “The impact of Khoesan on Southern Bantu.” Handbook of Khoesan languages, ed. Rainer Vossen. New York: Routledge, pp. 435–444.

Click Consonants: an Introduction

63

Lukusa, Stephen T.M., and Kemmonye C. Monaka. 2008. Shekgalagari Grammar: A Descriptive Analysis of the Language and its Vocabulary. CASAS Book Series 47. Cape Town: Centre for Advanced Studies of African Society. MacMahon, Michael K.C. 1994. “A mid-18th-century use of [ə], [ɔ], and [ʞ] as phonetic symbols. Journal of the International Phonetic Association 24(1): 19–20. Maddieson, Ian. 2003. “The sounds of the Bantu languages.” The Bantu languages, ed. Derek Nurse and Gérard Philippson. London: Routledge, pp. 15–41. Maddieson, Ian, Siniša Spajić, Bonny Sands, and Peter Ladefoged. 1993. “Phonetic Structures of Dahalo.” Afrikanistische Arbeitspapiere 36: 5–53. Maddieson, Ian, Peter Ladefoged, and Bonny Sands. 1999. “Clicks in East African languages.” African Mosaic: Festschrift for J.A. Louw, ed. Rosalie Finlayson. Pretoria: UNISA Press, pp. 59–91. Maddieson, Ian, and Bonny Sands. 2019. “The sounds of the Bantu languages.” The Bantu Languages, ed. Mark van de Velde, Koen Bostoen, Derek Nurse and Gérard Philippson. London: Routledge-Curzon, pp. 79–127. Maphalala, Zinhle, Michelle Pascoe, and Mantoa Rose Smouse. 2014. “Phonological development of first language isiXhosa-speaking children aged 3;0–6;0 years: A descriptive cross-sectional study.” Clinical Linguistics and Phonetics 28(3): 176–194. Marchal, Alain. 1987. “Des clics en francais?” Phonetica 44: 30–37. Marquard, Carina, Oliver Niebuhr, and Alena Witzlack-Makarevich. 2015. “Phonetic reduction of clicks – Evidence from Nǀuu.” Proceedings of the 18th International Congress of Phonetic Sciences, ed. The Scottish Consortium for ICPhS 2015. Glasgow, UK: the University of Glasgow. Retrieved from http://www.icphs2015.info/pdfs/ Papers/ICPHS0240.pdf. Mathes, Timothy K. 2016. “The segmental inventory and status of click replacement in Tsua.” Khoisan Languages and Linguistics: Proceedings of the 5th International Symposium, July 13–17, 2014, Riezlern/Kleinwalsertal, ed. S. Shah and M. Brenzinger. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 34. Cologne: Rüdiger Köppe, pp. 223–242. Memmott, Paul, Nicholas Evans, and Richard Robins. 2006. “Understanding isolation and change in island human populations through a study of indigenous cultural patterns in the Gulf of Carpentaria.” Transactions of the Royal Society of South Australia 130(1): 29–47. Midtlyng, Patrick J. 2011. “The effects of speech rate on VOT for initial plosives and click accompaniments in Zulu.” Selected Proceedings of the 40th Annual Conference on African Linguistics, ed. Eyamba G. Bokamba, Ryan K. Shosted and Bezza Tesfaw Ayalew. Somerville, MA: Cascadilla Proceedings Project, pp. 105–118. Miller, Amanda L. 2007. “Guttural vowels and guttural co-articulation in Juǀ’hoansi.” Journal of Phonetics 35: 56–84.

64

Sands

Miller, Amanda L. 2008. “Click cavity formation and dissolution in IsiXhosa: Viewing clicks with high-speed ultrasound.” Proceedings of the 8th International Seminar on Speech Production, ed. Rudolf Sock, Susanne Fuchs and Yves Laprie, pp. 137–140. Available online: issp2008.loria.fr/Proceedings/PDF/issp2008-28.pdf. Miller, Amanda L. 2010. “Tongue body and tongue root shape differences in Nǀuu clicks correlate with phonotactic patterns.” Turbulent Sounds: An Interdisciplinary Guide, ed. Susanne Fuchs, Martine Toda and Marzena Żygis. Berlin: Mouton de Gruyter, pp. 245–279. Miller, Amanda L. 2011. “Click change in the Juu languages.” Paper presented at the Symposium on Historical Linguistics, 85th Linguistics Society of America Annual Meeting, Pittsburgh, PA, 8 January, 2011. Miller, Amanda L. 2016. “Posterior lingual gestures and tongue shape in Mangetti Dune ǃXung clicks.” Journal of Phonetics 55: 119–148. Miller, Amanda L. 2017. “C-V coarticulation in consonants with multiple lingual constrictions.” Abstract of paper presented at Ultrafest VIII, October 4, 2017. Miller, Amanda L. 2019. “Palatal click allophony in Mangetti Dune ǃXung: Implications for sound change.” Journal of the International Phonetic Association 49(2): 153–181. Miller, Amanda L., Johanna Brugman, Bonny Sands, Levi Namaseb, Mats Exter, and Chris Collins. 2007. “The sounds of Nǀuu: Place and airstream contrasts.” (ed. Hye-Sook Lee and Pittayawat Pittayaporn. Ithaca, NY: CLC Publications). Working Papers of the Cornell Phonetics Laboratory 16: 101–160. Miller, Amanda L., Johanna Brugman, Bonny Sands, Levi Namaseb, Mats Exter, and Chris Collins. 2009. “Differences in airstream and posterior place of articulation among Nǀuu clicks.” Journal of the International Phonetic Association 39(2): 129–161. Miller, Amanda L., and Micha Elsner. 2017. “Click reduction in fluent speech: a semiautomated analysis of Mangetti Dune ǃXung.” Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages, Honololu, Hawaiʻi, USA, March 6–7, 2017, ed. Antti Arppe, Jeff Good, Mans Hulden, Jordan Lachler, Alexis Palmer and Lane Schwartz. Association for Computational Linguistics, pp. 107–115. Miller, Amanda L., and Kenneth B. Finch. 2011. “Corrected high-speed anchored ultrasound with software alignment.” Journal of Speech, Language, and Hearing Research 54(2): 471–486. Miller, Amanda L., and Jeffrey J. Holliday. 2014. “Contrastive apical post-alveolar and laminal alveolar click types in Ekoka ǃXung.” Journal of the Acoustical Society of America 135(4.2): 2351–2352. Miller, Amanda L., Jeffrey Holliday, David Howcroft, Sara Phillips, Bridget Smith, and Tsz-HimTsui. 2011. “The phonetics of the modern-day reflexes of the Proto-palatal click in Juu languages.” Paper presented at the 4th International Symposium on Khoisan Languages and Linguistics, 11–13 July 2011, Riezlern/Kleinwalsertal.

Click Consonants: an Introduction

65

Miller, Amanda L., Levi Namaseb, and Khalil Iskarous. 2007. “Tongue body constriction differences in click types.” Proceedings of Laboratory Phonology 9, ed. Jennifer S. Cole and José Ignacio Hualde. Berlin: Mouton de Gruyter, pp. 643–656. Miller, Amanda L., and Abigail Scott. 2011. “The phonetic basis of synchronic and diachronic phonological patterns involving the retroflex click in Juu languages.” Poster presented at the International Seminar on Speech Production 2011 (ISSP 2011), Montréal, Québec, June 20–23, 2011. Miller-Ockhuizen, Amanda L. 1998. “Towards a unified decompositional analysis of Khoisan lexical tone.” Language, Identity, and Conceptualization among the Khoisan, ed. Mathias Schladt. Quellen zur Khoisan-Forschung 15. Cologne: Rüdiger Köppe, pp. 217–243. Miller-Ockhuizen, Amanda L. 2000. “Issues in Juǀ’hoansi orthography and their implications for the development of orthographies for other Khoesan languages.” The State of Khoesan Languages in Botswana, ed. Herman M. Batibo and Joseph Tsonope. Gaborone: Tasalls Publishing and Books for the Basarwa Languages Project, pp. 108–124. Miller-Ockhuizen, Amanda L. 2003. The Phonetics and Phonology of Gutturals: A Case Study from Juǀ’hoansi. Outstanding Dissertations in Linguistics. New York & London: Routledge. Miller-Ockhuizen, Amanda L., and Bonny Sands. 1999. “ǃKung as a Linguistic Construct.” Language and Communication 19(4): 401–413. Miller-Ockhuizen, Amanda L., and Bonny E. Sands. 2000. “Contrastive Lateral Clicks and Variation in Click Types.” Proceedings of ICSLP 2000, vol. II. Beijing, China, 499–500. Mkanganwi, K.G. 1972. “The relationships of coastal Ndau to the Shona dialects of the interior.” African Studies 31: 111–137. Moisik, Scott R., and Dan Dediu. 2015. “Anatomical biasing and clicks: Preliminary biomechanical modelling.” Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015) Satellite Event: The Evolution of Phonetic Capabilities: Causes, constraints, consequences, ed. H. Little. Glasgow: ICPhS, pp. 8–13. Moisik, Scott Reid, and Dan Dediu. 2017. “Anatomical biasing and clicks: Evidence from biomechanical modeling.” Journal of Language Evolution 2(1): 37–51. Mowrer, Donald E., and Sharon Burger. 1991. “A comparative analysis of phonological acquisition of consonants in the speech of 2 1/2–6 year old Xhosa- and English-speaking children.” Clinical Linguistics & Phonetics 5(2): 139–164. Moyo, Themba. 1995. “Language Contact and Language Change: The Case for Chitumbuka in Northern Malawi.” South African Journal of African Languages 15(4): 86–191. Naidoo, Yugeshiree, Anita van der Merwe, Emily Groenewald, and Elsie Naudé. 2005. “Development of speech sounds and syllable structure of words in Zulu-

66

Sands

speaking children.” Southern African Linguistics and Applied Language Studies 23(1): 59–79. Nakagawa, Hirosi. 1996. “A first report on the click accompaniments of ǀGui.” Journal of the International Phonetic Association 26(1): 41–54. Nakagawa, Hirosi. 2006. “Aspects of the Phonetic and Phonological Structure of the Gǀui Language.” Ph.D. thesis, University of the Witwatersrand. Johannesburg. Nakagawa, Hirosi, Kazuyoshi Sugawara, and Jiro Tanaka. 2014. “ǀGui-English Dictionary (A Preliminary Edition, [April, 2014]).” Manuscript. Nathan, Geoffrey S. 2001. “Clicks in a Chinese nursery rhyme.” Journal of the International Phonetic Association 31(2): 223–228. Naumann, Christfried. 2016. “The phoneme inventory of Taa (West ǃXoon dialect).” Lone Tree: Scholarship in the Service of the Koon, Essays in Memory of Anthony T. Traill, ed. Rainer Vossen and Wilfrid H.G. Haacke. Cologne: Rüdiger Köppe, pp. 311–351. Ndala, António (falante mais velho), Kavila Lititinga Ketsi, Antonio Luciano Cassanga, Manuel Cativa, Helena Domingas, and Fernando Tchilunda. 2014. Ortografia Padrao Unificada Da Língua ǃXun. Orthografia Padrao unificada Das Língua Khoi E San (Africa do Sul, Angola, Botswana e Namibia). CASAS Monograph Series 259. Cape Town: Centre for Advanced Studies of African Society. Ndana, Ndana, Kapule David Mabuta, and Andy Chebanne. 2017. Chiikuhane (Subiya) Manual with Orthography. CASAS Book Series 123. Cape Town: Centre for Advanced Studies of African Society. Nedvěd, Radek. 2016. “On socio-linguistic features of Grootfontein urban ǃXun dwellers.” Europe and the Others. Occasional Research Bulletin, Centre for the Studies of Contacts and Conflicts in Language and Culture in Prague 1, ed. Petr Zima, Norbert Cyffer, Robert Nicolaï and Nad’a Zimová. Munich: LINCOM, pp. 2–36. Ngubane, Sihawukele Emmanuel. 1991. “A Survey of the Northern Zululand Dialects in the Ingwavuma District.” M.A. thesis, University of Natal, Durban. Nyamende, Abner. 1994. “Regional variation in Xhosa.” Stellenbosch Papers in Linguistics Plus 26: 202–217. Ogden, Richard. 2013. “Clicks and percussives in English conversation.” Journal of the International Phonetic Association 43(3): 299–320. Ohala, John J. 1995. “A probable case of clicks influencing the sound pattern of some European languages.” Phonetica 52: 160–170. Ohala, John J. 1997. “Emergent stops: diachronic and phonetic data.” Proceedings of the SICOL (Seoul International Conference on Linguistics). Seoul: Linguistic Society of Korea, pp. 84–91. Available online: http://linguistics.berkeley.edu/phonlab/users/ ohala/index3.html. Otto, [Brother]. 1922. “A plea for more method.” Bantu Studies and General South African Anthropology 1(2): 4–7.

Click Consonants: an Introduction

67

Pakendorf, Brigitte, Hilde Gunnink, Bonny Sands, and Koen Bostoen. 2017. “Prehistoric Bantu-Khoisan language contact: A cross-disciplinary approach.” Language Dynamics and Change 7(1): 1–46. Panconcelli-Calzia, G. 1918/19. “Untersuchungen über l̞, p, b, q͡χ, tˆ̩, tˆ, tˆ̰ mit Röntgenstrahlen.” Zeitschrift für Kolonial-Sprachen 9: 20–25. Panconcelli-Calzia, G. 1919–1920. “Über die Abhängigkeit bezw. unabhängigkeit der Laute von der Atmung.” Zeitschrift für Eingeborenen-Sprachen 10: 32–43. Panconcelli-Calzia, G. 1924. Die Experimentelle Phonetik in Ihrer Anwendung Auf Die Sprachwissenschaft. Berlin: Walter De Gruyter. Persson, J.A. 1932. Outlines of a Tswa Grammar, with Practical Exercises. Cleveland: Central Mission Press. Phillips, Howard. 1993. The University of Cape Town 1918–1948, the formative years. Cape Town: University of Cape Town Press. Pienaar, Pierre de Villiers. 1936. “A few notes on the phonetic aspects of clicks and the relationship thereof to certain other classes of speech sound.” Bantu studies 10(1): 41–55. Pienaar, Pierre de Villiers. 1939. “Click formation and distribution.” Proceedings of the 3rd International Congress of Phonetic Sciences. Gent: Rijksuniversiteit Gent. pp. 344–353. Pinto, Derrin, and Donny Vigil. 2018. “Clicks as discourse markers in Peninsular Spanish.” Spanish in Context 15(3): 441–464. Planert, Wilhelm. 1927. “Die Schnalzsprachen.” Bibliotheca africana 2: 296–315. Ponelis, Fritz A. 1975. “ǃOra clicks: problems and speculations.” Bushman and Hottentot linguistic Studies, ed. Anthony Traill. Communications from the African Studies Institute 2. Johannesburg: University of the Witwatersrand, pp. 51–60. Pongweni, Alec J.C. 1990. Studies in Shona phonetics: an analytical review. Harare: University of Zimbabwe. Proctor, Michael, Erik Bresch, Dani Byrd, Krishna Nayak, and Shrikanth Narayanan. 2013. “Paralinguistic mechanisms of production in human ‘beatboxing’: A real-time magnetic resonance imaging study.” Journal of the Acoustical Society of America 133(2): 1043–1054. Proctor, Michael, Yinghua Zhu, Adam Lammert, Asterios Tsoutios, Bonny Sands, Ulrich Hummel, and Shrikanth Naryanan. 2016. “Click consonant production in Khoekhoe: a real-time MRI study.” Khoisan Languages and Linguistics: Proceedings of the 5th International Symposium, July 13–17, 2014, Riezlern/Kleinwalsertal, ed. S. Shah and M. Brenzinger. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 34. Cologne: Rüdiger Köppe, pp. 335–362. Rossouw, Kate, and Michelle Pascoe. 2018. “Intervention for bilingual speech sound disorders: a case study of an isiXhosa-English-speaking child.” South African Journal of Communication Disorders 65(1): 1–10.

68

Sands

Roux, Justus C. 1991. “On ingressive glottalic and velaric articulations in Xhosa.” Actes du XIIème Congres International des Sciences Phonetiques vol 3. Aix-en-Provence: Université de Provence, pp. 158–161. Roux, Justus C. 2007. “Unresolved issues in the representation and phonetic description of click articulation in Xhosa and Zulu.” Language Matters 38(1): 8–25. Roux, Justus C., and Grzegorz Dogil. 1998. “On the phonetic representation of clicks: Some experimental phonetic considerations.” Proceedings of the XVIth International Congress of Linguists, 20–25, July, 1997, ed Bernard Caron. Paris: CNRS. CD-ROM format. [Paper 183, 5pp.] Roux, Justus C., Grzegorz Dogil, and Wolfgang Wokurek. 1995. “Click articulations in Xhosa: New perspectives through Wigner distribution analysis.” Proceedings of the 13th International Congress of Phonetic Sciences, vol. 2, ed. Kjell Elenius and Peter Branderud. Stockholm: Congress organizers at KTH and Stockholm University, pp. 574–577. Sands, Bonny. 1991. “Evidence for Click Features: Acoustic Characteristics of Xhosa Clicks.” M.A. thesis, University of California, Los Angeles. Sands, Bonny. 1992. “An acoustic study of Xhosa clicks.” Actes du 12ème congres international des sciences phonétiques, vol 4. Université de Provence. Aix-en-Provence, pp. 130–133. Sands, Bonny. 1998. Eastern and Southern African Khoisan: Evaluating Claims of Distant Linguistic Relationships. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 14. Cologne: Rüdiger Köppe Verlag. Sands, Bonny. 2007. “The contribution of language documentation to historical phonology.” Proceedings of Conference on Language Documentation and Linguistic Theory (75 years of Linguistics at SOAS, 5 years of the Endangered Languages Project, 7–8 December 2007, School of Oriental and African Studies, University of London), ed. Peter K. Austin, Oliver Bond and David Nathan. London: Hans Rausing Endangered Languages Project & Department of Linguistics, School of Oriental and African Studies, pp. 209–219. Available online: http://www.hrelp.org/publications/ldlt/ papers/ldltproceedings.html. Sands, Bonny. 2010. “Juu subgroups based on phonological patterns.” Khoisan Languages and Linguistics: Proceedings of the 1st International Symposium, January 4–8, 2003, Riezlern/Kleinwalsertal, ed. M. Brenzinger and Christa König. Quellen zur KhoisanForschung/Research in Khoisan Studies 24. Cologne: Rüdiger Köppe, pp. 85–114. Sands, Bonny. 2014. “Adoption, maintenance and loss of click contrasts.” Paper presented at “Sound Change in Interacting Human Systems”, 3rd Biennial Workshop on Sound Change, University of California, Berkeley, May 28–31, 2014. Available online: http://linguistics.berkeley.edu/SCIHS/abstracts/5_FridayAfternoon/Sands.pdf.

Click Consonants: an Introduction

69

Sands, Bonny. 2018. “From Calgary to the Kalahari: An exploration of language diversity and endangerment.” Presentation given at the University of Calgary, School of Languages, Linguistics, Literatures and Cultures Speaker Series, Friday, March 2, 2018. Sands, Bonny, and Koen Bostoen. 2009. “Fwe (Bantu, K402) clicks in crosslinguistic perspective.” Paper presented at the 39th Colloquium on African Languages and Linguistics (CALL 39), Leiden, Aug. 24–26, 2009. Sands, Bonny, and Tom Güldemann. 2009. “What click languages can and can’t tell us about language origins.” The Cradle of Language, ed. Rudolf Botha and Chris Knight. Studies in the Evolution of Language 12. Oxford: Oxford University Press, pp. 204–218. Sands, Bonny, and Hilde Gunnink. 2019. “Clicks on the fringes of the Kalahari Basin Area.” Theory and Description in African Linguistics: Selected Papers from the 47th Annual Conference on African Linguistics, ed. Emily Clem, Peter Jenks and Hannah Sande. Berlin: Language Science Press, pp. 703–724. Sands, Bonny, Hilde Gunnink, Brigitte Pakendorf, and Koen Bostoen. 2017. “How hard is it to borrow clicks?” Paper presented at the 91st Annual Meeting of the Linguistic Society of America, Austin Texas (Symposium: Cross-Linguistic Variability in Processes of Language Change, January 8, 2017). Sands, Bonny, Ian Maddieson, and Peter Ladefoged. 1996. “The Phonetic Structures of Hadza.” Studies in African Linguistics 25(2): 171–204. Sands, Bonny, and Levi Namaseb. 2007. “Documenting click loss and change: focus on the palatal click.” Paper presented at the 38th Annual Conference on African Linguistics “Linguistic Theory and Language Documentation”, March 22–25, 2007, Gainesville, Florida. Scharf, G., I. Hertrich, Justus C. Roux, and Grzegorz Dogil. 1995. “An articulatory description of clicks by means of electromagnetic articulography.” Proceedings of the 13th International Congress of Phonetic Sciences, vol. 1, ed. Kjell Elenius and Peter Branderud. Stockholm: Congress organizers at KTH and Stockholm University, pp. 378–379. Scobbie, James M., Sonja Schaeffler, and Ineke Mennen. 2011. Audible aspects of speech preparation.” The 17th International Congress of Phonetic Sciences (ICPhS XVII), Hong Kong, August 17–21, 2011, pp. 1782–1785. Scott, Abigail, Amanda Miller, Levi Namaseb, Bonny Sands, and S. Shah. 2010. “Retroflex clicks in two dialects of ǃXung.” Paper presented at the University of Botswana, Department of African Languages, June 2, 2010. Selmer, Ernst W. 1933. Experimentelle Beiträge zur Zulu Phonetik. Avhandlinger utgitt av det Norske Videnskaps-Akademi i Oslo, II, Hist.-filol. Kl. No. 1. Oslo: Kommisjon Hos Jacob Dybwad.

70

Sands

Sibanda, Magredi, Mhlanganiselwa Mpofu, Markson Ngcoli Sibanda, Msindo Best Moyo, and Mthandazo Kuphe Vundla. 2015. A Unified Standard Orthography for Tjoao Languages: Angola; Botswana, Namibia; South Africa & Zimbabwe. CASAS Monograph Series 264. Cape Town: Centre for Advanced Studies of African Society. Simpson, Adrian P. 2001. “Does articulatory reduction miss more patterns than it accounts for?” Patterns of Speech Sounds in Unscripted Communication: Papers from an International Symposium held at Akademie Sankelmark/Flensburg, Germany, 8th–11th October, 2000, ed. Klaus J. Kohler and Adrian P. Simpson. (Journal of the International Phonetic Association 31(1)): 29–39. Simpson, Adrian P. 2007. “Acoustic and auditory correlates of non-pulmonic sound production in German.” Journal of the International Phonetic Association 37(2): 173–182. Snyman, Jan W. 1975. Žuǀ’hõasi Fonologie & Woordeboek. Communication 37 of the University of Cape Town School of African Studies. Cape Town & Rotterdam: A. A. Balkema. Snyman, Jan W. 1978. “The clicks of Žuǀ’hõasi.” Proceedings of the 2nd Africa Languages Congress of UNISA (Tweede Afrikatale-kongres van UNISA), ed. Ernst J.M. Baumbach. Miscellanea congregalia 5. Pretoria: University of South Africa, pp. 144–168. Snyman, Jan W. 1980. “The relationship between Angolan ǃXũ and Žuǀ’hõasi.” Bushman and Hottentot Linguistic Studies (Papers of seminar held on 27 July 1979), ed. Jan W. Snyman. Miscellanea Congregalia 16. Pretoria: University of South Africa, pp. 1–58. Snyman, Jan W. 1999. “The phonetic description of the Žuǀ’hõasi clicks: A confusion of sounds?” African Mosaic: Festschrift for J.A. Louw, ed. Rosalie Finlayson. Pretoria: UNISA Press, pp. 92–117. Starostin, George S. 2003. “A Lexicostatistical Approach Towards Reconstructing Proto-Khoisan.” Mother Tongue 8: 1–45. Starostin, George. 2018. “Lexicostatistical studies in Khoisan I: The Ju-ǂHoan relationship.” Journal of Language Relationship 16(1): 19–61. Stevens, Kenneth N. 1998. Acoustic Phonetics. Cambridge, MA: MIT Press. Tabain, Marija. 2019. “An electropalatographic study of variability in Arrernte consonant production.” Phonetica 76(6): 399–428. Thaler, Lore, Stephen R. Arnott, and Melvyn A. Goodale. 2011. “Neural correlates of natural human echolocation in early and late blind echolocation experts.” PLoS ONE 6(5): e20162. doi:10.1371/journal.pone.0020162. Thomas, Kimberly. 1997. “EPG and aerodynamic evidence for the coproduction and coarticulation of clicks in IsiZulu.” Proceedings of Eurospeech ‘97: Fifth European Conference on Speech Communication and Technology: Rhodes, Greece, 22–25 September, 1997, 1: 379–382.

Click Consonants: an Introduction

71

Thomas-Vilakati, Kimberly Diane. 2010. Coproduction and Coarticulation in IsiZulu Clicks. University of California Publications in Linguistics 144. Berkeley: University of California Press. Tlale, One. 2005. “The Phonetics and Phonology of Sengwato, a Dialect of Setswana.” Ph.D. Dissertation, Georgetown University. Tosco, Mauro. 1991. A Grammatical Sketch of Dahalo. Kuschitische Sprachstudien 8. Hamburg: Helmut Buske Verlag. Toutios, Asterios, Sajan Goud Lingala, Colin Vaz, Jangwon Kim, John Esling, Patricia Keating, Matthew Gordon, Dani Byrd, Louis Goldstein, Krishna Nayak, and Shrikanth Narayanan. 2016. “Illustrating the production of the International Phonetic Alphabet sounds using fast real-time magnetic resonance imaging.” Proceedings of Interspeech 2016, September 8–12, 2016. San Francisco: ISCA, pp. 2428–2432. Traill, Anthony. 1979. “Another click accompaniment in ǃXóõ.” Khoisan linguistic studies 5, ed. Anthony Traill. Johannesburg: African Studies Institute, University of the Witwatersrand, pp. 22–29. Traill, Anthony. 1983. “Phonetic details of Khoisan consonants.” ALASA Khoisan Special Interest Group Newsletter 1: 3–6. Traill, Anthony. 1995. “Deserted tongues.” Arena: The Quarterly Magazine for Alumni and Friends of the University of the Witwatersrand, January 1995, pp. 6–9. Traill, Anthony. 1997. “Linguistic phonetic features for clicks: articulatory, acoustic and perceptual evidence.” African linguistics at the crossroads: papers from Kwaluseni, ed. Robert K. Herbert. Cologne: Rüdiger Köppe, pp. 99–117. Traill, Anthony. 1985. Phonetic and Phonological Studies of ǃXóõ Bushman. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 1. Hamburg: Helmut Buske. Traill, Anthony. 1991. “Pulmonic control, nasal venting, and aspiration in Khoisan languages.” Journal of the International Phonetic Association 21(1): 13–18. Traill, Anthony. 1992. “A confusion of sounds: the phonetic description of ǃXũ clicks.” African linguistic contributions (Festschrift Ernst Westphal), ed. Derek F. Gowlett. Pretoria: Via Afrika, pp. 345–362. Traill, Anthony. 1994. A ǃXóõ Dictionary. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 9. Cologne: Rüdiger Köppe. Traill, Anthony. 1997. “Linguistic phonetic features for clicks: Articulatory, acoustic and perceptual evidence.” African Linguistics at the Crossroads: Papers from Kwaluseni, ed. Robert K. Herbert. Cologne: Rüdiger Köppe, pp. 99–117. Traill, Anthony. 1999. Extinct: South African Khoisan languages. Compact disc & booklet. Johannesburg: Department of Linguistics, University of the Witwatersrand. Traill, Anthony. 2002. “The perception of clicks in ǃXóõ.” Mouton Classics: From Syntax to Cognition. From Phonology to Text. Berlin: Mouton de Gruyter, pp. 59–72. [Originally published 1994, Journal of African Languages and Linguistics 15(2): 161–174].

72

Sands

Traill, Anthony. 2018. A Trilingual ǃXóõ Dictionary. ǃXóõ – English – Setswana, ed. Hirosi Nakagawa and Anderson M. Chebanne. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 37. Cologne: Rüdiger Köppe. Traill, Anthony, and Rainer Vossen. 1997. “Sound change in the Khoisan languages: new data on click loss and click replacement.” Journal of African Languages and Linguistics 18: 21–56. Traill, Anthony, and P.G. Wright. 1984. “Nasal venting in ǃXóõ Bushman: An aerodynamic investigation.” [abstract.] South African Journal of Science 80. Trouvain, Jürgen. 2013. “Schnalzlaute als Diskursmarker im Deutschen.” Abstract of paper presented at Phonetik & Phonologie 9 (P&P9), Zürich, 11–12 October 2013. Trouvain, Jürgen. 2014. “Laughing, breathing, clicking – the prosody of nonverbal vocalisations.” Social and Linguistic Speech Prosody: Proceedings of the 7th International Conference on Speech Prosody (SpeechProsody 7), Trinity College Dublin, May 20–23, 2014, ed. Nick Campbell, Dafydd Gibbon and Daniel Hirst. Dublin: Trinity College, pp. 598–602. Trouvain, Jürgen. 2015. “On clicks in German.” Trends in Phonetics and Phonology: Studies from German-Speaking Europe, ed. Adrian Leemann, Marie-José Kolly, Stephan Schmid and Volker Dellwo. Frankfurt am Main: Peter Lang, pp. 21–33. Trubetskoy, Nikolai. 1939. “Zur Phonetik der Hottentotensprache.” Anthropos 34: 267–276. Tuhuse, Bolo ǁXao, and Anthony Traill. 1999. “ǂhan-ǂhan-sè, the desert cisticola, implements an acoustic target.” Proceedings of the 14th International Congress of Phonetic Sciences, San Francisco, 1–7 August 1999, ed. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville and A. Bailey. Berkeley: Department of Linguistics, University of California at Berkeley. Vol. 2, pp. 1041–1042. Tuomi, S.K., Sandile Gxilishe, and L. Matomela. 2001. “The acquisition of Xhosa phonemes.” Per Linguam 17(1): 14–23. Vedder, H. 1910–1911. “Grundriss einer Grammatik der Buschmannsprache vom Stamm der ǃKũ-Buschmänner.” Zeitschrift für Kolonialsprachen 1: 5–24; 2: 106–117. Visser, Hessel. 2001. Naro Dictionary: Naro-English, English-Naro. 4th edition. Ghanzi, Botswana: Naro Language Project & SIL International. Vossen, Rainer. 1997. Die Khoe-Sprachen: ein Beitrag zur Erforschung der Sprachge­ schichte Afrikas. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 12. Cologne: Rüdiger Köppe. Vossen, Rainer, ed. 2013. The Khoesan Languages. New York: Routledge. Wakumelo-Nkolola, Mildred, Raphael Mbala, Paulinus Haingura, Boniface Makumbi, Robert Muganda, Karel Nairenge, Jakura Kavari, and Petrus Mbenzi. 2008. A Unified Standard Orthography for Namibian Bantu Languages: Oshiwambo, Otjiherero, Rukwangali, Rumanyo, Silozi and Thimbukushu. CASAS Monograph Series 231. Cape Town: Centre for Advanced Studies of African Society.

Click Consonants: an Introduction

73

Walsh, Martin. 2006. “A click in Digo and its possible historical interpretation.” Azania 41: 158–166. Werner, Alice. 1925. The Language-Families of Africa. 2nd ed. London: Kegan Paul International. Westermann, D., and Ida C. Ward. 1930. Practical Phonetics for Students of African Languages. London: Kegan Paul International. Westphal, E.O.J. n.d. “x”e (Bushmen of Amsterdam, district Ermelo). (BC 1143 (C14), E O J Westphal Papers).” Unpublished notes housed at the Rare Documents and Manuscripts Department, University of Cape Town. Wilmsen, Edwin N., and Rainer Vossen. 1990. “Labour, language and power in the construction of ethnicity in Botswana.” Critique of Anthropology 10(1): 7–37. Witzlack-Makarevich, Alena, and Hirosi Nakagawa. 2019. Linguistic features and typologies in languages commonly referred to as ‘Khoisan’.” The Cambridge Handbook of African Linguistics, ed. H. Ekkehard Wolff. Cambridge: Cambridge University Press, pp. 382–416. Wright, Melissa. 2005. “Studies of the Phonetics-Interaction Interface: Clicks and Interactional Structures in English Conversation.” Ph.D. thesis, University of York. Wright, Melissa. 2007. “Clicks as markers of new sequences in English conversation.” Proceedings of the International Congress of Phonetic Sciences XVI, Saarbrücken, Germany, August 2007, pp. 1069–1072. Paper and sound files available online: http:// www.icphs2007.de/conference/Papers/1155/. Wright, Melissa. 2011. “On clicks in English talk-in-interaction.” Journal of the International Phonetic Association 41(2): 207–229. Wright, Richard, Sharon Hargus, and Katharine Davis. 2002. “On the categorization of ejectives: Data from Witsuwit’en.” Journal of the International Phonetic Association 32(1): 43–77. Wright, Richard, Ian Maddieson, Peter Ladefoged, and Bonny Sands. 1995. “A phonetic study of Sandawe clicks.” UCLA Working Papers in Phonetics 91: 1–24. Yost, William, Xuan Zhong, and Anbar Najam. 2014. “Sound source localisation: Clicks and click trains.” Journal of the Acoustical Society of America 135(4.2): 2253.

Chapter 2

Click Phonology Wm. G. Bennett 1 Introduction The basic articulatory process of producing clicks has become increasingly well-understood, and has been the focus of a long tradition of work going back over a century,1 including new contributions by many of the other papers in this volume. At the same time, consensus about the basic phonological representation of clicks has remained elusive. The goal of this article is to review the extant proposals and the evidence that supports them, and to summarize the ‘state of the art’ in the phonological representation of clicks. Traditionally clicks are divided into two components, known by various names in the work of various authors. One component corresponds to the location and release type of the front closure, commonly termed the ‘click type’ (Traill 1980; Ladefoged and Traill 1984, 1994), or ‘influx’ in earlier work (Beach 1938, e.g.). The other component corresponds to the rest of the characteristics of a click, including its laryngeal properties (voicing, aspiration, etc.), nasality, and any other articulations accompanying the release of the back closure. This is referred to as the ‘click accompaniment’ (Traill 1985 et seq.; Sands 1991; Ladefoged and Traill 1984, 1994), or ‘efflux’ in earlier work (Beach 1938). Previous work (Miller et al. 2009a, among others) has long noted that the notion of ‘click accompaniment’ is problematic, in that it does not correspond to any natural phonetic or phonological characteristic. The term is used conventionally to pick out the collection of properties that can differentiate clicks with the same front closure. This includes laryngeal features (e.g. voicing, aspiration) and nasality: properties which non-clicks seem to be inherently specified for. It also encompasses some phonetic details that are less clearly connected to well-understood laryngeal features, such as ‘delayed aspiration’ – which normally goes hand in hand with nasality (Beach 1938; Snyman 1978; Traill 1985; Ladefoged and Traill 1984, 1994; Miller et al. 2009a), and thus does not seem quite so clearly tied to the [+spread glottis] feature normally used for 1  See Chladni (1824), Bleek (1862), Jones (1907), Panconcelli-Calzia (1918/1919), Doke (1923), and Beach (1938) for some relatively early examples.

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_003

Click Phonology

75

aspiration in non-clicks. Finally, the ‘accompaniments’ encompass other phonetic elaborations on release of the posterior closure of clicks, such as extra dorsal frication. These sometimes also make distinctions among non-clicks, but seem to do so in a much more limited way. Thus, ‘click accompaniments’ include a mix of phonetic characteristics of various calibers, some of which cut across clicks and non-clicks, and some of which do not. For the sake of expository convenience, I will retain this term, but acknowledge that it does not pick out any phonetically or phonologically natural class of entities. While the phonetics of clicks have received long and careful study, the phonological implications of this excellent body of work are not always clear. For this reason, we begin the discussion with a review of the phonological evidence. In the discussion that follows, I will take phonology to be defined very narrowly, from a generative and synchronic perspective: i.e. as grammatical knowledge about the cognitive representation and manipulation of abstract units of sound (features and segments), within a single grammar (and therefore a single speaker). My aim here is to start from the most restrictive definition and work outwards. If we want to be absolutely confident that clicks participate in phonological patterns, the most firmly convincing examples would be patterns that are synchronically productive, and that cannot be attributed to universal or language-specific phonetic factors, nor to historical developments. This is not to say that sound changes or phonetically grounded patterns cannot in principle reveal things about phonology; but in principle, some of those phenomena could arise from the interaction of phonological systems with other extrinsic factors (such as coarticulatory and perceptual pressures, or biases in acquisition and transmission). If we find evidence of clicks participating in phonological patterns, while operating under the very restrictive definition of phonology, we can be certain that any adequate theory of phonology necessarily must handle those kinds of patterns. The question, then, in assessing evidence for phonological patterning of clicks, is whether they can feasibly be explained by extra-phonological factors, without appealing to phonology. 2

Evidence and Alternations

Part of the complication in navigating to a phonological structure for clicks derives from a paucity of evidence in the form of alternations. This is the bread and butter of the study of non-click segments: the changes they undergo or participate in provide a window into their internal workings. For example, the

76

Bennett

segment [u] can be observed across various languages to harmonize with other vowels for height, backness, rounding, and advancement. Such phonological interactions elevate these properties of [u] from incidental phonetic details to phonologically significant components of its representation. Thus, [u] is represented featurally as [+back, +high, +round, +ATR] not merely on the basis of it involving a raised dorsal gesture (and lip rounding, etc.), but also on the basis of these properties affecting – and being affected by – other segments in a form. With clicks, however, evidence from alternations is quite scant. This section reviews the reported cases of such interactions that are known to me. It is surely not an exhaustive list, but aims to summarize a variety of different types of evidence that may bear on the phonological structure of clicks, and to comment on why confounds sometimes render their interpretation unclear. 2.1 Alternations between Clicks and Non-clicks 2.1.1 Click and Non-click Pronouns in Nǀuu The best candidate for a phonological alternation between clicks and nonclicks comes from relatively recent work on Nǀuu, also called Nǁŋ and ǂKhomani (Collins and Namaseb 2007; Miller 2006). Nǀuu exhibits a distinction between three sets of pronouns, termed the ‘simple form’, ‘click form’, and ‘A-form’. These are given in Table 2.1. Table 2.1

Nǀuu click and non-click pronouns

1.sg 2.sg 3.sg: (human) (nonhuman) (anaphoric) 1.pl 2.pl 3.pl: (human) (nonhuman) (anaphoric) indefinite

Simple

Click

A-form

ʔŋ́ ʔá kú kí hǎ sí ~ ʔí ʔú kín kíkē hǹ ǂí

ŋǀŋ̀ ɡǀà –

n-á ʔá kú-a kí-a hǎ s-á ~ [cá] (= /ki-á/) b-á (= /u-á/) kín-a kík-ā hǹ-a ǂá

Source: Collins and Namaseb 2007

ɡǀì ɡǀù –



Click Phonology

77

Collins and Namaseb (2007) report that the simple pronouns are the general default set, appearing in object position, as possessors, and as subjects in non-questions. Click pronouns appear only as the subject of a yes/no question, or in syntactic contexts where it follows the linker particle (e.g. when used as a locative or as a predicate). A-form pronouns appear as the subject of declarative clauses, or the first object of a double-object construction. Miller (2006) characterizes the variation between these pronouns as a phonological alternation: the clicks of click pronouns are presumed to be underlying, and are reduced to non-clicks in the simple pronouns. On this interpretation, it would constitute an alternation between clicks and non-clicks. There are some reasons to be wary of such a phonological interpretation of this pattern, however. First, the differences between the click and non-click pronouns are not limited to a straightforward choice between a click consonant and a non-click one. In all cases, the click pronouns do not have the same tone as their corresponding simple pronouns: the click to non-click switch is accompanied by a switch from low to high tone. This tone change cannot obviously be accounted for as a general property of simple pronouns, since nonhigh tones can be found on other simple pronouns (e.g. 3.pl anaphoric hǹ). Additionally, the non-click pronouns are not fully consistent for what stands in place of the click. In most cases we find the click corresponding to a glottal stop, but 1.pl pronouns show [ɡǀ] corresponding to [s] or [c] (in the A-form), and in the 2.pl A-form we find [b]. Here again, the extra complexities do not appear to be the straightforward result of other phonological factors: in the A-form pronouns, the [c] in the 1.pl can be analyzed as underlying /ki-a/ and the [b] in 2.pl as underlying /u-a/. But, the 3.sg pronouns show surface [ki] and [u-a] sequences that do not change to [c] and [ba], respectively. As such, the additional changes beyond the click ~ non-click alternation cannot be understood as the product of the general phonology of the language. Finally, the indefinite pronouns have only click forms: they show no alternation between clicks and non-clicks, even when appearing in precisely the same syntactic contexts as the other pronouns.2 So, while it may be tempting to interpret the variation between click and non-click pronouns as a process of neutralization that happens at the edge of a particular syntactic or prosodic domain (as indeed Miller (2011) proposes), 2  It is also worth noting that potential alternative phonological analyses do not entail an alternation between clicks and non-clicks. For instance, the 1st and 2nd click pronouns could be analyzed as a /ɡǀ/ prefix on the simple pronouns, with nasal assimilation in the 1.sg form [ŋǀŋ]. On this interpretation, the data would show an alternation for nasality, and not an alternation between clicks and non-clicks.

78

Bennett

this kind of analysis does not readily explain the data. An alternative analysis is to treat this as a morphological phenomenon, e.g. as suppletion (on the same order as English he ~ him). This seems to me to be a better fit for the Nǀuu data; but, it does mean that this case is not a genuine phonological alternation between click and non-click consonants. 2.1.2 Click Loss and Replacement A number of languages exhibit click loss: the deletion of clicks, or replacement of them with other consonants. For an extensive survey of the literature reporting and describing this phenomenon in some detail, see Traill and Vossen (1997), and sources cited therein. The focus of this section is on the phonological ramifications of this process. Traill and Vossen (1997: 32) offer an autosegmental characterization of click replacement, which analyzes it as de-linking of the [Dorsal] feature responsible for the velar articulation, possibly accompanied by de-linking of the [Coronal] feature to distinguish total loss from replacement by another segment. Under a strict formal interpretation of this suggestion, it would constitute a phonological change from a click to a non-click. There is good reason, however, to be dubious of this particular formal account of click loss, as Traill and Vossen note. They argue that patterns of click loss across a number of languages are not predictable from their articulatory features, but are predictable from the acoustic phonetics of the segments involved, and from the acoustic consequences of articulatory undershoot. If the alternations we find in click loss are determined primarily by phonetics, then they don’t clearly fall in the domain of what an articulatorily-based theory of features should explain. There is also reason to be skeptical about the appropriateness of any kind of formal phonological analysis of click loss, regardless of formulation. Phonological theories are theories about the form of phonological grammars. But click loss and click replacement appear to be a means of shifting from one grammar to another grammar. In the case of diachronic loss, this would entail a shift across the generation gap between speakers – a jump from one speaker’s grammar to another speaker’s grammar. A change from one finalstate grammar to another could also plausibly happen within a single speaker, who has learned two similar – but distinct – phonological systems that produce different patterns (e.g. two co-phonologies). The alternations we find, then, are not necessarily a representation of mappings that are possible within a single grammar. That is, the loss of clicks, even within a single speaker, does not necessarily reflect a grammatical mapping from an underlying form with a click to a surface form with a non-click. Rather, it would seem to be the result

Click Phonology

79

of shifting from a grammatical system that faithfully maps underlying morphemes with clicks, to a system with similar lexical items that have no clicks on the surface. It is not straightforward to prove that the underlying forms are the same in both systems – and this is crucial to the interpretation of click loss as a phonological process. Click loss and click replacement seem to be surprisingly common, in one form or another. Traill and Vossen (1997) cite a number of sources reporting on click loss in other cases, and there has been further work documenting the process in other Khoisan languages in recent years; see, for instance, Sands and Namaseb (2007), as well as some of the other papers in this volume. But Bleek (1928–29: 85) reports the substitution of clicks with stops in ǀXam storytelling to imitate the speech of the Ichneumon, which suggests that deliberate stylistic click loss predates the potential effects of language attrition. There are also sporadic reports from Bantu languages: Ziervogel (1959: 33), for example, gives examples of Northern Transvaal (South African) Ndebele words with [ŋ] or [kx’] in place of clicks in other Nguni languages, such as Zulu -qala [ǃala] ~ Ndebele [kx’wala] ‘begin’, and Zulu enxenye [ɛnǁɛɲɛ] ~ Ndebele [ɛŋɛŋɛ] ‘elsewhere’. And there are anecdotal clues that similar processes may be more widespread than the literature implies. Xhosa, for instance, has a distinct lexical root -qhakancu [-ǃʰakaⁿǀu] (presumably historically a compound, with the -ncu element likely from a Khoekhoe term [nǀu], a name used for various San groups, and the likely origin of the name Nǀuu (Güldemann 2017: 95)), meaning, variously, ‘click sounds, Khoi or Khoisan language or customs, or speaking with lots of clicks’.3 One young, urban, Xhosa speaker who I worked with reported that she could not pronounce clicks correctly.4 There are also well-substantiated reports of lexical doublets that differ only in the quality of clicks (see Louw 1962: 72ff). All these suggest a high degree of salience of the presence of clicks in speech – potentially in contrast to their absence. 2.1.3 Click/Non-click Variation in Fwe A pattern of click ~ non-click variation recently documented by Hilde Gunnink (2013; see also Bostoen and Sands 2012; and Gunnink et al. 2015, Gunnink, this volume) in Fwe (a Caprivian Bantu language) differs from most cases of click loss in that some individual speakers produce both click and non-click variants of the same word. The generalizations reported by Gunnink (this volume) 3  The first two senses here come from The Greater Dictionary of Xhosa (Pahl et al. 1989: 28–29). The third meaning was given to me by a native Xhosa speaker. 4  This speaker did not appear to have any difficulty producing [ǀ] or [ǁ], but her production of [ǃ] had frication and very little other burst, as if she had difficulty producing suction.

80

Bennett

and Bostoen and Sands (2012) are that a relatively small number of words have clicks, which may variably be produced as [ǀ], [ǁ], [ǂ], or [ǃ], and may additionally be voiceless, or voiced, or nasalized. All of the clicks can alternate with nonclick segments, e.g. [nduǀoma]~[ndukoma] ‘reed sp.’, and [nǀoreza]~[ŋoreza] ‘joint’. Crucially, individual speakers are known to produce click and non-click variants of most words which have clicks. This is suggestive of a single phonological grammar producing both variants – which in turn suggests that the grammar of those speakers can map an underlying click to a surface non-click. This is the same sort of pattern implied by other cases of click loss, but what sets the Fwe case apart is that it can be observed within single speakers. Moreover, Gunnink (2013) proposes that some of the variation reflects a pattern of click insertion, on the grounds that younger speakers appear to use more clicks than older speakers. If we accept this proposal, it would be evidence that mapping an underlying non-click to a surface click is also phonologically possible. 2.2 Manner and Laryngeal Feature Alternations While the representation of place in clicks is a matter of some debate due to them having multiple closures, the notion that clicks are specified for the same laryngeal features and manner features as non-click consonants is relatively uncontroversial.5 It is also supported by good evidence from morphophonological alternations, as well as restrictions on positional distribution and co-occurrence. Somewhat surprisingly, much of this evidence comes from Nguni languages. 2.2.1 Post-nasal Alternations in Nguni Languages In Nguni languages such as Zulu and Xhosa, we can find alternations affecting the nasality and voicing of clicks. These occur with the prefixes for noun classes 9 and 10, and their associated adjectival and relative clause agreement markers – all of which canonically end in homorganic nasals. Some examples are given in (1), from Zulu (Doke 1927). The prefix for noun class 9 is normally /iN-/, and its adjectival agreement marker is /eN-/; for class 10, the noun class prefix is /iziN-/, and its adjectival agreement marker is /eziN-/. Since the prefixes for classes 9 and 10 both have homorganic nasals, they do not normally show alternations in the singular/plural paradigm. But, alternations can be seen with adjectival agreement markers, and with nouns derived into class 9 or 10; and in nouns of class 11, which normally makes plurals in class 10.

5  Bennett (2008, 2017), however, claims that nasality in clicks is fundamentally different from nasality in non-clicks, and is not genuinely connected to the feature [±nasal].

Click Phonology

81

(1) Homorganic nasals in class 9/10 prefixes in Zulu (Doke 1927) a. -kʰulu eŋ-k’ulu ‘big’/‘big (cl. 9 agr.)’ b. -de en-de ‘tall’/‘tall (cl. 9 agr.)’ c. -ɓi ezim-bi ‘evil, bad’/‘bad (cl. 10)’ d. -ʃa ezin-tʃ ’a ‘new’/‘new (cl. 10)’ e. -ɬe ezin-tɬ’e ‘fine, nice, beautiful’/‘fine (cl. 10)’ f. u-pʰapʰe izim-p’apʰe ‘feather(s)’ (cl. 11 sg./cl. 10 pl.) g. u-tʰi izin-t’i ‘stick(s)’ h. u-ɡu iziŋ-ɡu ‘river bank(s)’ When this assimilation happens, certain following consonants also undergo changes in manner of articulation (Doke 1926; Halpert 2012, a.o.). Some of these changes involve laryngeal features: aspirated stops become ejectives, implosive [ɓ] neutralizes to [b], and fricatives become affricates.6 Some examples can be seen in (1) above. (2) Class 9/10 prefixes in Zulu with clicks (Doke 1927) a. -ǁoɓisa iŋ-ᶢǁoɓiso ‘worry’/‘trouble’ (cl. 9 noun) b. -ᶢǀina iŋ-ᶢǀina ‘wax up hive’/‘wax’ c. -ǁʰant’a i-ᵑǁant’a ‘sprout (v.)’/‘shoot, branch’ d. -ǀʰwaja i-ᵑǀwaji ‘perform hut dance’/‘hut dancer’ e. um-ᵑǁa i-ᵑǁa ‘side’ (cl. 1 noun)/‘side’ (cl. 9 noun) f. u-ǀʰuʃela izi-ᵑǀuʃela ‘sharp instrument(s)’ (cl. 11 sg./cl. 10 pl.) g. u-ǃʰududu izi-ᵑǃududu ‘tall careless person(s)’ h. u-ǁʰuʃela izi-ᵑǁuʃela ‘sharp instrument(s)’ i. u-ǀezu iziŋ-ᶢǀezu ‘slice(s)’ j. u-ǃweǃwe iziŋ-ᶢǃweǃwe ‘scab(s)’ Clicks undergo similar changes in the class 9/10 morphological context, though not in exactly the same way. Aspirated clicks lose their aspiration, just as stops do in the basic case (cf. (2) and Table 2.2). But, clicks diverge from that pattern in surfacing as either voiced or nasal (Doke 1926; McLaren 1942; Louw 1977; Sands 1991). There is some disagreement about the precise quality of these derived post-nasal clicks. Doke (1926: 136) reports, for Zulu, that aspirated clicks become fully nasal clicks (/N+ǃʰ/ → [ⁿǃ]), while unaspirated clicks become

6  Other changes occur somewhat more sporadically, including /l/ to [d], and fortition of /h/ to [ŋk’] and /w/ to [ŋɡw] (Doke 1927: 14).

82

Bennett

Table 2.2 Changes to clicks in class 9/10 post-nasal condition

Zulu Default Voiceless aspirated Voiced (murmured) Nasal Nasal and murmured

Xhosa Cl. 9/10 post-N

Default

Cl. 9/10 post-N

ǃ ǃʰ ᶢǃ

(N)ǃ

(N)ᶢ ǃ

ǃ ǃʰ ᶢǃ

ⁿǃ

ⁿǃ

ⁿǃ ⁿ̤ǃ

ⁿǃ ⁿ̤ǃ

(N)ᶢǃ

voiced, but remain oral, though with a preceding nasal (/N+ǃ/ → [ŋᶢǃ]).7 In Xhosa, on the other hand, unaspirated clicks become voiced, while aspirated ones remain voiceless and oral (/N+ǃʰ/ → [ŋ+ǃ]) (McLaren 1942: 12). In both cases, it seems then that the contrast between voiced and aspirated clicks is retained, even though the quality of each is changed – and even though some contrasts with nasal clicks are lost. Some previous work has used this case of nasal place assimilation to argue that the posterior closure of clicks has a genuine [Place] feature (Sagey 1986; Maddieson and Ladefoged 1989; Miller 2011).8 However, the status of the preclick nasals as velar is based on Doke’s impressionistic observations, and has not been directly measured.9 However, some other work disagrees with this interpretation. Padgett (1995: 196–170) suggests that the choice of [Dorsal] nasals before clicks may be explained on perceptual grounds. Bennett (2014: 121–122) argues that the Zulu and Xhosa data does not actually support treating the 7  Interestingly, Doke (1926; 1954) states that examples of voiceless clicks following nasals do exist in both Zulu and Xhosa, though they are not common. He also follows McLaren in noting that post-nasal aspirated clicks are rare but attested in Xhosa. A Xhosa speaker I consulted was not familiar with any cases of post-nasal aspirated clicks, and reported an intuition that such sequences felt unacceptable. 8  Traill (1977: 110ff) also reports a pattern of nasal place assimilation in ǃXóõ, but in this case nasals assimilate to the front place of the click, and not to a dorsal nasal or doubly-articulated one. Thus, [m̩ ʘqāa] ‘my child’ and [mʘk’úm] ‘it is nice’ have [m] before [ʘ], and not a labiodorsal [ŋ͡m] or [m͡ŋ] of the sort observed with prenasalized clicks in forms like [mʘɢāi] ‘type of fly’. By the same token, assimilated nasals are dental before dental clicks and alveolar before lateral clicks – they are not velar or uvular: [n̪̩ ǀîi] ‘I am here’, and [n̩ ǁáli] ‘it is large’. 9  Though see Doke’s (1923) x-ray study.

Click Phonology

83

nasal in the iN-/iziN- prefixes as truly placeless, and that the nasals surface as [Dorsal] by default even in cases where no assimilation is evident. What is quite clear from the these class 9/10 post-nasal alternations, though, is that clicks can interact phonologically with non-clicks – and that they can undergo changes in laryngeal specifications and nasality. Even if the nasal assimilation facts do not satisfactorily support positing a [Place] feature for the back closure in clicks, the de-aspiration of post-nasal clicks is entirely undisputed. Since we can observe this de-aspiration cutting across clicks and non-clicks alike, it is good evidence that the laryngeal specifications of clicks involve categorically the same features as non-click segments. 2.2.2 Laryngeal Harmony in Zulu The laryngeal and nasality alternations seen in class 9/10 in Nguni are by no means the only evidence for manner of articulation features in clicks. Khumalo (1987: 104–105) reports a form of static laryngeal harmony among clicks in Zulu. The key generalization is that voiced and aspirated clicks are restricted to root-initial position, unless they follow another click that is completely identical.10 Thus, words with identical aspirated or voiced clicks are attested, e.g. u-qhoqhoqho [u-ǃʰoǃʰoǃʰo] ‘windpipe’ and uku-gcagca [uku-ᶢǀaᶢǀa] ‘to get married’. But, if a root begins with a non-click, then only laryngeally unspecified clicks may occur later in the root, e.g. uku-phoqa [uku-pʰoǃa] ‘to force’, and uku-gaqa [uku-ɡaǃa] ‘to crawl’ (*uku-pʰoᶢǃa, uku-ɡaǃʰa, etc.). What makes the laryngeal agreement restriction on clicks significant is that Khumalo (1987: 26ff) also observes a parallel form of consonant harmony among stops. The generalization is that stops in roots normally agree in laryngeal features. That is, there are roots with two voiced stops (bheda [beda] ‘talk nonsense’, two unaspirated and/or ejective stops (peta [p’et’a] ‘dig up’), and roots with two aspirates (thupha [tʰupʰa] ‘thumb’). But, roots with two stops that have different laryngeal specifications are relatively uncommon; and, they often fall into coherent groups of exceptions. Some are historically morphologically complex, others are the result of post-nasal or positional laryngeal neutralization (of the sort considered in §2.2.1 above).11 Thus, we see stops agreeing with stops, and clicks agreeing with clicks – and the parameters for such agreement seem to be the same in both. 10  Khumalo analyzes nasal clicks as prenasalized consonants, on par with obstruents preceded by homorganic nasals. Such prenasal stops do not appear to participate in the agreement pattern, nor do nasal clicks. 11  See Khumalo, and Hansson (2010) and Sibanda (2004) for more discussion of the details of this harmony.

84

Bennett

2.3 Alternations Induced by Clicks 2.3.1 Raising of /a/ in ǃXóõ ǃXóõ12 has a pattern of raising of /a/ conditioned in part by certain clicks (Traill 1985, 1994a, 1994b; see also Bradfield 2014, Lionnet 2018 for much more detailed discussion). The essential generalization is that the low vowel /a/, when non-pharyngealized, is raised to [ɜ] or [ɪ̠] or [i] in the conditions in (3).13 Thus, we see raising in the examples in (4), but not those in (5). (3) Conditions for /a/-raising in ǃXóõ (Traill 1985: 70): a. The preceding consonant is one of the dental non-clicks {t tʰ s n l} or the clicks {ǂ ǀ} b. The following mora is /i/ or /n/ c. If there is an intervening consonant before the following mora, it is {n l} (4) Raising of /a/ (examples from Traill 1994a: 123; 1985: 70ff): a. /ǀā̤i/ → [ǀīi̤ ] ‘aardwolf’ b. /ǂài/ → [ǂìi] ‘steenbok’ c. /ǂàli/ → [ǂɪ̠l̀ ì] ‘fold (Cl. 1)’ d. /tán/ → [tɜ́n] ‘to it’ (5) Failure of /a/-raising (Traill 1994a: 123): a. [ǂànà] ‘fold (Cl. 2)’ b. [ʘɐ́’i] ‘abomasum’ c. [ǁɐ̤́i] ‘old (Cl. 1)’ What makes this pattern especially remarkable is that we find some clicks patterning with dental non-clicks, to the exclusion of others. If it is indeed 12  Here and throughout, I will use the term ‘ǃXóõ’ to refer to the eastern Taa variety studied by Traill and in subsequent related work. For work on western ǃXoon, see Naumann (2008, 2016). 13  The degree of raising is highly variable, but Bradfield (2014: 12) points out that it depends in part on the content and position of the following mora: partial raising of /a/ → [ɜ] happens before moraic nasals and [Ci], and full /a/ → [i] raising happens only when the following mora is [i] and there is no intervening segment (see also Traill 1994b: 40). There may be an effect of non-moraic consonants too, though: Traill (1994b: 185) transcribes partial raising before [m] in /sàmi/ [sɛ̀mi], while /sáni/ [síni] apparently has full raising before [n]. For simplicity of exposition, I will not distinguish between full and partial raising here; the distinction between them depends entirely on factors unrelated to the role of clicks in conditioning the pattern.

Click Phonology

85

phonological, the /a/-raising pattern therefore shows clicks sharing the same phonological features as non-click segments – thus shedding light on their featural representations. However, certain aspects of Traill’s (1985) original description call into question the phonological status of this /a/-raising pattern, and suggest that it may be better explained as coarticulation instead of assimilation. First, Traill observes – and his measurements objectively show – that the degree of raising is quite variable (Traill 1985: 71ff; Traill 1994b: 40), and its occurrence is sporadic after certain consonants (Traill 1994a: 123–124). It is also heavily affected by speech rate: the most extensive raising of /a/ is found in speech in a ‘rapid, unmonitored style’ (Traill 1985: 70). Finally, Traill also notes that /a/ is also affected in other ways by other segments: it is raised and rounded when preceding [o] or [u], and carries variable amounts of labialization when following [ʘ]. These effects are similarly dependent on what follows the /a/: Traill (1985: 70) notes, for example, that /ʘân/ ‘sleep’ may be realized as [ʘɔ̂ n], but that /ʘqʰâna/ ‘duikers’ may not have [ɔ] as the first vowel, due to the following /a/. All of these facts are consistent with an explanation of /a/-raising based on co-articulation and articulatory planning, without invoking any classically phonological mechanisms. Dental consonants require an apical gesture. The clicks [ǂ ǀ] require a coronal gesture that pulls the tongue body further forward than in other clicks. Since the tongue body is connected to the tongue root, it’s possible that this would also cause the dorsal gesture to shift forward as well – a gesture that requires the tongue root to be in a much higher position than is typical for low vowels.14 Together, these point to the ‘raising’ of /a/ as articulatory undershoot: the tongue does not fully achieve the target for [a] between two segments that require it to be much higher. By this reasoning, we expect the effect to be most pronounced in fast speech: the less time there is between the raised gestures, the less closely the target of [a] can be approximated. We also expect the raising to be highly gradient: the degree of movement necessary to reach the target tongue position for [a] is dependent on the positions the tongue needs to assume for the segments on either side. Additionally, we expect that it should be easier to achieve the lowered dorsum

14  Further study of the dynamics of click releases is sorely needed to confirm or deny this. Miller’s (2016) study of Mangetti Dune ǃXung, finds that for one speaker (of 3 measured), a portion of the closure of [ǂ] involves the tongue root reaching significantly further back on release than other clicks. So, whether this articulatory interaction could explain the facts depends very much on finer details of relative timing. (I thank an anonymous reviewer for raising this issue).

86

Bennett

target of [a] when the following vowel is another /a/, and no subsequent raising needs to be planned for.15 2.3.2 The Back Vowel Constraint The Back Vowel Constraint (BVC) is a phonotactic restriction on the cooccurrence of certain consonants with back vowels, first identified by Traill (1985: 89–92) for ǃXóõ. Traill defines this constraint as an implication: in any CV sequence, if C is [+back], then V is also [+back]. The [+back] consonants of the language are velars and uvulars {k, kʰ, ɡ, kx’, ɡᵏˣ’, q, qʰ, ɢ, q’, x}, all coronals with velar accompaniments {Cˣ, Cᵏˣ’}, as well as bilabial, lateral, and alveolar clicks {ʘ ǁ ǃ} – to the exclusion of dental and palatal clicks {ǀ ǂ}.16 Thus, we find certain clicks patterning together with dorsal consonants in being disallowed with front vowels, while other clicks are permitted to co-occur with front vowels. Is the BVC truly phonological? The answer hinges in part on assumptions made about the phonetics-phonology interface. Many previous approaches observe a strong phonetic basis for the BVC, but take it to be part of phonology proper too (see, e.g. Miller-Ockhuizen 2003; Nakagawa 2006; Miller 2010b). Under the operating definition of phonology assumed here, the question at issue is whether there are BVC effects that cannot be explained by appealing to phonetics, and must be handled in phonology. There is evidence that the robustness and productivity of the BVC differs across languages, suggesting that it may be narrowly phonological in some languages, but not necessarily an active part of the phonology in others. In ǃXóõ, at least, the BVC appears to be a categorical restriction, operating on a well-defined natural class of vowels. Leaving aside the clicks, the consonants that it holds over are also a well-defined class: those with some kind of dorsal articulation. Moreover, Traill (1985: 90) observes that certain loanwords change in order to comply with the restriction: Afrikaans donkie [dɔŋki] ‘donkey’ is borrowed as [tońtì], and not the BVC-violating form *toŋki. These all point to the constraint being phonologically active in the synchronic grammar. In some other languages, however, the status of the BVC is not so clear cut. In Khoekhoe, for example, something like the BVC is evident as a statistical 15  Why the undershoot happens to the /a/ instead of the other adjacent segments – i.e. why /ǂài/ becomes [ǂii] and not [ǂaɛ] – is an interesting question that I will not attempt to solve here. But, the fact is that /a/ also picks up rounding and backness in a similar way seems relevant, and suggest that somehow achieving the full target for /a/ is perhaps somehow less of a priority than for other vowels. A model designed to explain cumulative effects (e.g. Lionnet 2014) seems like a promising approach. 16  Traill (1985: 123) actually characterizes all clicks as [+back], and notes the occurrence of [ǀ ǁ] before front vowels as exceptions to the BVC.

Click Phonology

87

skew in the lexicon, but not as an inviolable “hard” constraint (Miller et al. 2007). That is, front vowels are more likely to co-occur with ‘front’ consonants, and less likely to co-occur with ‘back’ consonants, though such co-occurrence is nonetheless attested. In Gǀui, Nakagawa (2006) reports that the BVC is categorical for {ǃ ǁ}, and uvulars, and coronals with uvular releases, but is robustly counter-exemplified with velars and {ǀ ǂ} – a pattern very similar to that in ǃXóõ. In Juǀ’hoansi, on the other hand, Miller-Ockhuizen (2000) finds that the BVC is counter-exemplified with velars, but only by a relatively small number of forms. The significance of these exceptions is unclear, in part because velar stops are infrequent in the language generally. Loanwords in Juǀ’hoansi, however, do seem to freely permit the co-occurrence of [k] with front vowels (e.g. [tə̀ ɾèkìi]̀ ‘saddle cloth’, from Afrikaans saal kleedjie [saːl kʰlèkʰi], and [kéɾéꜜsíi]́ ‘candle’ from [kʰérs]) (Miller-Ockhuizen 2003: 113). Miller-Ockhuizen proposes that the BVC in Juǀ’hoansi is a result of the articulatory difficulty of producing front vowels in tandem with the pharyngeal narrowing involved in articulating uvulars and pharyngeals. If this phonetic mechanism explains the BVC pattern, is it also necessary to posit the same constraint as part of phonology? If it is interpreted as part of the (perhaps language-specific) phonetics, and not part of phonology proper, that would fit with the BVC having exceptions, and being freely counter-exemplifiable in borrowings. If this interpretation is correct, then it means BVC effects are narrowly phonological in some languages (like ǃXóõ), but not in others (like Juǀ’hoan). As such, its consequences for phonology are not straightforward, and depend on language-particular evidence for its categoricality and synchronic activity. 2.3.3 Lateral Harmony in Gǀui Nakagawa (2006: 135) notes an apparent pattern of lateral assimilation triggered by clicks in Gǀui. Gǀui has a single liquid consonant, which is typically produced as a tap [ɾ], or occasionally as a trilled [r]. The relevant generalization is that when the liquid /ɾ/ is the second consonant of a root with an initial lateral click, then it surfaces as [l] instead of [ɾ]. Some examples are given in (6). The agreement in laterality seen here suggests that the lateral clicks bear the same [+lateral] feature normally assumed for [l]. (6) Gǀui lateral harmony triggered by [ǁ] (Nakagawa 2006: 135): a. /kǃàrà/ → [kǃàɾà] ‘lekkerbreek (Ochna ([VɾV]) Pulchra) tree hook’ b. /kǁárá/ → [kǁálá] ‘camel thorn tree’ (/r/→[l] after [ǁ]) c. /kǁχárá/ → [kǁχálá] ‘fence’ d. /kǁʰárà/ → [kǁʰálà] ‘cultivate, plow’

88

Bennett

There is a confound to this pattern, however: the same [l] realization of /r/ also occurs after pharyngealized vowels, even when there are no other laterals in the word. Examples from Gǀui are given in (7). These examples do not appear to be one-off flukes: Miller et al. (2007: 107) observe the same [ɾ]→[l] alternation after epiglottalized vowels in Nǀuu, suggesting it is more widespread.17 In these cases, the appearance of [l] cannot be explained as agreement for [+lateral]; consequently, if there is to be a unified explanation for both kinds of examples, it cannot be framed in terms of laterality alone. Thus, it is not safe to conclude that lateral clicks are genuinely [+lateral]. Since different laterals may differ in the location of their side releases, it stands to reason that their phonological designations may differ as well. (7) Gǀui lateral harmony triggered without [ǁ] (Nakagawa 2006: 135): a. /ɡǀá̰rà/ → [ɡǀá̰là] ‘wild vegetable (/r/→[l] after V̰ ) (Dipcadi viride)’ b. /ɡǀṵ̀rɪ̄/ → [ɡǀw̰ à̰lɪ̄] ‘mongoose’ 2.4 Distributional Restrictions There are a number of robust distributional restrictions on clicks, observed in a number of languages. These are clear cases of phonology affecting clicks, in a way that does not produce visible alternations, but nonetheless offers important insights about their phonological representation. First, distributional restrictions are an area where we can see that clicks pattern quite transparently like obstruents – a point made previously (by Miller-Ockhuizen 2003, and Miller 2011). In many Khoisan languages, such as ǃXóõ, Khoekhoe, and Juǀ’hoansi, clicks are restricted to morpheme-initial position – as are nearly all non-click stops and fricatives (Traill 1985; Brugman 2009; Miller 2010a, among others). ǃXóõ also allows closed syllables if the coda is a nasal consonant. Though some 20 clicks involve nasality in some form or another, none of these clicks are observed in final or medial positions. The nasal clicks behave in this respect just like other clicks, and like stops and fricatives; they do not behave like nasals or other sonorants. Very similar patterns show up in other languages: both Naro and Juǀ’hoan also permit closed syllables, so long as the coda is nasal (Visser 1998; Miller-Ockhuizen 2003; Miller 2010a), and also prohibit all nasal clicks from that position. This is evidence

17  Interestingly, Exter (2008: 65) gives the word ǁùɾú ‘quartz’, with [ɾ] and not [l], suggesting that [l] in Nǀuu is conditioned by epiglottalized vowels, but not by lateral clicks, though this may be a loan from Khoekhoe (Sands, p.c.).

Click Phonology

89

that clicks are uniformly obstruents, and not sonorants; contrary to what would be expected from analyzing them as “clicked nasals”.18 Second, there are also co-occurrence restrictions that suggest that the front closures of clicks might reflect active phonological features. Sibanda (2004: 165) observes for Ndebele that there are no roots containing two different clicks, and Khumalo (1987: 104) makes a very similar observation for Zulu. Hansson (2010: 69) interprets this as a kind of consonant harmony, in which clicks agree for place of articulation. This is an interesting pattern, and suggests that there are indeed more phonological behaviors of clicks than have been observed so far; but, because it does not involve any interaction between clicks and nonclick consonants, this case does not reveal anything about how to categorize clicks in terms of the same features posited for non-clicks. Additionally, some recent theories of consonant harmony in the Agreement By Correspondence framework (Rose and Walker 2004; Bennett 2015) hold that such harmony is driven by shared phonological feature specifications. In such theories, place of articulation harmony among clicks – and just clicks – entails that the property of being a click does actually reflect some feature (or combination of them).19 Finally, there are more general typological patterns that we can identify from the distribution of clicks in terms of syllable structure. Perhaps the most widely evident restriction on clicks is a positional one: they are preferentially found in word-initial or root-initial positions. In most Khoisan languages, clicks are limited to word-initial position (Doke 1925; Greenberg 1966: 67; Traill 1985; Güldemann 2001; Bennett 2008; and especially Brugman 2009). Echoes of this can also be found in Bantu languages. In Nguni languages, it appears that the root-initial position is the only position where all clicks can occur unrestrictedly (as discussed in §2.2.2 above). In Yeyi, a non-Nguni Caprivian Bantu language, clicks appear to be found only in root-initial position.20 These 18  There are some more marginal examples that point to alternations between nasal clicks and nasals, though. Heselwood (1997) reports an individual with language impairment substituting nasal clicks for sonorants, including nasals. A substitution of [ⁿǃ] for [ŋ] is also reported in a Chinese nursery rhyme (Nathan 2001). 19  In Agreement By Correspondence theories, harmony among clicks would also lead to a prediction that click dissimilation is also possible (e.g. ǃ…ǃ → ǃ… k) (see Bennett 2015 for discussion of why this prediction arises). It’s not clear if this is attested or not; while there are languages where at most one click is allowed per word, many of them follow from more general positional and word shape restrictions. 20  This observation was made based on the word list provided by Fulop et al. (2000), but appears to hold over the much larger lexicon of Lukusa (2009). Some apparent exceptions can be analyzed as reduplicated forms, with clicks at the start of halves. Other possible counter-examples, such as [ma-kʰweŋʔkǀʔumu] ‘sunrise’ and [ʃi-kuŋʔkǀʔumu] ‘sunset’, look like compounds with a click in the initial position of the second root. (Note, however,

90

Bennett

trends suggest two typological universals for click distribution, listed in (8) (Bennett 2008; Brugman 2009). (8) Positional distributional universals for clicks a. Clicks never occur in coda position; they may only be onsets21 b. If a language allows clicks in a position other than the initial onsets, then clicks also occur in initial onsets However, these restrictions come with substantial language-family-specific confounds. Most Khoisan languages exhibit strict limits on which consonants may occur in non-initial positions. For example, in ǃXóõ, the only consonants permitted in word-final position are nasals [m n] (and, dialectally, [ŋ]). In intervocalic position, only {b dʲ m n ɲ l} may occur. Thus, while ǃXóõ does ban the occurrence of clicks in word-medial position, this is actually somewhat unsurprising: nearly all other obstruents are similarly banned in that position. Some other languages are somewhat more tolerant of word-final consonants: Nama, for example, allows word-final /p ts s/ as well as nasals (Beach 1938; Hagman 1977: 11; Brugman 2009), but this is still a minority of the non-click consonants of the language. Since restrictions of this sort pervade the Khoisan languages in general, the absence of clicks in syllable-final position is somewhat unsurprising, and may be due to historical reasons as much as phonological ones.22 The situation in Bantu languages with clicks is quite similar. In most Bantu languages, the only coda consonants permitted are nasals that are homorganic with a following consonant. Zulu and Xhosa, two languages that have borrowed clicks extensively, are only marginally more permissive: codas may be homorganic nasals, or [m]. So, the absence of coda clicks is again perhaps unsurprising. Ball and Müller (2007) report a number of cases of substitution of clicks for non-clicks in impaired speech – including in word-final positions. One particularly interesting case, initially documented by Howard (1993), is that that the relatively low frequency of clicks and the high degree of inter-speaker variation makes it difficult to be sure; see Fulop et al. 2000, 2003 for more details about this.) 21  Elderkin (1992: 123) reports one example from Dahalo where a click undergoes templatic morphological gemination. If this geminate straddles the syllable boundary as would normally be expected, this would be an exception. 22  Nakagawa (2010) makes the similar suggestion that restrictions on non-initial consonants in Gǀui are historical in nature, based on findings from a constructed language game: Gǀui speakers are very much capable of transposing syllables to produce forms with clicks and other normally-initial obstruents in root-internal position.

Click Phonology

91

of a girl with cleft palate and hearing impairment, who used bilabial clicks in place of target [p] in both word-initial and word-final positions (though not word-medially), e.g. realizing tap as [ʔæʔʘ]. The interpretation of such reports should ideally be informed by a theory of how disordered speech relates to phonology in general; but, even without such a theory, cases like this would seem to be clear evidence that realizing clicks in word-final codas is both phonetically and phonologically possible. 2.5 Recap Two broad conclusions are evident from the cases reviewed above. The first is that clicks are in some ways phonologically very ‘normal’. They are subject to distributional restrictions, of the same caliber as patterns observed with nonclick consonants. They can trigger alternations in other segments, including both vowels and consonants. They can also undergo alternations, for at least laryngeal features and perhaps nasality. In short, there is convincing evidence that clicks are not somehow outside of regular phonology: to the contrary, they seem to pattern clearly like consonants.23 The other major conclusion to be drawn is that the available evidence from alternations tells us surprisingly little about the phonological representation of clicks. Examples of assimilation are relatively easy to come by with non-click consonants: nasal place assimilation, voicing assimilation, and other sorts of agreement are robustly attested in languages of the world. Reduction and/or neutralization processes are similarly well-attested: final devoicing is very well attested, as are languages that exhibit other manner or place neutralizations in coda positions. Such processes are clear windows onto the phonological properties of segments. For example, final devoicing of obstruents suggests that they bear a feature [+voice] that is not found on voiceless consonants or sonorants. However, the available evidence for phonological processes involving clicks do not offer the same clarity of evidence. If, for instance, we could observe a synchronic alternation where /ʘ/ neutralizes to [p] in codas, this would be clear evidence that [p] and [ʘ] share certain features. If we could observe /ʘt/ clusters surfacing as [ǀt], this would similarly point to certain features being shared between [ʘ] and [ǀ], and between [ǀ] and [t]. Patterns not too far off from this do seem to be attested, but the details of such cases limit what we can glean from them. In Gǀui, for example, lateral 23  In contrast to, e.g. Dogil and Mayer’s (1998: 170) claim that clicks are ‘less encoded in the speech string than other sounds’.

92

Bennett

clicks cause a following liquid to surface as [l], where it would normally be [ɾ~r]. This points to clicks like [ǁ] bearing the same [+lateral] feature assumed for [l] – or, at least it would point to that, if the /ɾ/→[l] alternation were not also triggered by pharyngealized vowels that are clearly not [+lateral]. By the same token, Fwe shows variation between clicks with various anterior places, and dorsal consonants [k ɡ ŋ]. This points to clicks sharing the same [dorsal] feature as velars. However, since the variation is free rather than phonologically conditioned, it also submits to a ‘pseudo-optionality’ analysis in which speakers vary between two underlying forms of the same word. On this interpretation, the connection between the clicks and non-clicks is not a phonological one, but more akin to morphological variation (e.g. burnt ~ burned). Additionally, the Back Vowel Constraint points to a feature being shared between [ʘ ǃ ǁ] and [q χ], and not shared with [ǀ ǂ]. But, Miller et al. (2006; 2009a) find that the position of the tongue root in [ǃ] is much closer to that of [q] than in [ǂ], and conclude that this difference is a consequence of the tongue body shape needed to maintain a cavity between the two constrictions. If the BVC can be explained by physiology, it calls into question the need for a phonological account of it.24 The available data shows that clicks do participate in phonology in general; but, the confounds of specific cases make it difficult to draw generalized conclusions about the phonological structure of clicks. Given the paucity of evidence from active phonological patterns, proposals about the phonological representation of clicks have often relied heavily on arguments from one or another kind of theoretical parsimony. Thus, click inventories have commonly been analyzed – and even described – in whatever terms make them most intuitively analogous to the non-click inventories, or in the terms that make them most analogous to other previous descriptions of other languages. By and large these sort of assumptions are made on a case by case and language by language basis, or else on a historical and comparative basis, rather than as part of a well-defined theory of phonology. However, these 24  Along similar lines, Beach (1938: 75) characterizes dental clicks as ‘affricative’, because there is frication accompanying the release of the front closure. However, this frication has a straightforward anatomical explanation: teeth have gaps between them, and therefore do not form an air-tight seal. Thus, the frication Beach noted is not a legitimate basis for positing a [+delayed release] feature as subsequent feature theories have done (Chomsky and Halle 1968; Sagey 1986, among others). A similar point is made by Roux (2007), who argues that the turbulent noise in dental and lateral clicks actually begins before the burst, and is therefore ‘pre-affrication’. A similar story also explains the noisiness of release in lateral clicks (which are also articulated against the teeth), without positing a [delayed release] feature for them.

Click Phonology

93

do add up to incremental progress towards such a theory, by making data and generalizations more available for analysis of the cross-linguistic typology of phonological systems that have clicks. 3

Typological Patterns in Click Systems

While active phonological alternations may not paint a full picture of clicks, many important insights can be gleaned by looking at clicks cross-linguistically, and observing patterns in the typology of click systems. Key among these are implicational universals, which can be observed both across click ‘accompaniments’ (closure, release, and place-independent properties), and click ‘types’ (front places of articulation). 3.1 Nasality and Glottalization A previous cross-linguistic survey of languages with clicks found that every language which uses clicks as genuine segments (rather than just paralinguistically) has clicks with some form of phonetic nasality – but not vice versa (Bennett 2008, 2017).25 Thus, we find languages which have both oral and nasal clicks, including some that make many further distinctions within both of those categories. We also find languages like Dahalo, which has only nasal clicks. Dahalo does not appear to be a lone fluke in having only nasalized clicks. Though it is the only natural language that seems to use exclusively nasal clicks as part of its normal consonant inventory, we do find the same pattern recurring elsewhere. Walsh (2006) reports two ideophonic words with a nasalized dental click in the Ts’imba variety of Digo, a Bantu language from Kenya. In Damin, a constructed language based on Lardil, Hale and Nash (1997) report a robust inventory of clicks, all of them nasal. Additionally, Nathan (2001) documents a nursery rhyme in Ningdu Chinese where clicks are substituted for velar nasals in two words; the resulting clicks are invariably nasal. Sandawe also manifests this pattern in root-internal position: even though Wright et al. (1995) transcribe non-nasal clicks in medial position in a number of examples, the included aerodynamic data clearly shows accompanying nasality that 25  The claim that all languages with clicks have nasal clicks was initially made by Bennett (2008), but the crucial facts of languages like Dahalo and Sandawe were observed long before; see, e.g. Elderkin (1992), as well as Ladefoged and Traill’s (1984) finding that ‘delayed aspirated’ clicks in Nama are phonetically nasal.

94

Bennett

Table 2.3 Presence of clicks implies presence of nasalized clicks

Nasal clicks?

Oral clicks?

Example languages









– –

– ✓

ǃXóõ, Juǀ’hoan, Khoekhoe, Hadza, Xhosa, Zulu, Fwe, Sandawe (root-initial positions) Dahalo; also Digo, Damin, and root-medial positions in Sandawe; see also Ningdu Chinese, and South African Englisha Non-click languages (Unattested)

a At least some varieties of South African English appear to attest the same ‘all clicks are nasal’ pattern as Dahalo. The word [nǃa] or [nǀa] (with variant form [nǃanǃa]~[nǀanǀa]) ‘nice’, which has a nasal click, is used even by functionally monolingual English speakers, including speakers who systematically fail to produce oral clicks in other words (such as the language name Xhosa [ǁʰɔsa], often substituted as [kʰosə]). Anecdotal reports of linguists pronouncing this language name as [nǁosə] also abound, suggesting a grammar that maps /ǁ/ to [nǁ].

precedes the click release. Thus, there are languages with only nasal clicks, and languages with both nasal and oral clicks. But there appear to be no languages which have only oral clicks. This typological generalization is summarized in Table 2.3. Bennett (2008, 2017) proposes an explanation for the click nasality universal based on the aerodynamic properties of clicks. The ubiquity of nasalized clicks is explained by treating nasality as a product of nasal venting, as a consequence of maintaining pulmonic airflow, rather than being the realization of a genuine [+nasal] feature specification. This mechanism of nasal venting is well documented: it has been independently observed in non-clicks (since at least Yanagihara and Hyde 1966), and as a means of enhancing a pre-voicing contrast in oral clicks with a uvular posterior closure, i.e. [qǃ] vs. [(ɴ)ɢǃ] (Ladefoged and Traill 1994; Gerlach 2015: 89ff). On Bennett’s (2008, 2017) proposal, the nasality of nasalized clicks is analyzed as the consequence of maintaining pulmonic airflow during a click: the phonetic nasality arises from nasal venting, rather than as the realization of a genuine phonological [+nasal] feature specification. In this way, clicks with nasality are classified together with pulmonic, non-click, consonants, as segments using pulmonic airstream. The ǃ → Nǃ implication then follows from a theory where pulmonic consonants are always mapped faithfully. The prevalence of nasalized clicks thus connects to a broader typological fact: that every language with click consonants also has non-clicks.

Click Phonology

95

The aerodynamic analysis of click nasality has broader consequences for other kinds of consonants, depending on how the implementation of the pulmonic airstream is formally defined. The proposal made by Bennett (2008, 2017) is that [+pulmonic] is realized by having the pharyngeal pressure be neutral or low, while [-pulmonic] is realized as increased pharyngeal pressure. On this definition, non-nasal clicks may be predicted to pattern together with certain kinds of non-click obstruents, such as pharyngeal stops, or other stops with secondary articulations targeting elevated pharyngeal pressure (such as, perhaps, tense stops in Korean). Nasal clicks, by contrast, are predicted to pattern together with other kinds of clicks that employ alternative means of decreasing pharyngeal pressure, such as reducing airflow into the pharyngeal cavity through glottal constriction. This seems to be borne out: languages that have distinctions between voiced and voiceless nasalized clicks typically have glottalization on the voiceless ones – restricting airflow out of the lungs in much the same way that voicing does. 3.2 Place Implications Similar implicational universals can be observed across anterior places of articulation. In languages with relatively small click inventories, three anterior places tend to recur: dental [ǀ], alveolar-lateral [ǁ], and (post)-alveolar [ǃ]. The click inventories of a wide range of unrelated languages contain exactly these three click types, including Hadza, Sandawe, Zulu, and Yeyi. While further articulatory study is needed to verify whether the details of the closures are exactly the same across these languages, these three places nonetheless appear to be privileged among clicks. Languages which have more than 3 click types all appear to have a superset of these. Known languages that distinguish 4 click types all appear to have [ǀ ǁ ǃ], plus palatal [ǂ] (e.g. Khoekhoe, Yeyi, Juǀ’hoan, Gǀui, Kxoe), or plus fricated palatal [⨎ ] rather than palatal [ǂ], in the case of Ekoka ǃXun (König and Heine 2001: 22; Sands, this volume). Languages with 5 distinct click types add either palatal [ǂ] and labial [ʘ] (e.g. ǃXóõ, Nǀuu, ǂHoan26), or palatal and retroflex (Grootfontein ǃXung; Miller et al. 2009b; Sands, this volume). Languages with fewer than three distinct click types all appear to have a subset of [ǀ ǁ ǃ]. Thus, in Dahalo, for instance, we find only [ǀ] (Maddieson et al. 1993), and in Sesotho only [ǃ] (Doke 1954: 35). In other languages there seems to be more variation. Fwe, for instance, has no contrasts for click type, but the clicks produced by speakers appear to be limited to the [ǀ ǁ ǃ] (Gunnink 2013; Bostoen and Sands 2012), or [ǀ ǁ ǂ] types (Gunnink et al., 2015, Gunnink, this 26  See Bell and Collins (2001) for ǂHoan.

96

Bennett

volume). Similar free variation confined to within these three click types is also reported in Gciriku (Möhlig 2013: 447). Additionally, there is a further asymmetry between lateral clicks on the one hand, and dental and alveolar ones on the other. While there are languages described as having only dental clicks (Dahalo), or only alveolar clicks (Sesotho), there appear to be no reports of languages with only lateral clicks. Siswati (Ziervogel 1952: 8) is reported to have free variation between dental and alveolar clicks, but to systematically lack lateral ones – despite being closely related and in contact with languages like Zulu, that have it. Certain varieties of Ndebele are also reported to lack lateral clicks (Wilkes 2001: 315; see also Skhosana 2009: 74). Cumulatively, these languages paint a picture where the simplest click systems cannot include lateral clicks to the exclusion of dental and/or alveolar clicks. The resulting typological picture thus contains a number of apparently universal implications, summarized together in (9). Retroflex and bilabial clicks both imply palatal clicks, and palatal clicks imply dental, lateral, and alveolar clicks.27 Additionally, lateral clicks imply at least one other type of click – ­either dental or alveolar.28 (9) Implicational universals among click types in inventories a. ǃǃ ǂ ⟨ǀ ǁ ǃ⟩ ʘ b. ǀ ǁ ǃ 3.3 Recap Comparing across the typology of click inventories reveals things about the phonological structure of clicks that cannot be determined from known cases of phonological alternations. Traditional analyses of nasal clicks have regarded them on par with non-click nasals. However, the way they pattern in segmental inventories is at odds with this: oral clicks imply nasal clicks, while in non-clicks, this is clearly not the case. This suggests that nasality 27  An exception to this is the case of ǁXegwi (Traill 1999), which retains bilabial clicks but has changed Proto-ǃUi palatal clicks to laterally-released affricates. However, the available data on ǁXegwi is not quite rich enough to be fully confident that palatal clicks were entirely absent from the language. 28  A possible exception to this is the case of Dahalo. Dahalo lacks a ǃ click type but has been reported as having a lateral click allophone of the dental click (cf. Sands, this volume), which has alternatively been claimed to be a contrastive click type (Ehret 2013).

Click Phonology

97

in some clicks may be fundamentally not the same as nasality in other segments: a conclusion which cannot be drawn based on known phonological alternations. In the same vein, comparing click types (anterior places) across inventories reveals a rich structure of complexity, where certain click types are clearly more marked than others – even though there is no universally leastmarked click type. These examples point to examination of the typology of click systems as a promising means of reaping further insights about clicks and non-clicks alike. 4

Approaches to Representing ‘Clickness’

Approaches to phonologically representing the distinction between clicks and non-clicks can be divided into two general strategies. The first group seeks to derive the distinction between clicks and non-clicks in a non-featural way, representing clicks using the same set of phonological primitives posited on the basis of non-clicks. The second group posits a phonological feature to separate clicks from non-clicks. The former approach treats the lingual airstream as a sort of incidental side-effect of particular combinations of other features, such as the overlap of multiple closures. The latter builds the property of being a click into the universal framework for representing segments. Without any clear phonological alternations between clicks and non-clicks, the choice of how to represent this distinction is a very much unsolved problem. The extant proposals are considered in turn here. 4.1 Dorsal Articulation and Velarization Some approaches to representing clicks, stemming from early work in generative phonology, distinguish clicks from non-clicks based on their having a second posterior closure. For instance, Trubetzkoy (1969) analyzes clicks as consonants with ‘extreme velarization’. This suggestion is followed by Chomsky and Halle (1968), in the framework laid out in The Sound Pattern of English, which represents clicks as consonants with the same feature combination [+high, +back] that characterizes other velar consonants. The same essential idea is also prominent in later work set in feature geometry, particularly Sagey (1986). Sagey’s proposal treats clicks as complex segments having two [Place] nodes, one for the front closure of the click, and one for the back closure. The two-place approach to clicks does not need to assume any direct representation of airstream mechanisms. This seems like a conceptual advantage, particularly because consonants with a glottalic airstream – implosives and ejectives – have standardly been handled with the laryngeal features

98

Bennett

[±spread glottis] and [±constricted glottis], without the need for a distinct feature for their airstream. But, this approach yields a number of problems as well. If clicks are treated as merely velarized consonants, then there is no straightforward way to distinguish them from velarized non-clicks, such as [kʷ]. Distinguishing labio-velars from bilabial clicks is particularly problematic, since both have simultaneous bilabial and dorsal closures. Sagey’s (1986) solution is an extra formal apparatus, the ‘pointer’, which designates one place in a complex segment as primary, and as the closure to which the segment’s closure features apply. This allows for a way to separate labiovelars and labial clicks, by designating [Labial] and [Dorsal] as primary places in [k͡p], but just [Dorsal] as primary in [ʘ]. The pointer has been widely criticized on theoretical grounds (Traill 1993; Halle et al. 2000) since it is a fundamentally different kind of primitive than features and association lines. Its empirical adequacy is also questionable, particularly for labial clicks. Sagey (1986: 207) treats nonlabial clicks as having a pointer designating [Dorsal] as the primary articulation in clicks, but this structure is parallel to the representation posited for labialized velars like [kʷ]. This leads the framework to offer an unhappy outcome: the only way to make a three-way distinction between [kʷ], [k͡p], and [ʘ] is to treat labial clicks as having a much different representation than all other clicks. 4.2 An Airstream Feature for Clicks Other approaches are more direct, and encode the distinction between clicks and non-clicks in the set of features assumed, based on it having a fundamentally different mechanism of production. The basic notion of an airstream feature was long ago suggested as a phonetic feature by phoneticians (see, e.g. Abercrombie 1967; Ladefoged 1971), in work which treats features primarily as a frame for description and categorization, rather than as the essential atomic unit of phonological patterns. In phonological work, we can also find the same sort of direct representation of airstream. Halle (1995) posits a feature [±suction], with all and only clicks bearing [+suction], and all other segments bearing [–suction].29 Some other, more recent, work on clicks 29  The same basic idea of having a property to separate clicks based on their airstream mechanism also appears in frameworks where features play a less primary role. For example, Ewen (1980) posits a dependency model which does not have a specific featural element for clicks, but the framework does differentiate them from non-clicks using a distinct structural component, the ‘initiatory gesture’. Thus, though no feature directly encodes the lingual airstream, there is still a distinct configuration of elements that has that effect, and which is distinct from the representation of dorsal consonants.

Click Phonology

99

specifically posits instead a [lingual] airstream feature (Miller et al. 2006, 2009a; Miller 2006, 2011; see also Nakagawa 2006; Bradfield 2014, whose proposals involve similar features). Positing a distinct feature for clicks – whether framed in terms of suction, or in terms of the lingual airstream mechanism – seems wholly descriptively adequate for clicks. However, previous criticisms have pointed out that airstream features do not obviously generalize beyond clicks: they conclude that it is not necessary to posit a feature such as [suction] or [glottalic airstream] in order to explain implosives, for instance (Halle and Stevens 1971; Sagey 1986; Clements and Osu 2002). Similarly, it is intuitive that an airstream feature might tie clicks to ejectives, and this might seem appealing given languages like ǃXóõ and Nǀuu, where clicks and ejectives are both restricted to word-initial position (Traill 1985; Miller 2006, 2011, etc.). But, it is not obvious what airstream property is shared between clicks and ejectives besides [–pulmonic]; and previous airstream-free treatments of ejectives have met with considerable success (e.g. [+constricted glottis], Fallon 2002; also [long VOT], Gallagher 2011). If airstream features are redundant phonetic details in all segments other than clicks, then a [±lingual] airstream feature offers little more explanation than merely stipulating a feature [±click]. 4.3 Concurrent Segments? A more radical proposal has recently been made by Bradfield (2014), which treats clicks as concurrent phonemes. This is closely affiliated with cluster analyses of clicks (discussed in §4 below), but goes one step further: it takes all clicks to be clusters, necessarily, which are produced simultaneously instead of sequentially. Thus, an aspirated click [ǃʰ] is analyzed as the fusion of two phonemes: a ‘pure click’ /ǃ/, and a ‘pure accompaniment’ /_ʰ/, which has the specification [+spread glottis], along with other specifications for its laryngeal and manner properties. The concurrency proposal thus dissects all clicks into two components, corresponding to the ‘influx’ or ‘click type’, and to the ‘efflux’ or ‘click accompaniment’, respectively. What sets Bradfield’s proposal apart from previous approaches is that the two components of clicks are taken to be phonemes in their own right, and that they are produced concurrently instead of sequentially. A rule of fusion combines two concurrent segments into one, which bears the union of their features. A universal constraint is stipulated to require the ‘pure click’ segments to combine with exactly one ‘accompaniment’ segment. The resulting theory posits a [lingual] feature to distinguish clicks from all other segments, but uses it in a highly restrictive way: it is all and only the ‘pure clicks’ that bear

100

Bennett

this feature, and they may not combine it with any other airstream (in contrast to other proposals that aim to use airstream features to cross-classify, or that allow unit clicks to have airstream contours). The theory also bears a strong similarity to certain non-featural approaches (especially Sagey 1986) in that accompaniments are represented with two concurrent articulatory pieces, which are combined at a level within the segment proper. On its own, Bradfield’s proposal does not take a fundamentally different stance than those outlined above. In essence, the ‘pure clicks’ can be understood as ostensibly the bundling of a [lingual] airstream feature together with the place and other features that define the major click types. So, the basic idea is closely akin to cluster analyses of clicks. The novelty of the proposal lies in its formal characterization of the clustering operation, which sets up an intuitive connection to the notion of simultaneity, and a different kind of connection between click and non-click inventories. If the concurrency proposal is extrapolated a bit further, though, an interesting and even more novel possibility emerges: what if clicks are themselves concurrent consonants? This idea is effectively the same as treating the ‘pure’ click element as being equivalent to [k] or [q], as Bradfield (2014: 40) points out. While this kind of account presents a number of the problems as radical cluster analyses (see §4 below), it would seem to dovetail very elegantly with numerous reports of ‘intrusive’ clicks emerging from consonant clusters such as [tk] or [pk] (Silverman and Jun 1993; Ohala 1995; Demolin and Delvaux 2001; Koreman and Morland 2007; Fuchs et al. 2007). Such an approach could potentially offer a way to unify clicks and labio-velar stops, without appealing to a distinct featural structure such as a complex segment. Whether this idea could be made to fit with the phonetic realities of clicks is very much uncertain, though – particularly since some languages like Nǀuu are known to have uvular constrictions in clicks, without having a full complement of uvular nonclick plosives. 5

Complexity: Units vs. Clusters

Some languages have very small inventories of clicks; Dahalo, for instance, has been described as having only four clicks (Maddieson et al. 1993) – and this number can be reduced to just one click if following labialization is treated as a segmentally-distinct [w]. Other languages, at the other end of the spectrum, exhibit clear contrasts between large numbers of clicks, sometimes dwarfing the non-click inventory. ǃXóõ, for instance has 39 non-click consonants, compared

Click Phonology

101

to 80 distinct clicks (Traill 1985: 99).30 As Traill (1985: 208) points out, this puts the inventory for ǃXóõ at nearly one and a half times as many consonants as the previously suggested record of 80 consonants in Ubykh (Catford 1977: 291). Large click systems like this pose two recurrent questions: first, why is it that languages with clicks can have such large consonant inventories? Second, how are the clicks in such inventories connected to the non-click consonants in the same language? 5.1 Unit and Cluster Analyses of Clicks Traditional descriptive work on clicks often recognizes their phonetic complexity, but generally regards them as phonologically unitary. Thus, Snyman (1978) describes a number of the clicks in Juǀ’hoan as consisting of two ‘phonetic segments’, but regards them as single phonological units. Similar ideas can be found throughout other descriptive works (see e.g. Bleek 1928–29; Doke 1936; Beach 1938; Lanham and Hallowes 1956; Snyman 1975). Treatments of clicks in phonological theories generally drew on these descriptions, and also took a unitary approach (Trubetzkoy 1969; Chomsky and Halle 1968), with complexity being encoded below the level of the segment (Sagey 1986). Some comparatively recent work has re-analyzed certain consonants, especially clicks, as consonant clusters. Examples of this approach, initially suggested by Traill (1985; 1993) include Güldemann (2001), Nakagawa (2006), and Naumann (2008, 2016), as well as Bradfield (2014). The core observations at the foundation of this approach are two. The first is that large inventories of clicks can yield dizzyingly large consonant inventories. The second key observation is that many of the clicks in ǃXóõ – and elsewhere in Khoisan – are conspicuously similar to independently-attested consonants. Some of these, such as [χ] in ǃXóõ, involve phonetic components which also occur as releases of non-clicks, suggesting some degree of permutability on par with independent segments in languages that tolerate clusters more freely. In this case, Bradfield (2014: 20) astutely notes that full decomposition of clicks reduces the phoneme inventory considerably: the combinations of click types and accompaniments changes from 5 × 16 to 5 + 16, reducing the number of clicks for ǃXóõ to nearly one quarter as many (21 vs. 80). Thus, it seems that cluster analyses of clicks solve two problems with one stroke: they both bring the size of segment inventories down into the range expected based on languages with no clicks, as 30  Traill (1994b) adds several more consonants, but they appear to be marginal and rare, so I do not include them in this count. See also Güldemann (2006) for counts of clicks and non-clicks, and ratios between them, in a number of other Khoisan languages.

102

Bennett

well as unifying click and non-click systems (by effectively positing that some clicks include a non-click component drawn from the same inventory). As an illustration of the cluster approach, let us consider the consonant inventory of ǃXóõ, given in Table 2.4 below. Consonants are transcribed based on the descriptions given by Traill (1985: 151), and the transcriptions of Ladefoged and Traill (1994: 49); consonants with pre-voicing are indicated as voiced stops with a pre-posed voicing diacritic (i.e. as [ˬd] rather than as a voiced-voiceless sequence like [dt]). Segments in parentheses are marginal; those also marked with asterisks are only listed by Traill (1994b), not Traill (1985). Some of the clicks have been described in multiple different ways, and warrant some clarification (see Bradfield’s (2014) appendix for more detailed discussion). The voiceless aspirated clicks are transcribed by Traill (1985) as with a uvular closure as [ǃqʰ], but are given with a velar one ([kǃʰ]) in later work. Clicks with a voiced uvular stop accompaniment are described by Traill (1985) as having prenasalization, which is not included in later transcriptions ([ɢǃ] = [⁽ᶰ⁾ɢǃ]). The descriptions of each series of clicks come from Traill (1985: 124); the pairs of numbers that follow indicate the corresponding row in tables given in other sources (Ladefoged and Traill 1994; Traill 1985: 124, 151). Table 2.4 ǃXóõ inventory

Non-clicks

Labial

Dental

Stops

(p) b (pʰ)

t d tʰ ˬd̥ ʰ (t’)*

Affricates

W/ uvular fric. W/ eject. uvular (p’kχ’)* affric. Fricatives (f)* Nasals m ʼm Liquids

Post-dental, Velar palatal

ts dz tsʰ ˬd̥ sʰ ts’ ˬd̥ s’ tχ ˬd̥ χ tsχ ˬd̥ sχ t’kχ’ ˬd̥ ’kχ’ ts’kχ’ ˬd̥ s’kχ’

n ʼn

s

(l)

Uvular

Glottal

ʔ k ɡ q ɢ kʰ (ˬɡ̥ʰ)* qʰ (ˬɢ̥ ʰ)* (q’) (k’)* kx’ ˬɡ̥x’

x (ŋ)

(h)*

103

Click Phonology Table 2.4 ǃXóõ inventory (cont.)

Clicks

Bilabial

Dental

Alveolar

Lateral

Palatal

Voiceless unaspirated (2/1) Voiced (1/2) Voiceless aspirated (3/8) Voiceless uvular stop (5/6) Voiced uvular stop (with prenasalization) (4/7) Voiced and aspirated (14/15) Voiceless nasalized (7/5) Voiced nasalized (6/3) Preglottalized and nasal (8/4) Delayed aspiration (with prenasalization) (9/12) Glottal stop (15/13) Uvular fricative (10/9) Voiced and uvular fricative (11/14) Ejected uvular stop (16/10)











ɡʘ kʘʰ

ɡǀ kǀʰ

ɡǃ kǃʰ

ɡǁ kǁʰ

ɡǂ kǂʰ











⁽ᶰ⁾ɢʘ

⁽ᶰ⁾ɢǀ

⁽ᶰ⁾ɢǃ

⁽ᶰ⁾ɢǁ

⁽ᶰ⁾ɢǂ

ɡʘh

ɡǀh

ɡǃh

ɡǁh

ɡǂh

n̥ ʘ

n̥ ǀ

n̥ ǃ

n̥ ǁ

n̥ ǂ











ʔnʘ

ʔnǀ

ʔnǃ

ʔnǁ

ʔnǂ

⁽ᶰ̥⁾ʘʰ

⁽ᶰ̥⁾ǀʰ

⁽ᶰ̥⁾ǃʰ

⁽ᶰ̥⁾ǁʰ

⁽ᶰ̥⁾ǂʰ

kʘʔ

kǀʔ

kǃʔ

kǁʔ

kǂʔ

kʘχ

kǀχ

kǃχ

kǁχ

kǂχ

ɡʘχ

ɡǀχ

ɡǃχ

ɡǁχ

ɡǂχ

qʘ’

qǀ’

qǃ’

qǁ’

qǂ’

104

Bennett

Table 2.4 ǃXóõ inventory (cont.)

Clicks

Bilabial

Dental

Alveolar

Lateral

Palatal

Ejected uvular affricate (12/11) Voiced and ejected uvular affricate (13/16) Pre-voiced and aspirated uvular stop (17/–)

kʘχ’

kǀχ’

kǃχ’

kǁχ’

kǂχ’

ɡʘqχ’

ɡǀqχ’

ɡǃqχ’

ɡǁqχ’

ɡǂqχ’

(ɡǀqʰ)*

(ɡǃqʰ)*

(ɡǁqʰ)*

Source: Ladefoged and Traill 1994; Traill 1985, 1994b

The ǃXóõ inventory is obviously large by any standard, and offers a range of consonants that intuitively seem like very plausible clusters – including both clicks and non-clicks. Unit analyses (e.g. Sagey 1986; Miller et al. 2009a; Miller 2011) take all clicks to be single units. Previous cluster analyses have proposed various degrees of decomposition. Table 2.5 compares three proposals: Güldemann (2001), and two possibilities outlined by Nakagawa (2006). Güldemann’s proposal divides the clicks into three types: ‘simple’ clicks, treated as phonological units, are specified only for voicing and nasality. Other ‘complex’ clicks are treated as simple clicks combined with a further laryngeal co-articulation, such as aspiration or glottalization. Finally, there are click ‘clusters’, consisting of a simple click and a following consonant. Nakagawa’s Moderate Cluster Analysis encounters the same three categories. The Radical Cluster Analysis (RCA), on the other hand, simplifies things further: it recognizes only a single unit click (one for each place of articulation), and treats all other ‘accompaniments’ as clusters formed with that click. Nakagawa only presents the RCA treatment for Gǀui, and focuses specifically on non-nasal clicks; as a result, I have taken some educated guesses at the RCA treatments of clicks in ǃXóõ, and these inferred analyses are marked with ‘(?)’. 5.2 The Explanatory Value of Cluster Analyses What is explained by cluster analyses of clicks? Treating certain types of clicks as clusters is quite intuitive, and seems appealing insofar as it brings the total counts of various consonant inventories down closer to the same range observed in many other languages. But, we are still forced to look elsewhere for

105

Click Phonology Table 2.5 Comparison of three cluster-based analyses of ǃXóõ

kǃ ɡǃ kǃʰ qǃ ⁽ᶰ⁾ɢǃ ɡǃh n̥ ǃ nǃ ʔnǃ ⁽ᶰ̥⁾ǃʰ kǃʔ kǃχ ɡǃχ qǃ’ kǃχ’ ɡǃqχ’ (ɡǃqʰ)* tχ ˬd̥ χ t’kχ’ ˬd̥ ’kχ’

Güldemann’s (2001) structural analysis

Moderate Cluster Analysis (MCA) (Nakagawa 2006)

Radical Cluster Analysis (RCA) (Nakagawa 2006)

unit: voiceless unit: voiced kǃ + qʰ~kʰ kǃ + q ɡǃ + q complex: aspirated ɡǃ unit: nasal, voiceless unit: nasal, voiced complex: glottalized nǃ complex: aspirated kǃ complex: glottalized kǃ kǃ + x ɡǃ + x kǃ + q’ kǃ + kx’~k’ ɡǃ + kx’~k’ ɡǃ + kʰ~qʰ t+x d+x t + kx’~k’ d + kx’~k’

unit: voiceless unit: voiced kǃ + qʰ kǃ + q kǃ + ɢ ɡǃ + h (or complex ɡǃʰ) unit: nasal, voiceless unit: nasal, voiced complex: glottalized nǃ kǃ + h (or complex kǃʰ)a kǃ + ʔ (or complex kǃ’) kǃ + χ ɡǃ + χ kǃ + q’ kǃ + qχ’ ɡǃ + qχ’ kǃ + ɢ̥ ʰ t+χ d+χ t + qχ’ d + qχ’

nǃ + k nǃ + ɡ nǃ + qʰ nǃ + q nǃ + ɢ nǃ + ɡʰ (?) nǃ + h (?) unit nǃ + ʔn (?) nǃ + kʰ (or nǃ + h) nǃ + k’ (or nǃ + ʔ) nǃ + χ nǃ + ɢχ (?) nǃ + q’ nǃ + qχ’ nǃ + ɢ̥ χ’ (?) nǃ + ɢʰ (?) t+χ d+χ t + qχ’ d + qχ’

a Nakagawa (2006: 293) includes distinctions for ǃXóõ not made by Güldemann, Traill, or Ladefoged and Traill. Specifically, he includes both voiced and voiceless aspirated clicks [kǃʰ] and [ɡǃʰ], as well as clusters of voiced and voiceless clicks followed by /h/, [kǃh] and [ɡǃh]. The former are defined in the MCA as complex stops, and the latter as clusters. It is somewhat unclear how this 4-way distinction connects to Traill’s previous descriptions, and to Güldemann’s analysis: Nakagawa’s inventory ends up with 20 sets of clicks, vs. the 17 sets encountered in prior work. This may in part reflect dialect differences; see Naumann (2008, 2016) for discussion.

explanations of the structure of combined inventories, and to explain the phonetic details of how clicks relate to non-clicks. Some of these issues are reviewed in this section.

106

Bennett

5.2.1 Phonetic Disparities between Clicks and Non-clicks? Strong cluster analyses like Güldemann (2001) involve a trade-off between phonetic accuracy or consistency and reducing the size of consonant inventories. Güldemann’s (2001: 29) analysis of Juǀ’hoansi, groups nasal (voiced) aspirated clicks together with (pre)voiced aspirates like [dʱ] and [dsʱ], which involve no nasality at all. Thus, the phonological analysis of [nǁʱ] as a voiced aspirate leaves one of its systematic phonetic characteristics – nasalization – ­unaccounted for. The motivation behind the analytical choice seems quite straightforward: Juǀ’hoansi has only one set of nasals, which are unaspirated, and therefore are seen as parallel to unaspirated nasal clicks.31 Since the nasal aspirated clicks are voiced and aspirated, pairing them with the voiced aspirates is the next most obvious choice, if the nasals are already taken to be parallel to something else. But the consequence of this analysis is that there are clear phonetic disparities between the clicks and the non-click consonants that they are ‘integrated’ with. Integrating clicks with non-clicks requires the analysis to overlook clear differences between them. Creating groups by simply overlooking disparities is an exercise doomed to succeed – regardless of whether it explains anything or not. In the Juǀ’hoansi case, grouping nasal aspirated clicks together with voiced aspirates does not explain why the former have detectable nasality, and the latter do not. This clearly does not follow from some more general asymmetry between clicks and non-clicks. For instance, we could imagine that the retention of a posterior closure during a click requires some degree of nasal venting in order to accommodate the continuation of airflow needed to produce voicing. This story seems entirely unsatisfactory, though, when we consider the fact that Juǀ’hoansi also has voiced aspirated clicks with no nasality [ɡǀʱ ɡǂʱ ɡǁʱ ɡǃʱ], which Güldemann analyzes as clusters of a voiced click plus an aspirated stop (e.g. /ɡǀ+kʰ/). So, the co-occurrence of aspiration and voicing during a click is not a viable explanation for why ‘voiced + aspirated’ clicks are phonetically nasal. This phonetic detail is not a mere artifact of coarticulation, but rather a non-predictable – and indeed contrastive – difference; and it is a difference that we gain no insights about from merely listing [nǁʱ] on the same row as [bʱ dʱ]. The language Nǀuu provides a particularly interesting case study for the cluster idea, largely because of direct ultrasound data collected in the work of Miller et al. (2009a). While Nǀuu clicks were previously been described as having a velar closure (Doke 1936), Miller et al. find that all of the clicks 31  Note however that Miller-Ockhuizen (2003: 21) does report exactly one clitic, [mʱii], with an aspirated nasal; though neither Miller-Ockhuizen nor Güldemann include this in the consonant inventory.

Click Phonology

107

actually have uvular posterior constrictions.32 As such, click ‘accompaniments’ which were long thought to be velar turn out to actually be uvular. Thus, the click initially transcribed by Doke as [ǃkʰ] is actually [ǃqʰ]. The distinction previously attributed to a uvular vs. velar posterior place, Miller et al. find, is actually one of timing and loudness of releases. The ‘velar’ clicks actually have a uvular constriction with no audible posterior release; the clicks described as ‘uvular’ have releases that produce an audible burst which follows the anterior release. It is worth noting, though, that Güldemann (2001: 22) very astutely identifies the velar vs. uvular discrepancy in clusters as an area where the model requires further work. The finding that Nǀuu clicks are uniformly uvular challenges the prospects for a cluster analysis to integrate them with non-clicks. Güldemann’s (2001) cluster analysis posits four types of clusters: /ǃ+x/, /ǃ+kx’/, /ǃ+k’/, and /ǃ+kʰ/, based on the contrasts reported among velars in previous work (Doke 1936). We can envision a revised analysis that captures the phonetic reality that clicks in Nǀuu are uniformly uvular, by taking them to be clusters with uvular offsets, i.e. /ǃ+χ/, /ǃ+qχ’/, /ǃ+q’/, and /ǃ+qʰ/. But this encounters an immediate problem: Nǀuu doesn’t have [qʰ] (Miller et al. 2009a: 132). The result is an uncomfortable choice to make: either the cluster analysis of Nǀuu clicks must posit non-clicks that do not exist, or it must ignore the fact that all clicks – including clusters with velars – are systematically uvular and not velar. For the discussion that follows, I will retain the velar ~ uvular ‘distinction’ in transcriptions as convention, in the interests of engaging cluster analyses that assume this distinction on their own terms. 5.2.2 Predicting Clicks from Non-clicks? Does a cluster analysis explain click inventories better than a unit analysis? Put differently, does treating (some) clicks as clusters allow us to predict which clicks a language will have, based on its non-click inventory? Here, again, the answer seems to be ‘not quite’. In some cases, we can find clear instances where the absence of a particular segment in the non-click inventory goes hand in hand with the absence of a certain kind of click. For example, Gǀui has a four-way contrast in uvular stops: [q ɢ qʰ q’] (Nakagawa 2006), and Nakagawa reports a corresponding distinction between for sets of clicks [kǀq kǀɢ kǀqʰ kǀq’]. Languages without such uvular stops are not reported to have these distinctions among clicks: Khoekhoe 32  The precise location varies somewhat depending on the front constriction; in some clicks, Miller et al. (2009a) report the posterior closure as upper pharyngeal, rather than properly uvular.

108

Bennett

Table 2.6 Inventory of Khoekhoe

Non-clicks

Labial

Alveolar

Velar

Glottal

Stops Affricates Fricatives Nasals Liquids

p~b

t~d ts ~ tʃ s n ɾ

k~ɡ kx ~ kʰ x~χ

ʔ

Clicks

Dental

Alveolar

Lateral

Palatal

Voiceless, plain Affricated Voiced nasal Voiceless nasal, ‘delayed’ aspirated Voiceless nasal, glottalized

ǀ ǀχ ~ ǀʰ ŋǀ ŋ̥ǀʰ ŋ̥ǀˀ

ǃ ǃχ ~ ǃʰ ŋǃ ŋ̥ǃʰ ŋ̥ǃˀ

ǁ ǁχ ~ ǁʰ ŋǁ ŋ̥ǁʰ ŋ̥ǁˀ

ǂ ǂχ ~ ǂʰ ŋǂ ŋ̥ǂʰ ŋ̥ǂˀ

m

h

has no uvular stops of any sort, and traditionally has not been described as having any clicks with uvular releases (Beach 1938; Brugman 2009). In many other cases, though, there is not a very clear or concrete connection between the non-click inventory and the sub-inventory of clicks that are presumed to be clusters. As an example, consider the inventory of Khoekhoe (Nama), given in Table 2.6, (Brugman 2009; Ladefoged and Traill 1984; Beach 1938). In the click inventory of Khoekhoe, we see a five-way distinction at each place of articulation. Three of these involve nasality. This is puzzling if we view clicks as clusters: the language has only one set of nasals, which are always voiced. From the non-click inventory, we should therefore predict just one set of nasal clicks, with normal (modal) voicing; this is not the case. Alternatively, based on the click inventory, we would predict a distinction between voiced and voiceless nasals in the non-click inventory; this too is not the case. Güldemann (2001) resolves this tension by analyzing the voiceless nasalized clicks as clusters with /h/ and /ʔ/. This fits quite sensibly with the glottalization and aspiration that have been reported since Beach’s (1938) excellent descriptions. But, it does mean looking elsewhere for an explanation of why these clicks are phonetically nasalized – and it requires a concomitant relaxing of the parallelism between clicks and non-clicks that is touted as a central aim of the cluster approach. The result, then, is an analysis that is not internally consistent. Nasality

Click Phonology

109

is an unexplained phonetic detail among voiceless clicks, while in the voiced clicks it is the main indicator of their phonological specification as nasal; at the same time, voicing is relegated to non-phonological status. The consequence: voiced clicks are diagnosed as phonologically nasal because they are phonetically voiced, and they are phonetically voiced because they are phonologically nasal. A further issue for cluster analyses emerges from the dorsal non-clicks. We can observe that there is a three-way contrast among the non-click dorsals. The first is a stop, traditionally analyzed as /k/, which may sometimes be voiced (Beach 1938: 58), for at least some speakers. The second is an affricated and/or aspirated velar, which varies between [kʰ], [kx], and [kxʰ], both within speakers and from one speaker to another; following Brugman (2009) and Beach (1938), let us take it to be /kx/ (abstracting away from any possible velar/uvular distinction in posterior closure). The third dorsal is a voiceless velar or uvular fricative, [x]~[χ]. Based on this three-way stricture distinction, an aggressive cluster analysis would predict a parallel three-way distinction among clicks, e.g. /ǃ+k/ vs. /ǃ+kxʰ/ vs. /ǃ+χ/. This prediction is not borne out. While the plain voiceless clicks could be interpreted as /ǃ+k/ clusters, the fact remains that there is only one set of clicks with fricated dorsal release: whether the clicks [ǀχ ǃχ ǁχ ǂχ] are analyzed as /ǃ+χ/ or as /ǃ+kx/, there is a distinction made in the non-clicks that is not reflected among the clicks. This is unavoidable, no matter what the specific clusters are taken to be: pulmonic obstruents show a three-way distinction (stop, affricate/aspirate, fricative), while clicks have only a two-way distinction (affricate/aspirate vs. stop). This gap is not something that we are likely to find an extrinsic phonetic basis for: other languages such as Gǀui and ǃXóõ and Juǀ’hoan are reported to have contrasts between aspirated and fricated releases, e.g. [kǃx] vs. [kǃʰ], so it seems unreasonable to assume that this is a phonetic impossibility. Güldemann’s (2001) proposal for Khoekhoe takes [ǃχ] to be /ǃ+χ/, and treats [ŋ̥ǃʰ] as parallel to /kx ~ kʰ/, overlooking the affrication in favour of the aspiration; but this still does not give any explanation of why the velar /kx ~ kʰ/ can be produced with dorsal affrication, and [ŋ̥ǃʰ] systematically has none. The point is that even when we take significant liberties in interpreting the phonetics, a cluster analysis does not reveal a systematic logic that predicts which clicks a language has based on its non-click inventory. It is trivially easy to construct an analysis by starting with one type of click and pairing it to the most intuitively similar non-click, and repeating that procedure until all click types are paired to some non-click. But analyses arrived at in this way are fundamentally rooted in the intuitions of the analyst, and therefore need not carry any predictive power or explanatory value.

110

Bennett

5.2.3 Predicting Non-clicks from Clicks? If a cluster analysis of clicks does not make it possible to predict which clicks a language has based on its non-click inventory, what about the reverse direction? If there is a theory of how clustering works in clicks, it is conceivable that it would not allow us to predict which clicks are possible. But, if phonetically complex clicks are analyzed as clusters, then there should be a theory that explains the structure of such clusters. As such, we would expect to be able to find systematic generalizations that cut across clicks and non-clicks when they occur in clusters. And, more specifically, we have a concrete expectation: if a given cluster occurs with clicks, we should expect to find it with non-clicks as well. But even a cursory examination of a few cases reveals that an implication in this direction is falsified. To see this, let us consider the case of Juǀ’hoan, analyzed in cluster terms by Güldemann (2001), and in unit terms – and in great phonetic detail – by Miller-Ockhuizen (2003), as well as in earlier work by Snyman (1975, 1978, etc.) and Dickens (1994). Juǀ’hoan has 6 velar obstruents: [x k ɡ kx kʰ ɡʱ]. The cluster analysis proposed by Güldemann posits three types of clusters: C+x, C+k’ (realized as [C+kx’]), and C+kʰ, as illustrated in Table 2.7.33 One of these three sets of clusters, C+kʰ, is asserted only to hold with clicks. Another, C+k’ (= [C+kx’]), is attested only with clicks and [t]: it does not occur with affricates like [ts] or [tʃ], and does not exhibit any voicing distinction (*dkx’). Thus, only clusters of the form C+x exhibit a genuine parallelism between clicks and non-clicks. Some details of the analysis seem contentious, but can be overlooked as being not at the core of the proposal. For instance, it is peculiar that voiced clicks should cluster with /kʰ/ instead of /ɡʱ/, when voicing agreement in consonant clusters is so ubiquitous in the world’s languages (see e.g. Lombardi 1999). But, it would be quite a stretch to decry that as a flaw in Güldemann’s approach, since [ɡǀʱ] could quite feasibly be analyzed as /ɡǀ + ɡʱ/, with the parallelism among clusters still maintained. Plus, pre-voiced consonants do occur, as in pre-voiced ejective affricates like [dʃ’], so a contour from voiced to voiceless is supported elsewhere in the language. Along the same lines, Güldemann treats the clusters like [ɡǃkx’] as /ɡǃ+k’/, which effectively reduces the frication observed in previous descriptions to a phonetic coincidence. It also adds to the consonant inventory in that /k’/ does not occur independently in the language. The motivation for this analytical choice is rooted in maintaining similarities 33  It is worth noting that there are numerous disparities between different descriptive sources. What Güldemann (2001) takes to be an ejective velar affricate (based on earlier descriptions, e.g. Snyman 1975), and analyzes as underlyingly a cluster ending with an ejective stop /C+k’/, Miller-Ockhuizen (2003) characterizes as epiglottalization.

111

Click Phonology Table 2.7 Juǀ’hoan stop clusters

Non-clicks

plain + /x/ voiced + /x/ plain + /k’/ voiced + /k’/ plain + /kʰ/ voiced + /kʰ/

Clicks

alveolar alveolar palatal stop affricate affricate

dental palatal lateral alveolar

tx dx tkx’

ǀx ɡǀx ǀkx’ ɡǀkx’ kǀʰ ɡǀʱ

tsx dzx

tʃx dʒx

ǂx ɡǂx ǂkx’ ɡǂkx’ kǂʰ ɡǂʱ

ǁx ɡǁx ǁkx’ ɡǁkx’ kǁʰ ɡǁʱ

ǃx ɡǃx ǃkx’ ɡǃkx’ kǃʰ ɡǃʱ

Source: Following Güldemann (2001: 29)

with other Khoisan languages, which do have [k’]. But, if we abandon the requirement for similarity, there is a ready alternative analysis that is more phonetically transparent, and does not require positing a novel and otherwiseunattested segment. While Juǀ’hoansi does not have a plain ejective [k’], it does have an affricate [kx’], and so [ɡǃkx’] can instead be thought of as /ɡǃ + kx’/. And in fact, in both cases, alternative refinements of this sort are actually explicitly proposed by Nakagawa (2006: 292ff) as improvements on Güldemann’s initial proposal. These finer decisions of how to implement the clustering idea are independent from the fundamental idea, which is at issue here. What is more peculiar about the cluster analysis of Juǀ’hoan, and more central to Güldemann’s proposal, is that the clusters C+k’ and C+kʰ occur only with clicks. The analysis of [ɡǃkx’] and [ɡǃkʱ] as clusters clearly requires the assumption that there can be clusters that are permitted only with clicks, and not with non-clicks. This is because clusters such as *[tkx’] or *[t͡skʰ] – clusters formed by taking the same ‘accompaniments’ [_kx’] and [_kʰ] from the clicks and combining them with non-click obstruents – are unattested in the language. So, a cluster analysis of clicks requires the general assumption that languages can allow clusters with clicks while not allowing them with other non-clicks – even if the latter are articulatory much simpler. Looking at other languages, it is not hard to find other cases that raise the same issues as Juǀ’hoan. For example, Nakagawa’s (2006) analysis of Gǀui (which builds heavily on Güldemann’s proposal) posits 8 types of clusters, shown in Table 2.8. Of these, only two are also found with non-clicks. Interestingly, these clusters are of almost precisely the same form as the ones found in Juǀ’hoan: clusters of a coronal stop, plus a following dorsal fricative or affricate.

112

Bennett

Table 2.8 Gǀui clusters

Clicks plain + /χ/ plain + /qχ’/ plain + /q/ plain + /ɢ/ plain + /qʰ/ plain + /q’/ plain + /ʔ/ plain + /h/

kǀχ kǀqχ’ kǀq kǀɢ kǀqʰ kǀq’ kǀʔ kǀh

Non-clicks kǃχ kǃqχ’ kǃq kǃɢ kǃqʰ kǃq’ kǃʔ kǃh

kǁχ kǁqχ’ kǁq kǁɢ kǁqʰ kǁq’ kǁʔ kǁh

kǂχ kǂqχ’ kǂq kǂɢ kǂqʰ kǂq’ kǂʔ kǂh

tχ tqχ – – – – – –

tsχ tsqχ

(cχ) (cqχ)

Source: Following Nakagawa 2006

In Kxoe, and ǂKhomani, the situation is similar as well: multiple manners of clicks are amenable to analysis as clusters,34 yet the only types of clusters that cut across both the clicks and non-clicks are those consisting of a consonant plus some form of dorsal frication (described variously as [x] or [χ]). Across all of these languages, then, we find an assortment of clusters that can occur only if the first consonant is a click, and not if it is any other consonant. The situation in Kxoe, ǂKhomani, Gǀui, Juǀ’hoan, and ǃXóõ contrasts very clearly with what we observe in Khoekhoe. This is illustrated in the comparative table below (Table 2.9), based on Nakagawa’s (2006) moderate cluster revision of Güldemann (2001). In the first 5 languages, we can see that the inventory of clusters evident from clicks are for the most part not found with non-clicks – with the major exception of C+χ and C+qχ clusters. However, looking at Khoekhoe, we see that the presence of fricated-release clicks like [ǃχ] does not imply the existence of other C+[χ] clusters in the same language. Rather, the implication goes in the opposite direction: if a language has C+χ clusters with non-clicks, then it also has them with clicks – [tχ] implies [ǃχ].

34  See Nakagawa (2006: 292ff) and Güldemann (2001: 26, 33) for details.

113

Click Phonology Table 2.9 Comparison of click and non-click clusters

Language

C+χ C+qχ’ C+q’ C+h C+ʔ (fricative) (affricate) (stop) (glot. fric.) (glot. stop)

ǃXóõ

✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ (✓)a ✓

clicks non-clicks Juǀ’hoan clicks non-clicks Gǀui clicks non-clicks Kxoe clicks non-clicks Nǀuu clicks (ǂKhomani) non-clicks Khoekhoe clicks non-clicks Sandawe clicks non-clicks

✓ ✓ ✓ (✓) ✓ ✓ ✓ ✓ ✓

✓+





✓ ✓+













(✓?)

(✓?)



✓ ✓

a Doke (1936: 443) notes: ‘c is further found in combination with the velar fricative x, e.g. cxei (excrete), cxan (dung), in which case it almost composes a separate syllable, the effect being near that of cəxan, etc. With some speakers an alternative in these cases is to substitute t for c giving txei, etc.’ This description suggests that a vowel breaks up the cluster for some speakers. Bonny Sands (p.c.) informs me that the variety of Nǀuu described by Miller et al. (2009a) does not appear to have any trace of an intrusive vowels in such forms, though. Note: ✓+ indicates multiple series of clusters with more distinctions within each category, e.g. C+q vs. C+ɢ. Shaded cells highlight cluster types found only with clicks. Source: Following Nakagawa 2006

Finally, in considering cluster analyses of Sandawe, we see that there is also no implicational relationship among the clusters. Sandawe is analyzed by Nakagawa, and by Güldemann, as having only /ǃ+ʔ/ clusters, and lacking /ǃ+χ/. So, it cannot be the case that /ǃ+χ/ are somehow the simplest click clusters possible. The majority of Southern African Khoisan languages may have clicks with fricated releases, but this is not universal. The net result, then, is that we cannot make firm predictions about a language’s non-click inventory based on its clicks. There are mismatches to be found in abundance at the level of the individual segment – click ‘accompaniments’ that do not phonetically match up with even their closest counterparts

114

Bennett

in the non-click inventory. There are also comparable mismatches at the level of the cluster inventory – the clusters allowed with clicks do not entail anything about which non-click clusters are permissible in the language. So, the asymmetries run in both directions, and leave us with little cross-linguistic basis to posit any systematic connection between the click and non-click inventories of languages. This is not to say that no relationship exists; rather, the existing cluster analyses simply do not offer a clear way to identify what it is, much less to explain why it is so. 5.2.4 Support from the Typology of Consonant Clusters? The typology of clusters that results from cluster analyses of clicks is at odds with well-substantiated generalizations about clusters in other languages. Apart from clicks that also submit to a unit analysis, the vast majority of Khoisan languages tolerate conspicuously few consonant clusters; most nonKhoisan languages with clicks are similarly restrictive as well. Cluster analyses of clicks may simplify segmental consonant inventories, but often this leads to a peculiar space of consonant clusters permitted.35 As an example, let us return to the cluster analysis of Juǀ’hoan from Table 2.7 above. Güldemann’s (2001) cluster analysis and Nakagawa’s (2006) ‘moderate’ cluster analysis converge on a total set of 31 clusters. These clusters are a remarkably narrow group, which can be characterized by the properties in (10). No other types of clusters would seem to be permitted. For instance, even though nasal clicks are treated as units by both Nakagawa and Güldemann, they are not free to combine with frication of the posterior release (*nǃχ). (10) Properties of all consonant clusters in Juǀ’hoan, in a cluster analysis of clicks a. All clusters consist of exactly two obstruents b. C1 is always a non-continuant (a click, stop, or affricate) c. C2 is always some kind of continuant (a fricative or affricate) d. C1 is always Coronal (a coronal obstruent or click with coronal front closure) e. C2 is always Dorsal or Glottal (never Coronal) f. C2 is always voiceless

35  It is worth noting that Bradfield’s (2014) concurrency proposal diverges from cluster analyses on this point. In this analysis, clicks are concurrent clusters, and not sequential ones. It is therefore unsurprising that they don’t show the same cross-linguistic behavior as sequential clusters.

Click Phonology

115

This constellation of properties is undeniably odd from a cross-linguistic perspective. Being a very salient point of cross-linguistic variation, the typology of consonant clusters has been the focus of a large body of research, including substantial cross-linguistic surveys (Blevins 1995; Morelli 1999; Kreitman 2008; see also Goad 2011 and sources cited therein). From this sizable literature, we can discern three types of consonant clusters that seem to have privileged status, listed below. (11) Typologically ‘privileged’ types of consonant clusters a. sC: a coronal sibilant plus a following consonant b. CR: consonant plus a following sonorant – a liquid or approximant c. NC: a nasal plus a following consonant There is clear evidence that these types of clusters are more readily tolerated than other kinds of clusters. Kreitman (2008), based on a survey of 62 languages, finds that all languages which permit obstruent-obstruent clusters also permit obstruent-sonorant clusters. Morelli (1999) surveys 29 languages, and finds that all languages which permit stop-stop clusters also permit some form of fricative-stop clusters. In addition to such ‘by-the-numbers’ arguments, there are also similar trends evident in individual languages. In English, for instance, word-initial consonant clusters may contain three consonants only if the first one is [s], and obstruent clusters may occur in onset position only if they begin with [s].36 Completely unrelated languages show the same kind of asymmetry: in Isthmus Zapotec, for example, consonant clusters are limited to Cw and Cy, unless the first consonant is [s] or [ʃ~ʂ] (Marlett and Pickett 1987). There are also some indicators that the generalizations about clusters based on non-clicks are reasonable to extend to clicks, and to languages that have them. Zulu and Xhosa both limit clusters to N+C sequences (which may be homorganic, or with [m]), and C+w. In these languages, clicks behave entirely like other obstruents with respect to these clusters: they may be preceded by [m] or a homorganic nasal, or followed by [w]. Additionally, s+C clusters are marginally found in both languages, arising either from borrowing or from reduction of an intervening vowel. The latter situation is not reflected in standard orthography, but can be readily observed in personal names, such as Sthembiso, Sphokazi, Snqobile and Stemela (< Sithembiso, Siphokazi, Sinqobile and Siteme­ la, respectively), and in neologisms, e.g. the Toyota Ses’fikile [sɛsfikiːle] a model of minibus taxi (from /se-si-fik-ile/ ‘we have arrived’). Reduction or deletion of a vowel to produce an s+C cluster happens in just the same way before clicks: 36  Compare, e.g. spear vs. *thpear [θpiɹ], *tpear [tpiɹ], *psear [psiɹ].

116

Bennett

for example, in fast or connected speech in Xhosa, the language name isiXhosa is commonly produced as [isi̥-̆ ǁʰɔsa] or [is-ǁʰɔsa], with the vowel before the click shorter and fully devoiced, or reduced to the point of being indistinguishable. We can also find s+C clusters in Nama, at least across word boundaries. For instance, Ladefoged and Traill (1984: 3) give acoustic and aerodynamic measurements of the phrase [nes ɡe a kǁʰaos] ‘this is striking’, with an [s+ɡ] cluster formed between the first two words. While some of these lines of evidence are circumstantial, they give us no indication that the typological generalizations about cluster shapes are somehow fundamentally inapplicable to clicks or to the languages that have them as consonants. In light of the typological evidence, what we would expect to find is that clicks in Khoisan should most readily occur in three types of clusters: clusters with [s] (s+ǃ), clusters with a preceding nasal (N+ǃ), and clusters with a following sonorant (ǃ+w, ǃ+r, ǃ+l). The situation we arrive at with a cluster analysis of Juǀ’hoan is not even remotely similar to this. There are no reports of s+ǃ clusters, nor of ǃ+r clusters. Clusters of ǃ+w may exist on a phonetic level, but have traditionally been understood as diphthongs, or else analyzed as a unit including a labial off-glide (see e.g. Maddieson et al. (1993) on Dahalo).37 Similarly, nasal clicks exist, and involve voicing and nasalization before the click release, but have been analyzed as units rather than as N+ǃ clusters, even in Güldemann’s aggressive cluster proposal. So, we find no obstruent+sonorant clusters, no nasal+obstruent clusters, and no s+obstruent clusters; instead, there are a myriad of other obstruent+obstruent clusters. What makes the Juǀ’hoan case very informative is that more familiar, and more typologically ‘normal’, non-click clusters are not merely unattested in the language; evidence from loanword adaptation strongly suggests that they are actively prohibited by the synchronic phonology. Miller-Ockhuizen (2003: 111–113) gives numerous examples of Afrikaans loanwords that have been borrowed into Juǀ’hoan, which show Cr and sC clusters being avoided through epenthesis. Some examples are given in (12) below. (12) Cluster avoidance in Juǀ’hoan borrowing Afrikaans source Juǀ’hoan borrowing Gloss a. stor (< En.) [stɔrː] tòɾà ‘store’ b. trou [tʰrou] tóɾó ‘wedding’

37  For instance, Dickens (1994: 27) gives the word thúí ‘abscess’ and his description of the orthography implies the pronunciation [tʰwí].

117

Click Phonology

c. d. e. f. g. h. i. j.

draad tronk petrol (< En.) broek patron kruiva kers knoop

[draːt] tȁʱɾà [tróŋk] tóɾókʰòè [pʰɛ́trol] pʰéétʰòɾò [bruk] búɾúkʰòè [pʰatrún] bàátʰóɾó [kréva] kə̀ ɾíꜜbáá [kʰérs] kéɾéꜜsíi ́ [knúp] kónóꜜbéé

‘wire’ ‘jail’ ‘petrol’ ‘trousers’ ‘pattern’ ‘wheelbarrow’ ‘candle’ ‘button’

Though Juǀ’hoansi has no native morphemes with Cr, sC, or NC clusters, these forms show us a good approximation of what the phonology might do when faced with such inputs. The result is that clusters of all three types are removed, one way or another. The most typical solution is epenthesis of a vowel, for example as in broek [bruk] > [bú.ɾú.kʰòè] ‘trousers’, where the [br] cluster is broken up by the vowel [u]. In other cases, we find deletion of one consonant, e.g. tronk [troŋk] > [tó.ɾó.kʰòè] ‘jail’, where the source cluster [ŋk] is reduced to [kʰ]; and in stor [stɔrː] > [tòɾà] ‘store’, where [st] reduces to [t]. So, it seems that the phonology of the language actively bans all of these relatively unmarked consonant clusters, and enforces that ban by removing clusters when presented with them. In view of this cluster avoidance, an analysis that treats some clicks as clusters seems quite strange. It requires us to assume that the phonology of Juǀ’hoan systematically avoids consonant clusters except for a narrow handful that fall into the cross-linguistically most restricted category of obstruent-obstruent clusters. Under a unit analysis, by contrast, the avoidance of clusters is readily explainable. Juǀ’hoan can be characterized as a language that does not allow complex onsets at all, in which case the epenthesis or deletion we observe in loanwords is to be expected. The trade-off is to admit a larger segmental consonant inventory, including a somewhat higher degree of complexity within the individual segment (particularly in the case of units like [tχ] and [ǃχ]). Across other languages with clicks, the situation is much the same: assuming that any of the clicks are clusters leads to treatment of clusters that runs squarely against cross-linguistic trends. The same generalizations about potential clusters in Juǀ’hoan given in (10) above hold for all of the other languages analyzed by Nakagawa (2006), and by Güldemann: in all the potential cases of click clusters, we find not a single example of a cluster formed by a click and a coronal obstruent, nor of clusters of clicks followed by liquids or other nonsyllabic sonorants. In short, then, the typological picture points to click ‘clusters’ having behavior that is quite unlike any other clusters. This can be seen as evidence that clicks are not genuinely clusters.

118

Bennett

5.2.5 Other Lines of Evidence? Naumann (2008) reports the result of a pilot experiment, in which speakers of Western Taa were trained with Afrikaans words to simplify word-initial sC and CR consonant clusters. The participants were then asked to simplify clicks, and made no distinction between clicks analyzed as clusters and those analyzed as units: all clicks were simplified to [h], [ʔ], or deleted entirely. Though only pilot data is available, this seems like good evidence to distinguish cluster from unit analyses. While further research is clearly warranted, the available evidence points away from cluster analyses. At the same time, we can find some historical support for certain treatments of clusters. Traill (1985: 211) observes a sound correspondence between Nama and Korana. Nama lacks [kx’], and also lacks clicks with affricated ejective posterior release [ǃkx’]. Thus, we find a sound correspondence that seems to cut across clicks and non-clicks: Korana [kx’] corresponds to Nama [ʔ], both as an independent segment, and as a click accompaniment. This points towards a historical change of *kx’ > ʔ happening in all situations; a generalization that makes more sense if we see [ǃkx’] and [ǃʔ] as clusters rather than as units. But such an analysis is only one possibility: the correspondence could also be explained as debuccalization of segments with glottalic releases, classifying across clicks and non-clicks by referring to their phonetic characteristics directly, without reifying the release as a segment in its own right. 5.3 Section Summary The phonetic complexity observed in clicks is indisputably real, and indisputably both interesting and significant for a general theory of phonology. Under unit analyses of clicks, we arrive at large inventories of extraordinarily complex segments. Under cluster analyses, we arrive at slightly smaller inventories of somewhat less complex consonants, but with numerous unanswered questions. Why do click clusters diverge so starkly from all other kinds of consonant clusters? Why are they, at the same time, are there such strong restrictions on the form of click clusters, which seem to recur cross-linguistically? Why do all of the relevant languages seem to allow clusters with clicks so much more freely than any other kinds of clusters? The conclusion evident from this section is that analyzing clicks as clusters does not explain much about their behavior. Cluster analyses are devised by connecting click ‘accompaniments’ to independently occurring non-click consonants largely on the basis of impressionistic phonetic similarity. This commonly offers multiple ways of analyzing a given type of click: for example, [h] could conceivably be seen as the cluster offset in a voiceless nasal click [n̥ ǃ], or as part of a voiceless aspirated click [ǃʰ], or as the source of aspiration

Click Phonology

119

in delayed aspirated clicks [ǃh]. There is no principled way to decide between these analyses in a generalized fashion: the choice is made based on impressions about phonetic similarity, intuitions about simplicity, and an impetus for diachronic or cross-linguistic parallelism. These are all appealing and intuitive bases for analysis, but their combination can lead to an uncomfortable sort of confirmation bias. The space of phonetic possibilities for click ‘accompaniments’ is only so large, and overlaps considerably with the space of possible non-clicks. All clicks involve a dorsal closure of some sort, and the dorsum can only do a finite number of things. Some degree of recurrence between clicks and non-click dorsals is to be expected, a priori of any phonological analysis. If the goal is to represent clicks with as few atomic units as possible, there is a good chance of finding at least one consonant that is vaguely similar to some portion of any click, such that treating the click as a cluster is not terribly at odds with what we know about the phonetic realities of the segment.38 But simplification for its own sake offers little in the way of real explanation. The value of any phonological theory comes from what it explains, and in the case of cluster analyses of clicks, we find explanation for surprisingly little. If the goal is to reduce clicks to as few representative primes as possible, there is surely no better framework for doing so than van der Hulst’s (1994a, 1994b, et seq.) Radical CV phonology, which cuts all segments down to structural arrangements of just two primary elements – C and V. When extrapolated to its logical extreme, this type of system begins to approximate straightforward binary code – and we can be entirely confident that all clicks can indeed find distinct representations in strings of 1s and 0s. This amounts to nothing more than glorified counting, though, and in abstracting away from all phonetic detail, it reveals no deeper insights about the phonological structure of clicks, and cannot hope to explain why they are so. Cluster approaches to clicks are significantly more useful than this: they can be convenient frames for organizing descriptive insights and generalizations, and for collating cross-linguistic observations – e.g. historical sound correspondences. What they actually predict is murky, though, without a clear and well-defined theory of how click clusters relate to non-clicks, to clicks, and to other consonant clusters in general.

38  The chances of doing so expand considerably if we are willing to exchange some amount of phonetic similarity for more cross-linguistic parsimony, for instance by admitting /ǃ+h/ or /ǃ+ʔ/ clusters in languages where [h] or [ʔ] do not appear to be independent consonants – as indeed Traill (1985) does in his analysis of ǃXóõ. (The revised inventory including rarer sounds in Traill (1994b) does add [h] to the segment inventory, though.)

120

Bennett

The end result is that the notion that some clicks are clusters seems intuitively plausible, but is quite difficult to prove – or disprove. The effect of cluster analyses is a trade-off between two kinds of theoretical parsimony. In reducing the size of click consonant inventories, cluster analyses expand the typology of clusters in other ways that seem bizarre by comparison with the typology of non-click clusters. But inasmuch as a cluster approach yields more questions than answers, it does have one major virtue: it seeks to go beyond the first step of description and categorization, into a deeper level of analysis. 6 Conclusion The aim of this chapter has been to provide an overview of some current issues in the phonology of clicks, and to review the main lines of evidence available, with an eye towards insights about their phonological representation in particular. While insights into the structure of non-clicks can be garnered from phonological alternations, such evidence is much less readily available with clicks. This is likely due in large part to their restricted geographic distribution, and confounds from areal and familial features: the languages with the richest click systems are also languages where morphemes are canonically very limited in shape and structure. Nonetheless, there is extremely good evidence that clicks can in principle participate in the regular phonology of a language: they are subject to distributional restrictions that sometimes cut across clicks and non-clicks alike; they can be seen to undergo active alternations at least in terms of laryngeal features; and they potentially can cause alternations in other segments (e.g. in languages where the Back Vowel Constraint is phonologically productive). Many unanswered questions still remain, including the most basic question of how to formally distinguish clicks from non-clicks. Also chief among the unresolved basic representational issues is whether certain clicks are clusters rather than singleton consonants. This is an area where recent work shows enormous progress, from the direct measurements in ultrasound studies (Miller et al. 2009a, b, e.g.), to Naumann’s (2008) tentative experimental test of whether speakers treat clicks as clusters of the same order as those found in other languages. While typological evidence does not appear to support cluster analyses (at least in respects other than the arbitrary size of segment inventories), it is possible that the typological picture is simply incomplete – perhaps there is something special about the sort of dorsal frication that allows it to pattern in a fundamentally different way, outside the norms of clusters in other languages. And though cluster analyses seem problematic in some cases, the faults of unit analyses remain, and await a deeper explanation.

Click Phonology

121

Acknowledgments For helpful discussion surrounding the contents of this chapter, I want to thank Bonny Sands, Nick Danis, Amanda Miller, Julian Bradfield, Florian Lionnet, Hirosi Nakagawa, Nomvula Sitole, Sisa Mapetu, Msindisi Sam, and Richard Bailey. Any mistakes are my own. References Abercrombie, David. 1967. Elements of General Phonetics. Edinburgh: Edinburgh University Press. Ball, Martin J., and Nicole Müller. 2007. “Non-pulmonic-egressive speech in clinical data: A brief review.” Clinical Linguistics & Phonetics 21(11–12): 869–874. Beach, D.M. 1938. The Phonetics of the Hottentot Language. Cambridge: W. Heffer & Sons Ltd. Bell, Arthur, and Chris Collins. 2001. “ǂHoan and the typology of click accompaniments in Khoisan.” Cornell Working Papers in Linguistics 18: 126–153. Bennett, Wm.G. 2008. “False nasality and the typology of click distribution.” Qualifying paper, Linguistics department, Rutgers University. Bennett, Wm.G. 2014. “Some differences between clicks and labio-velars.” South African Journal of African Languages 34(2): 115–126. Bennett, Wm.G. 2015. The phonology of consonants: Harmony, dissimilation and correspondence. Cambridge Studies in Linguistics. Cambridge: Cambridge University Press. Bennett, Wm.G. 2017. “Pulmonic venting and the typology of click nasality.” Rutgers Optimality Archive, ROA-1308. Available online at http://roa.rutgers.edu/article/ view/1622. Bleek, Dorothea F. 1928/29. “Bushman grammar: a grammatical sketch of the language of the /xam-ka-ǃk’e.” Zeitschrift für Eingeborenensprachen 19: 81–98. Bleek, W.H.I. 1862. A Comparative Grammar of South African Languages. Trübner and Co., London. Blevins, Juliette. 1995. “The syllable in phonological theory.” The Handbook of Phonological Theory, ed. John A. Goldsmith. London: Blackwell, pp. 206–244. Bostoen, Koen, and Bonny Sands. 2012. “Clicks in south-western Bantu languages: contact-induced vs. language internal lexical change.” Proceedings of the 6th World Congress of African Linguistics, Cologne, 17–21 August 2009, ed. M. Brenzinger and Anne-Maria Fehn. Cologne: Rüdiger Köppe, pp. 121–132. Bradfield, Julian. 2014. “Clicks, concurrency and Khoisan.” Phonology 31(1): 1–49. Brugman, Johanna. 2009. “Segments, Tones and Distribution in Khoekhoe Prosody.” Ph.D. thesis, Cornell University.

122

Bennett

Catford, J.C. 1977. “Mountain of tongues: The languages of the Caucasus.” Annual Review of Anthropology 6: 283–314. Chladni, E.F.F. 1824. “Über die Hervorbringung der menschlichen Sprachlaute.” Annalen der Physik 76(2): 187–216. Chomsky, Noam, and Morris Halle. 1968. The Sound Pattern of English. New York: Harper and Row. Clements, George N., and Sylvester Osu. 2002. “Explosives, implosives and nonexplosives: the linguistic function of air pressure differences in stops.” Laboratory Phonology 7, ed. Carlos Gussenhoven and Natasha Warner. Berlin: Mouton de Gruyter, pp. 299–350. Collins, Chris, and Levi Namaseb. 2011. A Grammatical Sketch of Nǀuuki with Stories. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 25. Cologne: Rüdiger Köppe. Demolin, Didier, and Véronique Delvaux. 2001. “Whispery voiced nasal stops in Rwanda.” EUROSPEECH 2001 Scandinavia, 7th European Conference on Speech Communication and Technology, 2nd INTERSPEECH event, September 3–7, 2001, Aalborg Congress and Culture Centre, Aalborg-Denmark: Proceedings, ed. Paul Dalsagaard, Borge Lindberg and Henrik Benner. Bonn: ISCA-Secretariat, pp. 651–654. Dickens, Patrick J. 1994. English-Juǀ’hoan, Juǀ’hoan-English Dictionary. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 8. Cologne: Rüdiger Köppe. Dogil, Grzegorz, and Jörg Mayer. 1998. “Selective phonological impairment: a case of apraxia of speech.” Phonology 15: 143–188. Doke, Clement M. 1923. “Notes on a problem in the mechanism of the Zulu clicks.” Bantu Studies 2(1): 43–45. Doke, Clement M. 1925. “An outline of the phonetics of the language of the ʗhũ̬ː Bushmen of north-west Kalahari.” Bantu Studies 2: 129–165. Doke, Clement M. 1926. The Phonetics of the Zulu Language, supplement to volume 2 of Bantu Studies. Johannesburg: University of the Witwatersrand Press. Reprinted 1969 by Klaus Reprint, Nendeln. Doke, Clement M. 1927. Textbook of Zulu Grammar, 6th edition. Cape Town: Maskew Miller Longman. Reprinted 1984. Doke, Clement M. 1936. “An outline of ǂKhomani Bushman phonetics.” Bantu Studies 10: 433–461. Doke, C.M. 1954. The Southern Bantu Languages. London: Oxford University Press. Ehret, Christopher. 2013. “The Extinct Khoesan languages in East Africa.” The Khoesan Languages, ed. Rainer Vossen. New York: Routledge, pp. 465–479. Elderkin, E.D. 1992. “Predictable nasality before east African clicks.” Afrikanistische Arbeitspapiere 29: 111–112. Ewen, Colin J. 1980. “Aspects of Phonological Structure with Particular Reference to English and Dutch. ” Ph.D. Thesis, University of Edinburgh.

Click Phonology

123

Fallon, Paul D. 2002. The Synchronic and Diachronic Phonology of Ejectives. New York: Routledge. Fuchs, Susanne, Laura Koenig, and Ralf Winkler. 2007. “Weak clicks in German?” Proceedings of the International Congress of Phonetic Sciences XVI, Saarbrücken, Germany, August 2007, pp. 449–452. Available online: http://www.icphs2007.de/ conference/Papers/1678/ Fulop, Sean A., Peter Ladefoged, and Rainer Vossen. 2000. “The dying clicks of Yeyi.” MS., UCLA. Fulop, Sean A., Peter Ladefoged, Liu Fang, and Rainer Vossen. 2003. “Yeyi clicks: Acoustic description and analysis.” Phonetica 60: 231–260. Gallagher, Gillian. 2011. "Acoustic and articulatory features in phonology – the case for [long VOT]." The Linguistic Review 28: 281–313. Gerlach, Linda. 2015. “Phonetic and Phonological Description of the Nǃaqriaxe Variety of ǂ’Amkoe and the Impact of Language Contact.” Ph.D. thesis, Humboldt-Universität zu Berlin. Goad, Heather. 2011. “The representation of sC clusters.” The Blackwell Companion to Phonology, ed. Marc van Oostendorp, Colin Ewen, Elizabeth Hume and Keren Rice. Oxford: Wiley-Blackwell, pp. 898–923. Greenberg, Joseph H. 1966. The Languages of Africa. Bloomington: Indiana University. Güldemann, Tom. 2001. Phonological regularities of consonant systems across Khoisan lineages. University of Leipzig Papers on Africa, Languages and Literatures 16. Leipzig: Institut für Afrikanistik, Universität Leipzig. Güldemann, Tom. 2006. “Structural isoglosses between Khoekhoe and Tuu: The cape as a linguistic area.” Linguistic areas: Convergence in Historical and Typological Perspective, ed. Yaron Matras, April McMahon and Nigel Vincent. Basingstoke: Palgrave Macmillan, pp. 99–134. Güldemann, Tom. 2017. “Casting a wider net over Nǀǀng: The older archival resources.” Anthropological Linguistics 59(1): 71–104. Gunnink, Hilde. 2013. “Variation and change in the click consonants of Fwe.” Paper presented at the Colloquium on African Languages and Linguistics, Leiden, 27 August 2013. Gunnink, Hilde, Bonny Sands, Brigitte Pakendorf, and Koen Bostoen. 2015. “Prehistoric language contact in the Kavango-Zambezi transfrontier area: Khoisan influence on southwestern Bantu languages.” Journal of African Languages and Linguistics 36(2): 193–232. Hagman, Roy S. 1977. Nama Hottentot Grammar. Bloomington: Indiana University. Hale, Kenneth, and David Nash. 1997. “Damin and Lardil phonotactics.” Boundary Rider: Essays in Honour of Geoffrey O’Grady, ed. Darrell Tryon and Michael Walsh. (Pacific Linguistics C-136). Canberra: Research School of Pacific and Asian Studies, Australian National University, pp. 247–259.

124

Bennett

Halle, Morris. 1995. “Feature geometry and feature spreading.” Linguistic Inquiry 26(1): 1–46. Halle, Morris, and Kenneth N. Stevens. 1971. “A note on laryngeal features.” Quarterly Progress Report, Research Laboratory of Electronics, (MIT) 101: 198–212. Halle, Morris, Bert Vaux, and Andrew Wolfe. 2000. “On feature spreading and the representation of place of articulation.” Linguistic Inquiry 31(3): 387–444. Halpert, Claire. 2012. “Overlap-driven consequences of nasal place assimilation.” Consonant Clusters and Structural Complexity, ed. Philip Hoole, Lasse Bombien, Marianne Pouplier, Christine Mooshammer and Barbara Kühnert. Berlin: Mouton de Gruyter, pp. 345–368. Hansson, Gunnar Ó . 2010. Consonant Harmony: Long-Distance Interaction in Phonology. University of California Publications in Linguistics 145. Berkeley: University of California Press. Heselwood, Barry. 1997. “A case of nasal clicks for target sonorants: a feature geometry account.” Clinical Linguistics & Phonetics 11(1): 43–61. Howard, Sara J. 1993. “Articulatory constraints on a phonological system: a case study of cleft palate speech.” Clinical Linguistics & Phonetics 7: 299–317. Hulst, Harry G. van der. 1994a. “An introduction to Radical CV Phonology.” SKY 1994: Yearbook of the linguistic association of Finland, ed. Susanna Shore and Maria Vilkuna. Helsinki: Suomen kielitieteellinen yhdistys, pp. 23–56. Hulst, Harry G. van der. 1994b. “Radical CV Phonology: The locational gesture.” UCL Working Papers in Linguistics 6: 439–477. Jessen, Michael. 2002. “An acoustic study of contrasting plosives and click accompaniments in Xhosa.” Phonetica 59: 150–179. Jessen, Michael, and Justus C. Roux. 2002. “Voice quality differences associated with stops and clicks in Xhosa.” Journal of Phonetics 30: 1–52. Jones, Daniel. 1907. “implousiv saundz ənd kliks.” Le Maître Phonétique (Nov.–Dec.): 111–114. Khumalo, James Steven Mzilikazi. 1987. An Autosegmental Account of Zulu Phonology. Ph.D. thesis, University of the Witwatersrand, Johannesburg. König, Christa, and Bernd Heine. 2001. “The ǃXun of Ekoka: A demographic and linguistic report.” Khoisan Forum Working Paper 17, ed. Yvonne Treis. Cologne: Institut für Afrikanistik, University of Cologne. Koreman, Jacques, and Aleksander Morland. 2007. “Speaker consistency of coarticulatory gestures in clusters of labial and velar plosives.” Proceedings of the 16th International Congress of Phonetic Sciences, Saarbrücken, 6–10 August 2007, pp. 570–580. Kreitman, Rina. 2008. “The Phonetics and Phonology of Onset Clusters: The Case of Modern Hebrew.” Ph.D. thesis, Cornell University. Ladefoged, Peter. 1971. Preliminaries to Linguistic Phonetics. Chicago: University of Chicago Press.

Click Phonology

125

Ladefoged, Peter, and Anthony Traill. 1984. “Linguistic phonetic descriptions of clicks.” Language 60(1): 1–20. Ladefoged, Peter, and Anthony Traill. 1994. “Clicks and their accompaniments.” Journal of Phonetics 22(1): 33–64. Lanham, L.W. 1969. “Generative phonology and the analysis of Nguni languages.” Lingua 24: 155–162. Lanham, L.W., and D.P. Hallowes. 1956. “An outline of the structure of Eastern Bushman.” African Studies 15: 97–118. Lionnet, Florian. 2014. “Doubly triggered harmony in Laal as subphonemic agreement by correspondence.” Proceedings of the 2013 Meeting on Phonology. Linguistic Society of America Online Conference Proceedings, ed. John Kingston, Claire Moore-Cantwell, Joe Pater, J. and Robert Staubs. Washington DC: Linguistic Society of America. Lionnet, Florian. 2018. “Phonological teamwork in Kalahari Basin languages.” Africana Linguistica 24: 75–97. Lombardi, Linda. 1999. “Positional faithfulness and voicing assimilation in optimality theory.” Natural Language and Linguistic Theory 17: 267–302. Louw, J.A. 1962. “On the segmental phonemes of Zulu.” Afrika und Übersee 46: 43–93. Louw, J.A. 1977. “Clicks as loans in Xhosa.” Bushman and Hottentot Linguistic Studies, ed. Jan W. Snyman. Pretoria: University of South Africa, pp. 82–100. Lukusa, Stephen T.M. 2009. Shiyeyi-English Dictionary. Munich: LINCOM EUROPA. Maddieson, Ian, and Peter Ladefoged. 1989. “Multiply articulated segments and the feature hierarchy.” UCLA Working Papers in Phonetics 72: 116–138. Maddieson, Ian, Siniša Spajić, Bonny Sands, and Peter Ladefoged. 1993. “Phonetic structures of Dahalo.” Afrikanistische Arbeitspapiere 36: 5–53. Marlett, Stephen A., and Velma B. Pickett. 1987. The Syllable Structure and Aspect Morphology of Isthmus Zapotec.” International Journal of American Linguistics 53(4): 398–422. McLaren, J. 1942. A Xhosa Grammar. Longmans, Green, and Co., 3rd edition. Miller, Amanda L. 2006. “The phonology of click consonants.” Ms., Cornell University. Miller, Amanda L. 2010a. “A prosodic account of Juǀ’hoansi consonant distributional asymmetries.” Khoisan Languages and Linguistics: Proceedings of the 1st International Symposium, January 4–8, 2003, Riezlern/Kleinwalsertal, ed. M. Brenzinger and Christa König. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 24. Cologne: Rüdiger Köppe, pp. 53–84. Miller, Amanda L. 2010b. “Tongue body and tongue root shape differences in Nǀuu clicks correlate with phonotactic patterns.” Turbulent Sounds: An Interdisciplinary Guide, ed. Susanne Fuchs, Martine Toda and Marzena Żygis. Berlin: Mouton de Gruyter, pp. 245–280. Miller, Amanda L. 2011. “The representation of clicks.” The Blackwell Companion to Phonology, vol. 1. ed. Marc van Oostendorp, Colin Ewen, Elizabeth Hume and Keren Rice. Oxford: Blackwell, pp. 416–439.

126

Bennett

Miller, Amanda L. 2016. “Posterior lingual gestures and tongue shape in Mangetti Dune ǃXung clicks.” Journal of Phonetics 55: 119–148. Miller, Amanda L., Johanna Brugman, Jonathan Howell, and Bonny Sands. 2006. “Tongue dorsum location and tongue root retraction in alveolar and palatal clicks in the endangered language Nǀuu.” Journal of the Acoustical Society of America 120(5.2): 3377. Miller, Amanda L., Levi Namaseb, and Khalil Iskarous. 2007. “Tongue body constriction differences in click types.” Proceedings of Laboratory Phonology 9, ed. Jennifer S. Cole and José Ignacio Hualde. Berlin: Mouton de Gruyter, pp. 643–656. Miller, Amanda L., Johanna Brugman, Bonny Sands, Levi Namaseb, Mats Exter, and Chris Collins. 2009a. “Differences in airstream and posterior place of articulation among Nǀuu clicks.” Journal of the International Phonetic Association 39(2): 129–161. Miller, Amanda L., S. Shah, and Bonny Sands. 2009b. “Five coronal click types in Grootfontein ǃXung.” Paper presented at the 83rd Annual Meeting of the Linguistic Society of America, San Francisco. Miller-Ockhuizen, Amanda L. 2000. “C-V coarticulation and complex consonants: Evidence for ordering in click place gestures.” Item Order in Language and Speech: Proceedings of LP ‘98, Columbus, the Ohio State University, September 15–20, 1992, ed. Osamu Fujimura, Brian Joseph and Bohumil Palek. Prague: Charles University Press, pp. 301–330. Miller-Ockhuizen, Amanda L. 2003. The Phonetics and Phonology of Gutturals: A Case Study from Juǀ’hoansi. Outstanding Dissertations in Linguistics. New York & London: Routledge. Möhlig, Wilhelm J.G. 2013. “Traces of Bantu-Khoesan interaction in Manyo (Kavango Bantu, Namibia).” The Khoesan Languages, ed. Rainer Vossen. London: Routledge, pp. 446–449. Morelli, Frida. 1999. The Phonotactics And Phonology Of Obstruent Clusters In Optimality Theory. Ph.D. thesis, University of Maryland. Nakagawa, Hirosi. 2006. Aspects of the Phonetic and Phonological Structure of the Gǀui Language. Ph.D. thesis, University of the Witwatersrand. Nakagawa, Hirosi. 2010. “Phonotactics of disyllabic lexical morphemes in Gǀui.” Working Papers in Corpus-based Linguistics and Language Education 5: 23–31. Tokyo University of Foreign Studies, Tokyo. Nathan, Geoffrey S. 2001. “Clicks in a Chinese nursery rhyme.” Journal of the International Phonetic Association 31: 223–228. Naumann, Christfried. 2008. “The consonantal system of west ǃXoon.” Handout of paper presented at the 3rd International Symposium on Khoisan Languages and Linguistics. Riezlern. Naumann, Christfried. 2016. “The phoneme inventory of Taa (West !Xoon dialect).” Lone Tree: Scholarship in the Service of the Koon, Essays in Memory of Anthony T.

Click Phonology

127

Traill, ed. Rainer Vossen and Wilfrid H.G. Haacke. Cologne: Rüdiger Köppe. pp. 311–351. Ohala, John J. 1995. “A probable case of clicks influencing the sound patterns of some European languages.” Phonetica 52: 160–170. Padgett, Jaye. 1995. “Partial class behavior and nasal place assimilation.” Proceedings of the South Western Optimality Theory Workshop 1995. Tuscon, AZ: University of Arizona Coyote Papers, pp. 145–183. Pahl, H.W., A.M. Pienaar, and T.A. Ndungane. 1989. The Greater Dictionary of Xhosa: volume 3, Q – Z. Alice, South Africa: University of Fort Hare. Panconcelli-Calzia, G. 1918/19. “Untersuchungen über l̞, p, b, q͡χ, tˆ̩, tˆ, tˆ̰ mit Röntgenstrahlen.” Zeitschrift für Kolonial-Sprachen 9: 20–25. Rose, Sharon, and Rachel Walker. 2004. “A typology of consonant agreement as correspondence.” Language 80(3): 475–531. Roux, Justus C. 2007. “Unresolved issues in the representation and phonetic description of click articulation in Xhosa and Zulu.” Language Matters 38(1): 8–25. Rycroft, David K. 1980. The depression feature in Nguni languages and its interaction with tone. Communication No. 8. Grahamstown: Department of African Languages, Rhodes University. Sagey, Elizabeth C. 1986. “The representation of features and relations in nonlinear phonology.” Ph.D. thesis, MIT, Cambridge, MA. Sands, Bonny. 1991. “Evidence for click features: Acoustic characteristics of Xhosa clicks.” UCLA Working Papers in Phonetics 80: 6–37. Sands, Bonny, and Levi Namaseb. 2007. “Documenting click loss and change: Focus on the palatal click.” Presentation at the 38th Annual Conference on African Linguistics. Sibanda, Galen. 2004. “Verbal Phonology and Morphology of Ndebele.” Ph.D. thesis, University of California Berkeley. Silverman, Daniel, and Jongho Jun. 1993. “Aerodynamic evidence for articulatory overlap in Korean.” UCLA Working Papers in Phonetics 85: 97–112. Skhosana, Philemon Buti. 2009. “The Linguistic Relationship Between Southern And Northern Ndebele.” Ph.D. thesis, University of Pretoria. Snyman, Jan W. 1975. Ž uǀ’hõasi Fonologie en Woordeboek. Communication 37 of the University of Cape Town School of African Studies. Cape Town & Rotterdam: AA Balkema. Snyman, Jan W. 1978. “The clicks of Ž uǀ’hõasi.” Proceedings of the 2nd Africa Languages Congress of UNISA (Tweede Afrikatale-kongres van UNISA), ed. Ernst J.M. Baumbach. Miscellanea congregalia 5. Pretoria: University of South Africa, pp. 144–168. Traill, Anthony. 1985. Phonetic and Phonological Studies of ǃXóõ Bushman. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 1. Hamburg: Helmut Buske. Traill, Anthony. 1993. “The feature geometry of clicks.” Linguistica, Festschrift E. B. van Wyk: ‘n Huldeblyk, ed. P. von Standen. Pretoria: J. L. van Schaik, pp. 134–140.

128

Bennett

Traill, Anthony. 1994a. “Place of articulation features for clicks: anomalies for universals.” Studies in General and English Phonetics in Honor of Professor J.D. O’Connor, ed. Jack Windsor Lewis. London: Pergamon Press, pp. 121–129. Traill, Anthony. 1994b. A ǃXóõ Dictionary. Quellen zur Khoisan-Forschung / Research in Khoisan Studies 9. Cologne: Rüdiger Köppe. Traill, Anthony. 1999. Extinct: South African Khoisan languages. Compact disc & booklet. Johannesburg: Department of Linguistics, University of the Witwatersrand. Traill, Anthony, J.S.M. Khumalo, and P. Fridjhon. 1987. “Depressing facts about Zulu.” African Studies 46: 255–274. Traill, Anthony, and Rainer Vossen. 1997. “Sound change in the Khoisan languages: new data on click loss and click replacement.” Journal of African Languages and Linguistics 18: 21–56. Trubetzkoy, N.S. 1969. Principles of Phonology. Los Angeles: Center for Research in Languages and Linguistics, University of California, Los Angeles. English translation of Grundzüge der Phonologie, first published 1939. Visser, Hessel. 1998. “The phonological system of Naro.” Languages, Identity, and Conceptualization among the Khoisan, ed. Mathias Schladt. Quellen zur KhoisanForschung/Research in Khoisan Studies 15. Cologne: Rüdiger Köppe, pp. 117–136. Walsh, Martin. 2006. “A click in Digo and its historical interpretation.” Azania Archaeological Research in Africa 41: 158–166. Wilkes, A. 2001. “Northern and Southern Ndebele – why harmonisation will not work.” South African Journal of African Languages 21(3): 310–322. Wright, Richard, Ian Maddieson, Peter Ladefoged, and Bonny Sands. 1995. “A phonetic study of Sandawe clicks.” UCLA Working Papers in Phonetics 91: 1–24. Yanagihara, Naoki, and Charlene Hyde. 1966. “An aerodynamic study of the articulatory mechanism in the production of bilabial stop consonants.” Studia Phonologica 4: 70–80. Ziervogel, D. 1952. A Grammar of Swazi (siSwati). Johannesburg: Witwatersrand University Press. Ziervogel, D. 1959. A Grammar of Northern Transvaal Ndebele. Pretoria: J.L. Van Schaik.

Chapter 3

The Interaction between Click Consonants and Tone in Tsua Timothy K. Mathes 1 Introduction The purpose of this chapter is two-fold. One is to demonstrate that click consonants may interact with tone. The second is to show that exceptions to the interaction between click consonants and tone can be plausibly accounted for when historical sound change processes are considered. Click consonant-tone interaction may occur via a process of Fundamental Frequency (F0) lowering of 50 Hz or more of a root-initial High (H) tone when it is preceded by voiced or aspirated obstruents, including clicks, and followed by a non-High tone. F0 lowering after certain consonant types is referred to in the literature as tonal depression. The phenomenon has been reported in the Khoisan language Tsua (Mathes 2015). Tsua is classified by Güldemann and Vossen (2000) as a Kalahari Khoe East language of the Khoe branch and part of the Tshwa subgroup. The Kalahari Khoe East languages occupy the eastern part of Botswana, southeast of the Central Kalahari Game Reserve and eastward towards the Western Sandveld of the Central District. These languages are spoken mainly around the Makgadikgadi Salt Pans, as well as towards the Shashe river, Serule and Mabesekwa areas in eastern Botswana, Serowe and Shoshong (Barnard 1992). The glottal fricative h is a tone depressor in Tsua as well. The result is a rare consonant-tone interaction pattern with depressor types found in both African and East Asian tone languages. The pattern’s rarity can be observed when considering cross-linguistic data on depressor consonants in Bradshaw (1999) and Tang (2008). It is quite common for voiced obstruents to be tone depressors, particularly in African languages. Aspirated obstruents are reported to be depressors in many East Asian languages. It is not common for the glottal fricative h to be a depressor in either African or East Asian languages. It is rare for all three depressor types to be found in one language as in Tsua, and particularly noteworthy that clicks and non-clicks participate in the interaction. Table 3.1 has examples of African and East Asian tone languages and their respective depressor consonant types, with citations. Yaka is analyzed as having

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_004

130 Table 3.1

Mathes Consonant-tone interaction in African and East Asian languages

African Languages Language

Location

Effect

Yaka

Bassa

Central African Republic, Rep. of Congo-Brazzaville Liberia

Kalang’a

Botswana

Low inserted after root-initial voiced obstruents, Low triggers voicing (Kutsch Lojenga 1998) Voiced obstruents and /h/ have an affinity for Low (Hobley 1964) Voiced obstruents and aspiration have an affinity for Low tone (Downing and Gick 2005)

East Asian Languages Language

Location

Wujiang dialect China Jingpho

Myanmar

Manange

Nepal

Effect Voiced obstruents and aspiration have an affinity for Low tone (Shen 1994) Voiced obstruents and /ħ/ have an affinity for Low tone (Maran 1971) Aspiration has an affinity for Low tone (Hildebrandt 2003)

Source: Based on Bradshaw 1999 and Tang 2008

Low tone insertion after root-initial voiced obstruents. Bassa voiced obstruents and the glottal fricative h most often co-occur with Low tones. Ikalanga [Kalang’a] has voiced obstruents and aspirated obstruents as depressor consonants. Voiced obstruents and aspiration have an affinity for Low tone in the Chinese Wujiang dialect, while in Jingpho, voiced obstruents and ħ have an affinity for Low tone. Manange aspiration has an affinity for Low tone. Interestingly, 19.3% of the root-initial High tones that are expected to be lowered in the post-depressor context are not depressed in the Tsua dataset of ~1,300 lexical items (Mathes and Chebanne 2013). It is argued that tonal depression exceptions involving voiced clicks correspond to historically nasalized clicks, consonants which are not depressors in Tsua synchronically or diachronically. Thus, voiced clicks function as depressors in lexical items that

The Interaction between Click Consonants and Tone in Tsua

131

have not undergone this historic sound change, an explanation referred to as the Tsua Depression Exceptions Hypothesis in this chapter. The Tsua click consonants are especially notable due to a number of inventory gaps, especially concerning alveolar clicks. It may be that a process of click replacement has asymmetrically reduced the Tsua inventory, a process which has been reported in Kalahari Khoe East languages (see Vossen 2013: 72). I have opted to use the concept “click replacement” to mean the substitution of a click or non-click sound segment in place of a click segment.1 I use the term “asymmetrical” to refer to the phenomenon where certain click place/manner series possess the full set of accompaniments while others do not. For example, the dental and lateral clicks have contrastive aspiration but the alveolar click does not in the current dataset. These phonological changes have been observed for quite some time, as Traill (1978) notes: “One such detail involves the erosion of clicks that takes place as one moves eastwards. Kua retains more clicks or reflexes of them than its neighbours” (p. 258). It is likely that the Tsua click inventory gaps emerged as a consequence of the sociolinguistic situation and extensive linguistic contact in eastern Botswana with languages such as Setswana and Ikalanga (see Wilmsen and Vossen 1990: 22–24 for a discussion). Thus, this chapter addresses the complex relationship between click consonants, tone and click replacement in Tsua. The data presented herein are based on original field research in Botswana during the summers of 2012–2013 with 3 Tsua consultants (Mathes and Chebanne 2013). Comparative evidence from the Khoe languages Kua and Gǀui are provided to support the forthcoming claims on Tsua consonant-tone interaction. Due to lack of space, this chapter focuses on the click consonant portion of the Tsua segmental inventory. 2

Tsua Click Consonants

The click consonant inventory presented in Table 3.2 follows Ladefoged and Traill (1994) by using the terms “click type” and “accompaniment” to describe the clicks. Tsua has four click types symbolized as: dental ǀ, palatal ǂ, alveolar ǃ, and lateral ǁ, with voicing and nasality transcribed before the click symbol. For instance, ɡǀ represents the voiced dental click and ŋǀ is the nasalized dental 1  With respect to the concepts “click replacement” and “click loss”, Traill and Vossen (1997: 24) diverges from Traill (1986) by stating: “We apply the concept of “click loss” to languages where a click … has been replaced with a non-click consonant and “click replacement” to cases in which one of the influxes … has been replaced by another influx … Obviously the term “replacement” is not altogether felicitous because both “loss” and “replacement” involve replacement of one sound by another.”

132

Mathes

Table 3.2 Tsua click consonant inventory (n = 34)

voiceless voiced aspirated ejective glottalized clusters

nasals

Dental

Palatal

Alveolar

Lateral

ǀ ɡǀ ǀʰ ǀ’ ǀʔ ǀχ ǀqχ’ ǀq ǀɢ ǀqʰ ŋǀ

ǂ ɡǂ ǂʰ

ǃ

ǂʔ

ǃʔ

ǂq

ǃqχ’ ǃq

ǂqʰ ŋǂ

ŋǃ

ǁ ɡǁ ǁʰ ǁ’ ǁʔ ǁχ ǁqχ’ ǁq ǁɢ ǁqʰ ŋǁ

Note: Tone depressors are bold and in gray-scale.

click. A ‘k’ before the click symbol to indicate voicelessness is not used in the transcriptions to reduce redundancy and to make the forthcoming Tsua click replacement patterns easier to discern. The single aspiration type in Table 3.2 is noteworthy because previous work on Khoisan languages has found two different types of click consonant aspiration. The two aspiration types are reported in Traill (1991) for Nama. In the first aspiration type, voiceless aspirated clicks like ǁʰ have an absence of nasal airflow before, during and after the click. The air pressure rises in the pharynx leading up to the anterior closure release with a rapid acceleration of oral airflow, co-occurring with an audible release of the posterior closure. In the second aspiration type, clicks with delayed aspiration, nasal airflow commences before the click closure and is sustained during and after click release. The air pressure is vented through the nasal cavity. The oral airflow upon click release is far slower compared to the first aspiration type because of this passive nasal venting. As a consequence, aspiration slowly increases in amplitude over time for delayed aspirated clicks, as opposed to reaching full amplitude right after click release as in aspirated clicks, quickly followed by an inaudible posterior closure release. The current dataset does not provide conclusive evidence of delayed aspirated clicks in the Tsua inventory, although Chebanne and Collins (2014) list them as part of the Kua inventory. Honken (2008: 188) states that most Khoe languages only have the type of aspiration found in aspirated non-clicks,

The Interaction between Click Consonants and Tone in Tsua

133

while the contrast between aspiration and delayed aspiration is common in Non-Khoe languages. There is one example in which a Kua cognate has a root-initial delayed aspirated lateral click, transcribed as ǁ’h in Chebanne and Collins (2014), that corresponds to a Tsua root-initial nasalized click, evidence of a potential sound change involving the passive nasal venting of the Kua delayed aspirated click being produced as a nasalized click in Tsua. (1) Kua delayed aspirated click to Tsua nasal click correspondence Kua Tsua Gloss ǁ’haˤo ŋǁaˤo ‘to throw nimbly’ It is tempting to conclude that delayed aspiration was a casualty of click replacement (i.e. a delayed aspirated click replaced by a nasalized click in Tsua) brought about in part by sociolinguistic factors. Honken (2008: 188–192) gives examples of a correspondence between delayed aspirated clicks in Khoekhoe and nasal clicks in Naro, Gǀui and ǁAni. Nevertheless, the one example in (1) does not supply ample evidence of a sound change that explains the lack of Tsua clicks with delayed aspiration but is included here as one possibility to be explored in future field research. All that can be said at the moment is there is no evidence for or against the existence of delayed aspiration in the Tsua click consonant inventory, so this issue must be left for future study. Another general property of the Tsua click consonant inventory revealed by Table 3.2 is the palatal and alveolar click types have accompaniment gaps (the blank cells) while the dental and lateral click types have the full set of accompaniments. These gaps are considered in the next section framed within the context of Traill’s observations regarding click replacement (see Fehn, Chapter 9, this volume, for an overview on click replacement and loss in Khoe). 3

Traill’s Click Replacement Patterns

Traill (1986) observes that certain Khoe languages but not all replace clicks with non-clicks, and when the phenomenon occurs, palatal ǂ and alveolar ǃ clicks may be replaced by cognate palatal and velar stops, respectively. In addition, there are Khoe varieties that replace at least some alveolar ǃ clicks while preserving all palatal ǂ clicks, or varieties that replace both, but never a language that has replaced ǂ while keeping ǃ. For instance, Nama, Naro and Gǀui are conservative in that they preserve all of their clicks as phonetically and phonologically stable elements at one end of the spectrum, but Shua, Teti and Tyire are non-conservative varieties that replace both alveolar and palatal

134

Mathes

clicks at the other end of the spectrum. The Khoe click-series systems are summarized in (2) in diachronic succession (ibid: 305). (2) Khoe click-series systems Stage 1 Stage 2 Stage 3 ǀ ǁ ǂ ǃ ǀ ǁ ǂ k ǀǁck but * ǀ ǁ c ǃ A Khoe language in Stage 1 preserves all four click types. A Stage 2 language has replaced at least some alveolar clicks with velar stops. For instance, the aspirated alveolar click ǃʰ is no longer in a language’s inventory because it has been replaced with the aspirated velar stop kʰ, although alveolar clicks with other accompaniments may still be present. A Stage 3 language has replaced at least some alveolar clicks with velar stops and at least some palatal clicks with palatal stops. Dental and lateral click types are always preserved. There is not a situation in which at least some palatal clicks have been replaced by palatal stops while retaining all alveolar clicks. Traill and Vossen (1997) appeal to the single factor “weakening” of what they categorize as the “abrupt” clicks (i.e. the palatal and alveolar clicks) as the phonetic basis for these sound changes, arguing that it provides a unitary account. Acoustic data reveal the peak amplitudes, spectra and intensities of weakened palatal and alveolar clicks are less like damped impulses, being noisier and more diffuse, compared to the compact peak of the unweakened forms (pp. 46–48). These weakened clicks are correlated with languages that have click replacement while the unweakened versions most commonly occur in languages that do not. The mechanism underlying this acoustic attenuation is articulatory undershoot of the abrupt clicks, in which the degree of rarefaction is reduced due to the formation of a larger click cavity prior to click release, a lingual gesture that compromises the compact peak normally associated with these clicks. Traill and Vossen (1997: 49) provide evidence of articulatory undershoot in palatographic form to support their argument, concluding that the reduction of articulatory complexity ends with replacement of a click by a non-click. However, it should be stated that more measurements from a variety of languages and speakers are needed before a definitive conclusion can be reached. The click consonant inventory gaps illustrated in Table 3.2 are a reflection of the extent of Tsua click replacement, with some click consonants being replaced by non-clicks in accord with Traill’s Stage 3. The clicks absent from Tsua are: ɡǃ, ǃʰ, ǂ’, ǃ’, ǂχ, ǃχ, ǂqχ’, ǂɢ, ǃɢ, and ǃqʰ. Further evidence of Stage 3 click

The Interaction between Click Consonants and Tone in Tsua

135

Table 3.3 Click replacement comparative data for ǃ and ǂ clicks

Stage 1

Stage 2

Stage 3

Gloss

Gǀui

Gǁana

Ts’ixa

Tsua

ǃnaro ǃɡanee ǃhae ǃxan ǂnu ǂɡoa ǂxoa ǂhuni

ŋaro ɡanee khae kxan ǂnuu ǂɡoa ǂxoa ǂhuni

ŋɡaro ɡaani khae xan ɲɟuu ɟua cxoa chuni

ɡaro ɡani kʰae χan ɟuu ɟoa cχoa cʰuni

‘chameleon’ ‘chin’ ‘pierce’ ‘sew’ ‘black’ ‘ash’ ‘elephant’ ‘elbow’

Source: Traill (1986: 305) for Gǀui, Gǁana and Ts’ixa; Mathes and Chebanne (2013) for Tsua

replacement in Tsua comes in comparative form with vocabulary based on Traill (1986: 305), shown in Table 3.3.2 Table 3.3 shows a few examples of click replacement patterns for alveolar and palatal clicks from Traill (1986: 305). Dental and lateral click types have not been replaced in these languages for reasons that have already been discussed. Gǀui has preserved all four click types, thereby being a Stage 1 language. Gǁana is a Stage 2 language because at least some alveolar clicks have been replaced with velar stops. For instance, ǃɡanee ‘chin’ in Gǀui is the clickless ɡanee in Gǁana. Ts’ixa is a Stage 3 language since at least some alveolar clicks have been replaced with velar stops and at least some palatal clicks with palatal stops in its lexicon. The Tsua data are most similar to Ts’ixa: at least some alveolar clicks have been replaced with velar stops and at least some palatal clicks with palatal stops. Table 3.3 has ǃxan ‘sew’ in Gǀui versus the clickless χan in Tsua. The click replacement comparative data gives us insight into the lack of certain clicks in the Tsua inventory, namely, ɡǃ, ǃʰ, ǃχ, and ǂχ, as they have undergone replacement by ɡ, kʰ, χ, and cχ, respectively (see Traill and Vossen 1997: 29 for identical findings in other Khoe varieties). There are extant clicks in the inventory that are replaced in some cases and not in others. ŋǃabe ‘giraffe’ in Gǀui is ɡabe in Tsua, but Tsua has the nasalized alveolar click in ŋǃau ‘small frog’. 2  Traill’s transcription system is somewhat different from the system used here. However, the vocabulary is still illustrative of the click replacement patterns under discussion.

136

Mathes

Click replacement operates on a continuum in terms of the degree to which a language has undergone a particular change, i.e. the degree to which it has spread through the lexicon (Traill 1986: 306). The click replacement patterns found across Khoe languages help explain the Tsua click inventory gaps since they have been replaced by non-clicks. Most importantly, historical sound changes vis-à-vis Tsua’s clicks help explain exceptions to tonal depression, a topic that is addressed shortly. 4

Non-depressed Tonal Melodies

We now turn to the inventory of tonal melodies in Tsua. A “tonal melody” is defined here as a permitted tone sequence docked to the tone-bearing units (TBUs) of a bimoraic root, resulting in the form [T₁T₂], with the mora acting as the TBU. Nasal consonants can bear tone. The high vowels i, u perturb tone melody realization by increasing F0 overall compared to tones docked to nonhigh vowels and are therefore excluded. Six contrastive tonal melodies have been observed as listed in (3). (3) Tsua contrastive tonal melodies Tone Category Notation a. High-level acute + acute accents; e.g. áá b. HM-falling acute + macron accents; e.g. áā c. HL-falling acute + grave accents; e.g. áà d. Mid-double-rise macron + macron; e.g. āā e. MH-rising macron + acute; e.g. āá f. ML-falling macron + grave; e.g. āà Given the nature of tone melody realization in Tsua, one issue is whether a contour tone unit analysis or a compositional analysis is most appropriate. A contour tone unit analysis, in which a contour behaves as a non-decomposable phonological unit as in Beach (1938), predicts that contour tones can occur on monomoraic forms, while a compositional analysis, in which a contour is a sequence of level tones, predicts that only level tones can occur on monomoraic forms. All of the monomoraic Tsua forms are grammatical morphemes in the current dataset and they only bear level tones, thus supporting the prediction of the compositional analysis and the mora as the TBU. A near-minimal CVV sextuplet in (4a) provides support for a six-way tonal contrast in Tsua, with examples of the form CVCV showing the same tone patterns in (4b). While the data in (4) focus on the tonal patterns for the CVV and CVCV forms, all six tonal patterns occur with the CVN syllable structure,

The Interaction between Click Consonants and Tone in Tsua

137

with each pattern being produced identically to the CVV and CVCV tonal patterns. These basic tonal melodies were elicited in citation form, i.e. in isolation, which is standard field practice (Hyman 2010: 205; Kutsch Lojenga 2010: 7). (4) Tsua tone patterns on CVV and CVCV roots a. Tonal near-minimal sextuplet on CVV roots Tone Category Lexical Item Gloss High-level ǁáé ‘to teach’ HM-falling ǁáē ‘to chew’ HL-falling ǁáè ‘to brood (eggs)’ Mid-double-rise ǁāē ‘seed necklace’ MH-rising ǁāé ‘to swear at someone’ ML-falling ǁāò ‘wing’ b. Tonal patterns on CVCV roots Tone Category Lexical Item Gloss High-level sá.rá ‘to fall short’ HM-falling tsó.rō ‘shell; husk’ HL-falling ǁχá.bà ‘hunchback’ Mid-double-rise ǁχā.mā ‘red hartebeest’ MH-rising dzā.rá ‘bird (generic)’ ML-falling kā.rò ‘to run’ The six Tsua contrastive tonal categories were established through speaker judgements, data from related languages and impressionistic categorization. The phonetic realization of the six tonal categories were confirmed through a rigorous acoustic analysis. The tonal space is given to illustrate the cross-speaker defining landmarks for each tonal category. Cross-speaker F0 averaging is done on a logarithmic scale: mean_f0 = exp(sum(ln(f01−n))∕n) in ProsodyPro 5.5.2 (Xu 2013). 5 CVV lexical items with root-initial voiceless consonants followed by non-high, nonnasalized and non-pharyngealized vowels are measured for each category to give the typically produced tonal melodies with the least amount of F0 perturbation. Each speaker produced 3 repetitions of each lexical item: 5 CVV lexical items × 3 repetitions × 3 speakers = 45 utterances per F0 averaged tone melody. ProsodyPro was set to time-normalize the contours at 10 points per tone, resulting in 20 time-normalized measurement points for each bi-tonal lexical item. Time normalization is done to allow graphical comparisons across melodies. A grand total of 900 measurements (45 utterances × 20 time-normalized points) are taken for each melody.

138

Mathes

Figure 3.1 The tonal space of the six time-normalized and F0 averaged tone melodies across 3 Tsua consultants (n = 45 per tone melody)

The tonal space of the six time-normalized and F0 averaged melodies across the three Tsua speakers is shown in Figure 3.1. The High-level, HM-falling and HL-falling shapes have F0 peak values close to or above 210 Hz within normalized time points 1–10, i.e. in the first half of the rime in the CVV syllable. I interpret this as confirmation that the root-initial High tone analysis is appropriate. The Mid-double-rise melody stays below the High-level melody throughout, although the first half of the shape (normalized time points 1–10) is closer in frequency compared to the second half (normalized time points 11–20). The MH-rising and ML-falling contours are close in frequency soon after their respective F0 onsets until they diverge at normalized time point 4. At that point, the MH-rising contour begins to fall and the ML-falling contour starts to rise. All six of the F0 offsets cluster together in pairs of two at normalized time point 20: the High-level and MH-rising melodies at approximately 190 Hz, supporting a final High tone interpretation; the HM-falling and Mid-double-rise melodies at about 175 Hz, supporting a final Mid tone; and the HL-falling and ML-falling melodies at around 150 Hz, supporting a final Low tone. 5

Depressed Tonal Melodies

There are two additional melodies that are the result of a root-initial depressed High tone. The HM-falling and HL-falling melodies become Depressed

The Interaction between Click Consonants and Tone in Tsua

139

High-Mid (DH-M) and Depressed High-Low (DH-L) when the root-initial consonant is a voiced or aspirated click. These clicks are seen in bold in Table 3.2. Non-clicks that serve as depressors are: /b d dz ɟ ɡ ɢ pʰ tʰ tsʰ cʰ kʰ qʰ h/. The effect on F0 is essentially the same as click depressors, pointing to a click nonclick parallelism with respect to this phenomenon. Moreover, there is a general avoidance of depressor consonants co-occurring with the High-level melody (i.e. HH) as there are only 11 HH lexical items with root-initial depressors in the current dataset. These are depression exceptions in which the initial High tone is not depressed, a topic that is addressed in §6. In other words, there is no evidence of a Depressed High-High (DH-H) melody. This is perhaps unsurprising, given that co-occurrence restrictions between tone and onset consonants are often manifested in tone languages where F0 differences indicate lexical contrasts (Chen 2011: 612). Tone lowering by certain clicks has been observed in Khoisan languages for quite some time. Beach (1938: 235) describes a characteristic of “Hottentot” [Khoekhoegowab] dialects in which voiced consonants cause tone lowering, arguing that “the parent language of both Nama and Korana had four tones, pronounced very much as in modern Korana … The two lower tonemes … each became subdivided in Nama into two: … the subdivision was produced by voiced consonants.” Beach lists the consonants associated with depressed melodies by stating “of click accompaniments the originally voiced velar click efflux (Cg), delayed aspiration (Ch) and the nasal accompaniment (Cn); of egressive consonants the voiced plosives b, d, g, as well as ts, kx, h, m and n” (p. 251). The depressed H tones in Tsua are consistent with observations in Lanham (1958: 74) for Xhosa: “… in passing from a depressor consonant which introduces the syllable, to the immediately succeeding vowel, there is release from the effect of the consonant and a consequent striving to achieve the required pitch level of /H/ … It is clear that under these circumstances a rising glide results and a glide of this kind coinciding with a depressor is a certain indication of /H/. /H/ can thus be realized by an allotone of the shape [/] at various pitch levels.” The depressed H tones become apparent when comparing the tone melodies of Gǀui, Kua and Tsua cognates. The HM and HL melodies map consistently between these languages. However, when Tsua has a root-initial voice or aspirated click, a following H tone may be depressed. (5) and (6) provide a subset of the click-initial cognates with HM and HL melodies, respectively. The Gǀui data come from Nakagawa et al. (2014) and the Kua data come from Chebanne and Collins (2014) unless otherwise noted.

140

Mathes

(5) Gǀui, Kua and Tsua cognates with the HM tone melody Gǀui (HM) Kua (HM) Tsua (HM) Gloss ǁáē ǁáē ǁáē ‘to chew’ ǀqχ’áō ǀqχ’áō ǀqχ’áō ‘snake’ ǀʔéē ǀʔéē ǀʔéē ‘fire; firewood’ ǀχáā ǀχáā ǀχáā ‘meat’ ǁqórē ǁqórē ǁqórē ‘claw; fingernail’ (6) Gǀui, Kua and Tsua cognates with the HL tone melody Gǀui (HL) Kua (HL) Tsua (HL) Gloss ǀχóò ǀχóò ǀχóò ‘gemsbok’ ǀqχ’áǹ ǀqχ’áǹ ǀqχ’áǹ ‘bad; ugly’ ǀʔáò ǀʔáò ǀʔáò ‘money; blood’ ǁqχ’áà ǁqχ’áà ǁqχ’áà ‘to wash’ ǁχábà ǁχábà ǁχábà ‘hunchback’ (7) shows a subset of the Gǀui and Kua HM cognates that correspond to Tsua DH-M. (8) shows a subset of the HL and DH-L cognates. Henceforth, the DH-M and DH-L tonal transcriptions are marked with an ‘x’ symbol [ ͓] below the vowel with the depressed H tone, i.e. [á͓ā] is DH-M and [á͓à] is DH-L. The Tsua Stage 3 click replacement patterns for alveolar and palatal clicks discussed in §3 can be observed in these data. (7) Gǀui, Kua HM and Tsua DH-M cognates Gǀui (HM) Kua (HM) Tsua (DH-M) Gloss ɡǁám̄ ɡǁám̄ ɡǁá͓m̄ ‘thorn’ ̄ ̄ ǁʰṹĩ ǁʰṹĩ ǁʰṹ͓i ̃̄ ‘to sigh’ ǂqʰúrū ǂqʰúrū ǂqʰú͓rū ‘dust cloud’ ǃʰáē kʰáē kʰá͓ē ‘to stab’ ǂʰúnī ǂʰúnī cʰú͓nī ‘elbow’ (8) Gǀui, Kua HL and Tsua DH-L cognates Gǀui (HL) Kua (HL) Tsua (DH-L) ɡǀáà ɡǀáà ɡǀá͓à ɡǃúm̀ ɡúm̀ ɡú͓m̀ ɡǃám̀ ɡám̀ ɡá͓m̀ ǁʰábà ǁʰábà ǁʰá͓bà ǁʰóbè ǁʰóbè ǁʰó͓bè

Gloss ‘Silver tree’ ‘to breathe on’ ‘to throw something’ ‘to stumble’ ‘to creep toward’

The Interaction between Click Consonants and Tone in Tsua

141

Figures 3.2 and 3.3 compare HM, DH-M and HL, DH-L, respectively, to illustrate the interaction between click consonants and tone in Tsua. Figure 3.2 shows superimposed F0 traces of HM ǁáē ‘to chew’ with DH-M ɡǁá͓m̄ ‘thorn’. Figure 3.3 compares HL ǀχóò ‘gemsbok’ with DH-L ɡǀá͓à ‘Silver tree’. The figures are a striking depiction of depressed versus non-depressed contours. The F0 lowering for the depressed melodies is apparent at their respective onsets. The curves converge and fall to their respective F0 offsets.

Figure 3.2 F0 traces of HM ǁáē ‘to chew’ (upper trace at F0 onset) and DH-M ɡǁá͓m̄ ‘thorn’ (lower trace at F0 onset) to illustrate H tone depression as produced by Tsua speaker S. F0 lowering is realized at the start of pitch onset. The curves converge and fall to similar F0 offsets.

Figure 3.3 F0 traces of HL ǀχóò ‘gemsbok’ (upper trace at F0 onset) and DH-L ɡǀá͓à ‘Silver tree’ (lower trace at F0 onset) to illustrate H tone depression as produced by Tsua speaker S. F0 lowering is realized at the start of pitch onset. The curves converge and fall to similar F0 offsets.

142

Mathes

Figure 3.4 The tonal space of the six non-depressed and two depressed time-normalized and F0 averaged tone melodies across 3 Tsua consultants (n = 45 per tone melody).

The tonal space of the six non-depressed plus two depressed time-normalized and F0 averaged melodies across the three Tsua speakers is shown in Figure 3.4. The DH-L contour is indicated by the dashed line with the lowest F0 onset. The DH-M contour is shown with the solid line with the lowest F0 onset. Both F0 traces have an onset value of about 170 Hz, thereby demonstrating that a root-initial depressed High tone is lower than a root-initial Mid tone. The DH-L curve rises while the DH-M curve falls from their respective onsets. The DH-L curve reaches its F0 peak sooner than the DH-M curve, possibly because the DH-L melody has to fall more in an equivalent amount of time to reach its F0 offset. The DH-L’s offset value is quite similar to the HL-falling and ML-falling offset values, therefore confirming the root-final Low tone analysis. The DH-M’s offset value is almost exactly the same as the Mid-double-rise and HM-falling offset values, confirming that the root-final Mid tone analysis is appropriate. 6

Tonal Depression Exceptions and the Relevance of Click Replacement

There are cases in which a root-initial H tone is not depressed when it follows a depressor consonant whether click or non-click. These depression exceptions reveal an important property of Tsua tonal depression, a property that involves

The Interaction between Click Consonants and Tone in Tsua

143

Table 3.4 Root counts of tonal depression exceptions by depressor type

Depression exceptions Depression expected

Voiced obstruents

Aspirated obstruents

h

Total

33 112

4 88

4 12

41 212 19.3%

Table 3.5 Root counts of tonal depression by tone melody

DH-M DH-L

Voiced obstruents

Aspirated obstruents

h

Total

46 33

56 28

4 4

106 65 171

click replacement. Table 3.4 gives a breakdown of exceptions in the current dataset by depressor type. The depression exceptions count is listed in the first data row. The second data row labeled ‘depression expected’ is the total number of roots where tonal depression is expected to occur, including the exceptions. In addition, the 171 roots with tonal depression are broken down by tone melody (Table 3.5). Tonal depression exceptions for root-initial H tones constitute 19.3% of the data. The voiced obstruents type has the highest number of depression exceptions with a total of 33. This equals a 29.5% exception rate for that depressor type when divided by the number of roots with root-initial voiced obstruents where depression is expected at 112. It is informative to consider why so many exceptions occur with voiced obstruents by inspecting the cognates between Gǀui, Kua and Tsua. Many of the voiced stops in Tsua have nasal counterparts in Gǀui and Kua as observed in Table 3.6. Tsua click replacement for these systematic exceptions comes in three forms: (i) nasalization loss, (ii) click and nasalization loss, and (iii) click and nasalization loss plus fronting. Table 3.6 lists the exceptions for voiced obstruents sorted by click replacement type. The common denominator is that all three forms of click replacement involve nasalization loss for these tonal depression exceptions.

144

Mathes

Table 3.6 Gǀui, Kua and Tsua correspondences for the voiced obstruents

Gǀuia

Kuab

Tsuac

Gloss

ŋǁúá ŋǁúì ŋǁóē ŋǁúì ŋǁóòrī d ŋǁáò ŋǁárò ŋǁáé ŋǀíi ̃̀ ~ ŋǀĩ d̀

ŋǁúá ŋǁúì ŋǁóē ŋǁúì ŋǁóòrī ŋǁáò – ŋ‖áé ŋǀéè ŋúō ŋɡû f ŋárò ŋóē ŋárō – ŋáò ŋáé ŋúū ŋǂúú ŋǂórō ŋǂéē

ɡǁúá ɡǁúì ɡǁóē ɡǁúì ɡǁóòrī ɡǁáò ɡǁálò ɡǁáé ɡǀéè ɡóō ɡúù ɡárò ɡóē ɡárō ɡúrī ɡáò ɡáé dzúū ɟúú ɟórō ɟúē

‘rock; stone’ ‘oil’ ‘moon’ ‘fat’ ‘nephew; niece’ ‘old; old person; to age’ ‘to carry (on the shoulder)’ ‘to count; to read’ ‘there (DEM)’ ‘to be silent; to shut up’ ‘far; to go far away’ ‘chameleon’ ‘porcupine’ ‘to clot; to coagulate; to freeze’ ‘to spoil; to damage; to ruin’ ‘to hide; to bury’ ‘to pass by’ ‘hut’ ‘black’ ‘back; backbone’ ‘to deny; not to trust; to oppose’

ŋǃóō ŋú e ŋǃárò ŋǃóē ŋǃárō ŋǃúrī ŋǃáò ŋǃáé ŋǃúū ŋǂúú ŋǂórō ŋǂóē

Note: Tonal depressor exceptions by click replacement type are shown (where comparative data are available) (n = 21). a Nakagawa et al. (2014) b Chebanne and Collins (2014) c Mathes and Chebanne (2013) d Nakagawa (p.c.) e Tanaka (1978) f  Vossen (1997)

It is striking to observe that the dental and lateral clicks are the most stable out of the four Tsua click types, being preserved in all of the cases. This observation is consistent with the claim that of the four basic clicks in Khoe languages, only the non-affricated (alveolar and palatal) clicks are replaced by clickless consonants while the affricated (dental and lateral) clicks remain stable (Traill 1986). Looking at the first example ‘rock; stone’, click replacement via nasalization loss is exemplified as:

The Interaction between Click Consonants and Tone in Tsua

145

(9) Tsua click replacement via nasalization loss Gǀui Kua Tsua Gloss ŋǁúá ŋǁúá ɡǁúá ‘rock; stone’ Click and nasalization loss are evident in six of the exceptions. All six examples involve alveolar click replacement. We can compare ŋǃóē ‘porcupine’ in Gǀui with the Kua and Tsua cognates ŋóē and ɡóē, respectively. (10) Tsua click replacement via click and nasalization loss Gǀui Kua Tsua Gloss ŋǃóē ŋóē ɡóē ‘porcupine’ The third type of click replacement, click and nasalization loss plus fronting, is apparent in the cognates for ‘hut’. We might expect the root-initial velar nasal consonant ŋ to correspond to the voiced velar stop ɡ in Tsua, following the pattern seen in (10). However, the Tsua correspondent is the voiced alveolar affricate dz. (11) Tsua click replacement via click and nasalization loss plus fronting Gǀui Kua Tsua Gloss ŋǃúū ŋúū dzúū ‘hut’ Independent evidence for Tsua nasalization loss is found in Vossen (1997), which has a comparative word list from various Kalahari Khoe East languages. Cognates from the Kalahari Khoe East languages ǀXaise, Deti, Cara, Ts’ixa and Danisi as compared to Tsua are listed in Table 3.7. The Tsua examples have root-initial voiced consonants corresponding to root-initial nasal consonants in the rest of the Kalahari Khoe East languages.3 The data in Tables 3.6 and 3.7 suggest that the nasal form is the earlier form if we use the comparative method of linguistic reconstruction (e.g. Crowley and Bowern 2010). For instance, the sound change ŋǃ > ɡ occurred historically in Tsua. Tsua has been less conservative in terms of the historically nasal consonants compared to the other languages in Tables 3.6 and 3.7. Thus, a diachronic sound change in Tsua involved nasalization loss and in some cases 3  Vossen (1997) transcribes nasal clicks with a tilde above the click symbol and voiced clicks with a tilde below the click symbol. I have modified Vossen’s transcriptions by using ŋ to indicate nasal clicks, e.g. ŋǀ, and ɡ to indicate voiced clicks, e.g. ɡǀ, to make the root-initial consonant comparisons easier to see. No further modifications were made.

146

Mathes

Table 3.7 Root-initial nasal consonants in Kalahari Khoe East languages compared to Tsua root-initial voiced consonants

ǀXaise

Deti

Cara

Ts’ixa

Danisi

Tsua

Gloss

ŋɡabe ŋɡao

ŋabe –

ŋɡabe ŋɡao

ŋɡabe –

ŋɡabe ŋɡao

ɡabe ɡao

ŋǀu ŋǁobu – ŋɡabi

ŋǀnu ŋǁnubu ŋǁai ŋabi

ŋǀu ŋǁobu ŋǁai ŋɡabi

ŋǀu – ŋǁai –

ŋǀu ŋǁxobu ŋǁai –

ɡǀu ɡǁubu ɡǁai ɡabi

‘giraffe’ ‘sp. Acacia tree’ ‘leg’ ‘to shake’ ‘to sing’ ‘to turn over’

Source: Based on Vossen (1997)

the root-initial consonants underwent fronting. These roots did not originally have depressed H tones, since nasalized clicks and nasal non-clicks did not trigger tonal depression historically and do not involve depression synchronically. These particular H tones never became depression targets. The lack of F0 lowering in these cases may be motivated diachronically. What remains uncertain is why tonal depression did not affect these H tones after the sound changes as a synchronic rule. One possibility is the consonanttone interactions were lexicalized by the time click replacement commenced. Therefore, the roots in question retained their original H tones after click replacement, even though the tones were eventually preceded by voiced obstruents. What might be unusual here is the apparently frozen-in-time nature of tonal depressor effects, since depression occurs following voiced obstruents that did not originate as nasalized clicks or plain nasal consonants. It may be that depression was a synchronic rule at an earlier stage of the grammar but the depressed tonal contours had been lexicalized by the time click replacement took place. Alternatively, there may be a synchronic depression rule, with the roots in Table 3.6 constituting lexical exceptions. It is a question that must remain open for the time being. One might consider whether the co-occurrence of root-initial nasal clicks with respect to the six non-depressed melodies is restricted. Root-initial nasal clicks are attested in 5 out of the 6 non-depressed melodies. Table 3.8 has examples for each tonal melody. Only the ML melody does not co-occur with a nasal click in the current Tsua dataset. It is not clear at this time whether this restriction will hold if more data are collected. While it cannot be definitively stated that root-initial nasal clicks co-occur with the 6 non-depressed melodies, we can say that they do co-occur with root-initial H and M tones.

The Interaction between Click Consonants and Tone in Tsua

147

Table 3.8 Examples of nasal clicks co-occurring with five of the six non-depressed tonal melodies in Tsua

Melody

Tsua example

Gloss

HH HM HL MM MH ML

ŋǁáḿ ŋǁúm̄ ŋǁáù ŋǁūm̄ ŋǁāí –

‘to scrape’ ‘nighthawk’ ‘to spray liquid’ ‘cloud’ ‘game stick’ –

The proposal above can be formulated as the hypothesis: (12) Tsua Depression Exceptions Hypothesis Tsua voiced obstruents that were nasal sonorants or nasalized clicks historically are not depressors synchronically. There are three predictions made by the hypothesis. First, root-initial voiced clicks in Gǀui and Kua should correspond to depressed Tsua H tones, whether the clicks have been replaced or not. Second, depressed H tones in Tsua should never correspond to root-initial nasals in Kua and Gǀui. Third, any lexical items with root-initial voiced obstruents that entered the grammar after the Tsua sound changes may not be subject to depression if the depression pattern is not productive. Evidence supporting the first prediction is in Table 3.9. The top five cognate sets show root-initial correspondences between Gǀui, Kua and Tsua where the voiced alveolar click has been replaced but tonal depression still occurs in Tsua. The tone melodies are in parentheses after each cognate for clarity. With respect to the second prediction, there are no instances of Tsua depressed H tones following voiced obstruents with cognate nasal correspondences in Kua or Gǀui in the dataset. There is also evidence to support the third prediction: recent Setswana loan words are not subject to tonal depression. For example, the Tsua HML débēdù ‘dewlap; chicken jowl’ and HM díā ‘to delay’, which should have root-initial depressed H tones, are likely the Setswana loans lebedu ‘dewlap’ and diiwa ‘to delay’, respectively. At least 2 of the 4 exceptions for the glottal fricative h are loans: the HL húmà ‘to become rich’ and the HH hírá ‘to hire’ are from Setswana huma ‘to become rich’ and hira ‘to hire’. Thus, by appealing

148

Mathes

Table 3.9 Cognates where tonal depression occurs in Tsua but not in Gǀui or Kua

Gǀuia

Kuab

Tsuac

Gloss

ɡǃãĩ (HM) ɡǃoo (HL) ɡǃui (HM) ɡǃam (HL) ɡǃari (HL) ɡum (HL) ɟua (HL) ɟuu (HL) ɡǁam (HM) ɡǁuu (HM)

ɡãĩ (HM) ɡuo (HL) ɡui (HM) ɡam (HL) – ɡum (HL) ɟua (HL) ɟuu (HL) ɡǁam (HM) ɡǁuu (HM)

ɡãĩ (DH-M) ɡoo (DH-L) ɡui (DH-M) ɡam (DH-L) ɡari (DH-L) ɡum (DH-L) dua (DH-L) duu (DH-L) ɡǁam (DH-M) ɡǁuu (DH-M)

‘steenbok’ ‘aardvark’ ‘rope; trapping string’ ‘to throw’ ‘to thatch’ ‘to blow’ ‘kudu’ ‘eland’ ‘thorn’ ‘chest’

a Nakagawa et al. (2014) b Chebanne and Collins (2014) c Mathes and Chebanne (2013)

to sound change, at least 25 out of 41 exceptions can be plausibly accounted for pending the availability of supplementary comparative data. 7

The Anatomy of DH

“Depressed High” has been used as a convenient cover term but it could be made more precise now to account for the Tsua data. The structure of DH could be analyzed as resulting from Low tone insertion. In this case, the anatomy of DH would simply be [LH]. Thus, DH-L is [LHL] and DH-M is [LHM]. The Tsua tonal depression rule is proposed in (13). (13) Tsua tonal depression rule with the [+slack] feature ∅ → L / [-sonorant, +slack] ___ H [-H] # In (13), a Low tone is inserted in the environment following [-sonorant], [+slack] and when followed by a sequence of a High tone plus a non-high tone (i.e. M or L). The rule in (13) appeals to the unifying feature [+slack] as the source of F0 lowering in Tsua. §8 gives an explanation in favor of appealing to this feature. For now, if we assume that voiced obstruents, aspirated obstruents and /h/ can be unified via [+slack], while Mid and Low tones could be

The Interaction between Click Consonants and Tone in Tsua

149

represented as [-H], (13) provides a plausible account for the shapes of the depressed contours. Figure 3.4 demonstrates that the F0 target at the start of the depressed contours is consistent with a Low tone interpretation, because it is clearly distinct from and below a root-initial Mid tone target. Subsequently, the curves rise to their respective High tone targets followed by a fall to the rootfinal tones. The curves do not show evidence of High tone deletion. Moreover, the effect is unlikely to be purely phonetic, as root-initial Mid tones do not undergo F0 lowering in the post-depressor environment. The formulation in (13) is highly context-dependent and takes into account the lack of a Depressed High-High melody discussed earlier. Nevertheless, the Tsua tonal depression rule in (13) may be problematic when we consider the exceptions. There are two ways to interpret the results seen in Table 3.4. The first interpretation is that a rule such as the one in (13) is present in Tsua’s phonology. In turn, the voiced obstruent exceptions, which pattern together as being historically non-depressor nasals, do not undergo the rule via lexical specification. The second interpretation is that the rule is not present in Tsua’s phonology. DH-M and DH-L are specified underlyingly as /LHM/ and /LHL/, surfacing as [LHM] and [LHL], respectively. Both accounts rely on lexicalization, but in different ways. In the first, there is a productive Low tone insertion rule, but there are lexically marked exceptions that resist the rule. In the second, there is not a rule, as the contours have been lexicalized. The current dataset cannot conclusively distinguish between these two possibilities. Future research that incorporates the probing of Tsua speakers’ grammars must be conducted to reach a definitive conclusion. 8

Summary and Implications

This chapter has detailed the interaction between Tsua click consonants and tone. There is a systematic class of exceptions that historically had root-initial nasal clicks or sonorants. These consonants are not depressors synchronically even though click replacement has seemingly caused them to become part of the voiced obstruents depressor type. The Tsua Depression Exceptions Hypothesis presented in this chapter gives us two crucial pieces of knowledge. First, Traill’s (1978) observation of the geographical West to East click replacement tendencies has consequences not just for the Tsua clicks themselves but also for understanding the Tsua tonal system. Therefore, future research on Khoisan tone should look at comparative lexical data because click replacement patterns may disguise tonal depression in other Kalahari Khoe East languages. Second, the Tsua evidence suggests that

150

Mathes

click replacement involves the nasalization accompaniment, which undermines a claim in Traill and Vossen (1997: 23–24): “… accompaniments consist of independent articulations involving glottal states (voiceless, voiced, glottal stop), voice onset time, nasalization and ejection … Neither process [click loss and click replacement] involves the accompaniment.” It would appear that this claim is too strong.4 The fact that F0 lowering may follow Tsua voiced stops, including clicks, is unsurprising. The effect of prevocalic voiced stops on F0 has been studied for quite some time (e.g. House and Fairbanks 1953; Lehiste and Peterson 1961; Löfqvist 1975; among others). The data from these and other studies make clear that the F0 of a following vowel is higher after voiceless than after voiced stops, and it does not vary as a function of place of articulation (Hombert et al. 1979: 39). The common cross-linguistic pattern of F0 lowering by prevocalic voiced stops holds for tonal as well as non-tonal languages. The studies in Hombert (1978) confirm this pattern by presenting the F0 perturbation time course after voiced stops in Yoruba and American English. In sum, the postvocalic F0 lowering caused by voiced stops compared to voiceless stops has been reported in numerous phonetic studies and can be attributed to slackening of the vocal folds (Halle and Stevens 1971; Kingston and Diehl 1994). The slackening gesture is the articulatory correlate of the [+slack] feature (Halle and Stevens 1971). On the other hand, F0 lowering after Tsua aspirated stops requires an explanation. The effect of aspirated stops is less consistent with respect to F0 perturbation trends cross-linguistically. F0 lowering by aspirated stops occurs in tone languages spoken in Africa and Asia. However, there are languages in which a higher F0 has been reported after aspirated stops compared to unaspirated stops. For instance, data from Danish (Jeel 1975), Korean (Han and Weitzman 1970), Cantonese (Zee 1980), Thai (Gandour 1974) and Japanese (Ewan 1976) show that an aspirated stop gives rise to a higher F0 at the onset of the following vowel (as cited in Xu and Xu 2003: 165). Moreover, the phonetic F0 raising effect has been shown to phonologize into an H tone in some languages (see Rivierre 1972, 1993). It appears that there is a dichotomy between F0 lowering and F0 raising after aspirated stops across languages. Appealing to the [+slack] feature as the unifying cause of Tsua’s synchronic tonal depression is complicated by classificatory systems that consider voiceless, aspirated stops as [-slack] (e.g. Halle and Stevens 1971: 51). Nonetheless, Chen (2011: 622) states: “To the extent that aspirated stops do introduce different perturbation effects as compared to their voiceless unaspirated counterparts, 4  Many thanks to an anonymous reviewer of Mathes (2016), for pointing out that click replacement may involve the accompaniment in Shua dialects.

The Interaction between Click Consonants and Tone in Tsua

151

the cross-linguistic variability suggests that speakers may use different strategies to produce aspirated stops in different languages which lead to different perturbation effects.” Citing the vocalis muscle effects described in Hoole (2006), Chen continues: Although vocal fold activity is a very complex phenomenon and much still remains to be learned, we may hypothesize that languages can differ in the strength of the vocalis muscle effects … It is possible that in Shanghai Chinese at the release of an aspirated stop, such an effect is either weak or not as effective as those after a voiceless unaspirated stop since the glottis opening after an aspirated stop is much greater. This in turn may lead to relative slackness of the vocal folds at the onset of the vowel after an aspirated onset, resulting in a certain amount of extended glottal opening and gradual onset of voicing. Consequently, there is a breathy transition between the aspiration and the vowel voicing with weak glottal buzz, manifested in higher H1‒H2 and f0 lowering, as compared to voiceless unaspirated stops. In other words, not only the timing of laryngeal adjustments relative to supraglottal gestures (manifested as the VOT values) are important to differentiate stop contrasts, the state of the glottis constriction (e.g. spread vs. constricted and/or stiff vs. slack) can also play a role even in cases where VOT seems to be a sufficient cue for phonological distinctions. Chen 2011: 622

If Chen’s hypothesis is on the right track, then vocal fold slackness may have caused the F0 lowering following voiced and aspirated stops, including clicks, in Tsua, assuming that aspiration is articulated in a similar fashion as described above for Shanghai Chinese. The voiceless glottal fricative h is an additional complication to the [+slack] possibility, in part because it is classified with the feature [-slack] (Halle and Stevens 1971: 51). There are two pieces of evidence to consider in this case. First, we may look at the related language Shua. Westphal’s handwritten field notes (n.d.) on Shua verb tense and pronouns have the voiced glottal fricative ɦ in the transcriptions. For example, ɦĩi ̃ ‘to work’ and ɦiʃe ‘you (fem.)’. The handwritten transcription of ɦ does not seem to be Westphal’s idiosyncratic way of writing ‘h’ in general, because he clearly transcribes ‘to stab’ as khae and ‘person’ as khwe. Second, we may consider evidence from Ikalanga, a neighboring Bantu language that has had extensive contact with Tsua, which could have been a source of historical influence on Tsua phonology (Chris Collins, Andy

152

Mathes

Chebanne, personal communication). Ikalanga has the voiced glottal fricative ɦ as a depressor in its inventory (Mathangwane 1999: 175). The voiced glottal fricative ɦ is categorized as [+slack] by Halle and Stevens, which would explain its F0 lowering effect on a following tone. It may be that Tsua’s voiceless glottal fricative h was historically voiced and the resulting F0 lowering became phonologized. Recently, ɦ became devoiced in Tsua but remained voiced in the Shua dialects at the time of Westphal’s field research. While a conclusive answer remains elusive, this chapter raises intriguing questions about which consonants function as depressors across Khoisan languages and what the time course and temporal ordering were for phonologization by depressor type. References Barnard, Alan. 1992. Hunters and Herders of Southern Africa: A Comparative Ethnography of the Khoisan Peoples. Cambridge Studies in Social and Cultural Anthropology 85. Cambridge: Cambridge University Press. Beach, Douglas M. 1938. The Phonetics of the Hottentot Language. Cambridge: W. Heffer & Sons LTD. Bradshaw, Mary M. 1999. “A Crosslinguistic Study of Consonant-Tone Interaction.” Ph.D. thesis. Ohio State University. Chebanne, Andy M., and Christopher T. Collins. 2014. “Kua Lexicon.” Unpublished manuscript, University of Botswana and New York University. Chen, Yiya. 2011. “How does phonology guide phonetics in segment-f0 interaction?” Journal of Phonetics 39: 612–625. Crowley, Terry, and Claire Bowern. 2010. An Introduction to Historical Linguistics. New York: Oxford University Press. Downing, Laura J., and Bryan Gick. 2005. “Voiceless tone depressors in Nambya and Botswana Kalang’a.” Proceedings of the 27th Annual Meeting of the Berkeley Linguistics Society, February 16–18, 2001, ed. Charles Chang, Michael J. Houser, Yuni Kim, David Mortensen, Mischa Park-Doob, and Maziar Toosarvandani. Berkeley, CA: Berkeley Linguistics Society, pp. 65–80. Ewan, William G. 1976. “Laryngeal Behavior in Speech.” Ph.D. thesis, University of California, Berkeley. Gandour, Jack. 1974. “Consonant types and tone in Siamese.” Journal of Phonetics 2: 337–350. Güldemann, Tom, and Rainer Vossen. 2000. Khoisan. African Languages: An Introduction, ed. Bernd Heine and Derek Nurse. Cambridge: Cambridge University Press, pp. 99–122.

The Interaction between Click Consonants and Tone in Tsua

153

Halle, Morris, and Kenneth N. Stevens. 1971. “A note on laryngeal features.” Quarterly Progress Report, Research Laboratory of Electronics, (MIT) 101: 198–213. Han, Mieko S., and R.S. Weitzman. 1970. “Acoustic features of Korean /P, T, K/, /p, t, k/ and /ph, th, kh/.” Phonetica 22(2): 112–128. Hildebrandt, Kristine A. 2003. “Manange Tones: Scenarios of Retention and Loss in Two Communities.” Ph.D. thesis, University of California, Santa Barbara. Hobley, June. 1964. “A preliminary tonal analysis of the Bassa language.” Journal of West African Languages 1(2): 51–55. Hombert, Jean-Marie. 1978. Consonant types, vowel quality, and tone. Tone: A Linguistic Survey, ed. Victoria A. Fromkin. New York: Academic Press, pp. 77–111. Hombert, Jean-Marie, Ohala, John J., and William G. Ewan. 1979. “Phonetic explanations for the development of tones.” Language 55: 37–58. Honken, Henry. 2008. “The split tones in Central Khoesan.” Khoisan Languages and Linguistics: Proceedings of the 2nd International Symposium January 8–12, 2006, Riezlern/Kleinwalsertal, ed. Sonja Ermisch. Quellen zur Khoisan-Forschung / Research in Khoisan Studies 22. Cologne: Rüdiger Köppe, pp. 185–224. Hoole, Philip. 2006. Experimental studies of laryngeal articulation. Habilitationschrift, University of Munich. House, Arthur S., and Grant Fairbanks. 1953. “The influence of consonant environment upon the secondary acoustical characteristics of vowels.” Journal of the Acoustical Society of America 25: 105–113. Hyman, Larry M. 2010. “How to Study a Tone Language, with exemplification from Oku (Grassfields Bantu, Cameroon).” UC Berkeley Phonology Lab Annual Report: 179–209. Jeel, V. 1975. “An investigation of the fundamental frequency of vowels after various Danish consonants, in particular stop consonants.” Technical Report (University of Copenhagen, Institute of Phonetics) 9: 191–211. Kingston, John, and Randy L. Diehl. 1994. “Phonetic Knowledge.” Language 70: 419–454. Kutsch Lojenga, Constance. 1998. “Depressor consonants and rising tones in Yaka (C10).” Paper presented at the 28th Conference on African Languages and Linguistics, Leiden University. Kutsch Lojenga, Constance. 2010. “Procedure for Tone Analysis.” The 3L International Summer School on Language Documentation and Description, Leiden University Center for Linguistics. Ladefoged, Peter, and Anthony Traill. 1994. “Clicks and their accompaniments.” Journal of Phonetics 22: 33–64. Lanham, L.W. 1958. “The tonemes of Xhosa.” African Studies 17(2): 65–81. Lehiste, Ilse, and Gordon E. Peterson. 1961. “Some basic considerations in the analysis of intonation.” Journal of the Acoustical Society of America 33: 419–425. Löfqvist, Anders. 1975. “Intrinsic and extrinsic F0 variations in Swedish tonal accents.” Phonetica 3: 226–247.

154

Mathes

Maran, La Raw. 1971. Burmese and Jingpho: A Study of Tonal Linguistics Processes. Occasional Papers of the Wolfenden Society on Tibeto-Burman Linguistics 4, ed. F.K. Lehmen. Urbana, IL: Center for Asian Studies, University of Illinois. Mathangwane, Joyce T. 1999. Ikalanga Phonetics and Phonology: A Synchronic and Diachronic Study. Stanford, CA: CSLI Publications. Mathes, Timothy K. 2015. “Consonant-Tone Interaction in the Khoisan Language Tsua.” Ph.D. thesis, New York University. Mathes, Timothy K. 2016. “The segmental inventory and status of click replacement in Tsua.” Khoisan Languages and Linguistics: Proceedings of the 5th International Symposium, July 13–17, 2014, Riezlern/Kleinwalsertal, ed. M. Brenzinger and S. Shah. Quellen zur Khoisan-Forschung / Research in Khoisan Studies 34. Cologne: Rüdiger Köppe. Mathes, Timothy K., and Andy M. Chebanne. 2013. “Tsua Lexicon.” Unpublished manuscript, New York University and University of Botswana. Nakagawa, Hirosi, Sugawara, Kazuyoshi, and Jiro Tanaka. 2014. Gǀui-English-Japanese Dictionary. (April 2014 preliminary edition). Tokyo University of Foreign Studies. Rivierre, Jean-Claude. 1972. “Les tons de la langue de Touho (Nouvelle-Calédonie): Etude diachronique.” Bulletin de la Société de Linguistique de Paris 67(1): 301–316. Rivierre, Jean-Claude. 1993. “Tonogenesis in New-Caledonia.” Oceanic Linguistics Special Publications 24, Tonality in Austronesian Languages, pp. 155–173. Shen, Zhongwei. 1994. “The tones in the Wujiang dialect.” Journal of Chinese Linguistics 22(2): 278–314. Tanaka, Jiro. 1978. A San vocabulary of the central Kalahari: Gǁana and Gǀwi dialects. African languages and ethnography 7. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. Tang, Katrina Elizabeth. 2008. “The Phonology and Phonetics of Consonant-Tone Interaction.” Ph.D. thesis, UCLA. Traill, Anthony. 1978. “Preliminary report on the linguistic situation amongst the Kua Basarwa living on the cattleposts Bae, Mosetlharobega, Metsimonate in the Central district.” Appendix 28 in 1979, Kalahari Cattle Posts, vol. 2, ed. Robert K. Hitchcock. Gaborone: Government Printer, pp. 252–260. Traill, Anthony. 1986. “Click replacement in Khoe.” Contemporary Studies on Khoisan: In Honour of Oswin Köhler on the Occasion of his 75th Birthday, vol. 2, ed. Rainer Vossen and Klaus Keuthmann. Quellen zur Khoisan-Forschung 5.2. Hamburg: Helmut Buske, pp. 301–320. Traill, Anthony. 1991. “Pulmonic control, nasal venting, and aspiration in Khoisan languages.” Journal of the International Phonetic Association 21(1): 13–18. Traill, Anthony, and Rainer Vossen. 1997. “Sound change in the Khoisan languages: new data on click loss and click replacement.” Journal of African Languages and Linguistics 18(1): 21–56.

The Interaction between Click Consonants and Tone in Tsua

155

Vossen, Rainer. 1997. Die Khoe-Sprachen: ein Beitrag zur Erforschung der Sprachgeschichte Afrikas. Quellen zur Khoisan-Forschung 12. Cologne: Rüdiger Köppe Verlag. Vossen, Rainer. 2013. “Phonetics and Phonology, Tshwa subgroup.” The Khoesan Languages, ed. Rainer Vossen. New York: Routledge, pp. 73–75. Westphal, E.O.J. n.d. “Shua. (BC 1143 (C8), E O J Westphal Papers).” Unpublished notes housed at the Rare Documents and Manuscripts Department, University of Cape Town. Wilmsen, Edwin N., and Rainer Vossen. 1990. “Labour, Language and Power in the Construction of Ethnicity in Botswana.” Critique of Anthropology 10(1): 7–37. Xu, Ching X., and Yi Xu. 2003. “Effects of consonant aspiration on Mandarin tones.” Journal of the International Phonetic Association 33(2): 165–181. Xu, Yi. 2013. ProsodyPro – A Tool for Large-scale Systematic Prosody Analysis, version 5.5.2. URL: http://www.homepages.ucl.ac.uk/~uclyyix/ProsodyPro/. Proceedings of Tools and Resources for the Analysis of Speech Prosody (TRASP 2013), Aix-en-Provence, France, 7–10. Zee, Eric. 1980. “The effect of aspiration on the F0 of the following vowel in Cantonese.” UCLA Working Papers in Phonetics 49: 90–97.

Chapter 4

Click Loss and Click Insertion in Fwe Hilde Gunnink Fwe1 (Bantu, K402) is one of a cluster of five Bantu languages in the south-west of the Bantu-speaking area that uses clicks. This paper gives the first description of the click inventory of the central dialect of Fwe, spoken on the border between Zambia and Namibia. In this variety of Fwe clicks are frequently interchanged with non-click consonants. In this paper I discuss the use of clicks in Fwe, focusing on the alternation between clicks and non-clicks and the way this alternation relates to the processes of click insertion and click loss. The paper is organized as follows. The first section gives background on the Fwe language, its speakers and its genetic affiliation. The second section describes where and how the data were collected on which this paper is based. The third section gives a phonological description of clicks in Fwe, and section four discusses the position of clicks in the phonotactic structure of the language. Section five discusses the alternation between clicks and egressive consonants as found in the speech of 13 Fwe speakers that were interviewed for this study. Sections six and seven discuss the role of click loss and click insertion in the history of Fwe, and how these processes relate to the modern alternation between clicks and non-clicks. 1 Fwe Fwe is spoken by between 10.000 and 20.000 speakers in South-Western Zambia and the Zambezi region (former Caprivi strip) of Namibia (Sakuhuka et al. 2011), as shown in Map 4.1. In Zambia, Fwe is spoken in the Imusho and Sinjembela regions, and parts of the Mutomena region. In Namibia, Fwe is spoken in the north-eastern part of the Zambezi region. Namibian Fwe and Zambian Fwe are quite different, and the Fwe variety spoken in the Imusho region, situated between Namibia in the south and Sinjembela in the north, represents a transition zone. Clicks are absent from the northernmost variety of Zambian Fwe. 1  I am grateful to Bonny Sands and Tim Mathes for their useful comments on earlier versions on this paper, to Jan Gunnink for providing the map used in this paper, and to all Fwe speakers I worked with for their contribution to this study. © Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_005

Click Loss and Click Insertion in Fwe

Map 4.1

157

Locations where Fwe is spoken

Fwe is a Bantu language classified as K.402 in Maho’s updated version of Guthrie’s referential system (Maho 2009). Genealogically, it is part of the Bantu subgroup called Bantu Botatwe (de Luna 2010). Fwe is part of the western Bantu Botatwe subgroup together with Subiya, Totela and Shanjo, (Bostoen 2009). None of these languages have clicks, with the exception of the Botswana variety of Subiya (Sands, this volume). Fwe is one of five Bantu click languages, including Manyo, Mbukushu, Kwangali and Yeyi (e.g. the “South-West Bantu click languages”, Gunnink et al. 2015), which are geographically distinct from another, more well-known cluster of Bantu click languages spoken in South Africa, which includes the Nguni languages and Southern Sotho (the “South-East Bantu click languages”, Pakendorf et al. 2017). 2 Methodology The present research is based on the Fwe variety that is spoken in the Imusho region of Zambia. The Imusho region is bordered by two other predominantly Fwe-speaking areas, the Sinjembela region in the north, and the Zambezi region (formerly Caprivi strip) in Namibia in the south. A list of click words

158

Gunnink

Table 4.1 Words showing contrast between clicks and non-clicks in Fwe

(a)

(b)

(c)

ᶢǀìmà ᶢǂìmà ᶢǁìmà ‘small fish sp.’

sîmà

kù-ᵑǀûr-à kù-ᵑǂûr-à kù-ᵑǁûr-à INF-click-FV ‘to click’

kù-fûr-à

kù-ǀàpùr-à kù-ǂàpùr-à kù-ǁàpùr-à INF-tear-FV ‘to tear’

kù-háꜝpúr-à

‘well’

INF-sharpen-FV ‘to sharpen’

INF-open-FV ‘to open (a book)’

Note: The following abbreviations are used in the glosses: AUG ‘augment’, COP ‘copulative prefix’, FV ‘final vowel suffix’, INF ‘infinitive’, NP ‘noun class prefix’. Numbers refer to the numbered noun classes.

has been collected, which is provided in the appendix. From this list, a native Fwe speaker who acted as a translator selected nineteen words that he considered to be the most frequent, and therefore the most likely to be known by all speakers. The pronunciation of these words was then checked with thirteen native speakers of Fwe living in the Imusho area, consisting of five women and eight men between the ages of 23 and 64 years old. With the help of this translator, each speaker was asked to pronounce each of the nineteen words three times in isolation, and three times in a sentence, though some speakers produced more or fewer tokens than asked for. Although the majority of speakers knew all words, a few words were rejected by certain speakers as being from a different language variety, and an alternative lexeme was offered instead. 3

The Click Inventory of Fwe

Clicks in Fwe have a very low functional load; there are few click phonemes, and few words in which clicks are used. A total of 87 click words has been found

Click Loss and Click Insertion in Fwe

159

in Fwe. There are four distinct click phonemes; a voiceless click, a voiced click, a voiceless nasal click2 and a voiced nasal click. Clicks may be realized as dental, palatal or lateral, but click type is not contrastive. The phonemic status of clicks is difficult to attest due to the low number of click words. Examples (a)–(c) in Table 4.1 show minimal pairs between clicks and non-clicks in Fwe. Voicing and nasality are contrastive features of clicks. The following spectrograms illustrate the four contrastive clicks found in Fwe. These four spectrograms all show the speech of the same 31-year-old female speaker (F31). Note that above each spectrogram is a time-aligned waveform. Waveforms are useful for detecting periods of voicing and for determining the amplitude of the click burst relative to other sounds. (1) ⁿ-dù-ǀɔ́ mà COP-NP₁₁-papyrus ‘it is papyrus’

Figure 4.1 ⁿdùǀɔ́ mà ‘It is papyrus.’ (F31)

The spectrogram shows the pronunciation of ⁿdù-ǀɔ́ mà, containing a voiceless oral dental click. Voicing is absent during the articulation of the closure and the release of the click. (2) ᶢǀìmà ‘small fish sp.’ 2  The voiceless nasal click is transcribed here with a voiceless diacritic under the click and with a superscript ŋ before the click, e.g. /ᵑǀ̥/, in preference to alternatives such as /ǀ̥/̃ and /ᵑ̥ǀ/ which are somewhat less legible.

160

Gunnink

Figure 4.2 ᶢǀìmà ‘small fish sp.’ (F31)

The spectrogram shows the pronunciation of ᶢǀìmà, containing a voiced oral dental click. The closure of the click is voiced, but voicing disappears during the click burst. (3) mù-ᵑǀ̥ápì NP₃-frog ‘small frog sp.’

Figure 4.3 mùᵑǀ̥ápì ‘small frog sp.’ (F31) Note: The lack of voicing of the consonants in this word can be easily seen in the waveform which has two relatively flat periods, in contrast to the waveform in Figure 4.4, which shows a word that is voiced throughout.

Click Loss and Click Insertion in Fwe

161

The spectrogram shows the pronunciation of mù-ᵑǀ̥ápì , containing a voiceless nasal dental click. The click is voiceless throughout the closure and the click burst. Nasalization is clearly heard on the vowel preceding the click, although it cannot be excluded that this is the effect of the preceding consonant /m/ rather than the following nasal click. In all words with a voiceless nasal click, the click is preceded by a syllable with a nasal consonant. There are also click words, however, where a syllable with a nasal is followed by a syllable with an oral click, such as màᶢǀúꜝkúmù ‘fruit sp’, mùᶢǀɛ̂nɛ̀ ‘thin’, showing that a nasal in the preceding syllable does not always result in a nasal click. (4) zì-ᵑǀúmàrɛ̀ NP₁₀-berry ‘berries of Grewia sp.’

Figure 4.4 zìᵑǀúmàrɛ̀ ‘berries of Grewia sp.’ (F31)

The spectrogram shows the pronunciation of zìᵑǀúmàrɛ̀, containing a voiced nasal dental click. The vowel preceding the click is also nasalized. As the syllable preceding the nasal click does not contain a nasal consonant, and Fwe does not have phonemic nasal vowels, the nasalization of the vowel must be due to the influence of the nasal click. The click is voiced throughout the closure and the click burst. The following minimal pair shows that voicing is contrastive in clicks. (5) Minimal pair showing voicing contrast in clicks kù-ǀàpùr-à vs. kù-ᶢǀàpùr-à INF-tear-FV INF-spread_limbs-FV ‘to tear’ ‘to spread one’s arms or legs’

162

Gunnink

Due to the small number of click words, there are no minimal pairs to prove the phonemic contrast between all four different click phonemes. Voicing and nasality of clicks are pronounced consistently by each speaker and upon each repetition of the word, showing that there is no variation in the realization of these features. This suggests that the voiceless, voiced, voiceless nasal and voiced nasal click are four separate phonemes. Click type is not contrastive for clicks in Fwe. Clicks are most commonly realized as dental, though occasionally clicks may be pronounced as lateral or palatal. Place of articulation preferences differ from speaker to speaker. Most speakers use only the dental click type. Some speakers use two click types, the dental and the palatal click, or the dental and the lateral click. No speakers were recorded who used all three different click types. The following spectrograms show the realization of tʃì-ᵑǀûʃù ‘sore’ by three different speakers: the first uses a dental click, the second uses a palatal click, and the third uses a lateral click. The three speakers each use the nominal root -ᵑǀûʃù in a different noun class; the first speaker uses a noun class prefix zì- indicating plural, the second speaker uses a noun class prefix tʃì- indicating singular, and the third speaker uses a prefixless noun class. Individual variation in noun class assignment is quite common in Fwe, and unlikely to be related to the realization of click type. According to Bostoen and Sands (2012) clicks in Fwe may also be realized as alveolar. I have not found this realization among the speakers I interviewed, which is likely to be related to the highly idiolectal nature of the realization of the place of articulation of clicks in Fwe. A number of other click languages also show variation between click types, while the click accompaniments remain unaffected. The Bantu languages Manyo and Mbukushu, both spoken geographically close to Fwe though genealogically not part of the same Bantu subgroup, also display free variation between dental and other click types, but make contrastive use of voicing and nasality (Fisch 1998: 4; Möhlig 1967: 13–14). Urban Sotho and urban Zulu spoken in Soweto, South Africa, show free variation between dental and postalveolar clicks, but do distinguish the qualities of voicing and nasality (Gunnink 2014). In cases of click loss in Khoisan languages, clicks are changed to non-clicks but their voicing and nasality remain unaltered (Traill and Vossen 1997). Although the front closure of clicks may vary between dental, palatal and lateral, the back closure of clicks is usually realized as velar. In most cases the dorsal release of the click is audibly velar, in other cases the release is less prominent. The analysis presented here deviates slightly from a description of Fwe phonology by Baumbach (1997), who describes a voiceless aspirated click in Fwe.

Click Loss and Click Insertion in Fwe

Figure 4.5 zìᵑǀûʃù ‘sores’ (F31)

Figure 4.6 tʃìᵑǀûʃù ‘sore’ (M40)

Figure 4.7 ᵑǁûʃù ‘sore’ (M40)

163

164

Gunnink

I have not found aspirated clicks in my data. This may be due to a regional difference, as Baumbach describes Fwe spoken in Namibia. 4

Clicks in the Phonotactic Structure

Clicks in Fwe are only used in lexical stems. Clicks are most often used as the initial consonant of a lexical stem, as in (6), but in some words both the first and the second consonant of a stem are clicks, as in (7). Only one example is attested of a click in non-initial position that is not preceded by another click, as seen in (8), a word which is likely to be an ideophone. (6) rù-ᵑǀɔ̀ rɛ̀ NP₁₁-toe ‘toe’ (7) rú-ꜝᵑǀáᵑǀà NP₁₁-sedge ‘sedge, reed’ (8) ɛ̀-tʃúᵑǀà AUG-dislike ‘expression of refusal, dislike’ Clicks are usually followed by a back vowel /a/, /ɔ/ or /u/. Of the 87 Fwe click words that have been collected, only six click words are attested in which a click is followed by a front vowel. Some may be borrowings from Yeyi; two Fwe click words in which the click is followed by a front vowel have a possible source word in Yeyi. (9) ᵑǀ̥iᵑ́ ǀ̥à ‘fruit of the wild date palm’ cf. Yeyi zì-ngǀìngǀá ‘fruits of the date palm tree’ (Seidel 2008) (10) -ᶢǀɛ̂nɛ̀ ‘thin’ cf. Yeyi –nǀ’ene ‘thin’ (Seidel 2008)

Click Loss and Click Insertion in Fwe

165

5 Click/Non-click Alternation A surprising characteristic of click use in Fwe is that clicks may be interchanged with non-clicks. If a click is realized as a non-click, the voicing and nasality features of the phoneme are maintained, and the place of articulation changes to velar. Voiceless oral clicks alternate with a voiceless velar plosive [k]. (11) kù-ǀàpùr-à ~ kù-kàpùr-à INF-tear-FV ‘to tear’ (12) kù-ǀáᵐpwìz-à ~ kù-káᵐpwìz-à INF-click-FV ‘to click (in anger or resentment)’ Voiced oral clicks may alternate with a voiced velar plosive [ɡ]. (13) mù-ᶢǀɛ̂nɛ̀ ~ mù-ɡɛ̂nɛ̀ NP₁-thin ‘thin (person)’ (14) mà-ᶢǀúꜝkúmù ~ mà-ɡúꜝkúmù NP₆-melon ‘brown melon sp.’ Voiced nasal clicks may alternate with a velar nasal [ŋ]. (15) ᵑǀɔ́ rɛ̀zà ~ ŋɔ́ rɛ̀zà ‘resin’ (16) kù-ᵑǀúmɛ̀ⁿt-à ~ kù-ŋúmɛ̀ⁿt-à INF-kiss-FV ‘to kiss’ Only one example is attested of a voiceless nasal click which alternates with a prenasalized voiceless velar plosive [ᵑk]. (17) mù-ᵑǀ̥ápì ~ mù-ᵑkápì NP₃-frog ‘small frog sp.’

166

Gunnink

Figure 4.8 mùɡɛ̂nɛ̀ ‘thin’ (M35)

Figure 4.9 mùᶢǀɛ̂nɛ̀ ‘thin’ (M35)

The spectrograms in Figures 4.8–4.9 show variant pronunciations of mù-ᶢǀɛ̂nɛ̀ by a 35-year-old male speaker, who produced three tokens of this word. The first two tokens were realized as mù-ᶢǀɛ̂nɛ̀, the third token was realized as mù-ɡɛ̂nɛ̀. The second and third token are shown in Figures 4.9 and 4.8, respectively. From an articulatory point of view, the alternation between clicks and non-clicks can be analyzed as a loss of the front closure. Clicks in Fwe are articulated by closing the articulatory tract at the front, for instance at the teeth in the case of the dental click, and at the velum, creating an ingressive airstream mechanism. When a click is interchanged with a non-click, the front closure is lost but the back closure remains, resulting in a consonant with an egressive airstream mechanism. As the dorsal closure of clicks in Fwe is velar,

167

Click Loss and Click Insertion in Fwe

the resulting egressive consonant has a velar place of articulation. The features of voicing and nasality are unaffected by the change in airstream mechanism. The variation between clicks and non-clicks is seen in the speech of all thirteen speakers that were interviewed, but not all speakers interchange clicks to the same degree. All speakers produced multiple tokens of each click word. Many speakers produced tokens of the same word both with a click and without a click. Table 4.2 shows for each speaker the number of words that were pronounced exclusively with a click, the number of words pronounced exclusively with a non-click, and the number of words realized with a click and with a non-click in different tokens. M indicates a male speaker, F indicates a female speaker, and numbers indicate the age of the speaker. Some speakers did not produce all nineteen click words that were asked for, either because they were unfamiliar with certain words or because they considered these words to be of a different language or dialect than their own. Table 4.2 Click/non-click alternations for each speaker

# of words always # of words always # of words Sex and age # of words pronounced pronounced with pronounced with pronounced of the with a click and a non-click a click speaker a non-click F23 F29 F31 M35 M39 M40 F43 F49 M50 M51 M56 M58 M64

16 16 19 16 14 19 17 17 16 15 16 16 15

12 4 15 7 2 13 11 3 12 4 3 2 1

3 10 3 5 11 3 1 5 3 5 11 9 14

1 2 1 4 1 3 5 9 1 6 2 5 –

168

Gunnink

Table 4.2 shows that most speakers alternate clicks and non-clicks in certain words. Speaker F49, for instance, produced seventeen of the nineteen click words that were asked for; she pronounced three click words with a click in each token, five click words with a non-click in each token, and for nine click words, she pronounced both clicks and non-clicks in different tokens. With the exception of speaker M64, all speakers alternate clicks with non-clicks in different tokens of the same word. The degree of alternation differs greatly from speaker to speaker. In addition to speaker-based variation, there is also wordbased variation; not all words are as frequently pronounced with a click as they are with a non-click. Table 4.3 shows how many speakers realized each word with only a click, only a non-click, or both a click and a non-click. The majority of click words are realized both with and without clicks, be it by the same speaker or by different speakers. The only exceptions are mú-ꜝᵑǀúlyà ‘lizard sp.’, and ᵐbû-ǀɔ̀ ‘tastelessness’. Both are considered by many speakers to be typical of Namibian Fwe, and some Fwe speakers of the Imusho region consider mù-ʃúⁿdùkìrɛ̀ to be the Zambian Fwe alternative to mú-ꜝᵑǀúlyà, and ᵐbù-ʃámù to be the Zambian Fwe alternative to ᵐbû-ǀɔ̀ . The interchangeability of clicks and non-clicks seems to suggest that Fwe has no phonemic contrast between clicks and velar stops, and that any velar stop may be realized as a click without change in meaning. This is not the case; there is only a small subset of words with /k/, /ŋ/ and /ᵑk/ where the velar non-click can alternate with a click consonant. The vast majority of words with these phonemes do not have a click alternative. The only exception is the voiced velar stop /ɡ/, which is rarely found as a phoneme in Fwe. /ɡ/ mainly occurs as alternation for the voiced click, though it also occurs in words that do not have an alternative pronunciation with a click. The alternation between clicks and non-clicks is linked to regional differences. In the northern part of the Fwe-speaking area, located around the village of Sinjembela, clicks are absent; Fwe speakers from this area always use the clickless variant. In the southern part of the Fwe-speaking area, in Namibia, Fwe speakers only use the click form, and alternation between clicks and nonclicks does not take place. The click utterances used in this study come from the Imusho region, which is situated between the clickless northern region of the Fwe-speaking area, and the click-using Namibian speech variety. The Imusho region therefore appears to be a transition zone, in which both click forms and clickless forms are used. An alternative, less likely explanation for the interchangability of clicks and non-clicks would be that Fwe is becoming moribund. In situations of language death, languages often go through a stage of extreme phonetic variation (Campbell and Muntzel 1989; Dressler 1988). Fwe, however, is in no way

169

Click Loss and Click Insertion in Fwe Table 4.3 Click/non-click alternations for each word

# of speakers # of speakers # of speakers using both using only using only non-clicks clicks and clicks non-clicks ᶢǀìmà ‘small fish sp.’ rù-ǀɔ́ mà ‘papyrus’ mú-ꜝᵑǀúlyà ‘lizard sp.’ mù-ᵑǀ̥àpì ‘small frog sp.’ ᵑǀɔ́ rɛ̀zà ‘resin’ ᵑǀàᵑɡì ‘resin’ mà-ᶢǀúꜝkúmù ‘brown melon sp.’ mà-ᵑǀúmàrɛ̀ ‘berries of Grewia sp.’ tʃì-ᵑǀûʃù ‘sore’ tʃì-ᵑǀùnà ‘grasshopper sp.’ ᵐbû-ǀɔ̀ ‘tastelessness’ kù-ǀàp-ùr-à ~ kù-ǀàp-ùr-ùr-àa ‘to tear’ kù-ᶢǀàkàm-ìn-à ‘to sit with arms and legs extended’ kù-ᶢǀâz-à ‘to shiver, fear’ kù-ᶢǀôⁿt-à ‘to drip’ kù-ᵑǀúmɛ̀ⁿt-à ‘to kiss’ kù-ᵑǀâᵑk-à ‘to shell groundnuts’ mù-ᶢǀɛ̂nɛ̀ ‘thin (person)’ kù-ᵑǀùm-ùn-à ‘to pull out, uproot’ kù-ᵑǀáᵑkùm-ùn-à ‘to take maize off a cob’

Total number of speakers who produced the word

4 5 4 7 3 1 5 9

8 6 – 5 5 6 3 1

1 2 – – 1 2 4 2

13 13 4 12 9 9 12 12

7 5 11 3

3 2 – 8

2 2 – 2

12 9 11 13

3

7

2

12

4 5 5 1 4 1

6 1 3 1 5 1

3 5 5 – 4 –

13 11 13 2 13 2

1

5

1

7

a -ur- is a derivational suffix. The reduplication of the derivational suffix gives the sense of intensity. Most speakers used only the form with the reduplication of the derivational suffix.

endangered, and therefore language loss cannot account for the interchangability of clicks and non-clicks. The difference in click use between northern and southern Fwe may be related to contact with other click languages. In the northern Fwe area, speakers

170

Gunnink

are no longer in contact with speakers of other click languages; there are no Khoisan languages spoken in this region, and the only Bantu click language spoken there is Mbukushu, where clicks are very marginal phonemes (Fisch 1998). In the south, however, Fwe speakers are still in contact with speakers of the Khoisan language Khwe (Jones and Dieckmann 2014), and with speakers of Yeyi, a Bantu language in which clicks have a fairly high functional load. Click loss in northern Fwe may thus be related to contact with non-click languages, a situation also observed for click loss in various Khoe languages (Traill and Vossen 1997). The continuum between northern Fwe and southern Fwe, resulting in the use of both click forms and non-click forms in the Fwe variety of Imusho that is discussed here, may be indicative of a sound change in progress. This sound change may be one of click loss, whereby the click is the original phoneme and the non-click the innovation, or click gain/insertion, whereby an original nonclick is being replaced by a click. The following sections consider the roles of click loss and click gain in the history of Fwe. 6

Click Loss

A number of click/non-click alternations in modern Fwe can be analyzed as click loss, as their etymology shows that these words originally contained a click. Only 11 Fwe click words have an assignable Khoisan origin, in which the click was borrowed with the word from a Khoisan source, and maintained up to now. Examples are for instance the Fwe word cì-ᵑǀùnà ‘grasshopper sp.’, a possible borrowing from Juǀ’hoan nǂaqnùn [ǂ̃aˤnũ] ‘corn cricket’ (Gunnink et al. 2015: 227). The Fwe word tʃìᵑǀùnà is realized as tʃìŋùnà by about half the speakers (see Table 4.3). Since the source word from which the Fwe word may have been borrowed has a click, the clickless pronunciation tʃìŋùnà can be seen as an innovation, and the alternation between tʃìᵑǀùnà and tʃìŋùnà as a sign of click loss in progress. Another example of click loss is seen in the Fwe word rú-ꜝᵑǀáᵑǀà ‘sedge, reed’, a possibly borrowing from Khwe ǂˈáà-ǀɡã ̀ã́ ‘sedgeleaf’ (Gunnink et al. 2015: 225), which is now realized by a number of speakers as rú-ꜝŋáŋà, thus exhibiting click loss. There are also examples of click loss in Fwe, whereby the current Fwe word can only be realized without a click, even though it may originally have contained a click. One such example is the Fwe word mù-tʃârɔ̀ ‘buffalo thorn’, which corresponds to the Khwe word ǂx’árò ‘buffalo thorn’ (Kilian-Hatz 2003: 197). As the Fwe word mù-tʃârɔ̀ is likely copied from a source word with a click, it has at some point in its history lost its click, possibly when it was copied,

Click Loss and Click Insertion in Fwe

171

possibly later. In this case, the non-click consonant that has replaced the original click consonant is not a velar. Another possible example of click loss is the Fwe word tànànà ‘Croton megalobotrys (tree sp.)’, possibly borrowed from Juǀ’hoan nǂoaqnáhàn [ǂ̃waˤnáhã̀] ‘Croton gratissimus (tree sp.)’ (Dickens 1994: 246). If these words are indeed cognate, the click in the Ju source word has been changed to a non-click when or after it was copied in Fwe. Again, the consonant replacing the click in this example of click loss is not a velar, as is the case in the modern alternation between clicks and non-clicks. 7

Click Insertion

Click insertion/gain3 has also taken place in Fwe. Gunnink et al. (2015) found 9 Fwe click words with a likely Bantu origin, listed in Table 4.4. This can be seen from the fact that they have clickless cognates in Bantu languages related to Fwe, such as Ila, Lenje and Tonga. Moreover, some of these Fwe click words have a Bantu reconstruction, suggesting that these are indeed native Bantu words that were originally clickless. These historical cases of click insertion in Fwe differ from the alternation between clicks and non-clicks in modern Fwe. Based on the reconstruction and cognates in other Bantu Botatwe languages, the forms of these words in Fwe before the click was inserted would have been -ʒàpùrà, -tɔ̂ ⁿtà and -tɔ̀ pɔ̀ rà respectively (see Bostoen 2009), and not the clickless variants -kàpùrà, -gɔ̂ ⁿtà, and -kɔ̀ pɔ̀ rà that are used in modern Fwe. The origin of Fwe -ᵑǀùmùnà ‘to pull out, uproot’ might be a reflex of the reconstructed root *-còmʊd- ‘pull out’ (Bastin et al. 2002), in which case the form of this word before the click was inserted would have been -ʃùmùnà; the change from /o/ in the reconstruction to /u/ in the Fwe reflex, however, is not regular. Another possibility is that Fwe ᵑǀùmùnà is a reflex of the reconstructed root *-túmʊd- ‘take (firewood) from fire; tear asunder’ (Bastin et al. 2002), in which case the form of the word before the click was inserted in Fwe would have been -fùmùnà; the change from the high tone in the reconstruction to the low tone in the Fwe reflex, however, is not regular, nor does the meaning of the reconstructed root closely resemble the meaning of the Fwe reflex. Whether the form of -ᵑǀùmùnà would have been -ʃùmùnà or -fùmùnà before the click was inserted, this differs from the current non-click variant of the word -ŋùmùnà.

3  I am grateful to Timothy Mathes for suggesting the term “click gain” as an alternative to “click insertion”.

172

Gunnink

Table 4.4 Bantu non-click cognates of Fwe click words

Fwe Ila Tonga Bantu reconstruction Fwe Ila Lenje Bantu reconstruction Fwe Ila Tonga Bantu reconstruction Fwe Ila Tonga Lenje Bantu reconstruction

-ǀàp-ùr-à -zap-ul-a -zap-ul-a *-jápʊd-ᶢǀɔ̂ ⁿt-à -londa-uk-a i-londo mu-ndondo *-tònt-ǀɔ̀ p-ɔ̀ r-à -lob-ok-a -lob-ok-a – -ᵑǀùm-ùn-à -som-on-a -fum-un-a -som-on-a -sòm-òn-òn-à *-còmʊd*-túmʊd-

‘to tear’ ‘to tear’ ‘to tear’ ‘to tear’ ‘to drip’ ‘to drip’ ‘drop’ ‘drop’ ‘to drip’ ‘to run fast’ ‘to run away in fear when one’s fault is found out’ ‘to run secretly’ ‘to pull out, uproot’ ‘to pull sticks out from the fire’ ‘to pull out as grass from thatch’ ‘to pull out (as a tooth)’ ‘to pull out’ ‘to pull out’ ‘to take (firewood) from fire, tear asunder’

Not only are the non-click substitutes for clicks in the words in Table 4.4 not velar consonants, as is the case in the modern click/non-click alternations, but the voicing and nasality of the non-click original consonant and its click substitute may differ. In the case of -ǀàpùrà, a voiceless click replaced a voiced consonant /ʒ/, and in the case of -ᶢǀôⁿt-à, a voiced click replaced a voiceless consonant /t/. In -ᵑǀùmùnà a nasal click replaced an earlier oral consonant /ʃ/ or /f/. After click insertion took place, the original non-click word was lost; the forms -ʒàpùrà, -tɔ̂ ⁿtà, -tɔ̀ pɔ̀ rà, and -ʃùmùnà/-fùmùnà are no longer used. Instead a new non-click alternative is formed using the modern pattern of click/non-click alternation, whereby the non-click is velar and shares the voicing and nasality of the click it alternates with. So, although click insertion has definitely taken place at some point in the history of Fwe, the historical process of click insertion is distinct from the synchronic process of click/non-click alternation. Hence, the historical process of click insertion cannot serve to explain the present-day alternation.

Click Loss and Click Insertion in Fwe

173

There are also cases of click insertion that follow the modern pattern of click/non-click alternation. The Fwe click word kù-ᵑǀɔ̂ rà ‘to write’, with its non-click alternative kù-ŋɔ̂ rà, is a case of click insertion as it is borrowed from a clickless Lozi verb ku-ŋola ‘to write’(Gunnink et al. 2015: 231). The alternation between -ᵑǀɔ̂ ra/-ŋɔ̂ ra shows the alternation between a velar nasal and a nasal click, and is a clear case of click insertion rather than loss. Another example where a click is inserted to replace a velar consonant is -ǀâmà / -kâmà ‘to milk’, a native Bantu word as attested by its reconstruction *-kám- ‘squeeze, wring’ (Bastin et al. 2002). Click gain in Fwe may have several functions. In some cases, the insertion of the click may be related to sound symbolism (Bostoen and Sands 2012), as in the verbs -ǀàpùrà ‘to tear’, and -ᶢǀɔ̂ ⁿtà ‘to drip’, which both refer to events that produce a certain noise. Other examples of click words where the use of the click may be related to sound symbolism, even though there is no clear indication that the word is of Bantu origin, are for instance -ᵑǀúmɛ̀ⁿtà ‘to kiss’, -ᵑǀáᵐpwīzà ‘to click (in anger or resentment)’, -ǀùtùrà ‘to tear off’, -ǀáʃáùkà ‘to crackle’. The use of clicks and click words in Fwe may also be related to taboo, as suggested by the use of clicks in a number of words that refer to sexual or social taboos, or to the use of language to mark a separate identity (Gunnink et al. 2015). In modern Fwe, however, clicks are no longer used for the purpose of identity marking. Even though they are surrounded by clickless languages such as Shanjo and Lozi, modern Fwe speakers do not consider clicks to be a typical feature of their language. They also do not link the use of clicks to Khoisan languages or their speakers, and have neither a positive nor negative attitude towards the use of clicks, even though most Fwe speakers consider Khoisan speakers to have a lower social status than themselves. Rather than an indexical or sound-symbolic function, click insertion in modern Fwe may be the result of the position of Imusho Fwe on a continuum between the clickless Sinjembela speech variety and the click-using Namibian speech variety. Confronted with a click form and a clickless form of the same word, Imusho Fwe speakers occasionally apply the allophony between clicks and non-click velar consonants to words that were originally clickless. 8 Conclusion This chapter has discussed the click inventory of Fwe, a Bantu language spoken in southwestern Zambia and the Zambezi region in Namibia. Fwe has a small click inventory displaying an extensive amount of free variation, particularly with respect to the alternation between clicks and non-clicks. Clicks may

174

Gunnink

alternate with non-click consonants, in which case the voicing and nasality of the click are maintained. These alternations may be a sign of click loss or click insertion, processes that have both played a role in the history of Fwe. Examples of click loss and click insertion are found that follow the modern pattern of alternation between clicks and velar egressive consonants, but there are also cases of click loss and click insertion that do not follow this pattern. The motivation for click insertion may have been sound symbolism, and the marking of a separate identity. Modern Fwe speakers, however, do not consider clicks to be a feature that distinguishes them from speakers of surrounding non-click languages, nor do they have specific positive or negative attitudes towards the use of clicks. Instead, click insertion and loss in the Fwe variety of Imusho appears to be the result of an allophony introduced by code-switching between the click-using southern variety and the clickless northern variety. Appendix The following list represents all Fwe click words collected by me, from speakers of both Zambian and Namibian Fwe. All words are transcribed in IPA. Surface tones are marked, except when not reliably known.

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Click word

Part of speech

Translation

ᶢǀìmà rù-ǀɔ́ mà ka-ǀɔma ᶢǀúꜝkúmù ᶢǀúmù ᵑǀàᵑɡì (shì-)mù-ᵑǀ̥ápì mù-ᵑǀ̥âwà ᵑǀɔ́ rɛ̀zà mu-ᵑǀɔrɛza mú-ꜝᵑǀúlyà ᵑǀúmàrɛ̀ tʃì-ᵑǀùnà ᵑǀuʃu

n n n n n n n n n n n n n n

‘small fish sp.’ ‘papyrus’ ‘palm nut’ ‘brown melon sp.’ ‘edible reed’ ‘resin’ ‘small frog sp.’ ‘fruit of Rhus tenuinervis’ ‘resin’ ‘joint’ ‘lizard sp.’ ‘berries of Grewia sp.’ ‘grasshopper sp.’ ‘pumpkin without mealie meal’

Click Loss and Click Insertion in Fwe

(cont.)

15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47.

Click word

Part of speech

Translation

tʃì-ᵑǀûʃù (shí-)mùᵑǀɔ̀ pwì rú-ꜝᵑǀáᵑǀà rù-ᵑǀɔ̀ rɛ̀ rù-ǀàᵑɡà ka-ǀuma ǀúmù n-ᶢǀapura kà-ǀùⁿdûbì ᶢǀɔ̂ ⁿdɔ̀ mù-ǀɔ̂ ᵐbɛ̀ (mì-) ᵑǀ̥iᵑ́ ǀ̥à ᵑǀàᵐbùrà ᵑkúsì tʃì-ᶢǀìᶮdʒᴐ̀ ᵑǀàtùrà kù-ǀáb-ùr-ùrà kù-ᶢǀákàm-ìn-à kù-ǀâm-à kù-ᶢǀàm-ùn-à kù-ǀàᵐpùt-ùr-à kù-ǀàp-ùr-à ~ kù-ǀàp-ùr-ùr-à kù-ǀárùmùs-à kù-ǀárùm-ùn-à kù-ᶢǀâz-à kù-ǀɔ̂ -h-à kù-ᶢǀɔ̂ ⁿt-à kù-ǀɔ̀ p-ɔ̀ r-à kù-ǀɔ̀ p-ɔ̀ r-à kù-ǀùkùm-ùn-à kù-ᵑǀâⁿk-à kù-ᵑǀàᵐb-ùr-à kù-ᵑǀáⁿkùm-ùn-à kù-ǀáᵐpwìz-à

n n n n n n n n n n n n n n n v v v v v v

‘sore’ ‘fish sp.’ ‘sedge, reed’ ‘toe’ ‘pumpkin seed’ ‘cheekbone’ ‘edible white substance on a reed’ ‘problem’ ‘plant sp.’ ‘plant sp.’ ‘anus’ ‘fruit of the wild date palm’ ‘pod of a Zambezi teak tree’ ‘tree sp.’ ‘sp. of wild cucumber’ ‘to stick on clothes (of thorns)’ ‘to sit with legs and arms extended’ ‘to milk’ ‘to take a little bit’ ‘to dig’ ‘to tear’

v v v v v v v v v v v v

‘to warm oneself’ ‘to stretch a fishing net’ ‘to shiver, fear’ ‘to become tasteless’ ‘to drip’ ‘to run fast’ ‘to remove flesh, an eye’ ‘to scrub’ ‘to shell groundnuts’ ‘to strip a tree’ ‘to take maize off a cob’ ‘to click (in anger or resentment)’

175

176

Gunnink

(cont.)

48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. a. 69. 70. 71. 72. 73. 74.

Click word

Part of speech

Translation

kù-ᵑǀɔ̂ r-à kù-ᵑǀúmɛ̀ⁿt-à kù-ᵑǀùm-ùn-à kù-ᵑǀûr-à kù-ᵑǀásà-ùr-à kù-ǀùt-ùr-à kù-ᵑǀɔ́ ᵑǀɔ̀wɛ̀z-à kù-ᵑǀám-à-ᵑǀàm-à kù-ǀáʃá-ùk-à kù-ᵑǀúᵐb-ùk-ìr-à kù-ᵑǀâᵐp-à ku-ᵑǀumɛⁿd-a kù-ᶢǀɔ̀ tɔ̀ m-ɔ̀ n-à kù-ǀɔ́ ᵐb-ɔ̀ r-à kù-ᵑǀɔ́ ⁿzɔ̀ r-ɔ̀ k-à kù-ᶢǀàp-ùr-à kù-ᶢǀɛ́nɛ̀-h-ɛ̀s-à -ᶢǀɛ̂nɛ̀ kù-ǀán-à-ùn-à kù-ǀàrùm-ùn-à kù-ᶢǀàⁿd-ùk-à kù-ᶢǀìⁿt-à kù-ᶢǀìⁿt-ùr-à kù-ᵑǀáᵑkùs-ùr-à kù-ᵑǁáràᵑǁàs-à kù-ᵑǀùᵐp-ìk-à kù-ᵑǀúᵐpw-ìk-à ᵑǀùᵐpwí kù-ǀùt-ùr-à

v v v v v v v v v v v v v v v v v adj v v v v v v v v v ideophone v

‘to write’ ‘to kiss’ ‘to pull out, uproot’ ‘to offend someone by clicking’ ‘to eat reluctantly’ ‘to tear off’ ‘to eat or drink slowly’ ‘to touch something lightly, to work slowly’ ‘to crackle, sparkle’ ‘to fall beyond reach’ ‘to be flat (of stomach)’ ‘to mend’ ‘to scrub someone’s skin clean’ ‘to pick a spoonful of honey’ ‘to be slimy (of okra)’ ‘to stand with legs apart’ ‘to make thin’ ‘thin’ ‘to divide, share’ ‘to search someone else’s property’ ‘to disperse’ ‘to hop’ ‘to lie with bent knees’ ‘to struggle free; to prise open’ ‘to rummage (noisily)’ ‘to cover a hole (in which a pole is planted)’ ‘to throw in water (of inanimate object)’ ideophone of falling in water ‘to bite off a piece of tough or undercooked meat’ ‘to apply plaster to a wall’ ‘to shout loudly’ ‘to pound with short, sharp movements (for instance pumpkin seeds)’ ‘to widen (of a hole)’

75. kù-ǀàp-ìk-à 76. kù-ᶢǀárùm-ùk-à 77. kù-ᶢǀíⁿkìt-à

v v v

78. kù-ᶢǀᴐ̀ p-ᴐ̀ k-à

v

Click Loss and Click Insertion in Fwe

177

(cont.) Click word

Part of speech

Translation

79. kù-ᵑǀàᵐb-ùk-à 80. kù-ᵑǀᴐ̂ ᵐp-à

v v

81. kù-ᵑǀᴐ̀ ⁿd-ᴐ̀ r-à 82. kù-ᵑǀᴐ̀ ᵑɡᴐ̀ m-ᴐ̀ n-à 83. kù-ᵑǀûm-à

v v v

84. 85. 86. 87.

v adj adv ideophone

‘to burst (of the pod of a Zambezi teak tree)’ ‘to taste (by sticking one’s finger in a dish and licking it)’ ‘to take a fingerful of something’ ‘to hollow out’ ‘to suck out blood (as treatment for pain, a snake bite, or a curse)’ ‘to rustle’ ‘tasteless’ ‘very far’ expression of refusal, dislike

ku-ǀaǀas-a -ǀɔ̀ ᵑǀumarɛ ɛ̀-tʃúᵑǀà

References Bastin, Yvonne, André Coupez, Evariste Mumba, and Thilo C. Schadeberg, eds. 2002. “Bantu lexical reconstructions 3 / Reconstructions lexicales bantoues 3.” Tervuren: Royal Museum for Central Africa, online database: http://linguistics.africamuseum .be/BLR3.html. Baumbach, E.J.M. 1997. “Bantu Languages of the Eastern Caprivi.” Namibian languages: Reports and Papers, ed. Wilfrid H.G. Haacke and Edward E. Elderkin. Cologne: Rüdiger Köppe. Bostoen, Koen. 2009. “Shanjo and Fwe as part of Bantu Botatwe: a diachronic phonological approach.” Selected Proceedings of the 39th Annual Conference of African Linguistics, ed. Akinloye Ojo and Lioba Moshi. Somerville: Cascadilla Proceedings Project, pp. 110–130. Bostoen, Koen, and Bonny Sands. 2012. “Clicks in south-western Bantu languages: contact-induced vs. internally motivated change.” Proceedings of the 6th World Congress of African Linguistics Cologne 2009, ed. M. Brenzinger and Anne-Maria Fehn. Cologne: Rüdiger Köppe, pp. 129–140. Campbell, Lyle, and Martha C. Muntzel. 1989. “The structural consequences of language death.” Investigating Obsolescence: Studies in Language Contraction and Death, ed. Nancy C. Dorian. Cambridge: Cambridge University Press, pp. 181–196.

178

Gunnink

de Luna, Kathryn. 2010. “Classifying Botatwe: M60 languages and the settlement chronology of south central Africa.” Africana Linguistica 16: 65–96. Dickens, Patrick. 1994. English-Juǀ’hoan, Juǀ’hoan-English Dictionary. Cologne: Rüdiger Köppe. Dressler, Wolfgang U. 1988. “Language death.” Language: The Cambridge Survery IV, by Frederick J. Newmeyer. Cambridge: Cambridge University Press, pp. 184–192. Fisch, Maria. 1998. Thimbukushu Grammar. Windhoek: Out of Africa Publishers. Gunnink, Hilde. 2014. “The grammatical structure of Sowetan tsotsitaal.” Southern African Linguistics and Applied Language Studies 32: 161–171. Gunnink, Hilde, Bonny Sands, Brigitte Pakendorf, and Koen Bostoen. 2015. “Prehistoric language contact in the Kavango-Zambezi transfrontier area: Khoisan influence on southwestern Bantu languages.” Journal of African Languages and Linguistics 36: 193–232. Jones, Brian T.B., and Ute Dieckmann. 2014. “Caprivi Region.” ‘Scraping the Pot’: San in Namibia Two Decades After Independence, ed. Ute Dieckmann, Maarit Thiem, Erik Dirkx and Jennifer Hays. Windhoek, Namibia: Land, Environment and Development Project of the Legal Assistance Centre and Desert Research Foundation of Namibia, pp. 399–422. Kilian-Hatz, Christa. 2003. Khwe Dictionary. Namibian African Studies 6. Cologne: Rüdiger Köppe. Maho, Jouni F. 2009.” NUGL Online: The online version of the New Updated Guthrie List, a referential classification of the Bantu Languages (Accessed June 4, 2009).” https://brill.com/fileasset/downloads_products/35125_Bantu-New-updated -Guthrie-List.pdf. Möhlig, Wilhelm J.G. 1967. Die Sprache der Dciriku: Phonologie, Prosodologie und Morphologie. Cologne: Walter Kleikamp. Pakendorf, Brigitte, Hilde Gunnink, Bonny Sands, and Koen Bostoen. 2017. “Prehistoric Bantu-Khoisan language contact: A cross-disciplinary approach.” Language Dynamics and Change 7: 1–46. Sakuhuka, Clifford, Donald Mulimba, and James Lucas. 2011. “Report of a survey for five languages in the Barotseland / Western Zambia Region: Fwe, Kwamakoma, Kwamashi (and Akakwandu), Kwangwa.” Mongu: Luke Partnership, Western Zambia. Seidel, Frank. 2008. A Grammar of Yeyi. Cologne: Rüdiger Köppe Verlag. Traill, Anthony, and Rainer Vossen. 1997. “Sound change in the Khoisan languages: new data on click loss and click replacement.” Journal of African Languages and Linguistics 18: 35.

Chapter 5

Perception of Non-native Click Consonant Contrasts: Implications for Theories of Speech Perception Catherine T. Best 1 Introduction This chapter reviews the range of existing findings on perception of click consonants and contrasts by non-native adult listeners whose native languages do not employ click consonants (i.e. languages that do not use the velaric mechanism for consonant production) versus those who speak click languages other than the stimulus target language. As we will see, the patterns observed in perception of non-native click contrasts by listeners of click languages versus those of non-click languages challenge both classic and current theoretical frameworks on experiential effects in speech perception. Specifically, listeners of non-click languages generally perceive the click bursts as non-speech sounds, failing to perceptually assimilate them to native consonant categories and thus discriminating many – though not all! – click contrasts quite well. Conversely, listeners of other click languages tend to hear those same nonnative clicks as consonants, perceptually assimilating them to native click categories. As a result of perceptual assimilations to native clicks, non-native click language listeners may actually discriminate certain non-native click contrasts less well than naïve listeners of non-click languages, even though the latter listeners clearly have less overall experience with clicks, especially in linguistic contexts.1 We begin by providing relevant background on key distributional, articulatory and acoustic characteristics of clicks that have been thought to have likely impact on their perceptibility. We will turn next to theoretical perspectives 1   However, some languages have only 1–2 click words (e.g. Namibian Afrikaans and Shekgalagadi; see Sands and Gunnink 2019), suggesting that they may serve a paralinguistic role in the language rather than a phonological one (my thanks to Bonny Sands for this suggestion). In such cases of very low functional load for clicks, non-native clicks are more likely to be perceived as nonspeech sounds, and non-native click contrasts thus discriminated accordingly, than to be assimilated to the exceedingly rare native clicks (see Best, Traill, Carter, Harrison, and Faber 2003, discussed later in this chapter).

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_006

180

Best

on the ways in which language-specific experience is expected to affect speech perception, in this case the perception of non-native click consonants. Following that background, we summarize findings on non-native click perception by adult speakers of non-click and click languages, as well as on early developmental changes in infants’ perception of non-native click consonants. Finally, the implications of these adult and developmental findings for theories of speech perception and of its attunement by language experience will be discussed. 2

Characteristics of Clicks That Could Affect How They Are Perceived

Click consonants capture attention, more so than any other consonant class. That they do is perhaps not so surprising, given that abrupt sounds with rapid rise and fall in amplitude in the external environment, e.g. the snap of a dry branch or the pop of a cork or balloon, can literally turn a listener’s head. In spoken languages, click consonants are double-released stops in which the ‘click’ is the first noise burst that results from releasing the anterior closure in advance of the posterior closure release. That initial articulatory action rapidly forms and then breaks an intraoral vacuum pocket, causing a sudden air influx (a suction release sound) at one of five main places: alveolar, palato-alveolar, alveolar-lateral, dental or bilabial. This is followed closely by an egressive posterior release (the efflux), which may be a simple stop or have various “accompaniments” such as nasalization, frication or affrication, usually velar or uvular (or dental, in the case of bilabial clicks). The influx burst of most clicks is much higher in amplitude than the surrounding vowels, an acoustic asymmetry that is quite the opposite of the intensity of other consonant classes relative to their vocalic (or other sonorant) neighbors. There are, however, exceptions to that amplitude characterization, in both directions: On the non-click side, for example, the amplitude of the alveolar fricative /s/ can be quite high (though still generally not exceeding that of surrounding vowels/sonorants). Among the clicks, conversely, the bilabials have low-amplitude bursts due to the small/non-existent resonating cavity in front of the anterior closure coupled with a large cavity posterior to that closure (back cavity) (Traill 2002a; see also acoustic differences across click places: Marquard et al. 2015). Thus, it is interesting to note that bilabial clicks are rare even in click languages, relative to other places of click articulation (Ladefoged and Traill 1994; Miller-Ockhuizen 2003). And even in those languages that use them, bilabial clicks occur in relatively few lexical items, that is, they carry a

Perception of Non-native Click Consonant Contrasts

181

low functional load (see Sands and Gunnink 2019), which may well be related to their relatively weak acoustic properties. A language-universal inverse relationship has been posited between the perceptual salience2 of phonological segments and their stability over time and geographic distribution within a language, i.e. that the less perceptually salient a segment is, the more likely it is to undergo loss/change over time and distance (e.g. Hayes and Steriade 2004; Hume and Johnson 2001; Kingston and Shinya 2003; Steriade 2001; Wright 2001). The rarity of bilabial clicks relative to the other click places of articulation is compatible with the perceptual salience hypothesis: they certainly have lower acoustic/perceptual salience than the other places of articulation for clicks (e.g. Ladefoged and Traill 1994). This characterization is further supported by evidence that ǃXóõ bilabial clicks are poorly identified even by native speakers, who instead show excellent identification of natural tokens of dental, alveolar, alveolar-lateral, and palato-alveolar clicks, and good categorical perception of synthesized versions (Traill 2002a). Findings reviewed later in this chapter on discrimination of non-native click contrasts both by speakers of non-click languages and by speakers of other click languages are also consistent with this same pattern of relative discriminability of bilabial versus non-labial click places. The general hypothesis about the relation between perceptual salience and language variation and change is, however, seriously challenged by the very low incidence of even the other, highly perceptually and acoustically salient clicks across the phonemic inventories of the world’s languages (~1% of languages: Maddieson 1984). Phonemic clicks are not only of low incidence in phonological systems, they are also quite restricted geographically. They are used phonemically only in a subset of southern and eastern African languages, with claims for just a single continental exception, Damin in Australia (Güldemann 2007; Sands and Güldemann, 2009).3 This posited exception is of extremely limited scope, however, as it seems dubious that Damin had its own full phonological inventory or included minimal lexical contrasts distinguished only by their clicks, given that it was the now-extinct male ceremonial register of

2  ‘Perceptual salience’ is a term used in research both on cross-language perception and on language variation and change, referring to phonemes whose acoustic properties are thought to be easy to detect and to discriminate from other, contrasting phonemes. Unfortunately, definitions are often circular and “salience” is seldom directly tested. 3  Bilabial clicks have been reported to occur in some non-click African languages as allophones of other consonants, usually labiovelar stops, which are also double-articulated stops (Doke 1931; Ladefoged 1964/1968).

182

Best

the Australian indigenous language Lardil (Hale 1973; Hale and Nash 1997), and is believed to have had only ~200 words. Meanwhile, although claims have been offered that European languages may display some incidences of clicks, these cases are clearly quite different to phonemic clicks that, for one thing, are coarticulated with vowels. The clicks claimed to arise in some European languages are described as low-amplitude emergent properties resulting from the overlapping articulation of two adjacent (non-click) consonants (e.g. Marchal 1987; Ohala 1995),4 rather than as phonemes or allophones in these languages. Relatedly, clicks as non-linguistic human vocal tract gestures are, in fact, widely distributed across languages and around the globe as paralinguistic (non-speech) sounds, e.g. dental tsk-tsk (scolding, disapproving, etc.), bilabial kissing sound, alveolar-lateral click for emphasis, etc. (e.g. Gil 2005; Sands and Güldemann 2009), though not universally (paralinguistic clicks reportedly do not occur in, e.g. Kinubi, an Arabic creole of Kenya: Luffin 2005; Singapore Bazaar Malay: Khin Khin 2005; Media Lengua, Ecuador: Muysken 1997). They also arise as paralinguistic discourse markers in non-click languages, e.g. indicators of turn-taking, word-searching or phrase-organizing in conversation (Ogden 2013; Wright 2011). Such observations suggest the dearth of clicks in phonological inventories is unlikely to be due to difficulties in either perceiving or producing them (see also Engstrand, 1997). Thus, all that is certain is that click consonants are found only in the phonological inventories of certain southern and eastern Africa languages. If perceptual salience were a major driving force in establishing a given phonetic type in a language and maintaining it over time and distance (i.e. across language change and variation), as has been suggested (e.g. Ohala 1993a, b; cf. Dinnsen 1983; Beddor et al. 1986), we should expect to see more widespread use of clicks across languages, and less loss/replacement of clicks, than is found with other consonants. But that is not what is observed – clicks are quite rare across languages, while conversely they appear to undergo loss or replacement in those languages at the same rate and according to the same principles as other types of consonants (e.g. Baumbach 1974; Fehn, this volume; Bostoen and Sands 2009; Gunnink 2014; Sands 2007; cf. Traill and Vossen 1997). Moreover, it should be noted that click loss/change is not observed more frequently for the 4  It is usually some kind of dental or alveolar click that emerges through consonant overlap, generally of low amplitude. It is likely that speakers or listeners are generally not consciously aware of such emergent clicks, although they might be able to direct attention to them if explicitly instructed to do so. Some researchers speculate that such cases of consonantal overlap might be one way that clicks can enter a language as phonemes, but testing that possibility would be difficult or impossible. (Thanks to Bonny Sands for raising these points.)

Perception of Non-native Click Consonant Contrasts

183

bilabial clicks than for the other clicks that display higher ‘acoustic salience’ (e.g. Traill 2002b), as the perceptual salience hypothesis should predict. Thus, the dearth of languages with clicks in their consonantal inventories, and the patterns of click loss and change, cannot be attributed to low percep­ tual salience. Evidence indicates that, despite their rarity, clicks as a class are indeed particularly salient even to native listeners, i.e. attention-getting, easy to detect, and easy to categorize and discriminate from other types of consonants (e.g. Ladefoged and Traill 1994; Traill 1997). They strongly tend to occur in word-initial position, which could assist listeners in perceptually segmenting words from connected speech. Click consonants in other word positions are much rarer. Clicks occur frequently in the lexicons of some of the core click languages: 50–70% of words5 in certain languages of the Khoe, Ju and Tuu families have clicks (see Güldemann 2007; Pakendorf et al. 2017). In addition, click languages in the core families often have fairly large click inventories, with many “accompaniments” (secondary feature distinctions such as velar or uvular frication). Another indicator of the perceptual salience and conscious awareness of clicks for speakers of these languages is that they have specific lexical items to refer to extensive use of clicks, incorrect pronunciations of clicks, and/or different types of clicks (Sands 2007), whereas this is unusual for other consonant classes, aside from the lexicons of speech therapists and language specialists. Additionally, click language speakers have been reported to replace clicks with ‘plain’ stop consonants for sociolinguistic reasons, e.g. to sound more like members of a higher status, non-click majority language (e.g. Bostoen and Sands 2012; Wilmsen and Vossen 1990), which implies they are perceptually salient to the speakers of both the click and nonclick languages. Indeed, empirical findings on click consonant perception support the notion that not only speakers of click languages, but also speakers of non-click languages find clicks perceptually compelling, as will be described and discussed in the remainder of this chapter. Non-native listeners who are naïve to clicks as consonants are likely to perceive them as non-speech sounds that seem to be somehow independent of the flow of the spoken linguistic message

5  However, lexical frequency for words with clicks varies widely in other language families, reaching only 12–25% of the lexicon or even much lower (see Güldemann 2007; Sands and Gunnink 2019), and the inventories of click places and accompaniments are often much simpler in those other languages (i.e. outside of the Khoe, Ju, Tuu families), such as those that adopted clicks through contact (e.g. Bantu languages).

184

Best

carried by the other consonants and vowels – that is, to constitute a separate “stream” from that of the linguistic message. It is easy to see, then, why click consonants hold considerable intrigue for linguists and psycholinguists. As discussed in other chapters of this volume (see Bennett; Fehn; Gunnink; Sands; see also, e.g. Sands 2007; Traill 1997; Traill and Vossen 1997), their articulatory and acoustic characteristics, their phonological behaviors, and their rarity and restricted distribution across languages have the potential to offer unparalleled insights into the limits and internal coherence of theoretical accounts of phonological representations and processes, including diachronic change and variation. Complementary to those linguistic challenges, the phonetic and phonological characteristics of click consonants and contrasts also offer unparalleled opportunities for examining competing psycholinguistic hypotheses about perception of non-native speech contrasts, e.g. of likely acoustic versus articulatory contributions. My initial cross-language click perception study (1988) was, in fact, the catalyst for developing the Perceptual Assimilation Model of non-native speech perception (PAM: Best, 1995; Best, McRoberts and Sithole 1988; Best, McRoberts, LaFleur and SilverIsenstadt 1995), more recently extended to L2 learners and bilinguals (Antoniou et al. 2013; Antoniou et al. 2012; Best and Tyler 2007). 3

Theoretical Perspectives

Those collective characteristics of click consonants offer unique opportunities to evaluate core theoretical hypotheses about speech perception, in particular whether it relies on specialized linguistic or basic auditory processes, and how it is influenced by language experience. Given that human vocal tract clicks are phonemic elements in only a small and geographically circumscribed group of languages, but occur more widely across cultures as paralinguistic “mouth sounds” (see above),6 they offer unique opportunities for teasing apart how specifically linguistic versus non-linguistic auditory experience contribute to perception of non-native speech contrasts. In addition, because speakers of languages who are naïve to clicks as consonants strongly tend to perceive them 6  Note that voiceless frication sounds can also sometimes used paralinguistically, e.g. a “shush” (akin to a long [ʃ] without a vowel) to calm a baby or to quiet someone down. However, voiceless fricatives are much more widely distributed as phonemic elements across the world’s languages than clicks are, and languages that have voiceless fricative phonemes use frication both linguistically and paralinguistically, e.g. English uses /s/ and /ʃ/ in, e.g. see and she, as well as in long “shhh!” and “sss!” without any vowel context, to quiet someone down and to indicate disapproval, respectively.

Perception of Non-native Click Consonant Contrasts

185

as non-speech sounds (e.g. Best et al. 1988), click consonants provide a test case for assessing perception of the same target sounds as linguistic units versus as non-linguistic sound-producing events. These questions can be investigated using both behavioral methods and electrophysiological/imaging studies designed to pinpoint activation of language-specific versus general auditory systems in the brain. The issue of whether speech perception is handled by language-specialized mechanisms and processes (e.g. Liberman et al. 1967; Liberman and Mattingly 1985, 1989; Best and Avery 1999; Dehaene-Lambertz, et al. 2005), or whether instead general auditory processes can fully account for the observed perceptual patterns (e.g. Belin et al. 1998; Binder et al. 2000; Diehl and Kluender 1989; Joanisse and Gati 2003) has been a long-standing, fundamental debate in speech perception research. Perception of click consonants can provide novel insights on the issue. Differences in perception between speakers of non-click languages, who are naïve to clicks as consonants, versus speakers of click languages, who may in turn differ depending on whether the target clicks are used in their native language or are non-native, would be compatible with specialized linguistic processes. Moreover, such group differences would indicate that those processes are maintained or tuned up by language-specific experience with clicks as consonants. But if, instead, those linguistically-disparate nonnative listener groups nonetheless show essentially the same performance in perceptual tasks with click consonants, despite the dramatic differences in their experience of clicks as linguistic elements, this would suggest that general auditory processes alone may suffice to account for observed patterns in cross-language speech perception. Click consonant perception is clearly germane to understanding the impact of language-specific versus general auditory experience on speech perception. Three theoretical models of cross-language speech perception should be considered with respect to predictions and interpretations of non-native click consonant perception. They include the Speech Learning Model (SLM: e.g. Flege 1995; Aoyama et al. 2004; Guion et al. 2000); and the Native Language Magnet model (NLM: e.g. Kuhl and Iverson 1995; and its update, NLM-e: Kuhl and Rivera-Gaxiola 2008; Kuhl et al. 2008). Our discussions will focus on the third model, however, which has generated nearly all cross-language click consonant perception studies to date, the Perceptual Assimilation Model (PAM: Best 1995; Best et al. 1988; and its extension to L2 learning: Best and Tyler 2007). We will briefly describe and consider the actual or extrapolated predictions of SLM, NLM and PAM for perception of native versus non-native click consonants, as grounding for our following discussion of existing cross-language

186

Best

findings on click consonant perception. We begin with SLM (Flege 1995; Aoyama et al. 2004; Guion et al. 2000), which posits that any L2 phoneme that is perceived as identical or similar to an L1 phoneme will be perceptually equivalence classified as a member of that L1 category. On the other hand, if learners discern phonetic differences between an L2 phoneme and the phonetically closest L1 phoneme they may form a new L2 category, the likelihood of which correlates with the magnitude of perceived phonetic distance between the L1–L2 phones. What might SLM predict about naïve listeners’ perception of non-native click consonants, including their discrimination of minimal click contrast? Extrapolation rather than straight prediction from SLM principles is needed, because SLM has not addressed the possibility that a non-native phoneme subclass may fall fully outside the native phonological space, as do click consonants for speakers of non-click languages. It does not anticipate the possibility of speech segments being perceived as non-speech sounds, i.e. as categorically unconnected in any way to phonological categories. Those issues noted, some extrapolations can be drawn. For adult speakers of non-click languages, in SLM all click consonants should be “new”, as they will all be perceptibly dissimilar from any native consonants. For naïve speakers of other click languages, however, non-native target clicks should be perceived as consonants, and either equivalence classified to the closest L1 click (“similar”) or perceived as deviating from even the closest native click (“new”), depending on whether or not phonetic differences from the closest L1 click(s) are discerned. According to SLM principles, these listeners should be easily able to form L2 categories for the “new” clicks, and should discriminate between “similar” and “new” clicks. However, SLM has not considered the initial state of inexperienced perception, i.e. in infants, and the trajectory of language-specific “tuning” over the first year or so. Therefore, it is not possible to make SLM predictions about early developmental changes in perception of non-native clicks, an issue that is crucial to addressing the effects of linguistic versus auditory experience on perception of click consonants. In contradistinction to SLM, NLM focuses almost exclusively on infant perception of non-native versus native speech, and posits that a critical period in neural maturation is the source of early developmental changes in perceptual sensitivity to experienced versus unexperienced acoustic properties of speech (e.g. Kuhl and Iverson 1995; Kuhl and Rivera-Gaxiola 2008; Kuhl et al. 2008). It posits that perceptual prototypes become established in infancy by experience with the categories used in the native language, via statistical learning mechanisms, which results in a warping of perceptual space around the prototypes. Native prototypes then begin to serve as perceptual magnets, attracting similar

Perception of Non-native Click Consonant Contrasts

187

tokens in perceptual categorization while increasing sensitivity to inter-token differences near the category boundaries. This leads to better discrimination among within-category variations around non-prototypes than around prototypes (discrimination asymmetry).7 For acoustic properties of speech that are not experienced during the critical period, prototypes fail to emerge, boundaries weaken, and neural commitment fails, resulting in reduced sensitivity to non-native contrasts. Like SLM, the NLM fails to anticipate that clicks could be perceived either as consonants or as non-speech sounds, and that this difference might affect performance. Thus, we can only extrapolate that NLM would make the same predictions for clicks as for any other type of non-native consonant, i.e. due to lack of exposure to non-native clicks, perceptual sensitivity to acoustic differences among non-native clicks should decline notably after the critical period, regardless of presence/absence of other clicks in the listener’s language, and of whether they are perceived as speech or non-speech. PAM differs from both SLM and NLM in crucial ways that make it the optimal model for predicting and evaluating perception of click consonants by non-native listeners of both L1 types: 1) PAM alone makes specific predictions of variations in assimilation and discrimination of different non-native contrasts based on principled considerations of phonetic-articulatory similarities and differences as well as phonological fit between the non-native phones and the listener’s native phonological inventory; 2) PAM assumes that perceptual attunement to the native language involves discovering through experience the phonetic-to-phonological mappings for the native phonological system, which it alone posits are organized around the perceived articulatory properties of native phonemes; 3) following from those assumptions, only PAM distinguishes among straightforward assimilation of a non-native phone to a single given native phoneme (Categorized) versus weaker, less-categorical assimilation distributed across two or more native phonemes (Uncategorized), versus failure to assimilate certain types of non-native phonemes to the native phonological system at all, resulting in perception of them as non-speech sounds (Non-Assimilable). The Non-Assimilable type is exemplified by perception of clicks by listeners of languages that do not employ the click mechanism in their consonantal systems, whereas non-native clicks would much more likely 7  The magnet effect is also claimed to be uniquely human, as monkeys do not display an asymmetry in discrimination between prototypical and non-prototypical tokens of a speech category (Kuhl 1991). However, this reflects a tautological flaw. Given that monkeys do not have the amount or type of early exposure nor the social experience with the target language that NLM claims are the foundations for formation of phonetic category prototypes and perceptual warping, the NLM prerequisites are not met. Thus, speech prototype effects cannot by definition be critically evaluated in non-human animals.

188

Best

be assimilated as Categorized or Uncategorized consonants (i.e. as speech) by listeners whose native language employs clicks. PAM states that the assimilation types for each of the two phonemes in a non-native contrast result in the following assimilation patterns for minimal pair segmental contrasts, which correspondingly predict relative discriminability of each contrast type: 1. In Single Category (SC) assimilation each phoneme of a non-native contrast is categorized as equally good/fair/poor exemplars of the same single native category and discrimination is thus expected to be poor. 2. In Two Category (TC) assimilation, on the other hand, each phoneme of a non-native contrast is categorized as a different native phoneme, thus the pair is assimilated to a native phonological contrast and accordingly discrimination will be excellent. 3. In a Category Goodness (CG) difference assimilation, both non-native phonemes are categorized to the same native phoneme but differ significantly in perceived phonetic goodness of fit, which is expected to result in good discrimination of the pair, rather than poor (SC) or excellent (TC) discrimination. 4. If the two non-native phonemes are Uncategorized (as described above) and Categorized, respectively (UC assimilation), discrimination should be very good if the Uncategorized one is never heard as the same native phoneme as the Categorized one, but should be lower to the extent that there is overlap in which native phonemes are perceived for the Uncategorized and the Categorized item. 5. If both members of a non-native contrast are Uncategorized (UU), discrimination levels are similarly predicted to vary depending on the amount of overlap between the native categories perceived in the two non-native phonemes, i.e. very good discrimination if there is no native category overlap, poor if the same subset of native categories is perceived in both non-native phonemes, and intermediate if some but not other perceived native categories overlap for the two non-native phonemes. 6. Importantly for perception of non-native clicks by speakers of non-click languages, if the articulatory-phonetic properties of a non-native contrast deviate dramatically from anything in the native phonological system, they will not be assimilated as phonemes but will instead be perceived as non-speech sounds (Non-Assimilated contrast: NA). Discrimination performance should range from excellent to poor depending on the magnitude of perceived non-phonetic auditory differences between the non-native items.

Perception of Non-native Click Consonant Contrasts

189

Thus PAM, unlike the other two models, provides specific predictions for perception of non-native clicks either as consonants (i.e. perceived as phonological segments) or as non-speech sounds (not perceived as phonological elements), with clear implications for discrimination of click contrasts by the two subtypes of non-native listeners: those whose native languages have versus lack click consonants. For speakers of click languages other than the stimulus language, PAM’s predictions about TC, CG and SC (or UU/UC) assimilations and discrimination patterns should be observed, with the patterns depending on the phonetic-articulatory and phonological similarities and differences between the target clicks and the tested non-native click contrasts, as has been found for many other types of non-native consonants (e.g. Best et al. 2001; Best and Hallé 2010; Best and Strange 1992; Bohn and Best 2012; Hallé, Best and Levitt 1999; Tyler, Best, Goldstein and Antoniou 2014; see also Hallé and Best 2007). Speakers of non-click languages, on the other hand, should perceive click consonants as non-speech sounds (NA) rather than as consonants, given how strikingly they differ from the consonant types in their native inventories. That is, they are expected to perceive clicks qualitatively and quantitatively differently from listeners of other click languages. This leads to what may initially seem a paradoxical prediction: for click contrasts that other click language listeners assimilate as Single Category and thus discriminate poorly, non-click language listeners may show significantly higher discrimination. One might expect, by extension, that non-click language listeners should perform similarly in perception of click consonants they hear as nonspeech sounds, and of non-speech environmental/mechanical clicks and other rapid percussive sounds (e.g. finger snap, click of a tap shoe or high heel, twig snap, cork pop, snap of an elastic band against a hard surface, the tick-tocks of analog clocks). Unfortunately, there is virtually no published research on categorization and discrimination among environmental or mechanical clicks that could serve as a basis for evaluating that expectation (though see Guastavino 2007; Gygi et al. 2007; Lemaitre et al. 2010, for perceptual categorization of environmental sounds more generally). Instead, non-speech clicks have been employed in auditory studies mostly to examine sound localization (e.g. Hafter and Dye 1983; Yost et al. 1971; Wightman and Kistler 1997; Hartmann and Rakerd 1993) and brain activation to abrupt, transient acoustic properties (e.g. Howard and Poeppel 2009). More germane to the divide between speech perception and non-speech auditory perception, however, clicks have also been employed to investigate how perception of spoken sentences or phrases is organized by their grammatical structure (e.g. Abrams and Bever 1969). In those studies, mechanical click

190

Best

sounds were superimposed at various points in utterances. English-speaking listeners were asked to respond as soon as they heard the click, and to locate where in the utterance it had occurred. They responded more quickly to clicks at clause boundaries, and strongly tended to locate clicks at a clause boundary even if they had actually occurred clause-internally. These findings suggest that non-native listeners of non-click languages are likely to perceive the clicks of conversational speech in a click language as being separate and independent from the stream of the non-click consonants and vowels. Indeed, they might have actually treated them like paralinguistic clicks in conversational English, which tend to occur at phrase boundaries.8 This could, in turn, exacerbate such listeners’ difficulties in late second language learning of a click language. While there is virtually no research on perception of contrasts among different environmental non-speech clicks and sharp percussive sounds, even the body of work on click consonant perception is itself fairly small. Very few papers other than those from my labs have examined non-native click consonant perception. Hence, nearly all of the studies reviewed below are ours. The majority of the papers have examined adults’ perception of non-native clicks, but a few (including the first reported study of non-native click consonant perception) explored developmental changes in infants’ perception of clicks. We turn first to the research on adult non-native listeners, and will then move to studies with infants. 4

Experimental Research on Cross-language Perception of Click Consonants

4.1 Adult Perception of Non-native Clicks The foundational study of non-native click consonant perception (Best et al. 1988) was the catalyst for development of the Percep­tual Assimilation Model of non-native speech perception (PAM), as the findings could not be accommodated by either of the two primary theoretical hypotheses at that time: 1) Linguistic – that adults have difficulty discriminating any phonetic contrast not employed distinctively in their native language, and 2) Psychoacoustic – that adults have difficulty discriminating contrasts to which they have not had early acoustic exposure. Refuting both of these hypotheses, we found that English-speaking adults discriminated a wide range of click consonant contrasts quite well rather than very poorly, despite the fact that their 8  Thanks to Bonny Sands for suggesting this possibility.

Perception of Non-native Click Consonant Contrasts

191

native language obviously lacks phonemic clicks, hence they lack experience with the click acoustic properties of coarticulated speech (i.e. click bursts, voicing differences, and formant transitions). In the study, American English-speaking adults completed discrimination tests for the 18 minimal place and voicing contrasts among the three places of articulation (apical [dental], alveolar and alveolar-lateral) crossed with the three non-nasalized voicing settings (pre-voiced, voiceless unaspirated, voiceless aspirated) of Zulu clicks. The stimuli were natural recordings of Zulu click+/a/ syllables, presented in AXB trials in which A and B tokens always reflected a minimal click contrast and X was a different token of the A or B click. Listeners had to indicate whether the X was of the same category as A or B. They performed quite well on all contrasts (mean correct: 91.5%, range 81–99%), including the pre-voiced versus voiceless unaspirated voicing distinction that is difficult for English listeners to discriminate when it occurs in non-native stops (e.g. Italian or Spanish stop voicing). Even their poorest performance was still quite good: 81% correct on the unaspirated dental versus alveolar-lateral click place distinction, which was tested again comparing a new group of English listeners (78%) against 6 native Zulu listeners (87%; difference ns). The stimuli were replayed to English participants after the task in both studies, who were asked to describe what the “sound at the start of the syllable” was. All reported hearing the clicks as non-linguistic events (finger snap, cork pop, clops, percussion instrument, plop, water drip, tongue clucking or popping, etc.). We created the PAM framework to help interpret these findings, reasoning that because the English listeners had failed to perceptually assimilate the clicks to native consonants (Non-Assimilable to English phonemes), discrimination was not constrained by their attunement to their native phonological system. Therefore, they discriminated the clicks quite well as non-linguistic sounds, and certainly very much better than the linguistic and psychoacoustic experience hypotheses would predict (Best et al. 1988). This was a very different performance pattern from that found in our subsequent study of American English adults’ assimilation and discrimination of three other non-native but non-click consonant contrasts from Zulu: voiced plosive/implosive bilabial stops, voiceless aspirated/ejective velar stops, and voiced/voiceless lateral fricatives (Best, McRoberts and Goodell 2001). As predicted by PAM, most participants in that study assimilated the bilabials as equally good English /b/s (SC assimilation) and discriminated them quite poorly. However, a subset of them reported a difference in quality of the /b/ (CG assimilation) and discriminated the contrast significantly better than the SC majority: they heard the plosive as a “good” /b/ but the implosive as a “murmured” or “swallowed” /b/. The velar stops were unanimously assimilated as a CG difference (the ejective was heard

192

Best

as a “choked” or “hit in the throat” /k/), which they discriminated well.9 All listeners assimilated the lateral fricatives as a TC contrast (as a fricative voicing contrast in /l/ clusters, or a fricative vs. /l/ contrast), and discriminated them excellently, significantly better than the CG velar stops, as predicted by PAM. A second study of non-native click perception compared adult native listeners of a non-click language (American English) against native listeners of two click languages from a different language family (Zulu and Sesotho, South African Bantu languages) than the target stimuli, which were natural multipletoken productions of click-vowel syllables in ǃXóõ (a Khoesan language of Botswana: Best, Traill, Carter, Harrison and Faber 2003). None of the listener groups had experience with this or other similar Khoesan languages, nor with the specific click types of the target stimuli: bilabial versus dental clicks, and alveolar versus palato-alveolar clicks, all articulated with a uvular frication accompaniment, i.e. /ʘχ/-/ǀχ/ and /ǃχ/-/ǂχ/, respectively. While Zulu and Sesotho have phonemic click consonants, adopted from contact with Khoesan languages via trading and intermarriage, neither has uvular fricatives nor uses frication as a click accompaniment. Zulu has a richer click system than Sesotho, using the voicing and place distinctions described above, as well as nasalized and pre-nasalized clicks at each place. Sesotho employs only dental clicks differing in voicing but not in nasalization, and has fewer lexical items containing clicks. Both of these South African languages lack bilabial and palato-alveolar clicks; Sesotho also lacks contrastive alveolar clicks. English of course has no click consonants at all. PAM principles predict that Zulu listeners should be more likely to assimilate ǃXóõ clicks to native clicks than Sesotho listeners, but should nonetheless perceive all of these ǃXóõ clicks as less than ideal exemplars of Zulu clicks because of their non-native uvular frication accompaniment. Sesotho listeners would, in turn, be expected to assimilate some ǃXóõ clicks to their dental clicks (e.g. the bilabials), but to hear others (e.g. the alveolars) as non-speech sounds. English listeners should clearly be most likely to perceive the clicks as nonspeech sounds, and least likely to assimilate them to English consonants. It 9  I know of no studies that have evaluated discrimination of a homorganic contrast between a voiceless click and an ejective stop, e.g. dental click /ǀ/ versus ejective /t’/ (e.g. in Xhosa), or glottalized lateral click /ǁ’/ versus lateral ejective /t͡ʎ̥ʼ/ (e.g. in Hadza). PAM principles would lead us to expect non-click language listeners to assimilate such non-native consonant pairs as a non-speech sound versus a voiceless stop with some “odd” quality, i.e. a contrast between a non-speech sound and some variant of a native consonant (speech) (or cluster, in the case of the lateral ejective: see Best and Hallé 2010; Hallé and Best 2007). If this prediction is correct, the contrast should be easily discriminated according to PAM (thanks to Bonny Sands for suggesting this possibility and the Hadza example).

Perception of Non-native Click Consonant Contrasts

193

follows from PAM principles that discrimination of minimal ǃXóõ click place contrasts will depend on whether they are perceived as (variants of) native consonants or as non-speech sounds. Discrimination should be poorest for Single Category (SC) assimilations, intermediate for Category Goodness (CG) assimilation, and excellent for Two Category (TC) assimilations, by the South African groups. However, for clicks perceived as non-speech sounds, i.e. mainly by English listeners, discrimination should instead reflect the magnitude of acoustic difference between the contrasting clicks, which is larger for the alveolar/palato-alveolar than the bilabial/dental contrast. A corollary prediction is that English listeners should show better discrimination than the South African groups for contrasts that the latter listeners assimilate to a single native click (SC assimilation) – despite the fact that they are clearly much less experienced with clicks, especially in speech. Results supported these predictions: a) English listeners were more likely to hear the clicks as non-speech sounds (NA assimilation for both contrasts) than Sesotho listeners, who in turn heard them as non-speech sounds more often than Zulu listeners; b) Sesotho and Zulu groups both showed mainly SC assimilation of the alveolar/palato-alveolar click contrast to one of their native clicks, and discriminated it significantly worse (78% and 77%, respectively) than the English listeners (89%); c) for the bilabial/dental contrast, however, Zulu listeners showed primarily CG assimilation to a single native click, and better discrimination (81%) than Sesotho listeners (75%), who showed SC assimilation; d) English listeners discriminated the bilabial/dental contrast as well (80%) as the Zulu group, but less well than they discriminated the larger acoustic difference between the alveolar and palato-alveolar clicks as non-speech sounds. Additional insights about the nature of non-native click perception, especially by adults whose native languages lack clicks as phonemes, can be gained from audiovisual (AV) studies on perception of place of articulation among clicks versus among stops. AV perception of non-click stop consonants has been taken to indicate how the articulatory information presented by a talking face impacts on perception of the acoustically-synchronous consonant. AV influences on categorization of stop consonants have been studied particularly effectively using the McGurk paradigm, in which synchronized but phonetically incongruent audio and visual (talking face) syllables are presented, and participants must identify “the consonant they heard” (McGurk and MacDonald 1976). In the original McGurk study (as in subsequent studies, e.g. MacDonald and McGurk 1978; Skipper, van Wassenhove, Nusbaum and Small 2007; cf. audio-tactile McGurk effect: Fowler and Dekle 1991) participants failed to detect discrepancies between the paired audio and video signals. Crucially, in critical trial types the stop place they perceived was neither simply

194

Best

the audio nor the video consonant, but rather a phonetically compatible “compromise”. For example, when presented with audio /apa/ and video /aka/ they perceived /ata/, i.e. the place of articulation intermediate between labial /p/ and velar /k/; on trial types such as audio /apa/ and video /ata/ (with no intermediate place of articulation) they perceived sequences such as /apta/. For non-speech, non-oral events (e.g. plucked-bowed violin sounds), however, McGurk-like effects are much weaker or lacking altogether (e.g. de Gelder and Vroomen 2000; Rosenblum and Fowler 1991; Saldaña and Rosenblum 1993). Thus, the McGurk approach offers an ideal tool for examining AV perception of clicks by listeners of non-click languages such as English, who strongly tend to perceive clicks as non-speech sounds. We compared McGurk effects for American English participants’ AV perception of native voiceless stops versus non-native voiceless clicks (Brancazio, Best and Fowler 2006). When click bursts were presented in isolation the McGurk effects were quite weak as compared to those found for stop consonant-vowel syllables. However, the McGurk effects for isolated voiceless stop bursts were found to be equally weak as for the click bursts. Conversely, and most crucially, McGurk effects for click consonant-vowel syllables were found to be equivalent to those for stop-vowel syllables. These results indicate that McGurk AV integration is strong for coarticulated stop consonant-vowel syllables, but much weaker for isolated release bursts, regardless of whether they are from native or non-native consonants, or are perceived as speech or as non-speech. It was concluded that the McGurk effect appears to be triggered by a rather specific type of event – rapid release of a vocal tract constriction into a vowel – and yet it does not depend on the components being recognized as phonological elements, i.e. the events need not be perceived as speech. This interpretation is also relevant to infants’ AV speech perception, discussed below. Neuro-cognitive approaches can also reveal distinctions between linguistic versus non-speech processing of identical speech stimuli, perhaps even more sensitively than AV speech perception. A left-hemisphere (LH) advantage is a fundamental neurocognitive aspect of language, i.e. greater activation of the left than the right hemisphere for perception of phonological and syntactic elements in spoken utterances. In speech perception, the left hemisphere advantage is strongest for identification and discrimination of stop consonants, as compared to, e.g. glide consonants or vowels (e.g. Altmann et al. 2014; Cutting 1974; Shankweiler and Studdert-Kennedy 1967). That observation, among others, led some researchers to propose that the left hemisphere has a domaingeneral auditory specialization for processing rapid spectral changes rather than for specifically linguistic information (e.g. Belin et al. 1998; Binder et al. 2000; Schwartz and Tallal 1980). Recent work, however, supports the view

Perception of Non-native Click Consonant Contrasts

195

that the left hemisphere is specialized for processing of linguistic elements in speech (see Hickok and Poeppel 2015; McGettigan and Scott 2012; Scott and McGettigan 2013). I have employed two neuro-cognitive techniques to probe differences in perception of Zulu clicks by non-native non-click listeners (English) as compared to native listeners. The first study used dichotic listening, in which competing stimuli are presented simultaneously to each ear. Listeners display a right ear advantage (REA) in identifying or discriminating phonemes; this REA is most robust for stop consonants (e.g. Shankweiler and Studdert-Kennedy 1967). Clicks are a subtype of stop, but had not been dichotically tested before. Because click-naïve listeners cannot by definition use click consonant labels, we adapted the dichotic procedure to allow dichotic matching judgments by non-native non-click language listeners (Best and Avery 1999). On each trial a dichotic target was presented, followed by four binaural choices. The participant was instructed to attend to one ear at a time in different blocks, and to choose the binaural choice that matched the attended-ear target. The stimuli were two types of Zulu click stimuli: full consonant-vowel syllables (CVs) and the isolated click bursts excised from the syllables. The participants were clicknaïve English listeners and native Zulu10 listeners. The English listeners performed as well on average as the Zulu listeners for the excised clicks and only slightly less well on the clicks in syllables, but only the Zulu listeners displayed a right-ear/left hemisphere advantage in either task. No ear differences were found for the English listeners, who showed poorer right-ear but better leftear accuracy than the Zulus. We drew two conclusions from those findings: 1) The left hemisphere is specialized for detecting linguistic information, rather than simply responding to domain-general acoustic properties in speech; and 2) that ability is dependent on specific language experience, in this case linguistic experience with clicks as consonants. Our second study used functional magnetic resonance imaging (fMRI) to examine regional brain activation during English versus Zulu listeners’ discrimination of a Zulu click place contrast (dental vs. alveolar-lateral unaspirated clicks /ǀ/-/ǁ/) in comparison to an English and a Zulu fricative voicing contrast (/s/-/z/ and lateral fricatives /ɬ/-/ɮ/, respectively) (Best et al. 1999; Best, Mencl, Pugh, Gore and Constable, in preparation). These target contrasts were chosen because all three are discriminated quite well by both native Zulu and non-native English listeners, and English listeners assimilate the Zulu lateral fricatives to English fricatives + /l/, i.e. they perceive them as consonants, 10  Some participants in this group also spoke Xhosa, another South African Bantu language that has essentially the same click consonant inventory as Zulu.

196

Best

whereas they perceive Zulu clicks as non-speech sounds (Best et al. 1988, 2001). This same pattern was seen in the fMRI study’s behavioral results for discrimination tests conducted both in a quiet room before or after the MRI session, and during the session inside the machine. Brain activation patterns during discrimination of each phonological contrast within the MRI session were calculated against the listeners’ baseline pure-auditory activation when discriminating frequency-modulated tone sweeps. For the two non-click consonant contrasts, both language groups showed specific left hemisphere activation localized in speech processing areas (middle temporal gyrus: MTG), with significantly greater activation in the native listener group in each case. Those results provide the critical backdrop against which to consider activation patterns for the two groups on the crucial Zulu click contrast, perceived as native consonants by Zulus but as non-speech sounds by non-native English listeners. The Zulu listeners showed strong engagement of MTG and superior temporal gyrus (STG), with greater activation in the left than the right hemisphere. The English listeners instead showed a small area of greater activation than Zulu listeners in the left inferior frontal gyrus, generally associated with speech production, not seen in either group for the other contrasts. Overall the activation patterns indicate greater activation of left hemisphere speech processing areas in native listeners, tuned by specific language experience, for discrimination of consonants perceived as speech.11 The unique activation response of nonnative English listeners to the clicks is compatible with perceiving them differently than native or non-native consonants perceived as speech. Intriguingly, it was in the left hemisphere speech production area, which we speculate is a sign that their brains recognized the clicks as having been articulated by a human vocal tract, e.g. as non-speech “tsk” (disapproval) and “giddyup” (urging a horse on) sounds.12

11   Language-specific tuning of left hemisphere language area activations has also been found for discrimination of other language-specific stimulus properties that non-native listeners perceive as non-speech, e.g. for discrimination of Thai lexical tones by Thai but not English listeners (Gandour, Wong and Hutchins 1998), and for perception of utterances in the special whistle language of the Canary Islands, Silbo Gomero (SG), by proficient SG whistlers but not by naïve controls (Carreiras, Lopez, Rivero and Corina 2005). 12  I thank an anonymous reviewer for the interesting proposition that speakers of languages that lack clicks even paralinguistically (e.g. as noted earlier, Lengua Media) may not only perceive clicks as nonspeech, but fail to recognize them as articulations of the human vocal tract. Given PAM’s premises that articulatory information is the basis of both native and non-native speech perception, we might extrapolate that speakers of languages lacking even paralinguistic clicks should fail to display activation of the left speech production area by Zulu clicks, as was seen in English listeners.

Perception of Non-native Click Consonant Contrasts

197

4.2 Infant Perception of Non-native Click Consonants Given that language experience appears to play a central role in whether adults perceive and neuro-cognitively process clicks as phonological elements (linguistic) or as non-speech sounds (non-linguistic events or paralinguistic signals), a fundamental question for theories of both speech perception and linguistic universals is how clicks are perceived at the outset of language acquisition, in the initial stages of learning a specific native language. Of course, it is impossible for preverbal infants to give us a direct answer, but clever testing procedures can provide insights on the issue. Several series of studies have investigated developmental changes in English-learning infants’ perception of click consonants, which diverge in interesting and informative ways from their perception of both native and other types of non-native consonants. The first such study (Best et al. 1988) examined 6- to 14-month-old Englishlearning infants’ discrimination of the same Zulu clicks (dental vs. alveolarlateral /ǀ/-/ǁ/) used in the adult fMRI study. Numerous studies i­ndicate that infants under 8–10 months of age discriminate many non-native consonant contrasts well, but show a dramatic decline by 10–12 months, when they fail to discriminate the very same contrasts (see reviews by Best 1994a, b; Werker 1989). But we found instead that infants at every age we tested (6, 10 and 14 months) discriminated the non-native click contrast, with no age-related decline (Best et al. 1988). In a second study, new groups of English-learning 6–8- and 10–12-month-olds discriminated both the click contrast and a native English contrast (/b/-/d/), whereas non-native Nthlakampx (Salish) uvularvelar ejective stops /k’/-/q’/ were discriminated only by the younger and not the older group (Best et al. 1995), consistent with prior reports (Werker, Gilbert, Humphrey and Tees 1981). However, clicks are not the only non-native consonant contrasts for which infants fail to show a decline in discrimination. While English-learning infants showed the expected 10–12 month decline in discrimination of several nonclick contrasts from Zulu (voiced plosive vs. implosive bilabial stops /b/-/ɓ/; voiceless aspirated vs. ejective velar stops /kʰ/-/k’/; voiceless vs. voiced lateral fricatives /ɬ/-/ɮ/), they instead continued to display excellent discrimination at 10–12 months for a Tigrinya labial versus alveolar ejective stop contrast /p’//t’/ (Best and McRoberts 2003), which adults hear as a “funny /p/ vs. /t/” rather than as non-speech sounds (Best et al. 2001). Those findings leave it unclear, however, as to whether infants at the end of the first year discriminate these two contrasts because they hear both clicks and ejective stops as non-speech sounds, or for some other reason. Given that two other non-native ejective contrasts do show a decline in discrimination at 10–12 months (Nthlakampx /k’/-/q’/ and Zulu /k/-/k’/), some other explanation must be necessary at least

198

Best

for developmental maintenance of /p’/-/t’/ discrimination. And as no other click contrasts had been tested with infants, it would also be important to see whether a different click contrast shows continuity of discrimination at 10–12 months, or the more frequently reported decline. We examined these questions in a developmental study of infants’ perception of articulatory congruity in the primary articulator of paired consonants across the auditory and visual speech modalities. In the audio-only infant perception studies described above, the non-native consonant contrasts that 10–12-month-olds still discriminated involved a between-articulator distinction: lips versus tongue tip gestures for /p’/-/t’/ ejectives, and tongue tip versus tongue body release gestures for /ǀ/-/ǁ/ clicks. Conversely, the nonnative contrasts that 10–12-month-olds failed on all involved a single articulator but differed in voicing, manner or place of articulation: tongue body for /k’/-/q’/ and /kʰ/-/k’/, tongue tip for /ɬ/-/ɮ/, lips for /b/-/ɓ/. The Articulatory Organ Hypothesis (AOH: Goldstein and Fowler 2003; Studdert-Kennedy and Goldstein 2003; see also Best and McRoberts, 2003) posits that between-organ (between-articulator) distinctions should be discriminable even at 10–12 months, whether native or non-native, but that within-organ distinctions will be more difficult even if native, and will show a decline by 10–12 months when they are non-native. To assess how language experience affects infants’ sensitivity to the articulators involved in native and non-native consonant contrasts, we created a new procedure to test cross-modal recognition of articulator congruency between audio-only and silent video-only CV syllables (Best, Kroos and Irwin 2010, 2011, 2014). For our purposes, all native and non-native consonant contrasts had to be between-articulator, the articulator difference had to be visible, and infants young enough not to show native language attunement in consonant perception had to be compared to infants old enough to show native attunement. In three separate experiments, we tested 4- versus 11-month-old English-learning infants’ detection of cross-modal audio-only → silent-video (A→V) congruency in visible between-organ labial (lips) versus coronal (tongue tip) stop consonant distinctions: 1) native English voiceless stops /pʰa/-tʰa/; 2) non-native Tigrinya ejective stops /p’a/-/t’a/; 3) non-native ǃXóõ bilabial versus dental uvular-affricated clicks /ʘχ/-/ǀχ/, which adult English speakers perceive as nonspeech (akin to “blowing a kiss” and “tsk” sounds accompanied with a throaty noise). In each test infants were first habituated to either the audio-only (AO) bilabial consonant, or the AO coronal consonant, then tested on their visual preference between silent video-only (VO) articulation of an English bilabial stop + /a/ versus coronal stop + /a/. For the native English contrast, both ages showed a VO looking preference in the first test trial for the consonant that

Perception of Non-native Click Consonant Contrasts

199

was articulatorily congruent with their AO habituation stimulus (Best et al. 2010). For the non-native Tigrinya ejectives, which adult English speakers perceive as speech, i.e. as “funny” versions of /p/ and /t/, the 4-month-olds again showed a VO articulator congruency preference, that is, a familiarity preference for the same articulator across A→V (Best et al. 2011). Interestingly, however, the 11-month-olds instead showed a A→V incongruency preference – they looked longer to the VO articulation that differed from that of their AO habituation stimulus. Even more intriguingly, for the ǃXóõ bilabial versus dental clicks the 4-month-olds showed an A→V incongruency preference, opposite to their congruency preference for the English and Tigrinya contrasts but like the older group’s reversed preference for the Tigrinya contrast. And in further contrast, the 11-month-olds failed to show any VO preference for ǃXóõ clicks (Best et al. 2011), even though a separate AO discrimination test showed they could hear the distinction (Best et al. 2014). We speculate that this pattern of age differences in A→V articulator congruency preference across the three language contrasts suggest that: 1) the AO Tigrinya ejectives are perceived by both ages as consonants (i.e. as speech), which the older but not the younger infants recognize as differing critically from VO English bilabial and coronal stops; 2) 4-month-olds perceive the AO ǃXóõ clicks as possible consonants (i.e. as speech) but can already recognize that they deviate notably from English VO consonants, while 11-month-olds have come to recognize that clicks actually fall entirely outside of the English consonant system, i.e. they hear them as non-speech, and thus perceive AO clicks as unrelated to the VO bilabial versus coronal stop syllables.13 This leads us to the final study on infants’ perception of click consonants, the only one conducted in another lab, which examined developmental changes across children’s second year in learning novel word forms comprised of a click consonant + vowel (May and Werker 2014). Given that children under 18 months but not those over 20 months can learn non-linguistic signals (toy noise, pictogram, manual gesture) as paired-associate “word labels” for novel 13  The only other published report on native versus non-native cross-modal consonant congruency compared English and Spanish 6- and 11-month-olds on a visible within-organ (lips) contrast found in English but not Spanish: /b/-/v/ (Pons, Lewkowicz, Soto-Faraco and Sebastián-Gallés 2009). As expected, both language groups preferred A→V congruency at 6 months, but only English-learning infants did so at 11 months. However, we cannot tell whether Spanish 11-month-olds failed to make an A→V connection, as those authors argued, or whether they could see the VO difference but failed to show a preference because they could not hear the preceding AO distinction. In our study, there is no ambiguity. The interpretation is clear: the 11-month-olds were able to discriminate the AO clicks but did not connect them with the VO speech articulation difference.

200

Best

objects (e.g. Woodward and Hoyne 1999), and that speakers of non-click languages perceive click consonants as non-speech sounds and discriminate them easily, those authors reasoned that 14-month-olds but not 20-montholds should learn click + vowel syllables as labels for novel objects. That is, the older children should reject them as possible word names because they perceive clicks as non-speech sounds and they reject non-speech labels for new words. That is essentially what they observed: only the younger group learned the click syllables as labels for the novel objects. But there were two caveats: 1) even the younger group only learned the new words if contextual support was provided that the task was a referential labeling task, and 2) 20-month-olds who had small expressive vocabularies for their age (which were nonetheless larger than the 14-month-olds’) also learned the click “names” for the objects in the referential context. Thus, children acquiring a non-click language display a growing recognition across their second year that clicks are not elements of their language (i.e. are non-speech sounds), and thus cannot serve as components of native words. Moreover, that dawning awareness appears to be linked to their vocabulary development. 5

Discussion and Conclusions

The hypothesis that segment types rarely found across world languages are low in “perceptual salience” is refuted by numerous studies, as discussed earlier, and this conclusion is bolstered by perceptual studies with click-naïve nonnative perceivers, both infants and adults. Click contrasts are generally quite easy for them to discriminate, indeed in some cases naïve non-click listeners are better at discriminating click contrasts than are speakers of other click languages. Still, some click contrasts are rather difficult both for click- and nonclick L1 listeners to discriminate, as well as for native listeners of the target click language. That is, most but not all clicks appear to meet the description of “acoustically salient”. The overall pattern of experimental findings, then, run counter to the hypothesis that perceptual salience is a major factor in the frequency distribution of segments types across the world’s languages. We have argued that discrimination of click contrasts by non-native per­ ceivers instead reflects whether, and how, they are perceptually assimilated to the listener’s native phonological inventory. In keeping with the principles of the Perceptual Assimilation Model (PAM: e.g. Best 1995; Best and Tyler 2007), it is the phonetic and phonological similarities to the perceiver’s L1 segmental inventory, i.e. linguistic relevance or lack thereof, that determine assimilation or non-assimilation of non-native clicks, and corresponding levels of

Perception of Non-native Click Consonant Contrasts

201

discrimination by perceivers of different L1s. The full array of findings cannot be explained simply in terms of non-linguistic salience of the acoustic properties of the clicks. For participants who speak another click language, PAM successfully predicts very good discrimination for a non-native click contrast that is assimilated to a native click contrast, but poor discrimination for contrasting clicks perceived as phonetically-equivalent exemplars of a single native click. Conversely, for participants who speak a non-click language, non-native clicks tend to be perceived instead as non-speech sounds, i.e. they are not assimilated to native phonological categories; it is in this case that discrimination depends solely on non-linguistic acoustic salience. It is the clicks’ linguistic relevance to a perceiver that matters first and foremost; acoustic salience is secondary. The cross-language perceptual findings are more closely compatible with PAM than with the other theoretical models of non-native speech perception reviewed earlier in the chapter: Flege’s Speech Learning Model (SLM) and Kuhl’s Native Language Magnet model (NLM). Nonetheless, many interesting questions remain open, with respect to both PAM predictions and those of SLM and NLM. For example, do speakers of non-click languages that employ ejective stops perceive clicks as non-speech sounds or as “odd” versions of their native ejectives? Ejectives use a glottalic rather than velaric airstream mechanism like clicks, but their release bursts can be more similar to clicks in acoustic salience (intensity, abruptness14) than the bursts of non-ejective stops. Would such participants perceive clicks’ velaric release as being similar to ejectives’ glottalic release? Additionally, comparing learners of L2 click languages who have click versus non-click L1s could critically test SLM predictions about L2 learning of “new” versus “similar” phonemes. For example, it might be difficult for adult learners of a non-click L1 to shift the L2 clicks into the speech realm: they may find it difficult to keep the clicks from perceptually streaming off from the rest of the utterance, and/or difficult to perceive or produce coarticulation of L2 clicks with adjacent vowels. In conclusion, clicks are a unique subclass of consonants, possibly the only type likely to be perceived as non-speech sounds by listeners of languages that lack click consonants. Indeed, as noted in our introductory review, they may even have unusual perceptual status in the consciousness of native speakers of click languages, e.g. some Khoesan languages have specific words to refer to the productions of certain clicks. This has allowed psycholinguistic research to exploit them to probe the contribution of linguistic versus general auditory processing to speech perception, and specifically how early language experience shapes the perception of clicks as linguistic elements or as non-speech 14  Note, though, that these properties range rather dramatically across ejective languages.

202

Best

sounds. Conversely, perceptual aspects of click consonants could potentially provide insights into historical changes in languages that do use clicks as phonemes, including those that have borrowed clicks into their inventories through inter-language contact. References Abrams, Kenneth, and Thomas G. Bever. 1969. “Syntactic structure modifies attention during speech perception and recognition.” The Quarterly Journal of Experimental Psychology 21(3): 280–290. Altmann, Christian F., Maiko Uesaki, Kentaro Ono, Masao Matsuhashi, Tatsuya Mima, and Hidenao Fukuyama. 2014. “Categorical speech perception during active discrimination of consonants and vowels.” Neuropsychologia 64: 13–23. Antoniou, Mark, Michael D. Tyler, and Catherine T. Best. 2012. “Two ways to listen: Do bilinguals perceive stop voicing differently according to language mode?” Journal of Phonetics 40: 582–594. Antoniou, Mark, Catherine T. Best, and Michael D. Tyler. 2013. “Greek-English bilinguals’ and Greek and English monolinguals’ perception of nonnative Ma’di voicing contrasts.” Journal of the Acoustical Society of America 133: 2397–2411. Aoyama, Katsura, James E. Flege, Susan G. Guion, Reiko Akahane-Yamada, and Tsuneo Yamada. 2004. “Perceived phonetic dissimilarity and L2 speech learning: The case of Japanese/r/ and English /l/ and /r/.” Journal of Phonetics 32: 233–250. Baumbach, Erdmann Johannes M. 1974. Introduction to the Speech Sounds and Speech Sound Changes of Tsonga. Pretoria: van Schaik. Beddor, Patrice S. Rena A. Krakow, and Louis M. Goldstein. 1986. “Perceptual constraints and phonological change: A study of nasal vowel height.” Phonology 3(1): 197–217. Belin, Pascal, Monica Zilbovicius, Sophie Crozier, Lionel Thivard, Anne Fontaine, Marie-Cécile Masure, and Yves Samson. 1998. “Lateralization of speech and auditory temporal processing.” Journal of Cognitive Neuroscience 10: 536–540. Best, Catherine T. 1994a. “Learning to perceive the sound pattern of English.” Advances in Infancy Research (vol. 9), ed. Carolyn K. Rovee-Collier and Lewis Paeff Lipsitt. Norwood NJ: Ablex Publishing Corporation, pp. 217–304. Best, Catherine T. 1994b. “The emergence of language-specific phonemic influences in infant speech perception.” Development of Speech Perception: The Transition from Speech Sounds to Spoken Words, ed. Judith C. Goodman and Howard C. Nusbaum. Cambridge, MA: MIT Press, pp. 167–224. Best, Catherine T. 1995. “A direct realist perspective on cross-language speech perception.” Speech Perception and Linguistic Experience: Theoretical and Methodological

Perception of Non-native Click Consonant Contrasts

203

Issues in Cross-Language Speech Research, ed. Winifred Strange. Timonium MD: York Press, pp. 167–200. Best, Catherine T., and Robert A. Avery. 1999. “Left hemisphere advantage for click consonants is determined by linguistic significance.” Psychological Science 10: 65–69. Best, Catherine T., and Pierre A. Hallé. 2010. “Perception of initial obstruent voicing is influenced by gestural organization.” Journal of Phonetics 38: 110–127. Best, Catherine T., Christian Kroos, and Julia Irwin. 2010. “Now I see what you said: Infant sensitivity to place congruency between audio-only and silent-video presentations of native and non-native consonants.” Proceedings of AVSP (AudioVisual Speech Perception). Hakone, Japan, Sept. Best, Catherine T., Christian Kroos, and Julia Irwin. 2011. “Do infants detect A→V articulator congruency for non-native click consonants?” Proceedings of AVSP Proceedings (AudioVisual Speech Perception). Volterra, Italy, Sept. 1–2, 2011. Best, Catherine T., Christian Kroos, S. Gates, and Julia Irwin. 2014. “Baby steps in perceiving articulatory foundations of phonological contrasts: Infants detect audio→video congruency in native and non-native consonants.” Laboratory Phonology, Tokyo, Japan, July. Best, Catherine T., and Gerald W. McRoberts. 2003. "Infant perception of non-native consonant contrasts that adults assimilate in different ways." Language and Speech 46(2–3): 183–216. Best, Catherine T., Gerald W. McRoberts, and Elizabeth Goodell. 2001. “American listeners’ perception of nonnative consonant contrasts varying in perceptual assimilation to English phonology.” Journal of the Acoustical Society of America 1097: 775–794. Best, Catherine T., Gerald W. McRoberts, Rosemarie LaFleur, and Jean Silver-Isenstadt. 1995. “Divergent developmental patterns for infants’ perception of two non-native consonant contrasts.” Infant Behavior and Development 18: 339–350. Best, Catherine T., Gerald W. McRoberts, and Nomathemba M. Sithole. 1988. “Examination of perceptual reorganization for nonnative speech contrasts: Zulu click discrimination by English-speaking adults and infants.” Journal of Experimental Psychology: Human Perception and Performance 14: 45–60. Best, Catherine T., Einar Mencl, Kenneth R. Pugh, Todd Constable, R. Fulbright, John C. Gore, C. Lacadie, P. Skudlarski, Alice Faber, A., and D.K. Harrison. 1999. “Native-language phonetic and phonological constraints on perception of nonnative speech contrasts.” Joint Meeting of the Acoustical Society of America and the German Acoustical Society. Berlin, Germany, March. Best, Catherine T., Einar Mencl, Kenneth R. Pugh, Todd Constable, and John C. Gore. (in preparation). “Brain activation during discrimination of nonnative consonant contrasts perceived as speech versus as nonspeech.”

204

Best

Best, Catherine T., and Winifred Strange. 1992. “Effects of language-specific phonological and phonetic factors on cross-language perception of approximants.” Journal of Phonetics 20: 305–330. Best, Catherine T., Anthony Traill, Allyson Carter, K. David Harrison, and Alice Faber. 2003. “ǃXóõ click perception by English, Isizulu, and Sesotho listeners.” Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, Spain, August 3–9, 2003. Universitat Autònoma de Barcelona, pp. 853–856. Best, Catherine T., and Michael D. Tyler. 2007. “Nonnative and second-language speech perception: Commonalities and complementarities.” Language Experience in Second Language Speech Learning, ed. Ocke-Schwen Bohn, and Murray J. Munro. Amsterdam: John Benjamins, pp. 13–34. Binder, J.R., J.A. Frost, T.A. Hammeke, P.S. Bellgowan, J.A. Springer, J.N. Kaufman, and E.T. Possing. 2000. “Human temporal lobe activation by speech and nonspeech sounds.” Cerebral Cortex 10: 512–528. Bohn, Ocke-Schwen, and Catherine T. Best. 2012. “Native-language phonetic and phonological influences on perception of English approximant contrasts by Danish and German listeners.” Journal of Phonetics 40(1): 109–128. Bostoen, Koen, and Bonny Sands. 2012. “Clicks in south-western Bantu languages: Contact-induced vs. language-internal lexical change.” Proceedings of the 6th World Congress of African Linguistics, Cologne, 17–21 August 2009, ed. M. Brenzinger, and Anne-Maria Fehn. Cologne: Rüdiger Köppe, pp. 121–132. Brancazio, Lawrence, Catherine T. Best, and Carol A. Fowler. 2006. “Visual influences on perception of speech and nonspeech vocal-tract events.” Language and Speech 49: 21–53. Carreiras, Manuel, Jorge Lopez, Francisco Rivero, and David Corina. 2005. “Linguistic perception: Neural processing of a whistled language.” Nature 433: 31–32. Cutting, James E. 1974. “Two left-hemisphere mechanisms in speech perception.” Perception & Psychophysics 16: 601–612. De Gelder, Beatrice, and Jean Vroomen. 2000. “The perception of emotions by ear and eye.” Cognition and Emotion 14: 289–311. Dehaene-Lambertz, Ghislaine, Christophe Pallier, Willy Serniclaes, Liliane SprengerCharolles, Antoinette Jobert, and Stanislas Dehaene. 2005. “Neural correlates of switching from auditory to speech perception.” NeuroImage 24: 21–33. Diehl, Randy L., and Keith R. Kluender. 1989. “On the objects of speech perception.” Ecological Psychology 1: 121–144. Dinnsen, Daniel A. 1985. “A re-examination of phonological neutralization.” Journal of Linguistics 21: 265–279. Doke, Clement M. 1931. A Comparative Study of Shona Phonetics. Johannesburg: University of the Witwatersrand Press. (See pages 113, 123, 161).

Perception of Non-native Click Consonant Contrasts

205

Engstrand, Olle. 1997. “Why are clicks so exclusive?” Papers from FONETIK-97, the Ninth Swedish Phonetics Conference, held in Umeå. Reports from the Department of Phonetics, Umeå University (PHONUM) 4: 191–194. Flege, James E. 1995. “Second language speech learning: Theory, findings and problems.” Speech Perception and Linguistic Experience: Theoretical and Methodological Issues in Cross-Language Speech Research, ed. Winifred Strange. Timonium, MD: York Press, pp. 233–277. Fowler, Carol A., and Dawn J. Dekle. 1991. “Listening with eye and hand: cross-modal contributions to speech perception.” Journal of Experimental Psychology: Human Perception And Performance 17: 816. Gandour, Jack, Donald Wong, and Gary Hutchins. 1998. “Pitch processing in the human brain is influenced by language experience.” Cognitive Neuroscience 9: 2115–2119. Gil, David. 2005. “Paralinguistic usages of clicks.” The World Atlas of Language Structures, ed. Martin Haspelmath, Matthew S. Dryer, David Gil and Bernard Comrie. Oxford: Oxford University Press, pp. 572–575. Goldstein, Louis, and Carol A. Fowler. 2003. “Articulatory phonology: A phonology for public language use.” Phonetics and Phonology in Language Comprehension and Production: Differences and Similarities, ed. Niels O. Schiller and Antje S. Meyer. Berlin: Mouton de Gruyter, pp. 159–208. Guion, Susan G., James E. Flege, Reiko Akahane-Yamada, and J.C. Pruitt. 2000. “An investigation of current models of second language speech perception: The case of Japanese adults’ perception of English consonants.” Journal of the Acoustical Society of America 107: 2711–2724. Güldemann, Tom. 2007. Clicks, Genetics and “Proto-World” from a Linguistic Perspective. University of Leipzig Papers on Africa, Languages and Literatures 29. Leipzig: University of Leipzig. Gunnink, Hilde. 2014. “The grammatical structure of Sowetan tsotsitaal.” Southern African Linguistics and Applied Language Studies 32: 161–171. Guastavino, Catherine. 2007. “Categorization of environmental sounds.” Canadian Journal of Experimental Psychology 61: 54–63. Gygi, Brian, Gary R. Kidd, and Charles S. Watson. 2007. “Similarity and categorization of environmental sounds.” Perception & Psychophysics 69: 839–855. Hafter, E.R., and R.H. Dye. 1983. “Detection of interaural differences of time in trains of high-frequency clicks as a function of interclick interval and number.” Journal of the Acoustical Society of America 73: 644–651. Hale, Kenneth. 1973. “Deep-surface canonical disparities in relation to analysis and change: An Australian example.” Current Trends in Linguistics, vol. II (Diachronic, Areal and Typological Linguistics), ed. T.A. Sebeok. The Hague: Mouton, pp. 401–458. Hale, Kenneth, and David Nash. 1997. “Damin and Lardil phonotactics.” Boundary Rider: Essays in Honour of Geoffrey O’Grady, ed. Darrell Tryon and Michael Walsh.

206

Best

Pacific Linguistics C-136. Canberra: Research School of Pacific and Asian Studies, Australian National University, pp. 247–259. Hallé, Pierre A., and Catherine T. Best. 2007. “Dental-to-velar perceptual assimilation: A cross-linguistic study of the perception of dental stop+/l/ clusters.” Journal of the Acoustical Society of America 121: 2899–2914. Hallé, Pierre A., Catherine T. Best, and A. Levitt. 1999. “Phonetic versus phonological influences on French listeners’ perception of American English approximants.” Journal of Phonetics 27: 281–306. Hartmann, William Morris, and Brad Rakerd. 1993. “Auditory spectral discrimination and the localization of clicks in the sagittal plane.” Journal of the Acoustical Society of America 94: 2083–2092. Hayes, Bruce, and Donca Steriade. 2004. “Introduction: The phonetic bases of phonological markedness.” Phonetically-based Phonology, ed. Bruce Hayes, Robert Kirchner and Donca Steriade. Cambridge: Cambridge University Press, pp. 1–32. Hickok, Gregory, and David Poeppel. 2015. “Neural basis of speech perception.” The Human Auditory System: Fundamental Organization and Clinical Disorders, ed. Gastone G. Celesia and Gregory Hickok. Edinburgh: Elsevier, pp. 149–160. Howard, Mary F., and David Poeppel. 2009. “Hemispheric asymmetry in mid and long latency neuromagnetic responses to single clicks.” Hearing Research 257: 41–52. Hume, Elizabeth, and Keith Johnson. 2001. “A model of the interplay of speech perception and phonology.” The Role of Speech Perception in Phonology, ed. Elizabeth Hume and Keith Johnson. New York: Academic Press, pp. 3–26. Joanisse, Marc F., and Joseph S. Gati. 2003. “Overlapping neural regions for processing rapid temporal cues in speech and non-speech signals.” NeuroImage 19: 64–79. Kingston, John, and Takahito Shinya. 2003. “Markedness asymmetries in place perception in consonant clusters.” Proceedings of the International Congress of Phonetic Sciences 15: 399–402. Khin Khin, Aye. 2005. “Bazaar Malay: History, Grammar and Contact.” Ph.D. thesis, Department of English Language and Literature, National University of Singapore, pp. 41–61. Kuhl, Patricia K., Barbara T. Conboy, Sharon Coffey-Corina, Denise Padden, Maritza Rivera-Gaxiola, and Tobey Nelson. 2008. “Phonetic learning as a pathway to language: new data and native language magnet theory expanded (NLM-e).” Philosophical Transactions of the Royal Society B: Biological Sciences 363: 979–1000. Kuhl, Patricia K., and Paul Iverson. 1995. “Linguistic experience and the “perceptual magnet effect”.” Speech Perception and Linguistic Experience: Issues in Cross-Language Research, ed. Winifred Strange. Timonium, MD: York Press, pp. 121–154. Kuhl, Patricia K., and Maritza Rivera-Gaxiola. 2008. “Neural substrates of language acquisition.” Annual Review of Neuroscience 31: 511–534. Ladefoged, Peter. 1964/1968. A Phonetic Study of West African Languages: An Auditory-Instrumental Survey, 2nd ed. Cambridge: Cambridge University Press.

Perception of Non-native Click Consonant Contrasts

207

Ladefoged, Peter, and Anthony Traill. 1994. “Clicks and their accompaniments.” Journal of Phonetics 22: 33–64. Lemaitre, Guillaume, Olivier Houix, Nicolas Misdariis, and Patrick Susini. 2010. “Listener expertise and sound identification influence the categorization of environmental sounds.” Journal of Experimental Psychology: Applied 16: 16–32. Liberman, Alvin M. 1996. Speech: A Special Code. Cambridge, MA: Bradford Books/MIT Press. Liberman, Alvin M., and Ignatius G. Mattingly. 1985. “The motor theory of speech perception revised.” Cognition 21: 1–36. Liberman, Alvin M., and Ignatius G. Mattingly. 1989. “A specialization for speech perception.” Science 243: 489–494. Liberman, Alvin M., F. S. Cooper, Donald P. Shankweiler, and Michael Studdert-Kennedy. 1967. “Perception of the speech code.” Psychological Review 74: 431–461. Luffin, Xavier. 2005. Un créole arabe: Le kinubi de Mombasa. Munich: Lincom. MacDonald, John, and Harry McGurk. 1978. “Visual influences on speech perception processes.” Perception & Psychophysics 24: 253–157. Maddieson, Ian. 1984. Patterns of Sounds. Cambridge: Cambridge University Press. Marchal, Alain. 1987. “Des clics en français?” Phonetica 44: 30–37. Marquard, Carina, Oliver Niebuhr, and Alena Witzlack-Makarevich. 2015. “Phonetic reduction of clicks – Evidence from Nǀuu.” Proceedings: International Congress of Phonetic Sciences 18: n.p. Glasgow. May, Lillian, and Janet F. Werker. 2014. “Can a click be a word? Infants’ learning of nonnative words.” Infancy 19: 281–300. McGettigan, Carolyn, and Sophie K. Scott. 2012. “Cortical asymmetries in speech perception: What’s wrong, what’s right and what’s left?” Trends in Cognitive Sciences 16: 269–276. McGurk, Harry, and John MacDonald. 1976. “Hearing lips and seeing voices.” Nature 264: 746–748. Miller-Ockhuizen, Amanda L. 2003. The Phonetics and Phonology of Gutturals: A Case Study from Juǀ’hoansi. New York: Routledge. Muysken, Pieter. 1997. “Media Lengua.” Contact Languages: A Wider Perspective, ed. Sarah G. Thomason. Amsterdam: John Benjamins, pp. 365–426. Ogden, Richard. 2013. “Clicks and percussives in English conversation.” Journal of Phonetics 43: 299–320. Ohala, John J. 1993a. “Sound change as nature’s speech perception experiment.” Speech Communication 13: 155–161. Ohala, John J. 1993b. “The phonetics of sound change.” Historical Linguistics: Problems and Perspectives, ed. Charles Jones. London: Longman, pp. 237–278. Ohala, John J. 1995. “A probable case of clicks influencing the sound pattern of some European languages.” Phonetica 52: 160–170.

208

Best

Pakendorf, Brigitte, Hilde Gunnink, Bonny Sands, and Koen Bostoen. 2017. “Prehistoric Bantu-Khoisan language contact: A cross-disciplinary approach.” Language Dynamics and Change 7(1): 1–46. Pons, Ferran, David Lewkowicz, Salvador Soto-Faraco, and Núria Sebastián-Gallés. 2009. “Narrowing of intersensory speech perception in infancy.” Proceedings of the National Academy of Sciences 106: 10598–10602. Rosenblum, Lawrence D., and Carol A. Fowler. 1991. “An audio-visual investigation of the loudness/effort effect for speech and non-speech events.” Journal of Experimental Psychology: Human Perception and Performance 17: 976–985. Saldaña, Helena M., and Lawrence D. Rosenblum. 1993. “Visual influences on auditory pluck and bow judgments.” Perception & Psychophysics 54: 406–416. Sands, Bonny. 2007. “The contribution of language documentation to historical phonology.” Proceedings of Conference on Language Documentation and Linguistic Theory (75 years of Linguistics at SOAS, 5 years of the Endangered Languages Project, 7–8 December 2007, School of Oriental and African Studies, University of London), ed. Peter K. Austin, Oliver Bond and David Nathan. London: Hans Rausing Endangered Languages Project & Department of Linguistics, School of Oriental and African Studies, pp. 209–219. Sands, Bonny, and Tom Güldemann. 2009. “What click languages can and can’t tell us about language origins.” The Cradle of Language, ed. Rudolf Botha and Chris Knight. Studies in the Evolution of Language 12. Oxford: Oxford University Press, pp. 204–218. Sands, Bonny, and Hilde Gunnink. 2019. “Clicks on the fringes of the Kalahari Basin Area.” Theory and Description in African Linguistics: Selected Papers from the 47th Annual Conference on African Linguistics, ed. Emily Clem, Peter Jenks and Hannah Sande. Berlin: Language Science Press, pp. 703–724. Schwartz, Joyce, and Paula Tallal. 1980. “Rate of acoustic change may underlie hemispheric specialization for speech perception.” Science 207(4437): 1380–1381. Scott, Sophie K., and Carolyn McGettigan. 2013. “Do temporal processes underlie left hemisphere dominance in speech perception?” Brain and Language 127: 36–45. Shankweiler, Donald, and Michael Studdert-Kennedy. 1967. “Identification of consonants and vowels presented to left and right ears.” The Quarterly Journal of Experimental Psychology 19: 59–63. Skipper, J.I., V. van Wassenhove, H.C. Nusbaum, and S.L. Small. 2007. “Hearing lips and seeing voices: How cortical areas supporting speech production mediate audiovisual speech perception.” Cerebral Cortex 17: 2387–2399. Steriade, Donca. 2001. “Directional asymmetries in place assimilation: A perceptual account.” The Role of Speech Perception in Phonology, ed. Elizabeth Hume and Keith Johnson. New York: Academic Press, pp. 219–250.

Perception of Non-native Click Consonant Contrasts

209

Studdert-Kennedy, Michael, and Louis Goldstein. 2003. “Launching language: The gestural origin of discrete infinity.” Language Evolution, ed. Morton H. Christiansen and Simon Kirby. Oxford, UK: Oxford University Press, pp. 235–254. Traill, Anthony. 1994. “The perception of clicks in ǃXóõ.” Journal of African Languages and Linguistics 15: 161–174. Traill, Anthony. 1997. “Linguistic phonetic features for clicks: Articulatory, acoustic and perceptual evidence.” African Linguistics at the Crossroads: Papers from Kwaluseni, ed. Robert K. Herbert. Cologne: Köppe, pp. 99–117. Traill, Anthony. 2002a. “The perception of clicks in ǃXóõ.” Mouton Classics: From Syntax to Cognition. From Phonology to Text. Berlin: Mouton de Gruyter, pp. 59–74. Traill, Anthony. 2002b. “The Khoisan languages.” Language in South Africa, ed. Rajend Mesthrie. Cambridge: Cambridge University Press, pp. 27–49. Traill, Anthony, and Rainer Vossen. 1997. “Sound change in the Khoisan languages: New data on click loss and click replacement.” Journal of African Languages and Linguistics 18: 21–56. Tyler, Michael D., Catherine T. Best, Louis Goldstein, and Mark Antoniou. 2014. “Contributions of native-language tuning and articulatory organs to infants’ discrimination of native and nonnative consonant contrasts.” Developmental Psychobiology 56: 210–227. Werker, Janet F. 1989. “Becoming a native listener.” American Psychologist 77: 54–59. Werker, Janet F., John H. V. Gilbert, Keith Humphrey, and Richard C. Tees. 1981. “Developmental aspects of cross-language speech perception.” Child Development 52(1): 348–355. Wightman, Frederic L., and Doris J. Kistler. 1997. “Sound localization in the presence of multiple distracters.” Journal of the Acoustical Society of America 101: 3105. Wilmsen, Edwin N., and Rainer Vossen. 1990. “Labour, language and power in the construction of ethnicity in Botswana.” Critique of Anthropology 10: 7–37. Woodward, Amanda L., and Karen L. Hoyne. 1999. “Infants’ learning about words and sounds in relation to objects.” Child Development 70: 65–77. Wright, Melissa. 2011. “On clicks in English talk-in-interaction.” Journal of the International Phonetic Association 41: 207–229. Wright, Richard. 2001. “Perceptual cues in contrast maintenance.” The Role of Speech Perception in Phonology, ed. Elizabeth Hume and Keith Johnson. New York: Academic Press, pp. 251–277. Yost, W.A., F.L. Wightman, and David M. Green. 1971. “Lateralization of filtered clicks.” Journal of the Acoustical Society of America 50: 1526–1532.

Chapter 6

Studying Clicks Using Real-Time MRI Michael Proctor, Yinghua Zhu, Adam Lammert, Asterios Toutios, Bonny Sands, and Shrikanth Narayanan 1 Introduction The phonetic properties of clicks have been investigated in a wide range of languages, using a variety of techniques. Aerodynamic and acoustic properties of lingual consonants1 have been examined in Zulu (Doke 1923a), Naro (Kagaya 1978), Khoekhoe (Ladefoged and Traill 1984), ǃXóõ (Traill 1985; Traill 1991; Ladefoged and Traill 1994), Xhosa (Sands 1991), Gǀui (Nakagawa 1996; Nakagawa 2006), Hadza (Sands et al. 1996), Nǀuu (Miller et al. 2007a), and Yeyi (Fulop et al. 2003). X-ray (Doke 1923b; Doke 1925; Traill 1985), palatography and linguography (Doke 1923a; Doke 1925; Beach 1938; Traill 1985; Nakagawa 2006; Sands et al. 1996; Sands et al. 2007), and more recently, high-speed ultrasound studies (Miller et al. 2007b; Miller et al. 2009b; Miller et al. 2009a, etc.), have provided rich detail about the lingual articulation of clicks in these and other languages. Collectively, these data have provided a good understanding of the principal mechanisms of click consonant production; however, because each these experimental methods focuses on specific aspects of lingual articulation, there are other aspects of click production that are not as well understood. Because ultrasound and palatography do not provide information about the whole of the soft palate, it not clear whether the dorsal constrictions of some clicks are best classified as velar or uvular, and to what extent – if at all – posterior place of articulation is contrastive in clicks. Neither ultrasound nor palatography provide any information about the pharynx or tongue root, so we have little data on the role of these articulators in click production. More generally, it is not yet clear if lingual ingressive consonants in all languages can be adequately described in terms of small set of basic underlying clicks and combining accompaniments, or if a complete description of these 1  ‘Lingual’ refers to the airstream mechanism (c.f. ‘glottalic’ and ‘pulmonic’), and is preferred by some authors (Taljaard and Snyman 1989) over the terms ‘velaric’ (Ladefoged and Traill 1984; Ladefoged and Traill 1994), or ‘linguo-velaric’, for reasons outlined in Miller et al. (2007b) and Miller (2011).

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_007

Studying Clicks Using Real-Time MRI

211

sounds requires reference to other details of articulation of the tongue body, tongue root, jaw, larynx, and velum. Cross-linguistic surveys (e.g. Köhler et al. 1988; Ladefoged and Maddieson 1996; Vossen 1997; Maddieson 2003) propose that there are five fundamental click contrasts – bilabial, dental, alveolar, palatal, and lateral – yet it is not clear whether this is an adequate inventory, or if clicks sharing the same anterior constriction location are produced in exactly the same way across languages. Consonants represented as /ǃ/, for example, have been variously described as alveolar (Sands et al. 1996; Vossen 1997; Ladefoged and Maddieson 1996), postalveolar (Maddieson 2003), palatal (Lanham 1964; Snyman 1975; Herbert 1990; Roux 2007), palato-alveolar (Doke 1954), alveo-palatal (Doke and Mofokeng 1957), apicolamino-palatal (Finlayson et al. 1991), and retroflex (Doke 1926; Cruttenden 1992). Some of this descriptive variation clearly reflects differing analyses of the same sounds; in other cases, different sounds, individual speaker variation, and dialect differences are being described. For example, there is considerable variation in the realization of lingual ingressive consonants in Bantu languages which use only a single click place contrast, even when these are considered to be allophones of the same phonological unit (Ziervogel 1952; Doke and Mofokeng 1957; Maddieson 2003). Our incomplete knowledge of the phonetic properties of clicks has important implications for their phonological representation – a topic of ongoing debate (Kohler et al. 1988; Nakagawa 2006; Brugman 2009; Miller et al. 2009a; Miller 2011; Bradfield 2014). Views differ over whether clicks are best described as a small set of underlying segments which combine in clusters (Traill 1985; Traill 1993; Nakagawa 2006), or as a larger inventory of unitary segments which resist decomposition (Beach 1938; Snyman 1970; Ladefoged and Traill 1994; Miller-Ockhuizen 2003). Because articulatory data on clicks has largely been restricted to the region of the vocal tract associated with the primary mechanisms of sound production, it is not surprising that fundamental questions about these consonants remain. A comprehensive understanding of their phonetic and phonological properties will require more data from more languages: spatially-detailed and temporally-rich information about the configuration of the entire vocal tract. While data of this nature has previously been provided by x-ray studies (Doke 1923b; Traill 1985), this modality is no longer generally considered viable because of concerns about the risks of exposing subjects to ionizing radiation. In this chapter, we report results from some initial investigations into the use of real-time Magnetic Resonance Imaging (rtMRI) to study clicks. The primary goal of this work is to investigate the utility of rtMRI as a method for studying lingual ingressive consonant production. We present data acquired

212

Proctor, Et Al.

from individual speakers of two different languages that make use of click consonants: Khoekhoegowab and siSwati. We compare these consonants with data acquired from a beatbox artist who makes paralinguistic use of clicks in his vocal percussion repertoire. In each case study, we discuss the advantages and disadvantages of rtMRI compared to other phonetic methods used to study clicks, and report some findings that will inform our understanding of the phonology of the languages under investigation, and our knowledge of click production in general. 2

Real-Time MRI of Speech

All data presented in this chapter were acquired using a real-time Magnetic Resonance Imaging protocol developed specifically for the dynamic study of the upper airway during speech production (Narayanan et al. 2004; Bresch et al. 2008). Video and audio recordings of the target languages were acquired from the participants during a series of scan sessions conducted at LA County Hospital. Stimuli were presented to the participants in the orthography of the study languages as they lay supine in an MRI scanner bore. Participants’ vocal tracts were imaged as they produced spontaneous speech, short passages of prose, and lists of words targeting the consonants of interest. Subjects were paid for their participation. 2.1 Image and Audio Acquisition Subjects’ upper airways were imaged in the midsagittal plane on a GE Signa Excite HD 1.5T scanner with gradients capable of 40mT/m amplitude and 150mT/m/ms slew rate, using a custom 4-channel head-and-neck receiver coil. Data from two front channels were used for image reconstruction. Image data were acquired from a 5mm midsagittal slice centered on the subject’s tongue, extending over a 200mm × 200mm field-of-view. The imaging region was positioned to include the subject’s trachea, larynx, velum, hard and soft palates, and the lips and jaw, over the full range of excursion of the articulators during speech. Image data were acquired with a fast gradient echo sequence (TR = 6.028ms) and an interleaved spiral readout: a new complete image was acquired every 54ms, using information from 9 partial acquisitions captured every 6.028ms. Spatial resolution in the sagittal plane was 68 × 68 pixels (2.9 × 2.9mm²). Further details of image acquisition and reconstruction are provided in Bresch et al. (2008). Audio was simultaneously recorded at a sampling frequency of 20 kHz inside the MRI scanner while the subject was imaged, using a custom fiber-optic

Studying Clicks Using Real-Time MRI

213

Figure 6.1 rtMRI frame showing midsagittal articulation of the upper airway Notes: Image captured at maximum constriction of the initial dorsal stop in the Nama word ɡe /ɡè/ ‘we (masc.)’. Left: original MR image resolution (68 × 68 px); Right: interpolated image frame (340 × 340 px) with superimposed anatomical labels. Teeth do not image in MRI. An arc-shaped artifact caused by co-planar cardiac activity partially distorts the image through the upper pharynx and tongue root. An additional artifact caused by dental fillings affects resolution of the lower lip, tongue tip, and sublingual cavity.

microphone system. Audio recordings were subsequently noise-canceled and reintegrated with the reconstructed MRI image sequences (Bresch et al. 2006). The resulting video data provide dynamic visualization, with synchronous audio, of the informant’s entire midsagittal vocal tract, from the upper trachea to the lips, including the oropharynx, velum and nasal cavity. Teeth do not image in MRI, so the location of dentition must be inferred when in contact with soft tissue. Scan planes were located midsagittally through the larynx, so that information about the state of the vocal folds and surrounding structures could also be inferred from pixel intensity in the laryngeal region. 2.2 Data Analysis Images were up-sampled by a factor of five, using bicubic interpolation, from the original image acquisition resolution of 68 × 68 pixels, to enhance resolution of vocal tract structures and to facilitate estimation of distances between articulators (Fig. 6.1). Image sequences were reconstructed post-acquisition using a sliding-window technique, to produce oversampled high-speed video with an effective rate of 165.9 frames per second (one frame every 6.028 ms, constructed from 9 partial acquisitions over 54 ms), so that images could be better time-aligned with acoustic events of interest. Midsagittal palatal

214

Proctor, Et Al.

outlines for each speaker derived from mean images calculated across multiple frames, and superimposed into image sequences to improve resolution of passive structures around the hard palate, where the relative lack of soft tissue provides a weaker and more variable signal. Start and end times of each utterance were located by examining timealigned audio, spectral, and video data in a custom inspection and analysis tool (Proctor et al. 2010a; Narayanan et al. 2014). A typical image sequence, aligned with companion acoustic data, is illustrated in Figure 6.2. Every 10th frame, spaced at 60.2 ms intervals, selected from the high-speed video reconstruction of the rtMRI data is shown, with corresponding landmarks indicated on the speech waveform. Cardiac artifacts – distortions which can arise when the imaging plane intersects the heart – can be seen in Figure 6.2 and other images in these datasets. Because this noise source is localized, it does not usually affect the analysis, although it can compromise study of nasalization when it distorts the region of the image corresponding to the velum and nasopharynx. 3 rtMRI Insights into Nama Clicks Nama is a variety of Khoekhoegowab, a Khoe language primarily spoken in Namibia (ISO 639–3 code: naq), whose other varieties include the ethnolects Damara and Haiǁom (Haacke et al. 1997; Güldemann and Vossen 2000). Twenty of the thirty four Nama consonant phonemes use a lingual airstream: each of four different clicks may be produced with five contrastive manners of articulation (Brugman 2009). The four click types are described as dental, alveolar, palatal and lateral (Beach 1938; Westphal 1971; Ladefoged and Traill 1984; Ladefoged and Traill 1994; Güldemann 2001; Haacke and Eiseb 2002), but the phonetic and phonological characterization of the other contrasts are debated. Ladefoged and Traill (1984) and Ladefoged and Traill (1994) describe a system of basic clicks modified by different combinations of laryngeal and nasal accompaniments (Table 6.1). Nama lingual ingressive consonants have also been analyzed as consonant clusters (Traill 1993; Güldemann 2001), unitary segments differentiated by contour airstreams (Miller 2011), and separate series of stops, affricates and nasals (Brugman 2009). 3.1 Informant and Corpora The study participant was a 35 year-old Namibian male trilingual speaker of Afrikaans, Nama, and English. Born and raised in Windhoek, he has lived most of his life in Namibia, and had been living in the United States for a year before

Studying Clicks Using Real-Time MRI

215

Figure 6.2 Time-aligned audio and video data acquired during Khoekhoe lateral click production Notes: Eight image frames (top two rows) showing midsagittal articulation at key stages of the consonant-vowel sequence beginning the word ǁɡɑm /ǁɑ́ m̀/ ‘talk’. Broken vertical lines superimposed on the waveform of the companion acoustic recording indicate the location in time of each frame. A decaying series of echoes spaced at 53 ms intervals (an artefact of the noise reduction processing) can be seen in the waveform and spectrogram (bottom), following the click release burst at 0.31 s.

participating in this study. His mother speaks Nama as her first language and his father, Afrikaans. The informant lay supine in an MRI scanner and read out words presented in Nama orthography, with accompanying Afrikaans and/ or English translations. A list of the subset of words analyzed for this study is provided in Appendix A; most items were elicited twice. A comprehensive analysis of these data is beyond the scope of this chapter. Initial findings have been presented in Proctor et al. (2016); here we highlight

216

Proctor, Et Al.

Table 6.1 Nama click consonants

Dental Alveolar Palatal Lateral

Glottal closure

Voiceless unaspirated

Voiceless aspirated

Delayed aspiration

Voiced nasal

ǀ ǃ ǂ ǁ

ǀɡ ǃɡ ǂɡ ǁɡ

ǀkh ǃkh ǂkh ǁkh

ǀh ǃh ǂh ǁh

ǀn ǃn ǂn ǁn

Notes: Orthographic representations (Curriculum Committee for Khoekhoegowab 2003) and phonetic descriptions (Ladefoged and Traill 1984; Ladefoged and Traill 1994) of the twenty click phonemes of Nama.

some of the most important ways in which real-time MRI can inform our understanding of the phonetic properties of Nama clicks. 3.2 Tongue Shaping and Movement in Nama Clicks Real-time MRI reveals the shape of the whole tongue, and the way that the tongue is coordinated with the jaw, at each point of an utterance. By examining tongue shape at comparable stages in minimally-different words, we can see how the four basic types of lingual ingressive consonant are formed and released by a speaker of Nama. Midsagittal tongue postures captured at the acoustic onset, and 100 ms into the release of word-initial clicks are illustrated in Figure 6.3. The data reveal four characteristic midsagittal tongue postures and patterns of release associated with each of the Nama clicks. Dental and alveolar clicks are produced with a more apical coronal gesture, compared to the palatal and lateral clicks, which involve a more extended constriction of the tongue blade across the alveolar ridge and back into the hard palate. The alveolar click is produced by this speaker with a particularly apical, slightly retroflexed tongue posture that is retained throughout release and into the following vowel. Cavity rarefaction mechanisms and release kinematics associated with each type of click are better understood by tracking changes in lingual posture in greater temporal and spatial detail. Image frames capturing midsagittal vocal tract configurations at the moment of click release, and at 18 millisecond intervals thereafter (every third frame in the sequence) were examined in word-initial clicks produced before the same long, low vowel: ǀɑ̄ /ᵑǁ̥ Ɂɑ́ ɑ̀/ ‘sharp’, ǃɑ̄ /ᵑǃ̥ Ɂɑ̀ ɑ̋ / ‘hang’, ǂɑ̄ /ᵑǂ̥ Ɂɑ̋ ɑ́/ ‘slaughter’, and ǁɑ̄ /ᵑǁ̥ Ɂɑ̋ ɑ́/ ‘wash’. Vocal tract

Studying Clicks Using Real-Time MRI

217

Figure 6.3 Comparison of word-initial clicks produced before mid-back vowels Notes: Top row: midsagittal lingual posture at acoustic onset of click; Bottom row: lingual posture 100 ms after click release. L-to-R: dental: ǀom /ᵑǀ̥ Ɂòm̀ / ‘breathe’, alveolar: ǃom /ᵑǃ̥ Ɂóm̀ / ‘remove thorn’, palatal: ǂom /ᵑǂ̥ Ɂóm̀ / ‘sew’, lateral: ǁom /ᵑǁ̥ Ɂòm̋ / ‘sleep’.

boundaries were located using the method described in Proctor et al. (2010a), and are superimposed in Fig. 6.4 to show the tongue trajectories associated with each type of release.2 The data reveal in more detail the apical nature of the coronal gestures which characterize dental and alveolar click release (Fig. 6.4, top row), and how these differ from the more laminal anterior constrictions used in the palatal and lateral clicks (Fig. 6.4, bottom row). The retroflexed lingual posture of /ǃ/ is even more apparent in this vowel context. The anterior place of articulation of the alveolar click is quite retracted, consistent with some previous descriptions of Khoekhoe /ǃ/ as postalveolar (Miller et al. 2007b), and alveolar/ postalveolar Beach (1938: 81). Greater differences in dorsal articulation are also apparent in the Nama clicks produced before a low vowel. The posterior constriction location at formation and during the initial release phase of the dental click is clearly characterized as velar in these tokens (Fig. 6.4, top left); in the other three clicks produced in this vowel context, part of the dorsal seal is created and released 2  A palate and pharynx trace captured at a single point in time is superimposed on each image to locate the tongue edges with respect to the passive structures. The tongue root was traced beneath the epiglottis in each frame. Soft palate traces are extracted from the first frame in each sequence; the exact location of the velum and uvula varies from frame to frame.

218

Proctor, Et Al.

Figure 6.4 Evolution of midsagittal lingual posture in four contrastive Nama click releases Notes: Top left: ǀɑ̄ /ᵑǀ̥ Ɂɑ́ ɑ̀/ ‘sharp’ (dental); Top right: ǃɑ̄ /ᵑǃ̥ Ɂɑ̀ ɑ̋/ ‘hang’ (alveolar); Bottom left: ǂɑ̄ /ᵑǂ̥ Ɂɑ̋ ɑ́/ ‘slaughter’ (palatal); Bottom right: ǁɑ̄ /ᵑǁ̥ Ɂɑ̋ ɑ́/ ‘wash’ (lateral). In each sequence, the first (darkest) outline traces the tongue edge at the moment of click release. Subsequent outlines show tongue posture at 18 ms intervals, as the tongue moves toward the following low vowel target.

against some part of the uvula. Across all vowel contexts, /ǀ/ involves a more advanced posterior constriction location than that observed in the other three clicks produced by this speaker. These data also show how cavity rarefaction is achieved using different parts of the tongue, and with different lingual kinematics. Dental and alveolar click release involves rapid lowering of the tongue blade and the front of the tongue body in the region immediately behind the anterior constriction. In palatal and lateral clicks, the lingual cavity is rarefied primarily by lowering and retracting the tongue body. In all four clicks, lingual articulation is accompanied by jaw lowering and retraction during release (Proctor et al. 2016). 3.3 Discussion Real-time MRI provides rich information about the location and type of tongue tip and tongue body constriction used to produce each of the four Nama click

Studying Clicks Using Real-Time MRI

219

types, the way that the lingual cavity is rarefied, and lingual motion involved in click release. Not all of these details of production can be obtained from other experimental methods that provide more restricted information about the tongue and the way it interacts with the passive articulators. For this speaker, /ǀ/ and /ǃ/ are produced with narrow apical coronal gestures in the midsagittal plane, while /ǂ/ and /ǁ/ involve laminal tongue tip constrictions that contact a much wider region of the alveolar ridge and hard palate. The posterior constriction of the dental click produced before the low vowel extends forward into the velar region of the soft palate, but in all four clicks a large part of the tongue body remains constricted against the uvula (Fig. 6.3, top row), increasingly so during release. Ultrasound provides similar information about the shape of the upper edge of the tongue around the click cavity (see, for example, Miller et al. 2007b); however, because rtMRI reveals the location of the entire mass of the tongue in the midsagittal plane, it allows us to better understand how click constrictions are formed and released, and how parts of the tongue below the oral airway are articulated and coordinated with other activity in the vocal tract. Figs. 6.3 and 6.4 reveal that different parts of the tongue are lowered – in different ways and with different timing – during cavity rarefaction. The dental clicks examined here, for example, are released with more tongue body retraction than alveolar clicks, which are formed with a more retracted tongue body posture to begin with, and are therefore characterized primarily by tongue lowering. The apex of the tongue dorsum in the dental click produced before a low vowel, for example (Fig. 6.4, top left) is located 10 mm further forward along the soft palate compared to the dorsal apex in the alveolar click, and then retracts 11 mm towards the pharynx during release, while the dorsum retracts less than 5 mm during /ǃ/ release. The lowest point of the lingual cavity retracts over 15 mm during the same interval in /ǀ/, compared to 4 mm retraction in the alveolar click release. Asymmetries can also be seen in the trajectories of the anterior and posterior constrictions: this speaker’s palatal clicks are rarefied and released primarily through dorsal lowering and retraction, while the tongue blade remains constricted against the hard palate (Fig. 6.4, bottom left: 18…54 ms). Greater pharyngeal apertures are observed for this speaker during /ǀ/ and /ǃ/ formation and release, compared to /ǂ/ and /ǁ/. In clicks produced before a low vowel, for example, the tongue root (measured at the base of the epiglottis) is approximately 11 mm more advanced at /ǃ/ release (Fig. 6.4, top row, early frames) than at the same point during /ǂ/ release (Fig. 6.4, bottom row, early frames). These data suggest that dental and alveolar clicks may be produced in Khoekhoegowab with a more advanced tongue root than palatal and lateral clicks.

220

Proctor, Et Al.

Looking beyond the tongue, these data illustrate another major advantage of rtMRI for studying non-pulmonic consonants: the global midsagittal view of the vocal tract reveals how other articulators are coordinated with lingual activity. The images in Fig. 6.3 show the state of the velum at any point in time, which remains lowered throughout the entire period of click production in these words. Although it is not always possible to determine the exact state of the glottis from rtMRI data, the images in Fig. 6.3 also show that the glottis is at least partially constricted during click release (c.f. the completely open glottises shown in Fig. 6.13). On the basis of these observations, the rtMRI data provide support for previous characterizations of the ‘plain’ Nama click series as voiceless nasal lingual ingressive consonants with a glottalized component (Brugman 2009; Miller 2011); however, in palatal and lateral clicks, the posterior constriction location may be better characterized as uvular: /ᵑǀ̥ Ɂ/–/ᵑǃ̥ Ɂ/– /ᶰǂ̥ Ɂ/–/ᶰǁ̥ Ɂ/. 4 rtMRI Insights into siSwati Clicks SiSwati is a Southern Bantu language, primarily spoken in Swaziland and South Africa. siSwati (ISO 639–3 code: ssw) is a Tekela variety of the Nguni language group (S43: Guthrie 1971), along with Ndebele, Phuthi, and Lala (Gowlett 2003). There is no consensus in the literature over the number and nature of click contrasts in siSwati. Most accounts argue for a single click place, most commonly characterized as dental (Nussbaum 1969; Rycroft 1981; Taljaard et al. 1991; Kockaert 1996), but also described as alveolar (Gowlett 2003). Doke (1954), and Chen and Malambe (1998) list /ǀ/ and /ǃ/ as distinct phonemes. The picture is complicated by multilingualism and the influence of loan words, as many neighbouring languages use more extensive click inventories. Even in the native lexicon, siSwati clicks appear to be realized with considerable variation in place (Ziervogel 1952; Nussbaum 1969; Corum 1980). There is also disagreement over the laryngeal and nasal features that differentiate clicks in siSwati (Ziervogel 1952; Doke 1954; Nussbaum 1969; Kockaert 1996), although again, this may reflect different approaches to transcription and phonological description. One reason for the incomplete understanding of these segments is a lack of phonetic data. Most descriptions of the sound system of siSwati appear to rely solely on impressionistic transcription. To the best of our knowledge, no instrumental studies of siSwati click consonants have previously been published. Here, we present some results from an initial rtMRI investigation into the articulatory characteristics of clicks produced by a single speaker of siSwati, to shed more light on some of these issues.

Studying Clicks Using Real-Time MRI

221

4.1 Informant and Corpora The study participant was a 47 year-old female, born and raised in Swaziland to first language speakers of siSwati. The informant also speaks English and Xitsonga but identifies siSwati has her primary language. She has lived most of her life in Swaziland, and has also lived in South Africa and Arizona as an adult. The participant reported normal hearing abilities and no speech pathologies. She has three missing teeth: left and right lower second molars and upper right second pre-molar. The participant read out short passages of prose and lists of words targeting siSwati phonological contrasts of interest. In addition to the study corpus, some spontaneous speech was recorded. Twenty one recordings were made in total, each lasting between 19 and 34 seconds. The subset of the corpus analyzed in this study is listed in Appendix B. 4.2 Articulatory Characterization of siSwati Clicks A real-time MRI sequence of a ‘plain’ siSwati word-initial click produced before a high-back vowel is illustrated in Figure 6.5, with time-aligned acoustic data. Every 5th frame is shown, spaced at 30 ms intervals, with corresponding landmarks indicated on the speech waveform. SiSwati click production before low and high front vowels is illustrated in the frame sequences in Figures 6.6 and 6.7, for comparison. Only minor influences of vowel context on anterior constriction formation are seen in these data. In each vowel context, the target of the initial tongue-tip gesture is dental (Fig. 6.5, frames 1–6; Fig. 6.6, frames 78–84), and the anterior seal is formed with an apico-laminal constriction extending from the back of the upper teeth over the entire alveolar ridge. As observed in the Nama dental clicks, the laminal component of the tongue tip gesture becomes more prominent at the point of release of the siSwati click before a low vowel (Fig. 6.6: 90–102). Before the low vowel /ɑ/, the posterior seal is created with a broad dorsal gesture extending across the lower velum and uvula (Fig. 6.6: 90–108). As in Nama /ǀ/, rarefaction appears to be achieved primarily through lowering of the tongue blade and the centre of the tongue in the region immediately behind the anterior constriction, below the center of the hard palate (Fig. 6.6: 102–114). Click release in this vowel context was characterized by a large amount of jaw lowering and retraction (−13 mm vertical, and −13 mm horizontal displacement, from click release to vowel target). Articulation of the nasalized click in the word ncisha /ᶰǀiʃɑ/ ‘delete it’ is shown in Fig. 6.7. The coarticulatory influence of the high front vowel is evident throughout the sequence: at each stage of production, the tongue body is articulated in a higher, more fronted position, compared to the posture captured

222

Proctor, Et Al.

Figure 6.5 siSwati ‘plain’ click production before a high-back vowel Notes: Eight image frames (top two rows) showing midsagittal articulation at key stages of the consonant-vowel sequence beginning the word cula /ǀúlɑ/ ‘knock down’. Broken vertical lines superimposed on the waveform of the companion acoustic recording indicate the location in time of each frame.

at equivalent points in time in Fig. 6.6. The dental place of articulation of the anterior seal remains the same in this vowel context (Fig. 6.7, frames 754–772). The posterior seal is still achieved with a broad dorsal gesture, but with a more advanced closure target. The dorsal constriction extends forward into the velar region and becomes progressively more uvular during rarefaction and release (Fig. 6.7, frames 772–787). Tongue lowering ceases approximately 50 ms after click release, after which the tongue raises and advances towards the vowel target (Fig. 6.7, frames 793–805). Another difference between siSwati clicks in these two vowel contexts is the articulation of the pharynx: before the high front vowel, the same click

Studying Clicks Using Real-Time MRI

223

Figure 6.6 siSwati nasal click production in a low back vowel context: mid-sagittal articulation of the onset consonant in ncata /ᶰǀɑtɑ/ ‘sound made when irritated’ Notes: MRI frames shown at 36 ms intervals. Frame 78: initial posture (140 ms before release); Frames 84–96: coronal closure followed by dorsal constriction; Frame 102: acoustic onset of click; Frame 120: lingual posture at acoustic target of post-consonantal vowel.

is produced with an advanced tongue root (evident in the relative displacement of the epiglottis), and greater overall pharyngeal aperture (Fig. 6.7, frames 90–108). Vowel context also influences the kinematics of the jaw and tongue dorsum: click release into a following i vowel is characterized by less dorsal lowering and retraction, and much less jaw motion (−4 mm vertical, and −1 mm horizontal displacement, click release to vowel target) than was observed during click release into a following /ɑ/ vowel. Differences in clicks arising from vocalic coarticulatory effects are better observed in a side-by-side comparison; lingual postures of voiceless aspirated siSwati clicks produced before three different vowels are compared in the enlarged image frames juxtaposed in Figure 6.8. Differences in the size and geometry of the midsagittal cavity can be observed. These images also show the influence of vowel context on labial posture: the onset consonant of choba /ǀʰoɓɑ/ ‘crush’ is realized with a smaller labial aperture (20 mm) and more lip protrusion (tongue tip to lips = 6 mm) at the moment of click release, compared to clicks produced before non-rounded vowels (/ǀʰi/: labial aperture = 6 mm; tongue tip to lips = 17 mm). We can characterize these clicks more accurately by comparing them to other consonants used in siSwati. Tongue postures captured at the articulatory centers of velar stops produced in three different vowel contexts are compared in Fig. 6.9. As expected, the images reveal large differences in dorsal posture

224

Proctor, Et Al.

Figure 6.7 siSwati nasal click production in a high front vowel context: mid-sagittal articulation of the onset consonant in ncisha /ᶰǀiʃɑ/ ‘delete it’ Notes: MRI frames shown at 36 ms intervals. Frame 754: initial posture (140 ms before release); Frames 763–781: coronal closure followed by dorsal constriction; Frame 787: acoustic onset of click; Frame 805: lingual posture at acoustic target of post-consonantal vowel.

Figure 6.8 Influence of vowel context on siSwati clicks: Lingual posture at acoustic onset of voiceless aspirated clicks Notes: Left: lichibi /liǀʰiɓi/ ‘pond, puddle’; Centre: chacha /ǀʰɑǀʰɑ/ ‘undo’; Right: choba /ǀʰoɓɑ/ ‘crush’.

due to the coarticulatory influence of the vowels: the apex of the dorsum is 28 mm more advanced in the velar stop in kimi /k’imi/ [k̟Ɂimi], compared to liduku /li̇d́ ̤ ukʼú/ [li̇d́ ̤ uqɁu]. Comparison with the images in Figs. 6.6–6.7 reveals that the clicks use a generally more retracted and extended dorsal constriction: the anterior limit of the posterior click seal is always more retracted than the most advanced velar allophones, and most closely resembles that of the uvular allophones. The dorsal constriction of the click typically appears to extend further back than the equivalent stop constriction, making contact with a greater

Studying Clicks Using Real-Time MRI

225

Figure 6.9 Place of articulation of siSwati velar stops Notes: Left to right: lingual posture at point of maximum constriction of velar ́ i/ ‘in me’ ; kabi /kʼɑ́ ɓi/ ‘badly, severely’ ; stop in three vowel contexts: kimi /kʼi̇m ́ liduku /li̇d̤ukʼú/ ‘piece of cloth’.

length of the uvula, because the tongue root appears be to less advanced and the lower pharynx more constricted during lingual ingressive consonant production; further investigation will be required to see if this observation holds across a wider range of utterances. 5

Paralinguistic Click Production

Vocal percussion – the use of the vocal organs to generate percussion sounds, or imitate percussion instruments – has a long history in many cultures and musical performance genres, such as konnakol recitation in southern Indian musical traditions (Nelson 2008). Recently, vocal percussion has come to find new expression in hip-hop performance, where ‘human beatboxers’ imitate synthetic drum sounds, typically to create an accompanying percussion track for another rapper. Many vocal percussion sounds can be described using the International Phonetic Alphabet because they are produced in the same ways, and share many phonetic properties with sounds exploited phonologically in languages (Lederer 2005; Proctor et al. 2013). Of particular interest are the clicks used by beatbox artists to imitate percussion elements such as wood blocks, snare drums, and rimshots (Stowell and Plumbley 2008). These data present an interesting case study for two reasons: the clicks are produced with musical, rather than linguistic goals, and they are typically produced by speakers of languages that do not use clicks phonemically. In this section, we review rtMRI data showing click articulation by a third informant, a beatbox artist. These data (originally described in Proctor et al. 2010b; Proctor et al. 2013) allow us to examine mechanisms of click production in a speaker uninfluenced by phonological constraints. By comparing these

226

Proctor, Et Al.

sounds with equivalents in Nama and siSwati, we can consider which aspects of click production might be language-specific, and which aspects arise from more general mechanisms of sound production. 5.1 Participant and Corpus The participant is a male professional singer, working in a wide variety of vocal performance styles including hip-hop, pop, and R&B. At the time of the study, the subject was 27 years old, and had been working for 10 years as an emcee in a hip-hop duo, and as a session vocalist with other groups. The informant was born in Orange County, California, to Panamanian parents, and lives and works in Los Angeles. He is a native speaker of American English, and a heritage speaker of Panamanian Spanish. The participant demonstrated all of the percussion effects in his repertoire, and several beatboxing sequences, performing in short intervals as he lay supine in the scanner bore. Each target effect was demonstrated at least five times, separated by short pauses of approximately two seconds. Image data were acquired with a different pulse sequence to that used for the other click studies (for details see Proctor et al. 2013), but the same methods were used for audio acquisition, image processing, and analysis. 5.2 Articulation of Clicks in Vocal Percussion Of the seventeen phonetically-distinct percussion effects in the informant’s repertoire, four made use of clicks. These were described by the subject as a snare drum, a high-hat, and two different rimshot effects. Before examining how these sounds were produced, we first identified the lingual postures associated with English consonants produced by the informant, to establish articulatory landmarks which would allow us to characterize his clicks. Typical dental, alveolar, and velar constriction targets (captured at consonant midpoints in rapidly-rapped English words) are illustrated in Fig. 6.10. Articulation of the effect described as a ‘side K rimshot’ is illustrated in the image sequence in Fig. 6.11, acquired over a 480 ms interval. The posterior constriction is formed first, followed by the coronal constriction, completed 119 ms before the acoustic onset of the click. The seal formed between the body of the tongue and the soft palate initially extends from the velar region to the bottom of the uvula (frame 283); after click release, the tongue remains in contact with the bottom of the uvula (f293). Place of articulation cannot be characterized in terms of a single target for the anterior lingual seal at formation, as the blade of the tongue forms a continuous constriction along the entire midline of the palate (f289). After click release, the midsagittal anterior constriction location can be characterised as alveolar (Fig. 6.11: f293; c.f. Fig. 6.10: /t/).

Studying Clicks Using Real-Time MRI

Figure 6.10

227

Places of articulation in English consonants produced by the beatbox artist Notes: Mid-consonantal posture captured in dental, alveolar and two velar obstruents, in rapid speech. Left-to-right: /ð/ (‘t͟he’); /t/ (‘masṯer’); /ʃ/ (‘s͟he’); /k/ (‘authentic̱’).

At the point of maximum lingual constriction, no midsagittal cavity can be seen between the tongue and the palate (Fig. 6.11: f289). Rarefaction and cavity formation occur in the mid palatal region while anterior and posterior lingual seals are maintained (frames 290–291), suggesting that the consonantal influx is lateralized, consistent with the subject’s description of the click as being produced at ‘the side of the mouth’. The velum remains lowered throughout. The same articulatory sequence was observed in all seven examples of this sound effect demonstrated by the subject, which may be described as a voiceless nḁ salized lateral click: [ᶰǁ]. An additional rimshot effect demonstrated by the subject is illustrated in Fig. 6.12: five image frames acquired over a 384 ms interval. Unlike the Nama dental and alveolar clicks, which were initiated with an apico-(post)alveolar gesture, the anterior seal in this sound effect is formed with a laminal coronal alveolar constriction. The posterior closure is spread across the velar/uvular regions (Fig. 6.12, frames 17–21), but initially at a more anterior location to that observed during constriction formation in Fig. 6.11. In terms of initial constriction location, and the way that the seal formed between the tongue body and the velum/uvula, articulation of the posterior constriction in this effect closely resembles that of the Nama /ǀ/. As in the lateral rimshot, posterior constriction formation precedes anterior closure in most cases, although some variation in the relative timing of these gestures was observed across eight repetitions of this effect. Rarefaction occurs in the palatal/velar region, as a result of jaw and tongue lowering. The velo-pharyngeal port remains open throughout, and the velum is lowered onto the rear part of the tongue dorsum as the click is released, so that the final location of the posterior lingual seal is against the lowest part of the uvula. This effect may be described as a type of voiceless nasalized alveolar ̥ The informant used the additional description ‘sucking in’ to difclick: [ᶰǃ]. ferentiate this sound from the previous rimshot effect, which suggests that he

228

Proctor, Et Al.

Figure 6.11

̥ Articulation of a ‘side K’ rim shot effect as a lateral click [ᶰǁ] Notes: Frame 283: posterior constriction formed between tongue dorsum and soft palate/uvula; f289: alveolar anterior constriction completes lingual seal; f290: rarefaction of palatal cavity; f293: final lingual posture after lateral influx: lowered tongue body retaining anterior & posterior lingual seals. Velum remains lowered throughout click production.

Figure 6.12

̥ Articulation of a rimshot effect as an alveolar click [ᶰǃ] Notes: Frame 15: lingual raising and advancement towards palate; f17: lingual seal extending from alveolar ridge to soft palate; f20–21: rarefaction of lingual cavity beneath palate; f23: final lingual posture after click release. Velum remains lowered throughout click production.

may have accompanied the click release with some ingressive pulmonic activity. The glottis appears to remain largely open throughout this effect, but it is not possible to establish whether additional airstream mechanisms were used from these data alone. The image sequence in Fig. 6.13 – four frames acquired over a 340 ms interval – illustrates articulation of a ‘clap snare’ sound effect. As in most of the clicks demonstrated by this subject, an extended lingual seal is first created along the the entire midline of the hard and soft palates (frame 387). The front constriction in this click is more anterior than that observed in the lateral (Fig. 6.11) and alveolar clicks (Fig. 6.12), with the point of influx closer to the subject’s teeth (Fig. 6.13, frames 393–4; c.f. Fig. 6.10: /ð/). The dorsal seal extends along the entire length of the uvula, and the place of articulation of the posterior constriction is clearly uvular at release (frames 393–394). Labial approximation begins when the tongue and jaw are raised to form the lingual seal; the lips are completely closed during the rarefaction stage (frame 391),

Studying Clicks Using Real-Time MRI

Figure 6.13

229

Articulation of a ‘clap’ snare drum effect as a labialized dental click [ᶰǀ̥ ʷ] Notes: Frame 387: tongue pressed into palate; f391–393: rarefaction of palatal cavity: lowering of jaw and tongue centre; f394: click release reveals dental-alveolar anterior constriction and extended uvular posterior lingual seal; Velum remains lowered throughout.

and reopened as the click is released (frame 393). The velum remains lowered and the glottis is clearly open throughout the entire sequence, which may be described as a labialized voiceless nasalized dental click: [ᶰǀ̥ ʷ]. One final sound effect in the repertoire was produced as a click, described by the subject as a ‘hi-hat closed: kiss teeth’. Six examples of this sound were demonstrated; lingual articulation in each case very closely resembled that observed in the ‘clap snare’ (Fig. 6.13). Some minor variation was observed in the timing of labialization and the degree of constriction of the glottis; however, the effect may also be described as a labialized voiceless nasalized dental click [ᶰǀ̥ ʷ]. 5.3 Insights from Beatboxing These data demonstrate that speakers of languages lacking lingual ingressive consonants can produce different types of clicks for paralinguistic purposes, and that, at least in the case of this informant, these sounds are generated in very similar ways to the click phonemes produced by native speakers. The sound effects demonstrated by this beatboxer can be described using standard IPA because they are characterized by the same broad places and patterns of articulation as the /ǀ/, /ǃ/ and /ǁ/ consonants used in Bantu and Khoisan languages, although we do not yet know if all clicks produced for paralinguistic purposes have equivalents in phonological systems. The posterior constriction location in all of these sound effects is best characterized as uvular. All these clicks are intrinsically voiceless, and pervasively nasalized (produced with the velo-pharyngeal port open), suggesting that when nasalization does not need to be controlled for phonological reasons, the dorsal lingual seal may be more naturally formed against a lowered velum and uvula. More research is needed to establish how different airstream mechanisms combine with nasalized clicks (cf. Miller et al. 2009a), and to examine

230

Proctor, Et Al.

the acoustic consequences of nasal coupling at different stages in the production of non-pulmonic consonants in more detail. A notable difference between the clicks produced by the beatboxer and those of the two native speakers of click languages involves the way the anterior constriction is formed. While some separation between posterior and anterior seals can be observed during closure in all the clicks illustrated in Figs. 6.3 to 6.8, the beatboxer’s clicks were formed with what appears to be a continuous seal along the entire palate. It is not clear if this reflects the paralinguistic nature of the task, anatomical differences between speakers, or language differences. The clicks produced by this siSwati speaker in high front vowel contexts, for example, are formed with a much smaller initial cavity than comparable clicks produced by the Nama speaker. Further research might examine whether native speakers of click languages ever articulate clicks without an initial cavity within the lingual-palatal seal. Extensive pre-labialization was used to shape the sound of two of the four types of clicks demonstrated – a secondary articulation not commonly exploited in click phonologies, even in the most extended lingual ingressive consonant inventories (Maddieson 1984; Ladefoged and Maddieson 1996). 6 Discussion These data illustrate the utility of real-time MRI as a method for investigating the phonetic and phonological properties of clicks. By providing a means of visualizing the dynamic configuration of the entire vocal tract, rtMRI offers some important advantages over other methods used to study clicks. rtMRI provides rich information about place of articulation in click consonants. Unlike ultrasound, in which the location of the palate must be reconstructed, rtMRI directly images the alveolar ridge, hard palate, velum and uvula, and reveals exactly how and where the tongue comes into contact with it. Unlike palatography, the field of view rtMRI includes the entire soft palate, including the uvula. In practice, the accuracy and amount of detail that rtMRI can provide about linguo-palatal contact will depend on the spatio-temporal resolution of the imaging sequences and other factors (see Sections 6.1–6.3). These data demonstrate the importance of this information for accurate characterization of the phonetic properties and differences between clicks. Nama dental clicks are characterized by an apico-laminal coronal constriction extending from the back of the upper teeth over the entire alveolar ridge; siSwati clicks are initiated as apico-dentals and released as lamino-alveolars; while the beatboxer initiates all clicks with an extended lingual constriction across the whole palate.

Studying Clicks Using Real-Time MRI

231

Because it reveals the location and posture of the whole mass of the tongue in a given imaging plane, rtMRI can provide richer insights into lingual articulation and coarticulation than modalities that only provide information from part of the vocal tract. Dental and alveolar clicks were found to be produced by this speaker of Nama with a greater pharyngeal aperture and more advanced tongue root than his palatal and lateral clicks. Palatal clicks are produced by this speaker with a more bunched tongue posture than /ǃ/, which shows greater independence of coronal and dorsal tongue gestures. Clicks produced by this speaker of siSwati before back rounded vowels have been shown to involve greater labial constriction and protrusion than her clicks produced in other vowel contexts. More research is required to test these characterizations for other speakers of these languages, and for other speakers’ paralinguistic click productions. Neither palatography nor ultrasound is able to track the formation and release of the posterior click constriction with the same level of detail as rtMRI, as neither method provides reliable information about the articulation of the velum and uvula against the tongue dorsum. rtMRI data from these three subjects reveal that click production typically involves an extended dorsal seal which may begin as far forward as the velar place of articulation and extend all the way to the bottom of the uvula. Characterizing the posterior ‘place of articulation’ of click consonants is not straightforward, as the anterior point of contact between the tongue dorsum and the soft palate often retracts throughout click release, may not be revealed in the midsagittal plane until after acoustic release of the click, and is influenced by vowel coarticulation. Nevertheless, these data suggest that the posterior constriction is typically best characterized as uvular, and velar in the case of Nama dental clicks. The main phonetic properties of the clicks examined in these data are summarized in Table 6.2. X-ray offers many of the same key advantages as rtMRI: the ability to view the whole vocal tract, and observe the coordination of articulators beyond the tongue during click production. However, because of the risks associated with exposing subjects to ionizing radiation, X-ray is no longer considered to be a suitable method for general phonetic investigation. rtMRI therefore offers the prospect of safely examining some aspects of click production that have not been attempted since Doke’s (1923; 1923) and Traill’s (1985) pioneering X-ray studies of click consonants. 6.1 Limitations We present these data as an initial exploration of the potential of rtMRI for studying clicks. The selection of languages is not intended to represent a typologically-balanced sample of phonological systems that use click

232

Proctor, Et Al.

Table 6.2 Summary of phonetic properties of click consonants

/ǀ/ Nama Swati BBox /ǃ/ Nama BBox /ǁ/ Nama BBox /ǂ/ Nama

Anterior constriction

Coronal posture

Posterior constriction

Cavity location

Release mechanism

dental-alveolar dental-alveolar dental-alveolar postalveolar alveolar alveolar alveolar palatal

apical laminal apical apical/retroflex laminal laminal laminal laminal

velar velar uvular uvular uvular uvular uvular uvular

palatal palatal palatal palato-velar palatal velar palatal velar

dorsal lowering dorsal retraction dorsal retraction tongue lowering dorsal retraction dorsal lowering dorsal retraction tongue lowering

Notes: L-to-R: anterior constriction location at click release; coronal posture at release; posterior constriction location at release; location of lingual cavity at greatest aperture; and primary lingual motion characterizing click release.

consonants, but was determined in part by the availability of participants. The use of a single informant in each of the target languages is another obvious limitation; it is not clear which of the patterns of articulation described here are speaker-specific, nor how click production might be influenced by factors including vocal tract morphology (Lammert et al. 2013) and prosody (Brugman 2009). The audio component of these data is not ideal for studying lingual ingressive consonants. Acoustic recordings in an MRI scanner are characterized by limited bandwidth and echo artifacts that are particularly problematic for highly transient sounds such as clicks. Speech recordings acquired during an MRI scan additionally require extensive post-processing to attenuate the pervasive scanner noise (Bresch et al. 2006; Vaz et al. 2013). Although the resulting recordings are of sufficient quality for general acoustic analysis (e.g. Lammert et al. 2013) and automatic phonetic transcription (Katsamanis et al. 2011), the audio signal is degraded in ways that currently prevent more extensive analysis. These data also demonstrate some inconsistencies in image quality which can affect MRI studies of the upper airway. Signal-to-noise ratios may vary throughout the image depending on the type and configuration of receiver coil, which can affect resolution of the velum, pharynx and larynx. Cardiac artifacts can mask or distort key anatomical regions, in particular the tongue root and velum. In some cases, implanted medical devices prevent some subjects from participating in any kind of magnetic resonance imaging study, due to safety issues. More commonly, metallic substances used in dental work introduce

Studying Clicks Using Real-Time MRI

233

imaging artifacts which distort parts of the signal beyond the point where accurate information about vocal tract configuration can be recovered. Although the minor dental artifacts observed in these Nama data were not problematic, the much larger artifacts that occasionally degrade other subjects’ MRI data can prove to be prohibitive for some types of analysis. 6.2 Recommendations Although not yet as widely used as static structural imaging, real-time MRI is an increasing viable technology for speech studies. Lingala et al. (2016) outline the requirements, considerations, and expectations for rtMRI, and suggest protocols for acquiring speech data with different spatio-temporal resolutions. Although click consonants are especially demanding objects of study, different aspects of click production can be fruitfully studied using a wide variety of different sequences on commonly available scanners. Global and consistent imaging of the vocal tract midline is not always possible: due to misorientation of the participant’s head and/or anatomical asymmetries, different regions of the vocal tract may be more closely aligned with the midsagittal imaging plane. A study focusing on airstream mechanisms and laryngeal setting should verify the mid-glottal alignment of the imaging plane by eliciting voiced, voiceless, and glottally-stopped segments during localization. There is a trade-off between imaging slice thickness and signal-to-noise ratio: slices wider than 5mm offer good temporal and spatial resolution at the high frame rates needed to image clicks (Lingala et al. 2016), but may be too wide to resolve some anatomical details. It is important to immobilize the participant’s head so that they are as comfortable and stable as possible throughout the scan session. Nevertheless, small movements can and do occur, shifting the anatomical alignment across images acquired over the course of an utterance or a scan session. Where necessary, misalignment due to head movement can be corrected by calculating displacement between frames (using correlation of static anatomical features), and applying compensating image transformations. 6.3 Future Directions The data presented here were all acquired using a midsagittal imaging plane, so they provide no information about articulation beyond the midline of the vocal tract. A comprehensive understanding of click production will need to consider activity in other parts of the oral cavity. MRI can be acquired using parasagittal, coronal and axial imaging planes, planes of arbitrary orientation that intersect any part of the upper airway of interest, and with interleaved acquisition from multiple imaging planes (Proctor et al. 2008; Proctor et al. 2010c; Kim et al. 2012; Zhu et al. 2013). Some of these advanced imaging techniques

234

Proctor, Et Al.

would provide additional information about tongue shaping and vocal tract configuration, which would be especially helpful in the study of lateral clicks. Image sequences offering greater SNR, and improved spatial and temporal resolution (e.g. Niebergall et al. 2013; Lingala et al. 2017) will be important to better resolve the hard palate and allow more accurate analysis of laryngeal and velic activity. Acknowledgments This work was supported by National Institutes of Health grant R01 DC007124– 01 and Australian Research Council Award DE150100318. Special thanks to our informants for their participation, patience, talents, and insights into their languages.

Appendix A: Nama Elicitation Items

Table 6.3 Nama orthography and broad phonetic transcriptions, with Afrikaans disambiguators (where used during presentation), and English glosses

Nama

IPA

Afrikaans

English

ǀā ǁā ǃā ǂā ǀom ǁom ǃom ǂom ǀgam ǁgam ǃgam ǂgam ga ge gu ega

/ᵑǀ̥ ʔɑ́ ɑ̀/ /ᵑǁ̥ ʔɑ̋ ɑ́/ /ᵑǃ̥ ʔɑ̀ ɑ̋/ /ᵑǂ̥ ʔɑ̋ ɑ́/ /ᵑǀ̥ ʔòm̀ / /ᵑǁ̥ ʔòm̋ / /ᵑǃ̥ ʔóm̀ / /ᵑǂ̥ ʔóm̀ / /ǀɑ̋ ḿ/ /ǁɑ́ m̀/ /ǃɑ̏ ḿ/ /ǂɑ̏ ḿ/ /kɑ̏ ɑ́/ /kè/ /kù/ /e̋kɑ̏ /

skerp was ophang slag asemhaal slaap doring uithaal werk met naald warm gesels diep van blydskap rondspring

‘sharp’ ‘wash’ ‘hang out’ ‘slaughter’ ‘breathe’ ‘sleep’ ‘remove thorn’ ‘sew’ ‘warm’ ‘talk’ ‘deep’ ‘jumping for joy’ ‘fool, trick’ ‘we’ (masc.) ‘they’ (masc.) ‘a bit later’

Studying Clicks Using Real-Time MRI



235

Appendix B: siSwati Elicitation Items

Table 6.4 siSwati orthography and broad phonetic transcriptions, with English glosses

siSwati

IPA

English

caba coba cula ncata ncusa ncisha ncesi choba lichubu lichibi chefe chacha kimi kabi liduku

/ǀɑ́ ɓɑ/ /ǀoɓɑ/ /ǀúlɑ/ /ᶰǀɑtɑ/ /ᶰǀusɑ/ /ᶰǀiʃɑ/ /ᶰǀêsi/ /ǀʰoɓɑ/ /liǀʰuɓu/ /li̇ǀ́ ʰiɓi̇/́

‘cut down’ ‘break, snap’ ‘knock down’ ‘sound made when irritated’ ‘ask a favor, send with a message’ ‘delete it (isiZulu)’ ‘sorry!’ ‘steal (isiZulu)’ ‘hump, hunchback’ ‘lake, pond’ ‘irritating’ ‘undo’ ‘in me’ ‘badly, severely’ ‘piece of cloth’

/ǀʰefe/ /ǀʰɑǀʰɑ/ /kʼimi / /kʼɑɓi̇/́ /li̇d́ ̤ úkʼu/

References Beach, Douglas M. 1938. The Phonetics of the Hottentot Language. Cambridge: W. Heffer & Sons. Bradfield, Julian. 2014. “Clicks, concurrency and Khoisan.” Phonology 31(1): 1–49. Bresch, Erik, Yoon-Chul Kim, Krishna Nayak, Dani Byrd, and Shrikanth Narayanan. 2008. “Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging [Exploratory DSP].” IEEE Signal Processing Magazine 25(3): 123–132. Bresch, Erik, Jon Nielsen, Krishna Nayak, and Shrikanth Narayanan. 2006. “Synchronized and noise-robust audio recordings during realtime magnetic resonance imaging scans.” Journal of the Acoustical Society of America 120(4): 1791–1794. Brugman, Johanna. 2009. “Segments, tones and distribution in Khoekhoe prosody.” Ph.D. thesis, Cornell University. Chen, Su-I, and Gloria B. Malambe. 1998. “Palatalization in siSwati: an optimality theoretical approach.” Language History and Linguistic Description in Africa, ed. Ian Maddieson and Thomas J. Hinnebusch. Trenton, NJ: Africa World Press, pp. 137–146.

236

Proctor, Et Al.

Corum, Claudia W. 1980. SiSwati: Special Skills Handbook. Battleboro VT: Peace Corps. Cruttenden, Alan. 1992. “Clicks and syllables in the phonology of Dama.” Lingua 86(2– 3): 101–117. Curriculum Committee for Khoekhoegowab. 2003. Khoekhoegowab: 3IIî xoaigaub (Orthography 3). Windhoek: Gamsberg Macmillan Publishers. Doke, Clement M. 1923a. “A Dissertation on the Phonetics of the Zulu Language.” Bulletin of the School of Oriental and African Studies 2(4): 687–729. Doke, Clement M. 1923b. “Notes on a problem in the mechanism of the Zulu clicks.” Bantu Studies 2(1): 43–45. Doke, Clement M. 1925. “The phonetics of ʗhũ̬ː Bushman.” Bantu Studies 2(3): 129–165. Doke, Clement M. 1926. “The Phonetics of the Zulu Language.” Bantu Studies 2. Johannesburg: Wiwatersrand University. Doke, Clement M. 1954. The Southern Bantu languages: Handbook of African Languages. London: Oxford University Press for the International African Institute. Doke, Clement M., and S. Machabe Mofokeng. 1957. Textbook of Southern Sotho Grammar. Cape Town: Longman. Finlayson, Rosalie, J. Jones, K. Podile, and Jan Snyman. 1991. An Introduction to Xhosa Phonetics. Hout Bay: Marius Lubbe. Fulop, Sean A., Peter Ladefoged, Fang Liu, and Rainer Vossen. 2003. “Yeyi Clicks: Acoustic Description and Analysis.” Phonetica 60(4): 231–260. Gowlett, Derek F. 2003. “Zone S.” The Bantu Languages, ed. Derek Nurse and Gérard Philippson. 4. London, New York: Routledge, pp. 609–638. Güldemann, Tom. 2001. “Phonological regularities of consonant systems across Khoisan lineages.” University of Leipzig Papers on Africa 16: 1–50. Güldemann, Tom, and Rainer Vossen. 2000. “Khoisan.” African Languages: An Introduction, ed. Bernd Heine and Derek Nurse. Cambridge: Cambridge University Press, pp. 99–122. Guthrie, Malcolm. 1971. Comparative Bantu: an Introduction to the Comparative Linguistics and Prehistory of the Bantu Languages. Farnborough: Gregg Press. Haacke, Wilfrid H.G., and Eliphas Eiseb. 2002. A Khoekhoegowab Dictionary with an English-Khoekhoegowab Index. Windhoek: Gamsberg Macmillan. Haacke, Wilfrid H.G., Eliphas Eiseb, and Levi Namaseb. 1997. “Internal and External Relations of Khoekhoe Dialects: a Preliminary Survey.” Namibian Languages: Reports and Papers, ed. Wilfrid H.G. Haacke and Edward D. Elderkin. Köln: Rüdiger Köppe Verlag; University of Namibia (UNAM), pp. 125–209. Herbert, Robert K. 1990. “The relative markedness of click sounds: Evidence from language change, acquisition, and avoidance.” Anthropological Linguistics 32(1–2): 120–138. Kagaya, Ryohei. 1978. “A phonetic sketch of Naron around Ghanzi.” Journal of Asian and African studies/Ajia Afuriku gengo bunka kenkyu 16: 29–63.

Studying Clicks Using Real-Time MRI

237

Katsamanis, Athanasios, Matthew Black, Panayiotis Georgiou, Louis Goldstein, and Shrikanth Narayanan. 2011. “SailAlign: Robust long speech-text alignment.” Proceedings of the Workshop on New Tools and Methods for Very-Large Scale Phonetics Research. University of Pennsylvania. Philadelphia, pp. 44–47. Kim, Yoon-Chul, Michael I. Proctor, Shrikanth S. Narayanan, and Krishna S. Nayak. 2012. “Improved imaging of lingual articulation using real-time multislice MRI.” Journal of Magnetic Resonance Imaging 35(4): 943–948. Kockaert, Hendrik J. 1996. Corrective Strategies for the Pronunciation of French as a Foreign Language among Swazi Learners. Johannesburg: Randse Afrikaanse Universiteit. Köhler, Oswin, Peter Ladefoged, Jan Snyman, Anthony Traill, and Rainer Vossen. 1988. “The symbols for clicks.” Journal of the International Phonetic Association 18(2): 140–142. Ladefoged, Peter, and Ian Maddieson. 1996. The Sounds of the World’s Languages. Oxford; Cambridge, MA: Blackwell. Ladefoged, Peter, and Anthony Traill. 1984. “Linguistic phonetic description of clicks.” Language 60(1): 1–120. Ladefoged, Peter, and Anthony Traill. 1994. “Clicks and their accompaniments.” Journal of Phonetics 22: 33–64. Lammert, Adam, Michael Proctor, and Shrikanth Narayanan. 2013. “Interspeaker Variability in Hard Palate Morphology and Vowel Production.” Journal of Speech, Language, and Hearing Research 56(6): S1924–S1933. Lanham, Leonard W. 1964. “The proliferation and extension of Bantu phonemic systems influenced by Bushman and Hottentot.” Proceedings of the 9th International Congress of Linguists, Cambridge 1962, ed. Horace Lunt. The Hague: Mouton & Co., pp. 382–391. Lederer, Karen. 2005. “The phonetics of beatboxing.” B.A. dissertation. Leeds University. Lingala, Sajan G., Brad P. Sutton, Marc E. Miquel, and Krishna S. Nayak. 2016. “Recommendations for real-time speech MRI.” Journal of Magnetic Resonance Imaging 43(1): 28–44. Lingala, Sajan G., Yinghua Zhu, Yoon-Chul Kim, Asterios Toutios, Shrikanth Narayanan, and Krishna S. Nayak. 2017. “A fast and flexible MRI system for the study of dynamic vocal tract shaping.” Magnetic Resonance in Medicine 77(1): 112–125. Maddieson, Ian. 1984. Patterns of Sounds. Cambridge: Cambridge University Press. Maddieson, Ian. 2003. “The Sounds of Bantu Languages.” The Bantu Languages, ed. Derek Nurse and Gérard Phillipson. London: Routledge, pp. 15–41. Miller-Ockhuizen, Amanda L. 2003. Phonetics & Phonology of Gutturals: A Case Study from Juǀ’hoansi. Outstanding Dissertations in Linguistics. New York: Routledge.

238

Proctor, Et Al.

Miller, Amanda L. 2011. “The Representation of Clicks.” The Blackwell Companion to Phonology, vol. 1, ed. Marc van Oostendorp, Colin Ewen, Elizabeth Hume and Keren Rice. Oxford: Blackwell, pp. 416–439. Miller, Amanda L., Johanna Brugman, and Bonny Sands. 2007a. “Acoustic and auditory analysis of Nǀuu lingual and linguo-pulmonic stop bursts.” Proceedings of the 16th International Congress of Phonetic Sciences, ed. Jürgen Trouvain and William J. Barry. Saarbrücken: ICPhS, pp. 769–772. Miller, Amanda L., Johanna Brugman, Bonny Sands, Levi Namaseb, Mats Exter, and Chris Collins. 2009a. “Differences in airstream and posterior place of articulation among Nǀuu clicks.” Journal of the International Phonetic Association 39(02): 129–161. Miller, Amanda L., Levi Namaseb, and Khalil Iskarous. 2007b. “Tongue body constriction differences in click types.” Proc. Laboratory Phonology 9, ed. Jennifer S. Cole and José Ignacio Hualde. Berlin: Mouton de Gruyter, pp. 643–656. Miller, Amanda L., Abigail Scott, Bonny E. Sands, and S. Shah. 2009b. “Rarefaction gestures and coarticulation in Mangetti Dune ǃXung clicks.” Proceedings of the International Conference on Speech Communication and Technology. Brighton, pp. 2279–2282. Nakagawa, Hirosi. 1996. “A first report on the click accompaniments of Gǀui.” Journal of the International Phonetic Association 26(1): 41–54. Nakagawa, Hirosi. 2006. “Aspects of the phonetic and phonological structure of the Gǀui language.” Ph.D. thesis, University of the Witwatersrand. Narayanan, Shrikanth, Krishna Nayak, Sungbok Lee, Abhinav Sethy, and Dani Byrd. 2004. “An approach to real-time magnetic resonance imaging for speech production.” Journal of the Acoustical Society of America 115(4): 1771–1776. Narayanan, Shrikanth et al. 2014. “Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC).” Journal of the Acoustical Society of America 136(3): 1307–1311. Nelson, David P. 2008. Solkattu Manual: An Introduction to the Rhythmic Language of South Indian Music. Middleton, CT: Wesleyan University Press. Niebergall, Aaron, Shuo Zhang, Esther Kunay, Götz Keydana, Michael Job, Martin Uecker, and Jens Frahm. 2013. “Real-time MRI of speaking at a resolution of 33 ms: Undersampled radial FLASH with nonlinear inverse reconstruction.” Magnetic Resonance in Medicine 69(2): 477–485. Nussbaum, Loren V. 1969. Understanding and Speaking siSwati. Washington, DC: Center for Applied Linguistics. Proctor, Michael I., Daniel Bone, and Shrikanth S. Narayanan. 2010a. “Rapid semiautomatic segmentation of real-time Magnetic Resonance Images for parametric vocal tract analysis.” Proceedings of the International Conference on Speech Communication and Technology. Makuhari, Japan, pp. 1576–1579.

Studying Clicks Using Real-Time MRI

239

Proctor, Michael I., Krishna S. Nayak, and Shrikanth S. Narayanan. 2010b. “Linguistic and para-linguistic mechanisms of production in human “beatboxing”: a rtMRI study.” Proceedings of Intersinging. Univeristy of Tokyo. Proctor, Michael I., Christine H. Shadle, and Khalil Iskarous. 2010c. “Pharyngeal articulation differences in voiced and voiceless fricatives.” Journal of the Acoustical Society of America 127(3): 1507–1518. Proctor, Michael I., Christine Shadle, and Khalil Iskarous. 2008. “A method of coregistering multiple Magnetic Resonance Imaged vocal tract volumes for fricatives.” Journal of the Acoustical Society of America 123(5): 3735–3735. Proctor, Michael, Erik Bresch, Dani Byrd, Krishna Nayak, and Shrikanth Narayanan. 2013. “Paralinguistic mechanisms of production in human “beatboxing”: a real-time magnetic resonance imaging study.” Journal of the Acoustical Society of America 133(2): 1043–1054. Proctor, Michael, Yinghua Zhu, Adam Lammert, Asterios Toutios, Bonny Sands, Ulrich Hummel, and Shrikanth Narayanan. 2016. “Click Consonant Production in Khoekhoe: a Real-time MRI Study.” Khoisan Languages and Linguistics – Proceedings of the 5th International Symposium, July 13–17, 2014, Riezlern/Kleinwalsertal, ed. S. Shah and M. Brenzinger. Research in Khoisan Studies, vol. 34. Köln: Rüdiger Köppe Verlag, pp. 337–366. Roux, Justus. 2007. “Unresolved issues in the representation and phonetic description of click articulation in Xhosa and Zulu.” Language Matters: Studies in the Languages of Africa 38(1): 8–25. Rycroft, David K. 1981. Concise SiSwati Dictionary. Pretoria: J.L. van Schaik. Sands, Bonny. 1991. “Evidence for click features: Acoustic characteristics of Xhosa clicks.” UCLA Working Papers in Phonetics 80: 6–37. Sands, Bonny, Johanna Brugman, Mats Exter, Levi Namaseb, and Amanda Miller. 2007. “Articulatory characteristics of anterior click closures in Nǀuu.” Proceedings of the 16th International Congress of Phonetic Sciences, ed. Jürgen Trouvain and William J. Barry. Saarbrücken: ICPhS, pp. 401–404. Sands, Bonny, Ian Maddieson, and Peter Ladefoged. 1996. “The phonetic structures of Hadza.” Studies in African Linguistics 25(2): 171–204. Snyman, Jan W. 1970. Introduction to ǃXũ (ǃKung) Language. Cape Town: A.A. Balkema. Snyman, Jan W. 1975. Žuǀ’hõasi Fonologie & Woordeboek. Communication 37 of the University of Cape Town School of African Studies. Cape Town; Rotterdam: A.A. Balkema. Stowell, Dan, and Mark D. Plumbley. 2008. “Characteristics of the beatboxing vocal style.” Technical Report, Centre for Digital Music, Dept. of Electronic Engineering, University of London, C4DM-TR-08–01: 1–4. Taljaard, P.C., J.N. Khumalo, and Sonja E. Bosch. 1991. Handbook of siSwati. Pretoria: J.L. van Schaik.

240

Proctor, Et Al.

Taljaard, Petrus C., and Jannie W. Snyman. 1989. An Introduction to Zulu Phonetics. Constantia: M. Lubbe. Traill, Anthony. 1985. Phonetic and Phonological Studies of ǃXóõ Bushman. Hamburg: Helmut Buske. Traill, Anthony. 1991. “Pulmonic control, nasal venting, and aspiration in Khoisan languages.” Journal of the International Phonetic Association 21(1): 13–18. Traill, Anthony. 1993. “The feature geometry of clicks.” Linguistica: Festschrift E. B. van Wyk, ’n Huldeblyk, ed. Paul Michael Siegfried von Standen. Pretoria: J.L. van Schaik, pp. 134–140. Vaz, Colin, Vikram Ramanarayanan, and Shrikanth S. Narayanan. 2013. “A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysis.” Proceedings of the 14th Annual Conference of the International Speech Communication Association. Lyon, France. Vossen, Rainer. 1997. “What click sounds got to do in Bantu.” Human Contact through Language and Linguistics, ed. Birgit Smeja and Meike Tasch. Frankfurt: Peter Lang, pp. 353–366. Westphal, Ernst O.J. 1971. “The click languages of Southern and Eastern Africa.” Linguistics in Sub-Saharan Africa, ed. Jack Berry and Joseph H. Greenberg. Berlin: Mouton de Gruyter, pp. 367–420. Zhu, Yinghua, Yoon-Chul Kim, Michael I. Proctor, Shrikanth Narayanan, and Krishna S. Nayak. 2013. “Dynamic 3D Visualization of Vocal Tract Shaping during Speech.” IEEE Transactions on Medical Imaging 32(5): 838–848. Ziervogel, Dirk. 1952. A Grammar of Swazi (siSwati). Bantu grammatical archives, vol. 3. Johannesburg: Witwatersrand University Press.

Chapter 7

Recording and Measuring Acoustic Attributes of Clicks Sean Fulop and Richard Wright 1 Introduction In this chapter we describe elicitation, recording, and acoustic measurement techniques that we think are appropriate for acoustic field research on clicks. While some of the methodologies and techniques may be applicable to speech sounds more broadly, we focus on acoustic description of clicks because we feel that practical guides to the acoustics of obstruents in general, and clicks in particular, are largely missing from the field-research literature. In this way this chapter can be seen as a complement to other chapters in this volume on articulatory/gestural descriptions and measures of clicks (see Sands, Proctor et al. this volume). The chapter is organized in modules following the progress that a researcher is likely to follow, thereby making material easy to find and easy to skip depending on the needs of the researcher. It begins with basics of developing elicitation material, follows with some practical elicitation techniques and recording methods, and ends with a series of acoustic measures of place of articulation. With the miniaturization of digital recorders and computational resources, high-quality digital recordings, and to a lesser degree detailed acoustic phonetic description and analysis, have become a common part of descriptive phonetic and phonological fieldwork. High fidelity recordings are especially important when describing endangered languages or languages that are hard to access because the recordings may represent the sole resource for subsequent studies. This presents a special responsibility for research on Khoesan languages, and other languages with clicks, since many are remote, endangered, or both. For this reason we spend the beginning of this chapter describing various recording techniques. While we draw on experience from fieldwork on a variety of languages and regions, we have tried to take into account specific factors related to recording in southern and eastern Africa where languages with clicks are spoken.

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_008

242 2

Fulop and Wright

Considerations for Recordings

Population: Research guides usually recommend healthy monolingual speakers (e.g. Ladefoged 2002), and many also specify speakers of a variety of ages and social roles so that the researcher may situate the observed variation in the context of its social milieu (See Labov 1984; Eckert 2000). However, few populations in the world are monolingual, particularly in Africa, and many languages of interest are spoken by few enough speakers that one feels lucky to encounter more than a few, and endangered languages may be spoken only by elderly members of a community who often have missing teeth and hearing loss in addition to other conditions of age that place limitations on what the researcher can expect to do. This means that the researcher should take into account the increased variability that is typical of bilingual or trilingual communities, and adjust their expectations and their tasks to ones that are appropriate for the age and abilities of their set of language consultants. Moreover, population movements may mean that the speakers participating in the study now live in a city far from their original communities (sometimes even in a European or North American city where there are few speakers of their language). In the end, it is the researcher’s responsibility to fully document the characteristics of the consultants from a linguistic and demographic perspective. This knowledge is gained using a reasonably thorough demographic interview, which can be modeled on a standardized interview but which must be customized for the particular language and culture of interest. At the very least the following demographic information should be noted about the language consultants: gender, approximate age, primary dialect spoken, other dialects spoken, other languages spoken, general areas where the person has lived for extended periods, and education level (if appropriate). The most difficult questions are those about dialects: most people do not have a clear distinction between dialect and language, and often language and culture, so the researcher may have to arrive at the answers obliquely by asking about how people sound who live nearby vs farther away, and by asking whether the consultant can understand their “dialect”. Ideally the demographic survey is administered in the consultant’s native language, and it is typically administered orally. This gives the researcher a chance to assess aspects of the consultant’s speech, hearing, and attention. A brief lexical elicitation task may also be appropriate at this point as well to help assess the linguistic knowledge of the consultant. Many members of communities where there are glossonyms or ethnonyms, or where the language of interest is disappearingly spoken, may describe themselves as “speakers” of a language because of cultural or ethnic affiliation without having a very fluent grasp of the language (see Sands et al. 2011 for a discussion).

Recording and Measuring Acoustic Attributes of Clicks

3

243

Recording Tasks

While this chapter focuses on two common elicitation tasks (lexical elicitation and collaborative), there are a variety of elicitation/recording techniques, each with its own advantages and disadvantages (see Dipaolo and Yaeger-Dror 2010; Labov 1984; Sands et al. 2011; Schilling 2013; Schilling-Estes 1998; Warner 2011; Warren and Hay 2011 for discussions). For example, spontaneous speech in dyads or groups of peers engaged in everyday activities elicits the most naturalistic speech, and therefore is ideal for studying variant pronunciations of words or sounds that are restricted to the most informal styles of speech. It is also useful for observing aspects of prosody, intonation, pragmatics, and discourse, which only occur in longer stretches of spontaneous speech. However, what this technique gains in naturalness, it loses in other aspects of coverage and statistical power; in acoustic studies it is important that the objects of comparison occur in equivalent discourse and prosodic environments, and that any set of speech sounds of interest occur in equivalent segmental environments with equivalent frequency. The reason for this is that each of these factors may have a dramatic impact on the phonetic realization of a speech sound. For example, click release-burst intensity may vary as the functional load (redundancy) of the lexical item that the click is in varies from “new” to “given” in the same way that other speech sounds vary with discourse changes (e.g. Aylett and Turk 2004). Additionally, making sufficiently long recordings of conversations to achieve statistical power over a set of observations can sometimes prove prohibitive for the field linguist. More formal techniques such as wordlist or lexical elicitation permit greater control, a more balanced sample of speech sounds, and greater density of observations, but they elicit a more formal (less natural) and typically a more standardized style of speech (however see Schilling-Estes 1998; Schilling 2013; Warren and Hay 2011). Many forms of interest, particularly socially stigmatized or very reduced forms, may not be possible to elicit with the most formal techniques such as wordlists. Roughly speaking, speech tasks from least formal to most formal elicit decreasing naturalness, in terms of patterns of reduction and lexical usage and increasing control over contextual variables. At the least formal styles, speakers engage in the widest variety of pronunciation strategies and will typically be more likely to employ stigmatized forms (e.g. Labov 1984), but there are strategies for eliciting a more natural and a wider variety of speech behavior at intermediate levels of formality than is typically described in the sociolinguistic literature (e.g. Schilling 2013; Warner 2011; Warren and Hay 2011). We provide a slightly modified version of the sociolinguistic interview hierarchy in Table 7.1.

244

Fulop and Wright

Table 7.1

Hierarchy of elicitation techniques in relation to speech style variation

Least formal (most variation) 1. 2. 3. 4. 5. 6.

Conversation between peers Collaborative problem solving tasks Interviews, map tasks, and monologues Reading passages (only appropriate in literate populations) Word lists, lexical elicitation Minimal pairs tasks

Most formal (least variation)

Of these, we will describe techniques 2 (collaborative tasks) and 5 (lexical elicitation) because they lend themselves to eliciting a broad range of pronunciations (2) as well as a high density and broad range of coverage of linguistic structures (2 and 5). Of all of these, the word list or lexical elicitation task remains central to initial stages of phonetic investigation of a language or of a community of speakers because it allows the researcher to elicit a range of sounds highlighting the sets of contrasts and allophonic variation of specific types. There may be words that might not come up in an everyday conversation but which nevertheless are well known to the language consultants and are important for observing specific sounds in specific environments or for making cross-dialect or cross-language comparisons (Sands et al. 2011). 3.1 Word Lists/Lexical Elicitation Task There are several good sources on constructing lexical elicitation tasks, including Sands et al.’s (2011) paper on lexical elicitation, so we will not go into detail here on the construction of wordlists specifically. Sands et al. cite a variety of wordlists for African languages that are particularly useful for coming up with a set of words to elicit: Snider and Roberts (2006), Hutchinson (1971), Bouquiaux and Thomas (1992), and Heath and McPherson (2009). However, it is important to modify these lists as appropriate for the technology and culture of the specific population of speakers under study. Similarly, for languages that have no dictionaries of their own, ones for other Khoesan languages are useful in the preparation of wordlists such as ǃXóõ (Traill 1994), Juǀ’hoansi (Dickens 1994), Ekoka ǃXun (König and Heine 2008), Naro (Visser 2001), Khoekhoegowab (Haacke and Eiseb 2002), and Khwe (Kilian-Hatz 2003).

Recording and Measuring Acoustic Attributes of Clicks

245

Here we list a few important factors that are specific to acoustic phonetic research (See Ladefoged 2003 and other phonetic field guides for more detail). We describe the elicitation as if it were a “word list” however, it is recommended that the words be elicited using props like pictures or objects using a naming task if the population is not literate in the language’s orthography. First, verify the words before putting them on the list, and verify that they are productive in the community at hand. A word that is either archaic or not in common use will often be mispronounced or pronounced in a peculiar way making it useless for phonetic research. Second, in choosing words for the wordlist, try to find ones in which the sounds of interest are in similar segmental and prosodic environments. For example, word initial position, at the onset to a stressed syllable (if the language has prosodic stress), preceding the vowel /a/. This is because each segmental and prosodic environment will have its own effect on the acoustic characteristics of a sound. Ideally, words of equivalent length are also used because word length influences segmental duration. It is also recommended that several environments (such as before the vowels /i/ and /u/) be included in the wordlist so that a range of allophonic or phonetic variation can be observed from the same set of consultants. The drawback of adding variation, is that each extra variable that is added to the list doubles the number of items to elicit. For these reasons, lists from specific research traditions or lists that are used for comparative research (such as Swadesh lists) may not be appropriate for phonetic research; they may not contain a full sample of sounds of interest, and they most certainly won’t have the set of sounds in equivalent segmental and prosodic contexts. Third, include several words for each sound of interest. This permits the researcher to verify the pronunciation of the sound of interest. It is also insurance against the inevitable loss of the occasional word due to background noise or other interference. Fourth, group words semantically. The typical consultant has a much easier time recalling words when they are grouped thematically. Moreover, a consultant will generally feel that the recording session is more meaningful if words that are related to each other, either semantically or by practical use, are elicited together. Finally, grouping words phonetically by place, manner, etc. can create a kind of tongue twisting alliteration that elicits a higher speech error rate. Fifth, record more than one instance of each word. A minimum of three repetitions should be recorded. This is both to ensure internal consistency in pronunciation and to allow the researcher a number of instances to choose

246

Fulop and Wright

from in deciding what to measure. It is ideal if the entire list is repeated (three to five times) rather than having the consultant simply repeat the word in that instance, because an untrained speaker will tend to run the repetitions together like [ŋǁɑlo ʔɑŋǁɑlo ʔɑŋǁɑlo ʔɑ] rather than [ŋǁɑlo ʔɑ, ŋǁɑlo ʔɑ, ŋǁɑlo ʔɑ] if asked to repeat, even if they are instructed to pause in between repetitions. It also makes the speech task more artificial. Sixth, randomize the order of words within their semantic/thematic groups between repetitions of the list. This is because neighboring words can influence the pronunciation of a particular word. This isn’t absolutely necessary, but it is useful for checking that a particular pronunciation isn’t due to wordlist context. Each elicitation repetition should be in a separate random order so that any unintended ordering effects are neutralized across repetitions rather than being reinforced. An easy way to randomize the ordering of sets of lexical items across repetitions is to have a set of index cards for each lexical set that are shuffled across repetitions. Even if physical props are being used the cards can help the researcher ensure randomness in the elicitation order. Each participant should ideally have a different random order of elicitation. Computer or tablet based shuffling can be used too with a variety of tools. A quick and easy way of creating a random list is to use a spreadsheet to generate a string of random numbers (e.g. in Excel =rand() is the formula for a random number) in a column next to the column with the set of lexical items, then sort the lexical list using the random number column. Last, start and end the elicitation procedure with “warm up” and “cool down” words that are not of particular interest. When consultants begin a recording session, they typically hyperarticulate the first few words before settling into the recording. Similarly, when they sense that a session is coming to an end, they will hyperarticulate the last few words. We recommend at least five words as a “buffer”. 3.2 The Collaborative Task A collaborative task (e.g. Warren and Hay 2011; Freeman et al. 2014) is typically used as a follow up to a lexical elicitation task used to elicit specific lexical items in fluent conversations. It is a refinement of the more widely used map task (Anderson et al. 1991). In the field, the map itself is a prop that is not typically culturally appropriate since the population may not be map-literate or since eliciting specific names on a map tends to rely on orthographic notation. The collaborative task typically involves a problem conceived of by the researcher intended to elicit specific vocabulary, morphology, syntactic structures, etc. To solve the problem, speakers, typically in a dyad, must use a variety of linguistic variables (in the case of this chapter, lexical items). The problem

Recording and Measuring Acoustic Attributes of Clicks

247

must be carefully researched to be both linguistically appropriate, eliciting the structures of interest, and culturally appropriate – put to the consultants in a way that is naturally and correctly interpreted. For example the researcher might present the following problem to solve to a pair of consultants: “Here are several objects …” (here the use of elicitation props can be very useful – in this case objects or pictures of objects that necessitate different carrying styles) “… how could you move all of these to a different site and how long will it take?” This problem might be used to observe the pronunciation and use of a specific vocabulary set such as the one presented in Sands et al.’s (2011) list of ǂHoan “carry” terms. This task is similar to the sociolinguistic interview in that it elicits spontaneous speech on specific topics (thereby increasing the odds that you will observe specific words), but since it elicits a conversation between two consultants rather than a monologue, it contains a wider range of syntactic, discourse, and pragmatic variation. By necessity this task can only be designed after the researcher has a reasonably good grasp of the language of interest’s lexicon and phonology. Therefore it is typically used as a follow up procedure to previous field research where the researcher has questions about the range of variation in pronunciation of specific sounds. For reference, recording and digital sampling methods are included in Appendix A. 4

Measurement Methods

In this section we will discuss techniques for using speech analysis software to examine the acoustic features of clicks. We will specifically refer to Praat software (Boersma and Weenink 2014), but the same general procedures can be performed in other software as well. Techniques will focus on easily used acoustic representations such as the waveform and standard spectrogram, as well as the power spectrum. Other acoustic metrics which are simple to obtain in Praat include the intensity plot, the center of gravity (mean) of a power spectrum, and the harmonicity. Let us first state that we do not suggest averaging data across speakers to report measured values of sounds. Each speaker is a different person, and this leads naturally to the expectation of systematic differences in spectra and other measures. There is simply no meaning, linguistic or otherwise, to the notion of an “average spectral peak” across speakers. To report results from more than one person, we suggest graphing all subjects’ values together in some kind of plot or table or by picking a specific speaker who is illustrative of the point being made with the graph.

248

Fulop and Wright

There are a few useful steps before performing any analysis. First, it is handy to use Praat’s resample function to change (“downsample”) the sampling rate of your sound file to equal twice the highest frequency of interest in the clicks. A sampling rate of 22050 Hz is quite useful, providing a maximum frequency in the sound spectrum of 11025 Hz. This is normally at or near the highest frequency of any utility in normal speech. The purpose of downsampling is to discard irrelevant high frequencies in the signal, which keeps file sizes to a minimum and also eliminates distracting upper frequencies when inspecting signal spectra. Initially, it helps to use a standard sort of view in Praat (see Figure 7.1) to look at the words of interest in the sound file; the standard view shows a waveform plot together with a spectrogram. The waveform plot does not normally need additional setup, but the spectrogram can readily be improved by changing the settings. For showing the fine time scale of clicks, it is important to set the “window length” to a small amount in the neighborhood of 0.005 seconds (5 ms). It also helps to ensure that the dynamic range is between 50–60 dB to start with, since experience has shown that a larger dynamic range clutters the spectrogram with too much background noise. Praat default settings show spectrograms with frequencies up to 5 kHz, but for clicks and other obstruents it can be very useful to show up to 8 or 10 kHz at least. In comparing click-release intensities it is important to turn off Praat’s default autoscaling function in the “advanced” settings of the spectrum function. If intensity comparison is a major goal, Praat’s Intensity plot can be quite useful, apart from spectrum analysis. If spectral energy distribution is of interest (for example as a measure of place of articulation), then the preemphasis function should be set to 0.0 as well. This is because preemphasis changes the spectrum to enhance the visibility of upper vowel formants in the spectrogram, but it should be deactivated for consonants in general, or whenever measurements need to be obtained from the frequency spectrum. If click intensities are going to be compared it is important to control microphone distances, ideally using a head-mounted microphone (see Appendix A for recording techniques). It also helps to have a large number of lexical items, with even sampling across the click types being compared. If the researcher is going to conduct a statistical analysis of the clicks, a reference amplitude or a normalization technique may be needed. This is because some speakers’ clicks will be consistently louder than other speakers’ clicks. A common reference technique is to use the ratio of intensities of the click release to the following vowel. Note that this technique is only valid if the following vowel is of the same quality since vowel intensity varies with vowel quality. Alternatively a normalization technique can be employed, such as a z-score (standard score)

Recording and Measuring Acoustic Attributes of Clicks

249

Figure 7.1 Praat “Edit” screen view of the Taa word [ʘŋ̥áʔãː] ‘conceal oneself’, with bilabial burst highlighted Note: All Taa examples are from a single speaker, available in the UCLA Phonetics Lab archive (listed under ǃXóõ), publicly accessible online. The recording is marred by a whistle (seen as one or more faint horizontal bars in the spectrograms here) due to electrical interference in the recording equipment.

in which the intensities of the set of compared click types are normalized within speaker. The formula for a z-score is z=(x−μ)/σ (where μ is the mean of the set of intensities for a particular speaker, and where σ is the standard deviation of the intensities). The z-score normalization allows for the comparison of relative intensities across speakers, but it does not compensate for withinspeaker fluctuation in intensity related to changes in microphone distances. In the event of both poor microphone distance control and lack of control of flanking vowel quality (for the reference intensity), comparison of the release intensities is not reliable. 4.1 Determining Places of Articulation Click sounds at the various places of articulation may be characterized as either abrupt or non-abrupt, based on the duration of the anterior burst. The bilabial and dental clicks are normally not abrupt, while the apical (­ post)­alveolar [ǃ] and palatal [ǂ] clicks are generally abrupt. The abruptness of lateral clicks may be variable across languages or even speakers of the same language.

250

Fulop and Wright

4.1.1 Click Primary (Anterior) Place – A first task is to locate the anterior click burst using a spectrogram and waveform – a good trick is to magnify the view so that closely spaced anterior and dorsal bursts are separable by their different appearance in the waveform. See, for example, Figure 7.18 and accompanying discussion. Note that some glottalized clicks and all kinds of nasal clicks will not normally have any secondary (dorsal) burst. – Select anterior burst and use CTRL-L in Praat for View Spectral Slice; this computes and displays a power spectrum object obtained from an average over the selection. The selection has windows applied that are the same kind that you have set under Spectrogram Advanced Settings. Gaussian is usually the best window shape among the Praat alternatives.1 An alternative procedure is to select the burst of interest and first Extract Windowed Selection to get a sound object consisting of purely the burst, and then use the Spectrum function to compute a spectrum object. The spectrum obtained by this second procedure uses a single window covering the selection, and will not be exactly the same as with CTRL-L due to differences in the windowing and averaging algorithms used by Praat. – Further processing can easily be done to characterize the features of this spectral object to determine the place of articulation. Inspect the spectrum first; if it has basically one hump overall, then Query: Get center of gravity can be used to obtain the spectral mean as a useful value. If the spectrum is generally flat or has multiple humps with high amplitude, the queried values will be less useful, and manually measuring the spectral peaks may be the most fruitful approach. – Before measuring spectral peaks, it may be helpful to apply the Cepstral Smoothing function; this applies a mathematical transformation to yield a new spectrum object that has fewer peaks than the original spectrum, but with the same overall outline. To provide a few examples, we show waveforms of the five click places in Taa, as they appear with voiceless nasal accompaniment (thus leaving them with no dorsal burst to clutter our view, as in Figure 7.1). We also show how the three non-abrupt clicks can be discriminated by examining their burst spectra.

1  If the researcher wants to make a direct statistical comparison between the abrupt (post­ alveolar, palatal) stops and the noisier (bilabial, dental) releases, they may want to use a short fixed window centered on the intensity peak of the release noise. It is important that the window be short enough that it doesn’t include any of the following vowel or the spectral measure will be meaningless.

Recording and Measuring Acoustic Attributes of Clicks

251

We look first at the (non-abrupt) bilabial. The noisy burst lasts from approximately the 4 ms point through to 17 ms in this typical example. The power spectrum below was computed using the above procedure by selecting this span of the waveform.

Figure 7.2 Waveform and spectrum of bilabial click burst highlighted in Figure 7.1; spectrum center of gravity (COG) = 2772 Hz

252

Fulop and Wright

The dental click waveform following displays an even longer burst; in fact this event lasts from the 3 ms mark through to 25 ms, which is more like a brief fricative than a consonant burst.

Figure 7.3 Waveform and spectrum of dental click burst in Taa word [ǀŋ̥úʔwì] ‘be careful’; dental burst spectrum COG = 3683 Hz

Recording and Measuring Acoustic Attributes of Clicks

253

We observe that the dental is readily distinguished from the bilabial using basic attributes of the power spectrum. The following lateral click seems to be typical of most Taa speakers in our archive recordings. The burst duration is similar to that of the bilabial above.

Figure 7.4 Waveform and spectrum of lateral click burst in Taa word [ǁŋ̥áʔám̀ ] ‘be damp’; lateral burst spectrum COG = 2271 Hz

254

Fulop and Wright

Note that the lateral spectrum center of gravity is once again quite different from that of the bilabial, and perhaps more importantly it also contains two rather prominent peaked areas that are absent from the other spectra. Now we turn to the abrupt clicks, beginning with the (post)alveolar waveform below. This click place is quite unique in that its initial impulse can be so perfect that it is basically not visible in the waveform because it does not generate any noise. The most salient feature of the (post)alveolar click is its distinctive “wood block” resonant sound generated by the impulse, which can be observed in the waveform as a rapid vibration lasting from the 3 ms point

Figure 7.5 Waveform and spectrum of postalveolar click burst in Taa word [ǃŋ̥áʔã] ‘attack’; spectrum COG = 932 Hz

Recording and Measuring Acoustic Attributes of Clicks

255

through to 10 ms, after which its fading signal blends into the vowel phonation just beginning. The other abrupt type of click is the palatal which, unlike the apical (post) alveolar above, is generated more with the tongue blade in the prepalatal or postalveolar area. This lends the click burst a distinctive high-pitched noise that begins at around the 2 ms mark in the waveform below, but has almost completely died away before the 10 ms point.

Figure 7.6 Waveform and spectrum of palatal click burst in Taa word [ǂŋ̥úʔùã] ‘be out of reach’; spectrum COG = 2710 Hz

256

Fulop and Wright

The waveform analysis is sufficient on its own to distinguish these two click types from each other and from their non-abrupt counterparts above. Nevertheless we also observe large differences in the burst spectra. While (post)alveolar apical clicks are almost always abruptly released, the laminal (pre)palatal clicks vary in this respect. The preceding Taa example is abrupt, while Figure 7.7 shows an example from Yeyi which is not abrupt. Notice the sloppy anterior burst extending from 12 to 25 ms in the waveform plot, while the Taa example of Figure 7.8 lasted just a few milliseconds. The accompanying burst spectrum is accordingly very different from the Taa example.

Figure 7.7 Waveform and power spectrum of palatal click in Yeyi word [ʃìkǂá] ‘scarification’; spectrum COG = 5944 Hz

Recording and Measuring Acoustic Attributes of Clicks

257

4.1.2 Secondary (Dorsal) Place – Clicks other than nasal or glottalized should generally have a dorsal burst which must follow the anterior burst. Sometimes they can be produced almost simultaneously, but articulatory demands of producing click suction dictate that they cannot in fact be simultaneous. Using the suggested technique of magnifying the waveform to separate different bursts, the dorsal burst can be isolated for analysis. – As above, create spectrum slice object from the selected burst; this object can then be viewed and further analyzed. – It may be useful to apply cepstral smoothing and then measure multiple peaks. – It may be useful to compute the spectral center of gravity; generally the uvular clicks may be distinguished from the velar in having a lower spectrum overall with a lower frequency center. The Taa word [ǀk’qàa] ‘hand’ provides us with an ejective “cluster click” that includes a double dorsal release. Let us analyze both dorsal releases for illustrative purposes. The following waveform in Figure 7.9 shows a close-up view of the first combined burst; the dental burst occurs first, followed immediately by what the UCLA archive transcribers judged as a velar burst. The 125 ms time point is the approximate divider between the two, as can be seen from the sudden change in the nature of the waveform.

Figure 7.8 Spectrogram of Taa word [ǀk’qàa] ‘hand’

258

Fulop and Wright

Figure 7.9 Dental and velar ejective bursts in Taa word [ǀk’qàa]

Figure 7.10

Second dorsal burst in Taa word [ǀk’qàa]

Next we show a close-up of the plosive burst at the beginning of the vowel in Figure 7.10, which the transcribers judged as a uvular [q]. This occurs between 350–360 ms at least. The time scale is continued from the above, so we see that the initial click/ejective double burst is separated from the remainder by over 200 ms of silence. Now, can we use spectrum analysis to decide which of the two dorsal bursts in this cluster is velar and which is uvular? The two respective power spectra are shown in Figure 7.11, and they look unfortunately too similar in overall shape to be judged in favor of either place of articulation.

Recording and Measuring Acoustic Attributes of Clicks

Figure 7.11

259

Power spectra of first and second dorsal bursts in Taa word [ǀk’qàa]

The listening test isolating the second burst and its accompanying vowel does sound quite uvular, though not extremely so. The place of articulation here may also be determined from the formant transitions going into the immediately following vowel. Unfortunately, this is not as “definitive” a feature of places of articulation as one might wish, but nevertheless it can be observed in the spectrogram that this dorsal release does not betray the telltale proximity of formants 2 and 3 which is indicative of a velar (the so-called “velar pinch” effect), and so this dorsal is more likely to be a uvular in accord with the auditory impression.

260

Fulop and Wright

4.2 Nasality in Clicks – A velar or uvular voiced nasal for some time preceding the click anterior burst is a hallmark of the voiced nasal click. – The nasal accompaniment may cease almost at the moment of the anterior burst, or may continue for some period of time. – It is fairly easy to distinguish the nasal phase of these clicks from the voiced closure during a voiced click; the nasal should generally be louder than a voiced closure and have a spectrum that extends into frequencies of 1000 Hz or above. A voiced closure, on the other hand, will show only a “voice bar” – the low spectral peak around 200 Hz. – Nasal clicks lack a dorsal burst; this is how they can be distinguished from prenasalized voiced clicks, and this conspicuous absence can also signal a voiceless nasal accompaniment in the event that no nasal can actually be heard. – Some languages (e.g. Taa) also present voiceless nasal clicks. In these examples, such as were used earlier to discuss the places of articulation, there is neither any audible nasal nor any audible dorsal burst. The fact of the voiceless nasal accompaniment can only be firmly established with the aid of airflow measuring equipment.

Figure 7.12

Spectrogram of Yeyi word [m̀ ŋǃójì] ‘Acacia erioloba’ (type of plant) exemplifying a voiced nasal alveolar click Note: Yeyi is a Bantu language with numerous clicks; recordings were obtained by Peter Ladefoged and Rainer Vossen in Botswana in 1998.

Recording and Measuring Acoustic Attributes of Clicks

261

4.3 Voicing in Clicks – Voiced clicks have a voiced velar or uvular stop accompanying; this can be recognized by the voice bar (around 200 Hz spectral peak) preceding the first burst. – Voiced clicks also show a dorsal burst which follows the anterior burst, although this may be difficult to separate from the vowel which follows immediately. – Some voiced clicks are also prenasalized; these look like voiced clicks in every respect but they begin with a brief nasal phase prior to the purely voiced closure. – Some languages contrast voiced clicks with voiced prenasalized, but in these cases the latter can equally well be analyzed as a cluster comprising a nasal followed by a homorganic voiced click. Figure 7.13 shows a voiced click [ɡǃ] in Yeyi; Figure 7.14 shows a voiced prenasalized click [ɴɢǂ] in Taa. The slight energy in the 1000 Hz area should be noted as indicative of the prenasalized closure, while the slight gap following the click release leads to a dorsal burst from the [ɢ]. This demonstrates a difference from a purely nasal click; here we find both a nasal and a voiced uvular stop in quick succession.

Figure 7.13

Spectrogram of Yeyi word [kà ɡǃáwà] ‘calabash’ exemplifying a voiced alveolar click

262

Figure 7.14

Fulop and Wright

Spectrogram of voiced prenasalized palatal click in the Taa word [ɴɢǂà̰̰li] for a kind of tree

4.4 Breathy/Whispery Voicing in Clicks – Breathy or whispery voicing, sometimes called “murmur,” can accompany clicks; this may occur during the voiced closure but is more likely to occur during the first part of the vowel following release of the voiced click. In the case of a nasal accompaniment, the nasal itself will have the breathy voicing. – Breathy voicing – produced with slack vocal cords which allow excess airflow – is characterized by, among other things, a decreased harmonicity value in comparison to modal voicing. The harmonicity is a measure of the sound power in the harmonics (periodic part of the sound) in proportion to the power in the noise. There is no one standard way of computing it, and some methods call it the harmonics-to-noise ratio, or HNR. Praat has an excellent algorithm for computing harmonicity. – Whispery voicing – produced by means of a small glottal opening which allows unvoiced airflow alongside the vocal vibration – has been shown (Fulop and Golston 2008) to measure even lower harmonicity than breathy voicing. – It is important, however, to compare harmonicity in vowels or voiced sounds which are otherwise identical; the harmonicity of [a] is naturally different from that of [e] or [u].

Recording and Measuring Acoustic Attributes of Clicks

Figure 7.15

263

Praat Edit view of the Xhosa word [ukúŋǃola] ‘to climb up’, wherein a portion of the nasal is selected for computing the harmonicity Source: Xhosa example is taken from a recording in the UCLA Phonetics Lab Archive

– In Praat, harmonicity is one of the functions available under Periodicity, which may be applied to a sound object. To measure the harmonicity during a relevant portion of a vowel or other sonorant sound, select the portion of interest and extract it to a sound object. The harmonicity function will compute a “track” of the harmonicity through the duration of a sound, which can be drawn, or queried for its mean value or specific values at different times. The Praat Edit view of the Xhosa word [ukúŋǃola] ‘to climb up’ is shown in Figure 7.15, wherein a portion of the nasal is selected for computing the harmonicity. The harmonicity of this nasal has a mean value of 24 dB, while the corresponding breathy nasal in the word [ukúŋ̤ǃala] ‘to go straight’ has a mean harmonicity of 15 dB. This difference is the simplest way to establish the presence of breathy or whispery voicing, but it does require a corresponding nonbreathy sound of the same type for comparison. The absolute value of the harmonicity varies widely for different speakers and phonetic contexts. Some sources describe these Xhosa clicks as slack voiced rather than breathy, but the difference between slack and breathy voicing is mostly a matter of degree; the general posture of the vocal cords is the result of slackening in each case, and

264

Fulop and Wright

the acoustic correlates of breathy and slack voicing have been found to be very similar overall, with breathy voice somewhat more extreme. It is also possible, and commonplace, to measure breathy voicing using the relative amplitudes of the lower harmonics in a power spectrum of a fairly long (around 50 ms) portion of the sound in question. Fulop and Golston (2008), for example, demonstrated that both this measure and the harmonicity are about equally good at measuring breathy voicing in Hmong. Harmonicity, however, is much easier to measure given the available tools, and may even be slightly more reliable. 4.5 Glottalized vs. Ejective Clicks Glottalized and ejective clicks both have an accompanying glottal stop. The difference is that the glottalized variety maintain the glottal stop both before and after the dorsal closure period, thereby preventing any dorsal release burst from occurring. For this reason they are often transcribed without a dorsal symbol, as in the Taa word [ǀˀâː] ‘die’ (v. Figure 7.16). Ejective clicks instead produce a laryngeal egressive airstream of the dorsal during the glottal closure, usually occurring immediately following the

Figure 7.16

Spectrogram of Taa word [ǀˀâː] ‘die’ showing glottalized dental click

Recording and Measuring Acoustic Attributes of Clicks

265

anterior click release. An example is found in the Taa word [ǀq’ɜ́n] ‘small, pl.’, shown in Figure 7.17. It is difficult to see in the spectrogram below that the apparent burst is actually a double burst, consisting of the dental click followed by the uvular ejective. The waveform in Figure 7.18 shows the relevant portion of the sound, in which it is clear that there are two separate obstruent events of distinct character.

Figure 7.17

Spectrogram of Taa word [ǀq’ɜ́n] ‘small, pl.’ showing dental ejective click

Figure 7.18

Waveform showing click burst from Figure 7.17; observe the dental burst from ~95–115 ms, and the dorsal ejective burst from ~130–145 ms.

266

Fulop and Wright

4.6 “Cluster” Clicks: Sorting Out Multiple Events Some languages such as Taa and Juǀʼhoansi employ complex syllable onsets involving clicks mixed with other stops in unusual ways. Many of the transcriptions of such clusters which can be found in literature or other documentation are unfortunately not correct, so for these onsets it is especially important to analyze or reanalyze any sound recordings which may be accessible. An example is furnished in Figure 7.19. Figure 7.19 depicts the Taa word meaning ‘to cry incessantly,’ which the UCLA phonetic archive material transcribes as [ɡǃk’qãː]. This transcription means that we should observe a voiced stop closure followed by an alveolar click followed by a velar ejective, the latter requiring a separate voiceless portion with a velar burst (since an actually voiced ejective is phonetically impossible), which would then leave a gap for the glottal stop, and finally a uvular stop before the vowel. Closer analysis of the recording shown in the spectrogram demonstrates easily that there is in fact no velar ejective. Instead, there is a voiced alveolar click which is dorsally released into a glottal stop. The Praat Edit screen view in Figure 7.20 shows the relevant portion, in which the vertical dashed line separates the very abrupt alveolar click from the velar release which follows. There is then a brief vocalic episode which slows to a glottal stop. There is no evidence whatsoever of a velar ejective. The corrected transcription of the item should then be [ɡǃʔqáã], in which the supposed ejective is replaced with a glottal stop.

Figure 7.19

Spectrogram of Taa word meaning ‘to cry incessantly’

Recording and Measuring Acoustic Attributes of Clicks

Figure 7.20

267

Praat Edit screen showing initial portion of the Taa word from Fig. 7.19

5 Conclusion In this chapter we have described elicitation, recording, and acoustic measurement techniques that we think are appropriate for acoustic field (communitybased) research on clicks. We focus on acoustic description of clicks because we feel that practical guides to the acoustics of obstruents in general, and clicks in particular, are largely missing from the field-research literature. We have highlighted both the lexical/wordlist and collaborative task elicitation methods because they both allow a high sampling density of desired forms. While the former permits greater control over the contextual and lexical factors, the latter elicits a more natural style of speech. In the acoustic methods section we have focused primarily on release burst characteristics, but we have also discussed accompanying articulations and segmental contexts. Discussion of microphone and recording technical issues has been relegated to the appendix.

268

Fulop and Wright

We hope that this chapter will encourage researchers to include acoustic phonetic descriptions in their research.

Appendix A: Recording Methods

In recording a session there are several practices that are widely employed for practical and ethical reasons. First of all, it is an absolute necessity that appropriate human subjects review (IRB) has approved the research at the host institution. Many institutions may view any voice recording and all video recording as needing informed consent, but it really depends on the nature of the behavior being recorded and the guidelines of the institution. Most institutions will insist on IRB approval even if it is not legally mandated in the country where the recordings are taking place. Most journals will not publish research unless the recordings were made under ethical conditions including IRB approval. Before beginning a recording session, it is important to have all of the equipment tested out and ready to roll well before the session begins. In the field, batteries are often the sole source of power, so it is important that they be tested too. There are often unexpected interruptions and there are sometimes opportunities to make supplemental recordings, so extra batteries should be available. Any technical issues during the recordings can create tension and spoil the moment. With delays, participants may become tense resulting in unnatural speech, or they may get bored and wander off. Similarly, all of the wordlists, conversational tasks, etc., should be well worked out and “piloted” before any real data is going to be recorded. Best practices for audio recordings recommend that a backup recorder be rolling to avoid data loss when the main recorder fails. It also encourages video documentation of the entire recording session. A video can be very useful in interpreting the events during the recording and it may capture important aspects of the interactions such as nonverbal gestures that help to disambiguate meanings of words. The video camera should be positioned so that the language consultant is clearly seen but so that the consultant and the researcher can both be heard. Before choosing recording equipment it is important to take into consideration cultural factors, some cultures may object to or may be made uncomfortable by video recordings. Similar considerations should be made when choosing a microphone, for example clothing may not be amenable to clip-on lavalier microphones, on the other hand head-mounted microphones may be problematic because of restrictions on touching the head and face. In setting up the recording environment the researcher should be audible enough for their words to be heard so that elicitation ordering and glosses are recorded, but it is not necessary for them to be optimally recorded. This may affect the choice of

Recording and Measuring Acoustic Attributes of Clicks

269

microphone array. At the beginning of the recording the researcher should indicate important metadata about the recording: date, language, dialect, purpose of the recording, etc. so that there is a permanent record embedded in the recording. Similarly, if multiple participants are being recorded, their microphone channels should be indicated.



Recorders and Microphone Considerations

There are currently several recording options available to the researcher: portable flash recorders, external A-D (analog to digital) devices connected to a computer, or recording directly into a computer. In general it is better to avoid using a computer during a recording for several reasons: its fan can generate a significant amount of background noise in the recording, it is less convenient to position for recordings than smaller devices, the onboard sound card in a typical laptop is lower quality than a professional A-D device, and it is more obtrusive than smaller devices. Obtrusiveness is an important consideration if natural spontaneous speech is the object of the recordings. If a computer is being used in a recording, an external professional USB microphone (described below) or an external A-D device is a necessity since there are no professional microphones that use the 1/8” jacks that laptop computers have. External professional A-D devices with XLR inputs (such as the Focusrite Scarlett 2i2 with two XLR inputs, or the Steinberg UR824 with 8 XLR inputs) are commonly used in laboratory recording settings because they provide the cleanest recordings (high SNR), because they have a faithful frequency response, and because they permit multiple XLR microphone inputs. The last item can be a necessity if multiple consultants are being recorded at the same time. In a laboratory setting with a sound booth, the computer noise is not an issue because it is isolated from the microphones, however in the field such isolation is more difficult. Additional considerations include power sources for the A-D device. However, if multiple participants are being recorded simultaneously this may be the preferred option. There are also several professional recording options for low noise devices such as iPads or other tablets (such as the Alesis iO Dock, or the Apogee Duet). These devices have the advantage of being lighter and sturdier than a typical laptop, however storage capacity can be an issue in the field. The researcher should be sure to select a professional recording interface with XLR inputs and low noise A-D encoding (typically 16–32 bit), a broad frequency range (typically 44.1 kHz), and lossless encoding (wav, aiff, and not mp3 or mp4). These devices are typically more expensive than the equivalent quality computer-USB interface, but the noise reduction and smaller size may be worth the extra expense. Solid state flash recorders are by far the most commonly chosen option (such as the Zoom H4n and Tascam DR-40 with two XLR inputs, or the Tascam DR-680MKII and Roland R-44 with four XLR inputs). There is a wide range in the recording quality

270

Fulop and Wright

available in these devices, so care should be taken in ensuring that they have high fidelity encoding across the relevant frequencies (20 Hz to 10 kHz). Typically the lowest priced recorders are intended more as Dictaphones rather than as professional recording tools. It is also important to ensure that they have 48v phantom power for XLR condenser microphones. In choosing a recorder for the field, power considerations and durability are an issue. If a recorder can run off of commonly sold batteries (in addition to power from an outlet) it may be preferable to one that has rechargeable batteries. Decisions about microphones should be driven by research considerations. First of all, the frequency response of the microphone should be taken into account: the microphone should have as “flat” a response as possible within the speech range (50–10,000 Hz) as in Figure 7.21. When choosing specialized microphones, professional “balanced” microphones with an XLR jack (with three prongs in a triangular orientation) are absolutely necessary if analog microphones are used because they have a shielded cable so they won’t create a hum in the recording. Furthermore, a professional microphone will typically have a better frequency response than consumer microphones. Alternatively professional USB microphones are becoming more widely available. There are a lot of low quality USB microphones available and they should be avoided because they have poor quality encoding of speech signals. The advantage of the professional USB microphones is that the A-D sampling occurs within the microphone and only digital information is sent across the cable thereby eliminating the potential problem of line-noise. A disadvantage is that they typically require a computer to “record” the digital signal, and computers are notoriously noisy. Therefore, efforts should be made to isolate the computer from the recording area. An additional problem is USB microphones are offered in a more limited selection and the appropriate type of microphone for the recording task may not be available. Many portable professional flash recorders (Zoom H4n, Sony PCM-D100, Tascam DR-100mkII, and others) have built-in microphones that have reasonably good frequency responses (The Zoom H4n built-in frequency response is illustrated in Figure 7.21) in addition to allowing XLR inputs for more specialized microphones. The best way to optimize the speech-signal to noise ratio (SNR) is to reduce the distance of the microphone to the speech source. This is particularly important in field recording conditions where there is less control over background noise than in lab recordings. Head-mounted microphones are the best way to minimize source to microphone distances; however, many sociolinguistic researchers consider head-mounted microphones to be too obtrusive (running the risk of making the language consultant self-conscious), moreover they can be challenging to place correctly to avoid microphone overloading or “pops” from breath. They are especially useful for dyadic conversational recordings since they minimize noise and interlocutor pickup (aka “crosstalk”)

Recording and Measuring Acoustic Attributes of Clicks

Figure 7.21

271

Frequency response of a built in microphone on the Zoom H4n–fairly flat up to 20 kHz

that can otherwise make acoustic measurements impossible. Because they fix the distance to the voice source, an important consideration if measures of amplitudes are important to the study, head-mounted microphones can be crucial. Lavalier microphones, small microphones that either clip onto the consultant’s clothing or can be worn with a lanyard around the neck, are popular because they can be worn and are therefore less obtrusive than head mounted microphones and because they remain relatively close to the source thereby increasing the intensity of the signal in relation to the background noise. However, their placement can be a bit tricky, it is important that they be placed out of the “chin shadow”, about the third button down from the neck on a western shirt. “Tabletop” microphones are popular in the field because they are relatively easy to orient and they are rugged; however it is important to use a microphone stand, or some other way of getting the microphone close enough to the speech source to make high fidelity recordings. Under most field conditions, “cardioid” (AKA “directional”) microphones should be used rather than “omnidirectional” microphones because cardioid microphones have a narrower field and pick up less of the environmental noise. “Shotgun” (AKA “superdirectional”) microphones are also a good choice for the field as they are good for recording in noisier environments. However, they are more expensive, delicate, and obtrusive than regular cardioid microphones. Figure 7.22 illustrates the pickup patterns, illustrated with the dark lines, of omnidirectional, cardioid, and shotgun microphones. As can be seen from the patterns, the shotgun microphone will pick up the least environmental noise and the omnidirectional microphone will pick up all environmental noise. Most researchers use condenser microphones in the field because they are sensitive to a wide range of sounds and have a louder output than dynamic microphones. They typically require 48v phantom power which they draw from the recorder or from a portable power source. Phantom power means that the microphone can be

272

Fulop and Wright

Figure 7.22

Pickup patterns of omnidirectional, cardioid, and shotgun microphones

connected directly to the recorder (with phantom power enabled) without a microphone amplifier (thereby significantly decreasing the amount of equipment); however, it also uses significantly more power (a consideration when batteries are being used).



Digital Recording Considerations

When a recording is being digitized there are a few important guidelines to follow. A common digital recording quality is a 44.1 kHz sampling rate with a 16 bit depth (CD quality) and a lossless format (.wav, .aiff, etc.). Lossy formats, such as .mp3 or .mp4, should be avoided since they introduce noise and typically have shaped spectral encoding (useful for compression but destructive to the signal). The standard sampling rate of 44.1 kHz, with a Nyquist frequency (the actual capture frequency) of 22.05 kHz is more than sufficient for most purposes. It is the industry standard because human hearing is sensitive up to only approximately 20 kHz. However, a lower sampling rate of 22.05 kHz is also sufficient since the speech signal contains important information only up to approximately 10 kHz. The standard 16 bit sampling depth provides an excellent noise floor – undetectable under most listening conditions. Nevertheless, some archives require much higher sampling and bit depths (e.g. 48 kHz 24 bit, or 96 kHz 32 bit) so if the recordings are destined for a particular archive their guidelines should be followed. The biggest issue with digital recordings that researchers typically encounter is clipping, when the incoming signal exceeds the bit depth. On most recorders, when the loudness meter is in the yellow there is danger of clipping, and when it is in the red clipping will occur. A clipped signal cannot be repaired and if the clipping occurs during important portions of the signal it will make acoustic measures, and maybe even transcriptions, impossible. The solution is to be conservative in recording; only the loudest useful (e.g. excluding laughing) portion of the signal should ever go into the yellow.

Recording and Measuring Acoustic Attributes of Clicks

273

References Anderson, Anne H., Miles Bader, Ellen G. Bard, Elizabeth H. Boyle, Gwyneth M. Doherty, Simon C. Garrod, Stephen D. Isard, Jacqueline C. Kowtko, Jan M. McAllister, Jim Miller, Catherine F. Sotillo, Henry S. Thompson, and Regina Weinert. 1991. “The HCRC Map Task corpus.” Language and Speech 34(4): 351–366. Aylett, Matthew, and Alice Turk. 2004. “The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech.” Language and Speech 47(1): 31–56. Boersma, Paul, and David Weenink. 2014. Praat: doing phonetics by computer [Computer program]. Version 5.4.04, retrieved 28 December 2014 from http://www .praat.org/. Bouquiaux, Luc, and Jacqueline M.C. Thomas. 1992. Studying and Describing Unwritten Languages. Translated by James Roberts. Dallas: Summer Institute of Linguistics. Di Paolo, Marianna, and Malcah Yaeger-Dror. 2010. “Field methods: Gathering data, creating a corpus, and reporting your work.” Sociophonetics: A Student’s Guide, ed. Marianna Di Paolo and Malcah Yaeger-Dror. New York: Routledge, pp. 7–23. Dickens, Patrick J. 1994. English-Juǀ’hoan, Juǀ’hoan-English dictionary. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 8. Cologne: Rüdiger Köppe. Eckert, Penelope. 2000. Linguistic Variation as Social Practice. Oxford: Blackwell. Feagin, Crawford. 2002. “Entering the Community: fieldwork.” Handbook of Language Variation and Change, ed. Jack K. Chambers, Peter Trudgill and Natalie SchillingEstes. Oxford: Blackwell, pp. 20–39. Freeman, Valerie, Julian Chan, Gina-Anne Levow, Richard Wright, Mari Ostendorf, and Victoria Zayats. 2014. “Manipulating stance and involvement using collaborative tasks: An exploratory comparison.” Proceedings of the 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), Singapore, Sept. 14–18. Fulop, Sean A., and Chris Golston. 2008. Breathy and whispery voicing in White Hmong. Proceedings of Meetings on Acoustics, vol. 4. Melville, NY: Acoustical Society of America. Haacke, Wilfrid H.G., and Eliphas Eiseb. 2002. A Khoekhoegowab Dictionary, with an English-Khoekhoegowab Index. Windhoek: Gamsberg Macmillan. Heath, Jeffrey, and Laura McPherson. 2009. “Cognitive set and lexicalization strategy in Dogon action verbs.” Anthropological Linguistics 51(1): 38–63. Hutchison, John P. 1971. “A composite English-French research guide for vocabulary elicitation in Africa: based on the vocabulary lists of Meinhof, Greenberg, University of Ibadan, Koelle, Swadesh-200 and 100, totaling one thousand and sixty-nine entries.” Unpublished Manuscript. [99pp].

274

Fulop and Wright

Kilian-Hatz, Christa. 2003. Khwe Dictionary. With a supplement on Khwe place names of West Caprivi. Namibian African Studies 6. Cologne: Rüdiger Köppe. König, Christa, and Bernd Heine. 2008. A Concise Dictionary of Northern ǃXun. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 21. Cologne: Rüdiger Köppe. Ladefoged, Peter. 2003. Phonetic Data Analysis: An Introduction to Fieldwork and Instrumental Techniques. Oxford: Blackwell. Labov, William. 1984. “Field methods of the project on linguistic change and variation.” Language in Use: Readings in Sociolinguistics, ed. John Baugh and Joel Sherzer. Englewood Cliffs, NJ: Prentice-Hall, pp. 28–66. Snider, Keith, and James Roberts. 2006. “SIL Comparative African Wordlist (SILCAWL).” SIL Electronic Working Papers 2006–005. Available online at: www.sil .org/silewp/2006/silewp2006-005.pdf. Sands, Bonny, Johanna Brugman, Mats Exter, Levi Namaseb, and Amanda L. Miller. 2007. “Articulatory Characteristics of Anterior Click Closures in Nǀuu.” Proceedings of the International Congress of Phonetic Sciences XVI, Saarbrücken, Germany, August 2007, pp. 401–404. Available online: http://www.icphs2007.de/conference/ Papers/1540/. Sands, Bonny, Andy Chebanne, and S. Shah. 2011. “Direct Lexical Elicitation in Language Documentation: Examples from ǂHoan.” Paper presented at ACAL 2011, 42nd Annual Conference on African Linguistics, June 10–12, 2011, University of Maryland, College Park, MD. Schilling, Natalie. 2013. Sociolinguistic Fieldwork. Cambridge: Cambridge University Press. Schilling-Estes, Natalie. 1998. “Investigating ’self-concious’ speech: The performance register in Ocracoke English.” Language in Society 27: 53–83. Traill, Anthony. 1994. A ǃXóõ Dictionary. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 9. Cologne: Rüdiger Köppe. Visser, Hessel. 2001. Naro Dictionary: Naro-English, English-Naro, 4th edition. Ghanzi, Botswana: Naro Language Project & SIL International. Warner, Natasha. 2011. “Experimentation Methods For Studying Spontaneous Speech.” The Oxford Handbook of Laboratory Phonology, ed. James M. Scobbie, Jane Stuart-Smith, Natasha Warner, Paul Warren and Jennifer Hay. Oxford: Oxford University Press, pp. 621–633. Warren, Paul, and Jennifer Hay. 2011. “Methods and Experimental Design for Studying Sociophonetic Variation.” The Oxford Handbook of Laboratory Phonology, ed. James M. Scobbie, Jane Stuart-Smith, Natasha Warner, Paul Warren and Jennifer Hay. Oxford: Oxford University Press, pp. 634–642.

Chapter 8

Nasalized Accompaniments in Proto-Khoe and in Khwe E.D. Elderkin 1 Introduction Clicks are versatile sounds in that they can co-occur with many simultaneous and near simultaneous phenomena to form what are often analyzed as single consonants. One of these phenomena is an open velic passage. This may cooccur distinctively with both glottalized and aspirated clicks within individual Khoe languages, although it would seem that, in proto-Khoe, glottalized clicks were articulated with an open velic passage; aspirated clicks do not reconstruct to proto-Khoe. Within NKhoisan and SKhoisan is seems that, historically, the collocation of nasality and prosodies of phonation type have brought about a fascinating proliferation of what seem to be click units with nasality. This paper is historically based and set within the Khoe languages. It becomes concerned with those clicks which can be seen to have a place in the phonological system parallel to nasal and (pre)nasalized consonants produced with a pulmonic airstream. In a phonological system, just using one place of articulation as an example, the four units t, d, ⁿd and n, may be considered analogous to ǀ̥, ǀ̬, ᵑǀ̬ and ǀ̃ respectively. This paper focusses on the two latter accompaniments, prenasalized and nasalized. Khwe, and some other languages in the Kalahari branch of Khoe, have two distinctive accompaniments involving nasalization with clicks. One is a straightforward (voiced) nasalized click, where the cessation of the nasalization is followed immediately by a vowel. The second is, in Khwe, described as a (voiced) prenasalized voiced click, in which the nasalization ceases before the click is released, but voicing remains, which results in the release of the influx being voiced before the transition to the vowel. Within Khwe, these two different clicks are found in CVV forms and therefore are potentially contrastive although no minimal pair has been found. It therefore cannot be predicted in Khwe where nasalized and prenasalized clicks occur and they are treated as phonologically distinct units. The prenasalized clicks sit uneasily in the system of proto-Khoe, which has no other prenasalized units.

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_009

276

Elderkin

1.1 Reconstructions: Sources and Attributions The Khoe languages are here considered to fall into two groups, following Voßen (1997), Cape (Voßen’s Khoekhoe) and Kalahari (Voßen’s Nicht-Khoekhoe). The Kalahari languages have two groups, Eastern and Western; Western is again subdivided into South West (Naro and Gǀui) and North West (only Khwe is mentioned here). ‘Proto’ is abbreviated to p and written as one word with the name of the group; Kalahari is abbreviated to Kal. Reconstructions are made from the four major corpora of tonally reliable data for individual Khoe languages which have become available in the third millennium listed below. – Khoekhoegowab: Haacke and Eiseb (2002) – Naro: Visser (2001) – Gǀui: Nakagawa et al. (2004) – Khwe: Kilian-Hatz (2003) Where the name of the language is given, it refers to these data sources, the dictionaries, listed above. Khoekhoegowab is abbreviated to SNK (Standard Namibian Khoekhoegowab), which is defined by Haacke and Eiseb (2002). Recently, data which appears in Mathes and Chebanne (2014) and Mathes (2014, 2015) has allowed forms from the most southeasterly Khoe language to feed into reconstructions. The sources for Juu (also known as Ju, or ǃXun) are Snyman (1975), (Afrikaans glosses), and Dickens (1994), (English glosses). Traill (1994) is used for ǃXóõ. Usually only reconstructions which have reflexes in both SNK and Khwe are here considered. Where Voßen has established a proto form including evidence from Eastern Kalahari, even without tonal corroboration, his label proto-Khoe (pKhoe) is retained. Where a form is reconstructed from at least SNK and Khwe but does not have further corroboration from Eastern Kalahari, it is labelled cWKhoe (common Western Khoe). (Forms labelled ‘common’ do not uniquely share a superordinate grouping.) Other abbreviations are – cSWFringe form found in SNK and at least one of Naro and Gǀui – cNEKal form found in Khwe and Eastern Kalahari – pKal proto-Kalahari – pWKal proto-Western Kalahari – pSWKal proto-South-Western Kalahari – preSNK form deduced by internal reconstruction from SNK 1.2 Conventions of Representation X is used as a cover term to represent any symbol for a click (ǀ ,ǂ , ǃ or ǁ). Besides the reconstruction for which he uses *X̃ N, which will be discussed later, Voßen also uses the symbols X̃ and X̃ n. There is a problem of representation. In order to achieve consistency with the symbolization used here, and to

Nasalized Accompaniments in Proto-Khoe and in Khwe

277

achieve the maximum differentiation in symbols, Voßen’s transcription is here changed. – Voßen X̃ n X̃ – this paper X̃ ᵑX̬ However, there is one exception: in dealing with the Cape languages (Khoekhoe and ǃOra) X̃ is used by Voßen to represent [X̃ ]; the present transcription retains X̃ . Although the present use of X̃ probably is phonetically likely, it is not claimed that the representation ᵑX̬ necessarily asserts that a prenasalized voiced click is the phonetic realization: the use of ᵑX̬ means that Voßen used X̃ and not X̃ n. At the pKhoe level, Voßen’s *X̃ in pKhoe is retained. Reconstructions presume a five-vowel system at V₁ for pKhoe. They follow the practical system given in Elderkin (2016). Occasionally reconstructions from Voßen (1997) are quoted; they are preceded by the abbreviation ‘RV97’. 2 Reconstructions The earliest attempt to reconstruct lexical proto forms for the Khoe languages is that of Baucom (1974). Only one type of nasalized click was reconstructed. Voßen (1997) is based in the main on this own field work of 1983 and 1984, and is to date the most important work in one volume on the reconstruction of pKhoe, morphologically as well as phonologically. He basically reconstructs two clicks with nasalization for pKhoe; he writes them as *X̃ and *X̃ N. We may first comment on the latter, and then disregard it. Of the nine reconstructions with *X̃ N, there are six which, with the hindsight of the detail of modern dictionaries, are seen to have aberrant tonal patterns, (for details, see Elderkin 2013). These tone patterns do not fit the usual reflexes of pKhoe *HH, *HL, *LH or *LL, and are usually a corollary of the donor item having a phonation type, such as breathiness or pharyngeality, which was unfamiliar to speakers of pKhoe. In Eastern Kalahari and several of the more easterly varieties of Western Kalahari, the initial consonant of these six examples with *X̃ N, loses its nasality and its voicing. This can be attributed to a restriction on what consonants can cooccur with the relevant phonation type. The phonation type is no longer heard in the Khoe languages, but its results, tonal and segmental, persevere. The remaining three items with *X̃ N, all with a palatal click, [ǂ̃], also lose their nasality as a concomitant of palatal click loss in some eastern varieties, and this similarity led Voßen to place them with the early loans. But they do not have aberrant tonal patterns and fit more properly into his *X̃ category.

278

Elderkin

Having suggested that Voßen’s *X̃ N is unnecessary, Voßen’s *X̃ remains. He recognizes that there are two major reflexes of this and attributes the seeming random distribution of these, X̃ and ᵑX̬ , to the chaos caused by a rule which is in progress. This rule, however, is not stated in full. This paper attempts to state the rule, show where it applies and to comment on its application in Khwe. After Voßen, two other workers have attempted reconstructions for pKhoe. Both project the distinctiveness of Khwe X̃ and ᵑX̬ backwards from Khwe into pKhoe. Honken (2010) and Elderkin (2008: 123) therefore postulate that Voßen’s *X̃ should be separated into two proto units, symbolized by Honken as *ŋXɡ and *ŋX, and by Elderkin as *X̃ ₁ and *X̃ ₂. In order to investigate the position of these two starred clicks in the system, a list of reconstructions is now presented for each of these clicks with nasalization. Symbols for the two starred nasalized clicks are reduced to one as the lists are grouped according to the realization of the nasalized click in Khwe. To those reconstructions which were already available to the writer, five extra reconstructions were added using the work of Honken (2010), which is, as its title suggests, an eminently useful and indispensable research aid. (1) Khwe Reflex: PRENASALIZED (VOICED) (Honken *ŋXɡ; Elderkin *X̃ ₁) cWKhoe *ǀ̃úró v rub, curry, make tired pKhoe *ǀ̃àó n Pterocarpus angolensis cWKhoe *ǀ̃ùβí vt winnow, separate cWKhoe *ǀ̃ùù n leg cWKhoe *ǂ̃áí v cut strips off cWKhoe *ǂ̃àì vt hunt, pursue cWKhoe *ǃ̃éé v pass, surpass cWKhoe *ǃ̃áò vt hide, load cWKhoe *ǃ̃òó v be quiet cWKhoe *ǃ̃ùú n house, hut, nest pKhoe *ǃ̃òé n porcupine cWKhoe *ǃ̃ùrí vi tell on somebody, maliciously spoil something cWKhoe *ǃ̃arì vt *HL move, vi *LL move pKhoe *ǁ̃uì n *HL oil, fat; v *LL grease cWKhoe *ǁ̃áò v be old pKhoe *ǁ̃ùú v accompany and, but problematic: (2)

pKal *ǂ̃úi n *HH (pKal), *HL (preSNK) vagina pKhoe *ǃ̃úù v? far [SNK also has the tone pattern 14] pWKal *ǃ̃àβè preSNK *ǃ̃àì n giraffe

Nasalized Accompaniments in Proto-Khoe and in Khwe

279

(3) Khwe Reflex: NASALIZED (VOICED) (Honken *ŋX; Elderkin *X̃ ₂) pKhoe *ǂ̃um̀ v *LL (SNK) make charcoal, n *HL (SNK), *LL (Kal) pKhoe *ǂ̃ṹṹ v sit cWKhoe *ǃ̃ùmá n root, tree trunk pKhoe *ǃ̃ònà num three cWKhoe *ǁ̃amí v *HH coil up (Cape); n *LH pangolin (Kal) and, but problematic: (4) cWKhoe *ǀ̃ùì v rub, wipe [only Khwe has nasalized vowel] cWKhoe *ǂ̃áḿ v whistle [Cape *a, Kalahari *u, at V₁] cWKhoe *ǂ̃ã́ã̀ vi become dry [Khwe oral vowel] pKhoe *ǁ̃ã́ã̀ n horn [Khwe oral vowel] 3 Implications It is clear from the above lists that the two forms are in complementary distribution. If there is no nasality in the rhyme of the morph (i.e. on V₁ and anything following) the velic passage becomes closed early, before the end of the click, to give a voiced plosive release to the anterior closure. If there is nasality in rhyme of the morph, this does not happen, the velic passage is not closed, which probably means that the V₁ in morphs with the rhyme -VNV is non-distinctively nasalized. These two accompaniments are not the last relics of a former distinction, old and inherent to pKhoe, but a low-level rule, which applied in the eastern part of the Kalahari language area and which produced two allophones for *X̃ . Therefore, at the pKhoe level there is no need to reconstruct two starred forms; only one, *X̃ , is needed. This rule will be referred to as the prenasalization rule. The phonologization of the two allophones may have been assisted by loans. Even so, once the allophony had been established, the nasalization in the rhyme of CVV forms becomes predictable from the consonant, and thus, without even any reinforcement from borrowings, there exists the setting for the phonologization of the distinction if the nasalization on the vowels is dispensed with. It is then possible for pKhoe *X̃ Ṽ Ṽ to be realized in, for example, Khwe, both as X̃ Ṽ Ṽ and X̃ VV, the former being the more conservative form. Nasalized vowels became oral vowels in ‘horn’ and ‘become dry’, (see (4) above); to these should also be added ‘sit’, because, as well as Khwe ǂ̃ṹ, the form used to justify the reconstruction given, ǂ̃ú also exists, (Kilian-Hatz 2003: 190; derivatives of this stem on that same page also show both nasalized and oral vowels). Mathes and Chebanne (2014: 5) have an interesting comment on Tsua

280

Elderkin

‘horn’: ‘not full nasal(ization)’. This suggests that the process of denasalization is presently active in Tsua. However, this choice (of denasalizing the rhyme) seems not to have been available to reconstructions of the form *X̃ VNV. The prenasalization rule applies equally to early loans. As well as those early loans which led Voßen to posit *X̃ N, there are others which have the aberrant tonal patterns but which do not lose the nasality in C₁ in the easterly varieties. Two examples are given next. (5) RV97 pKhoe *ǁ̃uβu buttern / (schütteln) Khwe ᵑǁ̬úβu̍ v shake (milk), make (butter) (6) RV97 pKhoe *ǀ̃am lieben Khwe ǀ̃ám̀ v 1 like, love; prefer 2 have sexual intercourse with 4

Developments in the Khoe Languages

Voßen’s data provides a foundation for comments on the implementation of the rule which results in prenasalized clicks. In only five of the 18 varieties on which he bases his reconstructions has the rule not operated at all: Nama and ǃOra (i.e. all of Cape), Naro, Gǀui and ǁAni. These are all in the west of the Kalahari speaking area. (ǁAni has odd reflexes in the C₁ of ‘house’, and of ‘diaphragm’; these are discussed later.) In eight varieties, the rule has operated, in all the data which Voßen presents. These are Buga, ǀ̬Anda and six of the seven varieties of Eastern Kalahari, the exception is ǀXaise. The application of the prenasalization rule is spreading from the east: between those languages in which it has not operated and those where it has completely operated, there are several varieties, which for starred forms without nasality, there are forms with a prenasalized click and with a nasalized click at seeming random. The prenasalization rule is here clearly in progress. These are Deti, Kxoe (= Khwe), ǁ̬Ana, ǂHaba and ǀXaise. The situation in Khwe, where Voßen’s data has revealed a rule in progress, will be discussed in the next section based on the entries in Kilian-Hatz (2003), which augment Voßen’s data base. Among those languages which have completely applied the prenasalization rule, one has gone further. Mathes (2015) shows that in Tsua, prenasalized clicks lose their prenasalization and become voiced clicks. They are indistinguishable from those voiced clicks which derive from starred voiced clicks except in

Nasalized Accompaniments in Proto-Khoe and in Khwe

281

one way. Voiced clicks from starred voiced clicks are depressors; voiced clicks from starred nasalized clicks are not depressors. There is clearly a connection between prenasalization and prevoicing. If voiced prenasalization loses the open velic passage, it becomes prevoicing. Prevoicing is disfavored in articulatory terms: air has to escape through the glottis in order to produce voicing, but the duration of the sound is severely limited because of the small volume of the space between the glottis and the posterior closure for a click when the velic passage is closed. Possibly for this reason, prevoicing is understandably more practicable if a voiceless click follows; this is not yet known from Khoe. At this point, it is relevant to give examples of items with a nasalized click being related to others with a voiced click, even if the method by which the two are related is not always clear. Firstly, there are instances of the alternation of nasalized and voiced click, either within Khoe, (7) or between the source and the borrowed item (8). For both these two examples, Juu can be cited as a possible source. (7) preSNK *ǃ̃ùí nmf snare vt 1 set (trap); lay (snare) 2 trap pKal *ǃ̬ùí Sansevieria spp, rope, snare Juu ǃᵏʰúí n Sansevieria aethiopica (8) Naro ǁ̃àí n nightjar Gǀui ǁ̃àɪ̍ n Rufus-cheeked nightjar, Caprimulgus rufigena Juu ǁ̬àn ̤ ì n European nightjar, Caprimulgus europaeus (breathiness from Dickens) Secondly, two of Voßen’s reconstructions with nasalized clicks also have forms with voiced clicks. Considering all the entries in Voßen 1997, and the modern dictionaries, there seem to be several potentially related stems, of varying spread, but all in the ‘cheek’, ‘jaw’, ‘chin’ semantic area. Neither tonally nor segmentally, is it easy to see the mechanics of any relatedness. (9) RV97 pKhoe *ǃ̬an(i) Kinn (chin) RV97 pKhoe *ǃ̃ani Kinnlade/Unterkeifer (lower jaw) cf. preSNK *ǃ̬àǹ chin pWKal *ǃ̬àní chin cSWFringe *ǃ̃ã́ã̀ / *ǃ̃ánì lower jaw / jaw cNEKal *ǃ̃àrì jaw bone / lower jaw

282

Elderkin

For one other item, Voßen reconstructs a starred nasal, but the modern dictionaries attest to a starred voiced click. Juu concurs; a ǃXóõ form which shows a nasalized click and pharyngealization is added, but the segmentals of the rhyme are not close to the other examples. (10) RV97 pKal *ǁ̃abe ~ ᵑǁ̬abe Zwerchfell cf. pWKal *ǁ̬áβè n thorax, (waist and above), diaphragm Juu *ǁ̬a̍βe̍ n diaphragm ǃXóõ * n diaphragm Voßen (1997: 511) does give a voiced click for the C₁ in Khwe, and a voiceless click for that of Tsua. The semantic range of the glosses for this item is wide: ‘chest’, ‘diaphragm’, ‘waist’. It is well known that reconstructed forms can be spurious and not relatable to any proto grouping. From SNK, Naro, Gǀui and Khwe, it was possible to reconstruct cWKhoe *ǃ̃ùú n ‘house, hut, nest’. As there was no data from Eastern Kalahari, there was hesitation in attributing the form to pKhoe. With the help of Voßen’s data, this caution is seen to be justified. ǁAni, which has no other indication of starting to apply the prenasalization rule, nevertheless has an initial ᵑɡ. The two most extreme Eastern Kalahari languages are also odd: Kua has ɡ (ᵑɡ would be expected) and Tsua has ʣ (ɡ would be expected). Kua borrowed the form with ɡ from a (now not known) variety, which, as Tsua today, had eliminated the prenasalization from ᵑɡ. It was then borrowed, as ɡ, into Tsua at a time when it was able to fall in with the reflexes of *ɡ; with them, it fronted to ʣ. With them, it should have undergone the usual depression of tone with the reflexes of starred voiced units. At one point (Mathes and Chebanne 2014) it is recorded with a depressed tonal pattern, but in Mathes (2015) with a nondepressed pattern. At the time of borrowing, it was uncertain whether it should submit to the depression rule or not. This item was a loan which underwent dialect hopping (horizontal transmission) and was not vertically transmitted from any proto language. (To make matters more interesting, the form is recorded in Mathes and Chebanne (2014) with an initial ʤ, (?ʥ); in Mathes (2015) it has an initial ʣ (as in Voßen).) We will now turn to the situation in Khwe. 5

Regular Developments in Khwe

Voßen (1997) gives just over 50 sets of data leading him to reconstructions with a nasalized click. From an inspection of these it can be seen that two rules are

Nasalized Accompaniments in Proto-Khoe and in Khwe

283

involved. Firstly, the instantiation of an allophonic variant, a prenasalized plosive, as a separate phonematic unit. Secondly the loss of the click influxes ǃ and ǂ. These rules interact and operate at different speeds in the various languages. Data in (11) and (12) assume that both have operated completely, and give the end point of the development from pKhoe. The development of the nasalized clicks in morphs which have nasality in the rhyme is straightforward. There is probably no phonetic change where the click is retained. (11) pKhoe *ǀ̃ *ǂ̃ *ǃ̃ *ǁ̃ reflex ǀ̃ ɲ ŋ ǁ̃ However, where there is no nasality in the rhyme of the starred form, a prenazalised unit results. (12) pKhoe *ǀ̃ *ǂ̃ *ǃ̃ *ǁ̃ reflex ᵑǀ̬ ᶮʥ ⁿɡ ᵑǁ̬ Kilian-Hatz (2003: 7) states that ⁿɡ is the realization in initial position, not ᵑɡ. Khwe does not lose ǂ, so the two reflexes in that language are (13) pKhoe *ǀ̃ *ǂ̃ *ǃ̃ *ǁ̃ before nasality ǀ̃ ǂ̃ ŋ ǁ̃ before orality ᵑǀ̬ ᵑǂ̬ ᵑɡ ᵑǁ̬ For the palatal click, *ǂ̃, Kilian-Hatz has an interesting doublet in Khwe, which shows both a form with a click, and a form in which click loss has taken place. (14) Khwe ᵑǂ̬á(-ò) n 1 place, center 2 moment, time Khwe ᶮʥa̍ n place, direction The tonal difference is possibly related to the use of the latter in compounds and then its reestablishment as an independent noun. Alternatively, or additionally, Khwe may have been influenced by a language which lost ǂ. But not enough data is available to reconstruct this etymon. The following section explores the possible reasons for occasions where there are problems in the form of nasalized clicks in modern Khwe as presented in Kilian-Hatz (2003).

284 6

Elderkin

Aberrant Nasalized Clicks in Khwe

Khwe is one of the languages listed in Section 4 which have not completely implemented the prenasalization rule. About 14 entries in Kilian-Hatz’s dictionary do not conform to the pattern. Only lexical items are included in this discussion, not grammatical items such as demonstratives or bound parts of compounds: members of closed systems (grammatical items) are not to be compared with members of open sets (lexical items), like is compared with like. 6.1 Unexpected Prenasalized Voiced Clicks These have nasality in the rhyme, which is not expected with prenasalized voiced clicks. Some of these inappropriate prenasalized voiced clicks relate to cognate voiced clicks. For one, Voßen has offered a pKal reconstruction, but the pharyngeality in Gǀui clearly indicates a borrowing. (15) pKal *ǁ̃om sich den Mund vollstopfen (Voßen (1997): 504.1.0) Gǀui v fill one’s mouth with food Khwe ᵑǁ̬óḿ n full mouth Others do not show pharyngeality. (16) pSWKal *ǁ̬àḿ n thorn Khwe ᵑǁ̬àḿ n 1 thorn 2 nail 3 Acanthospernum hispidum (17) pSWKal *ǂ̬ã́ã̀ v trust, believe Khwe ᵑǂ̬ã̍ã́ vt [c&c < dep*HL] 1 hope 2 trust The usual reflex of pKhoe X̬ is a voiced click, never a prenasalized voiced click. But the Khwe tones in both (16) and (17) are depressed which is not expected with a prenasalized click at C₁, but would be expected with a voiced click in that position. A prenasalized click appears in Khwe instead of the expected voiced click. One inappropriate prenasalized click has a voiceless alternant within Khwe. (18) Khwe ᵑǀ̬õ̀ã́ and ǀ̥õ̀ã́ n boil, abscess (no cognate yet traced) And finally, two entries in Kilian-Hatz (2003) have the same form, ᵑǀ̬iń í. Comparing other Khoe forms, (Elderkin 2017), ‘fly’ is clearly a compound, and so n is therefore probably part of the second element. The form is, as yet,

Nasalized Accompaniments in Proto-Khoe and in Khwe

285

without an etymology. It is possible that ‘ebony’ is the same item, for semantic reasons that are at present still opaque. (19) Khwe ᵑǀ̬iń í n fly Khwe ᵑǀ̬iń í n ebony, Diospyros mespiliformis 6.2 Both Nasalized Clicks Recorded As a bridge to commentary on the nasalized clicks without following nasality in the morph, three etymons recorded with both nasalized and prenasalized click must be mentioned. The sand camwood tree has three forms, one written with an initial dental nasalized click, but alphabetized in Kilian-Hatz (2003) as if it has a nasalized palatal initial click. (20) Khwe ǀ̃oɛ́rɛ́ n Baphia massaiensis [plant sp] (?ǂ̃oɛ́rɛ́ as from *ǂ̃óré) It may well represent the same item as (21) Khwe ǂ̥órè or ᵑǂ̬órè n Baphia massaiensis (-órè as from *-úrè) It is difficult to comment on these forms; no cognate has yet been traced. The variety of forms indicates a loan. Similarly difficult is the following correspondence set. (22) pKhoe *ǂ̃òé v dispute, deny Khwe ᵑǂ̬óɛ́ v 1 contest, dispute, deny 2 not know, ignore Khwe ǂ̃óɛ́ v challenge The reconstructed form ‘should’ give the prenasalized click in Khwe, but retention or borrowing with ǂ̃ is present to give a doublet. (Both examples in Kilian-Hatz (2003) have tones in their conjugation pattern which would relate to *HH, but the classification of the first in Köhler (1989) clearly indicates that it derives from *LH. The non-depressed realizations of *LH and *HH in Khwe are identical and this has been the trigger for the reallocation of the items to the tonal conjugation pattern proper to *HH.) A final etymon where both nasalized and prenasalized forms are probably found in the same etymon in Khwe is ‘cook’. The majority of forms give *ǀ̃óá. (23) pKal *ǀ̃óá to cook (not given in Visser (2001) for Naro) Khwe ᵑǀ̬óá to cook

286

Elderkin

However, emending the dictionary ‘cock’ to ‘cook’ gives another series (24) pWKal *ǀ̃ṍã́ to cook Ts’aokhoe ǀ̃ṍã́ to cook (from Visser (2001: 48); Ts’aokhoe is a Naro dialect) Khwe ǀ̃óá vt [no conj info] to cock [sic] How can this doublet be explained? It is interesting to note that Voßen (1997) records both (25) pKal *ᵑǀ̬oa (at p458) pKal *ǀ̃oa (in endlist at p524, where *ǀ̃óá is also quoted, in parentheses) which confirms the variability. But the source may be (26) Juu ǀ̃óá vt kook, opsit (kos –); cook, boil Assuming that Juu is the source, the nasalization can then be explained as an attempt to make an illegitimate form (with X̃ but without nasality following) conform to the norm at a stage of development when the allophonic nature of X̃ and ᵑX̬ was still productive. 6.3 Nasalized Clicks Followed by Oral Vowels Some have been already mentioned: ‘horn’, ‘become dry’ and ‘sit’, (see Section 3 above); there are other instances of nasalized clicks followed by oral vowels. One item which shows loss of nasality in Khwe does have oral cognates outside Khoe, but was not brought into the earlier discussion as it is absent from the Cape branch of Kalahari. (27) pWKal *ǃ̃ṹṹ geographical region Khwe ŋú 1 country, land 2 area, region, place, settlement 3 world This etymon is present in ǃXóõ (Traill 1994: 96) where it appears both with an oral and a nasalized vowel, but the ultimate stem is oral. However, the oral vowel in Khwe is an acceptable reflex of a nasalized vowel in X̃ ṼV.̃ Similarly (28) Khwe ǂ̃áxò vt lay down, put (sthg) on sthg, put down may show the same etymon as Voßen’s

Nasalized Accompaniments in Proto-Khoe and in Khwe

287

(29) pKhoe *ǂ̃a gießen (Voßen 1997: 443.2.0) However, SNK, Naro and Gǀui show a nasalized vowel and *HH; Naro not only shows pharyngealization, a marker of foreignness, but shows that the verb is well integrated into the tonal system by having an intransitive form in *LL. Ignoring the pharyngeality in Naro, a starred form can be posited. (30) cWKhoe *ǂ̃ã́ã́ v pour Naro v pour, throw something liquid vt water Naro vi pour out [Intransitive *LL] This may be an early borrowing into Khoe, adopted without the pharyngealization; Naro later introduced pharyngealization, as a hypercorrection, from their contact with Juu. The source seems to be reflected in Juu where the two following items are found. (31) Juu *ǂ̃á vt uitgooi (’n vaste stof of vloeistof met ’n swaaibeweging) / throw (liquid) away Juu  v uitgooi (’n vaste stof of vloeistof met ’n swaaibeweging) / throw (liquid) out of something, empty something of a liquid Clearly we need to know more about the historical morphophonology of Juu! Nasality (implied by the initial ǂ̃ in Khwe) and pharyngeality (implied in Gǀui by the uvularity, and implying loan status,) again co-occur in the next set. (32) Khwe ǂ̃úrú I vi [conjugation patterns indicate k, !>ky) Loss of ! (!>k)

Namibia Nama/Damara

Caprivi Khwe ǁAni

Ganadi Buga

Tyire Danisi

Tjwao

Nata Shua Deti Kua (cattle posts) Malatswae Kua

Naro

Southern Kua

Botswana Gǀui

Map 9.2

Ts’ixa

Zimbabwe

Pandamatenga Shua

Gǁana

Lephephe Kua

Geographical distribution of patterns of alveolar click loss in Khoe

304

Fehn

Table 9.4 Alveolar click retention in ǁAni and Gǁana

click retention click loss # of lexical items reconstructed with an alveolar click for PWKalK

ǁAni

Gǁana

22 (~53,7%) 19 (~46,3%) 41 (~100%)

6 (~19,4%) 25 (~81,6%) 31 (~100%)

Source: All data from Vossen (1997) Table 9.5 Variation in alveolar click loss among five speakers of ǁAni

Gloss

ǁAni #1

ǁAni #2

ǁAni #3

ǁAni #4

ǁAni #5

Proto-Form

(Fehn et al.)

(Fehn et al.)

(Fehn et al.)

(Fehn et al.)

(Vossen 1997)

(Vossen 1997)

kṹũ̀ ǃáò ǃóḿ káé

kṹũ̀ ǃáò ǃóḿ káé

kṹũ̀ ǃáò ǃóḿ káé

kṹũ̀ ǃáò ǃóŋ́ káé

kṹũ̀ ǃáò ǃóḿ káé

*ǃũ (PK) *ǃao (PKalK) *ǃóḿ (PK) *ǃae (PK)

ɡóò ɡóm̀ ɡám̀

ɡóò ɡóm̀ ɡám̀

ɡóò ɡóm̀ ɡám̀

ɡóm̀ ɡám̀

ɡǃóò ɡóm̀ ɡǃám̀

*ɡǃo (PKalK) *ɡǃom (PK) *ɡǃam (PKalK)

ǃhòḿ khyìrí ǃhèí

ǃhōm̄ khyìrí ǃhēī

ǃhōm̄ khyèrí ǃhāī

ǃhòḿ

ǃhòḿ

*ǃhòḿ (PKalK)

ǃhàé

*ǃhae (PK)

ǃ ‘plain’ ‘go’ ‘long’ ‘heavy’ ‘tie’ ɡǃ ‘voiced’ ‘aardvark’ ‘blow’ ‘throw’ ǃh ‘aspirated’ ‘cut’ ‘push’ ‘stab’

305

Click Loss in Khoe-Kwadi Table 9.5 Variation in alveolar click loss among five speakers of ǁAni (cont.)

Gloss

ǁAni #1

ǁAni #2

ǁAni #3

ǁAni #4

ǁAni #5

Proto-Form

ŋúú

ŋǃ ‘nasal’ (and ŋɡǃ ‘prenasalized voiced’) ‘country’ ‘far’ ‘foot’ ‘house’ ‘three’

ŋṹṹ ŋǃúù ŋǃāréè ŋɡúú ŋǃōànà

ŋṹṹ ŋǃúù ŋǃāréè ŋɡúú ŋǃōànà

ŋṹṹ ŋǃúù ŋǃāréè ŋɡúú ŋǃōànà

ŋṹṹ ŋɡǃúù kēréè ŋɡúú ŋǃōànà

ŋǃàdé ŋɡúú ŋǃóánà

*ŋǃu (PK) *ŋǃu (PK) *ŋǃadi (PK) *ŋǃu (PK) *!̃ona (PK)

ǃxóó xáń

ǃxóó xáń

ǃxóó xáń

ǃxóó xáń

ǃxóó xáń

*ǃxo (PK) *ǃxan(i) (PK)

‘face, forehead’ ‘fear’ ǃʔáò ‘know’ ʔã́ã́

ǃʔã́ã́ ʔã̄ã̄

ǃʔã́ã́ ǃʔáò ʔã́ã́

ʔã́ã́

ǃʔá(= ʔō) *ǃʔã (PK) ǃʔáò *ǃʔao (PK) ̀ ́ ʔãã *ǃʔã (PK)

ǃx’áó

kx’áó

kx’áó

ǃx’áó

ǃx ‘affricated’ ‘hold’ ‘sew’ ǃʔ ‘glottalized’

ǃx’ ‘affricated ejective’ ‘neck’

kx’áó

*ǃx’ao (PK)

The picture presented in Table 9.5 is rather uniform. Variation between idiolects is only observed with four lexemes (‘aardvark’, ‘foot’, ‘neck’, ‘throw’), while all others display stable onsets. The form kēréè for ‘foot’ in Vossen’s (1997) dataset #4 displaying both click loss and denasalization is most likely a borrowing from another Khwe dialect, presumably Buga. The same may be said for the form khyèrí~khyìrí ‘to push’ (in Vossen’s (1997) datasets #1 2–3) which displays palatalization otherwise not found in ǁAni (cf. also §3.3.3 below). Voiced clicks were only found in Vossen’s (1997) dataset #5; whether their absence with other speakers is due to geography or a diachronic development cannot be answered satisfactorily at this point.

306

Fehn

The comparative data available for ǁAni does not allow for a conclusive assessment of the role the phonological environment may play for sound change processes involving the alveolar click. There were no examples involving alveolar clicks with voiced or voiceless uvular accompaniments. Following Nakagawa (quoted as p.c. in Traill and Vossen (1997)), alveolar clicks in Gǁana were most stable with uvular effluxes. 3.3.2 Replacements for the Alveolar Click When the alveolar influx series is lost, the click influx is replaced by a non-click consonant. Reflexes of the alveolar influx series in Khwe and Eastern Kalahari Khoe are velar stops, fricatives or nasals; the glottal stop ʔ replaces the glottalized click ǃʔ. ǃ and k share the acoustic feature [+grave] which has already been identified as one of the defining characteristics of the alveolar click (Traill and Vossen 1997: 31). The grave spectrum results from the resonance of the sublingual cavity, resulting in a relatively low-frequency click burst. All replacements for the alveolar click found in Traill’s (1986a) and my own data from Kalahari Khoe are provided in Table 9.6. Table 9.6 Non-click replacements of the alveolar click in Kalahari Khoe

[+click]

[-click]

examples

ǃ ɡǃ ǃh ŋǃ ŋɡǃ ǃx ǃʔ ǃx’

k ɡ kh ŋ ŋɡ kx~x ʔ kx’~k’

(1)–(2) (3)–(4) (5)–(6) (7)–(10) (9)–(10) (11)–(12) (13) (14)

Note: Compare also Traill 1986a: 308

Examples of all non-click replacements for the alveolar click as listed in Table 9.6 and discussed in the following sections are provided. Palatalized replacements (cf. §3.3.3 for a discussion) are given in the right column.

307

Click Loss in Khoe-Kwadi

Regular click replacements (1)

(2)

(3)

(4)

(5)

*ǃũũ ‘go’ (Vossen 1997: 441) West: Gǁana: kṹũ� (Gǁ, Vossen 1997: 441) Khwe: kṹũ� (ǁA; Bu; Ca, Kilian-Hatz 2003: 374) East: Shua: kṹũ� (Ts, Da, Na, Pa, De) Tshwa: kũũ (LK, K, NTy, STy, Tsh) *ǃae ‘tie’ (Vossen 1997: 423) West: Khwe: káé (ǁA) East: Shua: káé~kɛ́ɛ́ (Ts), káé (Da, Na, Pa, De) Tshwa: kāē (LK), káē (Sty) *ɡǃoo ‘aardvark’ (Vossen 1997: 431) West: Gǁana: ɡóò (Gǁ, Vossen 1997: 432) Khwe: ɡóò (ǁA, Bu), ɡōó (Ca, Kilian-Hatz 2003: 232) East: Shua: ɡóò (Da), ɡóò (Pa, De) Tshwa: ɡǒò (LK, MLK), ɡǒòh~kǒò (K) *ɡǃam ‘throw’ (Vossen 1997: 507) West: Khwe: ɡám̀ (ǁA) East: Shua: ɡám̀ (Ts), ɡám̀ (Da) Tshwa: ɡām~kham (LK), ɡam (NTy) *ǃhae ‘pierce, stab’ (Vossen 1997: 494) West: Khwe: ǃhèí (ǁA) East: Shua: khāé (Ts), kháé (Da), khae (De) Tshwa: khāé (LK, STy), khae (K, SK)

Palatalized click replacements

kyáé (Bu; Ca, Kilian-Hatz 2003: 76)

kyāē (K, MLK)

ɡyāḿ (Bu; Ca, Kilian-Hatz 2003: 364)

ɡyam (MLK)

khyèí (Bu), khyàé (Ca, KilianHatz 2003: 354)

308

Fehn

(cont.) Regular click replacements (6)

(7)

(8)

(9)

(10)

*ǃhadi ‘push’ (no reconstruction available) West: Khwe:  East: Shua: khādí (Ts) *ŋǃuu ‘land, country’ (Vossen 1997: 461) West: Gǁana: ŋúú (Gǁ, Vossen 1997: 461) Khwe: ŋṹṹ~ŋúú (ǁA, Bu), ŋúú (Ca, Kilian-Hatz 2003: 257) East: Shua: ŋṹṹ (Ts), ŋṹṹ~ŋúú (Da), ŋúú (Pa, Na, De) Tshwa: ŋóò~nóò (LK), ŋûú (SK, K), nuu (NTy), ŋũũ~ŋuu~nũũ (Tsh) ŋǃam ‘begin’ (origin unclear, Khwe group only) West: Khwe: ŋǃám (ǁA, Heine 1999: 118) *ŋǃuu ‘house’ (Vossen 1997: 447) West: Gǁana: ŋúù (Gǁ, Vossen 1997: 447) Khwe: ŋɡúú (ǁA; Bu; Ca, Kilian-Hatz 2003: 94) East: Shua: ŋɡúú (Ts) Tshwa: ŋúū~ɡúū (LK), ŋúū~ɡúū~ɡúú (K) *ŋǃado ‘chameleon’ (Vossen 1997: 427) West: Gǁana: ŋádù (Gǁ, Vossen 1997: 427) Khwe: ŋɡárò (ǁA) East: Shua: ŋɡárò (Ts, Da), ŋɡárò (Pa), ŋárò (De) Tshwa: ɡádô (LK), ɡaro (SK), ŋɡárò (Ga, STy)

Palatalized click replacements

khyèrí~khyèdí (ǁA; Bu; Ca, Kilian-Hatz 2003: 331)

ɲáḿ (Ca, Kilian-Hatz 2003: 238)

djúú (SK), dzúū (MLK), nʤuu (NTy, Tsh)

ŋɡyárò (Bu; Ca, Kilian-Hatz 2003: 250)

ɡyárô~ɡaro (MLK), ɡyàdo (K), ŋɡaro (Tsh)

Click Loss in Khoe-Kwadi

309

(cont.) Regular click replacements (11)

(12)

(13)

(14)

*ǃxan(i) ‘sew’ (Vossen 1997: 470) West: Gǁana: xáǹ (Gǁ, Vossen 1997: 470) Khwe: xáń (ǁA), xã�ĩ�́ (Bu), xàń (Ca, KilianHatz 2003: 344) East: Shua: xáń (Ts, Da, Na, Pa, De) Tshwa: kxān̄ (LK, SK), xan (MLK) *ǃxoo ‘hold’ (Vossen 1997: 446) West: Gǁana: xóó (Gǁ, Vossen 1997: 447) Khwe: ǃxóó (ǁA), xóó (Bu), xóó (Ca, KilianHatz 2003: 144) East: Shua: xóó (Ts, Da, Na, Pa, De) Tshwa: kxóō (LK), kxóǒ (SK), xúō~xôó (MLK), xuo (K), xoo (Tsh) *ǃʔãã ‘know’ (Vossen 1997: 507) West: Gǁana: ʔã�ã� (Gǁ, Vossen 1997: 508) Khwe: ʔã�ã� (Bu; Ca, Kilian-Hatz 2003: 18) East: Shua: ʔã�ã� (Ts, Da, Na, Pa, De) Tshwa: ʔáńā (LK, MLK), ʔánâ (SK), ʔã�ã�~ʔānà (K), ʔáã (STy), ʔãã (Nty, Tsh) *ǃx’ao’ ‘neck’ (Vossen 1997: 446) West: Gǁana: kxʼáó (Gǁ, Vossen 1997: 446) Khwe: ǃx’áó~kx’áó (ǁA) East: Shua: k’áó (Da, Na, Pa), k’àó (De) Tshwa: ǃqʼɣao (LK, SK, MLK), kx’ao (NTy), kʼáō (STy), k’ao (Tsh)

Palatalized click replacements

310

Fehn

The replacements for the alveolar click are remarkably uniform across languages. The contrast between ŋ~ɲ and ŋɡ (>ŋɡy) only exists in languages that contrast two nasal clicks ŋǃ (nasal) and ŋǃɡ (prenasalized-voiced),4 i.e. the Khwe group, the Shua cluster except the western variety of Danisi, and the Tjwao and Tyire varieties of Tshwa. Some Kua varieties further display a loss of nasalization and only retain the feature [+voiced] (cf. (9) and (10) above). In the case of the affricated alveolar click ǃx, the click is usually elided completely and only the accompaniment is retained. Two of Traill’s “Kua” dialects display both the velar stop and the fricative, whereas this does not appear to be triggered by a particular phonetic environment (cf. (11) and (12) above). Replacement of the glottalized click ǃʔ yields ʔ, with no remaining trace of the click (cf. (13) above). As Traill and Vossen (1997: 32) remark, the expected replacement would in fact be glottalized velar stop, i.e. an ejective k’. They hypothesize that this replacement might have been blocked to avoid neutralization, as the affricated alveolar ejective click ǃx’ regularly yields an affricated ejective kx’ as replacement, which in Shua is regularly deaffricated to k’ (cf. (14) below).5 Nevertheless, Traill and Vossen (1997) admit that the replacement of alveolar clicks with velar stops in itself leads to a widespread neutralization of contrasts in the lexicon, thus making this explanation seem somewhat unsatisfying.

4  It seems likely that both clicks go back to a proto-form *ŋǃ which in some dialects became ŋǃɡ before oral rhymes (cf. Elderkin, this volume). When the nasality of some rhymes following ŋǃ was lost over time, the former allophones became contrastive. 5  Vossen (1997) reconstructs a non-affricated ejective *k’ for Proto-Eastern Kalahari Khoe, along with complete neutralization of the glottalized vs. ejective contrast in the click inventory. My own research (Fehn 2018) suggests that affricated ejective clicks are merely de­ affricated in some Eastern Kalahari Khoe languages, leading to a sound shift (here exemplified by the dental click) ǀx’ > ǀ’ (or ǀk’~ǀq’), rather than ǀx’ >ǀʔ. Among the languages surveyed, only Nata Shua may have completely lost the glottalized vs. ejective contrast. As reflexes of Proto-Khoe *ǃx’ and *ǂx’ in Eastern Kalahari Khoe (including Nata Shua) are kx’~k’, and c’, respectively, it must be assumed that the click loss predates neutralization of the glottalized vs. ejective contrast. During our survey, deaffrication of ejectives and ejective clicks was not only observed in Eastern Kalahari Khoe, but also in the Buga dialect of Khwe, and in Ts’ixa. In Buga and Ts’ixa, affrication is clearly a matter of idiolectal variation, i.e. it is found with some speakers, but not with others. Recent research on the Tshwa variety Tjwao of Zimbabwe (Fehn and Phiri, field notes) also yielded variation, with some speakers producing fully affricated ejectives and ejective clicks. The use of this sound shift for subgrouping may therefore be put into question.

Click Loss in Khoe-Kwadi

311

3.3.3 Palatalization of Alveolar Click Replacements in Khwe and Kua In Khwe (Traill and Vossen 1997: 29) and at least two Kua varieties, k, ɡ, ŋɡ and kh are palatalized before the back vowel /a/ and the mid-high vowel /e/~/ɛ/. Palatalization may affect the following replacements: k (>ky), ɡ (>ɡy), kh (>khy), ŋ (>ɲ) and ŋɡ (>ŋɡy). Only ky and ɡy are found in Kua, all other instances of palatalization are restricted to the Buga and Caprivi dialects of Khwe. In the ǁAni dialect, only palatalization of aspirated velar stops (kh>khy) is occasionally observed; more data would be needed to determine whether the palatalized forms constitute borrowings from the neighboring Buga or Caprivi dialects. In the Khwe group, the palatalization of velar stops appears to affect the lexicon as a whole (Vossen 1997: 95–96, 102–103). Thus, one may wonder whether the phenomenon is directly related to the process of click replacement, or whether we are dealing with a case of seemingly unmotivated palatalization, as has been observed elsewhere in the Kalahari Khoe group (e.g. palatalization of alveolar stops in Gǀui, cf. Nakagawa 2006). A comparison of palatograms from palatalized alveolar click replacements from the Buga and ǁAni dialects of Khwe made by Traill (1983, 1984) indicates that palatalization may after all be directly related to the gradual process of alveolar click replacement, suggesting a sound shift ǃ>ky, rather than ǃ>k>ky (Traill 1983: 4). The sound produced by the Buga speaker involves an anterior (post-dental) and a posterior (velar) closure, rendering the resulting palatogram strikingly similar to that of an alveolar click. The matching sound produced by the ǁAni speaker, however, matched a palatalized stop ky as found in other languages displaying that segment, e.g. Russian. The auditory proximity of the corresponding sound in ǁAni, despite of the phonetic disparity, might indeed hint at borrowing of palatalized consonants from Buga or Caprivi, rather than an independent development. 3.4 Loss of the Palatal Influx Eastern Kalahari Khoe languages are not only affected by loss of the alveolar, but also of the palatal click. To date, no language classified as a member of the Western Kalahari Khoe subgroup has been documented as being affected by palatal click loss. In contrast, all varieties of the Shua and Tshwa clusters replace at least a subset of their palatal clicks with palatal stops. Similar to what has been observed for the alveolar click in ǁAni and Gǁana, Ts’ixa and Kua retain the palatal click in more lexical items than other affected languages with only a small number of retained tokens. The geographical spread of palatal click loss and variation in replacements (cf. §3.2.2) is outlined in Map 9.3 below:

312

Fehn

Angola

Key Retention of Partial loss of ( , >c)

Caprivi Khwe ǁAni

Loss of ( > c~tc)

Buga

Ts’ixa

Loss of ( > tc)

Nama/Damara

Tjwao

Nata Shua Deti Kua (cattle posts) Malatswae Kua

Naro

Southern Kua

Botswana Gǀui

Map 9.3

Pandamatenga Shua

Danisi Tyire

Loss of ( > tc~ts)

Namibia

Zimbabwe

Ganadi

Gǁana

Lephephe Kua

Geographical distribution of palatal click replacements in Khoe

3.4.1 Frequency and Variation of Palatal Click Use in Ts’ixa Two purported members of the Eastern Kalahari Khoe subgroup have retained a subset of their palatal influxes: Ts’ixa (Shua) and Kua (Tshwa). In this section, an assessment of palatal click loss and retention in Ts’ixa will be provided (see Table 9.7), based on the author’s own fieldwork and data collected by Westphal (n.d.) during the 1950s, Traill (n.d.) during the early 1980s and Vossen (1997) in 1983/4. The inclusion of older data allows us to assess the situation from a diachronic perspective and shed some light on the impact of recent social developments on click loss processes. A statistical comparison of the frequency with which different speakers retain the palatal click was rendered difficult by a lack of uniform datasets. In consequence, I surveyed all lexemes with a palatal onset available for one particular speaker and then counted how many clicks were retained; I also took variation into account, i.e. I assessed if a form was realized both with and without a click by one and the same speaker. I then performed a χ² test to see whether the different rates of click retention are statistically significant. The test showed no statistically significant difference between the retention rates in my own and Traill’s data (#3–9), but a highly significant difference between these data and the ones collected by Westphal and Vossen (#1–2). The wordlists of Westphal and Vossen (#1 and #2) were recorded with the same speaker, but more than twenty years apart from each other. Nevertheless,

313

Click Loss in Khoe-Kwadi Table 9.7 Palatal click retention in Ts’ixa

#1 Total 21 click loss 7 click retention 14 variation 0 retention in % 66,7% (incl. variation)

#2

#3

#4

#5

#6

39 7 26 6 82,1%

31 23 4 4 25,8%

37 32 5 0 13,5%

37 21 26 13 9 8 2 0 29,7% 38,1%

#7

#8

44 12 32 7 9 5 3 0 27,3% 41,7%

#9 23 17 4 2 28,1%

Source: #1 from Westphal (n.d.), #2 from Vossen (1997), #3 from Traill (n.d.), #4–9 own data

they are fairly coherent in that the speaker in question retained the click in more than 50% of lexical items reconstructed with a palatal onset. My data included two sons of said speaker (#6 and #7) who indeed confirmed that both their father and mother, as well as older speakers they remembered from childhood would display a higher click frequency. Nevertheless, Traill’s speaker (#3) only displayed about 25,8% retentions. The 6 idiolects from my own data showed retention rates between 13,5% (minimum) and 41,7% (maximum). My research suggests that it would be misleading to consider relative age of the speakers as the main explanation for the variation present. The comparative data includes two speakers over 70, one male (#4) and one female (#5). #5 retained about 29,7% of clicks in palatal onsets, #4 displayed a comparatively low retention rate of 13,5%. #4 had spent time away from Mababe village and had a Shua-speaking mother, #5 had spent all her life in Mababe, surrounded by speakers of Ts’ixa. We may therefore infer that other factors, such as language contact and mobility, may also contribute to differences in click use. However, age plays a role insofar as speakers aged younger than 40 usually spent at least a few years at school where San children were (and still are) strongly discouraged from speaking their own languages. Several speakers confirmed that they would tone down the clicks when speaking in the schoolyard, so as not to be targeted by the teachers. We may identify two possible sociolinguistic factors influencing the use of palatal clicks in Ts’ixa, neither of which can actively be traced back to long-term impact from a Bantu-speaking group: 1) contact with dialects of Shua displaying complete loss of the palatal influx series, and 2) recent social pressure from a dominant, non-Khoisan speech community (Tswana). Interference from Shua has probably some historical depth and may have instigated the click loss

314

Fehn

process in the first place. The significant difference between my data, however, and the datasets recorded by Westphal and Vossen is most likely the result of social developments during the second half of the 20th century, such as resettlement, increased mobility, and schooling. Table 9.8 exemplifies the variation in palatal click use across the datasets surveyed: Table 9.8 Variation in palatal click loss among nine speakers of Ts’ixa

Gloss

#1

#2

#3

(West(Vossen (Traill n.d.) phal n.d.) 1997, 2011)

#4

#5

#6

#7

#8

#9

(Fehn) (Fehn) (Fehn) (Fehn) (Fehn) (Fehn)

ǂ ‘plain’ ‘ear’ ‘enter’ ‘head’

ǂéé ǂã́ã̀ tyú̧

ǂé ǂã̀ ǂú~cú

‘heart’

ǂáó

‘nose’

tcuí

ǂáó ~cáó ǂúì

ǂé:~cé: céé cã́: cã́ã̄ ǂú: ~cú: cáó

céé cã́ã̄

céé ǂíù

céé cã́ã̄ cúú

cáó

cáó

cúī

cúī

ɟúà

ɟúù ɟúà

ɟúù ɟúà ɟírà

cūī

cúī

ǂyáó ~cáó cúī

ɡǂúù ɡǂúà ~dyúà ɡǂárà

ɟú: ɟʊ̀ a

ɟúù ɟóà

ɟúù ɟóà

ɟō:

ɟòò

ɟīrā ɟòò

ɟírà ɟòò

ǂháó cùné

chāó chūnī

chūní

chūné

chūní

céé cúú

ǂúī

céé cã́ã̄

ǂúī~cúī

ɡǂ ‘voiced’ ‘angry’ ‘ashes’

dyúá

‘request’ ‘springhare’ ǂh ‘aspirated’ ‘ashamed’ ‘elbow’

ǂhùní

chūní chūné

315

Click Loss in Khoe-Kwadi Table 9.8 Variation in palatal click loss among nine speakers of Ts’ixa (cont.)

Gloss

#1

#2

#3

#4

#5

ɲúm̀

nǂúm̀

#6

#7

#8

#9

ŋǂ ‘nasal’ ‘mould’ ‘mouse’ ‘pour’

ɲúnì

ɲúm

ɲūnē

ɲúm̀ ɲúnì ŋǂánà

ɲánà

ŋɡǂ ‘prenasalized voiced’ ‘back’

nɡǂóró

‘black’ ‘walking stick’

nɡǂódò~ ǹɟóró ntyórò nɡǂú ɲɟú ɲɟùa

ɲɟóró

ɲɟóró

ɲɟóró

ɲɟúú ɲɟóá

ɲɟúú ɲɟúá

ɲɟúá

ǂʔóà

ǂʔúā: ~ʔjā: ʔɲṹ:

ʔyúā

ǂʔíī

ǂʔíì ǂʔáró

ǂʔúā ~ʔyúā ʔyũ̄ṹ ǂʔūrú ǂʔíì ǂʔáró

ǂʔó:lì ~ʔyórì

ʔyórī

ǂʔódī

ʔyāń

ǂʔāń

ǂʔāń

ʔyám̀ ʔyã́ã́

ʔyám̀ ʔyã́ã́

ǂʔã́ã́

cxṹũ̄

cxṹũ̄

ɲɟíró

ɲɟóró

ɲɟúú

ɲɟúú ɲɟúá

ǂʔóā

ʔyúā

ʔyũ̄ṹ ʔyũ̄ṹ ʔyūrú ǂʔíì ǂʔérú ǂʔáró ~ʔyáró ǂʔódī

ʔyũ̄ṹ ǂʔūrú

ǂʔúbí

ǂʔúbí ǂʔāní

ǂʔ ‘glottalized’ ‘ask’ ‘eat’ ‘dove’ ‘kick’ ‘seed’

ʔɲũ̄ũ

ǂʔū̌

‘sour plum’ ‘testicles’ ‘think’ ‘top’ ‘wind’

ǂʔùbí

ǂʔam ǂʔã́

ǂʔã́

ʔyũ̄ṹ

ʔyũ̄ṹ ǂʔūrú ǂʔáró

ǂʔāń ~ʔyāń ʔyám̀ ʔyã́ã́ ~ʔɲã́ã́

ǂʔã́ã́

ǂx ‘affricated’ ‘brain’

ǂxũ̀

chṍ: ~cxṍ:

cxṹũ̄

cxṹũ̄

ǂʔáró

ʔyám̀ ʔyã́ã́

316

Fehn

Table 9.8 Variation in palatal click loss among nine speakers of Ts’ixa (cont.)

Gloss

#1

#2

#3 cxóā

ǂxéí

ǂxòà ~cxòà ǂxáí

‘elephant’ ‘eye’

#4

#5

#6

cxóā

#7

#8

#9

cxéí

cxéí

tcxóā

cxéí

cxéí

cxáí

cxéí

c’ɛ́ɛ́

c’óá c’ɛ́ɛ́

c’ɛ́ɛ́

c’óá c’ɛ́ɛ́

ǂqáē ǂqōnā

ǂqáē ǂqōnā

ǂ’ (or ǂk’) ‘ejective’ ‘exit’ ‘spit’

c’ūà

ǂʔóá c’uà ǂʔɛ́(.xù) c’ɛɛ

c’ɛ́ɛ́

ǂq ‘uvular’ ‘bone marrow’ ‘crocodile’ ǂúná ǂòná ǂqona (?ǂqoná) (?ǂqoná) ‘mud’ ǂqó:̄ ‘springbok’ ǂaí (?ǂqai) ǂqái

ǂqōnā

ǂqáē

ǂqóō

ǂqáē

ǂqóō

ǂɢ ‘voiced uvular’ ‘slippery’

ŋǂɣārī

ŋǂɣādī ŋǂɡādī

ŋǂɡādī

The speaker recorded by Westphal and Vossen (#1 and #2) retained the palatal click with all accompaniments.6 In my data, however, a clear preference for certain phonological environments becomes visible: clicks are occasionally retained in plain form or with a nasal accompaniment, but the great majority of retained clicks is either glottalized or has a uvular accompaniment. Interestingly, the lexemes displaying uvular accompaniments in Ts’ixa have cognate forms in Shua varieties in which the click has been lost. However, none of them retains the uvular stop accompaniment, although a uvular reflex in form of a fricative may show up, e.g. cxáé ‘bone marrow’ (Danisi) and 6  The forms listed under “ejective” were noted by Vossen (1997) with a glottal stop, but it is quite likely they are deaffricated ejective clicks, following a sound change pattern affecting all influxes in Ts’ixa (cf. also footnote 6 above).

317

Click Loss in Khoe-Kwadi

cxona~tcxona ‘crocodile’ (several Shua varieties). The explanation for click retention with uvular stop accompaniments may be sought in an apparent difficulty to palatalize uvular stops, thus blocking the expected replacement qj (C. Naumann, p.c.). While palatalized uvular stops do exist (e.g. in languages of the Caucasus), they appear to be cross-linguistically rare (cf. Catford 1977) and (so far) have not been reported for any Southern African Khoisan language. In the end, speakers might find it easier to retain the click than replace it with a sound creating new and possibly more challenging problems for hitch-free speech flow. 3.4.2 Replacements for the Palatal Click When the click is lost, reflexes of the palatal influx series found in Eastern Kalahari Khoe may be palatal stops or nasals, palatal affricates, or postalveolar affricates. Traill (1986a) identifies a “western” and an “eastern” pattern, but the distribution of phonemic reflexes of the palatal click across the varieties surveyed is actually more complex than he assumed. An overview is provided in Table 9.9. It may be suggested that the variation in click replacements implies a diachronic evolution ǂ > c > tc > ts (ɡǂ > ɟ > dj > dz, and nɡǂ > ndj > ndz, respectively). However, from the distribution of palatal clicks and their replacements across varieties of Shua and Tshwa, it becomes apparent that none of these sound shifts is a very good indicator for genealogical subclassification. On the left end of the spectrum, we find Kua and Ts’ixa who retain a subset of palatal Table 9.9 Non-click replacements of the palatal click in Kalahari Khoe (compare also Traill 1986a)

[+click] ǂ ɡǂ ǂh ŋǂ ŋɡǂ ǂx ǂʔ ǂx’~ǂq’~ǂk’~ǂ’ Kua, Ts’ixa

West >>>>>>>>>>>>>>>>>>> East c ɟ ch ɲ ɲɟ cx ʔy~ʔɲ c’

tc dj tch ɲ ndj tcx ʔy~ʔɲ tc’ Danisi

Examples ts dz tsh ɲ ndz tsx ʔy~ʔɲ~ʔ ts’

Shua Deti, Tyire, Tjwao, Ganade

(15)–(17) (18)–(19) (20) (21) (22)–(23) (24)–(25) (26)–(27) (28)–(29)

318

Fehn

click influxes and otherwise replace ǂ with a palatal stop c. On the right end of the spectrum, Deti and the northeastern Tshwa varieties Tyire, Ganade and Tjwao front a subset of their click replacements, leading to a co-existence of a palatal affricate tc and a post-dental affricate ts as replacements for ǂ. In between, replacement patterns cut across varieties, suggesting a geographical, rather than a genealogical cline. The c~tc variation in the Danisi dialect of Shua is between speakers, rather than between lexemes or phonological environments. tc~ts variation exists between lexemes, though the distribution of tc and ts across the lexicon differs between Deti, Tyire, Ganade and Tjwao. In Tjwao, a complementary distribution of ts and tc may be suggested: ts (and, in extension, dz and ndz) appears before o and u, and tc (and, in extension, dj and ndj) before i, e and a. For the time being, this allophony has to be considered a hypothesis, as more comprehensive data for all click accompaniments is still lacking. A similar pattern appears to exist in Sepako Tyire, while Deti, Ganade and Nata Tyire display a greater degree of tc~ts variation which, in part, appears to be dependent on the speaker. Examples (15)–(29) cover all instances of palatal click replacement summarized in Table 9.9. (15) *ǂee ‘ear’ (Vossen 1997: 474) Shua: ǂéé~céé (Ts), céé~tcéé (Da), tcéé (Na, Pa), tséé (De) Tshwa: ǂéē (LK, SK), cíē (K), tcee (NTy), tcèè (Ga), tcee (Tsh) (16) *ǂui ‘nose’ (Vossen 1997: 471) Shua: ǂúī~cúī (Ts), cúí~tcúí (Da), tcúí (Na, Pa), tsúí (De) Tshwa: ǂúī (LK), cúī (K), tcui (NTy), tsúū [?tsúī] (STy), tsùí (Ga), tsui (Tsh) (17) *ǂao ‘heart’ (Vossen 1997: 449) Shua: ǂáó~cáó (Ts), cáó~tcáó~tcóó (Da), tcáó~tcóó (Na, Pa), tsáó~tcóó~tsóó (De) Tshwa: ǂáō~ǂóó (LK), cēo (K), tcoo (NTy), tsóō´ (STy), tsòó (Ga-Westphal), tcóō´ (Ga-Traill), tsoo (Tsh) (18) *ɡǂoo ‘springhare’ (Vossen 1997: 493) Shua: ɟòò (Ts), ɟóò~djōò (Da), djōò (Na, Pa) Tshwa: ɡǂôó (LK), djóō´ (STy), djóō´ (Ga)

Click Loss in Khoe-Kwadi

319

(19) ɡǂau ‘kori bustard’ (no reconstruction available) Shua: ɟáú (Ts), ɟáú~djàú (Da), djáú~djàú (Na, Pa), zàú (De) Tshwa: ɡǂœ́ u~ɡǂíu~ɡǂúù (LK), ɡǂíǔ~ɡǂìù (SK), ɟīú~ɡīú (K), dyuu (STy), dzūū (Ga) (20) *ǂhuni ‘elbow’ (Vossen 1997: 430) Shua: chūní (Ts), chúní~tchúní~tcúní (Da), tchūní~tcūní (Na, Pa), tchúnī~tshúnī (De) Tshwa: ǂhùńì~thùńì (SK), tyhúní (MLK), tshūní (STy), tshùní (Ga-Westphal), tsuini (Tsh) (21) *ŋǂam ‘whistle’ (Vossen 1997: 475) Shua: ŋǂúḿ~ɲúḿ (Ts), ɲáḿ~ɲúḿ (Da), ɲáḿ (Na, Pa) Tshwa: ŋǂím̄ ~ŋǂúm̄ (LK), ŋǂúm (SK), ɲíḿ (K) (22) ŋǂoa (> ŋɡǂoa) ‘walking stick’ (no reconstruction available) Shua: ɲɟóá (Ts), ɲɟóà~ndjóà (Da), ndjóà (Na, Pa) Tshwa: ɡǂóà~ŋǂóā (LK), ɡǂōā (SK), ɟōà~ɟūà (K), ǹdzua (STy) (23) *ŋǂadu (> ŋɡǂadu) ‘back’ (Vossen 1997: 479) Shua: ɲɟóró (Ts), ɲɟóró~ndjóró (Da), ndjóró (Na, Pa), yóró (De) Tshwa: ɡyīro (K-Traill), yódó (K; Vossen 1997: 479), ndjoro (NTy), dzóró~ǹdzórō (STy) ndjóró (Ga-Traill), nzóró (Ga-Westphal), ndzoro~ndjoro (Tsh) (24) *ǂxai ‘eye’ (Vossen 1997: 418) Shua: cxáí (Ts), cxáí~tcxáí~tcháí (Da), tcxáí~tcháí (Na, Pa), tchàí~tshàí (De) Tshwa: ǂxaî (SK), cxāi (K), tcxai (NTy), tcxáì (STy), tcxáì (Ga), tcxai (Tsh) (25) *ǂxoa ‘elephant’ (Vossen 1997: 430) Shua: cxóā (Ts), cxōà~tcxōà~tchōà~tcōà (Da), tcxóā~tchóā (Na, Pa), tcxóā~tchóā (De) Tshwa: ǂxúà (LK), ǂxòā (SK), cxòā (K), tcoa (NTy), tcxóā (STy), tcxóā (Ga), tsxoa~tsoa (Tsh) (26) *ǂʔoa ‘ask’ (Vossen 1997: 438) Shua: ǂʔóā (Ts), ʔyóà~ʔyúā-ʔyùà (Da), ʔyóà (Na, Pa), ʔyóà (De) Tshwa: ǂʔua~ʔyaa (LK), ʔuà (K-Traill), ʔyā (STy)

320

Fehn

(27) *ǂʔũũ ‘eat’ (Vossen 1997: 433) Shua: ʔyũ̄ṹ (Ts, De), ʔyũ̄ṹ~ʔɲũ̄ṹ (Da, Na, Pa) Tshwa: ǂʔū̂ ṹ (LK), ʔĩṹ̄ (K-Traill), ʔɲũũ (NTy), ʔɲ́ ũ�ṹ (STy), ʔɲũũ~ʔyũũ (Tsh) (28) *ǂx’oa ‘exit, go out’ (Vossen 1997: 450) Shua: c’óá (Ts), c’ōá~tc’ōá (Da), tc’óá~tcóá (Na, Pa), tc’óá~ts’óá (De) Tshwa: ǂq’ɣóà́ (LK), ǂq’oà (SK), c’uà (K), tc’oa (NTy), ts’úa (STy), ts’oa (Tsh) (29) *ǂx’ai ‘spit’ (Vossen 1997: 493) Shua: c’áé~c’ɛ́ɛ́ (Ts), c’āé~tc’ēé (Da) Tshwa: ǂq’ɣaé (LK), ǂkx’āē~t’ɜxāē~ǂɣae (SK), c’ēe (K), ts’ée (STy), ts’éī (Ga) In Traill’s Kua varieties (Cattlepost (K), Lephephe (LK) and Southern (SK)), ongoing click loss and eventual replacement with a non-click consonant is sometimes accompanied by raising of V1 of the rhyme following the weakened click or non-click consonant (cf. (19) and (20) above). It should be noted that this phenomenon is – albeit rarely – also found with speakers of Ts’ixa (cf. *ǂuu ‘head’ realized as ǂíú by speaker #6, and the non-click reflex ɟírà of *ɡǂara ‘to request’ in Table 9.8 above). Some forms – in particular from the Tshwa varieties – also display progressive vowel assimilation (cf. (17), (19) and (26) above). In Lephephe Kua, the vowel assimilation also appears in forms in which the click is retained, i.e. ǂóó ‘heart’ in (17) and ɡǂúù ‘kori bustard’ in (19). Loss of accompaniments compared to the respective proto forms is found throughout the subgroup: lexemes displaying a prenasalized-voiced onset ŋɡǂ (cf. footnote 5 above) in cognates from other Eastern Kalahari Khoe varieties are denasalized in Kua, irrespective of whether the click is replaced or not (cf. (22) and (23) above). This sound change also affects other influxes and is presumably unconnected to click loss. The non-click forms for ‘back’ in (23) constitute an interesting case. While Traill’s Kua and Sepako Tyire display the same denasalization pattern also seen in (22), Deti as well as Vossen’s Kua display a form in which neither nasality nor voicing are present. Only the palatal gesture is retained in the form of a wordinitial glide y. This does not appear to be a regular pattern, though, and for the moment, I suggest treating this replacement as a lexicalized exception. In our sample, the aspiration gesture of ǂh was lost in non-click cognate forms used by a subset of Shua speakers from all dialects as well as with speakers of

Click Loss in Khoe-Kwadi

321

Tjwao (cf. (20)). It is worth noting that loss of aspiration was never observed with replacements for the alveolar click (cf. (5) and (6) in §3.1.2), suggesting that this sound change is specifically connected to the loss of the palatal influx and the resulting non-click onset. Affrication becomes aspiration or is lost with some speakers of Shua and northeastern Tshwa (ǂx>tch>tc, cf. (24) and (25) above), leading to a further reduction of contrasts. A lenition of affricated clicks is also found with the dental (ǀx>ǀh) and lateral (ǁx>ǁh) influx series that are not affected by click loss; it also affects the non-click cluster /tsx/ which becomes [tsh]. This sound change appears to operate on the idiolectal plane and, to the best of my knowledge, does not affect any speech variety as a whole. It is however more frequent in Shua and northeastern Tshwa than in Ts’ixa and Kua. The affrication of ejective clicks ǂx’ is generally lost when the click is replaced by a non-click consonant, though the ejective component is retained (cf. (28) and (29)). As in the case of replacements for the alveolar ejective click ǃx’, it seems that the contrast between ejective and glottalized clicks had not been levelled at the time when the click loss occurred (as is nowadays the case in Nata and Pandamatenga Shua, cf. footnote 6). The reflex of ǂʔ is either a preglottalized palatal glide ʔy, or a preglottalized palatal nasal ʔɲ (before nasalized rhymes). The palatal gesture may be lost in some Tshwa dialects, leading to neutralization of the contrast between the reflexes of ǃʔ and ǂʔ (both become ʔ). 3.5 Loss of the Lateral Influx ǁ The lateral influx ǁ is the next most intense click after ǃ and ǂ and one might expect its loss, once the two [+abrupt] influxes have disappeared from the phoneme inventory. Indeed, Traill and Vossen (1997: 43) quote a total of five lexemes in which the lateral click is lost before a glottal stop accompaniment. No replacement by a non-click sound occurs, leading to a neutralization of the contrast between ǃʔ, ǁʔ and ʔ in C1 position. They list the forms ǁʔoo ‘to die’, ǁʔau ‘fish’, ǁʔao ‘to offend’, ǁʔan ‘to ripen’ and ǁʔae ‘village’, but provide no information on their distribution, except for the fact they occur in “certain Eastern NKK [Kalahari Khoe] dialects” (Traill and Vossen 1997: 43). My own data (see below) suggests that only ʔau and ʔae occur in the Shua cluster (except Ts’ixa), whereas the other clickless forms appear to be restricted to the Tshwa cluster. The usefulness of this particular sound change for subgrouping may however be questioned, as the easternmost Tshwa dialect Tjwao which has lost both the alveolar and palatal influx retains the lateral click with ǁʔoo ‘to die’, while it has been lost in Kua varieties retaining some of their palatal influxes.

322

Fehn

(30) *ǁʔae ‘village’ (Vossen 1997: 508) Shua: ǁʔáé~ǁʔɛ́ɛ́ (Ts), ʔáé (Da, Na, Pa, De) Tshwa: ʔāé (LK, MLK), ʔāi (SK, K), ʔâē (STy), ʔae (Tsh) (31) *ǁʔau ‘fish’ (Vossen 1997: 436) Shua: ǁʔáū (Ts), ǁʔáū~ʔáū (Da), ʔáū (Na, De) Tshwa: ʔau (K, NTy, Tsh), ʔáū (STy) (32) *ǁʔoo ‘die’ (Vossen 1997: 495) Tshwa: ʔōó (LK, SK, MLK, K), ʔóō´ (STy), ǁʔoo (NTy, Tsh) So far, lateral clicks with accompaniments other than the glottal stop ʔ are not affected by click loss. However, an interesting case also quoted in Traill and Vossen (1997: 44) concerns the person-gender-number suffix for the 3rd person masculine plural. While the Western Kalahari Khoe languages (plus Ts’ixa) all have -ǁù(à), Khoekhoe and the Eastern Kalahari Khoe subgroup have -kù(à). The correspondence between ǁ and k seems intriguing. However, Güldemann (2004, forthcoming) in reconstructing the pronoun system of Proto-Khoe presents a convincing case for contact influence from Ju. According to his reconstruction, the formation of the masculine plural form involves the Ju plural suffix *ǃa- plus a final element *u marking person and number. Historically, the form -ku found in Khoekhoe and Eastern Kalahari Khoe could therefore be explained as the result of *ǃa-u > *ǃu. In this scenario, the velar stop k would be the regular replacement for the alveolar click ǃ (see §3.1 above), rendering the lateral click of Western Kalahari Khoe the variant in need of explanation. 3.6 Loss of the Dental Influx ǀ So far, no Kalahari Khoe language has been reported to replace a subset, let alone all of its dental click series. However, Traill and Vossen (1997: 42) present a set of likely cognates (shown in Table 9.10 below) in which a dental click with an affricated ejective ǀx’ (ǃOra) or glottal stop accompaniment ǀʔ (Nama) in Khoekhoe corresponds to an alveolar ejective affricate ts’ in Kalahari Khoe. As the clickless forms are also reconstructed for Western Kalahari Khoe, i.e. for languages like Naro and Gǀui which are otherwise unaffected by click loss, one may assume that this sound change is not associated with the ongoing processes of click loss and replacement described above. Indeed, Traill and Vossen (1997: 43) call it a “lexically confined click loss presumably of a greater time-depth than the more extensive click loss already discussed”. They further stress that both ǀʔ and ǀx’ are well attested sounds in Kalahari Khoe, making

323

Click Loss in Khoe-Kwadi

Table 9.10 Assumed correspondences between ǀx’~ǀʔ in Khoekhoe and ts’ in Kalahari Khoe

Gloss

Khoekhoe

PKK

PWKalK

PEKalK

Kwadi

‘to milk’

ǀx’ao (ǃOra), ǀʔao (Nama) ǀx’oo (ǃOra), ǀʔoo (Nama) ǀx’aa (ǃOra), ǀʔaa (Nama) ǀx’an (ǃOra), ǀʔan (Nama) ǀx’ãã (ǃOra), ǀʔãã (Nama) n/a

*ǀx’ao

*ts’ao

*ts’ao

n/a

*ǀx’oo

*ts’oo

*ts’oro

n/a

*ǀx’ai

*ts’áì

*ts’áì

n/a

*ǀx’an

*ts’ánì

*ts’ani

n/a

*ǀx’ãã

*ts’ã́ã̀

*ts’ã́ã̀

n/a

n/a

ts’áà

[ǂaŋ́ ]a (prob. *ǀ͡χˀ > Nama ǀˀ. However, if we accept Güldemann’s (2004, 2010) classification of Kwadi as a higher order relative of the Khoe family, [+click] forms found in Westphal’s Kwadi data shown in Table 9.10 suggest a reconstruction *ǀx’ or *ǀʔ of the onset for Proto-Khoe-Kwadi. This reinforces the hypothesized loss of the dental click in Proto Kalahari Khoe, rather than click genesis in Khoekhoe. While click and non-click replacements do not appear to share obvious articulatory features, Traill and Vossen (1997: 43) stress their close acoustic similarities. As virtually no other case of dental click loss has been described in the literature, it is hard to assess how common or likely the pattern outlined above is from a comparative perspective.

324 4

Fehn

Click Loss in Kwadi and Sesfontein Damara

The discussion on click loss in the Khoe-Kwadi family has mostly been focused on Kalahari Khoe, due to the purported classificatory importance of the sound change within this subgroup. More recent data has however shown click loss to appear elsewhere in the family, i.e. in the Kwadi language of southwestern Angola (cf. Güldemann 2004; Güldemann and Elderkin 2010), and in the Khoekhoe dialect spoken by the Damara of Sesfontein in northern Namibia. Click loss in these varieties is not restricted to the alveolar and palatal influxes and displays both differences and similarities to what has been observed in Kalahari Khoe. While it has been hypothesized that the sound change might have spread from Eastern Kalahari Khoe to more distantly related languages through contact (Ehret 2008 for Kwadi), this does seem improbable given the present-day location of the speech communities in question. Click loss in Kwadi and Damara most likely arose independently from similar phenomena described for Kalahari Khoe. This observation sheds further doubt on the rarity and classificatory usefulness of lenition processes affecting click sounds in Khoe. 4.1 Kwadi Kwadi is a virtually extinct language of southwestern Angola, which was classified as a higher-order relative of the Khoe family by Güldemann (2004, 2014; see also Güldemann and Elderkin 2010). It was spoken among the Kwepe shepherds of the Angolan Namib until approximately the beginning of the 20th century, when the entire speech community started shifting to the neighboring Bantu language Kuvale. In the 1950s, de Almeida (n.d.) and Westphal (1964/5) were able to record few elders still able to speak Kwadi; rememberers have also been found by Gibson (1972) and the present author (Fehn, field notes). Due to the existence of audio recordings, we have a rather good idea of the phoneme inventory of Kwadi before the language became extinct. While comparisons with Khoe forms allow for the reconstruction of a click inventory featuring four influxes ǂ, ǀ, ǃ, ǁ for Proto Khoe-Kwadi, the only click retained on recordings is the dental influx ǀ. Although the available data does not allow for the identification of regular sound correspondences between Khoe and Kwadi, some patterns arise (see also Güldemann and Elderkin 2010). Those involving click loss are exemplified below in Table 9.11:

325

Click Loss in Khoe-Kwadi Table 9.11 Cognate forms between Kwadi and Khoe displaying click loss in Kwadia

Gloss

Kwadi

Related Khoe Form

tcui (W), ʃui (A) tsoo (A, W)

*ǂui (PK) *ǂao (PK)

tsoo(ᵑ) (W), tcoo (F)

*ɡǂoa (PKalK)

dzuu (W)

*ŋǂuu (PK)

ʔyũũ (A, F), ʔɲũũ (W) ʔɲãã (W)

*ǂʔũũ (PK) *ǂʔam (PK)

ts’ee(-ke)

*ǂx’ae (PK)

kũũ (A, F, W) kãŋ (W) xami(ᵑ) xum(-wi-yo) (G) xa (W), xe (F, W)

*ǃũũ (PK) *ǃãĩ (PK) *ǃam (PKalK) *ǃom~ǃum (PK) *ǃaa (PK)

kʟ̥ũũ (W), tsũũ (A) kʟ̥ãã (F, W)

*ǁũũ (PK) ǁãã (‘to rise’ in Ts’ixa)

ǂ > tc~ts~ʃ ‘nose’ ‘heart’ ɡǂ > tc~ts ‘ashes’ ŋǂ (>ŋɡǂ) > dz ‘black’ ǂʔ>ʔy~ʔɲ ‘to eat’ ‘top’ ǂx’ > ts’ ‘saliva, to spit’ ǃ > k, ǃ > x ‘to go’ ‘good’ ‘penis’ ‘heavy’ ‘to want’ ǁ > kʟ̥~ts ‘tooth’ ‘to stand up’

326

Fehn

Table 9.11 Cognate forms between Kwadi and Khoe displaying click loss in Kwadi (cont.)

Gloss

Kwadi

Related Khoe Form

tcee (F, W)

*ɡǁae (PKalK)

xaa xoe (W)

ǁxaa (Khoekhoe) *ǁxoba (PK), *ǁxobe (PKalK)

ɡǁ > tc ‘female, woman’ ǁx > x ‘moon’ ‘to open’ ǁʔ > ʔ ‘fish’ ‘to sleep’

ʔau *ǁʔau (PKalK) ʔmũũ (A), ʔũũ (F), ʔmu (W) *ǁʔum (PK)

ǁx’ > kʟ̥’ ‘to wash’

kʟ̥’aa

*ǁx’aa (PK)

a  There might be more, yet undetected correspondences that involve click loss in Kwadi. Table 9.11 merely quotes some of the more obvious cases for which audio recordings were available; for further examples based on Westphal’s field notes, the reader is referred to Güldemann and Elderkin (2010). Note: Abbreviations: A = de Almeida, F = Fehn, G = Gibson (transcribed by Riikka HalmeBerneking and Bonny Sands), W = Westphal; PK = Proto-Khoe, PKalK = Proto-Kalahari Khoe) Source: All Kwadi data taken directly from recordings; reconstructions by Vossen (1997) if not stated otherwise

The examples which have been shaded display obvious similarities with click loss patterns also observed in Kalahari Khoe: The palatal affricate tc appears as a replacement for the palatal click ǂ in Shua and Tshwa varieties, and a post-dental affricate ts is found in Nata Shua, Deti, Ganadi and Tjwao (cf. ex. (15)–(17) above). Vowel assimilation as seen in tsoo ‘heart’ is equally attested in Ganadi and Tjwao (cf. (17)). Loss of the palatal click, e.g. dzuu ‘black’ is also found in some eastern Kalahari Khoe varieties, and denasalization is attested for Kua (cf. (22)–(23)). The Kwadi form for ‘to eat’ is also found in Shua and Tshwa (cf. (27)), including variation between a pre-glottalized glide ʔy and

Click Loss in Khoe-Kwadi

327

a pre-glottalized palatal nasal ʔɲ. An ejective ts’ as replacement for ǂx’ also appears in Tyire and Ganadi (cf. (29)). A velar stop k replacing the alveolar click ǃ as in forms for ‘to go’ and ‘good’ is attested in Khwe, Shua and Tshwa (cf. (1)–(2)), and loss of the lateral click with the glottal stop accompaniment remaining is found in Nata, Deti and the Tshwa cluster (cf. (31)). More puzzling correspondences involve forms with a voiced click accompaniment in Khoe. tcoo ‘ashes’ (< *ɡǂoa) and tcee ‘woman’ (< *ɡǁae) both display a voiceless onset and vowel assimilation. Whether the loss of voicing is connected to the click loss or constitutes a general feature of the phoneme inventory of Kwadi vs. Khoe cannot be answered at this point. The palatal affricate tc appears as replacement for both the palatal and the lateral click. A partial overlap between replacements for the palatal and the lateral click is also suggested by the form tsũũ for ‘tooth’, although other occurrences of lateral click replacement either display kʟ̥ (‘to stand up’, ‘to wash’) or complete deletion of the click (‘fish’, ‘to sleep’, ‘moon’, ‘to open’). Replacements for the lateral click involving a lateral reflex are also found in some Angolan ǃXun varieties (cf. Fehn, this volume) and might constitute an areal feature. While the pattern ǃ>k also appears in Kalahari Khoe, some apparent cognates between Kwadi and Khoe seem to involve a shift ǃ>x (‘penis’, ‘heavy’). Whether the affricate x actually acts as a replacement for the alveolar click ǃ, or whether the underlying form should be reconstructed as *ǃx, with loss of affrication in Proto-Khoe, cannot be answered with the data at hand. 4.2 Sesfontein Damara Sesfontein Damara is a cluster of related Khoekhoe varieties in northern Namibia identified by Haacke et al. (1997). Haacke (1999) and Job (2014) provide examples of click loss that are quite distinct from patterns found elsewhere. In most cases quoted, the click is dropped, while the secondary articulation, i.e. the accompaniment, is retained. Unlike in Kalahari Khoe and Kwadi, all clicks – including the dental one – may be affected by click loss, but no click has disappeared entirely. In Table 9.12, clickless forms from Sesfontein Damara are contrasted with their counterparts in other Namibian Khoekhoe varieties. Most instances of click loss in Sesfontein Damara concern what might be interpreted as consonant clusters consisting of a click and a non-click consonant (cf. Güldemann 2001; Nakagawa 2006). Clicks followed by a glottal stop, a uvular fricative, or a glottal fricative (“delayed aspiration”) are simply dropped while the accompaniment is retained; no difference is made between different click types. Deviations from this pattern are the forms híù ‘to eat’ and sii ‘to come’. In the case of híù, V1 is fronted and the glottal stop is replaced

328

Fehn

Table 9.12 Cognate forms between Sesfontein Damara and other Khoekhoe varieties displaying click loss in Sesfontein Damara

Gloss ǂʔ > ʔ, ǂʔ > hi ‘to eat’ ǂhh > h, ǂhh > s ‘to publicize’ ‘to tell a lie’

ŋǀ > n ‘huntable game’ ǀx > x, ǀx > s ‘with’ ‘to come’ ǀ>ʔ ‘fountain’ ǃ>k ‘brother’ ŋǃ > n ‘in’ ǃx > x ‘place’ ǃʔ > ʔ ‘to receive’ ŋǁ > ŋ ‘that, there’ ǁʔ > ʔ ‘that aforementioned’ ‘normal person’

Sesfontein Damara

Khoekhoe [+click]

ʔuu ⟨û⟩ (J: 5), híù ⟨hîû⟩ (H: 150)

ǂʔùű ⟨ǂùű⟩ (H&E: 509)

hûa (J: 5) sûa (H: 150)

ǂhhoa ⟨ǂhoa⟩ (J: 5) tsuu-ǂhhoa ⟨tsuǂhôa⟩ (H: 150, ǂAakhoe, Haiǁom, Ghaub)

ʔam-ni (amni) (H: 149, J: 5) ʔám̀ -ŋǀèè ⟨ám̀ ǀnèè⟩ (H&E: 535) xa (H: 149; J: 5) sii ⟨sī⟩ (H: 149)

ǀxa ⟨ǀkha⟩ (H&E: 735) ǀxìi̋ ⟨ǀkhìi̋⟩ (H&E: 475)

ʔau ⟨au⟩ (J: 5)

ǀʔàű ⟨ǀàű⟩ (H&E: 532)

kui ⟨kûi⟩ (H: 149)

ǃùì ⟨ǃɡùì⟩ (H&E: 457)

naa ⟨nâ⟩ (H: 150)

ŋǃàà ⟨ǃnàà⟩ (H&E: 560)

xai (J: 5)

ǃxáì ⟨ǃkháì⟩ (H&E: 624)

ǃxoo-ʔoa ⟨ǃkhō-oa⟩ (H: 149) ǃxòő-ǃʔőá ⟨ǃkhòőǃőá⟩ (H&E: 641) ŋaa~naa ⟨nɡâ, nâ⟩ (J: 5)

ŋǁàa̋ ⟨ǁnàa̋⟩ (H&E: 705)

ʔíì ⟨î⟩ (J: 5) ʔau-khoe ⟨au-xoe⟩ (J: 5)

ǁʔíì ⟨ǁî⟩ (H&E: 238) ǁʔau-khoe ⟨ǁau-khoe⟩ (J: 5)

Note: If not stated otherwise, the [+click] form is from Standard Namibian Khoekhoe. Khoekhoe orthography as given in source is added in angle brackets; source and page number are shown in parentheses. Abbreviations: H = Haacke 1999; H&E = Haacke and Eiseb 2002; J = Job 2014

Click Loss in Khoe-Kwadi

329

by a glottal fricative. The vowel fronting might be considered a reflex of the palatal click, similar to what has been observed in Eastern Kalahari Khoe (cf. §3.4.2 above). Click loss with nasalized clicks generally results in a nasal n or ŋ. Both variants are also found as replacements for ŋǃ in Kalahari Khoe (cf. ex. (7) above). The example, kui ‘brother’, displays replacement of a click by a non-click consonant. The sound shift ǃ>k is regularly attested in Kalahari Khoe and Kwadi, but does seem more rare in Sesfontein Damara. The sources from which the data were taken do not allow for an assessment of how frequent the patterns quoted in Table 9.12 are, and whether idiolectal variation occurs. It may however be mentioned that click loss is also attested – if to a lesser extent – in other varieties of the Khoekhoe subgroup, including Haiǁom and possibly Namidama (Haacke 1999: 148). It may even be assumed that click loss in Damara has some historical depth, considering the phenomenon was already mentioned by the missionary Heinrich Vedder (cf. Inskeep 2003: 451) in the beginning of the 20th century. One last remark should be made about replacement of one click by another: Job (2014: 5) quotes two instances in which speakers of Sesfontein Damara replace an alveolar click with a lateral click: ŋǃani > ŋǁani ‘six’, and ŋǃae > ŋǁae ‘to be born’. This is the only case7 found so far in the Khoe family in which “weakening” of an abrupt click as described by Traill and Vossen (1997) does not lead to immediate replacement by a non-click consonant, but instead results in a [+noisy] click retaining the acoustic feature [+grave] (see §2.2 above). 5 Discussion This article discussed click loss in three genealogical units within the Khoe-Kwadi language family: Kalahari Khoe, Khoekhoe, and Kwadi. Patterns emerging within the Kalahari Khoe subgroup and, to a lesser extent, in Kwadi, could be shown to be fairly uniform, thus confirming regularities of sound change as postulated by (1986a) and Traill and Vossen (1997). However, the linguistic and geographical distance between Kalahari Khoe and Kwadi supports a scenario in which both linguistic units started losing click consonants independently from each other. The occurrence of click loss in selected dialects of Khoekhoegowab as well as in the Non-Khoe families Kx’a (Fehn, this volume) and Tuu (Sands, this volume) gives further weight to the assumption that click loss is by no means a rare sound 7  Historical sources on Kalahari Khoe (e.g. Maingard 1961) appear to suggest that correspondences between different click types indicative of click weakening exist throughout the family. However, a comparative analysis shows no regularities, and it may be assumed that erroneous transcriptions, rather than click weakening and replacement are to be held responsible.

330

Fehn

change and therefore a problematic indicator for genealogical subgrouping. The example of Ts’ixa where a statistically significant decrease in palatal click use could be shown to have occurred within one generation sheds further doubt on whether reconstructions like Vossen’s (1997) Eastern Kalahari Khoe without alveolar and palatal influx series will really hold. If click loss indeed constitutes a sound shift which may arise multiple times in different places and in both related and unrelated languages, one may ask whether all attested cases of click loss share a common linguistic or extralinguistic trigger. It is worth noting that the Khoe-Kwadi languages affected by click loss are all spoken by groups residing on the northern (Kwadi, Damara, Khwe) and eastern (Shua, Tshwa) fringes of the Kalahari Basin (cf. Map 9.1 in §3 above). Consequently, they were more exposed to influence from invading Bantu-speakers than their linguistic relatives from the Central Kalahari where languages retain the highest click load (cf. Figure 9.1). It has been argued that the likelihood of click loss increases in sociolinguistic environments in which click sounds have become negatively connoted identity markers (Wilmsen and Vossen 1990). One may however wonder if the present-day social situation of the affected Khoe-speaking groups (cf. e.g. Chebanne 2002) can be compared to the historical setting during which the alveolar and palatal influxes became first replaced by non-click consonants. Tswana influence in the area is unlikely to predate the middle of the 18th century (Wilmsen and Vossen 1990: 9), and we know in fact very little about early Bantu – Khoe interactions to the north and east of the Kalahari Basin. It seems clear that comparatively rapid click loss can only be explained by sociolinguistic pressure (cf. §3.4.1 for the role of institutionalized schooling in speeding up palatal click loss in Ts’ixa) or ongoing language shift. However, some historical factors may have contributed to the preference of non-click over click consonants: A leveled contact scenario in which Khoe men married Bantu women likely led to the introduction of clickless pronunciations through adult L2 learners into Khoe-speaking communities. Khwe, Shua and Tshwa, the very Khoe groups affected by click loss all display a considerable degree of Bantu admixture (Pickrell et al. 2012), including maternal DNA of Bantu origin (Barbieri et al. 2014). This implies a historical scenario in which gene flow was not unilateral, i.e. Bantu women became integrated in Khoe-speaking communities and possibly impacted the languages they came to adopt. Once initiated, click loss may first have spread through the lexicon of individual languages before eventually transcending linguistic boundaries. Palatal click loss presumably entered Ts’ixa through contact with western dialects of Shua (§3.4.1), and the Khwe dialect ǁAni seems to have borrowed clickless forms from the related Buga variety (§3.3.1). Both examples support an areal

Click Loss in Khoe-Kwadi

331

spread from one Khoe language to another, rather than inheritance of the sound shift from a common ancestor. Taking this into account, I suggest that the sociolinguistic context may have influenced the outcome of regular lenition processes involving clicks without necessarily being the initiating force. In this framework, it becomes possible to see click replacement as found in Non-Khoe and click loss as found in the Khoe family as different outcomes of a common phonetic process. While the initial trigger of click weakening remains the desire to facilitate difficult articulations in speech flow, the process is not linear and does not lead from weakening to replacement to loss. It rather leads to replacement of a targeted [+abrupt] click by another, weaker consonant. This consonant may, but does not have to be a click. References Almeida, António de. n.d. Audio recordings of Kwadi. Lisbon: Instituto de Investigação Cientifica Tropical (IICT), unpublished. Barbieri, Chiara, et al. 2014. “Unraveling the complex maternal history of southern African Khoisan populations.” American Journal of Physical Anthropology 153(3): 435–448. Beach, Douglas M. 1938. The Phonetics of the Hottentot Language. Cambridge: Heffer. Catford, John C. 1977. Fundamental Problems in Phonetics. Bloomington & London: Indiana University Press. Chebanne, Andy. 2002. “Shifting identities in eastern Khoe: ethnic and language endangerment.” Pula: Botswana Journal of African Studies 16(2): 147–157. Chebanne, Andy, and Chris Collins. 2017. “Tense and aspect in Kua: A preliminary assessment.” Khoisan Languages and Linguistics: Proceedings of the 4th International Symposium, July 2011, Riezlern/Kleinwalsertal, ed. Anne-Maria Fehn. Cologne: Rüdiger Köppe, pp. 91–108. Dickens, Patrick J. 1994. English – Juǀ’hoan/Juǀ’hoan – English dictionary. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 8. Cologne: Rüdiger Köppe. Doke, Clement M. 1925. “An outline of the phonetics of the language of the ʗhũ̬ː Bushmen of north-west Kalahari.” Bantu Studies 2: 129–166. Dornan, Samuel S. 1917. “The Tati Bushmen (Masarwas) and their Language.” Journal of the Royal Anthropological Institute 47(1): 37–112. Fehn, Anne-Maria. 2016. “A grammar of Ts’ixa (Kalahari Khoe).” Ph.D. thesis, University of Cologne. (http://kups.ub.uni-koeln.de/7062/, assessed 14/11/2017). Fehn, Anne-Maria. 2018. “New data on Eastern Kalahari Khoe phoneme inventories. A comparative survey.” Africana Linguistica 24: 5–29.

332

Fehn

Gibson, Gordon D. 1972. Kwepe basic vocabulary, collected at Ngwaia. Washington: National Anthropological Archives, Smithsonian Institution; unpublished. Güldemann, Tom. 1998. “The Kalahari Basin as an Object of Areal Typology: A first approach.” Language, Identity, and Conceptualisation among the Khoisan, ed. by Mathias Schladt. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 15. Cologne: Rüdiger Köppe, pp. 137–169. Güldemann, Tom. 2001. Phonological Regularities of Consonant Systems across Khoisan Lineages. University of Leipzig Papers on Africa: Languages and Literatures 16. Leipzig: Institut für Afrikanistik. Güldemann, Tom. 2004. “Reconstruction through ‘de-construction’: The marking of person, gender, and number in the Khoe family and Kwadi.” Diachronica 21(2): 251–306. Güldemann, Tom. 2010. “Kwadi: from family-level to family-internal isolate.” Handout for a paper presented at the workshop “Language Isolates in Africa,” Lyon, December 3–4, 2010. Güldemann, Tom. 2014. “ ‘Khoisan’ Linguistic Classification Today.” Beyond ‘Khoisan’: Historical Relations in the Kalahari Basin, ed. Tom Güldemann and Anne-Maria Fehn. Amsterdam & Philadelphia: John Benjamins, pp. 1–40. Güldemann, Tom. Forthcoming. “Person-gender-number marking from Proto-KhoeKwadi to its descendants: A language contact perspective.” Festschrift for Bernd Heine, ed. Christa König and Rainer Vossen. Routledge African Linguistics Series. London: Routledge. Güldemann, Tom, and Edward D. Elderkin. 2010. “On External Genealogical Relationships of the Khoe Family.” Khoisan languages and linguistics: Proceedings of the 1st International Symposium, January 4–8, 2003, Riezlern/Kleinwalsertal, ed. M. Brenzinger and Christa König. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 24. Cologne: Rüdiger Köppe, pp. 15–52. Güldemann, Tom, and Anne-Maria Fehn, eds. 2014. Beyond ‘Khoisan’. Historical Relations in the Kalahari Basin. Amsterdam & Philadelphia: John Benjamins. Güldemann, Tom, and Anne-Maria Fehn. 2017. “The Kalahari Basin area as a ‘Sprachbund’ before the Bantu expansion – an update.” The Cambridge Handbook of Areal Linguistics, ed. Raymond Hickey. Cambridge: Cambridge University Press, pp. 500–526. Güldemann, Tom, and Robyn Loughnane. 2012. “Are there “Khoisan” roots in bodypart vocabulary? On linguistic inheritance and contact in the Kalahari Basin.” Methodology in Linguistic Prehistory, ed. Tom Güldemann, Gary Holton, Robyn Loughnane and Laura C. Robinson (= Language Dynamics & Change, 2.2). Leiden: Brill, pp. 215–258. Güldemann, Tom, and Mark Stoneking. 2008. “A historical appraisal of clicks: A linguistic and genetic population perspective.” Annual Review of Anthropology 37: 93–109.

Click Loss in Khoe-Kwadi

333

Güldemann, Tom, and Rainer Vossen. 2000. “Khoisan.” African languages: An introduction, ed. Bernd Heine and Derek Nurse. Cambridge: Cambridge University Press, pp. 99–122. Haacke, Wilfrid H.G. 1999. “Phonological gleanings from the dialects of Khoekhoegowab (Nama/Damara): Towards internal reconstruction.” African Mosaic: Festschrift for J.A. Louw, ed. Rosalie Finlayson. Pretoria: Unisa, pp. 131–163. Haacke, Wilfrid H.G., and Elias Eiseb. 2002. A Khoekhoegowab Dictionary with an English-Khoekhoegowab index. Windhoek: Gamsberg Macmillan. Haacke, Wilfrid, Eiseb, Elias, and Levi Namaseb. 1997. “Internal and external relations of Khoekhoe dialects: A preliminary survey.” Namibian Languages: Reports and Papers, ed. Wilfrid Haacke and Edward D. Elderkin. Namibian African Studies, 4. Cologne: Köppe, pp. 125–209. Heine, Bernd. 1999. “The ǁAni: Grammatical notes and texts.” Khoisan Forum 11. Cologne: Institut für Afrikanistik, University of Cologne. Heine, Bernd, and Henry Honken. 2010. “The Kx’a family: A new Khoisan genealogy.” Journal of Asian and African Studies 79: 5–36. Heine, Bernd, and Christa König. 2016. The ǃXun Language. A Dialect Grammar of Northern Khoisan. Cologne: Rüdiger Köppe. Honeybone, Patrick. 2008. “Lenition, weakening and consonantal strength: Tracing concepts through the history of phonology.” Lenition and Fortition, ed. Joaquim Brandão de Carvalho, Tobias Scheer and Philippe Ségéral. Berlin: Mouton de Gruyter, pp. 9–93. Honken, Henry. 2012. “Some Notes on the History of Khoe.” Unpublished manuscript. Inskeep, Adi. 2003. Heinrich Vedder’s The Bergdama. An annotated translation of the German original with additional ethnographic material, 2 vols. Cologne: Köppe. Job, Sylvanus. 2014. “Features of a Sesfontein dialect of Khoekhoegowab.” Presentation at the African linguistic colloquium, Humboldt University of Berlin. Kilian-Hatz, Christa. 2003. Khwe Dictionary. Namibian African Studies 7. Cologne: Rüdiger Köppe. Knight, Alec, Peter A. Underhill, Holly M. Mortensen, Lev A. Zhivotovsky, Alice A. Lin, Brenna M. Henn, Dorothy Louis, Merritt Ruhlen, and Joanna L. Mountain. 2003. “African Y Chromosome and mtDNA Divergence Provides Insight into the History of Click Languages.” Current Biology 13(6): 464–473. König, Christa, and Bernd Heine. 2008. A Concise Dictionary of Northwestern ǃXun. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 21. Cologne: Rüdiger Köppe. Ladefoged, Peter, and Daniel Everett. 1996. “The status of phonetic rareties.” Language 72(4): 794–800. Ladefoged, Peter, and Ian Maddieson. 1996. The Sounds of the World’s Languages. Cambridge, MA: Blackwell.

334

Fehn

Maingard, L.F. 1961. “The central group of the click languages of the Kalahari.” African Studies 20(2): 114–122. Miller-Ockhuizen, Amanda L., and Bonny E. Sands. 1999. “ǃKung as a linguistic construct.” Language and Communication 19: 401–413. Nakagawa, Hirosi. 2006. “Aspects of the Phonetic and Phonological Structure of the Gǀui Language.” Ph.D. Dissertation, University of the Witwatersrand, Johannesburg. Nakagawa, Hirosi. 2014. “ Gǀui Dictionary.” Version April 2014. Unpublished Manuscript. Pickrell, Joseph K., Nick Patterson, Chiara Barbieri, Falko Berthold, Linda Gerlach, Tom Güldemann, Blesswell Kure, Sununguko Wata Mpoloka, Hirosi Nakagawa, Christfried Naumann, Mark Lipson, Po-Ru Loh, Joseph Lachance, Joanna Mountain, Carlos D. Bustamante, Bonnie Berger, Sarah A. Tishkoff, Brenna M. Henn, Mark Stoneking, David Reich, and Brigitte Pakendorf. 2012. “The genetic prehistory of southern Africa.” Nature Communications 3: 1143. Sands, Bonny. 2010. “Juu subgroups based on phonological patterns.” Khoisan Languages and Linguistics: Proceedings of the 1st International Symposium, January 4–8, 2003, Riezlern/Kleinwalsertal, ed. M. Brenzinger and Christa König. Quellen zur KhoisanForschung/Research in Khoisan Studies 24. Cologne: Rüdiger Köppe, pp. 85–114. Sands, Bonny, and Hilde Gunnink. 2019. “Clicks on the fringes of the Kalahari Basin Area.” Theory and Description in African Linguistics: Selected Papers from the 47th Annual Conference on African Linguistics, ed. Emily Clem, Peter Jenks and Hannah Sande. Berlin: Language Science Press. pp. 703–724. Traill, Anthony. 1986a. “Click replacement in Khoe.” Contemporary Studies on Khoisan, 2 vols., ed. Rainer Vossen and Klaus Keuthmann. Quellen zur Khoisan-Forschung/ Research in Khoisan Studies, 5. Hamburg: Helmut Buske, pp. 301–320. Traill, Anthony. 1986b. “Do the Khoi have a place in the San? New data on Khoisan linguistic relationships.” Tagungsberichte des Internationalen Symposions “Afrikanische Wildbeuter”, Sankt Augustin, Januar 3–5, 1985. 2 vols., ed. Franz Rottland and Rainer Vossen. Sprache und Geschichte in Afrika, 7. Hamburg: Helmut Buske, pp. 407–430. Traill, Anthony. 1994. A ǃXóõ dictionary. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 9. Cologne: Rüdiger Köppe. Traill, Anthony. n.d. “Field notes on Khoe.” Traill, Anthony, and Rainer Vossen. 1997. “Sound change in the Khoisan languages: New data on click loss and click replacement.” Journal of African Languages and Linguistics 18: 21–56. Valiente-Noailles, Carlos. 1993. The Kua: Life and Soul of the Central Kalahari Bushmen. Rotterdam & Brookfield: A.A. Balkema. Visser, Hessel. 2001. Naro Dictionary: Naro – English/English – Naro. Gantsi: Naro Language Project. Vossen, Rainer. 1991. “What Do we Do with Irregular Correspondences? The Case of the Khoe Languages.” History in Africa 18: 359–379.

Click Loss in Khoe-Kwadi

335

Vossen, Rainer. 1997. Die Khoe-Sprachen: Ein Beitrag zur Erforschung der Sprachgeschichte Afrikas. Quellen zur Khoisan-Forschung/Research in Khoisan Studies 12. Cologne: Rüdiger Köppe. Vossen, Rainer. 2011. “Patterns of Linguistic Convergence in the Khoe-Speaking Area of Southern Africa.” Geographical Typology and Linguistic Areas, with Special Reference to Africa, ed. Osamu Hieda, Christa König and Hirosi Nakagawa. Tokyo University of Foreign Studies, Studies in Linguistics 2. Amsterdam & Philadelphia: John Benjamins, pp. 189–200. Westphal, Ernst O.J. 1964/5. “Kwadi field notes and recordings.” Cape Town: Manuscripts and Archives Department, University of Cape Town, unpublished. Westphal, Ernst O.J. n.d. “Ganadi field notes.” Cape Town: Manuscripts and Archives Department, University of Cape Town, unpublished. Westphal, Ernst O.J. n.d. “Handa field notes.” Cape Town: Manuscripts and Archives Department, University of Cape Town, unpublished. Wilmsen, Edwin N., and Rainer Vossen. 1990. “Labour, language and power in the construction of ethnicity in Botswana.” Critique of Anthropology 10(1): 7–37.

Chapter 10

Click Replacement and Loss in Ju Anne-Maria Fehn 1 Introduction1 This article discusses click replacement and loss in the Ju (ǃXun) dialect cluster of the Kx’a language family. Kx’a classification is shown in Figure 10.1. The paper is a companion piece to “Click loss in Khoe” (Fehn, this volume) in considering sound changes, notably lenition processes, that affect click consonants in the typological unit southern African Khoisan. Click replacement and loss in Ju was first systematically described by Traill and Vossen (1997). The authors presented data on patterns of click replacement and loss in Angolan Ju varieties, based on their own research and data published by Snyman (1980, 1997). Since then, more data has become available, and a considerable amount of research has broadened our knowledge on the historical relations of the Ju dialect cluster, culminating in the establishment of a language family termed “Kx’a” which links the Ju dialects of Namibia, Botswana and Angola to the ǂ’Amkoe2 language of the Central Kalahari (Heine and Honken 2010). Click replacement, i.e. the replacement of one phonemic click by another, is a phenomenon frequently attested in this family, in particular with regards to reflexes of the Proto-Kx’a retroflex click *ǃǃ and the palatal click *ǂ (§2). In addition, data from the northernmost Ju varieties spoken in Angola suggest that click loss may affect both [+abrupt] and [+noisy] influxes, even amongst speech communities which otherwise display no signs of language shift and loss (§3). Map 10.1 below provides a very preliminary overview of click replacement and loss as attested in documented Ju varieties. 1  This article is based on data compiled as part of the European Science Foundation EUROCORES programme EuroBABEL, which was supported by funds from the German Research Foundation (Deutsche Forschungsgemeinschaft) DFG. Additional research on Angolan ǃXun varieties was funded through the contract CEECIND/02765/2017, by FEDER funds through the Operational Programme for Competitiveness Factors – COMPETE, and by Portuguese National Funds through FCT – Foundation for Science and Technology under the PTDC/ BIA-EVF/2907/2012, FCOMP-01-0124-FEDER-028341, and PTDC/BIA-GEN/29273/2017. The author would like to thank Bonny Sands, Tom Güldemann, Bernd Heine and Jorge Rocha for comments on a previous version of this paper. 2  Formerly called ǂHoan, see Güldemann (2014) for further discussion. © Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_011

337

Click Replacement and Loss in Ju Kx’a

Amkoe

Ju (= !Xun)

Central (C1-2) C1: Gaub C2: Neitsas

Southeastern (E1-3) E1: Ju hoan E2: Dikundu E3: X ao aen

Northwestern:

Western (W1-3)

Northern (N1-3)

Hoan N!aqriaxe Sasi

Kavango (K)

Figure 10.1 The Kx’a family Note: Ju and ǂ’Amkoe are both language complexes (or dialect clusters) Source: cf. Güldemann 2014, Heine and Honken 2010; Heine and König 2016

Key Retention of Proto-Ju *!! Proto-Ju *!! > ! Proto-Ju *!! > ǁ Proto-Ju *!! > ǁ; Proto-Ju * > ⨎ Proto-Ju *!! > ǁ; Proto-Ju * > ⨎; click loss

Menongue

Angola Mupa

Cubango/Cunene

Northwestern

Ju

Okongo Mpungu Ekoka

Quito/Cubango Cuando/Quito Rundu

Andara

Leeunes Farm Neitsas Mangetti Dune Tsumkwe Groot- Otjituo fontein Botswana

Gaub

Central

Ju

Southeastern Namibia

Epukiro

Ju Ghanzi

Gobabis

Map 10.1

Geographical distribution of click replacement and loss within the Ju language complex Note: Cuando/Quito, Quito/Cubango and Cubango/Cunene refer to approximate geographical areas in Angola with reference to major rivers. Source: cf. Snyman 1997

338

Fehn

This section intends to update the information on click loss in Central Angolan ǃXun first mentioned by Bleek (1927) and expanded on in Traill and Vossen (1997) with data from a Ju variety collected in the Mupa National Park of Angola’s Cunene province during fieldwork in July 2014. 2

Click Replacement

Heine and Honken (2010: 24–27) reconstruct five influxes for Proto-Kx’a: dental ǀ, palatal ǂ, lateral ǁ, alveolar ǃ, and retroflex ǃǃ. In selected varieties of the Ju dialect cluster, two of them have regularly been replaced by click sounds with different acoustic properties: The Proto-Ju retroflex click *ǃǃ corresponds to an alveolar click ǃ in Southeastern Ju, and to a lateral click ǁ in Northwestern Ju (§2.1). The Proto-Ju palatal click *ǂ corresponds to a fricated palatal click ⨎ 3 in some varieties of Northwestern Ju (§2.2). Examples for both sound shifts, including a description of their phonetic properties and geographic distribution are discussed in the sections below. 2.1 Reflexes of the Proto-Ju Retroflex Click *ǃǃ Miller-Ockhuizen and Sands (1999, 2000) and Sands (2010) convincingly argue that Southeastern Ju *ǃ and Northwestern Ju *ǁ can be considered reflexes of a Proto-Ju retroflex click *ǃǃ (Miller-Ockhuizen and Sands 1999, 2000; Sands 2010: 92). Such a click has been recorded in some Central Ju varieties (cf. Miller-Ockhuizen and Sands 1999: 409; Sands 2010: 91), e.g. in Vedder’s (1910/11) Gaub ǃXun, in one variety of Bleek and Lloyd’s “ǃKung” (1911), in Snyman’s (1997) “ǃXung” of Leeunes Farm and in Doke’s (1925) data from Neitsas and Otjimavare. Table 10.1 exemplifies the correspondences; the items ‘chin’, ‘iron’, ‘porcupine’, ‘son’, ‘eland’ and ‘river’ are also discussed in Traill and Vossen (1997: 36). Snyman’s (1980) data from Angolan ǃXun4 is quoted in the rightmost column. In Table 10.1, the retroflex click is only retained in Doke’s (1925) data from Neitsas. Juǀ’hoan displays a sound shift from retroflex to alveolar *ǃǃ > ǃ, and the data taken from König and Heine’s (2008) dictionary of Northwestern ǃXun aligns with the two dialects recorded in Angola in having *ǃǃ > ǁ. The correspondences appear to be regular, and none of the varieties surveyed displayed what 3  ⟨!͡s⟩ in the transcription of Miller et al. (2011); ⟨⨎ ⟩ was suggested by B. Sands (p.c.). 4  Snyman’s (1980) data originates from the Cuando-Cubango province and probably stems from varieties belonging to the “Northern” branch of the “Northwestern” cluster (cf. Figure 10.1 above). The author’s own research suggests that these varieties occupy a different place on the dialect spectrum than those from the more western provinces of Huíla and Cunene, which appear to be closer to the “Western” dialects studied by König and Heine (2001).

339

Click Replacement and Loss in Ju

Table 10.1 Reflexes of Proto-Ju *ǃǃ in Doke’s Neitsas ǃXun (C2), Juǀ’hoan, Northwestern ǃXun (W2), Mupa ǃXun and Snyman’s “A1” Gloss

Neitsas ǃXun (C2) (Doke 1925)

Juǀ’hoan (Dickens 1994)

NW-ǃXun (König and Heine 2008)

Mupa ǃXun (Fehn field notes)

Angolan ǃXun (A1) (Snyman 1980)

[kǃǃum̠ ] (‘thigh’) [kǃǃa͡ɪ́]

ǃaq (100) ǃóm (106, ‘hind leg’) ǃáí (64)

ǁȁq (140) ǁḿ (ǁóm ~ǁúm) (143) ǁāē (64)

[ǁàˤà] [ǁóḿ]

ǁà̰

[kǃǃaⁿ] (‘chin tip’) [kǃǃâʊ] [kǃǃuⁿ]

ɡǃãĩh (54) [ɡǃǃàŋ] ɡǃáu (91) ɡǃú (177)

ɡǁȁŋ (118)

[ɡǁȁŋ]

ɡǁàò (130, 136) ɡǁú (181)

[ɡǁáó∼ɡʟ̥áó] [ɡǁúú]

[kǃǃhá]

ǃhá (313)

ǁhā-mà (108)

[ǁʰáá]

ǃ’hán (323)

ǁ’hān (130)

nǃaoh (249) nǃànɡ (73)

nǁàȍ (113) nǁŋ̄ (127)

[ŋǁȁȍ] [ŋǁŋ]

ǃxòm (139) ǃxúí (162)

ǁxùm (102) ǁxōē (173)

[ǁxòm̀ ] [ǁxōē]

ǁxòm

ǃ’óm (130)

ǁ’m̄ (155)

[ǁʔm̄ ]

ǁ’òm

*ǃǃ~ǃ~ǁ ‘iron’ ‘leg’ ‘to die’

[ǁēē]

*ɡǃǃ~ɡǃ~ɡǁ ‘chin’ ‘hand’ ‘water’

ɡǁānɡ

*ǃǃh~ǃh~ǁh ‘meat, animal’ *ǃǃ’h~ǃ’h~ǁ’h ‘son’

ǁ’hã̀

*nǃǃ~nǃ~ǁh ‘bow’ ‘eland’

[ŋǃǃaⁿ]

nǁànɡ

*ǃǃx~ǃx~ǁx ‘river’ ‘tail’ *ǃǃʔ~ǃʔ~ǁʔ ‘porcupine’

Note: The orthographical representation of all forms has been retained from the original. For data taken from the dictionaries quoted, page numbers are indicated in parentheses.

340

Fehn

could be considered “relic forms” or instances of idiolectal variation (compare Fehn, this volume, for Kalahari Khoe). The sound shift apparently did not affect click accompaniments. However, the three forms for ‘chin’, ‘hand’ and ‘water’ noted by Doke (1925) with a voiceless retroflex click ǃǃ (⟨kǃǃ⟩) have voiced correspondences in the northwestern and southeastern dialects. Whether this should be interpreted as an irregular correspondence or as an erroneous transcription on Doke’s part cannot be answered at this point. Traill and Vossen (1997: 35f) considered the ǃ~ǁ correspondences to be a case of click replacement which involves weakening of the alveolar click ǃ.5 Despite ǃǃ being the proto-form and not a weakened version of ǃ, lenition may still hold as an explanation for the shift from *ǃǃ to ǁ: According to Sands (2010: 92), the seal of the retroflex click ǃǃ is difficult to produce. This may cause speakers to unroll the tongue before the closure is released, resulting in lateral leakage. The click hence becomes acoustically more similar to the lateral click ǁ, and eventually, the contrast between the two is neutralized. The Ju variety spoken at Mangetti Dune in northwestern Namibia might have retained an intermediate stage of this sound change process.6 Miller-Ockhuizen and Sands (2000) describe this click which they orthographically represent as ⟨ǀǁ⟩ as “forward released denti-alveolar lateral click”. According to their research, ǀǁ has a longer burst duration than ǃ and ǁ; furthermore, ǁ rises faster to peak intensity than ǀǁ, but slower than ǃ. A visual cue for the presence of a forward released lateral click might be seen in the speaker’s tongue protruding between the front teeth (Sands 2010: 1993). In conclusion, we may say that both Juǀ’hoan ǃ and Northwestern ǃXun ǁ as reflexes of a Proto-Ju retroflex click *ǃǃ are instances of click replacement, i.e. replacement of one click by another. However, it seems futile to speculate about implications for the history of the Ju dialect cluster: while it appears clear that Proto-Ju *ǃǃ > Southeastern Ju *ǃ constitutes a separate innovation in which ǃǃ merged with the acoustically similar ǃ, we do not know whether Proto-Ju *ǃǃ merged with ǁ in one single instance ancestral to all Ju dialects in which *ǃǃ corresponds to ǁ, or whether the same innovation happened multiple times (cf. Sands 2010, Heine and König 2016: 22).

5  Their argument is as follows: The alveolar influx ǃ becomes increasingly noisy, which results in the retention of the feature [+grave], but loss of the feature [+abrupt], i.e. eventual replacement of the alveolar click ǃ with a lateral click ǁ. 6  However, this particular click type was only recorded with a single speaker who allegedly knew Juǀ’hoan and might have exaggerated the pronunciation difference (B. Sands, p.c.).

Click Replacement and Loss in Ju

341

2.2 Replacement of the Palatal Click ǂ in Northwestern Ju A palatal click ǂ in Southeastern Ju corresponds to a different click type in the Northwestern cluster which, in the past, was commonly noted as retroflex,7 i.e. ⟨ǃǃ⟩ (cf. e.g. König and Heine 2008, Traill and Vossen 1997). Based on acoustic measurements involving ultrasound recordings from Ekoka ǃXun (W2), Miller et al. (2011, 2019) were able to show that the click corresponding to the palatal click is a fricated postalveolar or palatal click ⨎ . ⨎ contrasts with the abrupt alveolar click ǃ. This sound shift may be interpreted as lenition involving frication of a click. The measurements of Miller et al. (2011, 2019) revealed ⨎ to have a longer burst duration and longer rise times than the corresponding palatal clicks in Mangetti Dune ǃXun. This corresponds to Traill and Vossen’s (1997) loss of the feature [+abrupt], leading to an increased noisiness of the click. Examples for ǂ~⨎ correspondences with different accompaniments are provided in Table 10.2. In the two rightmost columns, additional data from Mupa and Traill and Vossen’s (1997) Angolan ǃXun has been included. The correspondences between Dickens’ (1994) Juǀ’hoan and the Northwestern ǃXun forms from the W2 dialect given in König and Heine (2008) are regular, i.e. Juǀ’hoan ǂ always corresponds to W2 ⨎ . Accompaniments do not correspond in all cognate sets provided in Table 10.2 above. The form for ‘far’ has a plain palatal click ǃ in Juǀ’hoan that corresponds to ⨎ x in König and Heine’s (2008) data. The affrication is also present in the clickless form recorded in Mupa. The aspirated nasal click nǂh in ‘pangolin’ corresponds to delayed aspiration (or aspiration) in all Northwestern forms, including those recorded in Angola. In addition to the expected ǂ > ⨎ shift, both datasets from Angola are affected by click loss, but correspondences differ. In the cognate forms from Mupa, Juǀ’hoan ǂ either corresponds to ⨎ , or to a voiceless alveolar stop t. Variation between speakers or lexemes was not observed, but it should be kept in mind that the data was elicited with two male speakers only. Traill and Vossen’s (1997: 37) data from two speakers shows variation in click use. One speaker retained ⨎ with all forms given, the other also gave clickless forms with a velar stop k corresponding to ǂ and ⨎ . In the data from Angola, correspondences do not always involve retention of the accompaniment. In Mupa ǃXun, the forms [⨎ ʰōē] ‘dog’ and [txei, txai] ‘scorpion’ do not retain the prevoicing. The forms for 7  Traill and Vossen (1997) mistake ⨎ for the retroflex click ǃǃ noted for the Central Ju varieties by authors such as Doke (1925) and Bleek (1956) (see §2.1 above) and assume that it constitutes a “weakened” version of the alveolar click ǃ on its way to replacement by a lateral click ǁ and eventual loss.

342

Fehn

Table 10.2 Reflexes of Proto-Ju *ǂ in Juǀ’hoan, König and Heine’s (2008) Northwestern ǃXun (W2), Mupa ǃXun and Traill’s ǃXun from Menongue Gloss

Juǀ’hoan (Dickens 1994)

NW-ǃXun (König and Heine 2008)

Mupa ǃXun

⨎ ȍqì-mà ⟨ǃǃȍqì-mà⟩ (113) ⨎ à’ō, ⨎ àò ⟨ǃǃà’ō, ǃǃàò⟩ (89) ⨎ xān ⟨ǃǃxān⟩ (129) ⨎ òȁh ⟨ǃǃòȁh⟩ (133)

[tòˤì-mà] [tàʔò] [txã] [toah] [tȁūh kōō] [thoŋ]

Angolan ǃXun (Traill and Vossen 1997: 37)

ǂ > ⨎ (> t) ‘boy’ ‘cold’ ‘far’ ‘giraffe’ ‘kidney’ ‘star’

ǂà’ú (56) ǂán (76) ǂoah (85) ǂauh(-còcòrò) (103) ǂuhn (155)

⨎ ȕn ⟨ǃǃȕn⟩ (170)

ɡ⨎ uu ⟨ǃǃɡu⟩, ku

ɡǂ > ɡ⨎ (> d) ‘beard’ ‘forearm’ ‘long’ ‘old’

ɡǂaqò (81) ɡǂà’ín (109) ɡǂah (121)

ɡ⨎ ȍŋɡōlō ⟨ɡǃǃȍŋɡōlō⟩ (111) ɡ⨎ àqo ⟨ɡǃǃàqo⟩ (132) ɡ⨎ ā’ŋ̄ ⟨ɡǃǃā’ŋ̄ ⟩ (144) ɡ⨎ ȁhȁ ⟨ɡǃǃȁhȁ⟩ (150)

[dȍnɡōlō]

ǂháí (112) ǂhà (126)

⨎ ’hí ⟨ǃǃ’hí⟩ (146) ⨎ hà ⟨ǃǃhà⟩ (152)

[thíí]

ɡǂhúín (68) ɡǂhàì (151)

ɡ⨎ hōē ⟨ɡǃǃhōē⟩ (125) [⨎ hōē] ⨎ háí-⨎ hàiyà ⟨ǃǃháí-ǃǃhàiyà⟩ (167)

ǂ’han (33) ǂ’hùrì (50)

⨎ ’hȁn ⟨ǃǃ’hȁn⟩ (109) ⨎ ’hùli ⟨ǃǃ’hùli⟩ (116)

[⨎ ’hãã] [⨎ ’hùlī]

‘python’

nǂanɡ (134)

’ǹ⨎ ŋ̏ ⟨’ǹǃǃŋ̏ ⟩ (158)

[⨎ ŋ]

‘to beat’ ‘to cuddle, to fold arms’ ‘to sneeze’

nǂà’m (245) nǂànɡnǂànɡ (244)

n⨎ ā’m̀ ⟨nǃǃā’m̀ ⟩ (111)

[ŋ⨎ àʔm̀ ]

[dāʔŋ̀ ]

ǂh > ⨎ h ‘many, much’ ‘path, road’ ɡǂh~ɡ⨎ h ‘dog’ ‘smooth’ ǂ’h > ⨎ ’h ‘arm, branch’ ‘calf (of leg)’ nǂ > n⨎

nǂaih (151)

[ŋ⨎ āīh]

n⨎ aŋ ⟨nǃǃaŋ⟩, n⨎ anɡ ⟨nǃǃanɡ⟩ n⨎ eŋɡa ⟨nǃǃeŋɡa⟩

343

Click Replacement and Loss in Ju

Table 10.2 Reflexes of Proto-Ju *ǂ in Juǀ’hoan, König and Heine’s (2008) Northwestern (cont.) Gloss

Juǀ’hoan (Dickens 1994)

NW-ǃXun (König and Heine 2008)

Mupa ǃXun

Angolan ǃXun (Traill and Vossen 1997: 37)

nǂhaò (76) nǂhòqí (125)

n⨎ hȁò ⟨nǃǃhȁò⟩ (55) ⨎ ’húí ⟨ǃǃ’húí⟩ (152)

[⨎ ’hui]

⨎ ui [ǃǃui], ⨎ hui [ǃǃhui]

nǂh > n⨎ (ʔ⨎ ’h) ‘to fall down’ ‘pangolin’ ǂx > ⨎ x (> tx) ⨎ xanu [ǃǃxanu],

‘book’

ǂxanù (303)

‘daughter’ ‘to rub’ ‘to shiver’

ǂxàè (303) ǂxàù (140)

⨎ xàè ⟨ǃǃxàè⟩ (123) ⨎ xāŋ̀ ⟨ǃǃxāŋ̀ ⟩ (164)

[txàù] [txŋ̀ ]

ɡǂxàìì (143)

ɡ⨎ xáín ⟨ɡǃǃxáín⟩ (132)

[txéí, txáí]

ǂ’ùì (51)

⨎ ’ùì-mà ⟨ǃǃ’ùì-mà⟩ (116) ⨎ ’ūín ⟨ǃǃ’ūín⟩ ⨎ ’úí ⟨ǃǃ’úí⟩ (170)

[⨎ ʔũ̄ĩ]́� [⨎ ʔúí]

kxanu

⨎ xae [ǃǃxae], kxae

ɡǂx > ɡ⨎ x (> tx) ‘scorpion’ ǂʔ~⨎ ʔ ‘caracal’ ‘to open’ ‘to sprout’

(ǂ’úí ‘burst, crack’ (308))

Note: The orthographical representation of all forms has been retained from the original, except for the data from König and Heine (2008) and Traill and Vossen (1997) where ǃǃ has been replaced by ⨎ , with the original transcription provided in angle brackets. For data taken from the dictionaries quoted, page numbers are indicated in parentheses.

‘pangolin’ align with König and Heine’s (2008) data in displaying delayed aspiration (Mupa ǃXun) or aspiration (Angolan ǃXun), rather than with the nasalized aspirated accompaniment found in Juǀ’hoan. Click loss in Angolan ǃXun, including the replacements for ⨎ found in Table 10.2 above, will be further discussed in §3 below.

344 3

Fehn

Click Loss in Central Angolan ǃXun

Bleek (1927) was the first scholar to mention replacement of clicks by non-click consonants in a Ju variety from Angola. Since then, more data has become available (Traill and Vossen 1997), and it seems clear that part of Central Angolan Ju is indeed affected by click loss. However, the scope and historical depth of the sound changes observed is hard to determine, mostly due to a lack of systematically collected data from a wide range of speakers and geographical locations. Traill and Vossen’s (1997) data from Menongue was collected with two speakers only, and the same is true for my own data from Mupa. Nevertheless, click loss in Mupa was also observed in informal discussions between speakers that occurred during the elicitation, and no objections were raised with regards to the forms given by the speakers interviewed. Speakers from Mupa were either monolingual, or bilingual in Kwanyama, the Bantu language dominant in the area. No signs of an ongoing language shift or language loss could be observed. During our visit in 2014, ǃXun was the sole language of everyday conversation, and the language in which children were addressed. The Mupa ǃXun are nomadic and retain a foraging way of life. Occasionally, individual members of the speech community work for the neighboring Kwanyama to obtain food and alcoholic beverages. Although the relationship between Kwanyama and ǃXun is usually tense and, in part, hostile, it seems rather unlikely that interference or pressure from Kwanyama can be made accountable for the click loss. The data available does not suggest that clicks are lost in any particular order, nor are there examples for complete loss of any influx series. The two male speakers with whom our data from Mupa ǃXun was collected8 tended to replace a subset of all documented influxes (⨎ , ǃ, ǁ, ǀ) by non-click consonants, suggesting that in this particular variety, the loss of one influx series does not precede the loss of another. The extent to which clicks are lost in the variety Traill recorded in Menongue is hard to determine. In the examples given in Traill and Vossen (1997), one speaker replaces a subset of his fricated palatal clicks *⨎ (or *ǂ) with a velar stop k (cf. ‘star’, ‘book’ and ‘daughter’ in Table 10.1 above), and both speakers appear to regularly9 replace the 8  Both speakers are male, between 25–35 years of age, and live full-time with their speech community. They speak Mupa ǃXun as their first language and understand and speak Kwanyama. One of them has a limited knowledge of Portuguese, which he obtained during a work stay in Lubango. 9  They quote one example in which a lateral click ǁ in Juǀ’hoan corresponds to a dental click ̀ ,̀ A2 ǀɡãũã; however, the second speaker ǀ in their Angolan data (A2) : ‘God’, Juǀ’hoan ǁɡãõ̀ ã gave a non-click form with a laterally released velar stop ɡʟ̥ãũã. This case of an irregular correspondence cannot be explained at this point.

345

Click Replacement and Loss in Ju

plain lateral click ǁ10 with a laterally released alveolar stop kʟ̥ (and ɡǁ with ɡʟ̥, respectively). In the following, I will give examples for all patterns of click loss occurring in Mupa ǃXun. For the fricated palatal click ⨎ and the alveolar click ǃ, variation between speakers may seem comparatively minor, but it should be kept in mind that the corpus is not big enough to determine whether click loss regularly occurs with all speakers. 3.1 Loss of the Fricated Palatal Click ⨎ Examples showing the loss of the fricated palatal click are found in Table 10.2 above. The following replacements are attested in the data shown in Table 10.3. In my data, replacement was regular. The speakers I worked with replaced all instances of the onsets quoted in Table 10.2 with non-click consonants. Other onsets involving a fricated palatal click were retained. These include aspiration, delayed aspiration, nasalization and glottalization. Note, however, that not all accompaniments were attested in the data, e.g. ejective affricated onsets were not recorded. The list of non-click replacements for the fricated palatal click is therefore non-exhaustive and has to be considered preliminary. Replacement of this click type also occurs in the data from Menongue quoted in Traill and Vossen (1997, see Table 10.2 above), but replacements differ. In their data, one speaker replaced ⨎ with k and ⨎ x with kx. As to date, no thorough articulatory study on this click type and its replacements has become Table 10.3 Replacements for the fricated palatal click ⨎ attested in Mupa ǃXun

Proto-Form

NW-ǃXun (W2) (cf. König and Heine 2008)

Mupa ǃXun

*ǂ *ɡǂ *ǂx *ɡǂx



t d tx tx

ɡ⨎ ⨎x ɡ⨎ x

10  Traill interprets kʟ̥ as a replacement for the alveolar click ǃ. However, the [+click] cognate forms he compares his data to are all from Juǀ’hoan, so we are dealing with reflexes of Proto-Ju *ǃǃ, i.e. the Northern Ju phoneme underlying Traill’s kʟ̥ would be a lateral ǁ, rather than an alveolar click ǃ (cf. §2.1).

346

Fehn

available; it is difficult to assess what might have led to replacement of ⨎ with two different non-click consonants. For now, it might be speculated that the alveolar stop k was blocked as a replacement in Mupa ǃXun because it already replaces a subset of the alveolar click ǃ. 3.2 Loss of the Alveolar Click ǃ The two speakers I worked with replaced a subset of their alveolar clicks with non-click consonants. The following replacements are attested in the data in Table 10.4. It can be observed that these non-click replacements for the alveolar click are remarkably similar to patterns observed in Kalahari Khoe (Fehn, this volume). Whether there really is a difference between kx and x as replacements for the fricated alveolar click ǃx and the voiced fricated alveolar click ɡǃx, respectively, cannot be answered at this point. As the phoneme inventory of Mupa ǃXun has not been established, yet, we know very little about the variety’s overall complexity and the accompaniments that do occur. It is well possible that even without taking click loss into account, the click inventory of Mupa ǃXun is already much reduced in comparison to related dialects such as König and Heine’s (2001) Ekoka ǃXun. In Table 10.5, examples for correspondences between forms in Juǀ’hoan and Northwestern ǃXun (W2), and forms in Mupa ǃXun are provided. Note that the alveolar click is still retained with a subset of lexical items. These have been highlighted. Table 10.4 Non-click replacements for the alveolar click attested in Mupa ǃXun

Proto-Form

NW-ǃXun (cf. König and Heine 2008)

Mupa ǃXun

*ǃ *ɡǃ *ŋǃh *ǃh *ǃx *ɡǃx *ǃx’

ǃ ɡǃ ŋǃh ǃh ǃx ɡǃx ǃx’

k ɡ ŋ kh kx x kx’

347

Click Replacement and Loss in Ju

Table 10.5 Patterns of alveolar click loss in Mupa ǃXun, compared to Juǀ’hoan and König and Heine’s (2008) Northwestern ǃXun

Gloss

Juǀ’hoan (Dickens 1994)

NW-ǃXun Mupa ǃXun (König and Heine 2008)

ǃù’úrú (118)

ǃūlú (130) ǃȕhùn (128) ǃālí (137) ǃú (149) ǃò’m̀ (142) ǃàȁh (159)

[kūlú] [ǃùhŋ́ ] [kālí] [kúú] [kòm̀ ] [kȁȁh]

ǃāō (175) ǃȁhŋ́ (177) ǃūí (161) ǃùm (84)

[kùì] [kāō] [kȁhŋ́ ] [ǃūí] [kùm̀ ]

ɡǃú (171) ɡǃàlú (126) ɡǃà’ŋ̄ (133) ɡǃō (151) ɡǃȕn (153) ǃòmà (165) ɡǃú (171) ɡǃà’í (134) ɡǃáí (163) ɡǃȁȉ (134) ǃōbē (‘firewood’, 131)

[ɡàhm] [ɡűű] [ɡàlú] [ɡàʔŋ́ ] [ɡōō] [ɡùŋ̀ ] [ɡòmà] [ɡúú] [ɡàʔí] [ɡíí] [ɡȁȉ] [ɡōʋē]

ǃhúlú (127) ǃhú (82)

[kʰúlú] [kʰúú]

ǃ>k ‘claw, nail’ ‘enclosure’ ‘hard’ ‘name’ ‘knot’ ‘red-crested bustard’ ‘pus’ ‘three’ ‘tree’ ‘to be rotten’ ‘to plant’

ǃú (317) ǃòmm (104)

ǃùì (133) ǃãĩ�h̀ (310)

ɡǃ > ɡ ‘bed’ ‘belly’ ‘drum’ ‘full’ ‘ostrich’ ‘pestle’ ‘short’ ‘stomach’ ‘to go out’ (pl) ‘to sew’ ‘wildebeest’ ‘wood’

ɡǃahm (39) ɡǃú (41) ɡǃárú (70) ɡǃà’ín (83) ɡǃùn (127) ǃòmà (147) ɡǃú(-tzí) (41) ɡǃà’í (87) ɡǃáí (145) ɡǃaih (180)

ǃh > kh ‘elbow’ ‘horn’

ǃhúrú (73) ǃhú (96)

348

Fehn

Table 10.5 Patterns of alveolar click loss in Mupa ǃXun, compared to Juǀ’hoan (cont.)

Gloss

Juǀ’hoan (Dickens 1994)

NW-ǃXun Mupa ǃXun (König and Heine 2008)

‘to kill’

ǃhún (314)

ǃhún (141)

[kʰúŋ́ ]

nǃòq’ḿ (149) nǃȍqm̄ (143) ’nǃóm (160) nǃún~ǃún (170) nǃon’a (176), nǃùàn nǃȁ (180) nǃm̀ (160) nǃȁì (51)

[ŋǃȍˤm̄ ] [ŋǃòm̀ , ŋǃòʔm̀ ) [ŋǃòˤm̄ ] [ŋǃóḿ] [ǃúń] [ŋǃũ̀ã̀] [ŋǃȁ] [ŋǃòm̀ ] [ŋǃàì]

nǃ (no replacement) ‘anus’ ‘navel’ ‘to lie’ ‘to ripen’ ‘to stand’ (sg) ‘to throw’ ‘to wait’ ‘stone’ ‘war’

nǃòqm (99) nǃò’m (118) nǃóm (139) nǃún (155) nǃo’àn (166) nǃòm (252) nǃàì (176)

nǃh (>ŋ) ‘kudu’ ‘to finish’ ‘to twist’

nǃhȍȁn (142) ŋ̀ ǃhãĩn (131)

[ŋǃʰũ̀ã̀] [ŋã̄ĩ]́� [ŋǃʰòòh]

ǃ’hòàn (116) ǃ’hm̀ (143)

[ǃ’hũ̀ã̀] [ǃ’hm̀ ] [ǃ’hã̀ã̀]

ǃxó (73)

ǃxàān (115) ǃxō (127) ǃxũ̄ṹ (153)

[ǃxãã̀ ̄, kxã̀ã̄] [kxōō] [kxúŋ́ ]

ɡǃxòà (103)

ɡǃxȍȁ (142)

[xȍȁ]

nǃhoan (104) nǃhoròh (172)

ǃ’h (no replacement) ‘calabash’ ‘leopard’ ‘well’

ǃ’hòàn (48) ǃ’hòm (107) ǃ’hàn (178)

ǃx > kx ‘bundle’ ‘elephant’ ‘ǃXun person’ ɡǃx > x ‘knee’

349

Click Replacement and Loss in Ju Table 10.5 Patterns of alveolar click loss in Mupa ǃXun, compared to Juǀ’hoan (cont.)

Gloss

Juǀ’hoan (Dickens 1994)

NW-ǃXun Mupa ǃXun (König and Heine 2008)

ɡǃx’ > x ‘to snore’

ɡǃx’ùnnù (167)

[xònò]

ǃx’òān (119) ǃx’úí (136) ǃx’ā (137)

[kx’ũ̀ã̄] [kx’úí] [kx’āā]

ǃ’ú (113) ǃ’āè (112) ǃ’áín (174)

[ǃʔúú] [ǃʔèè] [ǃʔã́ĩ]́�

ǃx’ > kx’ ‘charcoal’ ‘hair’ ‘heart’

ǃx’úí (91) ǃx’á (93)

ǃʔ (no replacement) ‘bone’ ‘to bite’ ‘to taste’

ǃ’ú (44)

Note: The orthographical representation of all forms has been retained from the original. For data taken from the dictionaries quoted, page numbers are indicated in parentheses.

Click loss has mostly affected the plain (ǃ), voiced (ɡǃ), aspirated (ǃh), affricated (ǃx) and affricated ejective (ǃx’) influxes. Only three forms with a plain alveolar click were found in the Mupa ǃXun data: [ǃūí] ‘to be rotten’, [ǃùhŋ́ ] ‘enclosure’ and [ǃúń] ‘to stand’. However, Heikkinen (1986) provides a form nǃùʰũʰ ‘enclosure’ for the Northwestern ǃXun variety of Ovamboland, and Mupa ǃXun [ǃúń] ‘to stand’ corresponds to nasalized forms in Juǀ’hoan and Northwestern ǃXun. These cognate forms might explain the retention, as nasal clicks are never replaced in Mupa; in this scenario, the loss of nasalization would postdate the click replacement. One of two speakers also retained the click with a fricative accompaniment in [ǃxã̀ã]̄ ‘bundle’. Regular retention was observed with nasal (nǃ), nasalized aspirated (nǃh), delayed aspirated (ǃ’h) and glottalized (ǃʔ) accompaniments. One speaker provided a clickless correspondence [ŋãĩ̄ ]́� for Northwestern ǃXun ŋ̀ ǃhãĩn ‘to finish’. Here, loss of the click is accompanied by loss of the feature [+aspirated]. Whether this is a regular pattern cannot be answered at this point, due to a lack of additional examples.

350

Fehn

The correspondences for the fricative x and the affricated stop kx are interesting. kx always corresponds to an affricated click in other Ju varieties. x corresponds to both the prevoiced affricated click ɡǃx and to its ejective counterpart ɡǃx’ (cf. ‘knee’ and ‘snore’). A preliminary analysis of the phoneme inventory of the ǃXun variety I recorded in Mupa did not yield any prevoiced affricated or prevoiced affricated ejective clicks. The contrast between affricated and prevoiced affricated clicks also appears to have been neutralized with the replacements of the fricated palatal click (cf. ‘rub’, ‘shiver’ and ‘scorpion’ in Table 10.2 above).11 Whether the two instances of [x] replacements in Table 10.5 therefore constitute an irregularity or a reflex of a previously existent contrast cannot be answered until more data becomes available. 3.3 Loss of the Dental Click ǀ and the Lateral Click ǁ Dental and lateral clicks are not regularly lost in Mupa ǃXun, with one exception: All speakers replace the plain dental click ǀ with a voiceless alveolar stop t in the case of the Ju negation particle ǀóá (cf. Heine and König 2016: 84) which hence becomes tóá in Mupa ǃXun. Only one of the two speakers we recorded occasionally replaced other instances of ǀ and ǁ with non-click consonants, in particular when the forms in question appeared in phrase-medial or phrase-final position. Loss of the dental and lateral click in lexemes pronounced in isolation was rarely observed. An exhaustive list of the dental and lateral click replacements present in the data is provided in Table 10.6. The replacements for the dental click ǀ appear to match those for the fricated palatal click ⨎ (cf. Tables 10.2 and 10.3 above). However, a more thorough articulatory study would be needed to ascertain that they are realized in exactly the same way. Acoustically, they appear to be identical. The same replacements for the plain and voiced lateral clicks were also found with one of the speakers Traill worked with in Menongue (cf. Traill and Vossen 1997). Whether they occur more regularly with other speakers or in other dialects of Angolan ǃXun remains a question for future research. Table 10.7 exemplifies replacement of the dental click as found with our speaker:

11  Loss of prevoicing with affricated and ejective clicks is a common feature of multiple dialects of Ju (cf. Sands 2010).

351

Click Replacement and Loss in Ju Table 10.6 Non-click replacements for a subset of dental and lateral influxes as recorded with one speaker of Mupa ǃXun

Dental

Lateral

Proto-Form NW-ǃXun (cf. König Mupa ǃXun and Heine 2008)

Proto-Form NW-ǃXun (cf. König Mupa ǃXun and Heine 2008)

*ǀ *ɡǀ *ǀx’ *ɡǀx

*ǁ, *ǃǃ *ɡǃǃ * ɡǃǃx’

ǀ ɡǀ ǀx’ ɡǀx

t d t’ tx

ǁ ɡǁ ɡǁx’

kʟ̥ ɡʟ̥ kʟ̥’

Table 10.7 Patterns of dental click loss in Mupa ǃXun, compared to Juǀ’hoan and König and Heine’s (2008) Northwestern ǃXun (W2)

Gloss

Juǀ’hoan (Dickens 1994)

NW-ǃXun (Heine and König 2008)

Mupa ǃXun (Fehn field notes)

‘to wash’ ‘eye’ ‘foot’ ‘mucus’ ‘to be tired’ ‘to urinate’

ǀoah (177) ɡǀà’á ǃ’ú ǃx’úí (75) ǀx’áí (317) ǀx’òàn (117)

ǀòȁ~ǀòȍ (75) ɡǀà’ā ǃʔú (129) ǀx’áí (132) ǀx’ùàn (148) ǀx’úí (176) ɡǀxàm (174)

[tòàh] [dā’à kōē] [t’éí] [t’ṹã̀] [t’úí] [txàm̀ ]

ɡǀxàm (180)

Note: The orthographical representation of all forms has been retained from the original. For data taken from the dictionaries quoted, page numbers are indicated in parentheses. Note that all Mupa forms are free variants that coexist with their [+click] counterparts.

In the 700-item wordlist the above data was taken from, the plain ǀ and voiced ɡǀ dental clicks were only replaced in two instances. The affricated ejective click ǀx’ was replaced three times; in all instances, the affrication was lost. The Juǀ’hoan and Northwestern ǃXun correspondences for ‘to urinate’ all have a prevoiced affricated click ɡǀx. The voicing is lost both in the click replacement, as well as in the [+click] form, which in Mupa ǃXun is [ǀxàm̀ ] (cf. footnote 11 above). The cases of lateral click replacement in the same dataset mentioned above are shown in Table 10.8.

352

Fehn

Table 10.8 Patterns of lateral click loss in Mupa ǃXun, compared to Juǀ’hoan and König and Heine’s (2008) Northwestern ǃXun (W2)

Gloss

Juǀ’hoan NW-ǃXun (Dickens 1994) (Heine and König 2008)

Mupa ǃXun (Fehn field notes)

‘medicine’ ‘neck’ ‘to be drunk’ ‘to die’ (sg) ‘hand’ ‘rib’

ǀ’ún-ǁáí (113) ǃàìn (118)

[ǀʔṹ-kʟ̥ēē] [kʟ̥ȁŋ̏ ] [kʟ̥āē] [kʟ̥ēē] [ɡʟ̥àò] [kʟ̥̥āā ǃʔúà]

ǃáí (64) ɡǃáu (91) ɡǃkàá ǃ’úa (138)

ǀ’únǁāē (146) ǁȁŋ (149) ǁāē (126) ǁāē (123) ɡǁàò (136) ɡǁx’ā (160)

Note: The orthographical representation of all forms has been retained from the original. For data taken from the dictionaries quoted, page numbers are indicated in parentheses. Note that all Mupa forms are free variants that coexist with their [+click] counterparts.

Except for two examples, ‘medicine’ and (possibly) ‘to be drunk’, all instances of lateral click replacement concern lateral clicks which appear as reflexes of the Proto-Ju retroflex click *ǃǃ. Whether this is due to the reflex of the retroflex click having articulatory properties different from other lateral clicks cannot be answered at this point. The majority of instances in which the click is replaced concern a plain lateral click ǁ. Its voiced counterpart ɡǁ was replaced in one instance only. The prevoiced affricated ejective click in other Ju varieties corresponds to a devoiced, deaffricated click replacement (cf. ‘rib’). 4 Discussion This article discussed sound change processes affecting clicks in the Southeastern and Northwestern Ju dialects. The full click inventory of Proto-Ju [ǀ, ǂ, ǁ, ǃ, ǃǃ] has only been retained in the Central Ju varieties. In the Southeastern cluster, the click inventory was reduced to four influxes when the retroflex click *ǃǃ became merged with the alveolar click *ǃ. In the Northwest, *ǃǃ merged with the lateral click *ǁ, and in a subset of dialects, the palatal click *ǂ was affected by a sound change eventually resulting in a fricated palatal click ⨎ . The evolution of the Proto-Ju click inventory across the main sub-clusters is outlined in Figure 10.2 below. Although all sound shifts impacting clicks seem to trace genealogical subgroupings, their uniqueness may be questioned. The Proto-Kx’a retroflex click

353

Click Replacement and Loss in Ju Proto-Ju *[|, , ||, !, !!]

Southeastern *[|, , ||, !] Figure 10.2

Central *[|, , ||, !, !!]

Northwestern *[|, ~⨎, ||, !]

Evolution of click inventories in the Ju dialect cluster

*ǃǃ was retained in Proto-Ju, but merged with the lateral click *ǁ in its sister branch ǂ’Amkoe (Heine and Honken 2010: 27). The click therefore underwent exactly the same shift reported for Northwestern Ju, but evidently in a historically and geographically unconnected setting. It also seems futile to try and explain these sound shifts by extra-linguistic triggers, as has become the norm for click loss in Kalahari Khoe (Fehn, this volume). Following Traill and Vossen (1997), the replacement of one click by another should be interpreted as a lenition process (“weakening”, in their terms), i.e. a regular sound change. While the affected clicks do not necessarily become “weaker” in terms of intensity, an increase in friction may at least be observed for *ǃǃ > ǁ and *ǂ > ⨎ . The hypothesis that these shifts are by themselves a prelude to eventual click loss seems unlikely: The phoneme inventories of most Ju dialects which have been historically affected by click replacement appear stable over time, with no impending sign of click loss. The only exception is constituted by the northernmost Ju varieties spoken in southern and central Angola. Traill, Snyman and the present author all collected data from dialects belonging to the Northwestern cluster which were affected by click loss. As in the Khoe family, phonemic clicks are replaced by non-click consonants. In the case of Mupa ǃXun, all four clicks were partly affected, but no complete loss of any influx series could be observed. It is however interesting to note that the series most affected were the alveolar ǃ and the fricated palatal click ⨎ , with the latter going back to a proto-Ju palatal click *ǂ. This is remarkably similar to click loss in Kalahari Khoe, which primarily affects the alveolar ǃ and palatal ǂ influx series. Like the dialect recorded by Snyman at Menongue, Mupa ǃXun is spoken at the northern fringe of the Ju-speaking area. However, unlike in the scenario described by Bleek (1927: 56) who observed 20–30-year-old men dropping clicks, bilingualism in a neighboring Bantu language is not the rule, children grow up speaking Ju, and we could not observe a beginning shift to either Kwanyama or

354

Fehn

Portuguese, the dominant European language of Angola. Social dominance by Bantu speakers, intermarriage, bilingualism and impending shift have all been described as sociolinguistic triggers for click loss in the Khoe language family. For the Ju scenario in Angola, they do not appear to provide a convincing explanation. It should however be noted that data from peripheral Ju varieties is still sparse. More comparative material from both male and female speakers from all age groups would be needed to assess the phenomenon from a broader perspective and to possibly detect patterns arising within speech communities not necessarily connected to outside forces. Still, click loss – like in Khoe (Fehn, this volume) and Tuu (Sands, this ­volume) – remains a feature of the periphery of the “Khoisan” speaking area in southern Africa. Evidently, clicks as a characterizing feature of the Kalahari Basin Area ‘Sprachbund’ (cf. Güldemann 1998) behave like other core features in that they show the highest density in the center, but become more erratic on the fringes (Sands and Gunnink 2019). This is certainly due to a higher exposure to languages not taking part in the ‘Sprachbund’, with regions like the eastern Kalahari and southern Angola acting as transitioning zones. Bibliography Bleek, Dorothea F. 1927. “Buschmänner von Angola.” Archiv für Anthropologie (Neue Folge) 21: 47–56. Bleek, Wilhelm H.I., and Lucy C. Lloyd. 1968 [1911]. Specimens of Bushman folklore. (= Africana Collectanea, 28). Cape Town: C. Struik. Dickens, Patrick J. 1994. English – Juǀ’hoan/Juǀ’hoan – English dictionary. Quellen zur Khoisan-Forschung / Research in Khoisan Studies 8. Cologne: Rüdiger Köppe. Doke, Clement M. 1925. “An outline of the phonetics of the language of the ʗhũ̬ː Bushmen of north-west Kalahari.” Bantu Studies 2: 129–166. Güldemann, Tom. 1998. “The Kalahari Basin as an Object of Areal Typology: A first approach.” Language, Identity, and Conceptualisation among the Khoisan, ed. Mathias Schladt. Quellen zur Khoisan-Forschung / Research in Khoisan Studies 15. Cologne: Rüdiger Köppe, pp. 137–169. Güldemann, Tom. 2014. “ ‘Khoisan’ Linguistic Classification Today.” Beyond ‘Khoisan’: Historical Relations in the Kalahari Basin, ed. Tom Güldemann and Anne-Maria Fehn. Amsterdam and Philadelphia: John Benjamins, pp. 1–40. Heikkinen, Terrtu. 1986. “Phonology of the ǃXũ Dialect spoken in Ovamboland and western Kavango.” South African Journal of African Languages 6: 18–28. Heine, Bernd, and Henry Honken. 2010. “The Kx’a family: A new Khoisan genealogy.” Journal of Asian and African Studies 79: 5–36.

Click Replacement and Loss in Ju

355

Heine, Bernd, and Christa König. 2016. The ǃXun Language. A Dialect Grammar of Northern Khoisan. Cologne: Rüdiger Köppe. König, Christa, and Bernd Hein. 2001. The ǃXun of Ekoka: A demographic and linguistic report. Khoisan Forum 17. Cologne: Institut für Afrikanistik, Universität zu Köln. König, Christa, and Bernd Heine. 2008. A Concise Dictionary of Northwestern ǃXun. Quellen zur Khoisan-Forschung / Research in Khoisan Studies 21. Cologne: Rüdiger Köppe. Miller, Amanda L. 2019. “Palatal click allophony in Mangetti Dune ǃXung: Implications for sound change.” Journal of the International Phonetic Association 49(2): 153–181. Miller, Amanda L., Jeffrey Holliday, David Howcroft, Sara Phillips, Bridget Smith, and Tsz-Him Tsui. 2011. “The phonetics of the modern-day reflexes of the proto-palatal click in Juu languages.” Paper presented at the 4th International Symposium on Khoisan Languages and Linguistics, July 2011, Riezlern/Kleinwalsertal. Miller-Ockhuizen, Amanda L., and Bonny Sands. 1999. “ǃKung as a linguistic construct.” Language and Communication 19: 401–413. Miller-Ockhuizen, Amanda L., and Bonny Sands. 2000. “Contrastive lateral clicks and variation in click types.” Proceedings of ICSLP 2000, vol. II, pp. 499–502. China: Beijing. Sands, Bonny. 2010. “Juu subgroups based on phonological patterns.” Khoisan languages and linguistics: Proceedings of the 1st International Symposium, January 4–8, 2003, Riezlern/Kleinwalsertal, ed. M. Brenzinger and Christa König. Quellen zur Khoisan-Forschung / Research in Khoisan Studies 24. Cologne: Rüdiger Köppe, pp. 85–114. Sands, Bonny, and Hilde Gunnink. 2019. “Clicks on the fringes of the Kalahari Basin Area.” Theory and Description in African Linguistics: Selected Papers from the 47th Annual Conference on African Linguistics, ed. Emily Clem, Peter Jenks and Hannah Sande. Berlin: Language Science Press, pp. 703–724. Snyman, Jan W. 1980. “The relationship between Angolan ǃXũ and Ž uǀ’hõasi.” Bushman and Hottentot linguistic studies (Papers of seminar held on 27 July 1979), ed. Jan W. Snyman. Miscellanea Congregalia 16. Pretoria: University of South Africa, pp. 1–58. Snyman, Jan W. 1997. “A preliminary classification of the ǃXũũ and Žuǀ’hõasi dialects.” Namibian Languages: Reports and Papers, ed. Wilfrid H.G. Haacke and Edward D. Elderkin. Namibian African Studies 4. Cologne: Rüdiger Köppe, pp. 107–116. Traill, Anthony, and Rainer Vossen. 1997. “Sound change in the Khoisan languages: New data on click loss and click replacement.” Journal of African Languages and Linguistics 18: 21–56. Vedder, Heinrich. 1910/11. “Grundriß einer Grammatik der Buschmannsprache vom Stamm der ǃKũ-Buschmänner.” Zeitschrift für Kolonialsprachen 1: 5–24.

Chapter 11

Production of Click Sounds in Acquired Apraxia of Speech: a View to the Motoric Nature of the Disorder Anita van der Merwe and Mollie Steyn 1 Introduction The purpose of the study reported in this chapter is to explore on both an auditory-perceptual and acoustic level the production of three Zulu clicks by a native African Bantu language speaker with apraxia of speech (AOS). This speech motor planning disorder in its acquired form results from brain damage in the dominant (left) hemisphere. Damage to the highest level in the speech motor control hierarchy results in AOS (Van der Merwe 1997, 2009). Areas proposed to be involved are cortical motor areas such as the left lateral premotor cortex (Brodman’s area 6) and Broca’s area (Brodmann’s area 44 in the left hemisphere) (McNeil, Robin and Schmidt 1997, 2009; and Van der Merwe 1997, 2009). Current characterization of the salient features of AOS is largely based on studies of speech production in English and other Germanic languages such as German and Afrikaans (Miller and Lowit 2014; Van der Merwe, Uys, Loots and Grimbeek 1987; Van der Merwe and Le Roux 2014a, b; and Ziegler, Aichert and Staiger 2012). A study of the nature of speech errors in languages that present with unique sound system characteristics, such as the Bantu languages, may enhance our understanding of the underlying motor planning disruption in this enigmatic disorder. A phonetic level comparison of English and Afrikaans with Zulu and Xhosa, which are both Bantu languages and belonging to the Nguni group within this family of languages, brought Niesler, Louw and Roux (2005) to the conclusion that both Zulu and Xhosa are “substantially more phonetically complex and diverse than the remaining two languages” (p. 468) and that a greater variety of sounds are in active use than in English and Afrikaans (Niesler et al. 2005). Idiosyncratic sound system characteristics of Bantu languages and the phonetically complex nature of some sounds may pose unique demands to the motor planning system (Van der Merwe and Le Roux 2014a, b). Click sounds, which occur in some Bantu languages, are a group of diverse sounds with manifold articulatory, laryngeal and airstream features. Click sound production requires

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_012

Production of Click Sounds in Acquired Apraxia of Speech

357

complex articulatory adjustments when coarticulated with other sounds during word or sentence production (Herbert 1990; Ladefoged and Traill 1994; Bernhardt and Stemberger 1998; and Thomas-Vilakati 2010). Accurate temporal and spatial planning of movements during click production is necessary and may pose particular challenges to the speaker with AOS. A segmental level analysis of how AOS affects click production could provide a view to the motoric nature of the disorder. The essential component of click production is the rarefaction of air enclosed between an anterior and a posterior articulatory closure within the oral cavity and the subsequent release of the forward closure. Air in the cavity between the two closures is rarefied by a downward and/or backward movement of the center of the tongue. The mechanism for producing a loud transient sound at the moment of release is the velaric ingressive airstream mechanism. The expression click accompaniment describes properties of the back oral articulation together with pulmonic activity, laryngeal setting, and nasal airflow that accompany the click. There are five click types that are widely recognized: bilabial, dental, lateral, alveolar, and palatal (Ladefoged and Maddieson 2008: 246–247; Ladefoged and Traill, 1994). Zulu and Xhosa use dental, lateral and postalveolar clicks. The three basic clicks are extended to 15 by the addition (accompaniment) of different combinations of aspiration, nasalization or voicing. Some of these combinations are phonemic and change the sound into a different phoneme (Niesler et al. 2005). The labels for the different clicks refer primarily to the frontal place of articulation. However, there has been enormous diversity in articulatory descriptions of the different aspects of frontal articulation by different authors (Ladefoged and Maddieson 2008; Ladefoged and Traill 1994; and Roux 2007). In view of this diversity in terminology, the wide range of clicks with different accompaniments, and the many variables that coarticulation of the click with surrounding sounds brings into articulatory manifestation, only three clicks were included in the current exploratory study (see Table 11.1 for a complete description of stimuli). These were first, the voiceless, lingual, lateral, alveolar click /ǁ/; second, the full breathy voiced (partially unvoiced), lingual plus pulmonic, medial plus nasalized, lamino-dental, click /ŋǀ/; and third, this same click but with other accompaniments and described as the voiceless, lingual, medial, lamino-dental click /ǀ/. Several diverse spatial and temporal (rate/timing) parameters of movement need to be planned during click production. For example, dental and lateral clicks are affricated with slow frontal release, whereas (post)alveolar and palatal clicks are sharply released (Ladefoged and Maddieson 2008: 256). Thus, accurate temporal control is integral to accurate click production. All three target clicks in this study are either lateral or dental and presumably need slow

358

van der Merwe and Steyn

affricated release. From a spatial perspective the major articulatory distinction between apical and laminal clicks is not place of articulation, but rather the part of the front of the tongue that is used and the shape and positioning of the tongue body and tongue root (see Proctor et al., this volume). The apical postalveolar click requires a greater degree of lowering of the center of the tongue for cavity expansion than does the laminal dental click. Lateral clicks are produced by lateral release which is attained by moving one side of the tongue at the level of the molar teeth (Ladefoged and Maddieson 2008: 255–257). Unilateral tongue movement during speech sound production is probably an idiosyncratic sound system characteristic of languages that utilize lateral click sounds. Another idiosyncratic feature is the simultaneous use of two air stream mechanisms during production of a sound. The second sound described above, the nasalized click /ŋǀ/, utilizes both the velaric and the pulmonic airstream. This sound is also produced with full breathy voice – vocal fold vibration does occur but without appreciable contact (Ladefoged and Maddieson 2008: 48). Planning of airstream release at the level of the vocal folds and of the air stream mechanism (velaric + pulmonic) is necessary for the latter sound, apart from all other specifications. The three click sounds in the current study appear to challenge speech motor control and the prediction is that individuals with AOS will display segmental level speech motor errors during production of words containing these sounds. The purpose of this exploratory study was to analyze the nature of errors in click production. 2 Background Apraxia of speech (AOS) is a speech motor planning disorder and can be differentiated from aphasia, a language-based disorder, and dysarthria which is a speech execution problem. These distinctions are driven by a three-level model of speech output that differentiates between disorders in linguisticsymbolic planning (aphasia), motor planning (apraxia) and speech execution (dysarthria). AOS has a long history of controversy regarding the nature and primary deficit of this speech disorder. Pathognomonic features of AOS suggest a motor-based disorder (McNeil, Pratt and Fossett 2004; McNeil et al. 1997, 2009; and Duffy 2013). The typical signs of AOS include sound distortions (inaccurate production of the sound due to spatial or temporal inaccuracies during production), “islands” of less impaired speech, voice onset time errors (voiced sounds are partially or fully devoiced and voiceless sounds are partially or fully voiced), distorted sound substitutions (the error is heard as a substitution, but not as a clear, well produced sound), extended vowel and consonant duration, and increased inter-segment and transition durations leading to slow speech and syllable by syllable production (syllable segregation) of the utterance

Production of Click Sounds in Acquired Apraxia of Speech

359

(Duffy, Strand and Josephs 2014; McNeil et al. 1997, 2004, 2009). Lengthened segment duration is regarded a kernel acoustic feature (McNeil et al. 2004: 399). Kinematic analysis of temporal and spatial movement features shows phase plane trajectories for closing gestures during speech that are decoupled in temporal-spatial relations (Forrest, Adams, McNeil and Southwood 1991). The disorder in temporal and spatial motor planning results in slow, struggling speech with inconsistent sound distortions. In a severe form the individual is unable to produce speech and cannot communicate verbally. In such instances, differential diagnosis from the expressive aphasias, such as Broca’s aphasia and conduction aphasia is difficult, and remains a contentious and clinically challenging matter. AOS often co-occurs with Broca’s agrammatic aphasia and differentiating between the contributing apraxic versus language level dysfunction is challenging and needs careful consideration. Broca’s aphasia is a language disorder which impacts syntactic planning, word finding, reading and written language. These abilities are not impaired in “pure” AOS (Duffy 2013; and McNeil et al. 1997, 2004, 2009). Inadequate differentiation between levels of breakdown displayed by research participants may invalidate the findings of research. The challenge of differentiating between language- and speech motor planning driven disorders in the presenting signs of research participants and clients in clinical practice could be addressed by research that further explores segmental level disruption, as in the current study. The signs of AOS have also traditionally been confused with phonemic paraphasias, such as sound substitution and phoneme sequencing errors which are a speech level sign of conduction and Wernicke’s aphasia. Phonemic paraphasias occur in the presence of typical speech rate and few or no sound distortions (McNeil et al. 1997, 2004, 2009). The lack of differentiation between phonemic paraphasia and AOS signs is largely due to first, the incomplete modelling of the phases in the preparation of speech output and second, the implementation of a broad phonetic transcription method in speech error analysis as opposed to a narrow phonetic transcription approach. A broad phonetic transcription does not allow for the phenomenon of distortion of a sound during production and interprets such errors as phoneme substitutions, hence a linguistically-driven phonological error. A narrow phonetic analysis, which notes the presence of sound distortions that are subtle subphonemic errors, is the only way to detect and explain the nature of AOS signs. Most researchers in the field regard narrow phonetic auditory-perceptual analysis as the golden standard for analyzing AOS signs. The liability of perceptual analysis is the subjective nature of such analyses and yet such analyses lead to a more comprehensive understanding of the nature of AOS (Shuster and Wambaugh 2000).

360

van der Merwe and Steyn

The challenge of differentiating the speech symptoms and signs of AOS from Broca’s aphasia and from phonemic paraphasia is one feature of the problematic nature of AOS, while the other is characterizing the nature of the deficit in AOS. The four-level model of Van der Merwe (1997, 2009) was designed to characterize disorders in speech sensorimotor control and clarify the underlying nature of motor speech disorders. The preparation of speech output is depicted in some detail and the traditional three-level model is expanded to include a fourth level or phase in the motor control hierarchy, as informed by neurophysiological and motor control data. This model can broadly guide differential diagnosis between linguistic and motor planning disorders and hypothetically characterize motor-based speech disorders. The phases in this model are linguistic-symbolic planning, motor planning, motor programming and motor execution. Linguistic planning includes syntactic, morphological and phonological planning. During motor planning, the “strategy” (Brooks 1986) for movement is determined by cortical motor areas, such as the prefrontal cortex and supplementary motor area, and during the programming phase “tactics” (Brooks 1986) are added to the plan by “lower” levels in the hierarchy such as the basal ganglia and lateral cerebellum. During motor planning the spatial and temporal parameters of the motor plan for the production of a sound is recalled from sensorimotor memory. Adaptation of motor plan parameters to the phonetic context, coarticulation of the series of movements and interarticulatory synchronization of movements of different structures for the production of a syllable, word or longer utterance are also planned. During motor programming, muscle-specific tone, velocity, direction, and range of movements are specified together with the repeated feedforward of motor programs. Within the context of this model, AOS is primarily characterized as a motor planning disorder which may impact programming secondarily (Van der Merwe 2009). The proposal of this model created an awareness of the differentiation between motor planning and programming. However, due to incomplete evidence of differential impairment in these two phases, AOS is currently referred to by leaders in the field as a motor planning and/or programming disorder (McNeil et al. 1997, 2004, 2009) or as a motor planning or programming disorder (Duffy 2013). Irrespective of the summarized label, the definition of McNeil et al. (2009: 264) is widely accepted and captures the essence of AOS as a motor speech disorder: Apraxia of speech is a phonetic-motoric disorder of speech production. It is caused by inefficiencies in the translation of well-formed and -filled phonological frames into previously learned kinematic information used for carrying out intended movements. These inefficiencies result in

Production of Click Sounds in Acquired Apraxia of Speech

361

intra- and interarticulator temporal and spatial segmental and prosodic distortions. It is not attributable to deficits of muscle tone or reflexes, nor to primary deficits in the processing of sensory (auditory, tactile, kinesthetic, proprioceptive), or language information. AOS and dysarthria are both motor speech disorders. Consistency and inconsistency in the presence of speech errors differentiate AOS from dysarthria. Dysarthria speech signs reflect a disorder in the execution phase of speech production and are due to muscle tone and movement coordination problems as a result of damage to the upper motor neurons, the basal ganglia, the cerebellum and/or the lower motor neurons. As the muscle tone and coordination problems are consistently present, dysarthria signs are consistently present and no “islands of error-free speech” occur. AOS errors appear to be inconsistent in occurrence. But, much controversy exists regarding the phenomena which could be consistent or inconsistent (Wertz, LaPointe and Rosenbek 1984; Mauszycki, Wambaugh and Cameron 2010; and Staiger, Finger-Berg, Aichert and Ziegler 2012). However, there is strong evidence that an individual speaker may have predictable difficulty in producing specific speech sounds (Wambaugh, Duffy, McNeil, Robin and Rogers 2005). Inconsistency (variability) appears to be influenced by task characteristics (Mauszycki et al. 2010; and Staiger et al. 2012). Contextual factors such as length of the utterance, motor complexity, speech rate, and novelty of an utterance may determine task difficulty (McNeil et al. 2004; and Van der Merwe 1997, 2009). Contextual loading (for example, longer, motorically more complex, and novel utterances) may induce an increase in the number and severity of speech errors (Van der Merwe 1997, 2009). In the limb movement literature, complexity is defined as the “number of parts or components and the amount of information-processing demands” of a skill (Magill 2007: 436). However, this too is an area of dispute. According to Ziegler, Aichert and Staiger (2012) we have no objective criteria to decide what is easier or more difficult to articulate for adult speakers. In spite of this objection to a motor complexity perspective on the speech signs and symptoms in AOS (Ziegler et al. 2012), a viable prediction would be that sounds that require multiple articulatory gestures would pose a challenge to a speaker with AOS. Click sounds could represent such motorically difficult sounds. Late development of clicks in the speech of young children could serve as evidence for high motor complexity. However, developmental data regarding the age of acquisition of click sounds by children are contradictory (Herbert 1990; Lanham 1969; Mowrer and Burger 1991; and Naidoo, Van der Merwe, Groenewald and Naude 2005). Methodologies vary among studies and this

362

van der Merwe and Steyn

prohibits direct comparison of results. Mowrer and Burger (1991) came to the conclusion that clicks are mastered early among Xhosa speakers. They observed correct production of the three clicks of Xhosa in the speech of children as young as 2.6 to 3.0 years. However, each click was elicited in a single word during a picture naming task. Accurate production of clicks in different phonetic environments and in the context of connected speech was not assessed. Their results are contrary to the findings of Naidoo et al. (2005) who came to the conclusion that clicks develop late during speech acquisition. In the latter study, data based on a 100-word spontaneous speech sample from 18 children in three age groups, varying between the ages of 3.0 and 6.2 (six per group), showed that the only click that was noted in the two youngest groups was the lateral alveolar click /ǁ/. In the eldest group, no click sounds were present. This result could be due to the criterion that at least five children in a group had to produce the sound correctly, based on a narrow phonetic analysis, for it to be considered present in the sound inventory of the age group (Naidoo et al. 2005). In their study on the acquisition of clicks by Xhosa-speaking children, Lewis and Roux (1996) noted many speech errors in production and also the occurrence of phonological processes, such as simplification and reduction, which simplify click production. They support the viewpoint of Lanham (1969: 159) that clicks are difficult to acquire by first language (L1) speakers and that “clicks are amongst the last consonants to be acquired”. These results collectively appear to argue for late development of accurate click production and by inference the high motor complexity of click sounds. Ladefoged and Maddieson (2008: 280) declare that clicks are easy to make and are perceptually salient. However, they do agree that the addition of diverse complex accompaniments provides a phonetic challenge. 3

Previous Research

To our knowledge only one study has been published in which click sound production in a speaker with an acquired speech disorder was explored. This was a study by Dogil and Mayer (1998). Data in the Dogil and Mayer study was collected during three recordings of a repetition test of 64 words, 26 of which contained clicks. Only 17 errors were made in three test sessions and no click errors occurred. Further analysis of spontaneous speech showed occasional click errors on some words. Their conclusion was that AOS does not impact click production negatively. But, a number of questions arise regarding participant selection, error analysis and the conclusions of this study. The participant in

Production of Click Sounds in Acquired Apraxia of Speech

363

their study had no problems with language comprehension, reading, writing, or grammatical output. She was reported to present with speech signs which included articulatory groping and phonetic distortions. These signs are consistent with AOS. However, they also report that the participant in their case study presented with “substantial language impediment” and “phonological substitutions and uncertainties” (p. 172). The phonological level errors (phonological substitutions and uncertainties) are contradictory to salient features of AOS. This fact is important as the nature of the disorder will determine the nature of the speech errors. If phonological level errors were present this may indicate the presence of phonemic paraphasia which is a disorder on a phonological planning, and not a motor planning, level (McNeil et al. 1997, 2004, 2009; and Van der Merwe 1997, 2009). The authors also declare that only phonological errors, in other words only substitutions or omissions were counted. A broad phonetic analysis method, as opposed to a narrow analysis, was implemented. The two different approaches will render different results in the case of AOS. A narrow phonetic analysis is necessary to capture subtle speech sound distortion which is a key sign of AOS (McNeil et al. 2004, 2009). In view of these problems the data from this study cannot be accepted as evidence regarding click sound production in AOS. 4

Purpose of the Study

The current study explored production of three clicks by a multilingual individual who presented with AOS. Click production across three recordings of three words, each containing a click sound, was analyzed perceptually and acoustically. Analyses were focused on only the click sound in the word. AOS signs were thus explored on a segmental level. Segmental level disruption of spatial and temporal parameters of speech sound production could contribute further evidence towards the motor-based nature of the disorder and be instrumental in the theoretical and clinical differentiation of AOS from speech level errors in Broca’s aphasia and phonemic paraphasia. The auditory-perceptual error analysis implemented a narrow phonetic transcription method which allows for the occurrence of distortion of sounds. The aim was to determine the presence of sound distortion (unclear production of the target sound due to mistiming of a movement or inaccurate spatial placement/features), deletion or substitution of the click sound. The acoustic analysis focused on total click duration, burst duration and click release to vowel onset duration (end of burst to point where formants of the following vowel are visible). An analysis of click

364

van der Merwe and Steyn

production across recordings may inform our understanding of features of AOS such as increased segmental length, sound distortion, and variability of error location and type. 5 Method 5.1 Participants Purposive sampling was used to select the participant with AOS. The Participant was a multilingual 48-year-old African male who sustained a gunshot wound in the left hemisphere of the brain 10 years prior to the current study. His home- and first acquired language (L1) was Tswana and he could also speak Zulu, English and Afrikaans pre-onset. The Participant learnt Zulu in his community as a pre-school child. Urban African language speakers in South Africa are traditionally multilingual (Jacobson and Traill 1986). He received schooling in Tswana, English and Afrikaans and later university training in English. He was exposed to all four of these languages before the end of the first decade of life and could be classified as “early” multilingual (Kim, Relkin, Lee and Hirsch 1997). He could read and write in all four of these languages before the incident happened and also during the current study. Before and after the incident, English was the language most often used in the workplace. His home language continued to be Tswana. Two months after the Participant sustained brain damage, he was assessed by the first author. At the time, he displayed slow effortful speech, groping articulatory movements, syllable segregation, start-restart behavior, sound distortions, and he used gestures to augment his speech. These are all salient features of AOS (McNeil et al. 1997, 2004, 2009). Speech production was attempted only in Tswana, his first language. Communication was restricted to single words and short phrases. He displayed no receptive language problems as reported by the treating clinician and the Participant’s wife. He displayed a minor right-sided hemiplegia and minor involvement of the tongue and lower quarter of the face on the same side, consistent with a mild unilateral upper motor neuron dysarthria. After this first assessment he received speech intervention for two years and the speech motor learning (SML) approach (Van der Merwe 2011) was followed. At the time of the current study (10 years post-onset) the Participant was able to communicate in full sentences in both Tswana and English. Speech was still mildly slow in these two languages, sound distortions occurred occasionally, some multisyllabic words were produced by separating the syllables and some self-corrections occurred. Production of Afrikaans and Zulu was more

Production of Click Sounds in Acquired Apraxia of Speech

365

effortful and still displayed all the characteristic signs of AOS in high frequencies. A diagnosis of AOS was made by agreement of the two authors and two other independent speech pathologists while listening to recordings of the speech of the Participant. At the onset of the current study the Western Aphasia Battery (WAB) (Kertesz 1982), which is widely used to assess language performance and skills, was applied to formally assess the language ability of the Participant. An aphasia quotient of 88.8 was attained. This score reflects a mild aphasia according to the WAB. However, a person with AOS will also attain low scores on the sub-tests that require speech. The WAB, published in 1982, was not designed to diagnose AOS or to provide a differential diagnosis between AOS and an expressive aphasia. Low scores on speech tasks caused by AOS are not considered by the WAB. The diagnosis of AOS with negligible, if any, co-occurring aphasia was made based on the following considerations: The Participant attained full score on all three receptive tasks (yes – no questions, word recognition, and following commands), the sentence completion task, and the responsive speech task which were all performed in English. On word fluency and naming tasks he displayed some articulatory errors and problems in producing long words. No perseveration, semantic paraphasias, or circumlocutions due to word finding problems were noted. He performed the picture description task in Tswana (L1) and was able to construct sentences of up to nine words with accurate syntax. For information content during picture description he attained full score. To assess comprehension of Zulu and Afrikaans he was requested to translate sentences of six to eight words into Tswana. He was able to do this accurately. The performance of the participant on the sub-tests of the WAB argues against the presence of aphasia and rather points to a diagnosis of AOS. Confirming this diagnosis is the fact that the Participant could go back to work two years after the incident when he was discharged from speech intervention. At that point in time his speech had improved to the extent that he was able to communicate in full sentences in both Tswana and English. He resumed his position as a stock clerk and the pay-slip manager at a casino. He still held this position during the current study. He also worked as lay preacher and delivered sermons in Tswana on a regular basis. No signs of dysarthria, such as paralysis/ paresis or asymmetry of oral structures, or involuntary movements were detected at the time of the current study. Three control participants were also included for the acoustic analyses. They were a 70-year-old female interpreter (who also acted as model during the recordings and as listener during the perceptual analyses) and also one male and one female individual in their mid-twenties. They were all first language Bantu language speakers and all acquired Zulu as a second or third

366

van der Merwe and Steyn

language during their pre-school years. Three control participants who varied in age and gender were regarded as sufficient as the typical acoustic spectra for the different click sounds have been well-documented (see discussion and references in Ladefoged and Maddieson 2008) and they all displayed typical spectrographic images. All participants gave their informed consent to take part in the study. Ethical clearance for the research was granted by a Faculty ethics committee. 5.2 Speech Stimuli The speech stimuli for the current study consisted of two single Zulu words (ix̱ox̱o and ngc̱ono). The third word esingenamc̱amelo was produced in the context of the Zulu sentence Ngihlala esihlalweni esingenamc̱amelo (I sit on a chair without cushions). This was the third of a set of three sentences of increasing length (I sit; I sit on a chair; I sit on a chair without cushions). The click sounds that occurred in the three stimuli are highlighted and underlined in the words and the phonetic features are summarized in Table 11.1. Table 11.1 The word stimuli and characteristics of the click sound in each word

Phonetic Nature of Airstream Airstream Place of Words with clicks articulation underlined, syllables symbol airstream mechanism release release indicated, and meaning ix̱ox̱o (i.xo.xo) (mouse)

ǁ

Voiceless

Lingual (velaric)

Lateral

ngc̱ono (ngco.no) (better)

ŋǀ

Full breathy voiced

Lingual (velaric) + Pulmonic

Medial + LaminoNasalized dental

Voiceless

Lingual (velaric)

Medial

ǀ esingenamc̱amelo (e.si.nge.na.m.ca.me.lo) (without cushions)

Alveolar

Laminodental

Production of Click Sounds in Acquired Apraxia of Speech

367

5.3 Procedures Three recordings of the speech stimuli were utilized for the purpose of the current study. The first recording was made 17 months prior to the other two. This first recording was an initial exploratory study in which AOS speech signs across four languages (English, Afrikaans, Zulu and Tswana) were compared during the production of 30 words and nine sentences in each language. The research was expanded at a later date and two further recordings were made. The last two recordings were made on the same day. Two single words and one word in one of the sentences contained click sounds and these were used as stimuli in the current study. During the last two recordings the whole list containing words and sentences was produced once and then again for a second time. The data of all three recordings were deemed suitable for the current study as procedures across all recordings were similar and because both occasions of recording were several years post-onset (eight years, five months and 10 years). During the intervening 17 months no significant changes occurred in his speech production ability. During all recordings the participant was requested to produce each target word once. Restarts and self-corrections were allowed. Only the final attempt was analyzed. The Participant had the opportunity to read the words and sentences before recording commenced. The control participant, who acted as model, read three words or sentences at a time and the Participant could read along. Then a prompt in English followed to repeat the three words or sentences. This procedure facilitated self-initiated production of the targets as opposed to imitated production. The Participant had the orthographic version available to read from. The words containing click sounds were extracted from the recordings and utilized as data in the current study. 5.4 Perceptual Analysis of the Data Speech errors were judged perceptually by a native multilingual Bantu language speaker and the first author. The multilingual speaker spoke seven Bantu languages and English and Afrikaans fluently. She had 30 years of experience as interpreter and assistant for speech-language pathologists in a multilingual hospital setting and assisted with speech and language assessments. At the time of the study she was assisting two speech-language pathologists who specialized in medico-legal assessments of Bantu language speaking individuals who sustained brain damage. She was able to point out distortion of sounds and played a major role during analyses. The first author (L1 Afrikaans and L2 English) received five years of formal schooling in Sepedi, a Bantu language, and had previously worked in a hospital setting where speakers of

368

van der Merwe and Steyn

Bantu languages were treated. She had been doing narrow phonetic analyses of neuromotor speech errors for more than 30 years. Recordings were played repeatedly on more than one occasion to ensure that all errors were noted. A consensus analysis method was followed. Reliability analyses of all recordings were performed by the current authors who both have a background in Bantu language studies. These analyses confirmed the results of the first analysis. The focus in the present study was only on the production of the click sounds in the target words. The three possible error types were: click distortion (the click appeared to be distorted in spatial placement or the burst release was present, but not sharp and clear), click deletion (the sound was deleted from the word), or substitution of the click with another well produced click or other speech sound. Distorted substitution was another possible category, but it was difficult to make reliable decisions based on perceptual characteristics of the clicks and this category was omitted. Errors other than deletion or substitution were noted as distortions. Syllable segregation was accepted and not noted as an error as the authors regard that as a compensatory strategy and not a primary error. If there were restarts or attempts to self-correct errors, the final production of the word was analyzed. 5.5 Acoustic Analysis of the Data Audio recordings of click productions were acoustically analyzed using the Computerized Speech Lab (CSL) Main program (Model 4150B, Version 3.4; Kay PENTAX, 2010). The following procedure was followed for each click sample: A two-second portion of the waveform, with the click sound approximately at the center, was selected and used to generate a spectrogram. Spectrogram analysis parameters (see Table 11.2), which remained consistent across all click samples, were set so as to yield accurate measurements and images. IPA phonetic symbols of individual sounds were inserted in the upper transcription line of the waveform. Three measurements were then performed, using the waveform and spectrogram images, as well as the auditory signal thereof. Three temporal measurements were done: total click duration, click burst duration, and click release to vowel onset duration. First, the data mark was placed at the onset of the click burst and the data cursor at the onset of the subsequent vowel. The duration between these points was noted as the total click duration (but note that this does not include the closure portion of the click). The data cursor was then moved to the end of the click burst while the data mark remained fixed; this was noted as the duration of the click burst. Last, the period between the click release (end of the click burst) and the onset of the following vowel was determined. Waveform and spectrogram images of the initial two second selection were then saved and duration values were transferred from the CSL logbook to a Microsoft Excel spreadsheet for further analyses.

Production of Click Sounds in Acquired Apraxia of Speech

369

Table 11.2 Spectrogram analysis parameters

Analysis parameter

Setting

Analysis size (bandwidth) Window weighting Pre-emphasis level Display range (y-axis) Frequency scale Palette levels

200 points (323.0 Hz)* Hamming 0.80 0–10,000 Hz Linear 48.00–0.00 dB

Note: * Analysis size (bandwidth) was changed to 100 points (234.38 Hz) for two samples, as this yielded more accurate measurements and images than the 200 points (323.0 Hz) setting. All other parameters remained consistent throughout analyses.

To determine intra-rater reliability, 23% of the recordings were re-analyzed acoustically. Samples from each of the three duration measurements of each recording were randomly selected. Point-to point comparisons between the reliability measurements and the original measurements were made. To be regarded as in agreement, the value of the original and reliability measurements had to be within 10 milliseconds from each other. Intra-rater reliability was calculated as 86.7%. 6 Results 6.1 Results of the Perceptual Analysis The results of the perceptual analysis are summarized in Table 11.3. This analysis revealed that all three of the click sounds as produced by the Participant with AOS were incorrect during at least one of the recordings. The lateral click /ǁ/ in the word ix̱ox̱o was distorted across all three recordings in both the first consonant (C1) position and in the second consonant (C2) position. The /ŋǀ/ click in the word ngc̱ono was deleted from the word during the first recording, but was produced correctly during the other two recordings. During production of the sentence containing the word esingenamc̱amelo, the participant displayed many restarts and attempts to self-correct errors during all three productions of this sentence. The nature of errors during click production varied across the three recordings. During the first recording the click was substituted with /sk/, and during the second and third recording the click was present, but distorted. A range of errors, including sound deletion and substitution, occurred on the other sounds in this word.

370

van der Merwe and Steyn

Table 11.3 Click production characteristics across recordings 1 to 3 as judged perceptually

Target word

Recording 1

Recording 2

Recording 3

ix̱ox̱ o

C1: Distorted C2: Distorted

C1: Distorted C2: Distorted

C1: Distorted C2: Distorted

ngc̱ono

Click sound deleted

Correct production

Correct production

Distorted

Distorted

esingenamc̱amelo Substitution (produced /sk/)

To provide supportive evidence for the perceptual analysis, a sample of spectrographic representations of productions are included. Examples of the three words produced by the control participants can be viewed in Figures 11.1 to 11.3. In Figures 11.4 to 11.6 examples of the Participant’s productions are portrayed. The arrow in each figure indicates the position of the click sound. In the upper transcription line of the waveform the phonetic symbols indicate the sound that was produced, though in the case of the Participant these were in some instances not the target sounds. A comparison of the word ix̱ox̱o by a control participant (Figure 11.1) and by the Participant with AOS (Figure 11.4) show clear sharp click burst spikes and a delayed click release1 (end of burst) to vowel onset duration in the speech of the typical speaker, but more diffuse energy and a brief click release to vowel onset duration in the case of the Participant. Both the clicks in this word were perceived as being distorted. The production of the click by the Participant in the word ngc̱ono (Figure 11.5) appears to be sharp and clear with a longer click release to vowel onset duration than in the previous word. This production was perceived as perceptually correct. Click production in the word esingenamc̱amelo also show diffuse energy with no clear spike (Figure 11.6) compared to the spectrogram of the control speaker (Figure 11.3). The production of this click by the Participant was perceived as being distorted. He displayed multiple restarts and attempts to self-correct the wide range of errors that occurred during production of this word, but did attempt to produce the latter part of the word that contained the click sound. Only the latter part of this word in which the click sound occur, is portrayed in Figure 11.6. The spectrographic representations appear to confirm the perceptual observations of distortion or correct production of the click. 1  This delay is indicative of a complete glottal stop closure after the click burst suggesting an ejected dorsal click release.

Production of Click Sounds in Acquired Apraxia of Speech

Figure 11.1

Production of ixoxo by a typical female Zulu second language speaker (control participant 1, recording 1)

Figure 11.2

Production of ngcono by a typical female Zulu second language speaker (control participant 1, recording 2)

371

372

van der Merwe and Steyn

Figure 11.3

Production of esingenamcamelo by a typical male Zulu first language speaker (control participant 2, recording 2)

Figure 11.4

Production of ixoxo by the participant with AOS (recording 2 – click judged perceptually as distorted)

Production of Click Sounds in Acquired Apraxia of Speech

Figure 11.5

Production of ngcono by the participant with AOS (recording 3 – click judged perceptually as produced correctly)

Figure 11.6

Production of esingenamcamelo by the participant with AOS (recording 3 – click judged perceptually as distorted)

373

374

van der Merwe and Steyn

6.2 Results of the Acoustic Analyses Due to the paucity of data we only applied descriptive statistics and the results are displayed in tables. No further analysis was performed. The results of the acoustic analyses are summarized in Tables 11.4 and 11.5. In Table 11.4 the total click duration, burst duration and click release (end of click burst) to vowel onset duration for each click produced by the Participant across recordings are presented. For the control participants as a group the mean and also the minimum and maximum values are presented for each measurement. The aim was to determine if the values recorded for the participant fall within the range of the upper and lower bounds of the control participants. Of the 18 values recorded for the Participant during production of the word ix̱ox̱o, only three are outside the range of the control group’s values (see Table 11.4). Total click duration is outside the range of control participants for C1 during recording 2 by 1.7 (80.5−78.8) milliseconds. Click burst duration is outside of the control group’s range during two of the recordings of C1. Across recordings 1 to 3 of C1, two of the values for burst duration are close to each other (19.7 and 27.3) while the third is longer (62.2 milliseconds). Two of these values are outside the normal range with the one value displaying a shorter and the other a longer duration. These results indicate variability in duration across recordings of the same word. Perceptually the production of ix̱ox̱o was perceived as distorted even though aspects of segmental duration were within the normal range in most instances (15 out of 18 opportunities). During production of the word ngc̱ono, the click sound was deleted by the participant during the first recording. During Recordings 2 and 3 of this word two of the values are outside the control range while three are close to the upper or lower bound of the control participants (see Table 11.4). The two productions during Recordings 2 and 3 were both noted as being produced correctly. The differences from the control group do not appear to be large. For example, total click duration was 62.2 and 69.7 milliseconds while the average of the control speakers was 65.2. Click burst duration tended to be shorter or close to the lower bound of the control speakers, while click release to vowel onset duration tended to be longer or close to the upper bound. For the analyses of the word esingenamc̱amelo the data of the control participant who acted as model during the recordings were excluded. Her production of the sentence was unnaturally slowed down presumably because she knew the Participant struggled with the task. During the first recording the Participant did not produce the click, but substituted it with /sk/. The results of five out of the six duration measurements of the Participant were outside the range of the control participants. In some instances, duration was shorter than that of the control participants while in others duration was longer. Click production was noted as distorted during both Recordings 2 and 3.

Production of Click Sounds in Acquired Apraxia of Speech

375

Table 11.4 Comparison of acoustic data of the Participant with AOS and control participants

Participant Word

ixoxo

ngcono

Control (N = 3)

Measurement (in milliseconds)

Recording Recording Recording Mean and minimum 1 2 3 to maximum duration across 8 productions (2 + 3 + 3)

Total click duration†

C1: 51.5

C1: 80.5 *

C1: 57.7

C2: 57.6

C2: 57.7

C2: 42.5

Click burst duration

C1: 19.7 ** C1: 62.2 *

C1: 27.3

C2: 27.3

C2: 34.9

C2: 33.4

C1: 31.8 Click release (end of burst) to vowel onset duration C2: 30.3

C1: 18.2

C1: 30.4

C2: 22.8

C2: 9.1

Total click duration† Click burst duration Click release (end of burst) to vowel onset duration

Total click duration† Esingena- Click burst duration mcamelo Click release (end of burst) to vowel onset duration

62.2 Not available for acoustic 18.2 ** analysis: Click was 44.0 *** deleted

69.7 ***

65.1* Not available for acoustic 39.3 * analysis: 25.7 Click substituted with /sk/

42.6 **

24.2 **** 45.5 *

24.3 18.2 **

Mean: 54.6 44.0–78.8 Mean: 65.6 39.5–81.9 Mean: 35.1 24.3–43.9 Mean: 47.4 21.2–68.3 Mean: 19.5 9.1–34.9 Mean: 18.2 3.1–39.5 Mean: 65.2 59.2–71.2 Mean: 32.03 24.2–44.0 Mean: 33.16 22.7–44.0

Control (N = 2) Mean: 57.3 50.0–65.0 Mean: 30.5 18.2–37.9 Mean: 26.8 18.8–38.0

Note: C = consonant; *Outside the range of controls (longer); ** Outside the range of controls (shorter); *** Close to upper bound (within 2.5% of maximum) of controls; **** Close to lower bound (within 2.5% of minimum) of controls; † Does not include (pre-burst) closure duration.

376

van der Merwe and Steyn

To determine if the Participant displayed greater variability in duration across the three recordings than the control participants, the range (difference between maximum and minimum values in milliseconds) of every measurement for each participant is summarized in Table 11.5. The Participant does not display consistently smaller (less variability across recordings) or larger (more variability across recordings) ranges. For each of the measurements the Participant displays an instance of a smaller range (in three instances) than any of the control participants. In two measurements the participant displays a larger range. This is for click burst duration and total click duration. No clear trends emerged from these data. Table 11.5 The difference between minimum and maximum acoustic measurements (range) per participant and control participants to demonstrate variability across recordings

Participant Word

Measurements

ngcono

Control 2

Control 3

C1 33.3 C2 41.0 C1 7.5 C2 6.1 C1 25.8 C2 34.9

C1 6.3 C2 18.0 C1 12.1 C2 22.5 C1 9.2 C2 4.6

C1 9.1 C2 21.2 C1 16.7 C2 47.1 C1 7.6 C2 27.3

6.2 9.2 2.9

10.6 4.6 12.1

4.6 7.6 9.1

Excluded Excluded Excluded

8.7 18.9 19.1

11.9 13.7 8.0

Range across recordings

C1 29.0 C2 15.2 * Click burst duration C1 42.5 ** C2 7.6 Click release (end of C1 13.5 burst) to vowel onset C2 21.2 duration Total click duration† 7.5 Click burst duration 6.0 Click release (end of 1.5 * burst) to vowel onset duration Total click duration†

ixoxo

Control 1

Total click duration† 22.5 ** esingena- Click burst duration 15.0 mcamelo Click release (end of 7.5 * burst) to vowel onset duration

Note: * smaller range; ** larger range; † Does not include (pre-burst) closure duration.

Production of Click Sounds in Acquired Apraxia of Speech

377

7 Discussion The present study aimed to explore the impact of AOS on the auditoryperceptual characteristics and the acoustic features of three Zulu click sounds. The production of these clicks by a multilingual L1 Tswana speaker, who learnt Zulu (L2) as a pre-school child and acquired AOS at the age of 38 years, was analyzed. Speech was recorded three times while the Participant produced a list of words and sentences. Three of the words contained click sounds and were analyzed in the current study. The results suggest that click sounds can be produced correctly or be distorted, deleted or substituted consistently or inconsistently across three recordings. None of the durational measurements were consistently longer or shorter than those of the control participants. Only nine of the 30 opportunities displayed duration measures outside the normal range. Five of these displayed a longer duration than the control measurements while four displayed a shorter duration. Some of these measures that were outside the bounds of the control group appeared as outliers compared to the duration of the other recordings of the same word (see, for example, total duration and burst duration of C1 in ix̱oxo across recordings – Table 11.4). It was also found that a click can be noted as perceptually distorted even though the duration measurements are within normal bounds. Other inaccuracies in production appear to underlie the distortion of click sounds. Inaccurate spatial planning, disrupted coarticulation of different articulators and mistimed interarticulatory synchronization during burst release and click release to vowel onset appear to be central to the distortion of clicks. 7.1 Nature of Speech Errors The error types that were present were distortion, deletion, and substitution of the click. Speech errors that were noted are similar to the features of AOS as derived from error analysis in speakers of English. Sound distortion was the error type most often noted. The click in the word ix̱ox̱o was distorted during all six opportunities to produce this lateral click. Spectrographically the distributed energy and the lack of a sound burst were noticeable (see Figure 11.4), supporting the perceptual findings. The voiceless lateral click, /ǁ/, is reported to be a highly marked click (Herbert 1990) and appears to be challenging for this Participant. The unilateral release of air, which can be on either side, probably needs careful motor planning to achieve well executed production. The click in this word had to be produced in the initial position of two CV syllables. Little coarticulation with other sounds is required and yet this sound

378

van der Merwe and Steyn

challenged the Participant, possibly pointing to high motor complexity of this click. The dental click, /ǀ/, in the word esingenamc̱amelo was also in error across all three recordings. During the first recording, the click was substituted with /sk/, while distortion occurred during the next two productions. This word was longer than the other two target words and was produced in the context of a sentence. An increase in utterance length is known to increase the frequency of errors (McNeil et al. 2004). Another dental click occurred in the word ngc̱ono. During the first recording of this word the click was deleted, but was produced correctly during Recordings 2 and 3. The spectrographic representation of this word (see Figure 11.5) appears to confirm the results of the perceptual analysis. This click is produced with full breathy voice and two airstream mechanisms have to be planned. However, these features did not appear to impact accuracy of production during the last two recordings. Segmental length of the click sounds was increased during five of the 30 opportunities. Slow rate of segmental articulation does not appear to be a constant feature during click production by a speaker with AOS as would have been predicted from the literature (McNeil et al. 2004). These results appear to suggest that not all segments are lengthened and could be a contra-indication that increased length is a primary sign of AOS. This result may also indicate that slow rate of speech in AOS may primarily be due to slow inter-segment and inter-syllabic transitions as mentioned by McNeil et al. (2004). This deduction in turn may suggest that slow rate is due to slow motor planning of multiple successive segments which increases complexity of the motor task (Van der Merwe 1997, 2009). Slow rate may therefore be a compensatory measure and not a primary symptom of AOS. Increased segmental duration would therefore not be a reliable index of the presence of AOS during differential diagnosis from Broca’s aphasia and phonemic paraphasia in conduction aphasia. Another significant finding is that durational measures were in a number of instances within the bounds of the control group, but perceptually the sound was perceived as being distorted. The word ix̱ox̱o is an appropriate model for this observation. This finding may suggest that the underlying cause for distortion is not primarily centered in duration of the sound, but rather in disrupted spatial parameters, coarticulation and interarticulatory synchronization of the movements. 7.2 Error Variability The consistency as opposed to variability of errors in AOS remains a contentious matter. The results of the current study contribute some further evidence towards the factors that drive error variability. Errors occurred consistently on

Production of Click Sounds in Acquired Apraxia of Speech

379

the clicks in the words ix̱ox̱o and esingenamc̱amelo, but the type of error was inconsistent on the latter word. The dental click /ǀ/, in the latter word was probably less challenging for this speaker than the lateral click in the first word. The dental click also appears in the word ngc̱ono but with fewer accompaniments in the word esingenamc̱amelo. The /ŋǀ/ click in the word ngc̱ono was produced correctly on two occasions. It is possible that the longer word loaded task complexity and therefore production of the click in esingenamc̱amelo was consistently incorrect, though errors were more variable. The consistent occurrence of distortion on the clicks in the word ix̱ox̱o shows that the participant found the production of the lateral click particularly challenging. This result appears to confirm the prediction that individuals with AOS may have predictable problems with the articulation of specific sounds (Wambaugh et al. 2005). Rate of articulation of a sound in a particular word further also appears to be variable across different times of production. This was also the case for control speakers. However, in the case of this Participant sudden, perhaps inadvertent, instances of lengthened duration occurred. The occasional increase in length may point to variability in segmental length control. Similar results were found by Van der Merwe and Grimbeek (2006). In that study the participant with AOS displayed a reduced range in voice onset time compared to the control participants, but with occasional large outliers during repeated productions. This observation may correspond to the phenomenon that Staiger et al. (2012) call instability in the realization of specific articulations. The occasional increases in segmental length and the inconsistently present distortion, deletion or substitution of the clicks suggest error variability to be a feature of AOS when task complexity is moderate. However, when task complexity is particularly high, the occurrence of errors and in particular sound distortion, seems to become consistent. There appear to be factors which may influence consistency and variability of error location and error type, but this phenomenon warrants further research. 8 Conclusions The perceptual speech errors that occurred during the production of click sounds were similar to those that occur in Germanic languages such as English (McNeil et al. 2004), German (Staiger et al. 2012) and Afrikaans (Van der Merwe, Uys, Loots and Grimbeek 1987). Click distortion, deletion and substitution were noted. The type of error varied across three recordings in the case of two of the words. Error occurrence was also not consistent in one of the words (ngc̱ono) as the click was produced correctly twice and deleted once. Though

380

van der Merwe and Steyn

increased segmental length is reported to be a core feature of AOS, this did not occur consistently across recordings in this study. Instances of shorter or longer duration than the control group or within the bounds of the control group were measured for the same sound. Inaccurate spatial features and disrupted coarticulatory movements and also errors in interarticulatory synchronization during click burst and click release to vowel onset are more likely the cause of distortion of clicks than slow rate of articulation. These preliminary findings also point to the negative impact of higher task complexity on speech accuracy in AOS. Higher motor complexity and increased length of utterances appear to induce speech errors which are consistent in occurrence, though not necessarily consistent in nature. The conclusions of this study are based on a limited set of data and are therefore mere indications. Future research should expand on this exploratory study. A greater number of participants with AOS should be involved and also a wider range of clicks in different and controlled phonetic contexts. To deepen our understanding of the impact of AOS on click production additional objective measures such as cine radiology or dynamic palatography is necessary. This line of research appears promising and exploring the impact of the many idiosyncratic features of Bantu languages on AOS signs could deepen our understanding of this speech disorder (Van der Merwe and Le Roux 2014b). References Bernhardt, Barbara H., and Joseph P. Stemberger. 1998. Handbook of Phonological Development from the Perspective of Constraint-based Nonlinear Phonology. San Diego: Academic Press. Brooks, Vernon B. 1986. The Neural Basis of Motor Control. New York: Oxford University Press. Dogil, Grzegorz, and Jörg Mayer. 1998. “Selective phonological impairment: A case of apraxia of speech.” Phonology 15: 143–188. Duffy, Joseph R. 2013. Motor Speech Disorders: Substrates, Differential Diagnosis and Management, 3rd edn. St Louis: Elsevier. Duffy, Joseph R., Edythe A. Strand, and Keith A. Josephs. 2014. “Motor speech disorders associated with primary progressive aphasia.” Aphasiology 28(8–9): 1004–1017. Forrest, Karen, Scott Adams, Malcolm, R. McNeil, and Helen Southwood. 1991. “Kinematic, electromyographic, and perceptual evaluation of speech apraxia, conduction aphasia, ataxic dysarthria, and normal speech production.” Dysarthria and Apraxia of Speech: Perspectives on Management, ed. Christopher A. Moore,

Production of Click Sounds in Acquired Apraxia of Speech

381

Kathryn M. Yorkston and David R. Beukelman. Baltimore: Paul H. Brookes Publishing Co, pp. 147–172. Herbert, Robert K. 1990. “The relative markedness of click sounds: Evidence from language change, acquisition, and avoidance.” Anthropological Linguistics 32(1–2): 120–138. Jacobson, Monica C., and Anthony Traill. 1986. “Assessment of speech intelligibility in five South-Eastern Bantu languages: Critical considerations.” South African Journal of Communication Disorders 33: 15–27. Kertesz, Andrew. 1982. The Western Aphasia Battery. New York: Grune and Stratton. Kim, Karl H.S., Norman R. Relkin, Kyoung-Min Lee, and Joy Hirsch. 1997. “Distinct cortical areas associated with native and second languages.” Nature 388 (10 July): 171–174. Ladefoged, Peter, and Anthony Traill. 1994. “Clicks and their accompaniments.” Journal of Phonetics 22: 33–64. Ladefoged, Peter, and Ian Maddieson. 2008. The Sounds of the World’s Languages. Oxford: Blackwell Publishing. Lanham, Leonard W. 1969. “Generative phonology and the analysis of Nguni consonants.” Lingua 24: 155–162. Lewis, Philip, and Justus C. Roux. 1996. “A phonological process analysis of the acquisition and loss of clicks in Xhosa.” South African Journal of African Languages 16(1): 1–7. Magill, Richard. A. 2007. Motor Control and Learning: Concepts and Applications, 8th edn. New York: McGraw-Hill. Mauszycki, C. Shannon, Julie L. Wambaugh, and Rosalea M. Cameron. 2010. “Variability in apraxia of speech: Perceptual analysis of monosyllabic word productions across repeated sampling times.” Aphasiology 24(6–8): 838–855. McNeil, Malcolm R., Sheila R. Pratt, and Tapanta R.D. Fossett. 2004. “The differential diagnosis of apraxia of speech.” Speech Motor Control in Normal and Disordered Speech, ed. Ben Maassen, Raymond D. Kent, Herman F.M. Peters, Pascal H.H.M. van Lieshout and Wouter Hulstein. Oxford: Oxford University Press, pp. 389–414. McNeil, Malcolm R., Don A. Robin, and Richard A. Schmidt. 1997. “Apraxia of speech: Definition, differentiation, and treatment.” Clinical Management of Sensorimotor Speech Disorders, ed. Malcolm R. McNeil. New York: Thieme, pp. 311–344. McNeil, Malcolm R., Don A. Robin, and Richard A. Schmidt. 2009. “Apraxia of Speech.” Clinical Management of Sensorimotor Speech Disorders, 2nd edn., ed. Malcolm R. McNeil. New York: Thieme, pp. 249–268. Miller, Nick, and Anja Lowit. 2014. “Introduction.” Motor Speech Disorders: A CrossLanguage Perspective, ed. Nick Miller and Anja Lowit. Bristol: Multilingual Matters, pp. 7–28.

382

van der Merwe and Steyn

Mowrer, Donald, E., and Sharon Burger. 1991. “A comparative analysis of phonological acquisition of consonants in the speech of 2.6–6-year-old Xhosa- and English-speaking children.” Clinical Linguistics and Phonetics 5(2): 139–164. Naidoo, Yugeshiree, Anita van der Merwe, Emily Groenewald, and Elsie Naude. 2005. “Development of speech sounds and syllable structure of words in Zulu-speaking children.” Southern African Linguistics and Applied Language Studies 23(1): 59–79. Niesler, Thomas, Philippa Louw, and Justus C. Roux. 2005. “Phonetic analysis of Afrikaans, English, Xhosa and Zulu using South African speech databases.” Southern African Linguistics and Applied Language Studies 23(4): 459–474. Roux, Justus C. 2007. “Unresolved issues in the representation and phonetic description of click articulation in Xhosa and Zulu.” Language Matters 38(1): 8–25. Shuster, Linda I., and Julie L. Wambaugh. 2000. “Perceptual and acoustic analyses of speech sound errors in apraxia of speech accompanied by aphasia.” Aphasiology 14(5–6): 635–651. Staiger, Anja, Wolf Finger-Berg, Ingrid Aichert, and Wolfram Ziegler. 2012. “Error variability in apraxia of speech: A matter of controversy.” Journal of Speech, Language, and Hearing Research 55: S1544–S1561. Thomas-Vilakati, Kimberley, D. 2010. Coproduction and coarticulation in isiZulu clicks. California: University of California Press. Van der Merwe, Anita, Isabel C. Uys, Johannes M. Loots, and Jackie Grimbeek. 1987. “Die invloed van sekere kontekstuele faktore op die ouditief waarneembare foute by verbale apraksie.” [The influence of certain contextual factors on the perceptual symptoms of apraxia of speech]. South African Journal of Communication Disorders 34: 10–22. Van der Merwe, Anita. 1997. “A theoretical framework for the characterisation of pathological speech sensorimotor control.” Clinical Management of Sensorimotor Speech Disorders, ed. Malcolm R. McNeil. New York: Thieme, pp. 1–25. Van der Merwe, Anita, and Jackie Grimbeek. 2006. “Variability of voice onset time, vowel duration and utterance duration in apraxia of speech.” Stem-, Spraak- en Taalpathologie 14 Supplement, Juni 2006. 5th International Conference on Speech Motor Control Nijmegen: Abstracts. Van der Merwe, Anita. 2009. “A theoretical framework for the characterization of pathological speech sensorimotor control.” Clinical Management of Sensorimotor Speech Disorders, 2nd edn., ed. Malcolm R. McNeil. New York: Thieme, pp. 3–18. Van der Merwe, Anita. 2011. “A speech motor learning approach to treating apraxia of speech: Rationale and effects of intervention with an adult with acquired apraxia of speech.” Aphasiology 25(10): 1174–1206.

Production of Click Sounds in Acquired Apraxia of Speech

383

Van der Merwe, Anita, and Mia le Roux. 2014a. “Dysarthria and apraxia of speech in speakers of selected African languages.” Motor Speech Disorders: A Cross-Language Perspective, ed. Nick Miller and Anja Lowit. Bristol: Multilingual Matters, pp. 125–142. Van der Merwe, Anita, and Mia le Roux. 2014b. “Idiosyncratic sound systems of the South African Bantu languages: Research and clinical implications for speechlanguage pathologists and audiologists.” South African Journal of Communication Disorders 61(1): 1–8. Wambaugh, Julie. L., Joseph R. Duffy, Malcolm R. McNeil, Don A. Robin, and Margaret A. Rogers. 2005. “Treatment guidelines for acquired apraxia of speech: A synthesis and evaluation of the evidence.” Journal of Medical Speech-Language Pathology 14(2): xv–xxxiii. Wertz, Robert. T., Leonard L. La Pointe, and John C. Rosenbek. 1984. Apraxia of Speech in Adults: The Disorder and its Management. Orlando, FL: Grune and Stratton. Ziegler, Wolfram, Ingrid Aichert, and Anja Staiger. 2012. “Apraxia of speech: Concepts and controversies.” Journal of Speech, Language, and Hearing Research 55: SI485–SI501.

Chapter 12

The ArtiVarK Click Study: Documenting Click Production and Substitution Strategies by Learners in a Large Phonetic Training and Vocal Tract Imaging Study Scott Moisik and Dan Dediu 1 Introduction This chapter presents preliminary results from a phonetic learning study of click consonants conducted with multimodal imaging of vocal tract structures using magnetic resonance imaging (MRI) and three-dimensional (3D) intraoral scanning. The click study is part of the ArtiVarK (sub)project (see e.g. Dediu, Janssen and Moisik 2017: 16), which was designed to quantify intra- and inter-population vocal tract variation and examine the relationship between this variation and variation in speech production. The aim of the click study was to ascertain whether and to what extent vocal tract morphology influences the ability to learn to produce clicks and, for those that do learn to produce them, the acoustic and articulatory characteristics of their productions. While the data of this study are still being analyzed to determine if vocal tract shape variation can predict details of speech production (such as the success at learning to produce clicks under carefully controlled phonetic training conditions), as a first step, we focus here on characterizing – in primarily qualitative terms – the variation in click attempts by our participants. Clicks are among the most complex speech sounds to articulate, involving, in the simplest case, a rarefaction gesture formed most typically between the tongue and the palate. It has been suggested (Engstrand 1997; Traunmüller 2003; Demolin p.c.; Moisik and Dediu 2015, 2017) that individuals of the Khoisan group have a palate shape that is favorable for click production. One specific suggestion (Traunmüller 2003: 4) is that a prominent alveolar ridge (one which gives the palate a sharp sigmoidal contour when viewed in profile) requires more distortion of the tongue to form the lingual-palatal seal. Various sources of evidence, such as Traill’s (1985) x-rays of ǃXóõ (Taa) speakers and measurements taken on palate impressions (van Reenen 1964; van Reenen and Allen 1987; Winkler and Kirchengast 1993) indicate that Khoisan palates are

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_013

The ArtiVarK Click Study

Figure 12.1

385

Tracings of a radiographic image (a), palate casts (b–d), and a midsagittal slice through a 3D intraoral scan (e) illustrating variation in anterior palate shape, particularly associated with the alveolar ridge (prominence of the alveolar margin). Note that the scale has been altered such that the incisors are all roughly the same size. sources: (a) is from Traill (1985: e.g. pp. 102, 106–107), (b) and (c) are of two adult male Hadza speakers from Sands et al. (1993: 73), and (d) is of an adult male from Maddieson (1993: 28). The palate in (e) is from a Caucasian male who is a native speaker of English.

characterized by an anteriorly flat (smooth and mostly concave) profile. Not much is known about the palates of other linguistic communities in which clicks are used (such as Nguni Bantu languages with clicks, or Dahalo, a southern Cushitic language, or the isolates Hadza and Sandawe). There is some suggestion that the Hadza share some of the craniofacial characteristics of the Khoisan peoples (Ten Raa 1970; Traunmüller 2003), even though they are genetically distant from the San (Knight et al. 2003; although there is evidence for an ancient genetic link between the Hadza and the Khoisan populations; see Pickrell et al. 2012). Midsagittal tracings of plaster palate casts provide some indication that Hadza (see Sands, Maddieson and Ladefoged 1993: 73ff) – and even perhaps Dahalo (see Maddieson, Spajić, Sands and Ladefoged 1993: 28ff) – palates seem to fit the characterization applied to the Khoisan palate (see Figure 12.1). Of course, these observations are for only a handful of casts. In our previous studies (Moisik and Dediu 2015, 2017), we examined the proposed palate-shape bias using a simplified model of click production developed in ArtiSynth (www.artisynth.org). The results suggest that a larger alveolar ridge increases the muscular forces required to produce the click and reduces the rate of volume change during rarefaction and release, which presumably would have consequences for click acoustic properties (although acoustics were not simulated). While the model itself is subject to certain limitations, it provides a plausible causal mechanism for the proposed bias and shows that the bias seems to operate in the expected direction (favoring Khoisan-like palates, i.e. those lacking a substantial alveolar ridge prominence). Should such a bias be real, and should the distribution of palate shapes among the Khoisan peoples be more frequently of the sort favorable for producing clicks

386

Moisik and Dediu

with possibly better acoustic properties (e.g. more prominent bursts) that increase their perceptual distinctiveness (vis-à-vis stops, for example), this palate-shape bias hypothesis might help explain, in part, the rarity of phonemic integration of clicks (i.e. into languages beyond those of southern Africa in close proximity to Khoisan-type languages) through an “economy of effort” argument (e.g. Zipf 1949; Brosnahan 1961; Napoli, Sanders and Wright 2014). Moreover, the existence of the bias would serve as an example of how some of the variation found in speech sound systems might be driven by the vocal tract variation found across populations (Dediu et al. 2017) or other extra-linguistic factors, such as environmental conditions (Everett 2013; Everett, Blasi and Roberts 2015). To this date, clicks have not factored into the discussion very much, but a considerable amount of recent work is showing that the shape of the hard palate influences at least the fine details of speech production behavior (e.g. Baum and McFarland 1997; Honda, Fujino, and Kaburagi 2002; Tiede, Gracco, Shiller, Espy-Wilson and Boyce 2005; Brunner, Fuchs and Perrier 2009; Thibeault, Ménard, Baum, Richard and McFarland 2011; Weirich and Fuchs 2013; Stone et al. 2013; for a survey, see Lammert, Proctor and Narayanan 2013a, 2013b; also see Stavness, Nazari, Perrier, Demolin and Payan 2013). The click study component of ArtiVarK thus contributes to the examination of this biasing hypothesis by providing a large and rich set of experimental data on the matter. In addition, even though it does not provide insight into the behavior of native speakers of click languages, it also contributes to the need for research on individual variation in click production, as has been identified by Proctor et al. (this volume), and would indeed help establish our expectations about how people might vary in click production. 2 Methods To investigate this claim of a palate-shape bias on click production, we conducted a multiphase phonetic learning and production study with audio recording and multimodal imaging to capture information about acoustic and articulatory aspects of speech production and of morphological variation of the vocal tract. (As noted above, the click study was part of a larger project called ArtiVarK, which was designed to examine many more sounds than just clicks.) In this chapter, we will only present qualitative descriptions of our observations of the audio and MRI data, with some preliminary quantification. The statistical assessment of the 3D intraoral scan data in relation to our participants’ success (or lack thereof) with learning to produce clicks is reserved for future work.

The ArtiVarK Click Study

387

2.1 Method for Phonetic Training of Clicks In an initial phonetic training phase, our participants were taught, using only auditory and visual aids, how to produce (post)alveolar ([ǃ]) and dental ([ǀ]) clicks as they occur in an intervocalic [a] context (i.e. [aǃa] or [aǀa]). Sequences of coordinated vowel-click-vowel segments (i.e. [aǃa] or [aǀa]) rather than isolated clicks (i.e. [ǃ] or [ǀ]) were selected as the learning objective because the task needed to be challenging to obtain a good balance between successful and unsuccessful productions, to be consistent with the presentation of other non-click learning targets (needed for the remainder of the ArtiVarK study not discussed here), and also to represent as natural as possible a context to the learner while being language neutral. For each type of click, audio stimuli were presented six times with a brief (3 second) pause in between each production to provide time for the participant to produce the sequence and a longer (6 second) pause in between the third and fourth stimuli. The audio stimuli used to aid in the training of clicks were provided by John Esling, former president of the International Phonetic Association (IPA): for each click, he recorded three very slightly different audio examples in intervocalic [a] context. The recordings were made at the University of Victoria (British Columbia, Canada) in an anechoic chamber using a ME 66 shotgun microphone and K6 powering module (both Sennheiser products) and sampled at 44,100 Hz at a depth of 16 bits. We take these to represent canonical productions of the corresponding clicks and to be consistent with the sorts of language-neutral productions one might hear being given to students of phonetics. The visual stimuli were obtained from a figure showing radiographic traces of ǃXóõ clicks just prior to anterior-stricture release originally presented in Ladefoged and Traill (1984: 19, figure 9). Alongside these images, we presented an orthographic form using a novel, non-IPA convention for the clicks that was intended to provide strong visual cues to the articulation of clicks. Note that the overall goal was to provide as many clues to the participants to help as many of them as possible be successful at learning to produce these clicks in an intervocalic context. The phonetic training session was automated using custom software, which presented instructions to the participant to follow and prompted the required elicitations. The entirety of each phonetic training session was recorded using an AT3035 cardioid condenser microphone (Audio-Technica) connected to a UA-25 external audio capture device (Edirol) sampled at 44,100 Hz at a depth of 16 bits. The recording environment was always a quiet office, but the location and hence room acoustics varied (because of variable room availability).

388

Figure 12.2

Moisik and Dediu

Visual and auditory aids used during the phonetic training phase to instruct participants in the production of (post)alveolar [aǃa] (top row) and dental [aǀa] click (bottom row) target sequences Note: The radiographic traces are from Ladefoged and Traill (1984: 19, figure 9) but were colorized to help participants easily identify vocal tract structures. Single examples of audio cues for each type of click are shown as waveforms and spectrograms.

2.2 Participants The participants represent an ethnolinguistically diverse sample. For the full study (phonetic training, 3D intraoral scan, and MRI), we recruited 36 Dutch (19 females, 17 males), 15 North Indian (4 females, 11 males; native speakers of an Indo-Aryan group language), 19 South Indians (3 females, 16 males; native speakers of a Dravidian language), and 10 Chinese individuals (3 females, 7 males), giving 80 full-study participants in total. Additionally, phonetic training and intraoral scanning was obtained for 6 native English speakers (4 females, 2 males), 4 native German speakers (2 females, 2 males), 3 native Spanish speakers (2 females, 1 male), and 1 native Romanian speaker, giving a total of 14 partial-study participants. This makes for a grand total of 94 participants (80 full-study plus 14 partial-study participants). For all participants, we also collected very detailed background information (about place of origin, place of origin of parents, languages spoken, orofacial medical history, and so forth) using a questionnaire and also solicited information about prior phonetic training (from lectures, courses, degrees, and so forth; phonetic experience was graded on a scale from 1 to 5, with 1 signifying no prior exposure to phonetics, 2 signifying exposure through lectures, 3 signifying exposure through university-level courses, 4 signifying a degree in linguistics, and 5 signifying

The ArtiVarK Click Study

389

an individual with professional experience working with phonetics). In this chapter, we consider data from all of the participants (not only those who participated in the full study). Anatomical and Speech Production Imaging Using MRI and 3D Intraoral Scanning These participants were then scanned using (i) static and real-time MRI and (ii) a 3D intraoral scanner (a digital dental impression system). The 3D intraoral scanning (the results of which are not presented in detail here) was conducted at the Department of Orthodontics and Craniofacial Biology, University Medical Center Radboud, Nijmegen, using a TRIOS system (3shape). Participants were first briefed about the scanning process and then seated comfortably in a dental unit; scanning covered the upper and lower dental arches and the hard palate, up to or just posterior to the transverse suture (corresponding roughly to the mid-point of the lingual surfaces of the second molars). The MRI scans were obtained at the Donders Institute for Brain, Cognition and Behaviour (in the Donders Centre for Cognitive Neuroimaging or DCCN) on a 1.5T MRI system (Avanto, Siemens Healthcare) using six-channel head and neck RF receiver coil array to enhance the signal in the region of the vocal tract. The static productions were imaged using a two-dimensional (2D) single sagittal slice T2-weighted turbo spin-echo sequence (acquisition time = 10 s, TE = 9.9 ms, TR = 380 ms, flip angle = 150°, slice thickness = 3 mm, field of view = 240 mm × 240 mm, voxel size = 3.0 mm × 0.94 mm × 0.94 mm). The purpose of the static scans was to obtain a much higher resolution image than what is possible with the real-time scans. As noted above, the acquisition time for the sequence was 10 s, which means participants needed to be trained to sustain a consonantal stricture for 10 s. Excessive movement during this acquisition time could result in poor image quality (especially blurring of the articulators). Thus, during the briefing process, participants were given detailed training on how to sustain a consonantal articulation, with [s] and [t] serving as examples. They were told how to hold the posture at the point in time when their tongue makes contact with the roof of their mouth and how to synchronize this with the MRI scanner. Participants were instructed that, if necessary, they could breathe during the scan acquisition process. Further they were told that it was not necessary to produce any audible sound or sustain sound (if they chose to produce sound) for the entire duration of the scan. Most participants had some difficulty at first with this phonetic concept of sustaining a sound but eventually succeeded in producing the desired sustain time. During scanning, on occasion, the participant would accidentally change her articulatory posture, resulting in a blurry image. This did not happen very frequently, but 2.3

390

Moisik and Dediu

when it did, most of the time it was possible to redo the scan to obtain a better image. Images judged too blurry too be interpretable were simply not selected for analysis (see below). The real-time productions were imaged using Siemens’ 2D single sagittal slice true fast imaging with a steady state precession (“true FISP”) sequence (acquisition time = 15 s, frame rate = 6.67 fps, TE = 1.09 ms, TR = 148.48 ms, flip angle = 49°, slice thickness = 5 mm, field of view = 160 mm × 160 mm, voxel size = 2.0 mm × 2.0 mm × 2.0 mm). This type of sequence is known to be associated with banding artifacts, which we observed in our data for some individuals, particularly if motions were excessive, and thus our participants were instructed to speak calmly and deliberately and avoid excessive movements. Some artefacts were occasionally observed, such as arc-shaped cardiac artefacts (see Proctor et al., this volume); other artefacts were much less commonly encountered, such as localized circular black-out patterns caused by dental fillings or, in only two cases, the presence of a dental wire (or retentiedraadje in Dutch). Development of the real-time sequence parameters was assisted by an MR physicist and the MRI technician of the DCCN. These parameters represent the best possible trade-off between frame rate and spatial resolution discoverable on the Avanto system with the help of their expertise. In all cases, care was given to align the field of view to obtain a consistent midsagittal section through the participants’ vocal tract (as judged by the intersection of the midsagittal plane through the tip of the nose anteriorly and the point of maximum curvature of the visible portion of the posterior pharyngeal wall). Although participants’ heads were firmly fixed into the head and neck coil, small changes in participant head position did occasionally occur over the duration of the scan. Whenever this was noticed to have occurred, if possible, an effort was made to correct the alignment. Prior to being scanned, participants were given extensive instruction on the procedure. In particular, they were instructed on how to sustain consonantal articulations for an extended period of time and on the appropriate speaking rate and elicitation schema (four citation forms, then one carrier sentence context, “I say X for him”, and then one final citation form) for the real-time sequences. During the session, participants produced a large number of different sounds (among these being clicks) that were part of the scope of the ArtiVarK study. Elicitation of the productions was guided always by communication on the part of the MRI operator (usually SRM) and by means of a script developed in Presentation (Neurobehavioural Systems). Before each scan, participants were presented with an example audio stimulus drawn from the same set used in the training phase along with the orthographic form (but not the radiographic

The ArtiVarK Click Study

391

image). If requested, the example sound would be repeated until the participant confirmed readiness by producing it aloud. During the production, the participant was guided through the elicitation with text prompts only, including the novel orthographic form for the clicks. Although the sound was recorded for every scan using a FORMI-III (Optoacoustics Ltd.) dual-channel optical microphone system with noise cancelling, the quality of these recordings is not suitable for visual display (as waveforms or spectrograms) and thus these will not be further discussed here. Ethical approval for the experiment was obtained as amendment number 45659.091.14 to the ethics approval for the “Imaging Human Cognition” project (Donders Center, Nijmegen) granted by the CMO Regio Arnhem-Nijmegen. 3 Results Participants tended to split into three groups: (i) those that showed an (almost) immediate capacity to produce clicks, (ii) those that showed an awareness of the components of click production but lacked the necessary coordination of these, and (iii) those that did not produce clicks. Although the learning targets are technically unaspirated oral clicks, [ǃ] and [ǀ], productions were deemed to be successful (as clicks) if some evidence of lingual (velaric) ingressive production occurred (whether that be appropriate auditory characteristics in the recorded audio or appropriate articulatory characteristics in the imaging data); thus, this preliminary analysis considers click production to be successful regardless of their secondary characteristics, i.e. whether they were nasalized (be this in the form of audible nasalization – either voiced or voiceless – or just velum lowering), glottalized, aspirated, or otherwise. The participants who did not learn to click used a rich array of phonetic substitutions, depending on click target type. Typically, these were either produced as tongue tip/blade or dorsal articulations with the former ranging from dental to deep palatal stricture (sometimes with strong retroflexion), and the latter tending towards velar stricture. Initiation was often pulmonic egressive and was accompanied by strong release bursts, but numerous participants used a glottalic egressive (ejective) mode instead. (Note that in all figures below, participant identification number, ethnic group [C = Caucasian, CH = Chinese, D = Dutch, NI = North Indian, SI = South Indian], and sex [F = female; M = male] are shown.) 3.1 Participant Production Audio Data The participant production audio data was subjected to a detailed qualitative auditory analysis. In the analysis, productions were classified according to

392

Moisik and Dediu

general mode of production (as either clicks, ejective stops, unaspirated or aspirated voiceless stops, voiced stops, fricatives, or affricates), place of articulation (dental, alveolar, postalveolar, retroflex, palatal, velar, uvular, and glottal), and, in the case of clicks, nasalization and lateral release were also noted. From this analysis, a general characterization of the frequencies of the different phonetic patterns observed for the two different types of clicks can be made. This is presented in Table 12.1 and Table 12.2. Only 54 (of 448) clicks (or 12%) had some audible voiced nasality and only 32 (7%) were classified as having a lateral release. (Note that characterization of the dorsal constriction in successful clicks was not carried out and it varies across tokens.) Table 12.1 Frequencies of different airstream and/or manner of articulation patterns observed in the participant production audio

Production Type

[aǃa]

[aǀa]

click ejective aspirated voiceless stop unaspirated voiceless stop affricate fricative voiced stop

227 (42.4%) 104 (19.4%) 45 (8.4%) 149 (27.9%) 8 (1.5%) 2 (0.4%) 0 (0.0%)

221 (40.9%) 47 (8.7%) 96 (17.7%) 137 (25.3%) 31 (5.7%) 8 (1.5%) 1 (0.2%)

Totals

535

541

Total 448 (41.6%) 151 (14.0%) 141 (13.1%) 286 (26.6%) 39 (3.6%) 10 (0.9%) 1 (0.1%) 1076

Note: Column-wise percentages are given in parentheses (rounded to the nearest tenth).

Examination of those ‘successful’ productions reveals two main basic timing patterns. The first is characterized by relatively small positive Voice Onset Time (VOT; i.e. the period between click release burst and the onset of voicing); the second shows a relatively larger positive VOT typically associated with glottalization (not aspiration). These two cases are illustrated for each type of click target sequence in Figure 12.3 (top and middle rows, respectively). The discrepancy in VOT between the plain (top row) and glottalized (middle row) clicks is manifest for these particular examples. In addition to the increased positive VOT, the glottalized examples also show a degree of abruptness in the onset of phonation of the following vowel, supporting the assertion here that they are glottalized (Sands, p.c.).

393

The ArtiVarK Click Study

Table 12.2 Frequencies of different place of articulation patterns observed in the participant production audio (focusing in all cases on the anterior-most stricture)

Target Place

[aǃa]

Target

[aǀa]

click

ejective

stop

affricate

fricative Total

dental

6 (2.6%)

1 (1.0%)

0 (0.0%)

0 (0.0%)

0 (0.0%)

7 (1.3%)

alveolar

28 (12.3%)

10 (9.6%)

26 (13.5%)

2 (25.0%)

0 (0.0%)

66 (12.4%)

postalveolar 122 (53.7%)

4 (3.9%)

16 (8.3%)

2 (25.0%)

0 (0.0%)

144 (27.0%)

retroflex

9 (4.0%)

0 (0.0%)

13 (6.7%)

0 (0.0%)

0 (0.0%)

22 (4.1%)

palatal

62 (27.3%)

0 (0.0%)

0 (0.0%)

0 (0.0%)

0 (0.0%)

62 (11.6%)

velar

0 (0.0%)

76 (73.1%)

55 (28.5%)

4 (50.0%)

0 (0.0%)

135 (25.3%)

uvular

0 (0.0%)

13 (12.5%)

13 (6.7%)

0 (0.0%)

1 27 (100.0%) (5.1%)

glottal

0 (0.0%)

0 (0.0%)

70 (36.3%)

0 (0.0%)

0 (0.0%)

70 (13.1%)

Place

click

ejective

stop

affricate

fricative

Total

dental

150 (67.9%)

27 (57.5%)

32 (13.7%)

21 (67.7%)

1 (12.5%)

231 (42.7%)

alveolar

32 (14.5%)

12 (25.5%)

94 (40.2%)

4 (12.9%)

0 (0.0%)

142 (26.3%)

postalveolar 24 (10.9%)

5 (10.6%)

22 (9.4%)

6 (19.4%)

1 (12.5%)

58 (10.7%)

retroflex

1 (0.5%)

0 (0.0%)

9 (3.9%)

0 (0.0%)

1 (12.5%)

11 (2.0%)

palatal

14 (6.3%)

0 (0.0%)

2 (0.9%)

0 (0.0%)

0 (0.0%)

16 (3.0%)

velar

0 (0.0%)

2 (4.3%)

31 (13.3%)

0 (0.0%)

1 (12.5%)

34 (6.3%)

uvular

0 (0.0%)

1 (2.1%)

10 (4.3%)

0 (0.0%)

1 (12.5%)

12 (2.2%)

glottal

0 (0.0%)

0 (0.0%)

34 (14.5%)

0 (0.0%)

3 (37.5%)

37 (6.8%)

Note: Column-wise percentages (by click target type) are given in parentheses (rounded to the nearest tenth).

394

Figure 12.3

Moisik and Dediu

Waveforms and spectrograms illustrating plain (top row), (post-)glottalized (middle row), and nasalized (bottom row) clicks occurring as successful productions (but not necessarily matching the exact intended place) of the target (post)alveolar (a) and dental (b) click types. Curly braces indicates (positive) VOT; arrow (a; bottom row) indicates some slight creakiness at the offset of the first vowel

As pointed out, audible voiced nasalization of clicks was uncommon; an example for each target click type is shown in Figure 12.3 (bottom row). In each of these, voiced nasality was audible and the waveforms and spectrograms support this interpretation by way of relative damping of higher frequencies, particularly those above 1 kHz. As can be seen from the images, the nasalization both precedes and follows the click burst (as identified by the transcription). Preceding the nasality is a period of silence. The (a) example, shows some signs of creakiness of the offset of the first vowel (arrow, Figure 12.3), indicating that there may be some pre-glottalization of this token. Such pre-glottalization was not uncommonly encountered.

The ArtiVarK Click Study

395

Figure 12.4 Static MR images of participants successfully producing sustained articulations of [ǃːːːː] (top row) and [ǀːːːː] (bottom row) targets held at the moment prior to click release

3.2 Static MRI Selected images of sustained productions of successful and unsuccessful click production attempts are shown in Figure 12.4 and Figure 12.5, respectively. Although most participants differentiated the alveolar and dental click targets in their productions (by some means), some participants evidently failed to perceive that the targets were different. For example, Figure 12.4a shows essentially identical postalveolar click postures; Figure 12.5a shows identical uvular articulations. The other examples in these figures demonstrate cases showing successful differentiation consistent with the nominal difference in anteriority between the two click types, that is, tongue position is typically more posterior for the alveolar click targets and more anterior for the dental ones. (Note that the standard way of differentiating click place of articulation is by reference to the posterior margin of the anterior stricture at click release; Ladefoged and Maddieson 1996: 250–251.) Successful click articulations for the alveolar click targets shown here (Figure 12.4, top row) tend to have tongue tip retraction and form a palatal cavity with either laminal (b) or apical (c & d) stricture. The corresponding productions of the dental click targets (bottom row) appear to involve a broad apico-dental, lamino-alveolar stricture, but the place of the cavity (judged by the posterior edge of the anterior stricture, where it is visible) still appears to be quite deep (posterior). It is possible this posteriority of the cavity was probably a result of the task (sustained static articulation), and this might indeed apply to all of the successful click productions (meaning that the visualization of click place of articulation these provide should be interpreted with this in

396

Moisik and Dediu

Figure 12.5 Static MR images of participants who failed to produce click postures for both [ǃːːːː] (top row) and [ǀːːːː] (bottom row) targets

mind). For (e), there appears to be full lingual-palatal contact (along the midline at least; see arrow, Figure 12.4). For those unsuccessful postures in Figure 12.5 (b – d), the alveolar-clicktarget cases (top row) are articulated with the dorsum, usually velar in place but varying in extent and anteriority of contact; the dental-click target cases (bottom row) are articulated with the tongue tip (b) or blade (c) or even sublingually (d). The participant in Figure 12.5e demonstrates the situation in which one click target appears to have been (nearly) mastered (top row; here the articulation is a palatal click, instead of the intended alveolar target), but the other is not (bottom row; here the articulation is a broad apico-dental lamino-alveolar stricture). We also observed that participants often produced the clicks with a lowered velum (and usually, thus, an open velopharyngeal port). The deformation of the velum, which can be extreme, was particularly well imaged in the static sequences (Figure 12.4). The way in which it drops down from the level of the hard palate indicates it is highly active in producing the posterior lingualpalatal seal necessary for click production. This is consistent with the interpretation of dorsal articulations as characterized by an oropharyngeal isthmus (OPI) articulator rather than by the traditional active-passive tongue-focused model (Gick 2011; Gick et al. 2014). Note that the lowered velum for these clicks is not obviously just a consequence of sustained articulation (and participant breathing), since in the non-click productions (usually stops or ejectives) like those seen in Figure 12.5, the velopharyngeal port is very often tightly closed, as would be expected for an oral sound. Laryngeal state is more difficult to determine (as massing of tissue does not necessarily mean that the larynx is constricted but could just reflect a slightly

The ArtiVarK Click Study

397

off-midsagittal cut). Some cases show clear abduction of the vocal folds (such as Figure 12.4b & e and perhaps Figure 12.5b). The other cases might indicate varying degrees of laryngeal constriction, but exactly how much is not easy to say; it is difficult to judge confidently from the sagittal slice of the larynx whether the vocal folds are in contact, whether vocal-ventricular fold contact is occurring, or even the degree of antero-posterior narrowing of the epilarynx (for an examination of laryngeal constriction in 3D MRI, see Moisik, Esling, Crevier-Buchman, Amelot and Halimi 2015). 3.3 Real-Time MRI This section presents a selection of productions illustrating successful click productions (Figures 12.6, 12.7, 12.8, and 12.9) and various noteworthy cases of substitutions (Figures 12.10, 12.11, 12.12, and 12.13) as visible in real-time MRI (rt-MRI). The first two of these examples, Figure 12.6 and 12.7, demonstrate what appear to be fairly native-like productions (judging based on comparable articulatory data in Proctor et al., this volume) of the click types in the [aǃa] and [aǀa] targets, respectively. Place of articulation of these clicks, again, judged from the location of the posterior edge of the anterior stricture at click release, matches well with the target click types, being postalveolar and dentialveolar, respectively. In Figure 12.6, the vocal folds adduct and appear to stay fully adducted throughout; however, vocal fold abduction is apparent in the production in Figure 12.7 (small arrowheads). Across speakers, the direction of lingual motion during the rarefaction sequence did not always proceed in the same way; although a retracting-lowering motion was most common (see arrows in Figure 12.6 and 12.7). The velum lowers in both cases, but, for the production in Figure 12.6, it does not clearly mean that the velopharyngeal port has opened, as there appears to be contact between the lower extent of the velum and the posterior pharyngeal wall (see arrowhead) even at click release. The production in Figure 12.7 much more clearly shows opening of the velopharyngeal port in conjunction with velum lowering (see large arrowhead). In both cases, the lowering is nearly synchronous with engagement of the posterior lingual stricture (frames 3 and 4 in Figure 12.6 and frames 5 and 6 in Figure 12.7). The production in Figure 12.8, which also appears to be a quite native-like instance of [aǃa], shows delayed velum lowering, only occurring towards the release of the anterior stricture (frames 27–29), and after this, it promptly returns to a closed position. Unlike native clicks (Ladefoged and Traill 1984; Thomas-Vilakati 1999), but not dissimilar to non-native clicks used in beatboxing (Proctor et al. this volume), we observed that the clicks produced by several of our participants featured full lingual-palatal contact (at least along the midline visible in sagittal

398

Moisik and Dediu

Figure 12.6

Real-time MRI frame sequence showing successful production of [aǃa] with a postalveolar click by participant 616098 (SI, M) Note: Numbers indicate frame number (out of 100; 1 frame = 150 ms); IPA characters loosely denote articulatory activity. Arrows provide an indication of lingual motion at click release. Arrowhead identifies contact between velum and posterior pharyngeal wall indicating closure of the velopharyngeal port (here just prior to click release). * = frame most closely aligned to click release.

Figure 12.7

Real-time MRI frame sequence showing successful production of [aǀa] with a denti-alveolar click by participant 615085 (CH, M) Note: Numbers indicate frame number (out of 100; 1 frame = 150 ms); IPA characters loosely denote articulatory activity. Arrows provide an indication of lingual motion at click release. The large arrowhead identifies lack of contact between velum and posterior pharyngeal wall indicating that the velopharyngeal is (still) open just prior to click release. The small arrowheads indicate vocal fold abduction. * = frame most closely aligned to click release.

The ArtiVarK Click Study

Figure 12.8

399

Real-time MRI frame sequence showing successful production of [aǃa] with a postalveolar click in which the velopharyngeal port opens late in click production by participant 616106 (SI, M) Note: Numbers indicate frame number (out of 100; 1 frame = 150 ms); IPA characters loosely denote articulatory activity. * = frame most closely aligned to click release. Note that there appear to be cardiac artefacts in frames 22–25 and a possible motion artifact associated with loss of tongue signal in frame 29.

section). For all rt-MRI examples shown thus far (i.e. those in Figures 12.6, 12.7, and 12.8), and typical for most MRI data of those participants who showed an ability to click, there is evidence of incomplete contact and thus an air cavity between the tongue and the palate. An example of complete lingual-palatal contact is seen in Figure 12.9 (frames 2–8). The place of articulation of this click is also exceptional. Although it is technically palatal (or even palatovelar) by dint of its passive articulation, the acoustic character gives an auditory percept much closer to [ǃ] (i.e. the targeted type) than a typical [ǂ] (Sands, p.c.). The very large front cavity combined with (probably) passive oropharyngeal stricture caused by the extremely retracted tongue position (arrow in frames 7–9) are likely causes for the low frequency acoustic character of this click (as typical palatals generally have concentration of higher frequency acoustic energy). Towards release, there is a resemblance to the static palatograms of what Beach (1938: 80–82) referred to then as cerebral clicks, insofar as the depth or posteriority of the anterior stricture is concerned. However, cerebral clicks occur with tongue tip retraction (cerebral is in fact synonymous with retroflex); yet the click produced in Figure 12.9 does not show signs of retroflexion (at least not compared, for instance, to retroflex stop productions in Figure 12.13). Instead, the tip and blade are initially pressed into

400

Moisik and Dediu

the postalveolar part of the palate and subsequently dragged backwards while maintaining their anterior orientation as the tongue slides along the palate. Beach notes that some speakers (e.g. of Xhosa) form “alveolar” clicks quite far back and with the cerebral/retroflex tongue posture, but also that the tongue is more “hollowed” in (post)alveolar clicks than palatal clicks. Nevertheless, some such “post-postalveolar” clicks have been identified as palatal previously, such as the [ǃ] in Juǀ’hoan (also spelled Žuǀ’hõasi; see Snyman 1980: 5) and the [ǃ] in Sandawe (Kagaya 1993, wherein the [ǂ] symbol is used instead). The velum shows extreme deformation (particularly in frames 7 and 8). Note that there is no clear frame showing cavity formation during this instance of the click, but we can infer that it is indeed a click based on the velic displacement in response to lingual suction. Additionally, in some subsequent productions, a very small cavity seems to appear just at the hard-soft palate junction (arrowhead, Figure 12.9). Because this click is primarily described from an articulatory view (rather than an auditory one because of the poor quality of the MRI audio), we characterize it as a palatal click (and here loosely use the symbol [ǂ]) on the basis of the passive articulator and the lingual cavitation pattern. It should be noted though that such a click, especially with oropharyngeal stricture (although uvulo-pharyngeal stricture is attested in palatal clicks found in Mangetti Dune ǃXung; see Miller 2016), is not truly comparable to any previously described click (Sands, p.c.). As was noted earlier, a number of our participants demonstrated some recognition of the two lingual strictures required to form the target sequence, [aǃa], but the lack of coordination between the lingual and velic gestures resulted in a failure to produce an audible click sound. Three slightly different examples of this are presented in Figure 12.10. These examples differ in terms of sequencing of the lingual gestures and the exact placement of the strictures, but, in all cases shown here, the posterior (dorsal) gesture precedes the anterior (coronal) one. In (a) and (b), a velar stricture is formed first (large arrowheads) followed by an apico-alveolar stricture (small arrowheads) in (a) and an apico-dental stricture in (b). In (a), the velar stricture is released prior to release of the alveolar stricture; however, in (b), the velar stricture is apparently maintained until full lingual release. The production in (c) is similar to (a) in terms of sequence, but involves a uvular rather than velar stricture and a postalveolar rather than alveolar stricture for the anterior component. In all of these, there is not much appreciable change in laryngeal-pharyngeal configuration and the velopharyngeal port appears tightly closed during the consonantal portions.

The ArtiVarK Click Study

Figure 12.9

401

Real-time MRI frame sequence showing successful production of [aǃa] with full lingual-palatal contact yielding an unusual palatal click (although see discussion) by participant 625115 (SI, M) Note: Numbers indicate frame number (out of 100; 1 frame = 150 ms); IPA characters loosely denote articulatory activity. The large arrowhead identifies the approximate location of cavity formation (not visible in this particular sequence). The regular arrows indicate the location of oropharyngeal stricture. * = frame most closely aligned to click release.

The remaining rt-MRI illustrations (Figures 12.11, 12.12, and 12.13), present examples of substitutions for [aǃa] which show only a single lingual stricture. In Figure 12.11, two examples of an aspirated velar stop occur where a click is expected. The vocal fold abduction necessary for aspiration is more easily visible in (a) because these structures (small arrowheads in frames 5 and 6) move enough away from the imaging plane (which is 5 mm thick) that the amount of tissue visible in the slice is substantially diminished; in (b) the abduction is less clear (small arrowheads in frames 20 and 22), but it can still be inferred by comparison to the preceding and following frames showing [a] production (frames 18 and 24). Some participants ended up producing velar (and more rarely uvular) ejectives as substitutions, although (as Table 12.1 shows) this tendency was more common for the [aǃa] than the [aǀa] target sequences. One example of the former case is shown in Figure 12.12. Here the ejection is visible as a compression of the pharyngeal airspace achieved through tongue retraction (small arrows) and contraction of the pharyngeal constrictor muscles (as evident in a slight thickening of these structures in frames 21, 22, and 23) and larynx raising (compare the height of the base of the vocal folds across frames 20–24 using the solid white bar for reference).

402

Moisik and Dediu

Figure 12.10 Real-time MRI frame sequence showing awareness of anterior (coronal) and posterior (dorsal) lingual articulatory components of the [aǃa] target but poor coordination and absence of velum engagement by participants 402023 (a; top), 616087 (b; middle), and 619100 (c; bottom). Note: Numbers indicate frame number (out of 100; 1 frame = 150 ms); IPA characters loosely denote articulatory activity. Large arrowheads indicate posterior lingual stricture; small arrowheads indicate anterior lingual stricture.

The final example, shown in Figure 12.13, demonstrates substitution of [aǃa] by voiceless retroflex stops. In both of these, a sublingual stricture is formed through retroflexion of the tongue. The contact extends from a point posterior to the alveolar ridge (prominence) to the palatal roof, just posterior to the alveolar margin. In (a), the folding of the tongue tip is so extreme that the tongue almost folds back upon itself (such that the tip almost touches the anterior extent of the dorsum), and this does indeed occur for some productions. The voicelessness is evident in both cases by the apparent reduction of tissue mass around the vocal folds (arrowheads), especially so for frame 17 of (a). Intriguingly, while the participant in (a) is South Indian (and a native speaker of a Dravidian language, which are known to use sublingual articulation in the production of retroflex stops; see Ladefoged and Bhaskararao 1983) the participant in (b) is Dutch.

The ArtiVarK Click Study

Figure 12.11

403

Real-time MRI frame sequences showing examples of aspirated velar stop [akʰa] substitutions for [aǃa] by 520060 (a; top) and by 520061 (b; bottom) Note: Numbers indicate frame number (out of 100; 1 frame = 150 ms); IPA characters loosely denote articulatory activity. Large arrowheads show velar stricture; small arrowheads show vocal fold abduction.

Figure 12.12 Real-time MRI frame sequence showing velar ejective stop substitution, [ak’a], for [aǃa] by participant 420044 (D, M) Note: Numbers indicate frame number (out of 100; 1 frame = 150 ms); IPA characters loosely denote articulatory activity. Arrows show tongue retraction; solid white line provides reference level to judge the degree of larynx raising.

404

Moisik and Dediu

Figure 12.13 Real-time MRI frame sequences showing examples of voiceless retroflex stop [aʈa] substitutions for [aǃa] by 616096 (a; top) and 521066 (b; bottom) Note: Numbers indicate frame number (out of 100; 1 frame = 150 ms); IPA characters loosely denote articulatory activity.

4 Discussion As stated previously, the main goal of the ArtiVarK click study is to examine the link between hard palate morphology and the production of clicks, and this analysis is currently underway. However, the data we have gathered also represent a window into addressing the more basic question of how people learn to produce an unfamiliar speech sound and what phonetic substitutions they use (in the case of clicks) if they cannot match the target exactly. This is not a very well understood area in click research, although a recent study shows that infants can imitate correctly the gross place of articulation (at the level of the speech “organ”) when exposed to adult lip and tongue “smacks” (Diepstra, Trehub, Eriks-Brophy and van Lieshout 2017). The static and realtime MRI data give us a mostly clear view of all essential parts of the vocal tract (but, of course, limited to the midsagittal plane). From this we can make an assessment of some aspects of (non-native) click production which are difficult or impossible to judge with other traditional techniques for investigating clicks such as palatography or ultrasound (the article by Proctor et al., this volume, expounds on the virtues of real-time MRI research for examining native and non-native click production in further detail).

The ArtiVarK Click Study

405

A common tendency in our data was for participants to produce clicks with a lowered velum. Some of these productions occurred as voiced nasalized clicks, which one might transcribe as [ǃᵑ] or [ǀᵑ] (implying, here, nasality throughout the greater part of the click). However, most clicks occurred without any obvious nasality – even voiceless nasality. The intended learning target sequences, [aǃa] and [aǀa], contain plain oral unaspirated clicks, thus it is intriguing that most participants showed a preference to lower the velum in forming the posterior stricture of the click gesture. We might be tempted to speculate that this is a property peculiar to learners of clicks. Moreover, it is indeed puzzling that, even though oral-nasal control for consonants is phonologically mandatory in the native languages of all of our participants, many participants did not master oral click production (Sands, p.c.). Perhaps it is somehow easier to form the posterior stricture when the velum is lowered. Neglecting the observation that a lowered velum is in some sense a default, minimal-effort posture for the respiratory system in general (as has been suggested by some; e.g. Matisoff 1975: 269; also see Boersma 1998: 22), dropping of the entire velum shares the burden of forming the posterior stricture between the tongue and soft palate. Maintaining a raised velum to close the velopharyngeal port presumably requires more work to be done by the tongue (to raise further upwards towards the soft palate than it would if the soft palate were lowered) in addition to the effort of velopharyngeal port closure. We also should not neglect the possibility of engagement of the velic traverse, a subpart of the soft palate (formed by the tissue spanning the faucial pillars). This structure can articulate independent of “nasovelum” movements (in the production of uvulars, at least; see Gick, Francis, Klenin, Mizrahi and Tom 2013). In our data, it seems to participate in forming the posterior stricture even in clicks with a closed velopharyngeal port. Finally, we note that, in the static MRIs, we cannot definitely say that velum lowering was not serving participant respiration (which they were permitted to do during these scans to prevent their discomfort). However, we do see ample evidence that our participants could produce static articulations sustained for 10 s without velopharyngeal port opening (in the case of the nonclick productions). We also see usage of nasalization during the real-time MRI sequences, in which breath holding is not an issue. However, in our participant production audio, nasality was not commonly detected. It may be that participants who used a voiceless nasalized posture did not tend to produce much airflow during the click, making it impossible to hear whether there was nasalization or not in these cases. Alternatively, it is possible that nasality is much less common than the static MRIs would indicate. The pervasive use of velum lowering in these scans may reflect breathing to avoid breath holding

406

Moisik and Dediu

unconstrained by an explicit articulatory setting for the unfamiliar sound (and one that ends up possibly facilitating the click stricture gesture). This would be unlike the non-click substitutions that we observed in the static MRI, which evidently did have a closed velopharyngeal port specification. It would seem that even some native productions of supposedly oral clicks can occur with an open velopharyngeal port. This is most manifest in the nasal airflow data for the glottalized and “delayed aspiration” clicks in Nama discussed in Ladefoged and Traill (1984: 7–8), although these are indeed claimed to be phonologically [+nasal] clicks (despite their transcription) and do, when intervocalic, become fully nasal and cause nasalization of the preceding vowel (ibid.: 6). However, some nasal airflow occurs even in the strictly non-nasal clicks, such as [kǃ] or [kǂ] (using their transcriptions). Nasal flow also sometimes occurs for the glottalized clicks. The nasal airflow in these cases is small, but we would not expect much given that they lack a pulmonic egressive airstream. Such small flow could be generated by lingual motions compressing the pharyngeal air during click rarefaction (and the timing of this nasal flow just prior to the click burst is consistent with this interpretation). In the Proctor et al. (this volume) study, MR images of a single participant’s production of word-initial glottalized oral clicks (produced before close-mid back vowels) in Nama demonstrate that these are produced with a lowered velum. They suggest the transcriptions (also see Miller et al. 2009: 133), [ᵑǀ̊ ʔ], [ᵑǃ̊ ʔ], [ᵑǂ̊ ʔ], and [ᵑǁ̊ ʔ], to convey this opening of the velopharyngeal port (although it is not clear if any voiceless nasality, as would be conveyed by [ŋ̊ ], is actually audible in these tokens). Ladefoged and Traill’s (1984: 7–8) results for Nama show that, despite the occasional small velic leakage and in contrast to the glottalized oral clicks, velopharyngeal port closure is required for voiceless unaspirated and aspirated clicks. This is probably necessary to build up pharyngeal pressure (presumably for the purposes of generating strong release bursts and possibly aspiration) at the release of the posterior stricture (following the click proper). Reinforcing this are the re-tracings found in Ladefoged and Maddieson (1996: 250) of radiographic images of the five click types in ǃXóõ originally from Traill (1985): these clearly show a closed velopharyngeal port. Turning now to the larynx, among the substitutions observed, there was a tendency for participants to employ some form of laryngeal constriction in their productions. This may have occurred as an ejective or more simply as a glottal stop (on occasion preceded by very weak acoustic events suggesting velar and/ or anterior lingual gestures). In fact, many click productions observed in our data could also be described as being pre- and/or post-glottalized. It is possible that participants were led to these productions by the training audio stimuli that we supplied. The phonetic training audio recordings provided by John

The ArtiVarK Click Study

407

Esling (three for each type of click) were spoken carefully and at slow rate so as to provide clear examples of these sounds with mostly uniform segmental temporal spacing. The average VOT was found to be 0.114 ms for the [aǃa] sequences and 0.052 ms for the [aǀa] sequences. One consequence of this is that most of these productions might be mistaken as glottalized clicks by our participants (and perhaps by native speakers whose languages contain such sounds). In all cases, however, the phonetic training audio recordings show a gradual voicing onset, rather than an abrupt one as would be expected for glottalized clicks (at least in some languages, Sands, p.c.; however, cf. Brugman 2009: 42–43). Nasal airflow can also occur for glottalized clicks (Ladefoged and Traill 1984: 7–8), but no such nasal flow could be detected in the stimuli. Vowel amplitude at voicing onset following the click, as gauged by the average intensity observed for the first three glottal pulses, was found to be 56.08 dB for [aǃa] and 66.59 dB for [aǀa]; average peak vowel intensity was 77.22 dB and 79.66 dB respectively. (Examples of such gradually rising vowel amplitude for each click type can be seen in Figure 12.2.) These findings demonstrate that initial vowel amplitudes are quite low following the click, suggesting a gentle rather than abrupt phonatory onset following the click. Thus, we are confident that Esling’s productions most likely represent true plain unaspirated clicks, despite the long VOT. Nonetheless, it is entirely possible that this long lag in VOT would be perceptually interpreted by the participants as glottalization (i.e. as a period of glottal closure following the click) or even as ejection (which typically has a period following release of the oral stricture during which there is no phonation). This seems to be supported by the fact that a sizeable portion of the substitutions were produced with glottal constriction of some form: 24.30% (104 of 428) of all target sequences produced as stops were glottal stops (or 9.67% of all productions) and that 14.03% (151 of 1076) of all productions were produced as ejectives. Further acoustic analysis needs to be made of those cases identified as glottal stops to determine if they indeed corroborate the auditory analysis. Participants often provided substitutions which demonstrated some awareness of the lingual components of click production, some of which showed both anterior and posterior components (as illustrated in Figure 12.10), but the lack of coordination of these in conjunction with velum movement, resulted in a failure to produce a click. Some participants may have been influenced by the highly iconic non-IPA glyphs (see Figure 12.2) that were presented as stimuli during the elicitation. Interestingly, for ejective substitutions, there was a clearly greater tendency for these to be dorsal than coronal if the target was [aǃa]. If it was [aǀa], then it was usually coronal; stops showed similar frequencies of each across the two click types. Postalveolar and retroflex articulations were sometimes encountered in substitutions, and one might expect

408

Moisik and Dediu

that this would be a transfer effect of the L1 (since many of our participants were speakers of Indian languages), and in the case of the Indians, this seems to be the best explanation. The only problem is that several examples can be found of non-Indian individuals using these places in their substitutions (e.g. see Figure 12.5d and Figure 12.13b). Since retroflex stops were another sound taught to our participants in the phonetic training, it is possible that some of the non-Indians confused these with the clicks. While clicks are phonologically rare sounds, their general occurrence in speech should not be considered rare: they are common in paralinguistic usage (Gil 2013; Ogden 2013) and are excrescent (or “emerge”, following Ohala 1995) in phonetic variants of sequences of anterior and posterior oral consonants. Examples include [pʘk] in “ab Köln” or [nǃk] in “in Kiel” of German (see Simpson 2007) and [mʘɡ] in /ímbɡá/ ‘dog’ or [nǃŋ] in /ínhŋɡwârò/ ‘weapons’ of Kinyarwanda (Demolin 2016). Engstrand (1997) suggests that labial-velars (prevalent among African languages) might constitute another source for click-o-genesis (so to speak). From the authors’ own casual observations, clicks (including, minimally, [ʘ], [ǃ], and [ǁ]) are frequently used by parents when interacting with young infants, such as newborns and babies (and a formal study of the details surrounding this behavior is needed). Segmental clicks (i.e. those integrated into phonological sequences of consonants and vowels) are reported to occur in Chinese nursery rhymes such that nasalized postalveolar clicks replace initial velar nasals in some words (Nathan 2001). In our data, we can observe that some individuals were able to produce clicks with relative ease, while others had significant trouble with the task.1 Although we have not investigated here whether or not morphology is predictive of success at the task, we would offer the following interpretation about why morphology might be an important predictor. If there is a bias associated with palate shape on producing clicks, it ought to be only a weak one (otherwise, we would presumably have detected it before). In the biomechanical simulation of clicks (Moisik and Dediu 2015, 2017), we demonstrated that anterior palate shape might influence the efficiency of click production but, indeed, only weakly so (efficiency being gauged with muscular effort and volume change characteristics). While this mechanical model does suggest a causal explanation for the link between morphology and clicks, it does not account for perception. Successful perception of click articulation by our participants 1  An anonymous reviewer suggests that, based on inspection of the failed productions presented here (in Figures 12.10–12.13), the anteriority component of palate shape variation identified in Lammert et al. (2013b) might be an important factor in predicting click production success/failure in our study.

The ArtiVarK Click Study

409

is presumably required for them to know how to take the appropriate motor actions to respond with a successful click production when prompted. Those that benefit in click production by having a favorable palate shape might have a slightly increased proneness to producing spontaneous clicks, a slightly greater ease at producing these (e.g. in phonetic play), and perhaps even a slightly better ability at perceiving a click as a click. This is particularly important for clicks since it is often observed that they are not perceived as speech sounds by non-native click language speakers (Traunmüller 2003: 3) and often seem “external” to the speech signal itself. It has been shown experimentally that infant-like discrimination ability for nonnative contrasts remains intact for English adult speakers in the case of clicks (Best, McRoberts and Sithole 1988). The authors argue that this reflects the fact that the nonnative clicks cannot (unlike many other types of nonnative contrast) be mapped onto native (English) phonemic categories and so are instead processed auditorily (rather than phonemically). The question remains as to what the link is then between motor processes, in which a biasing effect might play a role, and perception. One answer to this might be found in the motor theory of speech perception (e.g. Liberman and Mattingly 1985), which posits that speech perception depends at least in part on “underlying motor processes” (ibid.: 2). The importance of sensorimotor information in infant speech perception has recently been demonstrated by Bruderer, Danielson, Kandhadai and Werker (2015). This study showed that limiting what the tongue can do by means of mechanical obstruction (teething instruments) hinders the discrimination of non-native contrasts in sixmonth-old infants. Oral somatosensation has been demonstrated to influence phonetic perceptual acuity (Ghosh et al. 2010) and this type of information has a direct bearing on motor control system through sensorimotor feedback (for a summary of several studies, see Perkell 2012). The implication is that the nature of the speech production apparatus including its somatosensory system has an influence on speech perception. At the level of cortical mapping, it is clear that the structure of the sensory end organs can have a dramatic influence of the cortical representation (as is demonstrated in experiments on rodents to manipulate the structure of the barrel cortex associated with the whiskers; see Erzurumlu and Gaspar 2012). Morphological variation might too be a factor in the nature of these complex interactions between the motor and perceptual systems and result in variation amongst individuals in their ability to produce and perceive sounds such as clicks. Those individuals with favorable palate morphology throughout their life may have a greater perceptual awareness of clicks because of the greater likelihood that they spontaneously click when speaking, perhaps produced clicks more frequently in phonetic

410

Moisik and Dediu

play early in life, and are perhaps better able to map auditory exposure to clicks to the motor control patterns necessary for their production. These notions are all still speculative, but we believe it is critical to consider both how production and perception might be affected by potential anatomical (morphological) biases. It should be pointed out here that several of the participants were briefly asked after the study was completed what they thought about the new sounds they learned, including the clicks. Many of the Indians, especially those from South India, observed that the [ǃ] was familiar to them because of its common paralinguistic functions, such as when catcalling actors upon the stage or when catching the attention of dogs or infants. Recognition of the click targets because of their salient paralinguistic use might help account for why some individuals were capable of successfully producing click sequences. We would like to emphasize here that we also believe that the task we presented to our participants was very difficult, despite it being as explicit and instructive as carefully-controlled uniform presentation would allow. As the study by Bradfield (2014: 26–28) shows, even click language speakers can have difficulty learning to produce unfamiliar (non-native) clicks. Given enough time and more standard phonetic pedagogical techniques and personalized tutorial, we are confident that we could indeed teach all of our participants to produce clicks in intervocalic [a] sequences with relative ease. Although the conditions of our training regime do not replicate the conditions of second language (L2) acquisition, the presence of clicks in Damin, a ritual language of the Lardil (Hale and Nash 1997), shows that clicks can be acquired in an L2 context. Another important detail to keep in mind when interpreting our results is that the stimuli used to train clicks were also language neutral phonetic exemplars, and it is not clear how more natural language specific stimuli might have influenced the results (if at all). The reader should also remember that our main interest is to examine the supposed effect of a most-likely weak morphological bias that would only become apparent under ‘stressed’ conditions. It is clear prior to any detailed analysis that our results will not demonstrate that certain people can click and others cannot simply and deterministically depending on the shape of their vocal tract anatomy. Rather we hope to investigate to what degree vocal tract morphology matters. However, we anticipate that even a full analysis will not be definitive without additional replications and more targeted experimental studies (clicks were not the only sounds that our participants were required to learn or produce and the study itself was very long in duration, taking upwards of three hours per participant). Such will be essential before making any deeper claims about what governs the

The ArtiVarK Click Study

411

phonological integration of clicks, and why they appear only a handful of the world’s languages. 4.1 Conclusions and Future Directions We have documented the speech production behavior associated with a clicklearning task, in which a diverse group of speakers of non-click languages were phonetically trained to produce dental and (post)alveolar clicks under carefully controlled conditions. For those that were successful at producing clicks, there was an especially strong tendency to employ nasalization, presumably as a way to facilitate formation of the back closure, but this was by no means universal. The data also support the view (Gick 2011; Gick et al. 2013, 2014) that (a portion of the) velum is behaving as an active articulator in the context of sounds produced with a “dorsal” component (here, clicks). We also observed an incredible richness in the types of substitutions made by those who were not successful at clicking, and these differ by place of articulation of the target click. Our participants frequently produced voiceless stops and ejectives instead of clicks, with a tendency towards ejection being slightly higher for the [ǃ] target. Some non-click attempts were sequences of dorsal and coronal stops (or double-articulations of these), indicating that some individuals were aware of the articulatory components of the clicks but lacked the correct coordination of these gestures. With regard to place of articulation, successful clicks were most often produced at the place of the intended target, but there were a large number of palatal clicks produced in the case of the [ǃ] target, while the [ǀ] target was replaced more often by alveolar or postalveolar clicks, and much less often by palatal clicks. The substitution by palatal clicks in these cases is somewhat surprising given that palatal clicks are considered to be the most difficult to learn (Sands, p.c.; and this may have to do with U-shaped maxillary dental arch morphology; see Beach 1938: 78). A large number of unsuccessful attempts (non-clicks) were produced at velar or uvular place of articulation. Glottal stops were also moderately common for both the [ǃ] and [ǀ] targets. Moving forward, as we have indicated throughout, we hope to ascertain soon whether and to what degree our participants’ clicks and the substitution strategies used by those who did not manage to click are biased by vocal tract shape, particularly the shape of the anterior part of the hard palate. We plan to perform statistical analyses on a very comprehensive set of classic anatomical variables (both in the raw and under various data reduction approaches, such as principle components analysis and clustering based on psychometric principles) in relation to whether participants were successful at producing clicks or not. We also aim for a more advanced morphological analysis using

412

Moisik and Dediu

techniques such as non-rigid registration of the 3D intraoral scans (to establish correspondence for full 3D shape analysis) and methods from the field of Geometric Morphometrics (Zelditch, Swiderski and Sheets 2012). We further plan to provide a detailed quantification of lingual patterns in the real-time MRI sequences using active shape modeling (e.g. see Cootes, Taylor, Cooper and Graham 1995), and we hope to quantify acoustic properties of the clicks produced in the participant production audio and during the MRI productions. Acknowledgements We would like to thank the following people: all our ArtiVarK participants for their contribution to the project and for the patience and endurance during the many procedures; David Norris and Paul Gaalman for access to and piloting on the Avanto MRI scanner; Thomas Maal, Frans Delfos and Cees Kreulen for access to and help with the TRIOS intraoral scanner; Carly Jaques for participant recruitment and management; John Esling for recording the phonetic training materials; and Sabine Kooijman for assistance with ethics. We also are grateful for the generous support of the editor, Bonny Sands, for her numerous helpful suggestions, insights, and additions, and for the thoughtful and helpful comments provided by the anonymous reviewers. This work was funded by the Netherlands Organisation for Scientific Research (NWO) VIDI grant 276-70-022 to DD. References Baum, Shari R., and David H. McFarland. 1997. “The development of speech adaptation to an artificial palate.” The Journal of the Acoustical Society of America 102: 2353. Beach, D.M. 1938. The Phonetics of the Hottentot Language. Cambridge: W. Heffer & Sons Ltd. Best, Catherine T., Gerald W. McRoberts, and Nomathemba M. Sithole. 1988. “Examination of perceptual reorganization for nonnative speech contrasts: Zulu click discrimination by English-speaking adults and infants.” Journal of Experimental Psychology: Human Perception and Performance 14(3): 345–360. https://doi.org/ 10.1037/0096-1523.14.3.345. Boersma, Paul. 1998. “Functional Phonology.” Thesis, University of Amsterdam, Amsterdam, The Netherlands. Bradfield, Julian. 2014. “Clicks, concurrency and Khoisan.” Phonology 31(1): 1–49. https:// doi.org/10.1017/S0952675714000025.

The ArtiVarK Click Study

413

Brosnahan, Leonard Francis. 1961. The Sounds of Language: An Inquiry into the Role of Genetic Factors in the Development of Sound Systems. Cambridge: W. Heffer and Sons. Bruderer, Alison G., D. Kyle Danielson, Padmapriya Kandhadai, and Janet F. Werker. 2015. “Sensorimotor influences on speech perception in infancy.” Proceedings of the National Academy of Sciences, 201508631. https://doi.org/10.1073/pnas.1508631112. Brugman, Johanna C. 2009. “Segments, Tones and Distribution in Khoekhoe Prosody.” Ph.D. thesis, Cornell University. Retrieved from https://ecommons.cornell.edu/ handle/1813/13921. Brunner, Jana, Susanne Fuchs, and Pascal Perrier. 2009. “On the relationship between palate shape and articulatory behavior.” The Journal of the Acoustical Society of America 125(6): 3936–3949. https://doi.org/10.1121/1.3125313. Cootes, Timothy F., C.J. Taylor, D.H. Cooper, and James Graham. 1995. “Active shape models – their training and application.” Computer Vision and Image Understanding 61(1): 38–59. Dediu, Dan, Rick Janssen, and Scott R. Moisik. 2017. “Language is not isolated from its wider environment: Vocal tract influences on the evolution of speech and language.” Language & Communication 54: 9–20. https://doi.org/10.1016/j.langcom.2016.10.002. Demolin, Didier. 2016. “Clicks, stop bursts, vocoids and the timing of articulatory gestures in Kinyarwanda.” Studies in African Linguistics 45(1–2): 141–160. Diepstra, Heidi, Sandra E. Trehub, Alice Eriks-Brophy, and Pascal H.H.M. van Lieshout. 2017. “Imitation of non-speech oral gestures by 8-month-old infants.” Language and Speech 60(1): 154–166. Engstrand, Olle. 1997. “Why are clicks so exclusive?” Papers from FONETIK-97, the Ninth Swedish Phonetics Conference, held in Umeå. Reports from the Department of Phonetics, Umeå University (PHONUM) 4: 191–194. Erzurumlu, Reha S., and Patricia Gaspar. 2012. “Development and critical period plasticity of the barrel cortex: Barrel cortex plasticity.” European Journal of Neuroscience 35(10): 1540–1553. https://doi.org/10.1111/j.1460-9568.2012.08075.x. Everett, Caleb. 2013. “Evidence for direct geographic influences on linguistic sounds: The case of ejectives.” PLoS ONE 8(6): e65275. https://doi.org/10.1371/journal .pone.0065275. Everett, Caleb, Damián E. Blasi, and Seán G. Roberts. 2015. “Climate, vocal folds, and tonal languages: Connecting the physiological and geographic dots.” Proceedings of the National Academy of Sciences 112(5): 1322–1327. https://doi.org/10.1073/ pnas.1417413112. Ghosh, Satrajit S., Melanie L. Matthies, Edwin Maas, Alexandra Hanson, Mark Tiede, Lucie Ménard, Frank H. Guenther, Harlan Lane, and Joseph S. Perkell. 2010. “An investigation of the relation between sibilant production and somatosensory and

414

Moisik and Dediu

auditory acuity.” The Journal of the Acoustical Society of America 128(5): 3079–3087. https://doi.org/10.1121/1.3493430. Gick, Bryan. 2011. “Beware linguocentrism: The importance of vocal tract sphincters.” Presented at the Seeing Speech Workshop, Biosphere 2, Oracle, Arizona. Gick, Bryan, Peter Anderson, Hui Chen, Chenhao Chiu, Ho Beom Kwon, Ian Stavness, Ling Tsou, and Sidney Fels. 2014. “Speech function of the oropharyngeal isthmus: a modelling study.” Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization 2(4): 217–222. Gick, Bryan, Naomi Francis, Anna Klenin, Ezra Mizrahi, and Denise Tom. 2013. “The velic traverse: An independent oral articulator?” Journal of the Acoustical Society of America 133(3): EL208-EL213. Gil, David. 2005. “Paralinguistic usages of clicks.” The World Atlas of Language Structures, ed. Martin Haspelmath, Matthew S. Dryer, David Gil and Bernard Comrie. Oxford: Oxford University Press, pp. 572–575. Hale, Kenneth, and David Nash. 1997. “Damin and Lardil phonotactics.” Boundary Rider: Essays in Honour of Geoffrey O’Grady, ed. Darrell Tryon and Michael Walsh. Pacific Linguistics C-136. Canberra: Research School of Pacific and Asian Studies, Australian National University, pp. 247–259. Honda, Masaaki, Akinori Fujino, and Tokihiko Kaburagi. 2002. “Compensatory responses of articulators to unexpected perturbation of the palate shape.” Journal of Phonetics 30(3): 281–302. https://doi.org/10.1006/jpho.2002.0172. Kagaya, Ryohei. 1993. A Classified Vocabulary of the Sandawe Language. Asian and African Lexicon 26. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa. Knight, Alec, Peter A. Underhill, Holly M. Mortensen, Lev A. Zhivotovsky, Alice A. Lin, Brenna M. Henn, Dorothy Louis, Merritt Ruhlen, and Joanna L. Mountain. 2003. “African Y chromosome and mtDNA divergence provides insight into the history of click languages.” Current Biology 13(6): 464–473. https://doi.org/10.1016/ S0960-9822(03)00130-1. Ladefoged, Peter, and Peri Bhaskararao. 1983. “Non-quantal aspects of consonant production: a study of retroflex consonants.” Journal of Phonetics 11(3): 291–302. Ladefoged, Peter, and Ian Maddieson. 1996. The Sounds of the World’s Languages. Cambridge, MA: Blackwell. Ladefoged, Peter, and Anthony Traill. 1984. “Linguistic phonetic descriptions of clicks.” Language 60(1): 1–20. https://doi.org/10.2307/414188. Lammert, Adam, Michael Proctor, and Shrikanth Narayanan. 2013a. “Interspeaker variability in hard palate morphology and vowel production.” Journal of Speech, Language, and Hearing Research 56(6): S1924–S1933. https://doi.org/10.1044/ 1092-4388(2013/12-0211).

The ArtiVarK Click Study

415

Lammert, Adam, Michael Proctor, and Shrikanth Narayanan. 2013b. “Morphological variation in the adult hard palate and posterior pharyngeal wall.” Journal of Speech, Language, and Hearing Research 56(2): 521–530. https://doi.org/10.1044/ 1092-4388(2012/12-0059). Liberman, Alvin M., and Ignatius G. Mattingly. 1985. “The motor theory of speech perception revised.” Cognition 21(1): 1–36. https://doi.org/10.1016/0010-0277(85)90021-6. Maddieson, Ian, Siniša Spajić, Bonny Sands, and Peter Ladefoged. 1993. “Phonetic structures of Dahalo.” UCLA Working Papers in Phonetics (84): 25–66. Matisoff, James A. 1975. “Rhinoglottophilia: The mysterious connection between nasality and glottality.” Nasálfest: Papers from a Symposium on Nasals and Nasalization, ed. Charles A. Ferguson, Larry M. Hyman and John J. Ohala. Stanford, California: Department of Linguistics, Stanford University, pp. 265–287. Miller, Amanda L. 2016. “Posterior lingual gestures and tongue shape in Mangetti Dune ǃXung clicks.” Journal of Phonetics 55: 119–148. https://doi.org/10.1016/ j.wocn.2015.12.001. Miller, Amanda L., Johanna Brugman, Bonny Sands, Levi Namaseb, Mats Exter, and Chris Collins. 2009. “Differences in airstream and posterior place of articulation among Nǀuu clicks.” Journal of the International Phonetic Association 39(2): 129–161. Moisik, Scott R., and Dan Dediu. 2015. “Anatomical biasing and clicks: Preliminary biomechanical modelling.” Proceedings of the 18th International Congress of Phonetic Sciences [ICPhS 2015] Satellite Event: The Evolution of Phonetic Capabilities: Causes constraints, consequences, ed. H. Little. Glasgow: ICPhS, pp. 8–13. Retrieved from http://pubman.mpdl.mpg.de/pubman/faces/viewItemOverviewPage.jsp?item Id=escidoc:2166182. Moisik, Scott R., and Dan Dediu. 2017. “Anatomical biasing and clicks: Evidence from biomechanical modeling.” Journal of Language Evolution 2(1): 37–51. https://doi .org/10.1093/jole/lzx004. Moisik, Scott R., John H. Esling, Lise Crevier-Buchman, Angélique Amelot, and Philippe Halimi. 2015. “Multimodal imaging of glottal stop and creaky voice: Evaluating the role of epilaryngeal constriction.” Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow, UK. Napoli, Donna Jo, Nathan Sanders, and Rebecca Wright. 2014. “On the linguistic effects of articulatory ease, with a focus on sign languages.” Language 90(2): 424–456. https://doi.org/10.1353/lan.2014.0026. Nathan, Geoffrey S. 2001. “Clicks in a Chinese nursery rhyme.” Journal of the International Phonetic Association 31(2): 223–228. https://doi.org/10.1017/S0025100301002043. Ogden, Richard. 2013. “Clicks and percussives in English conversation.” Journal of the International Phonetic Association 43(3): 299–320. https://doi.org/10.1017/ S0025100313000224.

416

Moisik and Dediu

Ohala, John J. 1995. “A probable case of clicks influencing the sound patterns of some European languages.” Phonetica 52(3): 160–170. https://doi.org/10.1159/000262167. Perkell, Joseph S. 2012. “Movement goals and feedback and feedforward control mechanisms in speech production.” Journal of Neurolinguistics 25(5): 382–407. Pickrell, Joseph K., Nick Patterson, Chiara Barbieri, Falko Berthold, Linda Gerlach, Tom Güldemann, Blesswell Kure, Sununguko Wata Mpoloka, Hirosi Nakagawa, Christfried Naumann, Mark Lipson, Po-Ru Loh, Joseph Lachance, Joanna Mountain, Carlos S. Bustamante, Bonnie Berger, Sarah A. Tishkoff, Brenna M. Henn, Mark Stoneking, David Reich, and Brigitte Pakendorf. 2012. “The genetic prehistory of southern Africa.” Nature Communications 3: 1143. https://doi.org/10.1038/ ncomms2140. Ten Raa, Eric. 1970. “The couth and the uncouth: Ethnic, social, and linguistic divisions among the Sandawe of central Tanzania.” Anthropos 65(1–2): 127–153. Sands, Bonny, Ian Maddieson, and Peter Ladefoged. 1993. “The phonetic structure of Hadza.” UCLA Working Papers in Phonetics 84: 67–88. Simpson, Adrian P. 2007. “Acoustic and auditory correlates of non-pulmonic sound production in German.” Journal of the International Phonetic Association 37(2): 173. https://doi.org/10.1017/S0025100307002927. Snyman, Jan W. 1980. “The relationship between Angolan ǃXũ and Žuǀ’hõasi.” Bushman and Hottentot Linguistic Studies (Papers of seminar held on 27 July 1979), ed. Jan W. Snyman. Miscellanea Congregalia 16. Pretoria: University of South Africa, pp. 1–58. Stavness, Ian, Mohammad Ali Nazari, Pascal Perrier, Didier Demolin, and Yohan Payan. 2013. “A biomechanical modeling study of the effects of the Orbicularis Oris muscle and jaw posture on lip shape.” Journal of Speech, Language, and Hearing Research 56(3): 878–890. https://doi.org/10.1044/1092-4388(2012/12-0200). Stone, Maureen, Susan Rizk, Jonghye Woo, Emi Z. Murano, Hegang Chen, and Jerry L. Prince. 2013. “Frequency of apical and laminal /s/ in normal and postglossectomy patients.” Journal of Medical Speech Language Pathology 20(4): 106–111. Thibeault, Mélanie, Lucie Ménard, Shari R. Baum, Gabrielle Richard, and David H. McFarland. 2011. “Articulatory and acoustic adaptation to palatal perturbation.” The Journal of the Acoustical Society of America 129(4): 2112–2120. https:// doi.org/10.1121/1.3557030. Thomas-Vilakati, Kimberly D. 1999. “Coproduction and Coarticulation in IsiZulu Clicks.” Ph.D. thesis, University of California, Los Angeles. Tiede, Mark K., Vincent L. Gracco, Douglas M. Shiller, Carol Y. Espy-Wilson, and Suzanne E. Boyce. 2005. “Perturbed palatal shape and North American English /r/ production.” Journal of the Acoustical Society of America 117: 2568–2569. Traill, Anthony. 1985. Phonetic and Phonological Studies of ǃXóõ Bushman. Quellen zur Khoisan-Forschung / Research in Khoisan Studies 1. Hamburg: Helmut Buske.

The ArtiVarK Click Study

417

Traunmüller, Hartmut. 2003. “Clicks and the idea of a human protolanguage.” Reports from the Department of Phonetics, Umeå University (PHONUM) 9: 1–4. van Reenen, J.F. 1964. “Dentition, jaws and palate of the Kalahari Bushman.” Journal of the Dental Association of South Africa (= Die Tydskrif van Die Tandheelkundige Vereniging van Suid-Afrika) 19: 1–37. van Reenen, J.F., and D.W. Allen. 1987. “The palatal vault of the Bushman (San), Vassekela and Himba.” The Journal of the Dental Association of South Africa (= Die Tydskrif van Die Tandheelkundige Vereniging van Suid-Afrika) 42(8): 489–492. Weirich, Melanie, and Susanne Fuchs. 2013. “Palatal morphology can influence speakerspecific realizations of phonemic contrasts.” Journal of Speech, Language, and Hearing Research 56(6): S1894–S1908. https://doi.org/10.1044/1092-4388(2013/12-0217). Winkler, Eike-M., and Sylvia Kirchengast. 1993. “Metric characters of the hard palate and their cephalometric correlations in Namibian ǃKung San and Kenyan tribes.” Human Biology 65(1): 139–150. Zelditch, Miriam Leah, Donald L. Swiderski, and H. David Sheets. 2012. Geometric Morphometrics for Biologists: A Primer, 2nd edn. Amsterdam: Elsevier. Zipf, George Kingsley. 1949. Human Behaviour and the Principle of Least Effort. Cambridge, MA: Addison-Wesley Press.

Chapter 13

Notes on Child Acquisition of Clicks in Hadza Kirk Miller Children typically acquire most of the phonemes of their language by 3 or 4 years of age. Phonemes that are acquired late and are rare cross-linguistically, such as English /θ, ð/, are often designated as ‘marked’. Due to their articulatory complexity and cross-linguistic rarity, clicks have been thought to be highly marked consonants (e.g. Herbert 1986: 241). It might be expected that clicks would be among the last consonants acquired by children, but that does not seem to be the case (Mowrer and Burger 1991: 147). Hadza children are reported to acquire clicks between approximately the ages of 2 and 3. They may acquire at least some clicks before such crosslinguistically common (and thus, by that criterion, relatively unmarked) consonants as /s/. Unfortunately, no research into child acquisition of Hadza has been done. What follows are a few informal notes and observations. The author’s principal consultant, Mariamu Anyawire, noted the following sound substitutions made by very young Hadza children before they acquire clicks. The examples are hers, based on her recollection of words that such children use. They indicate that clicks are acquired after the most basic consonants, including at least /m, n, t, k, ʔ, tʃ, ɦ/ (and likely a few others).1 – /n/ for the voiced nasal click /ǃ̃/̬ in all positions, and for the glottalized nasal click /ǃ̃ˀ/ in medial position. For example, /ɦaǃ̃ˀako/ > /ɦanako/ ‘rock’. – /t/ for the palatal lateral obstruents /c, cʰ, c’, ǁ, ǁʰ/. For example, /cʰeena/ > /teena/ ‘under’ /ǁʰaʔano/ > /taʔano/ ‘dog’. – /tʃ/ for the sibilants /s, ts, tsʰ, ts’/ and for the lateral fricative /ɬ/. For example, /seme/ > /tʃeme/ ‘eat, food’ /ɬeme/ > /tʃeme/ ‘man’ (the same pronunciation as ‘food’) /ʔase/ > /ʔatʃe/ ‘sleep’ /ɦets’ana/ > /ɦetʃana/ ‘at home’. I do not have information on the other clicks: /ǀ, ǃ, ǀʰ, ǃʰ, ǀ̃,̬ ǁ̃,̬ ǀ̃ˀ, ǁ̃ˀ/, nor for the glottalized alveolar click /ǃ̃ˀ/ in word-initial position. Although the pronunciation 1  Aspiration was not recorded for the children’s pronunciations. Compare these early consonants with the set found in Hadza inflectional suffixes: /m, n, b, pʰ, t, k, kʷ, ʔ, tʃʰ, s, ɦ, j/.

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_014

Notes on Child Acquisition of Clicks in Hadza

419

of 'rock' above suggests that the same substitutions will be made for the glottalized nasal clicks /ǀ̃ˀ, ǁ̃ˀ, ǃ̃ˀ/ as for the voiced nasal clicks /ǀ̃,̬ ǁ̬̃, ǃ̬/̃ , this is likely to be true only in word-medial position. I suspect that in word-initial position, where the glottalized clicks are voiceless, they will instead be conflated with the unaspirated oral clicks /ǀ, ǁ, ǃ/. ‘Baby words’ used by adults with children of approximately this age, such as /didi/ for /kʰintʃa/ ‘bug’ and /tibi/ for /ʔati-biʔi/ ‘water’, do not, to my knowledge, ever make use of clicks, but apparently neither do they have ejectives or more than three syllables. The small amount of information available about Hadza acquisition is consistent with what has been reported for Xhosa. Lewis and Roux (1996) found that the voiceless unaspirated oral clicks /ǀ, ǁ, ǃ/ are frequently replaced by /t/ or /k/,2 the aspirated clicks /ǀʰ, ǁʰ, ǃʰ/ by aspirated /tʰ/ or /kʰ/, the slack-voiced clicks /ǀ̬ʱ̥ ̥, ǁ̬̥ʱ̥, ǃ̬ʱ̥ ̥/ by slack-voiced /d̥ ʱ̥/ or /ɡ̊ ʱ̥/, the nasal clicks /ǀ̃, ǁ̃, ǃ̃/ by nasal /n/, and the slack-voiced nasal clicks /ǀ̃,̬̥ ǁ̬̥̃, ǃ̃/̬̥ by /n/ or /ŋɡ̊ ʱ̥/. Lewis and Roux do not mention whether the choice of an alveolar or velar replacement corresponds to the place of articulation of the click, but Mowrer and Burger (1991: 154) record that, for the voiceless unaspirated oral clicks, /ǀ/ is replaced by /t/, /ǃ/ is usually replaced by /k/, and /ǁ/ is replaced by either /t/ or /k/ with approximately equal frequency. Anyawire reported that Hadza children acquire clicks between the ages of 2 and 3 years. We later found this to be the case even for a non-speaker exposed to Hadza. Anyawire’s young son would speak only Swahili, his home language, but he was frequently addressed in Hadza by his grandmother. He would also listen to other Hadza-speakers and to the author, who occasionally used Hadza words that had no Swahili equivalent, such as xxekwa /ǁ̃ˀekʷa/ (Vigna frutescens, a principal food tuber). The boy evidently found this amusing. At 2 years 5 months, he told his older sister about this Hadza word, which he had just overheard the author (‘Makubi’) using: /makubi alitʃema ǁekʷa na mama/ ‘Makubi said xxekwa to Mama.’ The Swahili verb ‘he said’ is /alisema/, with an /s/, not a /tʃ/. Although too young to have acquired /s/, the child was able to make a good lateral click without a prosodic break or other indication that it required any special effort.3 2  Both sets of Xhosa consonants, /ǀ, ǁ, ǃ/ and /t, k/, may be glottalized or even ejective (Jessen 2002) but are not nasalized. 3  His only error, according to his mother, was in not glottalizing the click. Glottalization is difficult to discern in utterance-initial position, but the error may well have been mine, in my pronunciation that he copied.

420

Miller

This accords with findings for Xhosa (Mowrer and Burger 1991), where at least some clicks are acquired before /s/.4 Consistent with their relatively early acquisition, click consonants are found in several Hadza words that are used predominantly by young children. For example, in-camp use of the word /ǀ̃i̬ e/ ‘friend, sibling’, with a voiced nasal click, occurs primarily among small children talking to each other. (The usual adult word is /nie/, with a pulmonic nasal in place of the nasal click – the mirror image of the replacements found in baby-talk.) The only native Hadza words for ‘fish’, the synonyms /ǀ̃ˀala/ and /ǀ̃ˀama/, are often used by children. Anyawire describes them as not being ‘serious’ words and says that they are seldom used by adults. Nonetheless, both have an initial glottalized nasal click. Relatively early acquisition of clicks in Hadza, and in other languages such as Gǀui (Hirosi Nakagawa, p.c. 2008) and Xhosa, has implications for theories of markedness, especially theories that use child language acquisition as evidence. The fact that a child may acquire at least some clicks before mastering the cross-linguistically highly frequent consonant /s/ belies claims that clicks are highly marked consonants, that they are acquired later than other consonants with high functional loads, or that they are rare cross-linguistically because of articulatory complexity (e.g. Herbert 1986: 29–30) – that, as Jakobson put it, rare sounds seem as a rule to be acquired relatively late and to present the most difficulties in learning (Battistella 1996: 29). The Hadza and Xhosa data is more consistent with the view that clicks are easily acquired, and are only rare cross-linguistically because their historical appearance depends on phonetic preconditions that are themselves rare (Engstrand 1997). References Battistella, Edwin. 1996. The Logic of Markedness. Oxford: Oxford University Press. Engstrand, Olle. 1997. “Why are clicks so exclusive?” Papers from FONETIK-97, the Ninth Swedish Phonetics Conference, held in Umeå. Reports from the Department of Phonetics, Umeå University (PHONUM) 4: 191–194. Herbert, Robert K. 1986. Language Universals, Markedness Theory, and Natural Phonetic Processes. Berlin: Mouton de Gruyter.

4  Mowrer and Burger unfortunately only investigated the three voiceless unaspirated oral clicks, /ǀ, ǁ, ǃ/, out of the 18 phonemic clicks in Xhosa, and did not consider the use of [ǂ] for /ǃ/ in baby-talk. They found that most consonants, including the clicks /ǀ/ and /ǃ/, were acquired by age 3; /s, z/ by age 3½, and the third click, /ǁ/, along with /ʃ, dz/ by age 4.

Notes on Child Acquisition of Clicks in Hadza

421

Jessen, Michael. 2002. “An acoustic study of contrasting plosives and click accompaniments in Xhosa.” Phonetica 59: 150–179. Lewis, Philip W., and Justus C. Roux. 1996. “A Phonological Process Analysis of the Acquisition and Loss of Clicks in Xhosa.” South African Journal of African Languages 16(1): 1–7. Mowrer, Donald, and Sharon Burger. 1991. “A comparative analysis of phonological acquisition of consonants in the speech of 2½–6 year old Xhosa- and English-speaking children.” Clinical Linguistics & Phonetics 5(2): 139–164.

Chapter 14

Paralinguistic Use of Clicks in Chad Florian Lionnet 1 Introduction Many languages without phonemic click consonants use clicks for paralinguistic purposes, i.e. as verbal gestures, defined by Grenoble (2014) as a “set of sounds or segments which stand outside a language’s phonemic inventory but are still part of the communicative system of the language”. In his worldwide survey of clicks used as verbal gestures, Gil (2011) defines two main paralinguistic functions of clicks: affective (approval, disapproval) and logical (“yes”, “no”). The present contribution focuses on the paralinguistic uses of clicks in southern Chad, with specific focus on Laal, a language isolate spoken by ca. 800 speakers in Gori and Damtar, two villages along the Chari River between the Moyen-Chari and Chari-Baguirmi regions of Chad. As can be seen from the inventories below, Laal does not have any phonemic click consonants. (1) a. Laal vowels b. Laal consonants i y ɨ u p t c k [ʔ] e yo ə o b d ɟ ɡ ia ya a ua ᵐb ⁿd ᶮɟ ᵑɡ ɓ ɗ ˀj s h m n ɲ ŋ l r w j The Laal case is representative of many if not all the languages of the southern part of the country, and most probably beyond: the areal distribution of these clicks, which has yet to be mapped precisely, is very likely to encompass much of West and Central Africa, and perhaps more, as we will see in this chapter. Four non-phonemic click-like articulations are used in Laal for paralinguistic purposes: a dental click, a lateral click, a back-released velar click, and a bilabial fricated click. Most of the limited acoustic data presented in this chapter

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_015

Paralinguistic Use of Clicks in Chad

423

was recorded in Gori in January 2015, with two male consultants: Kalem Dakour (KD), in his late 50’s, and his son Adoum Kalem (AK), 28 years old. The lateral (three tokens, by AK) and back-released velar clicks (six tokens by KD, two by AK) were recorded during a brief conversation about those clicks, while the dental (one token each) and bilabial-lateral (one token by KD) ones were extracted from recordings where they occurred spontaneously. One token each of a lateral and a dental click uttered by a third speaker, Idris Kanyour (ID), during a spontaneous conversation are also provided. Unless otherwise specified, the recordings were made using a Zoom H4n recorder set at a sample rate of 44.1 kHz and 16-bit quantization, and a mono Røde NTG2 condenser shotgun microphone. Waveforms and spectrograms were produced using Praat (Boersma and Weenink 2014, default settings). All waveforms and spectrograms show an interval of about 200 ms, so as to make a comparison of the relative length of the four clicks easier. Note that the articulatory descriptions given in the following lines are only impressionistic, mostly based on my own productions of these clicks as I tried to reproduce native speakers’ productions. More generally, it will be clear to the reader that the present contribution is but a preliminary attempt at characterizing these paralinguistic clicks (acoustically, articulatorily, and pragmatically) in Chad, an attempt which I hope will pave the way to more detailed studies in the future. 2

Dental Click [ǀ]

The dental click is used to express negation or disagreement, i.e. as a gestural1 equivalent of “no” (logical use), or to express negative evaluation (affective). The latter is illustrated in the following spontaneous conversation, recorded in Gori:2 (2) ID: kàlìté mē bàn ’yá̰ nūŋ kind death emph thus top ‘Such a death …’ (shaking head, expressing commiseration; NB: the person being talked about died after a long agony) (…) 1  In Grenoble’s (2014) sense of “gesture”, cf. section 1 above. 2  Abbreviations used in this paper conform to the Leipzig glossing rules, with the following exceptions: bkch = backchanneling; con = connective; emph = emphatic; excl = exclusive; f = feminine; m = masculine.

424

Lionnet

B: pāyál ɗòòɡ wúlá ɗáɡàl ’á suffering:con poss.3f.sg finish here compl ‘Her suffering was over.’ ID: ǀ kây! no excl “[dental click] really [what a horrible death it was]!’ The waveforms and spectrograms in Figure 14.1 (corresponding to speaker ID’s dental click in example (2) above) and Figure 14.2 below show that the release burst of the dental click is somewhat delayed and produces noise that contributes to the fricative quality of the sound. The energy is concentrated in the higher part of the spectrum, in the 5~10 kHz range.

Figure 14.1

Dental click [ǀ] (Speaker ID, natural speech, Zoom H4n built-in microphone)

Paralinguistic Use of Clicks in Chad

Figure 14.2

3

425

Dental click [ǀ] (Speaker KD, natural speech)

Lateral Click [ǁ]

The lateral click is frequently used in linguistic interactions in Laal, with two main functions: to express approval or endorsement (“yes”), and as a backchanneling strategy (“yes, go on”, “I get it”). Both KD and his son AK describe this click, and its allophone the back-released velar click [ʞ] (see next section) as indicating strong agreement: “Si quelqu’un va parler, si c’est une bonne parole, tu vas faire [ʞ] (~[ǁ])” (if someone speaks, if what they say is good, you are going to go [ʞ] (~[ǁ]); KD, 20 Jan. 2015). The example below, taken from natural conversations recorded in Gori illustrates the backchannel use of [ǁ] (the strong endorsement function cannot

426

Lionnet

be illustrated with naturalistic data, since it is not attested in the transcribed portion of my corpus). (3) A: ò kú nàrɡàɲ bòkà já you.sg see Nargany Boka con.m náár jí ɗāŋ àsyám jí his.mother con.f dist Asyam con.f ‘You see Nargany Boka, his mother, Asyam?’

ɗāŋ dist ɗāŋ dist





B: m̀ m̀ bkch ‘Uh huh.’



A: ùrú bàn ɨ́ɲì nùŋú ùrú ɡùr ɓɨ̀l ùrú ɓýýnù we.ex emph sit.pl here we.ex dig hole we.ex bury:her ‘We here, we dug the hole, we buried her.’



B: ǁ bkch ‘Uh huh.’

Note that two backchanneling strategies are used in (3): the lateral click, and m̀ m̀ , which appears to have exactly the same function (although, as far as I can tell from my corpus, it is only ever used for backchanneling, and never expresses approval or endorsement). The waveform and spectrogram in Figure 14.3 represent a lateral click realized by speaker ID in a natural conversation recorded in Gori. In many cases, the closure for the lateral click appears to involve the side of the tongue against the upper gums rather than the teeth, and the release seems to be further back than the expected lateral release, close to the side of the velum. Note that this articulatory inference is purely impressionistic, and is mostly based on my own attempts to reproduce the sound as heard from native speakers as close as possible to the original. The waveform and spectrogram in Figure 14.4 illustrate such a lateral click, uttered in careful speech, showing a 30 ms interval of fricative noise accompanying the release. As can be seen on the spectrograms in both Figure 14.3 and Figure 14.4, the lateral click release burst is characterized by two bands of high energy around 2 kHz and 4 kHz, as well as a wider band of high energy above 6 kHz, gradually fading away as frequency increases. It is also noteworthy that the burst has two to three successive peaks of amplitude.

Paralinguistic Use of Clicks in Chad

Figure 14.3

4

427

Lateral click [ǁ] (speaker ID, natural speech, Zoom H4n built-in microphone)

Back-Released Velar Click [ʞ]

The back-released velar click may be used as an equivalent of the lateral click, with the same functions: approval and backchannel. Both clicks are very frequent in Laal discourse, and appear to be allophones of each other, and both are explicitly described as strictly identical from a semantic and pragmatic point of view by speakers AK and KD. AK even insisted that he much prefers to use the lateral one. Like any other click, [ʞ] is produced with an ingressive velaric (or lingual, cf. Miller et al. 2006; Miller et al. 2007) airstream. The oral cavity is closed in two places: at the back and at the front of the mouth. Air rarefaction in the intra-oral cavity is achieved mostly through tongue body lowering. However,

428

Figure 14.4

Lionnet

Lateral click [ǁ] (AK, elicited, careful speech)

instead of the front closure, it is the back closure that is released, allowing air to rush into the mouth from the back, either from the nasal cavity or from the post-velar cavity if the velopharyngeal port is closed. However, keeping the velopharyngeal port open seems to be important for the production of this click, since the nasal cavities appear to act as the main resonator (my own attempts at producing this click with a closed velopharyngeal port yields a muffled sound that is nothing like any of the recorded tokens I have of this click pronounced by native speakers). This might explain why Anonby (2011: 68–69), the first author to give a detailed articulatory and auditory description of this click in Mambay (Adamawa), qualifies it as “nasal”.

Paralinguistic Use of Clicks in Chad

429

In the absence of articulatory measurements, it is difficult to determine where exactly the two closures are. Lukas (1937: 147) – the first linguist, to my knowledge, to ever describe this click – says it is a “post-velar click [with] lips closed” in Kanuri, suggesting a uvular rather than velar back closure in this language. Elders (2000: 586) also describes the back closure of this click as uvular in Mundang (“injectif uvulaire”, accordingly transcribed [q]). The back release sounds velar to me in all the Laal recordings at my disposal, but it is not unlikely that a uvular variant may also be possible. My own production of this click, which sounds very similar to what I hear from native speakers, and looks very similar on a spectrogram as well, involves a closure at the very edge of the hard palate, close to the velum. It also involves central airflow: the back of the tongue is still in contact with the gums and soft palate on both sides of the mouth after the release. Most of the authors who have written about this click describe the front closure as bilabial, from visual evidence only (Lukas 1937: 147 for Kanuri; Anonby 2011: 69 for Mambay; Grenoble 2014: 198, fn. 1, and Grenoble et al. 2015: 116, fn. 3 for Wolof). Indeed, it is most of the time pronounced with full lip closure throughout in Laal as well, which is suggestive of a bilabial front closure. It is possible to surmise, however, that the front closure, hidden by the lips’ rest position, might actually be dental/alveolar instead, in which case lip activity would be irrelevant to the production of this click, since the position of the lips would have no acoustic/perceptual consequences. A dental/alveolar closure, by reducing the size and elasticity of the intra-oral cavity, actually makes the back-released click both articulatorily easier and perceptually more salient, which seems to suggest that the front closure is dental/alveolar, while the lips are most of the time left in their rest position, i.e. closed (note that this click is typically uttered by a listener in a conversation, i.e. someone whose mouth is not unlikely to actually be closed). Clearly, precise articulatory data is needed to determine the exact position of the front and back closures in these languages, and the degree of intra-speaker, inter-speaker, and cross-linguistic variation in the realization of this click type in its area of distribution.3 I propose that this click be transcribed with the turned k symbol ⟨ʞ⟩ originally recommended in the Principles of the International Phonetic Association (1949) to represent a (then purely hypothetical) velar click, i.e. a velar stop articulated with ingressive velaric airstream. This articulation was later said to be 3  Tuhuse and Traill (1999) describe the same articulation in East ǃXóon as involving a coronal front closure, similar to that required for a palatal click [ǂ]. Note that this click also has an exclusively paralinguistic use in ǃXóon (the imitation of a specific bird call), but interestingly very different from West and Central Africa.

430

Lionnet

physically impossible due to the necessity for a velaric ingressive stop to involve a front closure additionally to the velar one, and release of the front closure exclusively: “the two closures must be separated by at least a centimeter or two, so a velar click is a contradiction in terms” (Pullum and Ladusaw 1996: 101). However, a velar click is articulatorily possible if the back closure is released first, which is exactly how the back-released velar click of Laal is realized.4 Figure 14.5 and Figure 14.6 below present the waveform and spectrogram of two representative tokens of the back-released velar click, for both KD and AK. Both tokens were elicited, and correspond to careful speech. As can be seen, this click is characterized by a relatively short delayed release burst, making it sound somewhat fricated. The energy seems to be mostly concentrated in two bands, centered around 3~4 kHz and 9 kHz respectively. One can also see an additional low frequency peak, with less amplitude, at around 1~2 kHz. The average overall intensity of KD’s six velar click tokens is around 60 dB, which is surprisingly high for a closed-mouth sound (compare with his son’s open-mouth lateral click’s 68 dB). This can be explained by the fact that, as said above, the nasal cavity acts as the main resonator. The intensity of the click is severely reduced when the velopharyngeal port is closed, as the sound may only escape the oropharyngeal cavity through the bones and tissues surrounding the vocal tract. This nasality could explain the lower frequency energy observed in the velar click, which is absent from all the other clicks used in Laal, none of which involve the nasal cavity. Note, however, that with a mean intensity of 43 dB, AK’s velar clicks are noticeably less loud than his father’s, but also than his own lateral click (mean intensity: 68 dB). Interestingly, this speaker also expressed a marked preference during our recording session for the latter, a preference which might be due to the sharp difference in loudness between his productions of both clicks. Anonby (2011: 68–69), describing the back-released velar click in Mambay, an Adamawa language of Northern Cameroon and Southwestern Chad, 4  In 1949, turned ⟨ʞ⟩ was officially adopted by the IPA as the symbol for a velar click. It was then removed in 1979. Following Ball et al. (2004: 159), the symbol was reintroduced in the extIPA chart to represent a voiceless velo-dorsal stop. In 2015, after the first version of the present chapter was circulated, the extIPA symbol for the voiceless velo-dorsal stop was changed to a reversed k, to avoid any confusion with the use of turned [ʞ] for a back-released velar click (see Ball et al. 2018: 159, which cites a previous version of the present chapter). Note that turned ⟨ʞ⟩ also represents an unaspirated [k] in some early work by Dorsey on the Siouan language Ponca (mentioned in Pullum and Ladusaw 1996: 101). This is a little-known idiosyncratic choice that has not been widely adopted. Additionally, Bradfield (2014) uses a turned small capital K to refer to a generic click of any type, graphically different from turned ⟨ʞ⟩. Note that the first use of ⟨ʞ⟩ appears to be as a transcription of [ʃ] in John Yeomans’ spelling reform proposal for English (Yeomans 1759: 49, cited in MacMahon 1994).

Paralinguistic Use of Clicks in Chad

Figure 14.5

431

Velar click (speaker KD, elicited, careful speech)

notes that its areal distribution “extends to much of northern Cameroon, southern Chad and likely other adjacent regions”, and mentions Mundang (Elders 2000: 586), geographically and genealogically very close to Mambay, and the Saharan language Kanuri (Lukas 1937: 147) as two other languages of the Lake Chad Basin where this click is attested with similar functions. It is also attested in Mauritanian Pulaar (Gil 2011, cited in Grenoble 2014: 108, fn. 1), and in Wolof (Grenoble 2014; Grenoble et al. 2015) as we will see later, as well as in the Mambiloid language Wawa (Marieke Martin, p.c., August 2015). In all the languages in which it has been described so far, the back-released click seems to serve the same two functions as in Laal: backchannel and agreement/approval.

432

Lionnet

Figure 14.6

5

Velar click, (speaker AK, elicited, careful speech)

Bilabial Fricated Click: the “Tchip”

This ingressive sound can be described as a complex bilabial-alveolar or bilabial-lateral click: the initial bilabial release is immediately followed by a prolonged suction of air through the teeth onto the tongue, either at the alveolar ridge or on one or both side(s) of the mouth. This sound is always pronounced with pursed lips moving down to the side. The resulting sound can be defined as an “ingressive affricate”: its release burst starts with a high amplitude peak corresponding to the release of the lip closure, followed by a long (about 124 ms in Figure 14.7) period of noise with lower amplitude,

Paralinguistic Use of Clicks in Chad

Figure 14.7

433

“Tchip” (speaker KD, natural speech)

gradually fading. The energy seems to be rather evenly distributed above 1.5 kHz, with a noticeable band of higher energy between 5 kHz and 7 kHz. This verbal gesture expresses dissent, disapproval, or contempt, and is very widespread in (at least) West and Central Africa, as well as in African diaspora communities. Note that there seems to be cross-linguistic variation in the articulation of this gesture: Agwuele (2014: 90) does not mention lip closure as part of the articulation of this click in Yoruba, and describes the place of articulation of the ingressive fricative as being strictly alveolar, while Grenoble (2014), Grenoble et al. (2015) describe the Wolof version as involving an ingressive labio-dental fricative. All descriptions of similar clicks however mention the accompanying downward lip movement noted in Laal, and often an

434

Lionnet

additional frown and/or sideways head movement reinforcing the highly negative semantic and pragmatic value of this very complex gesture, both verbal and non-verbal. Known as suck-teeth in African American communities (Rickford and Rickford 1976), and as tchip among the communities of African descent in the francophone world (from its Wolof name ciip5), this very salient verbal gesture is usually used as a strong marker of identity among communities of African descent outside of Africa. 6

Conclusion and Tentative Comparative Notes

As a conclusion to this chapter, I will sketch a brief comparison of the uses and values of the clicks used as verbal gestures in Laal and Wolof, which lie at opposite ends of the West and Central African zone (the Wolof data are taken from Grenoble 2014). As can be seen in Table 14.1 below, both languages make use of four clicks. The alveolar click is attested only in Wolof, where it is used for backchannel communication, while only Laal uses the dental click, as a synonym of “no” or sign of disapproval. The other three clicks (velar, lateral, tchip) are attested in both languages. The clicks that Laal and Wolof have in common seem to share similar properties. In both languages, the lateral and back-released velar clicks are allophones of each other and systematically have a positive meaning, both logical (“yes”, backchannel) and affective (approval, endorsement), while the tchip is used to convey negation (logical) or disapproval (affective). This also holds for the other West and Central African languages that have been described as having these clicks. However, there are interesting differences between the two systems. First, Wolof makes a distinction between single occurrence and repeated versions of each click: the lateral, velar and alveolar clicks, when articulated only once, are used for backchannel communication, and express approval only when repeated. Similarly, disapproval is conveyed through a single occurrence of the tchip, while a repeated tchip expresses simple negation. Interestingly, the distinction between single and repeated occurrence does not correlate with the difference between logical and affective functions: the non-repeated clicks used for backchanneling all express a logical function, but their repeated versions 5  According to Diouf (2003), ciip is the noun that refers to the click, from which the verb ciipu (~ciipatu) is derived.

435

Paralinguistic Use of Clicks in Chad Table 14.1 Paralinguistic clicks in Laal and Wolof

Lateral Velar Alveolar Dental Tchip

Laal

Wolof

Backchannel, “yes”/approval Backchannel, “yes”/approval – Negation, Negative evaluation Disapproval

Single Backchannel Backchannel Backchannel – Disapproval

Repeated “yes”/approval “yes”/approval “yes”/approval – Negation

have both a logical (“yes”) and affective function (approval). Conversely, it is the single version of the tchip that has an affective meaning (disapproval), while its repeated counterpart expresses a purely logical one (negation). Note that the same distinction between single and repeated occurrences exists in Mauritanian Pulaar: one occurrence of the dental click stands for “yes”, two for “no” (Gil 2011). Laal on the other hand does not make this distinction: backchannel, approval, negation and disapproval may all be conveyed through one single occurrence of the relevant click; repetition is only used to express insistence or emphasis. Finally, negation and disapproval are, like in Wolof, expressed (partly) separately, although not in exactly the same way: the tchip is used only for disapproval in Laal, while negation is expressed with a dental click (which also expresses negative evaluation), unattested in Wolof. The clicks described in this paper are used as verbal gestures over a wide area of Western and Central Africa, at least from Senegal to southern Chad. Gil (2011) mentions 23 languages between Senegal and Cameroon known for their paralinguistic use of clicks. Pillion et al. (2019) also mention three clicks in three Narrow Bantu languages (Basaa, Bakoko, Bulu) and one Grassfields Bantu language (Ngoshie) of Cameroon: bilabial (dog call, “wolf whistle”), bilabial-lateral (i.e. tchip, with the same function as in Laal), and lateral (back channel, as in all other languages). Very little is known however about the exact distribution and cross-linguistic properties of these clicks, and of verbal gestures in general, unfortunately too often left out of descriptive grammars. Whether the use of clicks as verbal gestures constitutes an areal feature of that part of Africa (or of the whole continent, as suggested by Gil (2011)) thus remains an open question, for which further research is needed.

436

Lionnet

Acknowledgements For their very helpful comments on previous versions of this paper, I would like to thank Marieke Martin, Michael Proctor, Bonny Sands, and Kirk Miller. I would also like to thank the Laal speaking community, in particular Kalem Dakour and his son Adoum Kalem, for their warm welcome and invaluable help. References Agwuele, Augustine. 2014. “A repertoire of Yoruba hand and face gestures.” Gesture 14(1): 70–96. Anonby, Erik J. 2011. A Grammar of Mambay: An Adamawa Language of Chad and Cameroon. Cologne: Rüdiger Köppe. Ball, Martin, Rachel Manuel, and Nicole Müller. 2004. “An atypical articulatory setting as learned behaviour: a videofluorographic study.” Child Language Teaching and Therapy 20(2): 153–162. Boersma, Paul, and David Weenink. 2014. Praat: doing phonetics by computer [Computer program]. Version 5.4, retrieved 4 October 2014 from http://www.praat.org/. Bradfield, Julian. 2014. “Clicks, concurrency and Khoisan.” Phonology 31: 1–49. Diouf, Jean-Léopold. 2003. Dictionnaire wolof-français et français-wolof. Paris: Karthala. Elders, Stefan. 2000. Grammaire mundang. Leiden: CNWS. Gil, David. 2011. “Paralinguistic usages of clicks.” The World Atlas of Language Structures online, ed. Matthew S. Dryer and Martin Haspelmath. Munich: Max Planck Digital Library, chapter 142. Available online at http://wals.info/chapter/142 (last accessed 15 April 2015). Grenoble, Lenore A. 2014. “Verbal gestures: toward a field-based approach to language description.” Язык. Константы. Переменные: памяти Александра Евгеньевича Кибрика [Language. Constants. Variables: In memory of A.E. Kibrik], ed. Vladimir Plungian, Michael Daniel, Olga Fedorova, Ekaterina Lyutikova and Sergei Tatevoso. Saint Petersburg: Алетейя [Aletheia], pp. 105–118. Grenoble, Lenore A., Rebekah Baglini, and Martina Martinović. 2015. “Verbal gestures in Wolof.” Selected Proceedings of the 44th Annual Conference on African Linguistics, ed. Ruth Kramer, Elizabeth C. Zsiga and One Tlale Boyer. Somerville, MA: Cascadilla Proceedings Project, pp. 110–121. International Phonetic Association. 1949 (reprinted 1967). Principles of the International Phonetic Association. London: Department of Phonetics, University College (now Department of Phonetics and Lingusitics, University College London).

Paralinguistic Use of Clicks in Chad

437

Lukas, Johannes. 1937. Zentralsudanische Studien. Abhandlungen aus dem Gebiet der Auslandskunde 45 / Reihe B. Völkerkunde, Kulturgeschichte und Sprachen 24. Hamburg: Friederichsen, de Gruyter and Co. MacMahon, Michael K.C. 1994. “A Mid-18th century use of [ə], [ɔ], and [ʞ] as phonetic symbols.” Journal of the International Phonetic Association 24(1): 19–20. Miller, Amanda L., Johanna Brugman, Jonathan Howell, and Bonny Sands. 2006. “Tongue dorsum location and tongue root retraction in alveolar and palatal clicks in the endangered language Nǀuu.” Journal of the Acoustical Society of America 120: 3377. Miller, Amanda L., Levi Namaseb, and Khalil Iskarous. 2007. “Tongue body constriction differences in click types.” Proceedings of Laboratory Phonology 9, ed. Jennifer S. Cole and José Ignacio Hualde. Phonology and Phonetics Series 4.3. New York: Mouton de Gruyter. Pillion, Betsy, Lenore A. Grenoble, Emmanuel Ngué Um, and Sarah A. Kopper. 2019. “Verbal gestures in Cameroon.” Theory and Description in African Linguistics: Selected Papers from the 47th Annual Conference on African Linguistics, ed. Emily Clem, Peter Jenks and Hannah Sande. Berlin: Language Science Press, pp. 303–322. Pullum, Geoffrey K., and William A. Ladusaw. 1996. Phonetic Symbol Guide. Second Edition. Chicago: The University of Chicago Press. First edition published in 1986. Rickford John R., and Angela E. Rickford. 1976. “Cut-eye and suck-teeth: African words and gestures in new world guise.” The Journal of American Folklore 89: 294–309. Tuhuse, Bolo ǁXao and Anthony Traill. 1999. "ǂHán-ǂhánsè, the desert Cisticola, implements an acoustic target." The Proceedings of the 14th International Congress of Phonetic Sciences (ICPhS-14), pp. 1041–1042. Yeoman, John. 1759. The Abecederian, or, Philosophic Comment upon the English Alphabet, Setting Forth the Absurdities in the Present Custom of Spelling … London: J. Coote.

Chapter 15

False Alarms: Spurious Reports of Click Consonants Kirk Miller Reports of lexical click consonants periodically appear for languages that had not previously been described as having them. The sounds almost invariably turn out to be ejectives, which are often confused with clicks, both being commonly described as ‘popping’ sounds.1 This is not particularly surprising – the forceful ejectives of the Salishan language Lillooet, for example, can sound very much like clicks, as the reader might judge from the sound files in Alexander (2016).2 The general public most frequently encounters claims of novel click consonants in descriptions of constructed languages. There have been numerous extraterrestrial insectoid species in science-fiction shows whose languages consist entirely of clicking and popping sounds, but these sounds are mechanically or electronically produced and are not assigned any linguistic meaning. Closer to the mark is the functional constructed language Na’vi, developed for the movie Avatar by linguist Paul Frommer, that was widely reported by the media to have clicks. For example, the London Telegraph reported that ‘the movie features … a brand new language called Na’Vi, which features clicks, glottal stops and unique ways of conjugating verbs’ (Zeitvogel 2009). The Na’vi ‘clicks’ are actually ejectives. The Telegraph article even noted Frommer’s use of the term ‘ejectives’, but then went on to say that their pronunciation ‘can best be described as a dulled-down version of the clicks in South Africa’s Xhosa language’.

1  Such wording is not restricted to English. Clicks and ejectives have been conflated as claquements in French, castañetillas in Spanish and Schnalzen in German. 2  Occasionally reports of a new click language turn out to be correct, as with Dahalo in Dammann (1950), Damin in Hale (1973), and the on-going spread of lexical clicks from Nguni to neighbouring Bantu languages such as Ndau. There are also occasional reports of weak incidental clicks that result from articulatory overlap, as in Jones (1911: 9) for /mw/ in Ndau, Ladefoged (1968: 9ff) for /k͜p, ɡ͡b, ŋ͡m/ in many Nigerian languages, Fuchs and Rodgers (2013) for /t#k, m#ɡ/ etc. in German, and Demolin (2016) for /mŋ, nŋ, mɡb/ in Rwanda. See Fuchs and Rodgers (2013) for a summary of the literature on such incidental clicks in non-African languages.

© Koninklijke Brill NV, Leiden, 2020 | doi:10.1163/9789004424357_016

False Alarms: Spurious Reports of Click Consonants

439

Other than Damin and the occasional personal-language project (the best known of which is Dritok [Boozer 2007]), I’m aware of only two constructed languages that were designed with click consonants. In the 1989 television series Alien Nation, a rudimentary language called ‘Tenctonese’ was created, with English phonology augmented by a single click consonant. Evidently the actors would insert clicks wherever they liked, but in doing so set the pronunciation of the word. In the short glossaries available online (Darksun, Gathercole), the click (written ⟨/⟩) is no more frequent in word-initial position than elsewhere. In word-initial and word-final position it can occur as a single consonant or in a cluster; in a cluster, it occurs at the edge of the word: e.g. /pokhorona ‘funeral’, /lasta ‘chosen’, aeb/ ‘change’, kleezantsun/ ‘overseer’, practimalnal/ ‘hysterical’. Medial clicks are only reported in Gathercole, where they are always followed by another consonant: bah/na ‘seminal fluid’, ho/lodka (a drug), klokhla/tulya (male hormone). In 2018, the Star Trek franchise introduced ‘Kelpien’, an alien language designed by Marc Okrand, who had also designed Klingon. Like Tenctonese, Kelpien has only a single click phoneme (Okrand, p.c., 2019). Unlike Tenctonese, the click is restricted to morpheme-initial (and, so far at least, to word-initial) position. A sample is kííZè yúùs Xǃèèdúànaak kaw lúúra Xǃàànéù Xǃàín ‘I was not certain you would come’, where ⟨Xǃ⟩ is the click (Kim 2019). For the main Kelpien character, the clicks were added in post-production, though they were omitted from the longest passage in the language, a song (Okrand, p.c., 2019). So far, Kelpien (unlike Klingon) hasn’t been developed as a language beyond what was needed for the dialogue in the show. Confusion between clicks and ejectives occasionally arises in the linguistic literature as well. Early reports did not reliably distinguish clicks from ejectives, or even from implosives. Cust (1883: 437), for example, says of the distribution of clicks in the world, the feature exists elsewhere, independent [of] the Khoikhoi and Sán […] Krapf observed it in the Swahíli Language [...].3 Similar Linguistic features are reported in the Languages of North and South America, in the Caucasian, and in some of the Melanesian Languages. Obst (1912), the earliest decent account of Hadza, and later Malloy’s manuscript Hadza dictionary (ca. 1965), do not reliably distinguish clicks from ejectives. Obst, for example, uses the symbol ⟨ŝ⟩ indiscriminately for the dentialveolar ejective and unvoiced clicks: /ts’, ǀ, ǀʰ, ǀ̃ˀ/, and ⟨t⟩̰̂ for the palatal lateral 3  Swahili has weakly implosive voiced stops but no clicks or ejectives.

440

Miller

ejective and clicks: /c’, ǁ, ǁʰ, ǁ̃ˀ/. To be fair, the Hadza lateral ejective /c’/ and glottalized lateral click /ǁ̃ˀ/ are acoustically so similar when in word-initial position that even native speakers need to be attentive with an unfamiliar word, but there is no similar difficulty in distinguishing denti-alveolar /ts’/ from /ǀ̃ˀ/. Wilhelm Bleek, who was intimately familiar with the ǀXam language of South Africa, briefly mentions that the Cushitic language Oromo (‘Galla’) has clicks, as ‘according to v. Klaproth’ does the Caucasian language Adyghe (‘Circassian’) (Bleek 1862: 14). Both languages have ejectives, not clicks. For Oromo, Bleek claims he is ‘not the only one who believes that according to Tutschek’s description, two of his letters indicate clicks’. He is presumably referring to Tutschek (1844) or (1845), which describe the velar ejective /k’/ as ‘a softer sound [than k] produced by pressing the tongue against the middle of the palate, and pushing or smacking with it’,4 and the /t’/ as ‘a T-sound peculiar to this language, and a very hard one, in pronouncing which the tongue is pressed closely against the upper fore-teeth, so that after the t˹, almost a soft s becomes audible, and the z [ts] that is wanting seems to be supplied’5 (Tutschek 1844: xxivf). The Adyghe report may be Klaproth (1823; 1836: II: 381), who says: la langue tcherkesse [offre,] dans plusieurs lettres, un claquement de langue impossible à imiter (‘the Circassian language [exhibits,] in several letters, a clicking of the tongue impossible to imitate’). Bleek may also be at least partly responsible for the occasional reports of clicks in America. Gregerson (1977: 32) says that ‘clicks (or rumors of clicks) are reported from one or two languages in Central and South America’. Bleek had described clicks for the Mayan language K’iche’ (Quiché), where he went into some detail and even transcribed the name of the language with a dental click: ⟨ǀiχe⟩. He writes, Two clicks are distinguished in the ǀiχe (Ꜯiche) language, spoken in Guatemala, of which an old Spanish grammar is in manuscript in Sir G. Grey’s Library.6 Here the click expressed by the letter Ꜭ is produced with the middle of the tongue, by withdrawing the breath within, and giving a little snap 4  In the parallel German text, ein gelindes dieser Sprache eigenthümliches k; es wird durch einen Druck der Zunge gegen die Mitte des Gaumens mit einer Art von Schnalzen hervorgestossen (Tutschek 1844: liif). 5  In the parallel German text, es ist sehr hart und wird durch enges Anpressen hervorgebracht, so dass nach dem t˹ fast ein s hörbar wird und so gewissermassen der fehlende Consonant z (ts) ersetzt erscheint. 6  Sir George Grey, governor of Cape Colony (1855–1861). His collection is now at the National Library of South Africa (former South African Library branch).

False Alarms: Spurious Reports of Click Consonants

441

(castanetilla) in the beginning of the palate with the centre of the tongue, in such manner that a sound is formed in the interior of the throat like as a guttural, if not something more in addition. An other click is marked by Ꜯ and pronounced with force by beating (slamming) the tip of the tongue, which is for that purpose drawn back sufficiently, against the upper teeth to their highest point. It is somewhat retarded or detained in its pronunciation. One cannot say that these ǀiχe clicks are exactly identical with any of the Hottentot clicks; but the second Ꜯ resembles mostly, the Hottentot dental click ǀ, and the first Ꜭ the palatal ǂ combined with some gutteral. Bleek 1862: 15

These ‘clicks’ are actually the ejectives [k’] and [ts’], respectively.7 The detailed phonetic description Bleek provides gives the impression that this renowned expert on the click languages of southern Africa was personally familiar with the Mayan sounds, when he was merely interpreting the antiquated description of a circa 17th-century author who had found ejectives to be novel consonants that required extensive explanation. Such confusion is not merely of historical interest: As recently as 2005, a well-received introductory phonetics textbook stated as fact that ‘clicks are common in languages of Southern Africa and also occur in some North American languages’ (Hewlett and Beck 2005: 81). Native speakers of languages with ejectives may also believe that these sounds, which may be considered unusual by the standards of their country’s national language and require special accommodation in writing, are what linguists call clicks. However, use of the word ‘click’ by speakers is not always so innocent. Soon after I arrived in the field to investigate Hadza, I was approached by a Gorowa (Cushitic) speaker who wanted to work with me; he claimed that his language had click consonants like Hadza did. I suspected that he meant ejectives, but he demonstrated several words which he did indeed pronounce with clicks. When I went to interview his mother, who he claimed also had the sounds, it turned out that the words contained ejectives in her speech. Apparently he had methodically substituted Gorowa ejectives with homologous clicks (e.g. /ts’/ with a dental click) in an attempt to find employment.

7  In Colonial orthography, the name K’iche’ /k’itʃeːʔ/ was rendered ⟨Ꜭiche⟩, which Bleek would presumably have transcribed with a palatal click: ⟨ǂiχe⟩. However, he miscopied a diacritic onto the velar cuatrillo ⟨Ꜭ⟩ k’, producing a dental ⟨Ꜯ⟩ ts’, which he then took to be a dental click. He also mistook orthographic ⟨ch⟩ for its German value [χ].

442

Miller

References Alexander, Qwa7yán’ak Carl. 2016. Sqwéqwel’ múta7 sptakwlh: St’át’imcets Narratives. UBCOPL / University of British Columbia Occasional Papers in Linguistics 3. Upper St’át’imc Language Culture and Education Society and the University of British Columbia, Vancouver. Sound files available at: epub.linguistics.ubc.ca/wp-content/ uploads/2016/09/AlexanderStories1to10.zip & epub.linguistics.ubc.ca/wp-content/ uploads/2016/09/AlexanderStories11to20.zip. Archived at web.archive.org. Bleek, Wilhelm Heinrich Immanuel. 1862. A Comparative Grammar of South African Languages, part I: Phonology. London: Trübner & Co. Boozer, Donald. 2007. “Dritok: The Sound of No Voice Speaking.” 2nd Language Creation Conference, UC Berkeley, 2007 July 7–8. Podcast URL: podcast.conlang.org/2009/02/ dritok-the-sound-of-no-voice-speaking. Archived at web.archive.org. Cust, Robert Needham. 1883. A sketch of the modern languages of Africa, volume 2. London: Trübner & Co. Dammann, Emst. 1950. “Einige Notizen über die Sprache der Sanye (Kenya).” Zeitschrift der Eingeborenen-Sprachen 35: 227–234. Darksun Productions. n.d. “Tenctonese.” URL: www.darksun-productions.de/alien nation/tenctonisches Woerterbuch.html. Archived at web.archive.org. Demolin, Didier. 2016. “Clicks, stop bursts, vocoids and the timing of articulatory gestures in Kinyarwanda.” Studies in African Linguistics 45 (1–2): 141–160. Dempwolff, Otto. 1916/17. “Beiträge zur Kenntnis der Sprachen in Deutsch-Ostafrika. Teil 12: Wörter der Hatzasprache.” Zeitschrift für Kolonialsprachen 7: 319–325. Fuchs, Susanne, and Blake Rodgers. 2013. “Negative intraoral pressure in German: Evidence from an exploratory study.” Journal of the International Phonetic Association 43: 321–337. Gathercole, Tim. n.d. “Alien Nation Encyclopedia.” [Based on a glossary Peter Chambers put together from the television scripts. Expanded by Gathercode (p.c. 2019) from the movie scripts and the novels.] URL: kaioken.co.uk/an/anindex.htm. Archived at web.archive.org. Gregersen, Edgar. 1977. Language in Africa: An Introductory Survey. New York: Gordon and Breach. Hale, Kenneth. 1973. “Deep-Surface Canonical Disparities in Relation to Analysis and Change: An Australian example.” Current Trends in Linguistics, volume II: Diachronic, Areal, and Typological Studies, ed. Thomas Sebeok. The Hague: Mouton, pp. 401–458. Hewlett, Nigel, and Janet Beck. 2005. An Introduction to the Science of Phonetics. Mahwah, NJ: Lawrence Erlbaum. Jones, Daniel. 1911. The Pronunciation and Orthography of the Chindau Language. London: University of London Press.

False Alarms: Spurious Reports of Click Consonants

443

Kim, Bo Yeon. 2019. Twitter post, @extspace, January 07. Klaproth, Jule. 1823, 1836. Voyage au Mont Caucase et en Géorgie, volume II. Paris: Librairie de Charles Gosselin. Ladefoged, Peter. 1964. A Phonetic Study of West African Languages. Cambridge: Cambridge University Press. Malloy, Michael. 2008. Hadza Dictionary. Digitized files from field work, 1963 and 1966. Obst, Erich. 1912. “Von Mkalama ins Land der Wakindiga.” Mitteilungen der geographischen Gesellschaft in Hamburg 27: 1–45. Tutschek, Karl. 1844. A Dictionary of the Galla Language. Munich: Lorenz Tutschek. Tutschek, Karl. 1845. A Grammar of the Galla Language. Munich: Lorenz Tutschek. Zeitvogel, Karin. 2009. “Parlez-vous Na’Vi?” The Telegraph (London), December 18.

General Index Abercrombie, David 98 Abrams, Kenneth 189 abrupt release 4, 6–7, 10, 25, 48, 134, 180, 189, 201, 249, 250n, 254–256, 266, 297–302, 321, 323, 329, 331, 336, 340n1, 341 accompaniments, (overview of) 22–36 accompaniments, dorsal ejected (af)fricated clicks 2, 23, 29, 32, 34–35, 104–105, 112–113, 118, 132, 134, 180, 305–306, 309–310, 321–323, 345, 349–352 fricated dorsal click release accompaniments 27–29, 32, 34–35, 51, 75, 92, 101, 103–105, 107–109, 111–114, 117–119, 120, 132, 134–135, 141, 180, 183, 192, 296, 305, 310, 315, 321–323, 327, 341, 349–352 linguo-glottalic click 29, 32 uvular (constriction in click) 3, 23n, 27–35, 51, 87, 94, 100, 102–104, 107, 112–113, 132, 210, 219–222, 224–232, 257–261, 265–266, 287, 306, 316–317, 395, 400, 429 uvular frication 27, 29, 34–35, 51, 102–105, 107–109, 112–113, 132, 134, 180, 183, 296, 327 velar (constriction in clicks) 3, 23, 26n, 28–29, 31–35, 78, 82, 97–98, 102, 106–107, 139, 162, 166, 180, 210, 217–219, 222, 226–227, 231–232, 257–258, 260–261, 266, 400 accompaniments, laryngeal 23–27 aspirated, voiced 28, 31, 35n, 103–106. See also delayed aspiration aspirated, voiceless 23–24, 26–27, 31, 35, 47, 74, 81–83, 99, 102–106, 131–134, 139, 143, 162, 191, 216, 223–224, 296, 304, 314, 320–321, 345, 349, 406, 418n, 419. See also delayed aspiration breathy aspiration 28, 35, 151. See also delayed aspiration breathy voiced 23, 262–264, 357–358, 366, 378

devoiced 23, 26, 357, 277 ejective clicks 24–26, 31, 34–35, 102–103, 132, 150, 257–258, 264–267, 316, 321. See also accompaniments, dorsal > ejected (af)fricated clicks glottalized click 2, 24–28, 93–95, 103, 105, 108, 112–113, 118, 119n, 132, 216, 220, 250, 257, 264–267, 275, 305–306, 310, 315–317, 321–323, 325–328, 343, 345, 349, 391, 392, 394, 406–407, 418–420, 440. See also nasal venting plain 2, 23–24, 28n, 29, 31, 33, 47, 99, 103, 108–109, 111–112, 216, 220–222, 304, 314, 316, 349–352, 391–392, 394, 405, 407, 419 pre-voiced accompaniments 23, 24, 26n, 28, 35n, 94, 104, 110 slack-voiced 23, 148, 150–152, 262–264, 419 unaspirated 30–31, 81–82, 103, 106, 216, 391–392, 405–407, 419. See also accompaniments, plain voiced (oral) 2, 4, 14–15, 17, 23–24, 28, 31–32, 34, 40, 47, 80–83, 95, 103–106, 108–111, 129–132, 139, 141, 143–152, 159–162, 165–166, 168, 172, 191, 197, 216, 260–263, 266, 275, 277–282, 284, 296, 304–306, 310, 314–316, 327, 340–341, 346, 349, 350–352, 357, 366, 391–392, 394, 405, 418–420, voiceless 1–2, 4, 17, 23–26, 28, 31, 33–34, 35, 40, 45, 47, 80, 82, 95, 102–103, 105, 108–110, 114, 118, 132, 137, 150, 159–162, 165, 172, 191, 192n, 194, 216, 220, 223, 224, 227, 229, 250, 260, 266, 281–282, 284, 306, 340, 357–358, 366, 377, 391–392, 402–406, 411, 419, 420n, 430n. See also accompaniments, laryngeal > aspirated, unaspirated accompaniment, labio(velar)ized 23, 50, 229 accompaniment, epiglottalized 23, 34, 110n

446 accompaniments, nasalized 4–5, 24, 28, 35–36, 74, 88–89, 93–97, 104–106, 108–109, 114–116, 130, 149–150, 162, 165, 167, 172, 174, 192, 214, 229–230, 250, 257, 260–262, 345, 357, 405–408, 411, 428 aspirated nasal (= delayed aspiration, voiced or voiceless) 25–28, 35–36, 74, 93n, 103, 105–106, 108, 119, 132–133, 139, 216, 296, 327, 341, 343, 345, 349, 406. See also nasal venting glottalized voiceless nasal. See glottalized click, under accompaniments, laryngeal pre-glottalized nasal 24 preglottalized voiced nasal click 24, 31, 103, 105, 394 prenasalized (voiced, voiceless) 26n, 28–29, 31–32, 82, 82n1 (prenasalized voiceless aspirated), 82n2, 82–83, 83n1, 102–103, 260–262, 275, 277–278, 280–285, 289, 305, 310, 315, 320 voiced nasal 4–5, 28, 40, 46–47, 80–82, 83n1, 88–89, 93–95, 103, 105, 108–109, 116, 118, 130–133, 135–136, 139, 143–147, 159, 161–162, 165, 172–173, 216, 221, 223–224, 260, 262–263, 275, 277–289, 305, 310, 315–317, 329, 349, 358, 366, 391, 394, 405, 418–420 voiceless nasalized 24–26, 28, 35, 93–95, 103, 105, 108, 159, 160–162, 165, 227–229, 250, 260, 391, 406. See also aspirated nasal, glottalized click acoustic acoustic characteristic 4, 7–8, 10, 21–25, 31, 74–75, 179–184, 217–218, 245, 249–250, 262, 267, 366, 370, 392, 399, 426, 430. See also burst characteristics, place of articulation acoustic features 23, 29, 45, 74–76, 78, 80–83, 85, 87, 89, 91–92, 94, 97–100, 120, 148, 150–151, 162, 183, 220, 250, 297, 306, 310, 329, 340n1, 341, 349. See also burst acoustic measurement 44, 85, 137, 247– 268, 271–272, 298, 368–369, 374–378 acoustic modelling 45 acquisition 47–48, 75, 197–200, 361–362, 389, 410, 418–420

General Index advanced tongue root 9, 76, 218–219, 222–225, 231. See also tongue retraction aerodynamic 25, 44, 46, 93–95, 116, 210, 406. See also airflow Afido, Pedro 40 affricate 3, 28–29, 47n, 47–48, 81, 92, 92n, 96n1, 101–102, 108–111, 114, 145, 214, 317–318, 326–327, 392–393 Agwuele, Augustine 433 airflow 2n1, 3–4, 10, 20, 22, 24–26, 35, 94–95, 106, 132, 260, 262, 357, 405–407, 429 airstream mechanism 2–4, 29, 94–95, 97– 100, 166–167, 201, 210n, 214, 228–229, 233, 264, 275, 356–358, 366, 378, 392, 406, 427, 429 Alexander, Qwa7yán’ak, Carl 438 allophone (allophonic, allophony) 6, 8, 16, 28, 32, 96n2, 173–174, 181n2, 182, 211, 224, 244–245, 279, 283, 286, 310n1, 318, 425, 427, 434 Almeida, António de 295, 323–324, 326 alternation 75–87, 120, 156, 165–174, 281, 284, 287 Altmann, Christian F. 194 alveolar 9, 50, 82n2, 108, 111, 145, 180, 191, 197, 317, 322, 341, 345–346, 350, 393, 419, 439–440 alveolar click 1–10, 14, 16, 18, 20–21, 23, 30–32, 31n1, 33–34, 46–49, 49n1, 50–51, 80, 86–87, 89n1, 89n2, 92, 95–96, 96n2, 103–105, 108, 111, 131–135, 140, 144–145, 147–148, 158, 161–162, 164–165, 168, 170, 180–182, 191–193, 195, 197, 211, 214, 216–220, 226–232, 249–250, 254, 256, 260–261, 263, 266–267, 276, 283, 291, 293, 297–311, 321–325, 327–330, 338– 341, 344–350, 352–353, 357, 393–396, 400, 411, 418–419, 420n, 429, 432–435, 439. See also postalveolar click alveolar ridge 8, 10, 16, 22, 216, 219, 221, 228, 230, 384–385, 402, 432 alveolopalatal click 6 alveo-palatal click 211 amplitude 6, 8, 21, 25–26, 46, 132, 134, 180, 182n, 212, 248, 250, 264, 271, 298, 407, 426, 430, 432 anatomical/morphological variation 230, 232–233, 384–386, 408n, 409–411

General Index Anders, H.D. 20n Anderson, Anne H. 246 Anderson, Victoria B. 16 Angola 36–37, 294, 324, 336–338, 341, 344, 353–354 Anonby, Erik J. 428–430 anterior seal (anterior constriction location) 6–8, 17–19, 22, 92, 95, 97, 211, 217–219, 221–224, 226–232, 250, 311, 357, 395–397, 399–400, 402 Antoniou, Mark 184, 189 Aoyama, Katsura 185–186 apical 4, 6, 8–10, 85, 191, 216–217, 219, 227, 232, 249, 255–256, 258, 295, 395–396, 400. See also subapical, tongue tip apicolaminal 4, 6, 8, 211, 221, 230 apicosublaminal 16 Apraxia of speech (AOS) 47, 356, 358, 360 archaic 245 areal distribution. See linguistic area areal feature. See linguistic area articulation. See place of articulation; see also closure, constriction, lingual kinematics, lingual posture, posterior constriction location aspiration 29, 74–75, 81, 83 (deaspiration), 104, 106, 108–109, 118, 130–133, 151, 320–321 (loss of aspiration), 321, 341, 345, 357, 401, 406, 418n. See also accompaniments, aspirated; delayed aspiration assimilation 77n, 81–88, 91, 320, 326–327. See also perceptual assimilation audible dorsal burst 16, 27, 29, 107, 132, 162 audio 193–194, 198, 212–214, 232, 268–272, 368, 387 Australia 36, 181–182 autoscaling 248 Avery, Robert A. 185, 195 Aycard, Pierre Benjamin Jacques 40 Aylett, Matthew 243 back-released velar click 3, 422–423, 425, 427–431, 434 Back Vowel Constraint (BVC) 86–87, 92, 120 backchanneling 423n2, 425–427, 431, 434–435 Bailey, Richard 40, 121

447 Bakker, Peter 48 Ball, Martin J. 9n, 48, 91, 430n Barbieri, Chiara 330 Barnard, Alan 46, 129 Bastin, Yvonne 171, 173 Battistella, Edwin 420 Baucom, Kenneth L. 277 Baum, Shari R. 386 Baumbach, Erdmann J.M. 39–40, 162, 164, 182 Beach, Douglas M. 5, 10, 38, 43, 48, 74, 90, 92n, 101, 108–109, 136, 139, 210–211, 214, 217, 296, 399–400, 411 beatboxer 44, 212, 225–230, 397 Beck, Janet 441 Beddor, Patrice S. 52, 182 Bedore, Lisa M. 48 Belin, Pascal 185, 194 Bell, Arthur 24, 38, 45, 95n Bennett, Wm. G. 23, 42, 80n, 82, 89–90, 93–95, 184 Benor, Sarah B. 46 Bernhardt, Barbara H. 357 Best, Catherine T. 42, 52, 179n, 184–185, 189–200, 409 Bever, Thomas G. 189 Bhaskararao, Peri 402 bilabial 45–46, 48, 102, 108, 116, 191, 194, 197–199, 223, 228–229, 231 (see also labial, lips) bilabial click 1–2, 4, 6–8, 45–46, 48, 50–51, 82n2, 84–86, 91–92, 95–96, 98, 103–104, 180–183, 192–193, 198–199, 211, 249, 250n, 251, 253–254, 293, 297, 357, 408, 422–423, 429, 432, 435 Binder, J. R. 185, 194 Bleek, Dorothea F. 36n3, 43, 50n, 79, 101, 338, 341n, 344, 353 Bleek, Wilhelm Heinrich Immanuel 7n2, 43, 74n, 440–441 Blevins, Juliette 115 Boersma, Paul 247, 405, 423 Bohn, Ocke-Schwen 189 Boozer, Donald 439 borrowing 1, 36, 40–41, 49–50, 86–87, 90, 115–116, 164, 170–171, 173, 202, 279, 281–282, 284–285, 287, 300–303, 305, 311, 330 (see also loan words)

448 Bostoen, Koen 40, 79, 157, 162, 171, 173, 182–183 Botswana 36, 39, 129–131, 192, 260, 292, 294, 300, 302–303, 312, 336–337 Bouquiaux, Luc 244 Bowern, Claire 145 Boyce, William B. 4n2 Bradfield, Julian 3, 6n2, 84, 99–102, 114n, 211, 410, 430n Bradshaw, Mary M. 129–130 Brancazio, Lawrence 194 breathy 11, 24, 26, 28, 35, 151, 262–264, 296, 357–358, 366, 378 Breckwoldt, G. H. 4, 43 Bresch, Erik 212–213, 232 Brooks, Vernon B. 360 Brosnahan, Leonard Francis 386 Brown, Justin 46 Bruderer, Alison G. 409 Brugman, Johanna 6, 11–12, 22, 24, 29, 35, 38, 44, 52, 88–90, 108–109, 211, 214, 220, 232, 407 Brunner, Jana 386 buccal 22 Burger, Sharon 47, 361–362, 418–420 burst characteristics (see also release) 4–5, 8–10, 13, 16n1, 21–22, 25, 27–32, 79n2, 92n, 107, 160–161, 180, 194, 201, 243, 249–256, 258–261, 264–267, 340–341, 368, 370, 374–377, 380, 386, 391–392, 406, 424, 426, 430, Cameroon 430, 435 Campbell, Lyle 168 cardioid 271–272, 387 Carreiras, Manuel 196n1 Catford, J. C. 3, 101, 317 cavity 2, 18, 22, 25, 92, 95, 132, 134, 180, 213, 216, 218–219, 223, 227–230, 232–233, 298, 357–358, 395, 399–401, 427–430 center of gravity 10, 32, 247, 250–251, 254, 257 Central Africa 130, 422, 433–435 Chad 422–423, 430–431, 435 Chebanne, Andy M. 35, 52, 130–133, 135, 139, 144, 148, 152, 276, 279, 282, 300, 330 Chen, Su-I 220

General Index Chen, Yiya 139, 150–151 children and clicks 4n2, 47–48, 199–200, 313, 344, 353, 361–362, 418–420 Chladni, E.F.F. 74n Chomsky, Noam 92n, 97, 101 Clements, George N. 99 clicks click accompaniment (see accompaniment) click clusters (see cluster) clicks, false reports 438–441 clicks, in beatboxing (see vocal percussion) click insertion (see also emergent click, discourse marker) 80, 156, 171–174 click languages 1–2, 36–42, 94, 157, 162, 183, 438n2 click loss (see click replacement) clicks, mechanical 189–191 click mechanism 2–4, 98–99, 134, 166, 210, 216–225, 232, 357–358, 366 click replacement (see also consonant substitution, variant pronunciations) 47, 49, 78–80, 131–136, 140, 142–146, 149–150, 156, 162, 170–174, 182–183, 277, 283, 291, 294, 297–331, 336–354, 419–420 click sound errors in apraxia of speech 356, 358, 368–373 click symbols 1–9, 23–26, 43, 276–278, 296 (see also transcription) click type 2–22, 28–29, 47–51, 74, 95–97, 99–101, 109, 131, 133–135, 144, 159, 162, 192, 214, 217–218, 248–259, 327, 340n2, 341, 345, 394–395, 397, 406–407, 429 (see influx, alveolar click, bilabial click, dental click, lateral click, palatal click, retroflex click) clicks, vocal percussive (see vocal percussion) closure (see also constriction) 2, 3n, 7–8, 17–18, 22–24, 26–29, 74–75, 80, 82–83, 85, 89, 92n, 93–95, 97–98, 102, 106, 107n, 109, 114, 119, 132, 159–162, 166, 180, 222–224, 227, 230, 260–262, 264, 266, 279, 281, 297–299, 311, 340, 357, 398, 405–407, 411, 426, 428–430, 432–433

General Index cluster (consonant) 23, 24n, 29, 31, 35, 91, 99–120, 132, 211, 214, 257–259, 261, 266–267, 439 coarticulation 6–7, 16, 18, 22–25, 30–34, 45–46, 51, 75, 84–87, 92, 104, 106, 120, 161, 182, 191, 194, 201, 216–219, 221–225, 230–231, 259, 311, 320, 326–327, 357, 360, 377–378, 380, 394, 406 coarticulatory influence 221, 224 coda 88, 90–91 cognates 50–51, 118, 133, 140, 143–148, 170–172, 276–288, 299, 301–306, 313, 316–317, 322–328, 338–357 collaborative task 244, 246–247, 267 Collins, Chris 24, 38, 45, 52, 76–77, 95n, 132–133, 139, 144, 148, 151, 300 community-based 244, 267 comparative-historical research 48–51, 129–152, 245, 275–289, 291–331, 336–354 concave tongue body shape 6, 8, 18, 385 configuration 98n, 211, 216, 230, 232–234, 400 conlangs 438–439 consonants consonants in inflectional suffixes  418n consonant inventories (see segment inventory) consonant substitutions (see also click replacement, variant pronunciations) 47–48, 79, 89n1, 91, 93–94, 113, 131, 172, 358, 363, 368–370, 374–375, 377–379, 391, 397, 401–404, 406–408, 411, 418, 441 consonant-tone interaction (see also tonal depression) 129–131, 141, 1 46 consonant-vowel coarticulation (see coarticulation) constriction, vocal tract (see also closure) 2–4, 6–10, 16–19, 22, 25–26, 28–29, 45, 92, 100, 107, 194, 210–213, 216–232, 392 consultant 131, 242, 244–247, 268–271, 418, 423 continuous seal 226, 230 contour, tonal 136–138, 141–142, 146, 149 contour segment 29–31, 100, 110, 214

449 contrast 2, 5, 8–10, 20–21, 23–24, 27–29, 31–32, 49–50, 82, 94–95, 106–107, 109, 112, 136–137, 158–159, 161–162, 168, 179, 181, 184, 186–201, 211, 214, 218, 220, 244, 261, 275, 297–298, 310, 321, 340–341, 350, 409 coordination 216, 219–220, 231, 361, 387, 391, 400, 402, 407, 411 Cootes, Timothy F. 412 Copland, B. D. 43 correspondence sets (see cognates) coronal 8–9, 16, 45, 47, 78, 85–87, 111, 114–115, 117, 198–199, 216–217, 219, 223–224, 226–227, 230–233, 400, 402, 407, 411 coronal posture 232 corpora 214–216, 221, 226, 276 Corum, Claudia W. 220 creaky 27, 394 Crowley, Terry 145 Cruttenden, Alan 211 Cust, Robert Needham 439 Cutting, James E. 194 Dammann, Ernst 39, 438n2 Dart, Sarah N. 16 Dave, Radhekant 16 Davey, Anthony S. 39 Davis, William J. 4n2 De Gelder, Beatrice 194 de Luna, Kathryn 157 de Vos, Connie 48 decomposition 101, 104, 136, 211 Dediu, Dan 7, 44–45, 48, 384–386, 408 Dehaene-Lambertz, Ghislaine 185 Dekle, Dawn J. 193 delayed aspiration 25–28, 35–36, 74, 93n, 103, 108, 119, 132–133, 139, 216, 296, 327, 341, 343, 345, 349, 406 delayed dorsal release 29, 92n, 430 Delvaux, Véronique 100 demographic 242 Demolin, Didier 46, 100, 384, 386, 408, 438n2 Dempwolff, Otto 43 denasalization (of vowels) (see also prenasalization, loss of) 280, 305, 320, 326

450 dental 1–9, 46, 85, 102, 191, 311, 318, 326, 389 dental click 3, 4, 6–7, 9, 20–21, 32, 41, 46–48, 49n1, 49n2 50–51, 76–77, 79n2, 80, 82n2, 84–87, 91–96, 103–104, 108, 111, 131–135, 141, 144, 158–166, 170–173, 180–182, 191–193, 195, 197–199, 211, 214, 216–224, 226–227, 229–232, 234–235, 249, 250n, 252–253, 257–259, 264–267, 276, 283, 285, 293, 297–299, 301, 310n2, 321–324, 327–328, 338, 344, 344n2, 350–353, 357–358, 366, 369, 378–379, 387–388, 391–398, 400–401, 405–407, 411, 418–420, 422–425, 429, 434–435, 439–441 (see also labial-dental, labiodental, teeth) dentition, (dental arch, dental fillings, dental artifacts) 213, 233, 389, 390, 411 (see also teeth) depressors (see tonal depression) Deumert, Ana 46 devoicing, loss of voicing 26, 91, 116, 152, 327, 350n, 351–352, 358 Di Paolo, Marianna 243 diachronic patterns 28, 42, 48–51, 78, 133–136, 275–289, 145–148, 301–331 diacritics 2, 5–6, 9, 23–24, 31, 102, 441n Dickens, Patrick J. 26n, 28, 35, 40, 110, 116n, 171, 244, 276, 281, 293, 339, 341–343, 347–349, 351–352 Dieckmann, Ute 170 Diehl, Randy L. 150, 185 Diepstra, Heidi 6, 404 diffusion of clicks 48–49 digital archive 43–44, 272, 294 Dinnsen, Daniel A. 182 Diouf, Jean-Léopold 434n diphthong 23, 50, 116 direction of airflow/airstream 2n, 4, 6, 10, 20, 22, 366, 429 discourse 243–244, 247, 427 discourse marker 45, 182 disordered speech 47–48, 90–91, 356–380 Dogil, Grzegorz 28, 39, 45, 91n, 362 Doke, Clement M. 4–5, 9–10, 13, 16, 20–21, 38–39, 43, 74n, 80–82, 89, 95, 101, 106–107, 113, 181n, 210–211, 220, 231, 297, 338–340, 341n domed tongue body shape 6, 8, 10

General Index Dornan, Samuel S. 293 dorsal (see also accompaniments, uvular, velar) dorsal accompaniment 28–35 dorsal gesture 76, 85, 213, 221–222, 400, 402 dorsal place contrast 31, 102–103, 258–259 dorsal seal / constriction 3n, 6n, 8, 16–17, 25–26, 29, 45, 210, 217, 221–224, 226–231, 384, 392 Downing, Laura J. 130 downsampling 248 Dressler, Wolfgang U. 168 drum 225–226, 229 du Plessis, Ménan 6, 38 Duffy, Joseph R. 358–361 duration 10, 13, 24, 26, 32, 44, 245, 249, 253, 263, 281, 340–341, 358–359, 363, 368–370, 374–380, 389–390, 410 dyad 243, 246, 270 Dye, R. H. 189 dynamic 7, 85n, 212–213, 230 dynamic range 248 dynamic microphone 271 Eaton, Helen 45 echolocation 48 Eckert, Penelope 242 Edgar, John T. 40 Edward, Mary 48 efflux (see click accompaniment) egressive 139, 156, 166–167, 174, 180, 264, 391, 406 Ehret, Christopher 41, 96n2, 324 Eiseb, Eliphas 35, 214, 244, 276, 293, 295, 328 ejective ejectives 3, 25, 81, 83, 97, 99, 110–111, 191, 192n, 197–199, 201, 310, 391–393, 396, 401, 403, 406–407, 411, 419, 438–441 ejected (af)fricated clicks 32, 34, 104, 118, 132, 305–306, 309–310, 321–322, 345, 349–352 ejective clicks 24–26, 31, 34–35, 102–103, 132, 150, 257–258, 264–267, 316, 321 Elderkin, E. D. 2n2, 22, 24, 28, 38, 41, 52, 90n2, 93n, 277–279, 284, 289, 310n1, 324, 326

General Index Elders, Stefan 429, 431 electropalatography (EPG) 22, 44, 46, 380 elicitation 242–247, 267–268, 390–391, 407 Elsner, Micha 9, 21, 44 emergent click 45–46, 100, 182n, 408 Engelbrecht, J.A. 49n2 Engstrand, Olle 182, 284, 408, 420 EPG (see electropalatography) epiglottalized 23, 34, 88, 110n epiglottalized click 34 epiglottis 217, 219, 223 Erzurumlu, Reha S. 409 Essing, D. 20n ethnonym 242 Everett, Caleb 386 Everett, Daniel 292 evolution, of language 42 Ewan, William G. 150 Ewen, Colin J. 98n Exter, Mats 29, 32, 38, 45, 88n F0 25, 129, 136–139, 141–142, 146, 148–152 F0 lowering 24, 129–132, 136–144, 146–150, 281–282, 284–285 F0 raising 136, 150 Fairbanks, Grant 150 Fallon, Paul D. 99 feature(s) 23, 29, 45, 74–76, 78, 80–83, 85, 87, 89, 91–92, 94, 97–100, 120, 148, 150–151, 162, 183, 220, 250, 297, 306, 310, 329, 340n1, 341, 349 Fehn, Anne-Maria 36, 38, 48–49, 52, 133, 182, 184, 292–296, 299–300, 301n, 304, 310n2, 314, 324, 326–327, 329, 336, 339–340, 346, 351–354 Finch, Kenneth B. 17 Finlayson, Rosalie 54, 211 Fisch, Maria 162, 170 Flege, James E. 185–186, 201 forensic phonetics 45 formal 78, 95, 98, 100, 120 formality 243–244 formant 30–34, 191, 248, 259 formation 187n, 134, 217, 219, 221, 226–227, 231, 400–401, 411 Forrest, Karen 359 forward-released lateral click 20–21 Fowler, Carol A. 193–194, 198

451 free variation 41, 95–96, 162, 173, 351–352 Freeman, Valerie 246 frequency 8, 10, 31n1, 90n1, 183, 200, 212, 243, 248, 257, 260, 269–272, 303, 312–313, 369, 392–394, 399, 407, 426, 430 (see also fundamental frequency) frication (see fricated) fricated fricated (anterior) click release 4, 6–7, 8–10, 15, 20, 92, 252, 298–299, 341, 357, 422, 424, 426, 430, 432–433 fricated dorsal click release 27–29, 32, 34–35, 75, 92, 101, 103–105, 107–109, 111–114, 117–118, 120, 132, 134–135, 141, 180, 183, 192, 296, 305, 310, 315, 321–323, 327, 349–352 fricated palatal click 7, 9–13, 50n, 95, 256, 296, 299, 301n, 337–338, 341–346, 350, 352–353 fricated retroflex / post-alveolar click 10, 15, 20 fricatives 8–9, 81, 88, 102, 108–109, 113–115, 180, 184n, 191–192, 195, 197, 296, 306, 316, 327, 392–393, 418 fronted, fronting 143, 145–146, 221, 282, 327, 329 Fuchs, Susanne 46, 100, 386, 438n Fulop, Sean A. 4, 7, 10, 39, 43, 52, 89n3, 90n1, 210, 262, 264 functional load 41–42, 49, 158, 170, 179n, 181, 243, 420 fundamental frequency (F0) 25, 129, 136–139, 141–142, 146, 148–152 Gal, Susan 46 Gallagher, Gillian 99 Gandour, Jack 150, 196n1 gap in inventory 109, 131, 133–134, 136 Garellek, Marc 25 Gathercole, Tim 439 Gati, Joseph S. 185 Gerlach, Linda 24, 32, 38, 94, 293 gesture 2, 22, 48, 76, 85, 98n, 134, 150–151, 182, 198–199, 216–217, 219, 221–222, 227, 231, 268, 320–321, 359, 361, 364, 384, 400, 405–406, 411, 422, 423n1, 433–435 Ghosh, Satrajit S. 409 Gibbon, Fiona 48

452 Gibson, Gordon D. 295, 324, 326 Gick, Bryan 18, 130, 396, 405, 411 Gil, David 40, 45, 182, 408, 422, 431, 435 glossonym 242 glottal glottalic (airstream) 3, 29, 34, 97, 99, 118, 201, 210n, 391 glottal constriction 24, 95, 150–151, 216, 262, 407 glottal stop 24, 77, 91, 102–103, 108, 112, 118, 119n, 150, 233, 249, 252, 254–255, 266, 306, 316, 321–322, 392–393, 406–407, 411, 418, 422, 438 glottal frication 25–26, 129–130, 147, 151–152, 327, 329 glottalization 24, 28, 93–95, 104–105, 108, 192n, 220, 296 glottalized click 2, 24–25, 27–28, 93–95, 103, 105, 108, 112, 118, 119n, 132, 216, 220, 250, 257, 264–267, 275, 305–306, 310, 315–317, 321–323, 325–328, 343, 345, 349, 391, 392, 394, 406–407, 418–420, 440 linguo-glottalic click 29, 32 preglottalized 31, 321, 326–328, 406 preglottalized voiced nasal click 24, 103, 105, 394 glottis 3, 74, 98–99, 114, 151, 220, 228–229, 281, 233 Goad, Heather 115 Gold, Erica 45 Goldstein, Louis 51, 189, 198 Golston, Chris 262, 264 Gowlett, Derek F. 220 Greenberg, Joseph H. 36n3, 89 Gregersen, Edgar 440 Grenoble, Lenore A. 422, 423n1, 429, 431, 433–434 Grimbeek, Jackie 356, 379 Gruber, Jeff. 38 Grundy, W.M. 46, 52 Guastavino, Catherine 189 Guion, Susan G. 185–186 Güldemann, Tom 29n, 35–36, 38, 42, 79, 89, 101, 104–114, 116–117, 129, 181–183, 214, 291n, 292, 294, 299–300, 322–324, 326–327, 336n1, 336n2, 337, 354

General Index Gunnink, Hilde 5, 28, 37, 39–42, 48–49, 79–80, 95, 157, 162, 170–171, 173, 179n, 181–182, 183n, 184 Guthrie, Malcolm 157, 220 Guttural OCP 34 Gxilishe, Sandile 47 Gygi, Brian 189 Haacke, Wilfrid H. G. 4, 24–25, 35, 43, 52, 214, 244, 276, 293, 295, 327–329 Hafter, E. R. 192 Hagman, Roy S. 90 Hale, Kenneth 36, 93, 182, 410, 438n Halle, Morris 92n, 97–99, 101, 150–152 Hallé, Pierre A. 189, 192n Hallowes, D. P. 49, 101 Halpert, Claire 81 Hamann, Silke 16 Han, Mieko S. 150 Hansson, Gunnar Ó. 83n2, 89 hard palate (see palate) Harley, Matthew 6, 46 harmonic 262, 264 harmonicity 247, 262–264 harmonics-to-noise ratio 262 harmony, of consonants 83, 87–89 Hartmann, William Morris 189 Hay, Jennifer 243, 246 Hayes, Bruce 181 Heath, Jeffrey 244 Heikkinen, Terttu 21, 287, 349 Heine, Bernd 9, 38, 50, 95, 244, 287, 291, 293, 297, 306n, 308, 336–353 Herbert, Robert K. 47, 49, 211, 357, 361, 377, 418, 420 Herbert, Thomas 43 Heselwood, Barry 48, 89n1 Hewlett, Nigel 441 Hickok, Gregory 195 Hildebrandt, Kristine A. 130 Hinton, Leanne 40 hip-hop 225–230 historically nasal consonants 143–145, 305, 310, 320, 326, 349 hlonipha 49 Hobley, June 130 Holliday, Jeffrey J. 12, 38 Hombert, Jean-Marie 150

General Index Honda, Masaaki 386 Honeybone, Patrick 298 Honken, Henry 50, 132–133, 278–279, 289, 297, 323, 336–338, 353 Hoole, Philip 151 House, Arthur S. 150 Howard, Mary F. 189 Howard, Sara J. 48, 91 Hoyne, Karen L. 200 Hudu, Fusheini 44 Hulst, Harry G. van der 119 Hume, Elizabeth 181 Hunziker, Daniel A. 24, 28 Hussain, Qandeel 16n3 Hyde, Charlene 94 Hyman, Larry M. 41, 137 hyperarticulate 246 Hertz / Hz 212, 248, 269, 270–272, 394, 423–424, 426, 430, 433 idiolect (see individual variation) Ilunga, Mpunga wa 41 implication 75, 86, 93–96, 303, 305, 110–113, 149, 180, 189, 211, 279, 293, 340, 409, 420 impulse 134, 254 inaudible dorsal burst 25, 29n, 107, 132, 260 individual variation 32, 38, 41, 48, 79–80, 162, 211, 303, 305, 310n, 313, 321, 329, 340, 359, 361, 386, 408–409 (see also variation) influx 25, 48, 74, 99, 131n, 180, 227–228, 275, 283, 291, 297–303, 306, 311–313, 316n, 317–318, 320–324, 330, 336, 338, 340n1, 344, 349, 351–353 (see click type, alveolar click, bilabial click, dental click, lateral click, palatal click, retroflex click) informant 213–215, 221, 225–226, 232, 234 (see consultant, participant) ingressive 3, 166, 210–211, 214, 216, 220, 225, 228–230, 232, 357, 391, 427, 429, 432–433 Inskeep, Adi 329 instrumental phonetics 17, 22, 38–39, 42–46, 220, 260 (see aerodynamic, electropalatography, MRI, ultrasound, X-ray)

453 intensity 10, 21, 169, 180, 201, 213 (pixel intensity), 243, 247–249, 250n, 271, 298, 340, 353, 407, 430 interdental click 6 International Phonetic Alphabet (IPA) 2n2, 4, 6–9, 21, 24, 174, 225, 229, 234–235, 296, 368, 403–404, 439n intervocalic position 90, 387, 406, 410 (see also medial position) intonation 243 inventory (see segment inventory) IPA (see International Phonetic Alphabet) Irvine, Judith T. 46 Iskarous, Khalil 7, 38, 44 Iverson, Paul 185–186 Jacobson, Monica C. 364 Jacottet, Édouard 39 Jakobson, Roman 45, 420 jaw 211–212, 216, 218, 221, 223, 227–229, 281 Jeel, V. 150 Jessen, Michael 23–24, 419n1 Joanisse, Marc F. 185 Job, Sylvanus 327–329 Johnson, Keith 39, 45, 181 Jones, Brian T. B. 170 Jones, Daniel 43, 74n1, 438n2 Jones, Kerry 36n1, 52 Jun, Jongho 100 Kagaya, Ryohei 38, 210, 400 Kalahari Basin 36–37, 41–42, 49, 292, 294, 330, 354 Katsamanis, Athanasios 232 Kenya 36–37, 40, 93, 182 Kertesz, Andrew 365 Khaila, Harveen 48 Khin Khin, Aye 182 Khumalo, James Steven Mzilikazi 83, 89 kHz 212, 248, 269, 270–272, 394, 423–424, 426, 430, 433 Kilian-Hatz, Christa 28, 38, 170, 244, 276, 279–280, 283–285, 293, 295, 307–309 Killian, Donald 38 Kim, Bo Yeon 439 Kim, Karl H. S. 364 Kim, Yoon-Chul 233 Kingston, John 150, 181

454 Kirchengast, Sylvia 384 Kistler, Doris J. 189 Klaproth, Jule 440 Klingenheben, August von 43 Kluender, Keith R. 185 Knight, Alec 292, 385 Kochetov, Alexei 17 Kockaert, Hendrik J. 220 Köhler, Oswin 4, 211, 285, 291n König, Christa 9, 38, 95, 244, 287, 293, 337–352 Koreman, Jacques 46, 100 Kreitman, Rina 115 Kuhl, Patricia K. 185–186, 187n, 201 Kutsch Lojenga, Constance 130, 137 kymograph 43 L1 acquisition 42, 47–48, 197–198, 361–362, 418–420 L2 acquisition 42, 47–48, 184–186, 201, 410 L2 phonetics 47–48, 184–186, 201, 384–412 labial (see bilabial, lips) labial aperture 223 labial posture 223 labial-dental, labiodental, labialized click 1, 6, 100, 229–230, 423, 432 labialization 85, 98, 100, 229 labialized voiceless nasalized dental click 229 labiovelarized click 23, 50 labialvelar, labio-velar stop 6, 45–46, 98, 100, 181n2, 408 laboratory phonetics 17, 22, 38–39, 42–46, 220 (see aerodynamic, electropalatography, MRI, ultrasound, X-ray) Labov, William 242–243 Ladefoged, Peter 1, 6–7, 10, 16, 22–26, 28–29, 31–32, 35, 38, 43–44, 46, 51, 74, 82, 93n, 94, 98, 102, 104–105, 108, 116, 131, 180–181, 183, 210–211, 214, 216, 230, 242, 245, 260, 292, 297, 357–358, 362, 366, 385, 387–388, 395, 397, 402, 406–407, 438n2 Ladusaw, William A. 429, 430n laminal 3–4, 6, 8–9, 16, 18, 217, 219, 221, 227, 230, 232, 256, 357–358, 366, 395–396 (see also sublaminal)

General Index Lammert, Adam 232, 386, 408n landmarks 137, 214, 221, 226 language evolution 42 Lanham, Leonard W. 41, 49, 101, 139, 211, 361–362 laryngeal 1, 22, 74, 80–83, 91, 97, 99, 104, 120, 151, 213, 220, 233–234, 264, 356–357, 396–397, 400, 406 laryngeal accompaniments 23–28, 214 (see accompaniments) larynx 3, 24, 34, 211–213, 232, 396–397, 401, 403, 406 epilarynx 397 lateral lateral click 1–13, 20–22, 25, 40–41, 46–52, 79–80, 84, 82n2, 86–88, 91–92, 94–96, 96n2, 103–104, 108, 111, 131–135, 141, 144, 158–159, 162–163, 180–182, 191, 192n, 195, 197–198, 211, 214–220, 227–228, 231–232, 234, 249, 253–254, 276, 283, 293, 297–299, 301, 321–322, 324–329, 337–340, 341n, 344, 344n2, 345, 345n, 350–353, 357–358, 362, 366, 369, 377, 379, 392, 408, 418–419, 420n, 422–423, 425–428, 430, 432, 434–435, 440 forward-released lateral click 20–21 lateral fricatives, affricates 9, 48, 96n1, 191–192, 195, 197, 327, 345, 345n, 351–352, 418, 439–440 Le Roux, Mia 356, 380 learning 44, 185–186, 190, 197–199, 201, 364, 384, 386–387, 391, 401, 405, 410–411, 420 (see also acquisition) Lebzelter, Viktor 47n Lederer, Karen 225 Lehiste, Ilse 150 Lemaitre, Guillaume 189 lenition 298–299, 301n, 321, 324, 331, 336, 340–341, 353 (see also weak>click weakening) Lepsius, Karl R. 4 Lesotho 36 Lewis, Philip W. 43, 47, 362, 419 lexical elicitation 234–235, 241–247, 267–268, 344 lexicon 87, 89n3, 135–136, 183, 220, 301, 310–311, 318, 330 Liberman, Alvin M. 185, 409

General Index Lingala, Sajan G. 233–234 lingual lingual consonants (lingual airstream) 3, 97–100, 210–211, 214, 357, 366, 391, 427 lingual kinematics (motion) 134, 218–220, 228, 232, 397–398, 406 lingual posture (articulation) 210, 216–218, 223–229, 232, 400 lingual(-palatal) seal 227–230, 384, 396–399, 401–402 lingual cavity 219, 232 linguistic area 36, 41, 49, 90n3, 120, 292, 294, 327, 330–331, 354, 422, 430, 435 linguo-glottalic click 29, 32 linguo-pulmonic click 29–30, 32 linguo-velaric 210 linguography (see palatography) Lionnet, Florian 2–3, 45, 52, 86n1, 121 lips 4, 6, 22, 76, 198, 199n, 212–213, 223, 228–231, 404, 429, 432–433 (see also labial, bilabial) loan words 20, 41, 49, 86–88, 116–117, 147, 220, 277, 279–280, 282, 285, 287 Löfqvist, Anders 150 Lombardi, Linda 110 Loughnane, Robyn 293 Louw, Jacobus Abraham 48, 49n2, 50, 79, 81 lowering 3, 6, 18–19, 26, 85, 218–223, 227–229, 232, 357, 391, 396–397, 405–406, 427 (for F0 lowering, see tonal depression) Lowit, Anja 356 Luffin, Xavier 182 Lukas, Johannes 429, 431 Lukusa, Stephen T.M. 40, 89n3 MacDonald, John 193 MacMahon, Michael K. C. 4n, 430n Maddieson, Ian 1, 6–7, 10, 16, 22–23, 29, 31n1, 36–39, 41, 43–44, 51, 82, 95, 100, 116, 181, 211, 230, 297, 357–358, 362, 366, 385, 395, 406 Magill, Richard A. 361 Magnetic Resonance Imaging (see MRI) Maho, Jouni F. 157 Maingard, L. F. 293, 329 Malambe, Gloria B. 220

455 Malloy, Michael 439 Manuel, Isata 38 map task 244, 246–247 Maphalala, Zinhle 23, 47 Maran, La Raw 130 Marchal, Alain 46, 182 markedness 97, 117, 377, 418, 420 Marlett, S. A. 115 Marquard, Carina 21, 44, 180 Mathangwane, Joyce T. 152 Mathes, Timothy K. 49n1, 129–131, 135, 144, 148, 150n, 156n, 171n, 276, 279–280, 282, 291n Matisoff, James A. 405 Mattingly, Ignatius G. 185, 409 Mauszycki, C. Shannon 361 maximum constriction 213, 225, 227 May, Lillian 199 Mayer, Jörg 91n, 362 McFarland, David H. 386 McGettigan, Carolyn 195 McGurk, Harry 193–194 McLaren, J. 81–82 McNeil, Malcolm R. 356, 358–361, 363–364, 378–379 McPherson, Laura 244 McRoberts, Gerald W. 184, 191, 197–198, 409 mechanisms of sound production 2–4, 29, 97–99, 134, 166–167, 185–187, 201, 210–211, 216, 225–226, 228–229, 232–233, 357–358, 366, 378 medial position 28, 88, 90–91, 93–94, 350, 418, 439 Memmott, Paul 36 microphone 248–249, 268–272, 387, 391, 423 midsagittal 13, 17, 212–213, 215–220, 222– 224, 227, 231, 233, 385, 390, 397, 404 Midtlyng, Patrick J. 44 Miller, Amanda [= Miller-Ockhuizen, Amanda] 2–3, 6–10, 17–22, 26, 27n, 28–29, 32, 34–35, 38, 43–45, 52, 74, 76–77, 82, 85n, 86–88, 92, 95, 99, 104, 106–107, 110, 113, 116, 120–121, 180, 210–211, 214, 217, 219–220, 229, 297, 338, 340–341, 400, 406, 427 Miller, Kirk 47, 52, 436 Miller, Nick 356

456 mispronounce 48, 79n2, 91, 94, 245 Mkanganwi, K.G. 40 modality 211 Möhlig, W. J. G. 96, 162 Moisik, Scott R. 7, 44–45, 48, 52, 384–385, 397, 408 Monaka, Kemmonye C. 40, 52 Morelli, Frida 115 Morland, Aleksander 46, 100 morphology (linguistic) 78, 81, 83, 90n2, 92, 246, 277, 288, 360 (see also vocal tract morphology) motor complexity and apraxia 361–362, 378, 380 Mowrer, Donald E. 47, 361–362, 418–420 Moyo, Themba 41 Mozambique 36 MRI 1, 3, 6, 22, 44, 195–197, 211–234, 384, 388–390, 395–406, 412 Müller, Nicole 48, 91 multilingualism 220 Muntzel, Martha C. 168 Muysken, Pieter 182 Naidoo, Yugeshiree 47, 361–362 Nakagawa, Hirosi 24–28, 29n, 32n, 35, 38, 43, 52, 86–88, 90n4, 99, 101, 104–117, 121, 139, 144, 148, 210–211, 276, 293, 296, 306, 311, 327, 420 Namaseb, Levi 7, 18, 38, 44, 49n2, 52, 76–77, 79 Namibia 9, 18, 20, 26, 36, 39, 43, 156–157, 164, 168, 173, 214, 292, 294, 324, 327, 336, 340 Napoli, Donna Jo 386 Narayanan, Shrikanth 212, 214, 386 nasal (see also prenasalized) nasal accompaniment (see accompaniments) nasal cavity 132, 213, 428, 430 (see also velopharyngeal port) nasal venting 26, 94, 106, 132–133 nasalization 25–26, 28, 36, 47, 106, 116, 150, 161, 180, 192, 214, 229, 275–289, 345, 357, 391–392, 394, 405–406, 411 (see also prenasalization) nasalization loss 143–145, 305, 310, 320, 326, 349

General Index nasalization (unexpected) 284–285, 288 nasal clicks (see accompaniments) nasalized clicks (see accompaniments) Nash, David 36, 93, 182, 410 nasopharynx 25, 214 Nathan, Geoffrey S. 40, 89n1, 93, 408 naturalness 243 Naumann, Christfried 38, 84n1, 101, 105, 118, 120, 317 Ndala, António 35 Ndana, Ndana 39 Nedvěd, Radek 18, 20 Nelson, David P. 225 neutralization 77, 81, 83, 91, 246, 310, 321, 340, 350 Ngubane, Sihawukele Emmanuel 40 Niebergall, Aaron 234 Niesler, Thomas 356–357 Nihalani, Paroo 16 noisy release 6, 31, 92n, 134, 144, 250n, 250–253, 270, 297–299, 301n, 329, 336, 340n1, 341, 357, 432 (see also fricated release) non-click languages 94, 170, 174, 179, 181–183, 185–186, 188–190, 192, 194–195, 200–201, 411 non-native click perception 180, 192–193 non-phonemic click 422 non-pulmonic 220, 230 non-rounded vowels 223 normalization 137–138, 142, 248–249 Nussbaum, Loren V. 220 Nyamende, Abner 40 Obst, Erich 439 obstruent 29, 45, 83n1, 88–91, 95, 109–111, 114–117, 129–130, 143–144, 146–149, 227, 241, 248, 265, 267, 418 Ogden, Richard 45, 182, 408 Ohala, John J. 46, 52, 100, 182, 408 omnidirectional 271–272 onset 29, 32, 34, 40, 90, 115, 117, 138–139, 141–142, 150–151, 217, 223–224, 226, 245, 266, 292–293, 305, 312–313, 320–321, 323, 327, 345, 368 (for onset of voicing, see VOT) oropharynx 213, 396, 399–401, 430

General Index orthography 4–5, 10, 24, 28n, 29, 31, 35–36, 46, 115, 116n, 216, 234–235, 245–246, 323, 328, 339–340, 343, 349, 351–352, 367, 387, 390, 441n (see also transcription) Osu, Sylvester 99 Otto, [Brother] 43 Padgett, J. 82 Pahl, H. W. 79n1 Pakendorf, Brigitte 36–37, 39, 48, 157, 183 palate 3, 7, 10, 17–19, 48, 91, 210, 212, 214, 216, 217n, 219, 221, 226–231, 234, 384–386, 389, 396, 399–400, 404–405, 408–409, 411, 429, 440–441 (see also velum) palatal 8–9, 47, 50, 102, 111, 134–135, 296, 311–312, 317–318, 320–321, 326–327, 418, 439 (see also lingual(-palatal) seal) palatal click 1–10, 20–21, 26–27, 41, 45–46, 48–50, 84–87, 92, 95–96, 103–104, 108, 111, 131–135, 140, 144, 158–159, 162, 170–171, 211, 214, 216–220, 231–232, 249, 250n, 255–256, 261–262, 276–277, 283, 285, 291, 293, 297–303, 401, 310n2, 311–321, 323–330, 336–338, 341–345, 352–353, 357, 392–393, 396, 399–402, 411, 420n, 441 fricated palatal click 7, 9–13, 50n, 95, 256, 296, 299, 301, 337–338, 341–346, 350, 352–353 palatalization 296, 305, 307–309, 311, 317 palatoalveolar click 5–8, 180–181, 192–193, 211 palatovelar click 232, 399 palatography 16–17, 43–44, 134, 210, 230–231, 298, 311, 399, 404 (see also electropalatography) Panamanian 226 Panconcelli-Calzia, G. 38, 42, 74n paralinguistic 42, 45–46, 48, 93, 179n, 182, 184, 190, 196n2, 197, 212, 225, 229–231, 408, 410, 422–435 participant 118, 191, 193–195, 201, 212, 214, 221, 226, 232–233, 246, 268–269, 359, 362–367, 369–380, 388, 384, 386–393, 395–403, 405–408, 410–412 Pascoe, Michelle 48 passive articulators 6, 22, 214, 217n, 219, 396, 399–400

457 passive structures (see passive articulators) perception 41, 179–202, 408–410 perceptual assimilation 179, 184–185, 190, 200 percussion 191, 225 (see also vocal percussion) periodicity 263 Perkell, Joseph S. 409 Persson, J. A. 40 Peterson, Gordon E. 150 pharyngeal aperture (see pharynx) pharyngeal, pharyngealized 87–88, 92, 107, 219, 225, 277, 282, 284, 287–288, 296, 399–401, 406 pharynx 6, 95, 132, 210, 213, 217n, 219, 222, 223, 231–232, 390, 397–398 (see also oropharynx, nasopharynx, velopharyngeal port) Phillips, Howard 43n1 phonation 22, 255, 275, 277, 392, 407 (see also breathy, glottalized, pharyngealized, whispery) phonetic learning 384, 386 phonological representation 74, 88, 91–92, 120, 184, 211 phonological unit 101, 104, 136, 211 phonotactics 42, 86–93, 114–118, 156, 164, 296 Pickett, Velma B. 115 Pickrell, Joseph K. 330, 385 Pienaar, Pierre de Villiers 43 Pillion, Betsy 435 Pinto, Derrin 45 place of articulation 2, 4–8, 22, 89, 104, 108, 150, 162, 165, 167, 193–194, 198, 210, 217, 222, 225–226, 228, 230–231, 241, 248, 250, 258–259, 275, 297, 357–358, 366, 392–393, 395, 397, 399, 404, 411, 419, 433 (see also click type) plain click (see accompaniment, plain) Planert, Wilhelm 43 plopped click 10, 16n1 Plumbley, Mark D. 225 Poeppel, David 189, 195 Ponelis, Fritz A. 38 Pongweni, Alec J.C. 40 Pons, Ferran 199n

458 postalveolar click [= (post)alveolar, postalveolar] 1–10, 14–16, 18, 20–21, 23, 31n1, 33–34, 46, 49n2, 50–51, 79–80, 95–96, 211, 216–218, 227, 231–232, 249, 250n, 254–256, 266–267, 283, 293, 341, 387–388, 392–395, 397–404, 407–411 (see also alveolar click) fricated (post)alveolar click 10, 15, 79n posterior constriction location/seal 82, 94, 97, 106–107, 109, 210, 217–220, 226–229, 231–232 power spectrum (see spectrum) Praat 247–250, 262–263, 266–267, 423 pragmatics 2, 243, 247, 423, 427, 434 preemphasis 248, 369 preglottalization 24, 103, 321, 326–327, 394 prenasalized 28, 47, 83n1 prenasalization, loss of (see also denasalization (of vowels)) 143–146, 282 prenasalized clicks 26n, 28–29, 31n2, 32, 82n2, 102–103, 165, 192, 260–262, 275, 277–278, 280–285, 289, 305, 310, 315, 320 prevoicing 23–24, 26n, 28, 35n, 94, 102, 104, 110, 191, 281, 350n Proctor, Michael 2–3, 7–8, 38, 44, 52, 214–215, 217–218, 225–226, 233, 241, 386, 390, 397, 404, 406, 436 production 1, 7–8, 13, 15–18, 22, 29, 44–45, 47, 51, 79n2, 98, 179, 192, 196, 201, 210–212, 215, 219–226, 228, 230–233, 356–364, 367–375, 377–380, 384–412, 423, 428–430, 439 productive 75, 120, 147, 149, 245, 286 prosodic (see prosody) prosody 16n3, 21–22, 44, 51, 77, 232, 243, 245, 275, 361 Pullum, Geoffrey K. 429, 430n pulmonic 3, 9, 16, 21, 29–30, 32, 48, 94–95, 99, 109, 210n, 228, 275, 357–358, 366, 391, 406, 420 pulmonic activity 228, 357 pulse sequence 226 radical (see tongue root) Rakerd, Brad 189

General Index random 246, 278, 369 rarefaction 2, 134, 216, 218–219, 221–222, 227–229, 297–298, 357, 384–385, 397, 406, 427 real-time Magnetic Resonance Imaging (see MRI) reconstruction 10, 20, 49–51, 145, 171–173, 276–285, 289, 297, 299–300, 304–305, 307–309, 310n2, 313, 318–320, 322–327, 330, 338 recording 43–44, 191, 212–213, 221, 232, 241–243, 245–246, 248, 266, 268–272, 295, 323, 367, 387, 391, 406–407 release (see also burst characteristics) 2–10, 16–29, 34, 74–75, 85n, 87, 92n, 93–94, 96n1, 101, 107–109, 112–114, 116, 118, 132, 134, 139, 151, 159, 162, 180, 194, 198, 201, 215–224, 226–232, 243, 248–249, 250n, 256–257, 259, 261–262, 264–267, 275, 279, 297–298, 340, 344n2, 245, 357–358, 363, 366, 368, 370, 374–377, 380, 385, 387, 391–392, 395, 397–401, 406–407, 422–432, 434 release kinematics 216 (see also tongue body movement) repetition 137, 162, 227, 245–246, 362, 435 retraction (see tongue retraction) retroflex 8–9, 16–17, 21, 392, 402, 404, 407–408 retroflex click 5–6, 9–21, 50–51, 95–96, 211, 232, 296–297, 336, 338–343, 345n, 351–353, 291, 393, 399–400 retroflexed 216–217 Rickford, Angela E. 434 Rickford, John R. 434 rimshot 225–228 Rivera-Gaxiola 185–186 Rivierre, Jean-Claude 150 Roberts, James 244, 386 Rodgers, Blake 46, 438n Rose, Sharon 89 Rosenblum, Lawrence D. 194 Rossouw, Kate 48 Roux, Justus 23, 28, 39, 43–44, 47, 92n, 211, 356–357, 362, 419 rt-MRI (see MRI) Rycroft, David K. 220

General Index Sagey, Elizabeth C. 82, 92n, 97–101, 104 Sakuhuka, Clifford 156 Saldaña, Helena M. 194 sample 40, 231, 243, 245, 267, 362, 364, 368–370 sampling rate 212–213, 247–248, 270, 272, 387, 423 Sands, Bonny 9–10, 20, 31, 36–43, 45, 48–51, 74, 79, 81, 88n, 95, 96n, 113, 121, 156n, 162, 173, 179n, 181–184, 190n, 192n, 210–211, 241–244, 247, 291n, 297–298, 326, 329, 336n, 338, 340, 350n, 354, 385, 392, 399–400, 405, 407, 411–412, 436 Scharf, G. 39, 44 Schilling-Estes, Natalie [=Schilling, Natalie] 243 Schwartz, Joyce 194 Scobbie, James M. 45 Scott, Abigail 16–17, 19 Scott, Sophie K. 195 secondary articulation 95, 183, 230, 250, 257–259, 327 segment 28–29, 31, 75–76, 78, 80, 83, 85, 91, 93–94, 97–102, 107, 111, 117–119, 131, 181, 186, 189, 200, 211, 214, 220, 233 segment inventory 1, 23, 28n, 36, 38, 41, 45, 51, 92–93, 95–96, 100–110, 112–114, 117–118, 119n, 120, 131–132, 158–162, 173, 181–183, 187, 189, 195n, 200, 211, 220, 230, 292, 297, 301, 310, 321, 324, 346, 350, 352–353, 362, 422 segmental 100, 114, 117, 188, 243, 245, 267, 277, 281–282, 298, 357–359, 361, 363–364, 374, 378–380, 407–408 Seidel, Frank 164 Selmer, Ernst W. 39, 43 semantic 40, 245–246, 281–282, 365, 427, 434 Senegal 435 Shankweiler, Donald 194–195 Shen, Zhongwei 130 Shinya, Takahito 181 Shuster, Linda I. 359 Sibanda, Galen 83n2, 89 Sibanda, Magredi 35 sibilant 47n, 48, 115, 418

459 Sign languages 48 signal 35, 46, 193, 197, 199, 214, 232–233, 248, 255, 260, 270–272, 368, 389, 399, 409 Silverman, Daniel 100 Simpson, Adrian P. 46, 408 Skhosana, Philemon Buti 96 Skipper, J. I. 193 [+slack] feature 148, 150, 152 slack voicing 23, 148, 150–152, 262–264, 419 snare drums 225–226, 228–229 Snider, Keith 244 Snyman, Jan W. 21, 23, 25, 34, 38, 43–44, 74, 101, 110, 210n, 211, 276, 336–339, 353, 400 sociolinguistic interview 243–244, 247 sociophonetics 1–2, 22, 183, 330, 354 soft palate (see velum) South Africa 36, 43, 46, 79, 157, 162, 192–193, 220–221, 294, 364, 438, 440 Spajić, Siniša 385 spatial 211–212, 216, 233–234, 357–361, 363, 368, 377–378, 380, 390 spectra (see spectrum) spectral (see spectrum) spectrogram 25–27, 31, 33, 38, 44, 159–163, 166, 215, 247–250, 257, 259–267, 368–370, 377, 388, 394, 423–426, 430 spectrum 4, 10, 13, 134, 194, 214–215, 247–248, 250–261, 264, 272, 298, 306, 366, 424 speech motor planning 356, 358–360, 363, 377–378 spontaneous speech 212, 221, 243, 247, 269, 362, 409, 423 Staiger, Anja 356, 361, 379 Starostin, George S. 9–10, 20 statistical power 243 Stavness, Ian 386 Stemberger, Joseph P. 357 Steriade, Donca 181 Stevens, Kenneth N. 8, 45, 99, 150–152 Steyn, Mollie 47–48 stigmatized 243 Stone, Maureen 386 Stoneking, Mark 292

460 stop 2, 6, 17, 21, 23–24, 28–29, 32, 35, 45, 47–48, 77, 79, 81, 83, 87, 88, 95, 100, 102–109, 111, 113–115, 133–135, 143, 145, 150–151, 168, 180, 181n2, 183, 191–195, 197–199, 201, 213–214, 223–225, 250n, 261, 264, 266, 297, 306, 310–311, 316–318, 322, 327, 341, 344–346, 350, 386, 392–393, 396, 399, 401–404, 407–408, 411, 430n, 438–439 Stowell, Dan 225 Strange, Winifred 189 Studdert-Kennedy, Michael 194–195, 198 subapical 6, 9–10, 16 sublaminal 6, 9, 16 substitution 47–48, 79, 89n1, 91, 131, 358–359, 363, 368–370, 377, 379, 391, 397, 401–404, 406–408, 411, 418 suck-teeth 434 suction 2, 8, 22, 79n2, 98–99, 180, 257, 400, 432 supraglottal 151 Swadesh 245, 293 Swaziland 36, 220–221 syllable 1, 23, 40, 45, 88–90, 136, 138–139, 161, 191–195, 198–200, 245, 266, 358, 360, 364, 368, 419 Tabain, Marija 16n3 Taljaard, Petrus C. 210n, 220 Tallal, Paula 194 Tanaka, Jiro 144 Tang, Katrina Elizabeth 129–130 Tanzania 36–37 target 47n, 48, 85–86, 91, 146, 149, 179, 185–189, 192, 195, 200, 212, 218, 221–224, 226, 297, 323, 357, 363, 367–368, 370, 378, 387–388, 391–397, 400–402, 404–407, 410–411 teeth 4, 6, 16, 22, 92n, 166, 213, 221, 228–230, 242, 340, 358, 426, 432, 434, 440–441 (see also dental) temporal 152, 211, 216, 230, 233–234, 357–361, 363, 368, 407 Ten Raa, Eric 385 Thaler, Lore 48 thematic 245–246 Thibeault, Mélanie 386

General Index Thomas-Vilakati, Kimberly [= Thomas, Kimberly] 39, 43–45, 52, 357, 397 Thomas, Jacqueline M. C. 244 Tiede, Mark K. 386 time scale 248, 258 timing of dorsal release 28–32, 107, 219, 227 Tlale, One 40 tone language 129, 139, 150 tonal depression 24, 129–132, 136–144, 146–150, 281–282, 284–285 tonal melodies 136–143, 146–147, 149, 277, 280 tonal space 137–138, 142 tongue (see also lingual) tongue blade 4, 6–9, 16–20, 22, 216, 218–219, 221, 226, 255, 391, 396, 399 (see also laminal) tongue, center 6–9, 18, 22, 357 tongue root (dorso-radical) 1, 3, 6–7, 18–19, 22, 28, 34–35, 85, 92, 210–211, 213, 219, 223, 225, 231–232, 401 tongue tip 4, 6–10, 16–20, 22, 198, 213, 218–219, 221, 223, 391, 395–396, 399, 402, 441 (see also apical) tongue body tongue body movement 2–3, 6, 17–19, 22, 35, 85, 217–219, 221, 232, 397 tongue lowering 3, 6, 18–19, 85, 218–219, 222–223, 227–229, 232, 357, 397, 427 tongue retraction 3, 6, 18, 85n, 217–219, 221–224, 231–232, 397, 399, 401, 403 tongue trajectories 6, 217–219, 232, 359 tongue body shape / position 3, 5–8, 10, 18–19, 22, 85, 92, 198, 216–219, 221 Tosco, Mauro 37 Toutios, Asterios 6, 44 trachea 212–213 Traill, Anthony 1, 3n, 6, 21–22, 24–26, 28–29, 31–33, 35, 38, 43–44, 48–50, 74, 78–79, 82n2, 84–86, 88–89; 93n, 94, 96n1, 98–99, 101–102, 104–105, 108, 116, 118, 119n, 131–136, 144, 149–150, 162, 170, 179n, 180–184, 192, 210–211, 214, 216, 231, 244, 276, 286, 291, 293–298, 300–301,

General Index 306, 310–314, 317–323, 329, 336, 338, 340–345, 350, 353, 357, 364, 384–385, 387–388, 397, 406–407 transcription 2–10, 21–29, 31–32, 34–36, 43, 50n, 84n2, 102, 107, 131–132, 135n, 140, 145n, 151, 174, 220, 232, 234–235, 257–258, 264, 266, 272, 277, 296, 326, 338, 340, 343, 359, 363, 405–406, 429, 430n, 440, 441n Traunmüller, Hartmut 384–385, 209 Trouvain, Jürgen 45 Trubetskoy, Nikolai 43n, 97, 101 Tsua Depression Exceptions Hypothesis 147 Tsua tonal depression rule 148–149 Tuhuse, Bolo ǁXao 3n Tuomi, S.K. 47 Turk, Alice 243 Tutschek, Karl 440 Tyler, Michael D. 184–185, 189, 200 typology 2, 23, 89–97, 114–120, 292, 294 UCLA phonetic archive 44, 249, 263, 266 ultrasound 1, 3, 17–19, 22, 44, 106, 120, 210, 219, 230–231, 241, 404 unaspirated (see accompaniments) underlying segments 77–79, 80, 92, 110n, 211, 345n Unicode 46 unit 75, 98, 100–120, 136, 185, 211, 275, 278, 282–283 unitary segments 23, 100–120, 211, 214, 275 universal 1, 75, 90, 93–97, 99, 113, 181, 197 uvula 217–219, 225–231 uvular 3, 18, 86–87, 100, 102, 107–109, 197, 221, 224, 232, 257–260, 287, 392–393, 401, 405, 411 uvular (constriction in click) 3, 28–35, 87, 94, 100, 102–104, 107, 192, 210, 219–222, 224–232, 257–261, 265–266, 287, 306, 316–317, 395, 400, 429 uvular frication 35, 102–104, 109, 180, 183, 296, 327 Valiente-Noailles, Carlos 300 Van der Merwe, Anita 47–48, 356, 360–361, 363–364, 378–380 van Reenen, J. F. 384

461 variant pronunciations 3n, 6–10, 20–24, 41, 45, 51, 79–80, 92, 94–96, 167–168, 171, 220, 243, 283, 297, 202–305, 314, 322, 329, 351–352, 408, 429 variation 1–2, 18, 22–23, 25, 75–80, 85, 90n, 92, 95–96, 115, 162, 167–168, 173, 181–182, 184, 187, 211, 227, 229, 242, 244–245, 247, 304, 310n2, 312–318, 329, 340–341, 345, 384–386, 409, 429, 433 Vaz, Colin 232 Vedder, Heinrich 20–21, 329, 338 velar velar click (back-released) 3, 422–423, 425, 427–432, 434–435 velar (constriction in clicks) 3, 23, 26n, 28–29, 31–35, 78, 82, 97–98, 102, 106– 107, 139, 162, 166, 180, 210, 217–219, 222, 226–227, 231–232, 257–258, 260–261, 266, 400, 429 velar consonants (non-click) 3, 46–47, 86–87, 92–93, 97–98, 109–110, 133–135, 145, 165, 167–168, 172–174, 191–192, 194, 197, 223–227, 259, 297, 306, 310–311, 322, 327, 341, 344, 392–393, 396, 400–401, 403, 406, 411, 419, 440 velar frication 32, 34, 110n, 113, 183, 296 velaric consonants 3, 179, 201, 210n, 357–358, 366, 427 (see also lingual) velopharyngeal port 25, 227, 229, 396–400, 405–406, 428, 430 (see also nasal cavity) velum 3, 18, 26, 166, 210–214, 217n, 219–221, 226–232, 391, 396–400, 402, 405–407, 411, 426, 429 verbal gesture 422, 433–435 (see also paralexical, paralinguistic) video 193–194, 198, 212–215, 268 Vigil, Donny 45 Visser, Hessel 40, 88, 244, 276, 285–286, 293 vocabulary 41–42, 135, 200, 246–247 vocal fold slackness (see slack voicing) vocal folds 150–151, 213, 358, 397–398, 401–403 vocal organs 225, 409 vocal percussion 212, 225–230 vocal performance 225–226

462 vocal tract 166, 182, 194, 196, 211–213, 216, 219–220, 230–231, 233–234, 384, 388–390, 404, 430 vocal tract morphology, variation in 232, 384–386, 408n, 409–411 voicing 9, 23, 29, 32, 74, 80, 91, 94–95, 102, 104, 106–110, 116, 130–131, 151, 159–162, 165, 167, 172, 174, 191–192, 195, 198, 261–264, 275, 281, 357 (see also: breathy, devoicing, prevoicing, slack voicing, voiced, whispery voicing) voiced 2, 4, 9, 14–15, 17, 23–24, 28, 31–32, 34, 40, 47, 80–83, 95, 102–106, 108–111, 129–132, 139, 141, 143–152, 159–162, 165– 166, 168, 172, 191, 197, 216, 233, 260–263, 266, 275, 277–282, 284, 296, 304–306, 310, 314–316, 327, 340–341, 346, 349, 350–352, 357–358, 366, 391–392, 394, 405, 418–420, 439n voiceless 1–2, 4, 9, 17, 23–26, 28, 31, 33–34, 35, 40, 45, 47, 80, 82, 91, 95, 102–103, 105, 108–110, 114, 118, 132, 137, 150, 151–152, 159–162, 165, 172, 184n, 191, 192n, 194, 197–198, 216, 220, 223, 224, 227, 229, 250, 260, 266, 281–282, 284, 306, 327, 340, 341, 350, 357–358, 366, 377, 391–392, 402–406, 411, 419, 420n, 430n Vossen, Rainer [= Voßen, Rainer] 36, 38, 46, 48–49, 78–79, 129, 131, 134–135, 144–146, 150, 162, 170, 182–184, 211, 214, 260, 276–278, 280–282, 284, 286–287, 289, 291, 293–295, 297–300, 302–314, 316, 318–323, 326, 329–330, 336, 338, 340–345, 350, 353 VOT (Voice Onset Time); onset of voicing 25–26, 99, 150–151, 358, 370, 374–377, 379–380, 392, 394, 407 vowel 1–2, 4, 22–23, 26, 45, 76, 87, 91, 113, 115–117, 136–137, 164, 180, 194, 221–225, 248–249, 259, 261–263, 266, 277, 279, 286–288, 296, 298, 358, 407, 422 (see also coarticulation) vowel formants 30–34, 191, 248, 259 vowel onset 29, 32, 34, 150–151, 363, 368, 370, 374–377, 380, 392, 407 Vroomen, Jean 194

General Index Wakumelo-Nkolola, Mildred 35 Walker, Rachel 89 Walsh, Martin 40, 93 Wambaugh, Julie L. 359, 361, 379 Ward, Ida C. 44 Warner, Natasha 243 Warren, Paul 243, 246 waveform 26–27, 44, 159–161, 163, 166, 215, 222, 247–248, 250–258, 263, 265, 267, 368, 370, 388, 394, 423–425, 427–428, 431–433 weak click weakening 134, 298–299, 320, 329, 331, 340, 341n, 353 weak click 6, 16n1, 21, 45–46, 181, 438n2 Weenink, David 247, 423 Weirich, Melanie 386 Weitzman, R.S. 150 Werker, Janet F. 197, 199, 409 Werner, Alice 43 West Africa 422, 429–434 Westermann, D. 44 Westphal, Ernst O.J. 43, 49, 151–152, 214, 293, 295, 312–319, 323–326 whispery voicing 262–263 Wightman, Frederic L. 189 Wilkes, A. 96 Wilmsen, Edwin N. 46, 131, 183, 293, 330 Windhoek 214 window sliding window 213 window length 248, 250n window shape 250 window weighting 369 Winkler, Eike-M. 384 Witzlack-Makarevich, Alena 35 wood block 254, 225 Woodward, Amanda L. 200 wordlist 145, 243–246, 268, 293–294, 303, 312, 351 Wright, Melissa 45, 182 Wright, P. G. 44 Wright, Richard 4, 7, 10, 25, 28, 38, 43, 52, 93, 181 Wertz, Robert T. 361 X-ray 43–44, 82n3, 210–211, 231, 380, 384 Xu, Ching X. 150 Xu, Yi 137, 150

463

General Index Yaeger-Dror, Malcah 243 Yanagihara, N. 94 Yeoman, John 430n Yost, William A. 48, 189 z-score 248–249 Zambia 36, 156–157, 168, 173 Zee, Eric 150

Zeitvogel, Karin 438 Zelditch, Miriam Leah 412 Zhu, Yinghua 233 Ziegler, Wolfram 356, 361 Ziervogel, Dirk 79, 96, 211, 220 Zimbabwe 36, 292, 294, 310n2 Zipf, George Kingsley 386

Language Index Afrikaans 47, 86–87, 116, 118, 179n, 214–215, 234, 276, 356, 364–365, 367, 379 Angolan ǃXun (includes Mupa ǃXun) 36, 294, 327, 336–339, 341–353 Bakoko 435 Bantu 23–24, 36–37, 39, 41–44, 48–49, 79, 89–90, 93, 151, 156–157, 162, 170–173, 183n, 192, 195n, 211, 220, 229, 260, 313, 324, 330, 344, 353–354, 356, 365, 367–368, 380, 385, 435, 438n2 Basaa 435 Bulu 435 Central Ju 352–353. See also Gaub ǃXun, Grootfontein ǃXun, Neitsas ǃXun Chiikuhane (Ikuhane, Subiya) 39, 157 Chitumbuka 41 Chopi 40 click languages 1–2, 36–42, 94, 157, 162, 183, 438n2 Dahalo 9, 23, 37–38, 52, 90n2, 93–96, 100, 116, 385, 438n2 Damara. See Khoekhoegowab, Sesfontein Damara Damin 36, 93–94, 181, 410, 438n2 Digo 40, 93–94 East African click languages 38, 51. See also Dahalo, Hadza, Sandawe East ǃXóõ 38, 298. See ǃXóõ Eastern Kalahari Khoe 129, 131, 145–146, 149, 293, 299–300, 302, 306, 310n2–3, 311–312, 317, 320, 322, 324, 326, 329–330 Ekoka ǃXun 9–13, 38, 50n, 95, 244, 337, 341, 346 English, clicks in L1 1, 45–46, 94 French clicks 46 Fwe 5, 28, 39, 49, 79–80, 92, 94–95, 156–177 Gaub ǃXun 20–21 German clicks 45–46 Gorowa 441

Griqua (Griekwa) 49, 52 Grootfontein ǃXun 5, 9–10, 13–21, 38, 52, 95, 337 Gǀui 24–28, 32n, 38, 40–41, 87–88, 90n4, 92, 95, 104, 107, 109, 111–113, 131, 133, 135, 139–140, 143–145, 147–148, 210, 276, 281–282, 284, 287–288, 292–293, 295, 299–303, 311–312, 322, 420 Gǀui-Gǁana cluster 295, 299, 303 Gǁana 40, 135, 295, 298–304, 306–309, 311–312 Hadza 3, 7, 16n1, 21, 24, 37–38, 47, 52, 94–95, 192n, 210, 385, 418–420, 439–441 Haiǁom 20, 328 Ikuhane (Chiikuhane, Subiya) 39, 157 isiNdebele 37, 39, 79, 89, 96, 220 isiXhosa 4n2, 21, 23–24, 39, 47–48, 79–83, 90, 94, 115–116, 139, 192n, 195n, 210, 263, 356–357, 362, 400, 419–420, 438 isiZulu 7, 36n2, 39–40, 43, 47, 79–83, 89–90, 94–96, 115, 162, 191–193, 195–197, 210, 235, 356–357, 364–367, 371–372, 377 Ju languages 20, 38, 50, 276, 281–282, 286–288, 337. See also Ju|’hoan, ǃXun Juǀ’hoan (Juǀ’hoansi) 7, 10, 20–21, 23, 26n, 28, 34–35, 36n1, 36n2, 38, 41, 43, 46, 52, 87–88, 94–95, 101, 106, 109–114, 116–117, 170–171, 244, 292–293, 338–352, 400 K’iche’ 440 Kakia 38, 50 Kalahari Khoe 48, 129, 145–146, 275–282, 284–286, 289, 291, 293–300, 301n, 302– 303, 306, 310n2, 311–312, 317, 320–324, 326–327, 329–330, 340, 346, 353 Kalahari Khoe East. See Eastern Kalahari Khoe Kanuri 429, 431 Karanga 40 Kelpien 439 Kgalagari 40

465

Language Index Khoe (family) 38, 41, 43, 48–50, 129, 131–136, 144, 170, 183, 275–277, 280–281, 286, 288, 292–295, 297–303, 312, 323, 325–327, 329–331, 353–354. See also Kalahari Khoe, Khoe-Kwadi, Khoe languages; see also Khwe language Khoe-Kwadi 36–38, 291, 294, 299, 323–324, 329–330 Khoekhoe. See Khoekhoegowab Khoekhoegowab (Khoekhoe, Nama, Damara, Haiǁom ǂAakhoe) 3–5, 7, 9–10, 20, 24, 34–35, 38, 41, 43–45, 49, 52, 79, 86, 88, 90, 93n, 94–95, 107–109, 112–113, 116, 118, 132–133, 139, 210, 212–221, 226–228, 230–234, 244, 276–279, 280–282, 287–288, 291–293, 295, 299, 301–303, 312, 322–324, 326–330, 406. See also Sesfontein Damara, Mixed Haiǁom Khoisan (Khoesan) 36, 79, 88–90, 101, 111–116, 132, 139, 149, 152, 162, 170, 173, 192, 201, 229, 241, 244, 292–294, 297, 300, 317, 336, 354, 384–386. See also Hadza, Ju languages, Khoe-Kwadi, Kx’a languages, Tuu languages, Sandawe Khwe 28, 38, 95, 112–113, 170, 244, 275–289, 291n, 293, 295, 299–303, 305–312, 327, 330 Kinyarwanda 46, 408, 438n2 Kora 6, 38, 49, 118, 139, 277, 280, 292–293, 322–323 Korana. See Kora Kua 41, 131–133, 139–140, 143–145, 147–148, 282, 292–293, 296, 299–303, 310–312, 317, 320–321, 326 Kwadi 291, 294–295, 299, 323–327, 329–330 Kwangali 39, 157 Kwanyama 344, 353 Kx’a languages 36, 38, 41, 43, 48, 50–51, 292–294, 299, 329, 336–337. See also Ju languages, ǂ’Amkoe languages Kxoé. See Khwe

Mangetti Dune ǃXun 3, 7, 20, 38, 52, 85n, 337, 340–341, 400 Manyo 39, 157, 162 Masalit 40 Mbukushu 39–40, 157, 162, 170 Mixed Haiǁom 20 Mundang 429–430 Mupa ǃXun 36, 337–339, 341–353

Laal 422–435 Lala 220 Lardil 36, 93, 182, 410 Lillooet 438

Phuthi 220 Proto-Bantu 49, 172 Proto-Ju 9–10, 20–21, 51, 337–345, 352–353 Proto-Khoe (pKhoe) 275–289, 299–303, 310n2, 322–324 Proto-Kx’a 50–51, 297, 301n, 336–338, 352–353

Mambay 428–431 Mambiloid 432

Na’vi 438 Nama. See Khoekhoegowab Namibian Afrikaans. See Afrikaans Naro 24, 35, 38, 40–41, 88, 133, 210, 244, 276, 280–282, 285–288, 292–293, 299, 301–303, 312, 322 Ndau 40, 438n2 Ndebele 37, 39, 79, 89, 96, 220 Northern Transvaal Ndebele 39, 79, 96 Zimbabwean Ndebele 37, 39, 89 Neitsas ǃXun 20–21, 338–339 Ngoshie 435 Nguni group 24, 28, 39–41, 49–52, 79–83, 89, 157, 220, 356, 385, 438n2. See also Ndebele, isiXhosa, IsiZulu, siSwati, sePhuthi Ninkyob 6 Northern Ju 38, 344–350. See also Angolan ǃXun, Mangetti Dune ǃXun, Mupa ǃXun Northern Transvaal Ndebele 39, 79, 96 Northwestern Ju 293, 301n, 337–343, 349–353. See also Ekoka ǃXun, Okongo ǃXun Norwegian clicks 46 Nǀuu 3, 6–7, 26, 29–32, 38, 41, 43, 49–50, 52, 76–79, 88, 95, 99–100, 106–107, 112–113, 210, 294 Nǀuǁ’en 38, 50 Nǁng. See Nǀuu Nǃaqriaxe 24, 38 Okongo ǃXun 21, 337

466 Proto-ǃUi 49–50, 96n1 Pulaar 431, 435 Quiché 440 Ronga 40 Rukwangali 39, 157 Rumanyo 39, 157, 162 Rwanda 46, 408, 438n2 Sandawe 7, 9, 16, 24, 28, 37–38, 48, 52, 93–95, 113, 385, 400 Sàsí 38 sePhuthi 220 Sesfontein Damara 291, 324, 327–329 Sesotho 39–40, 49, 95–96, 157, 162, 192–193 Setswana 40, 131, 147, 300, 313, 330, 364–365, 367, 377 Shekgalagari 40 Shiyeyi 7, 10, 36n2, 39–41, 89, 95, 157, 164, 170, 210, 256, 260–261 Shua 41, 44, 133, 150n, 151–152, 292–295, 299–303, 307–313, 316–322, 326–327, 330 Sign languages 48 siSwati 39–41, 44, 49, 96, 212, 220–225, 230–232, 235 Sotho 39–40, 49, 95–96, 157, 162, 192–193 South African English 94 South African Ndebele 39, 79, 96 South-West Bantu click languages 37, 39, 157. See also Fwe, Kwangali, Manyo, Mbukushu, Subiya, Yeyi Southeastern Ju 38, 337–341, 352–353. See also Juǀ’hoan Southern Bantu 37, 39, 42, 220 Southern Sotho 39–40, 49, 95–96, 157, 162, 192–193 Spanish clicks 45 Standard Namibian Khoekhoe. See Khoekhoegowab Subiya (Ikuhane, Chiikuhane) 39, 157 Swati 39–41, 44, 49, 96, 212, 220–225, 230–232, 235 SWB click languages. See South-West Bantu click languages

Language Index Taa languages 39, 50. See also ǀ’Auni, ǀHaasi, Kakia, Nǀuǁ’en, ǃXóõ Tekela 220 Tenctonese 439 Thimbukushu 39–40, 157, 162, 170 Tjwao 292–293, 296, 299, 302–303, 310n2, 312, 317–318, 321, 326 Transvaal Ndebele 39, 79, 96 Ts’ixa 135, 145–146, 293–295, 299–303, 310n2, 311–317, 320–322, 325, 330 Tshwa (cluster) 129, 292–294, 296, 299–302, 307–312, 317–322, 326–327, 330 Tsonga 40, 221 Tsua 49n1, 129–152, 279–280, 282, 299 Tswa 40 Tswana 40, 131, 147, 300, 313, 330, 364–365, 367, 377 Tumbuka (Mzimba dialect) 41 Tuu languages 36, 38, 41, 43, 48, 50, 183, 292–294, 329, 354. See also Taa languages, ǃUi languages Urban Sotho 162 Urban Zulu 162 Wawa 431 West ǃXóõ 38, 118, 294 Western Kalahari Khoe 276–277, 299–300, 311, 322 Wolof 429, 431, 433–435 Xhosa 4n2, 21, 23–24, 39, 47–48, 79–83, 90, 94, 115–116, 139, 192n, 195n, 210, 263, 356–357, 362, 400, 419–420, 438 Xitsonga 40, 221 Xri. See Griqua Yangkaal 36 Yeyi 7, 10, 36n2, 39–41, 89, 95, 157, 164, 170, 210, 256, 260–261 Yoruba 150, 433 Zhuǀhõasi. See Juǀ’hoansi Zimbabwean Ndebele 37, 39, 89 Zulu 7, 36n2, 39–40, 43, 47, 79–83, 89–90, 94–96, 115, 162, 191–193, 195–197, 210, 235, 356–357, 364–367, 371–372, 377

Language Index ǀ’Auni 38 ǀHaasi 38 ǀXam 38, 46, 49–50, 79, 440 ǁAni 133, 280, 282, 293, 295, 299, 301, 304–306, 311–312, 329–330 ǁXegwi 38, 49–50, 96n1, 294 ǂAakhoe 328 ǂHoan 6–7, 24, 38, 41, 50–52, 95, 247, 336n ǂKhomani. See Nǀuu ǂUngkue 38 ǂ’Amkoe language(s) 38, 50, 292–294, 336– 337, 353. See ǂHoan, Nǃaqriaxe, Sàsí ǃOra. See Kora ǃUi languages 38, 41, 49–50, 294. See also Nǀuu, ǂUngkue, ǀXam, ǁXegwi ǃXóõ (ǃXoon, Taa) (includes Taa, East ǃXóõ, West ǃXóõ) 1, 3n, 6, 21, 23–24, 26n, 28–29, 31–35, 36n2, 38, 41, 50, 82n2, 84n1, 84–88, 90, 94–95, 99–105, 109, 112–113, 118, 119n, 181, 192–193, 198–199,

467 210, 244, 249–262, 264–267, 276, 282, 286, 288, 292–294, 298, 384–385, 387, 406 ǃXun (ǃXung, ǃXuun) 3–5, 7, 9–21, 36, 38, 43, 47n, 50–52, 85n, 95, 244, 293–294, 301n, 327, 336–354, 400. See also Juǀ’hoan Angolan ǃXun (includes Mupa ǃXun) 36, 294, 327, 336–339, 341–353 Ekoka ǃXun 9–13, 38, 50n, 95, 244, 337, 341, 346 Gaub ǃXun 20–21 Grootfontein ǃXun 5, 9–10, 13–21, 38, 52, 95, 337 Kxabinǃa ǃXun (Tamboti people, or Tsintsabis ǃXun) 21 Mangetti Dune ǃXun 3, 7, 20, 38, 52, 85n, 337, 340–341, 400 Mupa ǃXun 36, 337–339, 341–353 Neitsas ǃXun 20–21, 338–339 Okongo ǃXun 21, 337 Sarisa ǃXun 21 ǃU ǃXun (Dune people) 21 Xunigǃu ǃXun (Makalani people) 21