The Native Languages of South America 1107044286, 9781107044289

In South America indigenous languages are extremely diverse. There are over one hundred language families in this region

1,189 142 7MB

English Pages 400 [399] Year 2014

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The Native Languages of South America
 1107044286, 9781107044289

Table of contents :
Contents
Figures
Maps
Tables
Contributors
Acknowledgments
1 Introduction: South American indigenous languages; genealogy, typology, contacts
Part I Introduction to South America
2 Human migrations, dispersals, and contacts in South America
3 Basic vocabulary comparison in South American languages
Part II Case studies in contact
4 Structural features and language contact in the Isthmo-Colombian area
5 The Andean foothills and adjacent Amazonian fringe
6 The Andean matrix
7 The Arawakan matrix
8 The Tupian expansion
Part III Comparative perspectives on linguistic structures
9 Language internal and external factors in the development of the desiderative in South American indigenous languages
10 Verbal argument marking patterns in South American languages
11 The Noun Phrase: focus on demonstratives, redrawing the semantic map
12 Subordination strategies in South America: nominalization
Part IV Major findings and conclusions
13 The languages of South America: deep families, areal relationships, and language contact
References
Subject and place index
Language index
Author index

Citation preview

The Native Languages of South America

In South America indigenous languages are extremely diverse. There are over 100 language families in this region alone. Contributors from around the world explore the history and structure of these languages, combining insights from archaeology and genetics with innovative linguistic analysis. The book aims to uncover regional patterns and potential deeper genealogical relations between the languages. Based on a large-scale database of features from sixty languages, the book analyzes major language families such as Tupian and Arawakan, as well as the Quechua/Aymara complex in the Andes, the Isthmo-Colombian region, and the Andean foothills. It explores the effects of historical change in different grammatical systems and fills gaps in The World Atlas of Language Structures (WALS) database, where South American languages are under-represented. An important resource for students and researchers interested in linguistics, anthropology, and language evolution. loretta o’connor is a post-doctoral researcher in the South American languages research group of the Traces of Contact project, Radboud University Nijmegen. pieter muysken is Academy Professor of Linguistics at Radboud University Nijmegen.

The Native Languages of South America Origins, Development, Typology Edited by

Loretta O’Connor and Pieter Muysken

University Printing House, Cambridge CB2 8BS, United Kingdom Published in the United States of America by Cambridge University Press, New York Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781107044289  C Cambridge University Press 2014

This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2014 A catalog record for this publication is available from the British Library ISBN 978-1-107-04428-9 Hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

Contents

List of figures List of maps List of tables List of contributors Acknowledgments

page vii ix x xii xiii

1 Introduction: South American indigenous languages; genealogy, typology, contacts pieter muysken and loretta o’connor

1

Part I Introduction to South America 2 Human migrations, dispersals, and contacts in South America loretta o’connor and vishnupriya kolipakam

29

3 Basic vocabulary comparison in South American languages ¨ harald hammarstr om

56

Part II Case studies in contact 4 Structural features and language contact in the Isthmo-Colombian area loretta o’connor

73

5 The Andean foothills and adjacent Amazonian fringe rik van gijn

102

6 The Andean matrix simon van de kerke and pieter muysken

126

7 The Arawakan matrix love eriksen and swintha danielsen

152

v

vi

Contents

8 The Tupian expansion love eriksen and ana vilacy galucio

177

Part III Comparative perspectives on linguistic structures 9 Language internal and external factors in the development of the desiderative in South American indigenous languages ¨ neele m uller 10

11

12

203

Verbal argument marking patterns in South American languages joshua birchall

223

The Noun Phrase: focus on demonstratives, redrawing the semantic map olga krasnoukhova

250

Subordination strategies in South America: nominalization rik van gijn

274

Part IV Major findings and conclusions 13

The languages of South America: deep families, areal relationships, and language contact ¨ pieter muysken, harald hammarstr om, joshua birchall, swintha danielsen, love eriksen, ana vilacy galucio, rik van gijn, simon van de kerke, vishnupraya kolipakam, ¨ olga krasnoukhova, neele m uller, and loretta o’connor References Subject and place index Language index Author index

299

323 366 371 376

Figures

4.1 The Chibchan language family, after Constenla (2012: 417). Boxed languages appear in this study. page 82 4.2 Linguistic distances for Chibchan vs. non-Chibchan languages 90 4.3 Linguistic distances for pre-defined areal groups 91 4.4 Geographic distribution of feature values 93 5.1 NeighborNet representation of the distances between the languages of the sample 116 5.2 NeighborNet of distances between languages of the sample (phonological features only) 118 5.3 NeighborNet of distances between languages of the sample (morphosyntactic features only) 119 5.4 Correlation between linguistic distance and geographic distance 121 5.5 Correlation between geographic elevation and linguistic distance 122 5.6 Correlation between elevation and proximity to the Andean profile 123 6.1 NeighborNet representation for the relative distances of the members of the Quechuan language family 135 6.2 The distribution of languages per region, over time (Q = Quechua) 140 6.3 NeighborNet representation for the relative distances of the different Andean languages discussed in this chapter 142 7.1 The distribution of the personal paradigms in Arawakan languages 167 7.2 Minimum spanning network of the Arawakan language family (also taken from the NeighborNet algorithm, Huson and Bryant 2006) 168 7.3 Structural analysis of thirty-one Arawakan languages 174 8.1 The branches of the Tupian language family 180 8.2 NeighborNet representation of lexical distances among Tupian languages 182 vii

viii

List of figures

8.3 NeighborNet representation of structural distances among Tupian languages 11.1 Scale of semantic features encoded by demonstratives 12.1 NeighborNet of nominalizations as subordination strategies in the languages of the sample 13.1 Parsimony reconstruction for alignment in Tupian (Birchall 2014, based on the tree typology of Walker et al. 2012) 13.2 Correlation between Hamming distances and geographic distances for the pairs of families in the sample

186 269 283 307 312

Maps

1.1 South American language families, extensions from 1500 CE page 6 2.1 Archaeological sites from the Late Pleistocene–Early Holocene in South America 34 4.1 The Isthmo-Colombian area, noting the position of languages in this study 76 5.1 The greatest extent of the Quechuan, Aymaran, Panoan, Tupian, and Arawakan expansions 105 5.2 The languages in the sample and their geographic distribution 107 6.1 Approximate distribution of the indigenous languages in the Andes in the mid twentieth century 129 7.1 The reconstructed geographic dispersal of the Arawakan and Tupian language families at the time of European contact. For complete references, see Eriksen 2011: 12 163 8.1 The location of Tup´ı-speaking groups at the time of European contact 179 8.2 Current location of the languages in our structural sample 185 9.1 Desiderative/no desiderative marking in eighty-five South American languages 209 10.1 Languages in the sample, with regions used in this chapter 226 11.1 Geographic distribution of semantic features encoded by demonstratives 271 12.1 The use of participant nominalization as a relativization strategy 286 12.2 The encoding of notional subjects as possessors in subordinate clauses 290 12.3 The encoding of notional objects as possessors in subordinate clauses 291 12.4 The use of case marking to form adverbial clauses 292

ix

Tables

1.1 Schematic brief overview of some of the current families page 5 1.2 Preliminary overview of typological features mentioned in the literature 9 1.3 Highly ranked stable features in the meta-analysis of Dediu and Cysouw (2013) 19 3.1 Language families with their status in the work of Loukotka (1968), ASJP, and Campbell (2012a) 64 4.1 Languages in this study 83 4.2 Typological profiles, using features from Constenla (1991) 86 4.3 Numbers of features in major categories and their intersections 89 4.4 Ranges of features and types ranked by normalized median distance (N = 162) 94 4.5 Feature table 96 5.1 The languages in the sample and their sources 106 5.2 Areal studies of the Amazon and Andean regions used in this study 108 5.3 The phonological features 111 5.4 The morphosyntactic features 114 5.5 The constituent order features 114 5.6 The lexicon features 115 5.7 The average linguistic distance per river system 124 6.1 Languages of the study, compared to the varieties used by Heggarty (2005) and by Adelaar with Muysken (2004) 132 6.2 Morphosyntactic and phonological features that distinguish Southern Peruvian Quechua from Ecuadorian Quichua 137 6.3 Putative historical development of language use in Kallawaya villages in the Charazani region 146 6.4 The forms reconstructed for Proto-Quechua by Parker (1969) 147 6.5 Features in more than twelve of the seventeen Quechuan varieties in our database, as compared to their occurrence in Aymaran and other Andean languages 148 6.6 Quechua and Andean feature questionnaire 149 x

List of tables

7.1 9.1 9.2 9.3 9.4 10.1 10.2 11.1 11.2 12.1 12.2 12.3 12.4 12.5 12.6 13.1 13.2 13.3 13.4 13.5 13.6

13.7

The Arawakan language family Desiderative markers in three Nambikwaran languages Desiderative markers in seven Cariban languages Desiderative markers in ten Tupian languages Desiderative marking in eighty-five SA languages Comparison of characteristic features for Amazon vs. Eastern and Andes vs. Western Language sample with additional information (by region) Distribution of semantic features in the sample Language sample (ordered by language family) Semantic relations considered for subordination strategies Questions on nominalization Comparison of global and South American distributions of nominalized structures Comparison of distribution of nominalized structures per semantic relation type Overlap of semantic relation types Non-core case markers and adpositions used to form adverbial relations WALS features for which South American languages show a significantly distinct profile as a group Comparing the top 200 language family pairs in the sample Rank order correlation between the language pairs for TAME, SubOrd, ArgMar, and NP The variance in the four domains Average stability of our different sets of features for the language families Language pairs ranked highest on the domains of NP, TAME, SubOrd, and ArgMar (languages from the western region bold, Tupian-Cariban-Macro-Jˆe languages italic) Some well-known expansion varieties in South America

xi

165 213 215 217 220 244 247 267 272 276 278 281 282 284 294 306 310 313 314 314

315 318

Contributors

joshua birchall, PhD student, Radboud University Nijmegen, Netherlands; postdoctoral researcher, Goeldi Museum, Belem, Brazil swintha danielsen, postdoctoral researcher, University of Leipzig, Germany love eriksen, postdoctoral researcher, Lund University, Sweden ana vilacy galucio, senior researcher, Goeldi Museum, Belem, Brazil rik van gijn, postdoctoral researcher, Radboud University Nijmegen, Netherlands, and University of Zurich, Switzerland ¨ harald hammarstr om, postdoctoral researcher, Radboud University Nijmegen and Max Planck Institute for Psycholinguistics, Netherlands simon van de kerke, lecturer, Leiden University, Netherlands vishnupriya kolipakam, PhD student, Max Planck Institute for Psycholinguistics, Netherlands olga krasnoukhova, PhD student, Radboud University Nijmegen, Netherlands ¨ neele m ueller, PhD student, Radboud University Nijmegen, Netherlands; postdoctoral researcher, University of Marburg, Germany pieter muysken, Professor, Radboud University Nijmegen, Netherlands loretta o’connor, postdoctoral researcher, Radboud University Nijmegen, Netherlands

xii

Acknowledgments

We would like to acknowledge the support from various institutions and individuals that helped us in putting together this volume. The research for this book was funded by the European Research Council Advanced Grant 230310 “Traces of Contact” awarded to Pieter Muysken, the Veni Grant to Rik van Gijn from the Netherlands Organization for Research (NWO), as well as the Academy Chair awarded to Pieter Muysken by the Royal Netherlands Academy of Sciences (KNAW). We would also like to acknowledge the advice and support of a number of colleagues working on South American indigenous languages and on phylogenetic methods: Willem Adelaar, Mily Crevels, Dan Dediu, Michael Dunn, Fiona Jordan, Sergio Meira, Lev Michael, Annemarie Verkerk, and Hein van der Voort. Needless to say, none of these people can be held accountable for the claims made in this volume. Our student assistant Ellen van den Broek of Radboud University Nijmegen compiled and checked the references and helped prepare the manuscript for publication. All maps but one were produced by Thijs Hermsen, using ArcGIS 9.2, and we thank Love Eriksen of Lund University for providing the polygon shape file used in four of the maps.

xiii

1

Introduction: South American indigenous languages; genealogy, typology, contacts Pieter Muysken and Loretta O’Connor

This chapter discusses general issues concerning the relationships among the indigenous languages of South America and sketches the background for the rest of the volume. Why are there so many language families, and why so many isolates? What is the distribution of both larger families and isolates? Given the apparent genealogical diversity, why are there so many shared specific areal typological patterns, some characterizing most of the continent, and some, individual parts? Do shared patterns reflect older historical genealogical links, or are they the result of convergence? What can we learn about these issues from the perspective of language history (vertical transmission) and language contact (horizontal transmission)? We also describe and justify our research methodology and briefly outline the chapters in the volume. 1

Solving an intellectual puzzle

The relationships among the indigenous languages of South America pose complex intellectual problems that invite innovative research methodologies. Until recently these languages were relatively unknown. Not only were the vast majority not at all or poorly documented, but the complex relations among them also remained obscure. Our knowledge base was woefully inadequate with respect to classification, history, and typology. In recent years, our knowledge of these languages has grown enormously, but the puzzles remain. With respect to language classification, using the comparative method there is a current consensus for some 108 separate language families on the continent, half of which are isolate languages. This total represents a large part of the overall inventory of language families of the world (420, according to Campbell (2012a: 59). This introduction is based in part on material presented at the conference on methods in historical linguistics at Bel´em in September 2005, and a further developed version at the International Conference on Historical Linguistics (ICHL XIX) at the Radboud University Nijmegen in August 2009. We are grateful to the participants at these meetings, especially to Willem Adelaar, Rodolfo Cerr´on-Palomino, Swintha Danielsen, Michael Dunn, Stephen Levinson, and Hein van der Voort for comments.

1

2

Pieter Muysken and Loretta O’Connor

With respect to language typology, there is less diversity. While certainly there are considerable differences between the languages, there are many similarities as well, as shown below. Our puzzle falls into three parts: (A) Why are there so many language families, and why so many isolates? What is the distribution of both larger families and isolates? Given the fact that South America was the most recently populated of all continents (15– 13,000 BP, see O’Connor and Kolipakam, this volume), the genealogical diversity is surprising indeed. (B) Given the apparent genealogical diversity, why are there so many shared specific areal typological patterns, some characterizing most of the continent, and some, individual parts? Do shared patterns reflect older historical genealogical links, or are they the result of convergence? In either case, why do classical comparative methodologies based on lexical data not yield better results for language classification? (C) What can we learn about the relation between the issues in (A) and (B) from the perspective of language history (vertical transmission) and language contact (horizontal transmission)? In our estimation, moving forward will require an extension of the historical analysis of South American languages to language typology and language contact. Auspiciously, the continent offers a unique opportunity, in that the relatively recent moment of its settlement roughly coincides with the upper limit of the time depth that can be reasonably investigated through language comparison. 2

Language relationships in South America

2.1

The field of historical linguistics and the comparative method

There is a long tradition of historical and comparative linguistics in South America, starting with Jesuit scholars such as Herv´as (1800–1805), who identified Tup´ı-Guarani, and Gilij (1780–1784), who established the Arawakan and Cariban language families as a unit, two among nine lenguas matrices (‘mother languages’ or ‘base languages’) in Venezuela. Following these pioneers, there have been a large number of attempts at classification of the languages of South America and proposals for family groupings, from modest to grandiose. Uhle (1890) identified the Chibchan family, and Davis (1968) argued that some more distant clusters together formed Macro-Jˆe. During the past century, numerous scholars have explored genealogical relationships and proposed different classifications, including Loukotka (1968), whose work is commented on in Hammarstr¨om (this volume). Swadesh (1959, 1962) proposed four large ‘networks’: macro-mayan, macro-carib, macro-arawakan,

Introduction: genealogy, typology, contacts

3

macro-quechuan. Kaufman (1990) is fairly conservative, arriving at 118 groups. This number is further specified in Campbell’s report (2012a: 59) as 108 genealogical units, consisting of 53 families with more than one member and 55 one-member families, the so-called isolates. Writing a history of the classificatory efforts would require a separate paper. Adelaar with Muysken (2004) provide a concise summary, and a detailed treatment is also given in Campbell (1997, 2012a) and Adelaar (2012a). Hammarstr¨om (2010: Appendix) provides an exhaustive meta-analysis of the published sources, returning to this issue in this volume. Efforts to investigate existing language families and reconstruct hypothetical earlier forms continue, as in the work on Tupian, carried out in Brasilia (Rodrigues 1999; Rodrigues and Cabral 2012) and at the Goeldi Museum in Belem, and the work on Nambikwara at the Free University Amsterdam (Telles and Wetzels 2009), coupled with efforts to relate Tupian, Cariban, and Jˆean as TuCaJˆe (Rodrigues 1985). There are also efforts to link individual unclassified languages to well-established language families, such as the recent linkage of Jabut´ı (Ribeiro and van der Voort 2010) and Chiquitano (Adelaar 2008b) to Macro-Jˆe, and smaller families have been proposed, such as Harakmbet and Katukina (Adelaar 2000). In Argentina, Viegas Barros (2005) has argued for the Chonan family. However, the overall picture is not one of unification, and this raises issues about the universal applicability of the comparative method, argued by Sapir (1929: 208): There can be no doubt that the methods first developed in the field of Indo-European linguistics are destined to play a consistently important role in the study of all other groups of languages, and that it is through them and through their gradual extension that we can hope to arrive at significant historical inferences as to the remoter relations between groups of languages that show few superficial signs of a common origin.

In contrast, scholars such as Thurston (1987) have argued that the traditional comparative method may not always be the right model, on two grounds: Wave and network models are sometimes better than tree models, and structural features may play a role next to lexical and morphological features. In this book we will stick to the middle ground and adopt a plurality of methods for studying the historical linguistics of South American languages, using both published materials based on the comparative method and other techniques that rely on structural features. We also systematically explore the possible role of language contact, which can obscure genealogical relationships. Like other historical linguists, we depart from the Uniformitarian assumption, namely that processes of language change are uniform in different periods of human history. However, the precise implications of this need to be specified, since in the prehistory of the South American continent different conditions

4

Pieter Muysken and Loretta O’Connor

held, which must have had a profound impact on processes of change and contact (see O’Connor and Kolipakam, this volume). In the early period, we are dealing with relatively isolated hunter-gatherer societies, with low population density, in probably intermittent contact with other groups. Later, after the intensification of agriculture, more complex exchange networks were built up between communities in areas with much higher population density. Finally, in the colonial period, the tremendous demographic decline and ethnic destruction following the Iberian invasion led to much lower indigenous population densities once again, the destruction of social and commercial networks, and the restructuring of communities. All these changes must be taken into account when we start modeling language change in the South American continent. Comparative-historical linguistics in the South American context faces a number of problems. There are few reconstructed proto-languages for comparisons at the level of families, and the coverage and quality of documentation is very uneven for many languages. In addition, information about word frequency, needed to apply some lexicostatistic methods, is non-existent. Similarly, lexical semantic information is incomplete and highly skewed. However, through enormous documentation efforts currently underway, both lexical and grammatical information on a large range of languages has become available. The challenge is to bring this material to bear on the issue of language relationships. 2.2

Greenberg’s Amerind

In an influential but controversial 1987 book, Joseph Greenberg assumed, based on mass comparisons of lexical material, that all the languages of South America belonged to a single language family dubbed Amerind. This hypothesis has met with little support among South American language specialists, even though it was accepted by outsiders, both linguists and others. Greenberg was criticized on four counts: r The idea of a single overall family may not be incorrect, but it cannot be demonstrated empirically. r The intermediate macro-families or stocks postulated have been criticized; alternative macro-families postulated such as the already mentioned Tupian, Cariban, and Jˆean family cluster (TuCaJˆe) have more support. r Many data used by Greenberg were inaccurate; the work is riddled with mistakes in data interpretation and reproduction. r His method, mass lexical comparison, is superficial and cannot replace the traditional comparative method based on reconstructions. Nonetheless, Greenberg’s ideas loom large in the background, for several reasons. First, there are odd resemblances, some very specific, between languages assumed to be entirely unrelated. There are a number of “pan-Americanisms”

Introduction: genealogy, typology, contacts

5

Table 1.1 Schematic brief overview of some of the current families Type

Name

Location

Larger families

Arawakan

Widespread, from Belize and Honduras to Bolivia and Paraguay Central Amazon, from Paraguay to French Guyana and Peru Northern Amazon Central and southern Amazon Central America, from Honduras through northwestern Colombia Western Amazon and Pacific coast Western Amazon Southern Cone

Tupian Cariban Macro-Jˆean Chibchan Tucanoan Pano-Takanan Chon Smaller families

Barbacoan, Arawan, In the Andean foothills, central and southern Amazon basin Chapacuran, Mak´uan, Nambikwaran, Witotoan

Clusters

Aymaran, Quechuan

Isolates and small groupings

Andes Many in the Amazonian fringe

(Kaufman 1990: 26; Campbell 1997: 257–259), even though they do not allow us to reliably create larger family units. Second, there is a sense of “Amerindian language type,” even if it is not very precise. Third, as our knowledge of the individual languages increases and more comparative work is done, the evidence for macro-families grows, even if these do not correspond to the ones postulated by Greenberg. Without undue optimism, we can predict that further links will be established in the coming years. Currently, the mass comparison method propagated by Greenberg has been replaced by automated similarity judgment on basic lexicon in the ASJP (Automatic Similarity Judgment Program; Holman et al. 2008); see Hammarstr¨om (this volume). 2.3

Current distribution of the language families and isolates

Table 1.1 presents some larger families in South America, as well as a number of smaller families and language isolates, and two entities we could label clusters, relatively shallow families with a wider spread. The categories are illustrated in Table 1.1. The Andean Foothills and the Amazonian Fringe include foci of extreme linguistic diversity, such as the Vaup´es region in the northwest and the Mamor´eGuapor´e and Chaco regions in the southwest (see van Gijn, this volume). Some languages in these regions are Arawakan and Tupian, but many represent minor families and isolates. This fragmentation contrasts with the more homogeneous

6

Pieter Muysken and Loretta O’Connor

Map 1.1 South American language families, extensions from 1500 CE

central and eastern plains, where most languages belong to stocks like Macro-Jˆe, Arawakan, Cariban, and Tup´ı-Guarani. Map 1.1 gives an impression.

2.4

Explanations for the current diversity

Why is there such diversity of languages in the continent, concentrated mostly on the Andean fringe area? Consider five possible hypotheses:

Introduction: genealogy, typology, contacts

7

(a) Genealogical fragmentation is the original situation for all of South America. The ethnically and linguistically more homogeneous areas simply result from more recent expansions. Notice that we can date the expansion of some of the larger families fairly precisely. Aikhenvald (2012: 4) writes: “Amazonia and the Americas in general have a large number of recognizably distinct language families. This is compatible with the relatively late peopling of the continents. In areas of deeper antiquity of settlement such as Australia (estimated at 50,000 years), the time-depth promoted intensive convergence of languages towards a common prototype.” This explanation links the very diversity to the late date of settlement. However, the original population may not have been large enough to include the full number of 100+ families (plus ones that have already disappeared). We must assume that a substantial amount of the diversity is more recent than the moment of spread across the continent, and is subsequently shallower. (b) Fragmentation in the fringe zones arose because these areas functioned as zones of refuge, to which many smaller groups fled when pushed out of richer areas by stronger groups. Recent work on ethnogenesis (e.g. Hornborg 2005, Eriksen 2011) casts doubt on this, citing linguistic diversity as a possible outcome of contact and not necessarily of isolation. Certainly, isolation cannot have been the sole cause of linguistic fragmentation, as speakers of many language isolates were far from isolated. Rather, they participated in intensive trading networks. The areas along the fringe of the Amazon were a contact zone for lowland–highland exchanges. Lexical diversity may go hand in hand with contact and indeed be a product of language contact. Bowern et al. (2011) compared lexical borrowing in hunter gatherer populations across the globe and state that although “loan levels varied both within and among regions, they were generally low in all regions . . . , despite substantial demographic, ecological, and social variation. Amazonian levels were uniformly very low, with no language exhibiting more than 4%. Rates were low but more variable in the other two study regions, in part because of several outlier languages where rates of borrowing were especially high.” The authors interpret this result as showing “an association between language and group identity that is relatively strong compared to many other parts of the world, and pertains widely within Amazonia.” (c) The fragmentation along the eastern flanks of the Andes, the so-called monta˜nas, is due to the fact that this was the oldest inhabited zone of South America, the path along which groups moved southward as they came through the Isthmus of Panama (Dahl et al. 2011). This is not likely to be the right scenario, given the time depth involved. A key problem is that the paleoarcheological evidence does not necessarily point to the monta˜nas as the oldest inhabited zones (see O’Connor and Kolipakam, this volume, and van Gijn on the Andean foothills, this volume).

8

Pieter Muysken and Loretta O’Connor

(d) Certain features of South American language systems may themselves contribute to the diversity puzzle. Nichols (1992) tentatively argues that in head-marking languages, language change tends to destroy the information needed for the reconstruction of deeper genealogical units, while in dependent-marking languages this information is more likely to be preserved over time. This argument merits further investigation. (e) Social factors also play a role (and see 4.1). Thurston (1987) argues that there are group dynamics that strengthen specific historical developments. These can be grouped under the general rubric of exotericity and esoterogeny. Exotericity is involved in intergroup relations and may lead to the indigenization of a variety by a particular group. In esoterogeny, language is treated as an internal emblem of group membership, leading to elaboration, the emergence of morphologically complex forms involving suppletion and irregularity, the creation of idioms, and new vocabulary. On the whole, the norm-enforcing dynamic of such a group process is considerable, and could lead to differentiation. A similar line of argumentation has been developed in Nettle (1999) and Trudgill (2011). 3

Linguistic typology and the areal distribution of features

South American languages tend to have a number of recurrent typological features that distinguish them from the languages of the Old World (Wichmann et al. 2010c). There are fewer differences with the languages of Central and North America, but this has been investigated even less. Table 1.2 gives a very preliminary overview of some of the features mentioned in the literature so far, organized in terms of the four major categories of elements studied in this volume. Notice that the noun phrase and verbal argument marking have been more thoroughly explored than Tense-Aspect-Mood-Evidentiality (TAME). Similarly, certain geographic areas have been more thoroughly explored than others. Typologically, the languages involved are very interesting, but information about patterns of typological markedness is skewed by earlier underrepresentations of South America in typological surveys. South American languages have been disproportionately absent in large language samples, in large part because few descriptive grammars of the type that typologists like to work with were available. The aforementioned classification by Greenberg (1987) of all South American languages into a single Amerind family did not help, as this justified the relative under-representation in the samples, on the basis of which some typologists have drawn statistical conclusions. From the qualitative perspective, the types of unusual patterns found in the languages of South America were insufficiently known. The recognition of new typological properties of the languages of South America will change our

Introduction: genealogy, typology, contacts

9

Table 1.2 Preliminary overview of typological features mentioned in the literature Feature Noun Phrase A rich nominal determiner system, including nominal tense or aspect Positional deictics (standing, sitting, lying, (in)visible, etc.) Both nominal and verbal classifier systems

Inclusive/exclusive Genitive classifier for possessed domestic animals Gender in demonstratives and pronouns Lack of classifiers Lack of nominal number

Region or language

References

Cariban, Tupian, Rondˆonia

Nordlinger and Sadler (2004); Muysken (2008c)

Southern Cone, Chaco, Rondˆonia

Some languages in Bolivia and Rondˆonia, southern Amazon in Brazil, western Amazon in Brazil and Colombia Diverse spread Chaco

Kirtchuk (1996); Krasnoukhova (2012); Campbell and Grondona (2012) Aikhenvald (2012); Crevels and van der Voort (2008); Payne, Doris L. (1987); Seifart and Payne (2007) Crevels and Muysken (2005a) Campbell and Grondona (2012)

Chaco

Campbell and Grondona (2012)

Andes

Adelaar (2008a); Torero (2002); Adelaar (2012b) Crevels and van der Voort (2008)

Guapor´e-Mamor´e

Argument marking and verbal morphology Lowland South America Complex verbal morphology, (poly)synthesis and concatenative syntax-like morphology A high incidence of prefixes Guapor´e-Mamor´e Contrast between active/stative/ Chaco, Guapor´e-Mamor´e inverse alignment Do-verbs Northwestern South America Western Amazon, Chaco, Verb affixes marking direction, Guapor´e-Mamor´e location, position, orientation; meaning extension to mark tense, aspect, mood Sociative causatives Tupian, Tacanan, Arawak, Panoan, Moset´en Serial verbs Northern Amazon in Brazil, parts of Paraguay and adjacent Bolivia and Brazil Verbal number Guapor´e-Mamor´e Tense-Aspect-Mood-Evidentiality (TAME) Rich evidential systems Rio Negro, Rondˆonia/Bolivia, south-central Brazil Subordination Switch-reference and head-tail linking

North central Andes and foothills: Quechua, Panoan, Jivaroan, Barbacoan, Tucanoan, and Uru-Chipaya

Payne, D. (1990); Crevels and van der Voort (2008)

Crevels and van der Voort (2008) Crevels and van der Voort (2008); Campbell and Grondona (2012) J¨ager (2006) Payne, D. (1990); Crevels and van der Voort (2008); Campbell and Grondona (2012) Guillaume and Rose (2010); Aikhenvald (2012) Aikhenvald (2012)

Crevels and van der Voort (2008) (Aikhenvald 2004, 2012); Crevels and van der Voort (2008) Adelaar (2008a)

10

Pieter Muysken and Loretta O’Connor

perspective considerably. The promising typological features for phonology include nasal spreading, nasal harmony, and various tonal patterns; for the lexicon, we find reduplication, ideophones and sound symbolism, noun incorporation (especially involving the distinction between noun incorporation and affixation), nominal classification systems, and both nominal and verbal classifiers (see Krasnoukhova, this volume, for properties of the noun phrase). In morphosyntax, features of interest include possession (especially on the types of nouns that may be considered inalienable and alienable) as well as number, positional deictics (e.g. sitting, standing, lying), directional markers on the verb (e.g. grammaticalized verbal morphology for ‘upriver,’ ‘downriver,’ etc.), reflexive and reciprocal relations, verb compounding, and serialization. With respect to the variables focused on in this project, TAME has not been studied systematically, but among the striking features are the marking of tense on the noun in several families, complex evidentiality systems, and fine-grained tense marking (e.g. future beyond tonight) (see M¨uller, this volume). Argument realization in the Amazonian Fringe is highly complex and involves reference systems marked on the verb, differential object marking and other animacy effects (where object markers occur on nouns depending on their animacy), and active/stative/inverse alignment (see Birchall, this volume). The occurrence, precise properties, and distributions of these features across the Amazonian fringe still need to be explored. This line of research has already proven to be highly promising (cf. e.g. Grinevald and Seifart 2004, Crevels and van der Voort 2008). In the field of subordination, Everett (2005) has argued that Pirah˜a lacks recursion in the clausal domain, and hence true subordination. To take but one example of the link between typology and classification, as far as is known, nominal tense or aspect is strongly rooted in the Cariban language family and may have spread from there to individual members of other families, such as Tariana and Chamicuro (Arawakan); Wari’ (Chapacuran); Nambikwaran; Movima (isolate); Moset´en (Mosetenan); Cof´an (isolate); Weenhayek’ (Matacoan); as well as Guaran´ı, Sirion´o, and Yuki (Tupian). However, it is also quite possible that it is an original feature of other language families, including Tupian. Further study of the precise geographic distribution of the features involved is urgently needed, also in the light of the possible grammaticalization of lexical suffixes referring to ‘deceased’ and ‘future’ (see M¨uller, this volume). 4

Language contact

The topic of contact between indigenous languages in South America is vast and almost intractable; a first general exploration is presented in Muysken (2012b). We still know little about the history of the languages of the continent,

Introduction: genealogy, typology, contacts

11

and we lack essential sources of information, such as historical sources dating back more than a few centuries, full descriptions for the majority of languages or major representatives of language families, reliable family trees for a number of linguistic families, and reliable reconstructions of the features of potential ancestor languages. In quite a number of cases, perhaps the majority, we do not know whether a given instance of resemblance between two languages is due to contact or to shared ancestry. There are a great many incidental observations about the contact between indigenous languages in South America in the literature, and few scholars active in the field would deny its importance, but no consistent picture has emerged as yet, nor is there an inventory, let alone a typology, of contact phenomena. Quite obviously, changes due to language contact depend on multilingualism. In contemporary Amerindian societies many patterns of multilingual language use exist. In some small communities in close contact, A knows B, but B does not know A, as with the Yurakar´e in Bolivia who know the language of their Tsimane neighbors, but not vice versa (van Gijn, p.c.). In some cases, as with the Machigenga (Peru) and Yanomame (Brazil), A, B, and C all know each others’ (related) languages (Perri Fereira, p.c.). In other cases, such as the Vaupes region, A, B, and C all know each others’ languages, and these are not related (Aikhenvald 2002 and much related literature). Yet another pattern, as with the Waorani in Ecuador: Waorani speakers understand Quichua but do not speak it, while nobody speaks Waorani but the Waorani themselves (Muysken, fieldwork). In a number of communities in Bolivia, speakers speak dominant language B and (surviving remnant bits of) an ancestral language A as a minority language, as with Uchumataqu (Muysken, fieldwork), Puquina (Hannss, p.c.), and Paunaka (Danielsen, p.c.). Sometimes, A and B are brought together in a new mixed setting, as with Aikana and Kwaza in Brazil (van der Voort, p.c.). There is evidence of substantial numbers of loan words in the large majority of languages, with the possible exception of Waorani core vocabulary, where only two loans have been documented (Peeke 1973: 4). In Yanomame, Cariban loans are manioc-related and recent. Thus the linguistic evidence suggests that truly isolated groups are very rare if they existed at all. In the period between 1500 and1800 CE contacts may have stopped due to sharp demographic decline, but this should not blind us to the dominant earlier pattern of contact. At the same time, it should be kept in mind that for thousands of years, population density in the continent was low. Even if we assume a population of over thirty million in 1492, with a continent of 17,840,000 km2 in size, average density cannot have been much more than an average of two persons per square kilometer. It is also evident, of course, that there must have been areal differences in population density. Some zones were virtually uninhabited,

12

Pieter Muysken and Loretta O’Connor

while others (the ones with favorable living conditions) were more densely populated (see O’Connor and Kolipakam, this volume). 4.1

Scenarios for language contact

In any case, we can identify specific prototypical situations or scenarios of language contact, both in terms of their sociolinguistic contours and of their linguistic outcomes. We will sketch some of these here, without pretending that this is an exhaustive list (see also Muysken 2012b). Prestige borrowing. A number of high-prestige languages pass on words to neighboring languages with lower prestige. In addition to words, in some cases affixes are passed on this way, and occasionally phonetic properties.1 The vocabulary may involve such domains as political functions, (higher) numbers, cultivated food, or animal names. Bowern et al. (2011) argue that in the sample for South America that they studied, borrowing is predominantly asymmetric; Arawak languages are frequent sources of loans into other languages, although this directionality appears to be reversed in the Vaup´es, where Arawak Tariana has experienced profound contact with Tucanoan languages. It is also reversed in southwest Colombia, where Arawak Res´ıgaro has borrowed from Bora; note that these languages are all AG (= agricultural). HG (= hunter gatherer) languages in contact with AG languages are predominantly recipients of loans, both in the Vaup´es, where the HG languages Hup, Yuhup, and Kakua have borrowed from AG Tucanoan, and also in other cases, e.g. HG Nad¨eb from AG Arawak.

Quechua words in many surrounding languages also typify this type of prestige borrowing. Trading partner borrowing. Related to this, and not easy to distinguish from it, may be patterns of long-distance borrowing of names for household goods, plants and animals, and possibly words for rituals. Here there need not be a hierarchy, and the effects may be less local. Metatypy. In some cases a particular language A is dominated by another one B. Typically, the speakers of A are also fluent in B, but not vice versa, and numerically and economically A is less strong than B. Over time, metatypy may occur: A starts adopting more and more structural features of B, but not vice versa. See Ross (1999, 2006). Substrate. When large numbers of speakers of A shift to language B, they may import all kinds of semantic and pragmatic distinctions into their version of B, without overtly transferring structural features or many words from A into B. An example of substrate is Aymaran influence on Quechuan varieties in southern Peru and Bolivia. 1

A well-known example is the Parisian French velar (r) that was passed on to several other court cities: Copenhagen, Berlin, The Hague, presumably in the eighteenth century.

Introduction: genealogy, typology, contacts

13

Bilingual convergence and linguistic areas. When many speakers of two adjacent languages A and B are bilingual, there may be frequent code-switching between the languages, and in addition, the languages may start showing structural convergence. Depending on the patterns of multilingual usage in the community, this convergence may be bi-directional or even multi-directional, as illustrated with the Ic¸ana and Vaup´es region (Aikhenvald 2002). We must also assume that cultural areas emerged before linguistic areas. There is no language contact without intensive human and cultural contact, although there may be cultural contact without much language contact. Koineization. When a language spreads to a new area and speakers adopt it as a second language without strong native speaker input (incomplete shift), they may simplify and restructure the new second language somewhat. Furthermore, dialect distinctions existing in the original homeland tend to be leveled. This process of selective simplification and homogenization of patterns as they spread from one community or area to another is often referred to as koineization. Such languages resulting from incomplete shift may be illustrated with Cocama (Tupian) and with Ecuadorian Quechua. Intertwining. New languages are created through combining the lexicon of one language with the grammar of another one. Intertwined languages can be illustrated with Kallawaya, a secret ritual language of Bolivia. 4.2

Linguistic areas

A central issue in our argumentation will concern the question of linguistic areas, regions in which the languages share an unusual number of features which cannot be attributed to shared genealogical inheritance. For South America a number of linguistic areas have been proposed: r the Andes (Torero 2002; Adelaar 2008a, 2012b) r the ‘Intermediate Area’ (Constenla 1991) r the Chaco (Grondona 2003; Messineo 2011; Campbell and Grondona 2012) r the Vaup´es and Ic¸ana region (Aikhenvald 2002) r the Guapor´e-Mamor´e area (Crevels and van der Voort 2008) There is a large literature on the methodological and conceptual problems involved in postulating a linguistic area (see e.g. Muysken 2008b and references there). Aikhenvald (2012: 70) distinguishes between a linguistic area and a language region, assumed to differ in at least two respects: denseness of features (many more features involved in linguistic areas) and proven historical interactions “where we know how the contact has been, or is proceeding.” Nevertheless, even for classical “linguistic areas” such as the Balkan area, it is historically unclear exactly how it emerged and where all the “Balkan” features came from. Furthermore, researchers have distinguished between “core Balkan” and “peripheral Balkan,” blurring the distinction between area and region.

14

Pieter Muysken and Loretta O’Connor

With regard to language regions, Aikhenvald (2012: 70) notes: “Similarities across a language region could be traces of old language contact, or former linguistic areas, no longer recoverable (due to ‘punctuations’ and language extinction, in the case of Amazonia). A feature or two could have spread through ‘intermediaries’, perhaps through trade or other interactions which have never been recorded.” The conventionalized definition of these notions is an area of active research. 5

Stability and the selection of features: lexical or structural

Stability concerns the ease with which features change values across time, under the influence of various processes (Dediu and Cysouw 2013). In other words, does a particular feature remain unchanged within a specific linguistic tradition (a language or a language family), no matter what happens to the family? It can also describe the prevalence of a feature within a geographic region. Stability is a complex notion, for at least two reasons. First, a lack of stability may result through contact with another language, often measured as feature borrowability, or from changes internal to a language or language family. Third person pronouns are not highly borrowable, but they often change due to language-internal developments, such as the bleaching of the deictic force of demonstrative pronouns. Second, there can be various sources of stability. It can result from frequency of use of a word or construction; it can result from the design of a language system; a particularly stable feature can be a badge of social identity in a specific contact setting; and finally, the interaction with typologically similar languages can lead to a stable typological profile. Stable linguistic features can serve as signals of genealogical relationship. Lexical data, the primary basis for the classic comparative method, are well suited to tracing family relationships: the arbitrary nature of the sign means that the number of possible words for any referent is virtually infinite, and the recognized rules of sound change underpin proposals of proto-forms and pathways of evolution. Genealogical history is however only part of the story of a language and therefore of its speakers. Communities are always in contact, and the impact of that contact plays an active role in shaping linguistic systems. Here, too, lexical data have a key role to play. The lexicon can be divided into rough but defensible subsets of so-called basic vocabulary, which tends to reflect vertical transmission and family resemblance, and cultural vocabulary, which may reflect horizontal transmission and the effects of contact among speakers. The underlying understanding is that basic vocabulary will reflect linguistic inheritance, and cultural vocabulary will reflect salient local culture and the overt choices made by speakers. In contrast, structural features have only a small number of realization options – for example, there are only six possible permutations of SVO – and

Introduction: genealogy, typology, contacts

15

there are no agreed-upon patterns of structural change that permit reconstruction of earlier forms. Structural features have long played a secondary role in investigations of linguistic prehistory, complementing reconstructions of genealogical relationship based on lexical data. There is no accepted structural analogue to the notion of basic vocabulary that might provide a direct window to family relations by comparing structural features. Therefore, within historical linguistics, lexical data are used both to discern genealogical relations and to detect borrowing patterns, while before Nichols’ ground-breaking study (1992), the status of structural data was left undefined. And yet, recognition of structural change and the search for appropriate ways to study its impact in language contact have occupied linguists since the Neogrammarians (see Winford 2003: 6–9 for a concise review). Discussions of structural convergence, the ways in which the grammatical systems of two or more neighboring languages become more similar, typically distinguish between two types of convergence, one of which involves the transfer of actual morphemes along with an abstract structural pattern and one of which does not (Winford 2003, 2005). Heath (1978) discusses the differing processes as direct borrowing as opposed to indirect diffusion, and terms to describe the results include pattern transfer and syntactic calquing, substance linguemes vs. schematic linguemes (Croft 2000), and Matter vs. Pattern (Matras and Sakel 2007a,b). In addition to linguistic factors, researchers also acknowledge social and psychological factors in assessing and predicting the types, intensity, and direction of structural convergence in specific contact scenarios (Winford 2003, Muysken 2010). 5.1

Lexical stability

The most advanced work in this area has been carried out on lexical stability, based on the assumption that certain vocabulary items termed “basic,” including body parts, kinship terminology, and the specific words to describe common activities, may be particularly stable. The most commonly used elicitation instruments in genealogical comparison are the 100- or 200-word Swadesh lists, established for comparative purposes and conceived as identifying the “basic” or most stable vocabulary in any language. The lists have been criticized as imprecise, culturally inappropriate, and in part arbitrary; yet despite these critiques they remain the standard both for initial exploration of sound systems and as inventories of candidates for lexical stability. Recent work has both refined and reduced the traditional Swadesh list. As part of a large-scale project examining loanwords across 41 languages, Tadmor and colleagues (Haspelmath and Tadmor 2009; Tadmor et al. 2010) have developed the Leipzig-Jakarta list of basic vocabulary, a register of the 100 items most

16

Pieter Muysken and Loretta O’Connor

resistant to borrowing cross-linguistically that only partly overlaps with the Swadesh list. Other projects have divided the Swadesh list into sub-lists which differ in their degree of borrowability such as McMahon et al. (2005), who split up the list into a hihi segment (high frequency, high stability) and a lolo segment (low frequency, low stability). Their proposal is that limiting oneself to the hihi list will provide more reliable genealogical results. A team at the MPI for Evolutionary Anthropology has further pared down the lexical inventory to a list of maximally 40 items, argued to be most stable across languages, which then can be subjected to uniform transcription and automated comparison in the Automated Similarity Judgment Program (ASJP, Holman et al. 2008). Hammarstr¨om (this volume) argues that the ASJP method recognizes about 90 percent of the families established with more traditional methods, a remarkable feat: the claim is that automated mass comparison of lexical similarity based on normalized transcription of 28–40 words per language produces much the same results as existing classifications based on traditional methods. However, some deeper South American families (MacroJˆe, Tupian, and Witoto-Boran) are not recognized as groupings through the ASJP method, and recent work by the ASJP team on dating sub-branch splits gives unsatisfactory results for older families such as Chibchan. On a scale of increasing stability, the inventory of lexical items then includes: r domesticated plant and animal words and ritual vocabulary r general vocabulary r Swadesh list of words r hihi lists of frequent words from the Swadesh list (McMahon et al. 2005) r 28- to 40-word list in the Wichmann and Holman ASJP project Given this substantial work on borrowability of the lexicon by the Loanword Typology group and first-cut classifications based on mass comparison by the ASJP group, our ERC-funded project focuses on structural features in part to complement lexical findings. 5.2

Structural stability

Turning from the lexicon to structural features, the consensus view among historical linguists is that: “certain domains or components of linguistic structure tend to be more stable and resistant to change than others. For instance, phonology and grammar (and to some extent semantics) are more stable, while vocabulary is less stable” (Winford 2005). This would proclaim an important role for structural stability in the study of prehistory. However, structural stability is not easy to measure. Metrics have often been proposed a priori by structurally inclined historical linguists, but so far it has been difficult to empirically substantiate claims about stability. Some of the challenge lies in the operational definition of the term, which is itself an active

Introduction: genealogy, typology, contacts

17

area of research. The notion of the persistence of specific structures is tested from various angles: within a family or across families, independently within a dataset, as linguistic distance with respect to a measure of geographic distance, and in relation to dependency or other factors of linguistic systems in general. In addition, many sources discuss primarily the mechanisms of change and not the relative stability of features or clusters of features. Grammaticalization approaches to structural stability have been attempted by Heine and Kuteva (2005), while functional approaches to structural stability can be found in Matras and Sakel (2007a), Aikhenvald (2007), and many of the papers in Aikhenvald and Dixon (2007). 5.3

Combining structural and lexical stability

A third possibility is to consider the borrowability of items at the intersection of the grammar and the lexicon. This has been explored in van Hout and Muysken (1994), who studied internal stability factors and word classes in Quechua–Spanish lexical borrowing. They found that word frequency has a weak effect (since donor frequency effects are offset by recipient frequency effects). Strong blocking effects were found in this study for paradigmatic load in the recipient language and the presence of inflection on words in the donor language. A significant positive effect for borrowability came from the peripheral status of words in donor language constituents. Combining these factors in the specification of individual word classes led to familiar lexical borrowability hierarchy outcomes such as the following, which illustrates that Nouns are among the easiest and Clitic pronouns among the most difficult to borrow: Nouns Proper names < Adverbs Complementizers Conjunctions Exclamatives Negators Prepositions < Adjectives Auxiliaries Copulas Verbs < Numerals Quantifiers Wh-forms < Demonstratives Determiners Preposition+determiners Possessives Full pronouns Clitic pronouns This result has an interesting implication for the potential distinction between the lexicon and the syntax. In the traditional perspective, the lexicon is associated with items and syntax with rules. However, the factors mentioned constraining borrowability are very much structure-related. This is compatible with approaches such as Word Grammar and Construction Grammar, in which languages are seen as inventories of {items}, where {items} are form/meaning mappings either of the “word” type or of the “construction” type (see van Gijn,

18

Pieter Muysken and Loretta O’Connor

this volume, for a constructional approach to subordination strategies). This allows a more comprehensive perspective on the issue of stability. 5.4

Quantitative approaches

Since a single structural feature is highly unspecific (e.g. VO vs. OV), often groups of structural features need to be taken into account, and this calls for a quantitative approach. Nichols (1992, 1995, 2003) pioneered the use of quantitative approaches to structural stability. This has been taken up in work by Bickel, Cysouw, and many others in the burgeoning field of quantitative linguistics. The past fifteen years have thus witnessed the rise of quantitative historical linguistics, in which computational tools and statistical analyses are used to probe relationships in large databases of linguistic features (e.g. McMahon and McMahon 2005, Bickel 2007, Gray et al. 2007). Dunn et al. (2005, 2008) pioneered the use of phylogenetic analysis of structural features to give insights into language family history, yet opponents (Donohue and Musgrave 2007, Donohue et al. 2011) maintain that structural features will mostly or only reflect areal history, and insist that the modeling of genealogical history is the realm of the lexicon. Phylogenetic studies that explicitly compare lexical and typological data have found that while rates of change may be comparable across certain subsets of the two datasets, the results of typological analysis are not consistent across language families and as such give no clear picture of tree-like evolution vs. the effects of horizontal transmission (Gray et al. 2011, Greenhill et al. 2010). These last studies underline that a key challenge is to understand the rates of change, likelihoods of borrowability, and degrees of cohesion among certain (types of) features (Gray et al. 2011: 3924). Conflicting rankings on continua of specificity and stability imply that there is no single ideal type of feature to be chosen to establish genealogical relationships or areal relationships: content words are highly specific but subject to rapid change, while structural features are more stable over time but are rather unspecific (see O’Connor, this volume, for an assessment of contrasting categorizations of structural features). Further evaluation and exploration by Gray et al. (2010), Greenhill et al. (2010), and Dunn et al. (2011) suggest that structural stability will be more a complement than a mirror of lexical stability, with distinct evolutionary rates across families and greater sensitivity to local dependencies, and Dediu and Levinson (2012) propose family-specific “stability profiles” within more abstract patterns of structural stability and instability. Recent quantitative papers generally use data from the World Atlas of Language Structures (WALS) (Haspelmath et al. 2005). Dediu and Cysouw (2013) compare seven different approaches to computing the stability of 132

Introduction: genealogy, typology, contacts

19

Table 1.3 Highly ranked stable features in the meta-analysis of Dediu and Cysouw (2013) Phonology

Absence of common consonants Front rounded vowels Vowel nasalization Uvular consonants

The velar nasal Glottalized consonants Tone

Morphosyntax

M-T pronouns The optative Verbal number and suppletion

Nominal and locational Predication Passive constructions Predicative adjectives

Linear order

Order of genitive and noun Order of object and verb Order of adposition and noun phrase

Order of subject and verb Order of numeral and noun Order of adjective and noun

typological features in WALS with an explicit contrast of the different assumptions about the definition of stability and the different quantitative techniques for its estimation. Their compilation and comparison of the eight sets of results finds considerable overlap in the ranking of features (see O’Connor, this volume, for an application of their ranking). The first 19 features in the combined list include phonological properties, morphosyntactic properties, and constituent orders (Table 1.3). The stability of clausal constituent order is particularly interesting, given the common understanding that word order is frequently unstable in a situation of language contact (e.g. Aikhenvald 2011: 173). In addition to these quantitative measures for structural stability, historical linguists have attempted to establish the stable features for individual families. We can thus study the stability of features within Chibchan (Constenla 2012), Arawakan, Cariban, and so forth. An alternative to feature clusters would be the comparison of broad typological properties. Several studies have taken broad typological properties as their point of departure in trying to characterize the spread and differentiation of the languages of South America, rooted in a long tradition of macro-comparisons. Examples of such characterizations would be “polysynthetic” and “topic prominent.” The advantage of this approach is that it allows us to link the classification to broad typologies in the general literature. The disadvantage is that it is rather imprecise. It is possible to characterize many languages in our sample as “head marking,” but in actual practice this broad label may cover very many different language sub-types. We have thus decided to study specific features and have focused on the morphosyntactic domain. This corresponds to our own research interests and enlarges the global perspective on this vein of research, taking into account

20

Pieter Muysken and Loretta O’Connor

various projects around the world. These include a group of scholars at UC Berkeley, led by Lev Michael, who have established a large phonological database (http://linguistics.berkeley.edu/~saphon/en/), and a new group centered in Bern and Z¨urich, headed by Fernando Zu˜niga, who will establish a database for lexical morphology. In future research, comparative studies on different databases will yield increasingly rich results. 5.5

Morphosyntactic elements and categories

Regarding the comparison of grammatical morpheme inventories, many of these are very stable elements, and thus they would be ideal candidates to determine genealogical relationships. However, many of the elements involved are short, in the extreme case monosyllabic forms with cross-linguistically frequent consonants, such as South American -y ‘1st person’ and -m ‘2nd person’ (cf. Greenberg 1987), which means they are fairly low on the specificity hierarchy (Campbell 1997). Nichols and Peterson (1996) argue for an areal (“Pacific Rim”) rather than genealogical spread for these features. Altogether, this approach merits further explanation in the South American context, once the appropriate categories have been identified. Yet another possibility would be the comparison of notions expressed by morphemes (through gloss inventories). Many descriptive grammars contain a list of the glosses (b) used to describe the grammatical morphemes used in the representation of the examples of the language, as in the Cuzco Quechua example (a): a.

riku -naya -chi -wa -rqa -n see desi caus ob 1pas 3 ‘It/he/she caused me to feel like seeing (something).’

b.

caus desi pas 1ob ...

causative desiderative past tense first person object marking

Theoretically it would be possible to systematically compare the glosses used in the different languages in South America, indicative of their semantic peculiarities and hence also of their relationships to other languages. Gloss lists are easily accessible, and thus this method would yield quick results. Also, underlying semantic notions could reflect stable properties of the languages and therefore be indicative of deep-time genealogical relationships. However, there are clear disadvantages in actual practice. First, what is meant by the glosses by different authors may differ considerably. Thus terms

Introduction: genealogy, typology, contacts

21

like “associative plural” and “mirative evidential” may or may not be used to describe an identical phenomenon. Second, there are semantic “splitters” and “lumpers” among the glossing practitioners. Some will gloss all aspectual elements as asp, while others distinguish a series of glosses, including “durative,” “inchoative,” etc. A third disadvantage is that a number of notional categories, such as “causative,” will be extremely widespread, making a comparison less meaningful. Fourth, notions may have similar grammaticalization paths in different languages, and thus shared semantic categories do not necessarily point to genealogical relationships. The value of this method will probably increase once language descriptions become more standardized in the use of terms and glosses. M¨uller (this volume) on the desiderative exemplifies this approach. 6

Research methodologies in the present volume

In view of these considerations considering language genealogy, linguistic typology, language contact, and the stability of linguistic properties, we have adopted a hybrid methodology and have chosen to study various types of features. Also, we have taken a mixed approach in terms of sampling languages. 6.1

Features and languages sampled

The features for the main database addressed in this study were selected from core areas of grammar in four main categories: (1) tense, aspect, modality, and evidentiality marking, (2) argument marking, (3) the noun phrase, and (4) subordination strategies. The primary researcher for each area of grammar, in consultation with the research group, assembled a questionnaire of a broad feature set within a few major components of each category. In the joint database, the component features of any one category can then be investigated independently or combined with any or all of the other categories to produce both narrow and broad typological profiles of South American languages. The obvious advantages of this approach include that it allows statistical comparisons of a considerable number of relatively independent features, and assessments of varying rates of stability among different feature clusters. However, the features often are binary or have few values and thus rank low on a specificity hierarchy, requiring a clustering approach. Furthermore, some features are dependent on one another, such as the type of noun class distinction (which depends on the presence of noun classes) or the form of verbal argument marking (which depends on the presence of verbal marking), and these require additional analytical tools to weed out redundancy. A general problem in this kind of comparative work is that the features require detailed grammatical descriptions, which may also be subject to misrepresentation or hard to establish on the basis of the description.

22

Pieter Muysken and Loretta O’Connor

The four typological studies are based on a global sample of about sixty languages, initially designed by Mily Crevels using the following criteria: (a) geographic spread in the continent (b) genealogical spread (maximally four languages from one family) (c) quality of the language description and documentation (as of 2008) The sample was slightly modified by the four authors for practical reasons. The advantage of this sample is that it gave us a broad overview of the languages of the continent; the disadvantage is that it does not allow studies of developments within families, nor does it by itself allow detailed analysis of specific regions or linguistic areas. Although each typological individual chapter works with a slightly modified version of the core sample to remedy some of these defects, the sample does allow for more balanced statements about the continent as a whole than have been possible so far. In addition to the global sample, a number of specific samples have been used in individual chapters, complementing the global sample. Details are given in the chapters concerned, but this is an overview: r 90 structural features for 14 Isthmo-Colombian languages r 23 structural features for 30 languages of the Andean foothills r 126 structural (noun phrase and argument realization) and 37 specific morphosyntactic features for 24 Andean languages r 154 features, both general and Arawak-specific, coded for 36 Arawakan languages r 100 lexical items for 50 languages and 20 structural features for 30 varieties in the Tupian family. 6.2

Database and computational tools

All data in our study are stored online and are accessible through an online browsable interface, created and maintained by Harald Hammarstr¨om for our project (cls.ru.nl/staff/hhammarstrom/sails.html). In addition to the languages and feature specifications covered, the interface provides several computational tools developed by Hammarstr¨om, such as an algorithm to calculate pair-wise distances between selected languages, an algorithm to calculate predictability between pairs of features, a tool that allows the online analysis of user-generated Sprachbunds, and a tool to map innovations within a sample of related languages. A variety of statistical and computational tools are used in this volume to describe and illustrate the distribution of patterns and features. One tool is NeighborNet (Huson and Bryant 2006), a distance-based method that calculates a pair-wise distance matrix from the features database and then applies an agglomerative clustering algorithm to produce a network representation of similarity in the data. Some of the chapters also feature the use of GIS

Introduction: genealogy, typology, contacts

23

(Geographic Information Systems) for displaying the spatial distribution of linguistic features and other elements of material and non-material culture. 7

The present volume

In this volume we approach the issues involved from a number of perspectives. First there are two further chapters addressing general issues. In a chapter on migrations and dispersals, Loretta O’Connor and Vishnupriya Kolipakam review recent literature in the human sciences about population movements into and around South America and tie these to the investigation of language contact. Our understanding of the peopling of the Americas is undergoing massive revision, as discoveries of new data and development of new tools push settlement dates back in time and reveal previously unidentified links in the spread of people and culture. There are direct ramifications for the study of language history and for modeling the timing and extent of language change. Revised dates serve as starting points and waypoints in phylogenetic approaches. Genetic studies begin to suggest patterns of migration and dispersal identifiable in the DNA of archaeological remains and of contemporary speakers. Ground-breaking research explores the reach of specific cultures, engraved on ceramics and etched into the Earth’s surface, that delineate the interactions of particular speaker communities. A lack of written history and scarce linguistic documentation have long kept the linguistic history of South America poorly understood. Today, evidence from across the human sciences promises a fuller, clearer picture. The complex issue of the lexical relationships between South American languages is analyzed by Harald Hammarstr¨om. Comparison of basic vocabulary has been the default method for sorting out the fundamental relationships between South American languages since the very beginning. While extensive grammatical data are increasingly becoming available, the notion of basic vocabulary and its role for genealogical classification is now also better understood, and lexical comparison continues to play a major role. A key debate centers on how lexical items are compared and analyzed to determine family relationships. The classic method involves manual reconstruction of items through detailed lexical–phonological comparison, based on assumptions which are rigorous but require considerable time and expertise. Newer approaches automate lexical comparison, making it wider-scope, systematic, and objective but relying on reduced detail and simplistic assumptions. This chapter compares the outcomes of three classifications – two manual and one automated – and finds little difference among them, opening up the possibility that we can achieve worthwhile results quickly from very small numbers of lexical items per language.

24

Pieter Muysken and Loretta O’Connor

The second part of the volume presents a series of case studies of different matrices or spheres, defined regionally or in terms of expansion zones of particular families. Languages of the Isthmo-Colombian area, addressed by Loretta O’Connor, are spoken at the gateway to South America, across the land bridge that connects the American continents and along the northwest coast of the southern landmass. Dominated by speakers of Chibchan languages for millennia, the zone is characterized by rich resources, long-term settlement, and relative social stability. Goods and technologies were exchanged within the region and with neighbors north, south, and east. O’Connor examines languages of this region to assess the role of structural features as indicators of the effects of language contact. Features from seven Chibchan languages and seven neighbors are compared with respect to stability and function. Results suggest that functional properties of features related to speaker interaction have played a stronger role than strictly formal properties in shaping the current linguistic profile. The Andean foothill region or the upper Amazon is surveyed by Rik van Gijn. This is one of the linguistically most diverse areas of the continent. Its position in between three culture areas that have been expanding over the last millennium (Arawakan and Tupian in the Amazon area to the east, Quechuan-Aymaran in the Andean area to the west) raises the question of what imprint these expansions have made on the many languages involved. This study assembles an inventory of features designed to maximize differences between “Andean” and “Amazonian” profiles for thirty of the region’s languages and analyzes these to measure linguistic distance and correlations with geographic factors. The foothill languages are characterized by a separate typological profile in between the two regional types. Morphosyntax seems more stable than phonology, and there is a partial correlation between elevation and a more Andean profile. The well-known Andean matrix is discussed by Simon van de Kerke and Pieter Muysken. Over the last three thousand years the Central Andean area has been the stage for the rise and fall of different civilizations, peaking with the Inca Empire that controlled an area from Ecuador to Bolivia just before the Spanish invasion. Lexical comparisons within and between varieties of Aymaran and Quechuan suggest intimate language contact, but analyses based on morphological and syntactic features have never reached the same level of theoretical depth as those based on lexical comparison. This chapter aims at a more refined comparison of the two language families. It includes an analysis of Quechuan dialectal variation and takes other relevant languages such as Uru-Chipaya, Cholon, and Mochica into account. Another important sociopolitical constellation is the Arawak matrix, discussed by Love Eriksen and Swintha Danielsen. In 1492, Arawakan languages were distributed from the Greater Antilles in the north to the Gran Chaco area in

Introduction: genealogy, typology, contacts

25

the south, and from the Amazon River mouth in the east, to the eastern Andean slopes in the west. Behind this successful linguistic expansion was a powerful cultural complex, the Arawak matrix, which this chapter defines, exemplifies, and maps using GIS to explore the distribution of cultural and linguistic features. The depiction of Arawakan culture over space and time illustrates the inception, expansion, and subsequent fragmentation of the Arawak matrix, linking sociocultural mechanisms of the Arawakan diaspora and the spatial distribution of linguistic features. The final constellation in this section is constituted by the Tup´ı expansion, described by Love Eriksen and Ana Vilacy Galucio. Tupian languages are or were spoken from the Brazilian Atlantic coast through Paraguay to the eastern Andean slopes of Peru. The study compares material culture with phylogenetic results of lexical and structural comparison, using GIS to map the spatial distribution of cultural and linguistic features associated with Tup´ı-speaking groups and plot their historical expansion. The chapter breaks new ground in combining traditional studies of material culture with linguistic data and in mapping their relationship to other cultural attributes. The third part of the volume focuses on language typology, through a comparative study of the distribution of various linguistic patterns. Languages from South America have until lately played a rather marginal role in typological studies, mainly because of the scarcity of available up-to-date language descriptions of an adequate quality. Recent years have seen the appearance of top-quality descriptive materials of many under-documented languages. Within the wider topic of Tense, Aspect, Modality, and Evidentiality, a study by Neele M¨uller traces the internal and external forces that shape patterns of desiderative marking in South American indigenous languages. Features were selected on semantic and formal grounds from a sample of eighty-five languages, and results show that desideratives are more frequent in South America than elsewhere in the world. A series of family-based and regional accounts illustrates the interplay of genealogical and geographic factors and the language-internal developments that shape specific inventories, as languages fill gaps created by semantic shift or split. Claims of areal signatures in argument realization patterns in the languages of South America have often failed to account for the distribution of these properties over the continent as a whole. Joshua Birchall makes a first attempt at a more holistic description with a systematic analysis of the distribution of structures used to mark core arguments in a database composed of sixty-five South American indigenous languages, selected to optimize geographic and genealogical diversity. The quantitative analysis develops areal descriptions of argument marking strategies for seven regions. The study finds little support for existing claims about ergativity in Amazonia vs. accusativity in the

26

Pieter Muysken and Loretta O’Connor

Andean region but does identify significant areal distributions for patterns of hierarchical marking and clusivity. Olga Krasnoukhova’s chapter on the Noun Phrase presents the first major cross-linguistic study of this constituent throughout the continent, providing a detailed review of specific typological structures as well as observations on their areal distribution. The chapter then focuses on semantic features encoded by demonstratives in a dataset of fifty-five languages, and the findings recommend additions to both the number of features and the typology of features reported in the demonstratives literature. Rich, illustrative examples are presented from recently published work, especially from languages of Western Amazonia and the Chaco regions. Finally, subordination strategies are discussed by Rik van Gijn, who investigates the geographic distribution of these structures in clause subordination and evaluates the likelihood of their having been spread through contact, as opposed to genealogical inheritance or chance. He finds that while South America has a larger number of nominalized structures than other parts of the world, the distribution of these structures is not connected to any specific semantic domain. Both genealogy and contact influence the variation of nominalization type across languages, in patterns that suggest a series of local spreads but not a continent-wide spread region. Major findings and conclusions are presented in a final chapter co-authored by the group, led by Pieter Muysken.

Part I

Introduction to South America

2

Human migrations, dispersals, and contacts in South America Loretta O’Connor and Vishnupriya Kolipakam

This chapter examines ancient population movements into and within South America to understand what these can tell us about contacts among people and therefore among languages. Basing our discussion on a rich sample of recent literature in the human sciences outside linguistics, we describe evidence of human migration, presence, and interaction. The genetics profile suggests that while three source populations from Siberia migrated into the American hemisphere, South America could have been populated by a single source population that entered from the northwest around 15,000 BP and slowly dispersed eastward in multiple crossings of the cordillera. Broad-spectrum subsistence strategies adapted to local ecologies encouraged the development of myriad pockets of population density and cultural development as communities slowly filled the continent. For millennia, exchanges of material and non-material culture took place mostly “down the line,” and not through major demic dispersals, at least until significant ethnolinguistic expansions that began some 4,000 years ago, contemporary with the emergence of large-scale agriculture. 1

Introduction

The story of migrations and dispersals plays an important role in sorting through the complex picture of South American linguistic diversity. In this chapter, we use evidence from outside linguistics to sketch answers to basic questions about scenarios for prehistoric language contact in the region. When did people first populate the Americas, and when and where did people first enter South America? How many source populations do we know of? Who were the early migrants, and how many were there? As people dispersed throughout the continent, what are the indicators of population density and interactions with other groups? South America was the last continent to be inhabited by humans, and common sense assumptions have long shaped our notions of when and how the We thank Pieter Muysken, Olga Krasnoukhova, and Martine Bruil for thoughtful reviews of this chapter.

29

30

Loretta O’Connor and Vishnupriya Kolipakam

so-called New World was populated. It makes sense that the primary entry point to each American continent was a land bridge, from Asia across the Bering Strait into North America, and from Central America across the isthmus into South America; these natural bottlenecks suggest that humans (and languages) in the Americas might have evolved from a few sources or even a single founder group. Because large quantities of distinctive projectile points identified from a Clovis culture (11,000 BP) were found all across North America and identified in the southern continent, it made sense to assume that the first South Americans were descendants of big game hunters associated with this culture, who simply followed their prey south. And given the vastness of the continent and the variety of climate and topography, it would seem to make sense that the early South Americans entered pristine, empty landscapes, having little contact with each other, and that they developed complex civilizations in cooler, drier climates such as the Andes, rather than in the steamy, overgrown tropics and floodplains of the interior. Research in various disciplines over recent decades reveals a different story. Not only are dates of human presence in South America being pushed back in time, but in addition prescriptive archaeological models both of cultural essentialism, based on a series of social stages of development from “simple” to “complex,” and of environmental determinism, which views humankind as fundamentally reactive and subject to the demands of specific ecologies, are giving way to an appreciation of the adaptability of early South Americans in diverse lifestyles and the ingenuity of the settlers in interaction with their surroundings. In this sense, the prehistory of the continent goes against received wisdom: from the start, people actively shaped their environments and engaged in a variety of broad-spectrum subsistence strategies. Nevertheless, great gaps remain in what we can learn from the multidisciplinary record of South American prehistory, and the numbers and locations of contemporary populations, who have undergone massive loss of linguistic and genetic diversity, can be misleading. This chapter aims to characterize the migrations and dispersals that constitute the history of South American speaker communities, and shaped the language contact landscape, through a broad multidisciplinary assessment. Fundamental questions related to the number of original languages or language families in South America are addressed first (in Section 2) by examining what we know about the timing and number of initial entry points to the hemisphere and to the southern continent, using evidence primarily from archaeology and genetics. We then turn to the possibilities and likelihood of language contact in South America by examining the trace of population movements inside the continent and the evidence for population density and interaction. In Section 3 we review proposals for the earliest paths of internal population dispersals, noting especially the outcomes of genetic modeling, and in Section 4 we consider the factors that motivate the decision to migrate and offer summaries

Human migrations, dispersals, and contacts in South America

31

of evidence for population density and distribution based on primary technologies and activities. We conclude with a few remarks on the roles of different types of evidence in understanding what migrations and dispersals can tell us about language contact in prehistoric South America. The approach of this chapter reflects an inclusionary movement in the study of prehistory that advocates the layering of data from as many sources as possible in the search to uncover the past (e.g. Blench and Spriggs 1997– 1999; Hornborg and Hill 2011; Heggarty and Beresford-Jones 2012). While perspectives from linguistics, archaeology, anthropology, and ethnohistory have long been combined in African research, the practice is expanding in other area studies, chiefly due to recent developments that now permit the large-scale integration of information from many other fields, such as geographic ecology, botany, and, critically, human genetics. The field of molecular anthropology has contributed much to our understanding of human dispersal across the world. In addition, computational models enhance the manipulation and predictive power of quantitative data. A key challenge for our approach is the mapping of results across disciplines, linking rates and directions of evolutionary change, and linking instances of material culture with specific ethnolinguistic groups. Research in human genetics investigates patterns among several types of markers, including classical markers, found in blood groups and serum proteins, as well as molecular genetic markers. The latter category encompasses non-sex-specific autosomal markers in addition to the key sex-linked markers of mitochondrial DNA (mtDNA), that shows maternal history, and Y chromosome DNA, that shows paternal history. Basic units of analysis include alleles (alternative forms of gene traits, such as blue vs. brown eyes), haplotypes (combinations of alleles that are transmitted together), and haplogroups (collections of haplotypes). As in historical linguistics, different types of change can be represented with different graphics, using bifurcating trees (composed of a sequence of two-branch forks) for a series of fissions that radiate out from a center and using networks for changes such as mutation, transposition, and loss. The trick is in finding the appropriate way to compare and combine these diverse data: a strictly tree-like model based exclusively on lexical cognates misses a great deal of the history of a language (e.g. Ross 2003); genes can cross linguistic borders (e.g. Hunley and Long 2005); the transmission of non-linguistic cultural legacies responds to external factors such as ecological adaptation and geographic diffusion as well as family or clan inheritance (e.g. Guiglielmino et al. 1995), and any analysis of South American genetics is challenged by the trace of admixture that dilutes the pre-Columbian signal (Hunley and Healy 2011). Rates of change may vary, as well, across disciplines and within each discipline across time. The recurring motif in this chapter is “pulse and pause”: periods of fission and rapid change, followed by periods of stability, as shown in studies on the multi-stage settlement of the Pacific region by Austronesians (Greenhill and

32

Loretta O’Connor and Vishnupriya Kolipakam

Gray 2005, Gray et al. 2009). These studies explicitly tie linguistic data to archaeological data in quantitative analyses that apply statistical methods from biology to the investigation of language prehistory. The family tree model, particularly suitable for looking at modifications in language, culture, or genes engendered by rapid change events – a “pulse” of human migration or biological speciation – is often nicely represented by statistical methods that produce bifurcating trees. In genetics, a “pulse” of rapid population expansion is identifiable in features such as exclusive polymorphisms (genetic variations) and novel clades that are retained in large populations. Correspondingly, the type of modifications produced by slower rates of change – a “pause” characterized by longer-term contact and convergence – may be well represented by waves or networks that illustrate and consider the effects of lateral connections between ancestral units, whether language families, cultural practices, or genomes (e.g. Mace et al. 2005; Gray and Jordan 2000; Gray and Atkinson 2003; Gray et al. 2007; Greenhill et al. 2010); as more South American data are collected and collated, we can look forward to analyses in this direction. A similar “pulse and pause” rhythm is exemplified in the model of “punctuation and equilibrium” (Dixon 1997). The key notion in this work is that during a punctuation, language change is compatible with a bifurcating tree model, as languages split or disappear at a fast rate, while during a period of equilibrium, language change is compatible with a wave or network model, as features diffuse and languages gradually become more similar. These explanations of language change have not been widely accepted in linguistics (e.g. Bowern 2006), and the anthropological theory of ethnogenesis would argue for precisely the opposite outcome during equilibrium, when linguistic and other cultural differences may in fact be strengthened and even constructed as badges of ethnic identity (Hornborg and Hill 2011). Our point here is to note the pulse and pause rhythm while keeping an open mind about its specific repercussions on sequences of change in language or other cultural practices. A summary review chapter such as this one cannot do justice to the breadth and depth of the literature on the peopling of South America, and indeed we discuss only a handful of the relevant themes. Key archaeological sources include Dillehay (2000) and a variety of excellent papers in the Handbook of South American archaeology (Silverman and Isbell 2008), a special issue of Diversity journal (2010), and Hornborg and Hill (2011). All dates are reported in uncalibrated years BP (before present) unless otherwise noted.1 1

Archaeological dates are typically calculated by measuring the level of a radioactive isotope of carbon (Carbon-14, 14 C) in organic remains. As radiocarbon years differ slightly from calendar years, a number of “calibration curves” have been developed to align date assessments which are then reported as 14 C-years or cal BP. Sources used in this chapter mostly reported dates

Human migrations, dispersals, and contacts in South America

2

33

The first migrations

There is clear archaeological evidence of human presence throughout the South American continent dating from the Late Pleistocene (the geological epoch that ended roughly 12,000 years ago) and Early Holocene (the epoch of warmer climate following the Pleistocene that continues today). Starting on the isthmus in the northwest corner, the forests of La Yeguada in Panama have been inhabited and modified by humans continuously since 11,000 BP (Cooke 2005: 137; see Map 2.1). In Colombia, hearths, tools, and small animal bones at highland rock shelters in Tequendama and Tibit´o have been dated somewhere between 13,000 and 10,000 BP (Dillehay 2000: 123), and the Venezuelan sites of Taima-Taima, Muaco, and El Jobo have yielded remains of larger animals, tools, and projectile points from at least the same time frame and perhaps much earlier (Navarrete 2008: 432–433). Sites documented among the traces of early hunting and fishing camps along the Pacific include those of the Paij´an culture on the Peruvian coast and a Paleoindian culture in the Ayacucho Basin, both dated at about 11,000–9000 BP (Roosevelt 1999: 272–273), and we find older remains on the Chilean coast at the TaguaTagua butchering sites (11,400–11,000 BP, 10,190–9700 BP) and at the Monte Verde settlement, with a wide variety of evidence dating from at least 12,500 BP and with tools and charcoal dated with much less certainty to 33,000 BP (Dillehay 2000: 157–168). In the same Late Pleistocene–Early Holocene period, there were also people east of the Andes throughout the Southern Cone. Distinctive fishtail points dated between 11,600 and 10,200 BP have been found in southern Patagonia, the Pampas, and central Chile (Scheinsohn 2003: 346). There is evidence of megafauna hunting and consumption at Piedra Museo in Patagonia at almost 13,000 BP, and tools, points, and camelid bones from around 12,600 BP were excavated at Los Toldos in southern Patagonia (Scheinsohn 2003: 347, Dillehay 2000: 209). Points, animal bones, and human bones from some 11,000 years ago were found at Fell’s Cave in Tierra del Fuego (Dillehay 2000: 36). Moving north up the Atlantic coastal region of Brazil, stone artifacts from 12,070–9000 BP were uncovered at Lapa do Boquete (Dillehay 2000: 198), and people of the Itaparica Tradition complex (11,000–8000 BP) left painted rock shelters, projectile points, scrapers, bones, and plant remains (Roosevelt 1999: 314). At Pedra Furada, while proposed charcoal dates as early as 30,000 BP have been rejected as evidence of human presence, other artifacts indicate human activity by 11,500 BP (Scheinsohn 2003: 345, Dillehay 2000: 194). Late Pleistocene and Early Holocene records of human occupation in Amazonia in uncalibrated radiocarbon years Before Present (BP), with ‘present’ defined as 1950. Dates reported as BC, AD, or CE were recalculated in this chapter as BP using a Christian Era baseline of 2,000 years ago.

34

Loretta O’Connor and Vishnupriya Kolipakam

Map 2.1 Archaeological sites from the Late Pleistocene–Early Holocene in South America

include flakes, rock shelter deposits, and shellmounds, with the Monte Alegre culture at Pedra Pintada near Santer´em, dating from 11,200 BP, as the bestdocumented example of Amazonian Paleoindian culture (Roosevelt 1999: 312– 314). In summary, the archaeological picture to date establishes that there were human settlements of varying degrees of sophistication throughout

Human migrations, dispersals, and contacts in South America

35

South America, at roughly the same time, often located near coastlines and rivers. The question now becomes, how and when did they first arrive?

2.1

Entry from Asia, initial migrations into the Americas

While there is debate about the first time and place that humans set foot in the Americas, most theories assume that the earliest migrations originated in northeast Asia, a view corroborated by genetic evidence to be detailed below. The first migrations took place during the final stages of the Pleistocene epoch (2,600,000–12,000 BP) or Ice Age, a period characterized by repeated glaciations of the planet’s surface. The Last Glacial Maximum (LGM), when the Earth’s ice cap was at its thickest and the seas were at their lowest, was reached between 22,000 and 18,000 years ago, and most proposals date the arrival of humans some 20,000 to 15,000 years ago. Sea levels 100 to 170 meters below current levels meant that a strip of land at times 1,000 miles across connected what are now Siberia and Alaska; this land bridge, called Beringia, permitted humans and animals to migrate by land into the Americas over what is now the Bering Strait. As the Laurentide and Cordilleran glaciers covering North America melted and parted in stages over the next several millennia, groups of Beringian hunters could have traveled south through an ice-free corridor along major river valleys near the western mountain spine. And yet, the earliest population movements to the Americas need not have been by land. Other theories propose Late Pleistocene travel along the Pacific Rim. Building on work by Fladmark (1979) for North America, Gruhn (1988) describes the likelihood of possibly multiple migrations of Asians to the Americas along the Pacific coasts. The periodic receding of glaciers even prior to the LGM meant that coastal hunters could feasibly have moved along shorelines during warm periods when temperatures were low. Coastal shelves were capable of sustaining travelers from Japan to Beringia, through the Aleutians, down the coast of the Americas, and all along the increasingly glacier-free southern edge of the landmass. Although the idea of ancient, periodic migrations along a western coastline is enticing, the proposal is hard to evaluate. Multiple tectonic plates shift and collide all up and down the western border of the American continents. As Dillehay (2000: 64) observes of the Pacific shoreline, “the interpretation of the South American coastal evidence is complicated by the rapid uplift of the Andes and its impact on archaeological visibility.” The relatively steep continental shelf suggests “the shoreline generally shifted by less than 50 kilometers, whereas along the Atlantic and Gulf coasts, where the shelf is wide and gently sloped, the shoreline moved by 100 kilometers or more” (p. 65). Evidence of human presence pre-15,000 BP would have to be found on islands or on lands now covered by the sea.

36

Loretta O’Connor and Vishnupriya Kolipakam

Another possibility for early population is entry to the Americas from the east, along the Atlantic coast. Based on perceived similarities between certain stone and bone industries of the Solutrean culture of Spain and southern France (24,000–16,000 BP) and artifacts of the Clovis culture as well as earlier sites in eastern North America, Bradley and Stanford (2004) have suggested there was migration along the edge of the ice sheet that connected northwest Europe and North America. Similar to the Pacific proposal, any archaeological evidence for the journey is underwater, and the Solutrean connection has been challenged by genetic analysis that shows the founder lineage from Europe could not have given rise to the X haplogroups present in the American continents (Malhi and Smith 2002). Alternatively, O’Rourke and Raff somewhat playfully point out that “it is useful to recall that Beringea had two coastlines, northern and southern” (2010: 206). An east-coast migration – from present-day Alaska through open waterways across northern Canada and then south along the Atlantic coast – would actually be consistent with certain facts, such as an unusual distribution of certain genetic haplogroups and the finding of the earliest and the most numerous Clovis artifacts in eastern North America. Evidence from genetic data strongly supports an eastern or central Asia origin for the first Americans and an initial migration that proceeded via Beringia. Analysis of autosomal DNA, mtDNA, and Y chromosomal data indicates that the diversity present within the Americas is a subset of the diversity found in Asia: for mtDNA, subsets of haplogroups A, B, C, D, and X, and for the Y chromosome, of haplogroups Q or C (Torroni et al. 1992, 1993; Forster et al. 1996). All native American mtDNA haplogroups show similar diversity indices and coalescence age estimates, as well as a suite of polymorphisms restricted to the Americas (Fagundes et al. 2008). The mtDNA evidence suggests that a single founding lineage of the Native American ancestors accumulated variation either before or after entering the New World and then rapidly expanded to populate the continent (Wang et al. 2007; Kemp et al. 2007; Tamm et al. 2007). Recent coalescent age estimates of genetic material restricted to the Americas and evaluations of accumulated polymorphisms point to an original separation from Asia 18–15,000 BP (Torroni et al. 1993; Forster et al. 1996; Zegura et al. 2004; Goebel et al. 2008; Fagundes et al. 2008; Kitchen et al. 2008; O’Rourke 2009; Perego et al. 2009), corroborating the archaeological record. There is no clear genetic or archaeological evidence for a direct voyage to the Americas across either ocean (Dillehay 2000: 65). 2.2

How many source populations were there?

Hamilton and Buchanan (2010) conduct diffusion analyses on a large dataset of radiocarbon dates to test for the origin, timing, and geographic pathways of

Human migrations, dispersals, and contacts in South America

37

modern humans from Eurasia into the Americas. Their results support a threestage expansion that began in southern Siberia and moved across into modern Alaska 47,000–32,000 cal BP, experienced a “Beringian pause” 32,000–16,000 cal BP, and then expanded again post-LGM at around 16,000 cal BP.2 To date there is little archaeological evidence for an interim settlement of significant duration in Beringia, but the notion of multiple and perhaps quite early migrations from Asia is a factor in many recent models. For example, the dates and “pulse and pause” dynamic of Hamilton and Buchanan’s three-stage expansion are compatible with proposals that use differences in skull morphology as evidence of at least two major waves of migration into the Americas, both from east Asia (Gonz´alez-Jos´e et al. 2003; Hubbe et al. 2010). Some research suggests that the first wave represents a migration that started in south Asia, involving people who shared ancestry with modern Australians (Dillehay 2003). This interpretation posits a migration of humans with “Australo-Melanesian” skull morphology before 10,000 BP and other migrations of humans with ‘Mongoloid’ type morphology after 10,000 BP.3 Other craniometric research (Pucciarelli et al. 2006) finds significant differences between skulls in South America east and west of the Andes. All studies urge more data and more sophisticated assessment to understand if craniometric differences stem from different human stocks or reflect processes such as uneven rates of genetic drift and intra-regional gene flow. While there is strong support from genetics for a founding population originating in Asia, it should be noted that a single founding population does not necessarily mean a single migration event. The Americas could have been successively colonized by multiple migrations from a single founding population. That said, some genetic research in the past decade has pointed to more than one source population. Several proponents of dual-founder migration theories base their proposed routes on uneven geographic distribution of mtDNA haplogroups. Schurr and Sherry (2004) claim that bearers of haplogroups A, B, C, and D migrated via the Pacific coast from Siberia around 20–15,000 BP, followed by a second wave of migration, involving bearers of haplogroup X, to the eastern and continental parts of North America. Perego et al. (2009) identified two rare mtDNA 2

3

“In general, however, for the past back to about 3000 BP (1000 BC) radiocarbon assays tend to give dates that are older than calendar years. Calibration brings these events forward, into more recent time. . . . On the other hand, from about 1000 BC back, the calibration of radiocarbon assays tends to produce older dates. At 2000 BC, calibration adds a couple of centuries until at 10,000 BC it adds a millennium and a half, to almost two millennia, which then becomes tremendously significant in discussions about the peopling of South America and its earliest sites” (Silverman and Isbell 2008: xix). Another interpretation suggests “recurrent gene flow among Asian and the American populations in the Arctic region” (Long and Bortolini 2011: 492) and basically supports cross-Beringian interaction during a lengthy “pause.”

38

Loretta O’Connor and Vishnupriya Kolipakam

lineages – D4h3, found only in the Pacific coastal regions of North and South America, and X2a, restricted to northeastern North America – and concluded that this indicated one coastal migration and a contemporaneous but separate migration into the interior of North America. O’Rourke and Raff (2010) showed that haplogroup B, one of the founding mtDNA haplogroups, is only found along the west coasts of both continents and not found in the northern part of North America or in the southern cone of South America. In complement to genetic studies using sex-specific markers, a recent study by Reich et al. (2012) finds evidence for not two but three separate gene streams in American populations that arrived from Siberia. The result is based on analysis of single nucleotide polymorphisms (SNPs, pronounced “snips”), variations in the DNA sequence tantamount to typos in the transmission of the billions of nucleotides (A, T, C, G) in the human genome. The findings suggest that only one of those gene streams is represented in South America. 2.3

Group size and internal differentiation

Could small individual groups have survived without contact? Probably not. Models to describe population growth and dispersal implement estimates related to population, such as initial size, average size, reproductive rate, and size at which the group would split up, as well as geographic parameters like natural barriers, historical glacier borders, paleovegetation, and estimated range of group mobility (Steele et al. 1998, Anderson and Gillam 2000, 2001; Hazelwood and Steele 2004). Some models assume a high birth rate until a certain “carrying capacity” or maximum sustainable group size is reached, at which point the community will decrease the growth rate through “cultural mechanisms of fertility regulation, such as infanticide, delayed weaning and lactation-induced amenorrhea” (Steele et al. 1998: 299). Hypotheses suggest groups ranging from 25 to 150 people, but the viability of very small groups is challenged (e.g. Moore and Moseley 2001). Theories of paleodemography, the study of mortality, fertility, migrations, distributions, and densities of prehistoric populations, done mostly by examining skeletal remains (Meindl and Russell 1998: 376), suggest that small-group isolation was not a sustainable situation. The minimum viable size for the survival of a group depends upon the so-called “effective population size” or Ne, which for any group is “the breeding population and not . . . the total number of individuals of all ages” (Wright 1931: 110). Bocquet-Appel (1985) uses the metric of effective population size to explore hypotheses of genetic and cultural diversity in relation to population size and geographic dispersion. Because factors such as random fluctuations in the female to male birth rate would affect the number of reproductive members in small groups more severely than large groups, the smaller the Ne, the greater the necessity for “migratory flow,”

Human migrations, dispersals, and contacts in South America

39

finding sexual partners outside the group, in order to maintain reproductive viability. Migratory flow is estimated at “about 11% for a group of 20 individuals, 7% for 50 individuals” (p. 685). Higher levels of interaction for intermarriage suggest higher levels of interaction involving all kinds of cultural exchange – including linguistic – and these might take place over broad geographic ranges. This leads to certain predictions. Genetically one therefore expects small populations to exhibit relatively high betweengroup homogeneity and inversely, large populations should exhibit high between-group heterogeneity. Culturally, the more freely individuals move between groups, the more likely it is that diverse techniques and practices will be exchanged. One would therefore expect more cultural unity between small populations, even over a large geographic area, than in large populations which tend to develop strong local or regional characteristics. (Bocquet-Appel 1985: 686, italics added)

Based on evidence from skeletal remains from Saccopastore (Italy) and Broken Hill (now Kabwe, Zambia), Bocquet-Appel’s conclusions are that Pleistocene groups could have been small in number and geographically dispersed yet have maintained high migratory gene flow between groups, implying regular contact and the sharing and exchanging of cultural features (1985: 689–690). Larger groups in smaller areas would tend to promote less gene flow outside the group and greater cultural diversity within the group. We will see results attributable to these opposing patterns of interaction exemplified in Amazonian and Andean populations in Section 3. Genetic studies have proven to be a good tool in testing scenarios for the minimum viable populations that could have been involved in the populating of the Americas. The effective size of the founding population of the Americas has been argued to be as low as 70 (females of child-bearing age) (Hey 2005) to as high as 1300 people (Kitchen et al. 2008). Both these studies have used forward and backward simulations to test the viability of these founding numbers. Hey (2005) argued that a single founding population of 70 females could have been enough to start the colonization process, whereas Kitchen et al. (2008) propose it would have required about 640 women and a total of about 1,280 individuals to establish a minimum viable population which could have populated the rest of the New World. They put the source population in Asia at 8–10,000 women and in Siberia/Beringia at 4–5,000 women (Kitchen et al. 2008). In any case, these are low numbers suggesting little initial diversity. 3

Internal migrations and genetic profiles in South America

From a language contact perspective, the question of migratory entry points into the South American continent speaks to how many different language families may have been involved in the initial peopling of the continent. Equally

40

Loretta O’Connor and Vishnupriya Kolipakam

important is an analysis of contact and interaction among the different communities. Did populations advance only into unoccupied territories? Were settlements so far apart that any interactions would have been primarily among speakers of the same language families for centuries at a time? What are the likelihood of – and the evidence for – contact involving unrelated groups, and when and how would this have evolved? We can imagine that many groups did in fact enter South America by crossing the Central American land bridge on foot, where they could have traveled south immediately on the Atrato, Sinu, Cauca, and Magdalena river systems or along the high-altitude flat plains in the Andes. Other groups would have come by boat, foraging along the shore and going inland only when the situation was inviting and necessary. The continent has some 145,000 kilometers of coastline and comprises nearly 18 million square kilometers of surface region. Major natural features include the Andes mountains all along the west coast, gentler highlands in the Guyana Shield, Brazil, and parts of Patagonia, and three enormous river drainage systems of the Orinoco in the north, the Amazon from west to east across the equator, and the Paran´a in the southeast (see Map 2.1). Most models of population dispersal use known archaeological sites as waypoints and adopt one of three metaphors to describe the movement: wave of advance, string of pearls, or leap-frog. In a wave of advance model, there is slow and gradual expansion of the population frontier, usually in search of resources. Progress would be slow without the impetus of a specific motivation. A speedy wave of advance was hypothesized for the spread of big game hunters across the Americas in a migration that not only populated the continents with humans but also depopulated the continents of large mammals (e.g. Martin’s 1973 “Overkill” wave). This model has been criticized as requiring an unreasonably high birth rate and a sustained motivation for a rapid push forward (Bird 2002: 11 and references therein). The other two strategies involve defining a group territory as a spatio-temporal element with a geographic range, often handily represented as a pearl-like circle, and placing these circles according to a group fission rate based on a time interval such as a generation or a century. In a string-of-pearls model, groups that split off are assumed to move into adjacent territory, and the total time of migration between two points is the number of circles (spatial territories) multiplied by the time interval implied by each new circle. The leap-frog model assumes a fissioned group will move to wherever advance scouts have found an attractive destination, which might be a large distance away. The vast majority of known archaeological waypoints in South America have indeed been found near the coast or along the banks of rivers. However, some models of population dispersal suggest that coasts and major rivers may not have been the primary means of initial internal migration. Anderson and Gillam (2000) use publicly available Geographic Information Systems (GIS) data as

Human migrations, dispersals, and contacts in South America

41

a basis for calculating proposed “least-cost pathways” of early migrations through the Americas, based on a variety of start and end points. The pathways “assume that people would have taken easier rather than more strenuous routes when moving across the landscape, particularly if these routes gave them a reasonable expectation of finding food and other useful resources” (p. 45). Furthermore, a strict focus on immediate landscape is meant to represent the perspective of early migrants, with limited knowledge of what lay ahead. The authors modeled elevation levels in the natural topography, using glacier and lake boundaries from roughly 12,000 BP as natural barriers, controlled for differences in drainage areas, and generated a “roughness layer” based on slope to simulate the relative difficulty or ease of traversing different types of terrain. In the analysis especially pertinent to this chapter, a primary least-cost pathway is plotted from the Isthmus of Panama to the Los Toldos archaeological site in southern Argentina (13–12,000 BP), with later calculation of secondary routes ending in ten other Late Pleistocene or Early Holocene sites (see Map 2.1). The proposed primary pathway first goes northeast to Taima-Taima and El Jobo near the Caribbean coast of Venezuela before proceeding south on an inland route well to the east of the Andes (p. 51). Suggested secondary pathways follow the Pacific coast to El Inga (Ecuador) and Guitarrero Cave (Peru) and use internal rivers and valleys that branch off the central pathway to reach Monte Alegre and Pedra Furada, both in Brazil, and Quereo, Tagua-Tagua, and Monte Verde, along the Pacific Coast of Chile. The surprising outcome is that proposed least-cost pathways regularly deviated from coastlines and river valleys to include segments that involved climbing, and the authors suggest that modeled least-cost pathways may guide future research in finding more archaeological ruins. At the same time, a noted limitation of this particular analysis is the positioning of coastal shorelines based on measures of sea level as they are today, due to deficiencies and incompatibilities in GIS datasets that reflect the lower sea levels and expanded coastal exposures of the late Pleistocene. In addition, the very notion of an idealized least-cost pathway is not without its critics. To investigate population movement along their proposed least-cost pathway, Anderson and Gillam (2000) devised a flexible model using assorted combinations of parameters and all three migration strategies. They assumed groups of 25 and 50 people, with varied founding populations, reproductive rates, fragmentation thresholds, and group ranges. By manipulating variables in a series of tests, they produced estimates ranging from 4,500 to only 600 years to fill up the entire New World (p. 54). In more structured testing, they assumed spatio-temporal elements of 400 km in diameter for both the string-of-pearls and leap-frog strategies, finding that a string of 24 “pearls” was required to reach Los Toldos, Argentina, from the Isthmus of Panama; assuming the group

42

Loretta O’Connor and Vishnupriya Kolipakam

split every century, the least-cost pathway would be traversed in 2400 years. This same trajectory could be traveled much faster using a leap-frog strategy, which in fact fits the archaeological facts better (Anthony 1990; Anderson and Gillam 2000), but the motivation for speed remains unclear. A string-of-pearls or realistic wave-of-advance approach would take a very long time, and with either of those we would expect to find older sites and larger concentrations of population at the beginning of the trajectory and smaller, newer sites at the end. Instead, what we find are dates in the 13,000–11,000 BP range all around the continent. Recent radiocarbon (re-)dating of South American sites and the modeling of population dispersal rates (e.g. Steele and Politis 2009) give repeated support to the idea that there were multiple early departures south from North America, but details remain imprecise. All in all, the picture of earliest internal migrations in South America is not clarified by the current archaeological record. We find more explicit proposals about the number and timing of founding populations and the specific geographic patterns of internal migration from genetics studies undertaken in recent decades. A defining feature of the genetic profile of South America is a distinct east vs. west pattern of genetic diversity roughly delimited by the Andes mountain range, which divides the demographic profile into coastal and Andean populations in the west and Amazonian and other lowland populations in the east. The presence of excessive rare alleles found in some eastern populations (a characteristic of genetic drift with greatest consequence in small, isolated groups) has multiple possible explanations, one of which is separate founders. Based on this type of region-internal diversity, early studies proposed that people from two different source populations populated the regions separately (Lalueza et al. 1997; Lalueza Fox 1996) and that regional isolation followed. However, another explanation for the phenomenon is gene flow from west to east, which would also reduce the difference between Andean and Amazonian populations as a whole, a solution supported by AMOVA testing (ANOVA for molecular data), which showed negligible distance between Amazonian and Andean populations as a group (Lewis et al. 2007, Lewis 2010). Research from a slightly different angle examines the role of admixture in regional accounts of South American genetic diversity. Hunley and Healy (2011) caution against a quick attribution of east and west differences to multiple waves of population followed by subsequent differences in regional evolutionary histories, noting that European admixture has reduced genetic diversity in the west much more than in the east. The assumption of a single source population and reduced population sizes in the Amazonian region and/or a high level of gene flow between the two regions explains the scenario better and with less complexity, and the multiple-founder hypothesis for South America has been largely dismissed by some (Lewis et al. 2007; Bodner et al. 2012).

Human migrations, dispersals, and contacts in South America

43

Another early hypothesis suggested that a single founding population split and populated the two regions separately (Rothammer and Silva 1989; CavalliSforza et al. 1994; Callegari-Jacques et al. 1994). However, this would also mean that the time estimates for Andean and Amazonian populations should be similar and that they should show negative correlation between genetic diversity and distance from the Bering Strait, the point of entry into the Americas. Wang et al. (2007) found this was true only for Andean populations. Furthermore, two novel mitochondrial lineages restricted to South America (D1g, D1j), whose ancestral haplogroup (D) is present throughout the Americas, were investigated. Given the distribution of the two lineages, it is assumed that they arose in South America and that the dates of the lineages would essentially correspond to the earliest settlement of South America. This date was found to be 16,000 BP ± 1500, which fits the timing of the earliest known archaeological sites, and the coalescence age of these lineages reveals that the populating of the entire continent could indeed have taken place within 2,000 years. This conclusion is also supported by an earlier forward simulation study based on genetic data of mtDNA (Fix 2005). Other features of the west–east genetic diversity provide additional support for initial migrations along the Pacific coast and point to distinct patterns of regional interaction. Callegari-Jacques and colleagues (1994) proposed, based on evidence from protein systems, that either the Amazon river acted as a barrier to gene flow and contact among people, or that genetically differentiated groups from the west entered the region and colonized the region. In more recent work, Lewis et al. (2007) report that Andean populations show a high level of genetic diversity (suggesting in-group growth) and only around 6 percent genetic differentiation among the populations (suggesting significant gene flow between groups). These are signs of large effective population size and recent expansion. The same study reports that eastern populations, including Amazonian populations, show lower genetic diversity (suggesting a small effective population or a recent bottleneck) and higher differentiation among groups, at around 30 percent (suggesting little gene flow between groups). For example, the populations of Chile and Peru, though separated by several thousands of kilometers, lie closer on the genetic distance matrix than several Amazonian populations that are much closer geographically but lie at the opposite ends of the distance matrix. This pattern of diversity and difference between eastern and western populations was observed in both mtDNA and Y chromosome patterns (Luiselli et al. 2000; Tarazona-Santos et al. 2001; Fuselli et al. 2003; Fix 2005; Lewis et al. 2005, 2007). Genetic evidence showed that Andean populations evinced higher levels of effective population size and signs of expansion when compared to eastern lowland populations, which showed low population sizes and signs of genetic drift. If eastern populations were colonized by populations from

44

Loretta O’Connor and Vishnupriya Kolipakam

the west and were then regionally isolated, it could lead to the pattern we see today. In summary, the most well-supported explanation from genetic evidence for the dispersal scenario in South America is that a single founding population first colonized the Andean region, either via the coast or by land from the north. Populations then moved east, crossing the cordillera at many different points, followed by regional isolation, perhaps due to biogeographic factors such as dense forests and diminished visibility and mobility, causing the highly localized genetic pattern we see in the Amazonian region today. This explanation motivates the apparently limited gene flow among the Amazonian populations, who likely came from a single founding population initially and then were regionally isolated. Further research is needed to reconcile questions of different skull types and of the effects of admixture from European and African gene pools.

4

Population distribution, density, and interaction in South America

Anthropologist William Denevan (1992) estimates that the population of the continent in 1492 totaled some 24,315,000 people. This figure includes 15,696,000 in the Andes region, comprising Central Andes, Colombia, and Venezuela, and 8,619,000 in Lowland South America, composed of Amazonia, Argentina, Chile, and the remainder of the continent (p. xxviii).4 How was this population distribution achieved? What can we know about movements and interactions among South Americans before the arrival of Europeans – that is, before the advent of written records? Archaeological studies provide evidence of the earliest known human presence, and genetics research is beginning to suggest the contours of earliest interactions, but they can tell us very little about the specific people linked to certain places or practices and even less about which languages they spoke, what their motivations were for dispersing, and whether they filled the continent slowly or quickly. In this section, we review how populations get distributed across a continent into pockets of density, noting the possibility of contactinduced change in all aspects of culture without the migration of people, and 4

These approximations are based in part on census records and projections backwards, depopulation ratios derived especially from disease behavior, and, perhaps most controversially, on estimates of population density according to habitat. While Denevan uses estimates per square kilometer of 14.6 for the Amazonian v´arzea or floodplains (based on an average of 28.0 for large floodplains and 1.2 for upland forests) and of 1.3 to 2.0 for lowland savannas, other scholars argue for an estimate of 0.3 people per square kilometer, noting multiple repopulations of the same sites and large areas that were uninhabited (Denevan 1992: xxvi–xxvii).

Human migrations, dispersals, and contacts in South America

45

we present instances of non-linguistic evidence for population density and possible interaction in South America throughout the centuries. 4.1

Dissemination of peoples and cultures

Why do people migrate? In brief, because they want or need to, they know how to, and they can. Anthony (1990) outlines a perspective on population movements based on structures of migration and general principles of human behavior, identifying three primary conditions that favor migration. First is a push–pull dynamic, based on such motives as economic need, population density, beliefs, quests, natural disaster, or invasion, that is sufficient to cause a group of people to migrate, combined with a belief, conviction, or hope that the new location is an appropriate goal. The establishment and maintenance of information flow is another important condition. This means that migrants would likely rely on reports from some type of scout before initiating the migration of a larger group into an unknown territory, and later there would be a flow of information, often through kin, and a flow of people, back and forth between the new location and the community of origin. The transfer and exchange of people and of goods may lead to changes in the original society, as unusual objects are exploited for trade, and the founding families of the new community may experience a change of status, gaining social importance through their role as the source of information and orientation for newcomers. Established pathways can then be traveled by steady streams of migrants in both directions, and people who have migrated once are more likely than habitual home-bodies to migrate again. A third condition that favors migration involves the costs of transportation. The presence or absence of natural obstacles, coupled with technological advances such as the construction of appropriate watercraft or the domestication of beasts of burden will influence a decision to migrate. This condition should have proven especially significant in the South American context, where the only domesticable animals large enough to play a significant role in long-distance travel are camelids, mostly found in the Andes region (Stahl 2008). Turning to practices and evidence, Anthony (1990) also observes that the economy practiced by a community will very often bear upon the type of population movement, and this in turn will produce certain features in the social, historical, or archaeological record. For example, a diffuse economy, exemplified by broad-spectrum foraging and gardening, as was common in South America for millennia, might encourage short-distance migration to an adjacent area with similar resources in migration models like the “wave-ofadvance” or “string-of-pearls” strategies described earlier. This would be typical of communities whose primary sustenance came from the coast, the forest, or the river, responding to a need for fresh resources and without the

46

Loretta O’Connor and Vishnupriya Kolipakam

means to move a great distance. Old and new groups would be likely to maintain contact, especially through kin groups, and while short-range dispersals may be traceable through evidence such as patterns of marital residence, they can be difficult to detect archaeologically as they are often slow and within a confined zone of interaction. If, on the other hand, the community engages in a focal economy such as fishing or a particular type of agriculture that depends upon a specific set of natural resources, population movements are likely to involve long-distance moves and the “leap-frog” model of migration to find appropriate new locations. Dispersal is prone to be faster and farther, and migrants would rely heavily on information from scouts (hunters, trappers, prophets, and the like) and on the benefits of transportation technologies. Evidence of scouts might include the presence of only males in mortuary assemblages. Beyond initial dispersals of people into unknown territories, an understanding of the spread of social practices and elements of material and non-material culture is critical to an investigation of language contact.5 Proponents of ethnogenesis, the strategic construction of social identity (Hornborg 2005), argue precisely that cultural change, including linguistic change, can and does take place without demic dispersals. Hornborg and Hill (2011) maintain that while migrations surely took place, conclusions about the character and timing of population movements require contextualization in a multidisciplinary record. Attempts to explain the distribution of indigenous languages and ethnic groups in Amazonia since the time of European contact, whether by historians, linguists, or archaeologists, have generally been founded on an essentialist conception of ethnolinguistic groups as more or less bounded, genetically distinct populations that have reached their recent territories through migration. . . . On closer examination, the evidence in Amazonia suggests a much more fluid relation among geography, language use, ethnic identity, and genetics (Hornborg 2005). . . . To understand the emergence, expansion, and decline of cultural identities over the centuries, we thus need to consider the roles of diverse conditioning factors such as ecological diversity, migration, trade, epidemics, conquest, language shifts, marriage patterns, and cultural creativity. (Hornborg and Hill 2011: 1–2)

The sample of non-linguistic evidence for population density and interaction presented below is organized into three time periods roughly modeled on archaeological periods in the literature defined by climate, technologies, and social behaviors. Section 4.2 sketches the interim of the first migrations and earliest Paleo-Indian, pre-ceramic societies; 4.3 concerns the period of Archaic hunter-foragers in a range of ecological niches and the beginnings of agricultural practices and ceramics; and Section 4.4 encompasses a type of 5

See Oliver (2008: 186–187, 191–194) for a brief and useful review of theoretical approaches and perspectives on tracing and interpreting the prehistory of South America through archaeology, with emphasis on the evolution of agriculture in Amazonia.

Human migrations, dispersals, and contacts in South America

47

Formative period, characterized by sedentary villages, increasingly intensive agriculture, and greater social and political complexity. Each section presents a few illustrative highlights of data concerning subsistence, material culture, and other practices relevant to identifying the traces of contact among speaker communities. 4.2

Early subsistence economies 14,000–8000 BP

The story of subsistence strategies is a good place to begin examining evidence for population density, as a sufficient year-round food supply is thought to represent the key factor in the emergence of social and political structure, village organization, and other cultural development. The need for sustenance is of course fundamental to human survival. From all indications, the earliest South Americans engaged in a broad spectrum of subsistence activities adapted to the particular ecosystem of their habitat. They relied on a combination of foraging and gathering of naturally occurring products of shoreline or forest supplemented with fish, shellfish, and/or game and were not exclusively big game hunters armed with spears and at the mercy of the surrounding terrain. For example, the diet at Monte Verde in Chile (12,500 BP) included seeds, berries, animals, shellfish, aquatic plants, and medicinal herbs. Intriguingly, some of the herbs and plants came from environments hundreds of kilometers away, suggesting that “the Monte Verdeans either traveled regularly to distant environments or were part of a web of social and exchange relationships” (Dillehay 2000: 165). Furthermore, the remains of wild potatoes found at the site support suggestions that the potato originated in at least two places, in Peru and in Chile (p. 165). The analyzed remains of Paij´an culture (11,000– 9000 BP, north central coast of Peru) indicate that people ate fish, small birds, land mollusks, and small land animals; the distinctive long points found at the sites were likely used for spearing large fish and not larger animals like deer or mastodon (Roosevelt 1999: 273). Another excellent example of thriving local subsistence is found in the Monte Alegre culture, which began some 11,200–10,000 years ago at the juncture of the Tapajos and Amazon Rivers near Santerem, Brazil, with related sites found downriver as far as Belem (Roosevelt 1999: 313). These presumed nomadic hunter-gatherers exploited riverine resources such as fish, turtles, and mollusks, as well as forest foods such as fruits, nuts, and small animals. Other examples of early, local agriculture include traces of bottle gourd, arrowroot, leren, and squash in Panama (9000– 7000 BP), and evidence of maize, beans, and chili in Argentina (8000 BP). Small-scale gardening, the exploitation of local resources, and broadspectrum foraging may leave little archaeological evidence of the numbers of people involved, but we find indications of population density from traces of activities related to subsistence. These include physical modifications to the

48

Loretta O’Connor and Vishnupriya Kolipakam

landscape such as garbage middens and garden heaps, which have long been investigated for clues to the numbers and habits of early South Americans. For example, Roosevelt (1999) concludes there was no contact between the earliest Peruvian cultures of the coast, where middens contain the remains of fish, seafood, and aquatic plants, and cultures of the highlands, where remains of camelids and other types of tools are found. The Monte Verde remains provide another detail about social practices of this period. The structures at this site suggest a complex and coordinated social organization, reflected in separate areas for residential and non-residential activities and evidence that specific locations within these areas had been dedicated to specific tasks, such as cooking, cleaning hides, and sharpening tools (Dillehay 2000: 167–168). 4.3

Ceramics technologies and plant domestication 8000–4000 BP

The earliest South American ceramics date from 7580 BP, found at Pedra Pintada near Santerem in Brazil, and the record includes other early pottery from Taperinha, Monte Alegre, and Salgado (all Brazil) dated between 7500 and 5000 BP, and from the Alaka culture (Guyana) dated 6000 to 4500 BP (Roosevelt 1999: 316–318). Roosevelt (1999: 315) surmises that the river floodplains of Amazonia provided more of the relevant raw materials for ceramic technologies than did, for instance, the Andean highlands, where pottery did not develop for another 4,000 years.6 Ceramics inform the archaeological record in several ways. The composition of the clay as well as the material used for tempering – sand, gravel, bone, shell, fibers, charcoal, crushed rock, crushed potsherds – can help trace where the pot was made. Large, globular, fiber-tempered tecomate pots first appeared about 5,000 years ago in Puerto Hormiga on the Caribbean coast of Colombia, from where the tradition apparently spread north into Central America and east into Venezuela (Allaire 1999: 679–692). These ceramics are distinct from sand-tempered but roughly contemporary wares from Valdivia (Ecuador), Monagrillo (Colombia), and the middle Orinoco (Venezuela). The shape and size of pots suggest how they were used, for cooking, serving, and preserving food and water, or perhaps for ceremonial purposes, such as burial urns. Physical features as well as motifs, materials, and general aesthetics of decoration play major roles in classifying ceramic traditions and in linking these with particular cultures. At the same time, it is also recognized that pots, 6

This same reasoning may apply to the Southern Cone, where fired ceramics came late, but clay itself was a component in fardos funerarios, bundles found in mortuary assemblages in the Puna and northern Argentina and especially associated with complex processes of mummification practiced from 7000 BP in the Chinchorro culture of northern Chile and southern Peru (Scheinsohn 2003: 351).

Human migrations, dispersals, and contacts in South America

49

pot-making techniques, and even potters can be traded and/or transported rather easily among different communities, and we know little about the relative significance and durability of the myriad ceramics styles. Bray (1984: 338) urges caution in making claims about ceramic iconography which is “charged with symbolic messages designed to reinforce the sociopolitical ideology of a particular society. . . . In this kind of context, borrowing a new motif from outside can be a major ideological event – as serious, in its way, as adding a hammer and sickle to the American flag or hanging an icon in a Baptist chapel.” Pottery production often goes hand in hand with developments in subsistence practices, which in turn locate areas of population density and identify networks of commerce and exchange. For example, the necessary detoxification of bitter manioc was perhaps first achieved by cooking the tuber on flat clay plates called budares, artifacts of a practice in Puerto Hormiga and Mons´u in northern Colombia that expanded broadly to the Andes, the northern South American lowlands, and the Caribbean (Navarette 2008: 436). In all corners of the continent, local plant domestication and small-scale agriculture came first and probably early, as chance landraces emerged by accident in dump-heap gardens. Birds and animals would have played roles in disseminating seeds, and even ocean currents may have introduced new varieties, as is presumed for bottle gourds native to Africa (Cooke 2005: 140). Large-scale agriculture developed late, but small-scale agriculture and the domestication of local varieties began early, probably in more than one place, and “perhaps at the beginning of the Holocene, rather than when production systems coalesced and became prominent 3,000 to 4,000 years BP” (Clement et al. 2010: 74). Clement et al. (2010) present recent Amazonian results from phylogeography, the analysis of the geographic distribution of genetic variants and lineages. They review molecular genetics literature on eight of the eighty-three native Amazonian crops (manioc, cacao, peach palm, pineapple, inga, guaran´a, Brazil nut, cupuassu), and they estimate first domestications of peach palm at perhaps 10,000 BP, manioc at 8000 BP, and Capsicum and pineapple at 6000 BP, projecting sites and dates for other crops using modeling techniques. Their results suggest that all but one of the eighty-three species examined originated in the periphery of the Amazon basin (p. 93) and later spread throughout the region. Examples from these early findings include the identification of manioc with a source cultivar in eastern Brazil and northern Bolivia being grown as a food ˜ crop in the Zana and Nanchoc valleys of coastal Peru in 8000 BP (p. 77), and the possible spread of one type of peach palm from eastern Amazonia along the Maderia River to Bolivia and a separate landrace from southwestern Amazonia to the northeast and northwest (pp. 82–83). Distribution of species may reveal patterns of social practice and population densities, seen in the prevalence of bitter manioc along major Amazonian rivers and coastal areas of South America, where sedentary populations could process it, while sweet manioc is

50

Loretta O’Connor and Vishnupriya Kolipakam

more common in the headwaters of these rivers in western Amazonia and Peru (pp. 77–78). Oliver’s (2008: 198–205) account of the “itinerant gardeners” of Araracuara near the Colombia–Brazil border is a fascinating example of how the spread of cultivars and technologies could have taken place. These ancient gardeners seem to have planted a variety of root crop gardens with staggered harvest times at different places in the forest, and then moved from spot to spot as particular trees came into fruit. Radiocarbon dates at the Araracuara settlement suggest multiple separate occupations starting around 9000 BP and recurring over several millennia. Similar dates are found in the literature on plant domestication in the Andes (Pearsall 2008), with evidence of Cucurbita squash, lleren, tree crops, and tuber domestication dates from 9700–9000 BP. Very few native animals in South America were domesticated for use by humans; for example, Stahl (2008) notes only the muscovy duck and the llama in the Andes. Significant repercussions of this shortcoming are measured in terms of fewer affordances, for transport and farm work, and fewer pathogens shared by coexisting with animals, leading to less resistance in the human population to animal-borne diseases. More intensive agriculture would have developed gradually in South America, as a reliance on tubers and local foraging gave way to cereal-based economies, especially to maize from Mexico. In a study on the domestication and dissemination of maize and teosinte, Matsuoka et al. (2002) found that maize began in southern Mexico around 9000 BP, moved into northern South America, and then was adapted in the Andes. Inhabitants of the Caribbean region grew corn in the west and manioc in the east from perhaps 8000 BP (Allaire 1999: 678), and maize farmers slowly spread into more humid forests in Panama 7000–4000 BP (Cooke 2005: 144). Pearsall (2008) notes traces of early forms of maize in Colombia and coastal Ecuador pre-6000 BP and in coastal Peru about 4000 BP, while maize replaced manioc at Momil in northern Colombia some 2,500 years ago (Allaire 1999). The earliest acquisition of crops domesticated outside particular regions probably took place by “down-the-line contact among contiguous populations” and not through population movements (Cooke 2005: 141). New crops would be adapted gradually over the centuries to individual ecologies before being spread “sometimes widely, through social interactions among foragers/ horticulturalists” (Pearsall 2008: 119). Oliver (2008: 200–201) posits different timelines in the shift to agriculture for different regions of Amazonia. In Lower Amazonia and the Par´a coast of Brazil, the agrarian transition, from house gardening to the tending of wild plant food production and then to the systematic cultivation of high-yield crops, probably began around 8000 BP with the advent of ceramics. Developments in food production technologies in Colombian Amazonia are less well documented

Human migrations, dispersals, and contacts in South America

51

but appear to be less tied to pottery production, which seems to have begun much later (maybe 2500 BP?), and more influenced by the introduction of maize around 4700 BP. Different regional histories for eastern and western Amazonia are supported by analyses of soil samples which “indicate that human impacts on Amazonian forests were heterogeneous across this vast landscape” (McMichael et al. 2012: 1429). What emerges is a picture in which farming is present but not focal in potentially populous and sophisticated societies around the continent. Rock art, textiles, regional craft styles, and evidence of monumental architecture and shared technologies enter the archaeological record. Many non-agricultural societies manifest the characteristics considered to exemplify complex culture. In South America, the earliest complex societies in the Central Andes do not show evidence of strongly agricultural food economies. The earliest horticultural complex societies in both the Andes and the tropical lowlands relied on a variety of crops other than maize, with both temperate and tropical tubers holding positions of importance. Cultivated plants enter economies more for their importance in crafts and food processing (gourds and cotton) and their ceremonial importance (maize for beer) than for their direct food value for humans. In both the lowlands and the Andes, intensive agricultural economies were created by late prehistoric complex societies, not the other way around. (Roosevelt 1999: 266)

4.4

Landscape modification and the intensification of agriculture 4000–500 BP

Much of the evidence for population density and interaction during this period comes from human modifications to the landscape. Thousands of shell mounds or sambaquis are testaments to the presence of fisher people societies along the central and southern coasts of Brazil (Gaspar et al. 2008). While the earliest examples date from 9200 BP, most sambaquis are 4,000–2,000 years old. They mostly contain the remains of shellfish, as well as occasional evidence of intensive fishing, bone tools, grinding stones, and human burial, yet there is a notable scarcity of ceramics. These factors and their uniformity support theories that the mound builders comprised a regional network of communities, perhaps constrained to the coast. The prolonged and widespread sharing of fundamental cultural patterns exhibited in sambaquis indicates intense, sustained interaction among the corresponding communities. Major shifts in cultural trajectories are not apparent, and there is no appreciable evidence for interaction and exchange with inland hunters and gatherers of other cultural traditions . . . or with subsequent ceramic people, until after 2000 BP. (Gaspar et al. 2008: 323)

In contrast to sambaquis, mounds on Marajo Island at the mouth of the Amazon are filled with funerary effects and polychrome ceramics; some

52

Loretta O’Connor and Vishnupriya Kolipakam

interpretations of these mounds posit potentially independent groups of a few thousand people who flourished here 1,500 to 700 years ago (Schaan 2008), while others (e.g. Eriksen 2011) link these ceramics and other practices such as mound-building and water management to (contact with) Arawakan culture. Various types of landscape modification suggest increasing degrees of population density, social cooperation, and intensified agriculture. Raised fields are found all over the continent, from northern Colombia throughout the Andes region to Lake Titicaca on the Peruvian–Bolivian border. The Andes are wellknown sites of terracing, from perhaps as early as 4500 BP, and the remains of irrigation practices are found in mountain valleys as well as coastal regions of the Caribbean and Pacific as far south as central Chile. Strategies of irrigation range from small ditches in the Za˜na Valley from about 6000 BP to complex systems of canals about 1,000 years old. Other specific examples of earthworks include raised fields, fish weirs, ring ditches, and causeways 400 to 800 years old in Amazonian Bolivia (Erickson 2000; Walker 2008) and in the Upper Xingu of Brazil (Heckenberger et al. 2003), constructed to manage farming, fishing, and community relations in concert with natural rhythms of annual flooding. Surinamese mound builders from the same time period practiced agriculture and built villages on higher ground (Versteeg 2008). Mounds dating from 2650 BP in lowland Ecuador (Salazar 2008) attest to the architectural planning and organized labor that contributed to their construction. Hornborg and Eriksen (2011: 143) submit that geoglyphs etched between 3,000 and 800 years ago in Acre (Brazil) are vestiges of a defensive perimeter, which they argue was constructed by speakers of Arawakan languages. Another feature of the domesticated landscape is that of enriched, anthropogenic soil, sometimes called ADE or Amazonian dark earths.7 The dark color and the fertility of ADE come from charcoal, the result of repeated, in situ, low-temperature burning and spreading of charcoal to improve the soil for re-use (e.g. Oliver 2008; Arroyo-Kalin 2010, and sources therein). The first patches, found atop river bluffs in the central Amazon and Orinoco basins, are thought to date from some 2,000 years ago and have been linked to the production of especially manioc. However, the analysis of ADE is very much an active area of current research: details are constantly updated as to when and where the practice began and where and how it was introduced elsewhere. Suffice to say that anthropogenic soils open the possibilities for the long-term occupation by larger, denser populations of what would seem to be weak soils, and they provide the foundation for the intensification of agriculture. The biological and sociological impacts of agriculture have long been topics of discussion and research across many disciplines. For instance, the 7

Other names in the literature are terra preta do indio (Indian black earth), terra preta (black earth), and terra mulata (mixed earth).

Human migrations, dispersals, and contacts in South America

53

relationship of agriculture to population density is of critical importance in the field of paleodemography. The significant increase in population numbers after the adoption of agriculture constitutes a demographic revolution called the “Neolithic demographic transition” or NDT, as in Bocquet-Appel (2002: 637), later denoted by the more general term “agricultural demographic transition” or ADT (Bocquet-Appel 2009: 659). The notion of transition stems from the chronological position of the ADT between two states of equilibrium: in each period, birth rates and death rates maintain roughly stable population densities of different sizes. After an initial spike in the birth rate, the mortality rate also rises to stabilize the later population at a larger absolute number of humans. Agriculture is the “pulse” factor between “pauses” of smaller and larger absolute numbers of people. Applications of the NDT model in the Americas include studies by Bandy (2005), who supports the two-stage dynamic of initial spike and subsequent stabilization, and Lesure (2008), who urges nuanced interpretations of population increases that take spatial scale and the character of specific subareas into account when calculating the effect of the particular DT. Diamond and Bellwood (2003) maintain that a transition to agriculture provides three main advantages: (1) food production means higher population densities; (2) sedentary farmers can store food and accumulate surpluses that will sustain populations through lean times; and (3) farmers become more resistant to diseases that emerge from the more crowded conditions of settlements and close association with domesticated animals (p. 597). The authors discuss the role of farming in population shifts around the globe involving fifteen language families, and indeed some of the major population expansions in South America probably were triggered by the intensification of agriculture. See Eriksen (2011) for a recent interdisciplinary study of the role of manioc, ADE, and other cultural artifacts and practices in Arawakan expansions 3000–1500 BP, and see Heggarty and Beresford-Jones (2010) for a thoughtful contextualizing of the role of agriculture in language dispersals with special focus on expansions 3,200 and 1,500 years ago in the Andean sphere. 5

Conclusions

We know a great deal more about the prehistory of South America than could be reported and categorized in the erstwhile seminal publication on the topic, the multi-volume Handbook of South American Indians (Steward 1946–1950). Today’s archaeological record overturns many of the predictions of sixty years ago. For example, we now know that by 11,000 BP South America was home to populations throughout the continent, living in communities adapted to local ecologies, which means that the Clovis culture, while certainly an important civilization in the history of the Americas, was not the origin of the first migrations to South America. Social stratification and plant domestication in South

54

Loretta O’Connor and Vishnupriya Kolipakam

America began long before the establishment of villages and large-scale agriculture, and the lowlands, in particular Amazonia, were not empty and pristine wildernesses but instead were the locus of social development and innovation at least contemporary with that in the better-documented societies of the South American highlands. The previous chapter in this volume presented the range of language classification schemes for South America, from the notion that all South American languages come from one superstock family called Amerind (Greenberg 1987) to the conservative view that there are more than 110 distinct genealogical linguistic units (Kaufman 1994; Campbell 2012a). This diversity stands in stark contrast to the situation on other continents, where linguistic evidence supports much smaller numbers of language families. Fruitful large-scale analysis and classification of the linguistic diversity in South America have been made difficult by scant documentation and a belief that only the Comparative Method can establish an accurate linguistic history. Meanwhile, in the absence of competing proposals, a general acceptance of the Amerind proposal (and, most damagingly, its sub-branches) as a working template persists outside linguistics (e.g. Reich et al. 2012). Heggarty and Beresford-Jones (2010: 178–179, 181) suggest that the plethora of shallow language families, the many shared typological features, and the absence of deep language families in South America simply reflect a reality that no single pre-4000 BP speaker community achieved the type of dominance that would lead to language shift. Scholars across the human sciences have proposed links between archaeological evidence and ethnolinguistic expansions. Constenla (2012: 418) notes that lexicostatistical analysis places the emergence of Proto-Chibchan at 8–9,000 years ago, at the beginnings of a “Hunters-Collectors” period, and that subsequent internal splits in the Chibchan family between 6600 and 4800 BP occurred during the transition to the time of the “Specialized CollectorsDomesticators.” Research on the Macro-Jˆe stock, which may have a time depth of 5–6,000 years (Ribeiro and van der Voort 2010), features De Souza’s (2011) use of ceramics and subsistence records to motivate a southward expansion that began some 1,200 years ago. Beresford-Jones and Heggarty (2012a) link an Aymaran expansion to the Chavin Early Horizon, around 3200 BP, and a major Quechuan expansion to the Wari’ Middle Horizon, about 1500 BP. Eriksen (2011) delineates the extent of a regional exchange system that disseminated Arawakan language, material culture, and social practices over a vast expanse of Amazonia between 3,000 and 500 years ago using evidence from Barrancoid ceramics, raised mounds, ADE, and ceremony, while multiple expansions of Tupian speakers 5,000, 2,500, and 1,200 years ago have been linked to polychrome ceramics and militarism (Noelli 2008; Eriksen 2011). Hornborg and Eriksen (2011) contextualize a Panoan expansion 1600 BP in an analysis of ceramics styles, trade routes, exchange items, and regional interaction.

Human migrations, dispersals, and contacts in South America

55

The general vision of prehistory that emerges in this chapter is that, starting some 15,000 years ago, South America saw millennia of localized pockets of population growth and cultural development, with subsistence strategies bound to shoreline, forest, river, or altiplano and with domesticated tubers and cereals playing a supplementary role. There were vibrant networks of contact, especially with contiguous neighbors but also involving markets or sources of specific natural resources farther afield. This type of interaction would promote cultural exchange and convergence on multiple levels, as technologies were adopted and adapted by diverse communities. However, there is little evidence for major expansions of particular groups that would provoke language shift until the highland and lowland expansions that began 4,000 to 3,000 years ago and continued until and through the European conquest.

3

Basic vocabulary comparison in South American languages Harald Hammarstr¨om

Comparison of basic vocabulary has been the default method for sorting out the fundamental relationships between South American languages since the very beginning. It is through basic vocabulary (Tadmor et al. 2010) that we are able to make the first distinctions between contact and inheritance and thus infer contact in grammar and other domains of vocabulary. I show that the classification of South American languages by Loukotka (1968), based on basic vocabulary inspection, closely mirrors the classification presented by Campbell (2012a), for which far more extensive lexical and grammatical data had become available. In addition to the classic manual comparative work, I compare the outcome of an automated procedure for lexical comparison to the existing manual classifications. This reveals that automated comparison has a high degree of correspondence to the manual ones, despite the simplistic assumptions of the former and question marks on systematicity and objectivity of the latter. 1

Introduction

From the very beginning, basic vocabulary comparison has been used by scholars of South American languages to find genealogical language families. The ideas underpinning the method of basic vocabulary comparison are simple. If legitimate, it constitutes a powerful tool, because, once the shallowest genealogical families are found, we can trace diffusion across those families, as well as make well-grounded investigations towards deeper relations. 2

South American avant-gardists of basic vocabulary comparison

The first1 major attempt at classifying the languages of South America was Herv´as’ (1784) catalog of languages of the world. Due to the Jesuits’ activities in the colonies, several hundred South American languages were known to various degrees (Herv´as 1784: 10–11). 1

In the centuries before, only a handful of South American languages were known through published data. De Laet (1643) is an even earlier precursor to basic vocabulary comparison, but effectively only had access to unrelated South American languages.

56

Basic vocabulary comparison in South American languages

57

Herv´as’ (1784, 1800) catalog lists the languages along with basic metainformation such as location and alternative names, while his vocabulary (Herv´as 1787b) and extracts (Herv´as 1787a) give basic vocabulary and text specimens respectively. Almost all of this data concerning South American languages was obtained by Herv´as through extensive letter correspondence with his Jesuit colleagues (Batllori 1951; Clark 1937). Emanating from this scholarly exchange was the organization of languages in groups around a “matrix” language (“lengua matriz” in the original Spanish versus “lingua matrice” in the original Italian). Despite the centrality of the term and lengthy discussions on linguistic theory and the role of language,2 we are never given an explicit definition of the matrix-language concept. However, its essential properties can be inferred from the context where the term occurs: 1. Languages belong to the same matrix language if items of data from them show affinity (“affinit`a”, and more precisely “affinit`a delle parole”), e.g. Herv´as (1800: 29–30): se halla freq¨uentemente que hablan dialectos provenientes de una misma lengua matriz naciones entre s´ı distint´ısimas, . . . De los lenguages de estas naciones tup´ı, guaran´ı y homagua los jesuitas sus misioneros me han dado varios documentos, con cuyo cotejo he hallado que los dichos lenguages tienen afinidad, y son dialectos provenientes de una misma lengua matriz. it is frequently found that nations very different from each other speak dialects that stem from one and the same matrix language, . . . , concerning the languages of the nations Tup´ı, Guaran´ı and Homagua, the Jesuits who are missionaries with them have given me various documents, with whose authentication I have found that the said languages have affinity, and they are dialects stemming from one and the same matrix language.

Numerous examples of actual vocabulary comparisons and the declaration of their affinity is found throughout the Jesuits’ language listings (Gilij 1784; Herv´as 1784, 1787b). 2. The absence of affinity between languages means they are different matrix languages, e.g. Herv´as (1800: 245): De la lengua Puquina solamente he podido lograr la oracion dominical, cuyas palabras me parecen muy diferentes de las respectivas de otros idiomas de Am´erica, por lo que conjeturo que sea matriz. Regarding the Puquina language I have only been able to obtain the Lord’s Prayer, whose words, in my opinion, seem very different from the respective words in other languages of the Americas, wherefore I conjecture that it is a matrix language. 2

See especially Herv´as 1800: 1–106 or Gilij 1784: 273–309 where we learn that linguistics is superior to physical characteristics (e.g., size of the head) for classifying the nations of the world, and that entire nations can switch languages, and that there is no difference in the complexity of languages of civilized versus primitive peoples.

58

Harald Hammarstr¨om

3. Affinity of items of vocabulary between two languages does not imply that they are of the same matrix, if those vocabulary items are borrowings, e.g. Herv´as (1787b: 32): Passo ora ad esporre alcuni pratichi esempi della rispettiva affinit`a delle lingue di ognuna delle due Americhe nelle parole; non gi`a perch`e essi provino essere veramente affini le lingue di parole affini, ma perch`e se ne rilevi il vicendevole commercio delle nazioni . . . I will now show some practical examples of the relatedness in some words among the languages of the two Americas. Not, however, because these languages which have related words are truly related, but because the mutual trade between the nations is revealed in them . . .

4. Chance resemblances in basic vocabulary can be found even in different matrix languages (and thus do not indicate that the languages have a common matrix), e.g. Herv´as (1800: 153–154): Carne ‘meat’ Hijo ‘son’ Lengua ‘tongue, language’ Ma˜nana ‘morning, tomorrow’ Negro ‘black’ Noche ‘night’

Tamanaco Charar´u Emuru Nuru Coronare Kinˆeme Kolco

Kiriris Cradz´o I˜nura Nunu Carantzi Kotko Kaya

La semejanza de estas palabras es accidental: porque las lenguas tamanaca y kiriri son totalmente diversas, y provenientes de matrices diferent´ısimas . . . The similarity between these words is accidental: because the languages Tamanaco and Kiriri are totally different, and come from very different matrix languages . . .

In particular, Herv´as (1800: 43–44) discredits theories that “prove” that all languages descend from Hebrew based on a few superficially similar lexical items. 5. Any human language has basic vocabulary, e.g. Herv´as (1800: 15–16): El lenguage de la nacion mas b´arbara tiene a´ lo m´enos las palabras de todas las cosas mas necesarias para su subsistencia, y quando comercia o´ trata con otra civil, recibe de esta las demas palabras de las cosas no tan necesarias. Por tanto en los idiomas de las naciones, que se advierte estar corrompidos con palabras forasteras, se deben buscar como primitivas las que signifiquen cosas de la mayor necesidad, o´ del mas freq¨uente uso o´ conversacion de los hombres; Even the language of the most barbarous nation at least has words for the things most necessary for their subsistence, and when such a language engages in trade or deals with a civil language, it obtains from the latter words for other not so necessary things. So among the languages of the nations which are corrupted with unknown words, one has to find the primitive ones which carry the meanings that are the most necessary or of most frequent use or interaction of the people;

Basic vocabulary comparison in South American languages

59

However, a number of issues relating to the matrix-concept seem to have remained undeveloped. First, if two languages A and B were concluded to belong to the same matrix language, it is not clear on what grounds it was decided that A is the matrix and B the dialect, rather than vice versa, and only in a few passages is the matrix language neither A nor B, but one that must have existed in the past. Second, when it is realized that there are different levels of relatedness, the terminology oscillates between matrix referring to the sub-family and matrix referring to the deepest-level family. Third, Herv´as (1800: 11, 15) declares that languages can be compared not only on the lexical, but also on the phonological and grammatical level. While he is aware that grammatical similarities such as the order of adjective and noun (Herv´as 1800: 24, Herv´as 1799) have no particular implications for matrix-hood, in at least one passage, Herv´as (1784: 41) admits that the “artificio gramaticale” has a higher probative value than vocabulary for showing that languages belong to the same matrix. The culmination of the method pioneered by the Jesuits3 is Loukotka’s (1968) classification of all South American languages. The principles behind Loukotka’s classification, as described by himself (Loukotka 1968: 29–31) and in Wilbert’s (1968) introduction, can be summed up as follows: 1. Whenever possible, inspect a standard list of forty-five meanings for cognates with other languages. 2. If the standard list cannot be compiled, use whatever there is, and if there is no form–meaning data at all, classify based on any other information. Languages which only have scraps (a few words, personal names) of data or no data at all, are systematically indicated as such in Loukotka’s (1968) listing. A selection of languages with five to ten vocabulary items are interspersed throughout his outcome listing, but otherwise what comparisons underlie what classification choices is left implicit. That is how Loukotka describes the principles of his own work in language classification. However, the listing that actually appears in Loukotka’s book has a few families which Loukotka (1968: 29–30) describes as compromises between his own classification and that of other prominent scholars: other scholars’ results, often their opinions did not coincide with the conclusions of my own comparative studies. In such cases it was necessary to strive for a reasonable 3

Wilbert (1968: 8–10) ascribes this method, i.e., classification based on the inspection of standard lists of basic vocabulary, to Brinton (1891) because Brinton had an explicit standardized list of twenty-one meanings. Brinton and Loukotka, in fact, often found themselves using fewer or more items of vocabulary than from the standard lists, according to whatever was available. There is not much difference, then, between these methods and those of the Jesuits whose implicit notion of basic vocabulary must be taken to be roughly the meanings in Herv´as (1787b)’s Vocabulario Poligloto, or whatever was available.

60

Harald Hammarstr¨om

compromise that retained the value of the other investigator’s work and at the same time meshed with my own.

These “compromise-case” families are said to be Arawak, Karaib Tup´ı, and Chibcha (Loukotka 1968: 29–30), but may include a few more (e.g. Chim´u; see Rowe 1954). It seems that the “compromise-cases” cannot be defended on purely scientific grounds – Loukotka had them as such out of personal reverence for certain individuals. For example, regarding the Chibchan family, he followed Paul Rivet “his esteemed teacher” (Loukotka 1968: 50) – in fact, the 1968 book itself is dedicated to Paul Rivet. Campbell (2012a: 66) holds that “Loukotka’s method was generally criticized” but does not adduce adequate support for this claim. He cites Rowe (1954: 15) as evidence, but Rowe’s critique concerns the “compromise-cases” where Loukotka did not apply his method. A subsequent passage concerns “Mischsprachen” but this concept pertains to Loukotka’s 1942 classification (Loukotka 1942: 1), and the labels “Mischsprachen” and “Spuren” were almost completely abandoned for the 1968 classification.4 3

The standing of basic vocabulary comparison

The method of basic vocabulary comparison (BVC) may be characterized as follows: r use a standard list (e.g., 45 words as per Loukotka, 200 words as per Swadesh) of basic vocabulary r look for similarities in form and meaning r select languages that share similarities in sound and meaning beyond randomness r interpret the non-random similarity as inheritance rather than r universals r borrowing BVC is a method that takes language data and outputs families. The three fundamental questions, for evaluating BVC and any such method, are: r Is the method sound? I.e., does it produce false positives, or in the dearth of hard evaluation data, is the method based on defendable principles? r Is the method complete? I.e., does it find all families that are recoverable from linguistic data? r Can it be automated? (This is desirable in order to save time and to straighten out question marks on systematicity and objectivity with manual methods.) 4

Many languages were so labeled in the 1942 classification, yet only a few Matacoan and Guaycuruan languages remain with these labels. Since Loukotka’s introduction to the 1968 classification does not mention “Mischsprachen” he presumably intended to remove them. Perhaps the few cases where they did remain are where he did not find the time to revise them before his death.

Basic vocabulary comparison in South American languages

61

Regarding soundness, there are several components. First, regarding borrowing, a long overdue empirical investigation (Tadmor et al. 2010) shows that there is a subset of vocabulary which is more resistant to borrowing. Such vocabulary items are often (but need not be) monomorphemic (Urban 2012). More importantly, the set of meanings resistant to borrowing significantly overlaps with lists of meanings that are found in all human cultures and with lists of meanings that are frequent in discourse across human languages (Borin 2012) as well as the lists used by Herv´as and Loukotka. Second, empirical investigations into universal tendencies in similarity between form and meaning – also known as sound symbolism – have shown that such tendencies are very slight (Urban 2012; Wichmann et al. 2010a). Third, chance resemblance can be ruled out in a variety of ways. For example, to the degree that regular sound correspondences can be established between different items of basic vocabulary, chance as an explanation is effectively eliminated (Campbell and Poser 2008; Hewson 2010). If chance resemblance is not transparently ruled out there are methods for explicit tests of chance similarity (Dunn and Terrill 2012). Thus, at the end of the day, there are good arguments for concluding that BVC, as just described,5 is a sound practice. Regarding completeness, the question is whether comparison of other linguistic data, i.e., morphosyntax and non-basic vocabulary, produces significantly more or significantly different family relationships. Loukotka (1968: 15, 29) relied on basic vocabulary inspection for practical reasons: the time limitations of one single human and the lack of more extensive lexical or grammatical data for most South American languages. This is in sharp contrast with the classification of North American languages by Powell (1891: 11), who held the supremacy of BVC on theoretical grounds and would not have used other data even if it were available. However, Powell’s (1891) BVC classification is nearly identical to the classification of Goddard (1996) which is not limited to BVC and takes into account a century of additional data and intensive study of historical relationships. The situation of South American languages at the time of Loukotka versus now is similar in the sense that essentially only wordlists were available in Loukotka’s time, whereas at present, grammatical data are available for the bulk of the languages that were still alive in the past 50 years. In 1964, when Loukotka handed over his manuscript, there were 46 South American languages with a grammatical description of around 150 pages, and 94 further languages with a grammar sketch of around 50 pages or the equivalent. In 2012, there are 233 languages with a grammatical description and 119 additional languages with a grammar sketch.6 A partial answer regarding 5

6

Of course, not any method based on inspection of basic vocabulary is sound. For example, Rivet (1924b) and Greenberg (1987) failed to observe the requirement of non-randomness in their basic vocabulary comparisons (Campbell and Poser 2008). The figures were computed from data in the LangDoc project (see Hammarstr¨om and Nordhoff 2011) on September 1, 2012.

62

Harald Hammarstr¨om

completeness for South American languages is given in the next section, with the comparison of Loukotka’s (1968) classification with that of Campbell (2012a). Regarding automation, recent work gives us a lower bound on what automation can do. The Automated Similarity Judgment Program (ASJP) project compares the forms of words on a standardized forty-item list of basic vocabulary (Brown et al. 2008). The words are entered in a uniform transcription system. The distance between two words can then be calculated as the number of character insertions/deletions/substitutions needed to transform one word to the other (also known as the Levenstein-distance). Dividing this by the length of the longer of the two words gives a score between 0 (identical) and 1 (completely different). The distance between two entire languages is then defined as the average distance between pairs of words with the same meaning, divided by the average distance between pairs of words with different meaning. The latter step is to discount similarity caused merely by small or similar phonetic inventories (see Wichmann et al. 2010b for further details). The ASJP program amounts to a reasonable formalization of basic vocabulary comparison. Forty-word lists for most South American languages (357 out of a total of 577 attested classifiable languages) have been included for the 15th edition of ASJP (Wichmann et al. 2012). A comparison between the ASJP classification and the manual classifications of Loukotka (1968) and Campbell (2012a) is described in the next section. For the sake of clarity, it is worth pointing out that BVC as discussed here is not in conflict with the comparative method in historical linguistics – BVC constitutes (one possibility of achieving) the first step of the comparative method as defined by Ross and Durie (1996: 6), namely to “determine on the strength of diagnostic evidence that a set of languages are genetically related, that is, that they constitute a ‘family’.” The comparative method is furthermore in a position to achieve a reconstructed proto-language and a subclassification based on shared innovations – two worthy goals which are beyond the capability of simple basic vocabulary comparison. However, if enough basic vocabulary is reconstructed for one or more proto-languages, basic vocabulary comparison can, of course, be carried out on proto-languages to determine if they are related. 4

Three perspectives on language families in South America

We shall now compare the three classifications of South American languages: L1968:

The BVC-based classification of Loukotka (1968) as described in Section 2. ASJP-NJ-268: A cut-off neighbor-joining (NJ) tree (Saitou and Nei 1987) based on the default ASJP language distance measure as

Basic vocabulary comparison in South American languages

C2012:

63

described above. An NJ tree is the simplest way to convert a distance matrix between languages into a hierarchical classification with meaningful branch-lengths. The NJ-method necessarily produces one complete tree of all the input languages rather than a number of different families. To get a set of families from the tree, we cut the tree at a threshold distance t from the root of the tree. I chose the cut-off value t such that it maximizes the correspondence (least number of “misclassified” languages) with Campbell’s (2012a) classification (maximizing the correspondence with Loukotka 1968 yields a very similar result). The maximizing t-value happens to be 268 and results in the “misclassification” of about 40 languages (out of 357, or ca 11%).7 The classification of Campbell (2012a). This classification aims to follow any convincing evidence for genealogical relationship and is thus not restricted to basic vocabulary comparison. Although the outcome listing is claimed to represent only “generally accepted” genealogical units, no actual evidence is given, neither as direct linguistic evidence for the families themselves nor as indirect evidence pointing to experts or surveys that found them acceptable.

The purpose of comparing these three classifications is as follows. Comparing L1968 to C2012 gives us an indication of how complete BVC is, since a wealth of descriptive work appeared between 1964 and 2012. Comparing ASJP-NJ268 to either L1968 or C2012 gives us an indication of how automatable either one is, and, if ASJP-NJ-268 suggests new families that turn out to be valid on closer inspection, that comparison tells us how systematic and objective L1968 and C2012’s manual comparisons were. Table 3.1 lists all units not split by any of the three classifications and how those units are grouped. The exact languages comprising every unit can be found in the original listings. As already mentioned, the ASJP-NJ-268 does not include all South American languages, and features only 89 of the total of 108 (82%) Campbell (2012a) families. Except as featured in Table 3.1, there are no differences between the language inventories in L1968 and C2012 that have any bearing on the questions addressed in this chapter – they concern language/dialect divisions, the treatment of poorly attested languages or newly discovered easily classifiable languages rather than different classifications of sufficiently attested languages. Similarly, attested languages missing from both 7

Higher correspondence is achievable if we have a freely variable threshold or a threshold that depends on the size (number of languages) of the clade being cut.

64

Harald Hammarstr¨om

Table 3.1 Language families with their status in the work of Loukotka (1968), ASJP, and Campbell (2012a) Unit

L1968

ASJPNJ268

C2012

Aikana Aimore Andaqui Andoque Arara do Rio Branco Araucanian Arawakan-Nuclear Arawakan-Campa Arawakan-AndaquiGuajiro Arawakan-LokonoCaribbean Arawan Atacame Atacame˜no

Huari Botocudo Chibcha Andoque Tup´ı Mapuche Arawak Arawak Arawak

F5 [Aikana-Kwaza] F49 [Aimore] F33 [Andaqui-Guajiro] F31 [Bororoan-Andoque] – F30 [Kariri-Araucanian] F74 [Arawakan-Nuclear] F73 [Arawakan-Campa] F33 [Andaqui-Guajiro]

Aikan´a Krenakan Andaqu´ı Andoque Unclassified Mapudungun Arawakan Arawakan Arawakan

Arawak

F65 [Lokono-Caribbean]

Arawakan

Arawa Chibcha Atacame

Arawan Esmeralda Atacame˜no

Awake Aymaran Baenan Barbacoan Betoi Boran Bororoan Cahuapanan Candoshi-Shapra

Auake Aymara Baenan Chibcha Chibcha Bora Bor´oro Kahuapana Murato

Canichana Ca˜nar-Puruh´a Cariban Cayuvava Chapacuran Charruan Chibchan-Nuclear Chibchan-Aruakan Chiquitano Chocoan Chonan Chono Cofan Culli Fulnio Gamela

Canichana Chim´u Karaib Cayuvava Chapacura Charrua Chibcha Chibcha Chiquita Choc´o Patagon (Aksanas) Cofan Culli Fulnio Gamela

F3 [Arawan] – F43 [Atacame˜noCandoshi-Shapra] F42 [Awake] F0 [Aymaran-Quechuan] – F1 [Barbacoan] – F72 [Bora-Res´garo] F31 [Bororoan-Andoque] F4 [Cahuapanan] F43 [Atacame˜noCandoshi-Shapra] – – F32 [Cariban] F71 [Cayuvava] F19 [Chapacuran] – F52 [Chibchan-Nuclear] F55 [Aruakan-Katukinan] F40 [Chiquitano] F57 [Chocoan] F25 [Chonan-Payagu´a] – F38 [Cofan] – F60 [Fulnio-Leko] –

Awak´e Aymaran Baenan Barbacoan Betoi Boran Bororoan Cahuapanan Candoshi Canichana Ca˜nar-Puruh´a Cariban Cayuvava Chapacuran Charr´uan Chibchan Chibchan Chiquitano Chocoan Chonan Chono Cof´an Culle Yat´e Gamela

Basic vocabulary comparison in South American languages

65

Table 3.1 (cont.) Unit

L1968

ASJPNJ268

C2012

Guachi

F48 [Trumai-Guachi]

Guach´ı

Guahiboan Guaicuruan

Mixed Guaicuru Arawak Guaicuru

Guajiboan Guaicuruan

Guamo Guato Harakmbet

Guamo Guat´o Toyeri

F23 [Guahiboan] F64 [Abip´on], F26 [Guaicuruan] – F46 [Guato] F76 [Harakmbet]

Hibito-Cholon Huarpean Huitotoan Iranxe

Cholona Huarpe Uitoto Iranshe

Itonama Jabut´ı Jˆe-Central Jˆe-Southern Jirajaran Jivaroan Jodi Kakua-Nukak

Itonama Yabut´ı Ge Kaingan Jirajara Jibaro (–) Mak´u

Kamakanan Kamsa Kanoe Karaja Kariri Katukinan

Kamakan Chibcha Capixan´a Karaja Kiriri Catuquina

Kawesqar Kwaza Leko Lengua-Mascoy Lule Maku Matacoan Matanawi Maxakalian Mochica Moseten-Chimane Movima Muniche Mura-Piraha

Alacaluf Koaia Leco Lengua Lule M´aku Mataco Matanawi Mashakali Chimu Mosetene Mobima Muniche Mura

F21 [Cholon] – F62 [Huitotoan] F66 [IranxeNambikwaran] F75 [Movima-Itonama] F29 [Jabut´ı-Kaingang] F17 [Jˆe-Central] F29 [Jabut´ı-Kaingang] – F10 [Jivaroan] F39 [Jodi-Saliban] F18 [Puinave-KakuaNukak] – F35 [Kamsa] F70 [Kanoe] F16 [Karaja] F30 [Araucanian-Kariri] F55 [ChibchanKatukinan] F41 [Kawesqar] F5 [Aikana-Kwaza] F60 [Fulnio-Leko] F47 [Lengua-Mascoy] F7 [Vilela-Lule] F37 [Maku] F27 [Matacoan] – F56 [Maxakalian] F11 [Mochica] F2 [Moseten-Chimane] F75 [Movima-Itonama] F67 [Muniche] F44 [Mura-Piraha]

Guamo Guat´o Har´akmbetKatukinan Cholonan Huarpean Witotoan Irantxe Itonama Jabut´ıan Jˆean Jˆean Jirajaran Jivaroan Jot´ı Mak´uan Kamakanan Cams´a Kapixan´a Karaj´a Karir´ıan Har´akmbetKatukinan Qawasqaran Kwaza Leco Mascoyan Lule-Vilelan M´ako Matacoan Matanau´ı Maxakal´ıan Mochica Mosetenan Movima Muniche Muran (cont.)

66

Harald Hammarstr¨om

Table 3.1 (cont.) Unit

L1968

ASJPNJ268

C2012

Mure Nadahup Nambikwaran

Chapacuran Mak´u Nambikwara

– Mak´uan Nambikwaran

Nat´u Ofaie Omurano Oti Otomaco Paez Pankararu Panoan Payagu´a

Nat´u Opaie Mayna Ot´ı Otomac Chibcha Pankarur´u Pano Mixed Guaicuru Yagua Gennaken Mak´u

– F6 [Nadahup] F66 [IranxeNambikwaran] – F68 [Ofaie] – – – F36 [Paez] – F9 [Panoan-Tacanan] F25 [Chonan-Payagu´a]

Peba-Yagua Puelche Puinave Puquina Puri-CoropoCoroado Purubor´a Quechuan Res´ıgaro Rikbaktsa Saliban Sape Sechuran

Purubura Quechua Arawak Erikbaktsa Piaroa Kaliana Sechura

F51 [Peba-Yagua] F12 [Puelche] F18 [Puinave-KakuaNukak] – F22 [Puri-CoropoCoroado] – F0 [Aymaran-Quechuan] F72 [Bora-Res´garo] F34 [Rikbaktsa] F39 [Jodi-Saliban] F59 [Sape] –

Tacanan Tallan

Tacana Catacao

F9 [Panoan-Tacanan] –

Taruma Taushiro Tekiraka Ticuna Timote-Cuica Tinigua Trumai Tucanoan Karitiana Tarairiu Tupian Tuxa

Taruma (–) Auishiri Ticuna Timote Tinigua Trumai Tucano Tup´ı Tarairiu Tup´ı Tush´a

– F45 [Taushiro] – F24 [Ticuna] – – F48 [Trumai-Guachi] F28 [Tucanoan] F77 [Karitiana] – F78 [Tupian] –

Puquina mid Puri

Nat´u Ofay´e Omurano Unclassified Otomacoan Paezan Pankarur´u Pano-Takanan Payagu´a Yaguan Chonan Mak´uan Puquina Purian Tupian Quechuan Arawakan Rikbakts´a S´aliban Kaliana SechuraCatacaoan Pano-Takanan SechuraCatacaoan Taruma Taushiro Tequiraca Tikuna-Yur´ı Timotean Tiniguan Trumai Tucanoan Tupian Unclassified Tupian Tux´a

Basic vocabulary comparison in South American languages

67

Table 3.1 (cont.) Unit

L1968

ASJPNJ268

C2012

Urarina Uru-Chipaya Vilela Wamo´e Waorani Warao Xoc´o Xukuru Yamana Yanomamic Yaruro Yurakar´e Yur´ı Yurumangui Zamucoan Zaparoan

Itucale Uro Vilela Uman Sabela Uarao Shoc´o Shukuru Yamana Yanoama Chibcha Yuracar´e Yuri Yurimangui Zamuco Zaparo

– F69 [Uru-Chipaya] F7 [Vilela-Lule] – F50 [Waorani] F63 [Warao] – – F15 [Yamana] F8 [Yanomamic] F20 [Yaruro] F13 [Yurakar´e] – – F61 [Zamucoan] F14 [Zaparoan]

Urarina Chipaya-Uru Lule-Vilelan Wamo´e Sabela Warao Unclassified Xukur´u Yagan Yanomaman Yaruro Yuracar´e Tikuna-Yur´ı Yurumangu´ı Zamucoan Zaparoan

L1968 and C2012 are not taken up. Finally, Chibchan languages that fall outside South America geographically have been excluded from consideration. The L1968 and C2012 classifications nearly always agree. A complete account of the differences between L1968 and C2012 are as follows: r Linguistic data on Chono (Bausani 1975), Jodi (Guarisma Pinto and Coppens 1978), and Taushiro (Alicea 1975c) appeared only after 1964, and were thus missing in L1968. r L1968 has a Chibcha family ambitiously comprising Atacame, Andaqui, Barbacoan, Betoi, Kams´a, and Yaruro as well as an Arawak family including Guahiboan. These are, in fact, cases where Loukotka explicitly stated he had not followed his method of BVC, but sought a compromise with the views of scholars he had a personal reverence for. The L1968 inclusion of the very poorly attested Ca˜nar-Puruh´a in Chim´u is probably also such a case (Rowe 1954). r C2012 is composed of Lule-Vilela, Pano-Takanan, Harakmbet-Katukinan and Chon-Puelche, Southern Jˆe-Central Jˆe, Purubor´a-Tupian, Ticuna-Yuri, and Sechura-Catacaoan. Curiously, all of these are in fact argued on the basis of basic vocabulary comparison.8 If these are counted as valid families, there 8

For Lule-Vilela, see Viegas Barros (2001); Pano-Takanan, see Girard (1971), Key (1968), Ribeiro (2003) – though grammatical similarities have seriously entered the comparison afterwards; Harakmbet-Katukinan, see Adelaar (2000, 2007); Chon-Puelche, see Viegas Barros (2005);

68

Harald Hammarstr¨om

is some leakage in Loukotka’s BVC method, either in systematicity – did he ever compare Harakmbet and Katukinan? – or in the need for reconstructing proto-languages to arrive at Jˆe or Pano-Takanan. r L1968 labels the two poorly attested languages Guach´ı and Payagu´a as “mixed languages” under Guaicuru. Possibly he meant that they were nonGuaicuru languages that had come under Guaicuru influence, in which case his result is not very different from C2012. r The poorly attested Arara do Rio Branco, Tarairu, Oti, and Xoc´o are listed as unclassified in C2012. The reason for this is opaque since several languages with similar or less data than those have been classified as isolates in C2012. Mure (Teza 1868) is missed as a separate attested language in C2012. C2012 thus has a little leakage when it comes to consistency. The correspondence between ASJP-NJ-268 and C2012 is very high (as well as with L1968, since L1968 and C2012 are very similar). A few large families have not been identified in their entirety in ASJP (Arawakan, Tupian, Chibchan, Jˆe), 9 and a number of further families are realized (Aikana-Kwaza, Andaqui-Guajiro, Iranxe-Nambikwaran, Movima-Itonama, Jodi-Saliban, Aymaran-Quechuan, Bora-Res´ıgaro, Atacame˜no-Candoshi-Shapra, AruakanKatukinan, Chonan-Payagu´a, Bororoan-Andoque, Kariri-Araucanian, FulnioLeko, Trumai-Guachi). The first seven are old suggestions which have been investigated already,10 while the latter seven have not been seriously investigated. Of the latter seven, only Chonan-Payagu´a seems ethnohistorically and geographically plausible on the face of it. Thus, it appears that classical human BVC search for genealogical relations as reflected in L1968 and C2012 is as systematic and complete as an initial computerized search. 5

Conclusion

Even as extensive grammatical data are increasingly becoming available and used in the investigation of language prehistory, the notion of basic vocabulary and its role for genealogical classification is now also better understood and

9

10

Southern Jˆe-Central Jˆe, see Davis (1985), Jolkesky (2010); Purubor´a-Tupian, see Monserrat (2005), Vilacy Galucio (2005); Ticuna-Yur´ı, see also de Carvalho (2009), Nimuendaj´u (1977); and for Sechuran-Catacaos, only short vocabularies are available (Adelaar with Muysken 2004). Curiously Abip´on is not classified with Guaicur´u. Since Abip´on is a straightforward Southern Guaicuru language (Viegas Barros 2011) it raises the suspicion that the Abip´on list used in ASJP is misidentified or poorly transcribed. For Aikana-Kwaza, see van der Voort (2005); for Andaqui-Guajiro, Rivet (1924a) must have looked through the two; Iranxe-Nambikwaran must have been compared by Loukotka (1963); Movima-Itonama was checked already by Herv´as (1784: 56); for Jodi-Saliban, see Jolkesky (2009); Aymaran-Quechuan, Heggarty (2011), and in Bora-Res´ıgaro, (Seifart 2011) similarities are the result of borrowing.

Basic vocabulary comparison in South American languages

69

continues to play the main role. There is good evidence that BVC as practiced by Loukotka (1968) is sound and, if not complete, almost so, at least if we assume that Campbell (2012a) is a standard to measure this by. Attempts at automating one version of basic vocabulary comparison as practiced by the ASJP program reveals that manual BVC can be mimicked to a high degree and that manual BVC appears to have tested geographically and ethnohistorically plausible combinations thoroughly.

Part II

Case studies in contact

4

Structural features and language contact in the Isthmo-Colombian area Loretta O’Connor

This chapter examines the role of structural linguistic features as indicators of nested levels of social history in a specific geographic region. The IsthmoColombian area, dominated by speakers of Chibchan languages for millennia, is a region of rich resources, long-term settlement, and relative social stability where goods and technologies were exchanged within the region and with neighbors north, south, and east of the Chibcha sphere. For this study, structural features from fourteen languages of the region were coded as stable or unstable, using a composite ranking of relative stability, and as template or contents, using a functional metric. Patterns of similarity indicate that the set of features defined as contents, that involve choosing what to encode in a given structural feature, is more successful than any other set at replicating areal patterns. The analysis suggests that structural features, like lexical items, can be divided into types which are more and less susceptible to conscious manipulation by speakers, and that their role must be interpreted within a specific sociohistorical context.

1

Introduction

By the time of European contact in the early sixteenth century, Chibchan languages were distributed across four non-contiguous regions in Central and South America (Constenla 2012: 419): r eastern Honduras (Paya) r from southern Nicaragua to western Panama (the Votic branch and most of the Isthmic branch) r eastern Panama and northwest Colombia (Kuna (Isthmic), and the probably Chibchan extinct languages Cat´ıo and Nutabe) r along the Magdalena River from Cundimarca north to the sea (the Magdalenic branch)

This paper was improved by thoughtful comments from Pieter Muysken, Dan Dediu, and Simon van de Kerke. I am also grateful to Ana Vilacy Galucio for early discussions on database design and to Arnold van der Wal for statistical analyses and Figures 4.2 and 4.3.

73

74

Loretta O’Connor

This distribution presents a type of natural laboratory for looking at the effects of contact between particular Chibchan languages and genealogically diverse neighbors on the various borders, including Jicaquean and Misumalpan languages in the north, Chocoan, Barbacoan, and Paezan languages in the south, and Arawakan and Cariban languages in the east. This chapter examines patterns of structural similarity in seven Chibchan languages and seven neighbors to investigate multiple roles of structural data in tracing the history of language contact in the Isthmo-Colombian area. Quantitative investigations to date of the role of structural features in historical linguistics have focused a great deal on notions of dependency or predictability among features (Dunn et al. 2011, Hammarstr¨om and O’Connor 2013) and especially on assessment of the relative stability of individual features or meaningful subsets of features. Analysis by Dediu and Levinson (2012) of the World Atlas of Language Structures (WALS) database suggests that as few as ten to eighteen features may form the operative basis of abstract stability profiles in the language families of the world, yet the small set of crucial features varies across families and, importantly, includes both stable and unstable features. The study in this chapter makes use of a synthesis of stability rankings of individual WALS features, compiled in Dediu and Cysouw (2013), described further in Section 3.1. The limitation of models based solely on frequency of values in a database like WALS is partly inherent, as conclusions can only be based on the incomplete inventory of languages and features that could be included. The limitation is also partly a question of the narrow focus on linguistic factors, without incorporating quantifiable assessments of other fundamental properties of language, especially as a communicative system shaped by human interaction. Patterns of structural stability and change seem to emerge from a cluster of factors that includes the structural resources in particular languages and language families, the physical characteristics of the geographic area of contact among speakers, and the size and tenor of overlapping social networks. We need categories that allow us to consider individual characteristics of specific sociohistorical contexts and that account for the psycholinguistic behavior of speakers in those contexts. This chapter contributes a multi-faceted approach to assessing the role of structural features in the investigation of language prehistory by expanding the categories for evaluating structural features and profiting from the position of a single language family in a particular set of natural and sociohistorical circumstances. Chibchan languages predominate in a relatively small and cohesive geographic region extending from Central America through the northwest corner of Colombia, and they are surrounded by languages from a variety of unrelated families. Section 2 introduces the region, the languages, and a characterization of the social scenario of contact, in which Isthmo-Colombian societies apparently incorporated non-linguistic objects and practices in a

Language contact in the Isthmo-Colombian area

75

specific type of cultural change and transmission. In Section 3, the linguistic data for analysis are categorized in two ways, with features classified as stable or unstable, based on the Dediu and Cysouw (2013) proposed ranking, and classified independently as template or contents, based on functional characteristics. Details of data categorization and data collection are presented in Section 3, and the analysis and results are discussed in Section 4. The final section offers some concluding remarks. 2

The Isthmo-Colombian area: region and languages

Languages of the Isthmo-Colombian area are spoken at the gateway to South America, across the land bridge that connects the American continents and along the northwest coast of the southern landmass. The territory in question stretches from northern Honduras through the Isthmus of Panama and into the northwestern areas of Colombia, Venezuela, and Ecuador (Map 4.1). The topography varies enormously, from the mountainous Isthmus, across the thickly jungled Darien, to the wide river plains and estuaries along the Caribbean coast. Within Colombia, the sphere of Isthmo-Colombian influence encompasses the Pacific coast region as well as the extensive riverine networks cut by the Atrato, Cauca, and Magdalena rivers and tributaries through the northern reaches of the cordilleras of the Andean mountain chain. The eastern border is traced by the Magdalena River valley north to the Sierra Nevada de Santa Marta and east to the Guajira Peninsula. Throughout history the speaker communities have shared borders with powerful Mesoamerican civilizations to the north and with dynamic Caribbean, Amazonian, and Andean groups to the south, and for some time the region was called the “Intermediate Area” to reflect its position between better-documented societies north and south (see Hoopes and Fonseca 2003: 51–54 for a discussion of the motivations and relative appropriateness of various terms used in the literature; the label “Isthmo-Colombian Area” was adopted from this paper). Scholarship in recent decades from across the human sciences has provided a more nuanced view of the region, as a place where technologies did indeed sweep through from all directions, but where the human populations remained relatively stable and intact. Objects, products, and practices probably transformed the material culture periodically, but there was little permanent immigration. 2.1

The historical and cultural context of the region

A language contact scenario can be thought of as “the organized fashion in which multilingual speakers, in certain social settings, deal with the various languages in their repertoire” (Muysken 2008d). This chapter is based on

76

Loretta O’Connor

Map 4.1 The Isthmo-Colombian area, noting the position of languages in this study

the premise that language is in large part like any other cultural trait, whose practice can be inherited, acquired, modified, or lost by any generation of speakers (Mace et al. 2005; Gray et al. 2007, Gray et al. 2010). Therefore, an assessment of the non-linguistic record that encompasses details of the archaeology, ethnohistory, and ecological history of the region is relevant to our understanding of the language contact scenario: the hypothesis is that patterns

Language contact in the Isthmo-Colombian area

77

of speaker behavior in dealing with non-linguistic cultural practices will shed light on how speakers may have dealt with the various languages that entered their environments, as well. This section provides a brief sketch of the historical and cultural context of the Isthmo-Colombian area, looking at evidence for practices related to subsistence, social organization, and trade, starting from the earliest known populations through to the character and consequences of an apparent watershed moment, roughly 1,500 years ago. People have been living in the Isthmus of Panama since the late Pleistocene, some 12–10,000 years ago (see O’Connor and Kolipakam, this volume). Multidisciplinary research suggests that the Chibchan communities in the region today are in fact the genetic (Barrantes et al. 1990, Melton et al. 2007) and linguistic (Constenla 1991, 2012) descendants of the earliest inhabitants, having spread north into Nicaragua and southeast into Colombia, eventually occupying scattered territories of the Caribbean littoral and the drainage areas of the Cauca and Magdalena rivers. Although these groups apparently engaged in frequent conflict with each other and with non-Chibchan neighbors, they are also described as connected by a “diffuse unity” that encompassed belief systems and associated material practices (Hoopes and Fonseca 2003). Other studies discuss an “Isthmian Interaction Sphere” that extended from central Colombia to the Mexican Yucatan, in overlapping circles or down-the-line chains, within which commerce and the practice of other cultural activities flowed back and forth across stable local boundaries (Myers 1978, Bray 1984, Cooke 2005). As will be presented in more detail below, Proto-Chibchan probably split from an older Central American stock nearly 10,000 years ago, while all but one contemporary Chibchan language developed from a core branch that emerged 3,000 years later, its speakers gradually filling the narrow isthmian region of Costa Rica and Panama. The diverse ecology of the region, with deep mountain valleys and rich coastal resources on both shores, encouraged longterm settlement of small groups that practiced a wide variety of subsistence strategies. People ate fish, shellfish, and birds, and evidence of early, local agriculture includes traces of bottle gourd, arrowroot, leren, and squash in Panama 9000–7000 BP (Cooke 2005). Raymond (2008) notes maize, manioc, arrowroot, and yams on the Pacific coast, where skeletons show that maize was a primary ingredient of the diet by 6000 BP. The presence of both maize, from Mesoamerica, and manioc, from South America, demonstrates the impact of long-distance exchange in this cultural and commercial nexus. By the time Chibchan speakers started to radiate away from the core Chibchan region, beginning some 5,000 years ago, a clear record of sustained human settlement had already emerged in multiple places. This development is seen in three archaeological sites mentioned repeatedly in the archaeological literature that nicely delimit our region of focus: Cerro Mangote, from pre-6000 BP, on the Pacific coast of Panama; Las Vegas, from 8500–4600 BP, on the

78

Loretta O’Connor

Santa Elena peninsula in western Ecuador; and Puerto Hormiga, from 5100 BP, near the Caribbean coast in the estuaries of the Magdalena River in northern Colombia. Evidence at all three sites points to the existence of foragers and collectors who practiced a type of residential mobility (Raymond 2008: 80– 86). These mobile communities trekked in circuits, probably to achieve their attested varied diet of fish, birds, game, and other forest products, returning to the central bases to bury their dead in what Raymond notes as symbolic if not physically permanent homes (p. 81). As noted above for Panamanian societies, in Ecuador, too, we find evidence of very early plant domestication, with traces of squash and leren from nearly 10,000 BP, and maize and bottle gourds by about 8000 BP. All three regions described above had ceramics fairly early. Pots were found at Valdivia (after 4500 BP) near the older Las Vegas site, and sand-tempered pots of lower quality were found at Monagrillo, around 4400 BP, near the older Cerro Mangote site. Distinctive large and globular ceramic bowls called tecomate, made of clay tempered with fibers, were found throughout Colombia at sites such as Puerto Hormiga, Turbana, Monsu, and San Jacinto, with the earliest tecomate dated pre-5000 BP (Allaire 1999: 679). Similar tecomate ware has been documented throughout the greater Central American region: from 4000 BP, on the Guajira Peninsula and the Pacific coast of Guatemala; from 3000–3500 BP, at Tronadora and Chaparron in Costa Rica; and from 3000 BP, at Momil in the Sinu lagoons of northern Colombia (Allaire 1999). On the evidence of these ceramics, Myers (1978) made the case that overland trade routes could indeed have linked the coasts of Guatemala and Ecuador, and he also suggested that Puerto Hormiga ceramics looked more like those of the Orinoco and Amazon than like those of Ecuador. It is often quite difficult to prove an ultimate origin of particular ceramics; to complicate the question, the word tecomate appears to come from the Nahuatl word tecomatl, which describes this very type of pot. And yet, perhaps the Isthmo-Colombians traded tecomate pots for Saladoid and Barrancoid ceramics coming from the Orinoco region, as well as for jade from Guatemala, both of which entered the area by about 3500 BP. Despite our increasing knowledge of the region through archaeological and ecological data, we cannot reconstruct with confidence any major social trends or patterns of dominance for much of prehistory. Scholars of the IsthmoColombian area do however note a major moment of change sometime around 500 CE (Bray 1984: 331, Allaire 1999: 707, Hoopes 2005). The transition may have been motivated in part by increased consumption of maize, a much better crop for floodplain cultivation (Bray 1984), or by climatic events, such as environmental catastrophe (Hoopes 2005), and many think the sociocultural change was linked to a transition from jade to gold as the precious material of greatest cultural importance. Quilter (2003) describes the paradigm shift.

Language contact in the Isthmo-Colombian area

79

Mesoamerican jade was difficult to obtain, and its hardness meant that working with it was a slow if relatively easy process. Once shaped, jade artifacts were quite durable. Gold was found locally, and the rendering process was complex and somewhat mysterious yet relatively fast. Furthermore, gold artifacts could be melted down and transformed into an entirely different object. The power of gold, as a commodity to be mined, owned, worked, traded, stockpiled, refashioned, and passed on to descendants, played a key role in the emergence between 300 and 600 CE of the ranking, inequality, and complexity that still characterized the societies encountered 1,000 years later by Europeans (e.g. Quilter and Hoopes 2003, Hoopes 2005). Many details of social development remain to be deciphered. As observed by Bray (1984: 307), early Spanish chronicles from the Gulf of Uraba, between Panama and Colombia, mention witnessing “a thriving business in slaves, fish, salt, cotton cloth, and live peccaries, as well as gold.” Bray continues, “It is worth noting that most of the products on this list will leave no archaeological trace and that pottery does not figure at all.” Some of the societies that were powerful during the last millennium before the Conquest would be particularly relevant to the linguistic analysis in this chapter. Among these are two whose languages remain uncertain: the Zenu (or Senu), who erected astonishing raised fields over vast expanses of the San Jorge River basin in northwest Colombia, and the Quimbaya, renowned and prolific gold workers from farther south on the upper Cauca, whose gold work has been found throughout the Isthmo-Colombian area. There were also two important centers of Chibchan-speaking groups near the Magdalena River along the eastern edge of the area. The Muisca realm occupied the upper Magdalena, near present-day Bogota, at a site known to early Spaniards as El Dorado for its power and wealth, most notably in gold and emeralds. In the Sierra Nevada of Santa Marta near the mouth of the Magdalena, we find the Tairona civilization. The Tairona melded complex architecture with diverse agricultural practices in the design of terraced villages and fields at various altitudes, and they left a copious iconographic record of their rich religious life. Archaeological evidence suggests all these societies were chiefdoms, entities that demonstrate organization and hierarchy in political, religious, and economic activities, seen in artifacts such as large-scale public works, settlement layout, burial practices, and iconography (Bray 1984: 331). There is a growing body of literature from recent decades on what constitutes a chiefdom and on the many ways sociopolitical complexity can be manifest, and these definitions shape what we might expect from a given social scenario in terms of language contact. For example, Hoopes (2005: 6–9) discusses the literature on two modes of social power, known as network and corporate. A network type of chiefdom emphasizes such factors as individual power passed through hereditary lines, centralized chiefs, and commercial power achieved

80

Loretta O’Connor

and maintained by military means. This type of scenario seems more likely to lead to language shift, as speaker communities are conquered and subjugated to enrich a central power. In contrast, the outcome of a corporate mode of social power may be more consonant with diglossia and language maintenance, as a corporate mode emphasizes the power of the office, non-linear inheritance, and control achieved through ritual and ideological means. In this context, the important leaders could have been not chiefs but priests and shamans, who exercised locally the power of a broadly shared worldview, expressed in common iconography and “routinized ritual” rather than through control of key resources (Hoopes 2005: 31). Cooke (2005: 31) proposes a type of mixed model, in which “above the chiefdom, there were larger, equally important social units – to judge from the ethnographic record, some kind of descent group or groupings of ethnias with closely related languages and memories of common origins, shared songs and praises, and conflicts between real and mythical personalities and social groups.” Thinking again of the linguistic outcome, we might envision the role of Latin as a language of worship that left space for the maintenance of local languages. Within the Isthmo-Colombian region, chiefdoms of Central America tended toward the network model, perhaps due to influence from Mesoamerica, and chiefdoms of Colombia tended toward the corporate model, a difference that would likely have consequences for patterns of language contact. This section began with a mention of the “diffuse unity” said to characterize the Isthmo-Colombian area. Bray (1984: 336–337) argues the opposite side of the same coin: that despite constant contact and constant conflict, especially among close neighbors, individual cultures in the region remained distinct. He calls this phenomenon “conservatism in the face of opportunity for change” and proposes that while population stability may play a role, the true barriers to convergence were ideological: When borrowing does occur, what is usually taken over is the technology (metalworking, pottery painting, crop complexes), but this technology is used for purely local ends. There is surprisingly little direct copying. The more neutral the trait, the wider its distribution and the greater its chances of acceptance. As our comparisons have shown, geometrical designs travel faster and farther than figurative or symbolic themes, which are often strongly regional. (Bray 1984: 337)

The key notion here is that societies seem to have accepted the basic frameworks of new technologies, practices, or artifacts, and to have adapted and reproduced the new structures with locally relevant contents. With this notion in mind, I will summarize what this brief review of the non-linguistic literature on the Isthmo-Colombian area can bring to the question of the linguistic prehistory of the region, and especially to the investigation of the effects of language contact. Speakers of Chibchan languages have been in situ for millennia, with mostly

Language contact in the Isthmo-Colombian area

81

Chibchan neighbors throughout Costa Rica and Panama and among mostly non-Chibchan speakers in scattered pockets from Nicaragua to central Colombia. There was ongoing contact and conflict, especially among nearest neighbors, which suggests a degree of bilingualism (or multilingualism) and intermarriage (which may have been forced, as an outcome of conflicts). Something happened around 500 CE that led to greater sociopolitical complexity that affected the entire region, shaped societies for the next 1,000 years, and may have involved the imposition of dominant languages, in the form of language shift or of diglossia. Throughout, individual cultures remained relatively distinct, but at the same time, societies did take advantage of new technologies and practices, accepting the frameworks and adapting details of the content to fit local needs. When looking at language systems, stability may take different forms. Basic vocabulary is expected to be stable and to indicate family relations, while cultural vocabulary is expected to show more effects of cross-family borrowings that reflect the specific cultural and ecological context. A resistance to lexical borrowing is often interpreted as the conscious maintenance of a distinct social identity, a phenomenon documented in contact situations from the Vaup´es (Aikenvald 2002; Epps 2007a, 2008a) to Vanuatu (Franc¸ois 2011), and proposed for languages of Colombia as well (O’Connor 2011). As was discussed in the introduction of this chapter, categories of structural features and their interpretation are less clear, but there are general expectations that stable features will reflect genealogy better than unstable features will. This chapter contributes a perspective from structural data that unpacks relative stability, or relative resistance to borrowing, using two types of metrics, one of which explicitly operationalizes the notions of abstract template and locally relevant content, as described by Bray (1984). 2.2

The Chibchan family

The Chibchan language family is by far the largest family in the IsthmoColombian Area, in number of languages and in geographic spread, and we know a great deal about the history of this family thanks especially to the work of Constenla Uma˜na (e.g. 1981, 1991, 2012) and Quesada (e.g. 1999, 2007). Citing evidence from phonological, lexical, and grammatical comparison and reconstruction, Constenla (2012: 418) suggests that the proto-language split around 9,700 years ago from a Lenmich´ı “micro-phylum” composed of the Lencan, Misumalpan, and Chibchan families. The Paya language, now spoken in northern Honduras, probably split from the proto-language some 6,700 years ago, leaving what is known as Core Chibhan (see Figure 4.1). Judging by the distribution and degree of present diversity, Constenla (2012: 419) presumes a Chibchan homeland in southern Central America and estimates

82

Loretta O’Connor I. Paya II. Core Chibchan: IIA. Votic: Rama, Guatuso IIB. Isthmic: B1. Western Isthmic B1.1 Cabécar, Bribri. B1.2 Teribe/Térraba. B1.3 Boruca B2. Doracic: Dorasque, Chánguena B3. Eastern Isthmic: B3.1 Guaymiic: Guaymí, Bocotá. B3.2 Kuna IIC. Magdalenic: C1. Southern Magdalenic: C1.1 Chibcha: Musica, Duit. C1.2 Tunebo. C1.3 Barí C2. Northern Magdalenic: C2.1 Arhuacic: C2.1.1 Kogi. C2.1.2 ES Arhuacic C2.1.2.1 Eastern Arhuacic: Damana, Kankuama. C2.1.2.2 Ika. C2.2 Chimila

Figure 4.1 The Chibchan language family, after Constenla (2012: 417). Boxed languages appear in this study.

that it was from this Isthmian homeland that other branches developed as speakers migrated north (Votic branch, 5325 BP) and east across northern Colombia (Magdalenic branch, 5225 BP), with another migration east by Kuna speakers some 4,800 years ago. There were at least twenty-one languages, of which sixteen survive. Two of the extinct tongues would have been particularly useful for this study of language contact: the Antioquian languages Cat´ıo and Nutabe, thought to have been spoken between the Sinu and Cauca Rivers, near the Zenu and Quimbaya societies mentioned in Section 2. Sadly, they are virtually undocumented and therefore could not be included in Constenla’s classification of Chibchan subgroups. Of particular interest in this chapter is the observation that “Tairona . . . seems not to be another language, but a variant of the still spoken Damana” (Constenla 2012: 391, citing older literature). 2.3

The languages in the study

The fourteen languages for this study, listed in Table 4.1, were chosen for their geographic location on the borders of the Chibcha sphere and because there were sufficient descriptive materials available for the data collection questionnaire, described in Section 3. A fundamental goal of this paper is to determine if any particular subset of features can be identified as a good “trace of contact”: in other words, will any category of feature highlight areal relations among languages and speakers by occurring in patterns that correlate with geographic proximity of languages irrespective of genealogical relationship? Regional subareas are defined here (see Table 4.1) as a Northern group (languages 1–4), an Isthmian group (5–7), a Southern group (8–11), and an Eastern group (12–14).1 The subset of features 1

As such, this study and its goal are a micro-version of the seminal areal study of the IsthmoColombian region by Constenla (1991), which arrived at a macro-view of the area. Constenla’s

Language contact in the Isthmo-Colombian area

83

Table 4.1 Languages in this study No.

iso

Language

Family

Location

Group

1 2 3 4 5 6 7 8 9 10 11 12 13 14

jic pay miq rma tfr gym cuk emp sja kwi pbb arh mbp guc

Jicaque Paya Misquito Rama Teribe Guaymi Kuna Northern Embera Epena Pedee Awa Pit Paez Ika Damana Guajiro

Jicaquean Chibchan Misumalpan Chibchan Chibchan Chibchan Chibchan Chocoan(N) Chocoan(S) Barbacoan Paezan Chibchan Chibchan Arawakan

Honduras Honduras Nicaragua Nicaragua Costa Rica, Panama Costa Rica, Panama Panama Colombia Colombia Colombia, Ecuador Colombia Colombia Colombia Colombia

North North North North Isthmian Isthmian Isthmian South South South South East East East

which best replicates these geographic subareas will therefore be claimed to contain the features most susceptible to the effects of contact in the given social scenario. It should also be noted that, even in this small inventory of languages, assigning languages to areal groups is itself a matter for investigation and experimentation (see Map 4.1). We might expect Paya to group with the Jicaquean and Misumalpan languages. Jicaquean is a small family unconnected to any other, and not much is known about the history of the people. It is spoken in northern Honduras, along the Caribbean coast, and while it is likely a long-term neighbor of Paya, there is no known interaction. Misumalpan is a small family of languages spoken primarily throughout central and eastern Nicaragua, extending across the border into southern Honduras, and with a small pocket of speakers on the border between Honduras and El Salvador (Constenla 1991). The only representative in our dataset is Misquito, especially interesting for its historical extension all along the Caribbean coast of Nicaragua, where speakers could have had contact with Paya to the north and Rama to the south. Rama, from the Votic branch of Core Chibchan, could pattern with the Northern group or alternatively with the Isthmian languages Teribe and Guaymi, goal was to determine if the cultural area designated as the Intermediate Area, defined mostly by anthropologists, also constituted a single linguistic area. He concluded that the languages were better classified into three groups, as members of a Central American-Colombian subarea (CAC), an Ecuadorian-Colombian (Andean) subarea (EC), and a Venezuelan-Antillean (Caribbean) subarea (VA). Nearly half of the structural features collected and analyzed by Constenla were also used in the present study, primarily in the set of stable features. Under Constenla’s (1991) scheme, the languages in this study are 1–9 and 12–13 in CAC, 10–11 in EC, and 14 in VA.

84

Loretta O’Connor

while the position of the third Isthmian language, Kuna, could be expected to vary between its Chibchan Isthmian cousins and the Chocoan languages with which it has surely been in contact. The Chocoan language family is composed of two living language varieties, Waunana and Embera. Embera is itself described as a set of closely related languages or as a dialect continuum, each variant named for the region where it is spoken, and it is divided into Northern and Southern branches. The sample here includes one Northern Embera variant (called Northern Embera) and one Southern Embera (Epena Pedee). Chocoan languages are spoken today all along the Pacific coast of Colombia and into eastern Panama, and the speakers call themselves terms that translate as ‘mountain dwellers,’ ‘river dwellers,’ and ‘people of the wild cane’ (Mortensen 1999: 1). Historically, they are known as a “flexible and expanding population” who have settled in regions vacated by other groups during the process of colonization (Adelaar with Muysken 2004: 56–57). Several extinct languages prominent in the discussion of the Isthmo-Colombian prehistory have been associated with Chocoan, though none of these has sufficient documentation to confirm the relationship. These include Cueva, which was spoken on the Isthmus between Kuna to the east and the rest of Isthmian Chibchan to the west, and the extinct Colombian languages Quimbaya, of the Upper Cauca Valley in western Colombia, and Sin´ufana, of the Sin´u region between the Sinu and Lower Cauca rivers near the Caribbean coast. The Southern Embera languages may show areal similarities with a Southern group that contains Paezan and Barbacoan languages. Paez is the only language (or only surviving language) in the Paezan family, spoken on the eastern and western slopes of the Andean cordillera central in southwestern Colombia. Paez has had known contact with the surrounding Barbacoan languages Guambiano and Totoro and likely contact with Southern Emberan (Chocoan) languages of the nearby Saija, San Juan, and Cauca River systems. The Barbacoan languages are spoken in separate pockets scattered from the mountainous regions of southwestern Colombia to the coastal lowlands of northwestern Ecuador. The Barbacoan language in this study, Awa Pit, is spoken in the western foothills of the Andes along the Colombia–Ecuador border. This language is perhaps an odd choice for a study of areal contact, as the community is (and may have long been) known for a culture of “secrecy” and inaccessibility (Curnow 1997; Curnow and Liddicoat 1998). The final subregion in this study is the Eastern group, centered on Ika and Damana, Chibchan Magdalenic languages of the Sierra Nevada de Santa Marta, which may have had contact with the Northern Maipuran Caribbean branch of Arawakan languages. As mentioned previously, Damana may be the modern version of the language spoken in the powerful Tairona chiefdom, a factor which may impact its regional profile. Arawakan is a large family of around thirty languages spread geographically from Belize to Bolivia. The language

Language contact in the Isthmo-Colombian area

85

of interest here is Guajiro, of the Guajira peninsula on the Caribbean coast at the border of Colombia and Venezuela. Regretfully, no Western Cariban languages could be included in this dataset due to insufficient documentation. Cariban is a large family of forty to sixty languages, many extinct, and mostly spoken from the Orinoco basin of eastern Venezuela across the Guianas to the Amazon, and into central Brazil. There were Cariban speakers in the Magdalena River valley, and some scholars suggest that at least some of the unknown languages spoken throughout the Caribbean lowlands of northern Colombia were also Cariban. Constenla notes that between the Kuna and the Magdalenic group of Chibchan “there was a series of people of proven or supposed Cariban affinities, such as the Opon, the Muzo, the Panche, and the Pijao” (2012: 419). The surviving Cariban language closest to the Isthmo-Colombian region is Yukpa, a language cluster with scant documentation, spoken just west of Lake Maracaibo along the Colombia–Venezuela border. Relying on small descriptions in older sources, Constenla (1991: 60) described Yukpa as SV/SVO, with genitive and demonstrative before the noun and adjective and numeral after the noun. In more recent work, Flores (2002) finds that while constituency is varied, the basic word order of Japreira, one language of the Yukpa group, is SOV. Postpositions are illustrated as suffixes, and nominal constituent order includes Genitive-Noun, possessor-possessed, and both Adjective-Noun and Noun-Adjective. A typological overview of the relevant language families, based on the Constenla (1991) binary-coded dataset, is presented in Table 4.2. While some feature values seem rather widespread, such as SOV basic word order, postpositions, and case suffixes everywhere but in Arawakan, we can also see a certain amount of variation and indeed several features with “mixed” answers within the Chibchan family. Interestingly, the history of the genealogical classification of some languages in the study has taken them from sisters to cousins to neighbors. The Chibchan family was identified as such by Uhle (1890). Genealogical classifications involving the Chibchan, Chocoan, Barbacoan, and Paezan families include efforts by Rivet (1924b) and Loukotka (1968), that grouped Barbacoan and Paezan inside Chibchan; by Greenberg (1987), that proposed a ChibchanPaezan subgroup within the single family Amerind, placing Barbacoan and Chocoan inside the Paezan division; and by Campbell (2012a), that registers the groups as four distinct families without known interrelation. 3

Features and methods

The dataset for this study consists of binary answers (yes = 1, no = 0) to 90 questions about structural features in 14 languages of the Isthmo-Colombian

86

Loretta O’Connor

Table 4.2 Typological profiles, using features from Constenla (1991) Jicaquean Misumalpan Chibchan

Chocoan Barbacoan Paezan

Arawakan

langs N = clausal constituents adpositions case suffix

1 SOV

4 SOV

15 SOV

3 SOV

4 SOV

1 SOV

1 VSO

postp, N-sfx

postp, N-sfx

postp, N-sfx

postp, N-sfx

postp, N-sfx

postp, N-sfx

postp, prep_N

NP

Gen-N N-Adj N-Num N-Dem

Gen-N N-Adj N-Num N-Dem-N

Gen-N N-Adj-N N-Num-N N-Dem-N

Gen-N N-Adj N-Num Dem-N

Gen-N Adj-N Num-N N-Dem-N

Gen-N N-Adj Num-N Dem-N

N-Gen N-Adj Num-N N-Dem

ACC mark AGT-PAT ERG-ABS

no no no

mixed no no

mixed mixed mixed

no no yes

yes no no

yes no no

no no no

arg-marks: prefixes suffixes

yes yes

yes yes

mixed mixed

no no

no yes

no yes

yes no

TAM: prefixes suffixes

yes no

no yes

no yes

no yes

mixed yes

yes yes

no yes

region (see Table 4.5 in the appendix). Features were classified as stable vs. unstable and, independently, as template vs. contents. These categorizations are explained below. 3.1

Stable vs. unstable

The sets of stable and unstable features used in this study were selected from a proposed stability ranking of structural features (Dediu and Cysouw 2013), itself based on a comparative analysis of eight individual approaches to calculating the stability of features archived in the World Atlas of Language Structures (WALS). The eight studies made use of different statistical analyses and operated under different definitions of stability. Some were based on the persistence of features within families, others on estimates of the evolution of feature values through time, and others measured patterns of persistence within the dataset without initial consideration of known language families. Every study devised an estimate of relative stability for each individual WALS feature, up to a total of 132 features, depending on the study. Dediu and Cysouw then converted the stability estimates into relative ranks from 0.00 (least stable) to 1.00 (most stable) to facilitate comparability, reported in their Table 1, and their Table 7

Language contact in the Isthmo-Colombian area

87

presents a composite ranking for the 62 features represented in all eight analyses, based on principal component analysis. For the present study, features 1–31 in the 62-item list were classified as stable, and features 32–62 as unstable. To supplement the dataset, 19 additional features were chosen from the 132-item list by averaging the seven relative rank scores reported for that feature. The resulting average score was then compared to the average score of the “cut-off” feature (that is, feature 31 in the 62-item list), calculated by averaging the same seven relative rank scores, in order to situate each supplementary feature in the appropriate stable or unstable category.2 The features used in this chapter were chosen with four parameters in mind: (i) relative position in the proposed stability rankings, (ii) presence in the Constenla (1991) dataset of typological features, (iii) presence in WALS, and (iv) likelihood of appearance in existing descriptive materials for the languages in question. The last parameter is subjective yet realistic, given the scarcity and brevity of materials on under-described languages of South America. The Constenla (1991) dataset consists of binary indications (yes = 1, no = 0) of the presence of 42 morphological features and 39 phonological features in 76 languages of Mesoamerica, Central America, and northwestern South America. The first step in data collection was to incorporate all relevant information from the Constenla dataset for the 35 languages of the region in question, yielding data for 35 features in all languages. Next, all possible information from WALS (accessed 13 August 2012) was added to the dataset, providing information on 59 more features for only some languages (reflecting the uneven coverage in WALS). If WALS contained more recent documentation that contradicted information from Constenla (1991), the WALS feature value was used. These collections were especially fruitful for the subset of stable features. Published grammatical materials were then consulted to code the remaining features, resulting in data for 49 stable features and 49 unstable features for 14 languages. Certain pairs of questions, such as “Is the order of the NP Adj-N?” and “Is the order of the NP N-Adj?” gave only duplicate information for this set of languages, so these questions were eliminated. Nevertheless, dependencies remain in the data, in part because WALS features with multiple possible values were expressed in the questionnaire as multiple questions with yes/no answers. 2

It should be noted that Dediu and Cysouw (2013) offer the lists as ranked compilations, not as explicit hypotheses to be tested as is being done in this chapter. Dediu (p.c.) stresses that the first principal component “agreement” upon which Table 7 is organized is a global negotiation among the different methods, and he suggests that the rankings in the Parkvall (2008) study, discussed in their paper, might serve as a fruitful basis for future investigations of interactions among vertical and horizontal transmissions. The 19 WALS features drawn from Table 1 for use in this Isthmo-Colombian investigation are features 34, 35, 36, 52, 71, 78, 101, 103, 112, 116 (unstable; with average stability estimates from 0.239 to 0.510) and features 33, 51, 63, 69, 81, 88, 98, 99, 100 (stable; with average stability estimates from 0.564 to 0.819). The corresponding average stability score of the “cut-off” item, item 31 in the 62-item list, is 0.523.

88

Loretta O’Connor

This left a final dataset for the 14 languages of 90 features each, composed of 44 stable and 46 unstable features, including five cells of missing data. Initial hypotheses are that stable features will more often replicate genealogical patterns than areal patterns in the data, and unstable features will more often replicate areal patterns in the data. This is a conservative view, equating geographic proximity with the probability of language convergence and shared structures. 3.2

Template vs. contents: patterns that connect

The next step was to categorize each feature as template or contents according to its function within a language. Template and contents classifications are comparable to familiar structuralist categories of syntagmatic and paradigmatic relations, respectively. The label “template” appeals to formal, constructional properties, at the level of the phrase or the word. These features describe the sequence of forms in a construction (constituent order, the location of affix or adposition) and the quality of the form as bound or free (affix, clitic, free form). Template features indicate where and in what form a feature is expressed. In contrast, “contents” features indicate which value from a set of values will fill a given formal position. These features appeal to choices and computations made by the speaker, often in response to factors of social cognition, group identity, and cultural rules and preferences. Contents features include those relating to the sound system; to the choice of pronoun, based on gender, politeness, or inclusivity; to choices of nominal marking, based on animacy, inalienability, or other classificatory quality; and to the choice of verbal marking, based on the grammatical role, number, or referent of the participant. The categories of template and contents were operationalized in part from descriptions of structural change (Heath 1984; Zavala 2002; Winford 2003; Heine and Kuteva 2005; Matras and Sakel 2007), all of which draw on distinct feature types commonly known as “pattern” (the formal construction, corresponding to “template”) and “matter” (the morphophonological content). The key difference between “matter” and “contents” is that “contents” indexes a semantic component rather than a specific morphophonological form, identifying the meaningful distinctions encoded without examining the forms. The decision-based nature of contents features suggests these are more likely to be shaped and enforced by interaction in specific sociocultural contexts, in effect, computed for each utterance; the initial hypothesis is that contents features will reflect areal relationships. The stencil-like nature of template features suggests two possible outcomes. On a psycholinguistic basis, “template” could describe a property that would be backgrounded and perhaps not easily accessible and manipulable by speakers. In this case, the feature might be relatively stable and reflect inheritance more than contact. On the other hand, in the

Language contact in the Isthmo-Colombian area

89

Table 4.3 Numbers of features in major categories and their intersections Template

Contents

Total

Stable Unstable

20 24

24 22

44 46

Total

44

46

90

description of cultural borrowing patterns in Section 2, Bray (1984: 337) was quoted in Section 2.1 as observing “The more neutral the trait, the wider its distribution and the greater its chances of acceptance.” If we connect the pattern of linguistic change to the pattern of cultural change, extending the dynamics of cross-linguistic priming to cross-modal priming of non-material culture, then template features may reflect a more general, regional level of contact, while contents features reflect the most local social groups. In this sense, the template and contents features could possibly both reflect areal patterns in a set of nested levels. Table 4.3 describes the resulting database of 90 features and two types of categorization, with intersecting subsets (and see appendix). 4

Assessment of feature role

The previous section presented hypotheses of what each category of structural features is likely to tell us about relationships among the various languages. r stable features → genealogical relations r unstable features → areal relations r template features → genealogical relations or more general areal relations r contents features → areal relations, perhaps of the very local region In this section, the hypotheses are explored and tested using quantitative tools. The first subsection uses a linguistic distance matrix calculated with the NeighborNet algorithm in SplitsTree4 (Huson and Bryant 2006). This tool produces a distance matrix based on the presence or absence of each feature for every pair of languages. Patterns of similarity in the data matrix were represented graphically as networks and as bifurcating trees as a first quick look at relationships (not reproduced in this chapter); the examination suggested that the best guide to identifying contact relationships among languages in the dataset are the contents features and particularly the subset of stable contents features. The graphics also suggested that no set or subset of features will identify a family relationship, successfully grouping Chibchan languages apart from other languages.

90

Loretta O’Connor 0.7 Chibchan

non-Chibchan

0.6

distance

0.5 0.4 0.3 0.2 0.1 0 unstable-template-NCH (21)

unstable-contents-NCH (21)

stable-template-NCH (21)

stable-contents-NCH (21)

contents-NCH (21)

template-NCH (21)

unstable-NCH (21)

stable-NCH (21)

all features-NCH (21)

unstable-template-CH (21)

unstable-contents-CH (21)

stable-template-CH (21)

stable-contents-CH (21)

contents-CH (21)

unstable-CH (21)

template-CH (21)

stable-CH (21)

all features-CH (21)

Figure 4.2 Linguistic distances for Chibchan vs. non-Chibchan languages

4.1

Linguistic distance by feature category and pre-defined group

In this subsection, linguistic distances from the matrix calculated by SplitsTree are compared by family grouping and by the regional groupings defined earlier. The graphics represent, for each feature type, the median distance (bar inside the box), the standard deviation spread around the average distance (top and bottom of box), and the maximum and minimum values in the data (whisker tips). The distribution in this dataset did not have extreme outliers, so the median is close to the average value for each type. The lower the linguistic distance, the more similar are the pairs of languages. Figure 4.2 addresses genealogical relations, illustrating linguistic distance by feature subset for all Chibchan languages (CH) vs. all non-Chibchan languages (NCH), with 21 unique pairs in each calculation, noted as (21). The first observation is that average linguistic distance for “all features” is nearly identical between the two groups (although the spread is larger in non-Chibchan languages), perhaps reflecting the known similarity of structural features across the region. The results also suggest the importance of separating the simple category of stable features into different types of stable features to identify Chibchan family relationship: the best predictors of genealogy in this dataset are stable template features, while the stable contents features show a higher average distance and

Language contact in the Isthmo-Colombian area

91

0.7 North (N)

Isthmian (I)

South (S)

East (E)

0.6

distance

0.5 0.4 0.3 0.2 0.1 0 unstable-template-E (3) unstable-contents-E (3) stable-template-E (3) stable-contents-E (3) contents-E (3) template-E (3) unstable-E (3) stable-E (3) unstable-template-S (6) unstable-contents-S (6) stable-template-S (6) stable-contents-S (6) contents-S (6) template-S (6) unstable-S (6) stable-S (6) unstable-template-I (3) unstable-contents-I (3) stable-template-I (3) stable-contents-I (3) contents-I (3) template-I (3) unstable-I (3) stable-I (3) unstable-template-N (6) unstable-contents-N (6) stable-template-N (6) stable-contents-N (6) contents-N (6) template-N (6) unstable-N (6) stable-N (6)

Figure 4.3 Linguistic distances for pre-defined areal groups

much greater spread. Within Chibchan, contents features as a whole and in subcategories consistently show slightly higher linguistic distances than unstable features. This suggests that a classification scheme of contents vs. template has more consistently identified properties of relative mutability among features than has the stable vs. unstable metric. No particular category of features signals any type of relationship among the diverse languages in the non-Chibchan group. Stable template features in this group display an extremely wide spread, suggesting these features are quite poor indicators of a general areal relationship for this set of languages. The representation of linguistic distances in Figure 4.2 argues that the stable vs. unstable opposition is not useful in delineating the Chibchan languages in the dataset. Template features, particularly the stable template features, predict genealogical rather than areal relationships, and contents features seem most sensitive to change within Chibchan but make no predictions about contact relationships in the non-Chibchan group. Figure 4.3 addresses areal relations, illustrating linguistic distance by feature subset among pairs of languages in northern, Isthmian, southern, and eastern groups. Data numbers are very small: groups with four members have six unique pairs for comparison, shown as (6), and groups with three members have three unique pairs for comparison, (3). This discussion of Figure 4.3 examines the graphic from left to right within each regional group. The common sense hypothesis is that unstable features

92

Loretta O’Connor

will be better than stable as predictors of areal relationship. This means that linguistic distances should be smaller for unstable features than for stable features – and this is only true in the Eastern group. In all other groups, the average distance of stable features is smaller than the average distance of unstable. This contrary outcome actually might make sense for Isthmian, where all three neighbor languages are also Chibchan, but in general we can say that unstable features do not consistently predict areal relations better than stable features in this dataset. The prediction for the next two categories is that contents features will show local area while template features may reflect a larger area or may reflect genealogy. Results here are quite mixed. Contents distances are smaller than template distances in Northern and (only slightly) in Southern groups, while this is not true in Isthmian and Eastern groups. The outcome is again striking in Isthmian, where the spread among features is quite small and the template distance is much lower than the contents distance. The results in Eastern are harder to interpret, as the spreads are similar and the numbers are so small. The summary statement for these categories is that neither type consistently predicts clear areal relationships in the dataset. The perspective from the four multi-class categories suggests no common predictor of areal relation for the four pre-defined groups. The results here simply subdivide the unexpected outcome of the stable vs. unstable features described above, with stable features in three of the four regions showing smaller linguistic distances, which could be interpreted as the feature sets most useful in delimiting each areal group. These are stable contents for Northern, stable template for Isthmian, and either of those for Southern. In the Eastern group, unstable contents and unstable template features show the smaller distances.

4.2

Geographic distance by feature category

This subsection departs from the linguistic distance matrix calculated by SplitsTree and from pre-defined regional groups of languages. In this analysis, pairs of languages are compared using geographic distance to calculate a normalized median distance between languages that share values for each feature. There are 90 binary features, which yields 180 possible sets of shared values (pairs that share 1 and pairs that share 0, for each feature). Of the 180 possible sets, 162 were shared by at least one pair of languages. Distances between pairs of languages were calculated using point coordinates for language locations in the WALS database and an online calculator that used the haversine formula to calculate the shortest distance between two points over the earth’s surface (the great-circle distance).

Language contact in the Isthmo-Colombian area

93

Languages with shared values

16 Stable-Contents

14

Stable-Template

12

Unstable-Contents Unstable-Template

10 8 6 4 2 0 0

0.5

1 1.5 Normalized median distance between pairs

2

Figure 4.4 Geographic distribution of feature values

Normalized median distance was calculated as the median distance between all pairs that shared the value of a given feature divided by the median distance of all pairs with a value for that feature (see appendix). If all languages shared a value for a given feature, as was the case for example with feature S1 = 0, then the normalized median distance was 1. If no languages or only 1 language had a specific value for a feature, as with S1 = 1, then there was no pair and no median distance. Features S33, S34, U27, U34, and U35 contain missing data for one language each, and therefore the total number of languages with values is 13. In all cases, the lower the normalized median distance, the closer the areal relation among the languages that share the value for the feature. Figure 4.4 illustrates the plotting of feature category by the number of languages that share values and the normalized median distance between pairs.3 The general impression is of a rather homogeneous distribution of feature types, with a possible cluster of unstable contents features between 0.8 and 1.0 normalized median distance. The spread is characterized in Table 4.4, which shows counts by feature type for cumulative ranges of the 162 features ranked by normalized median distance. The distribution of stable and unstable features in each range is strikingly balanced, which contradicts the hypothesis that unstable features should show clearer areal affinities than stable features. The highest proportion of unstable features (%U, at 55%) occurs at either end of the scale, where 11 of the 20 lowest and 11 of the 20 highest normalized median distances are between shared values of unstable features. 3

Calculations were actually made using pairs of languages, where 14 languages produce 91 unique pairs, 13 languages produce 78 pairs, 12 languages produce 66 pairs, etc. Simple numbers of languages are used on the y axis to ease understanding of the plot.

94

Loretta O’Connor

Table 4.4 Ranges of features and types ranked by normalized median distance (N = 162) Ranked features

Stable

Unstable

%U

Template

Contents

%C

“Contents” breakdown

20 lowest 30 lowest 40 lowest 50 lowest 60 lowest 70 lowest 81 lowest

9 15 20 24 30 34 38

11 15 20 26 30 36 43

55% 50% 50% 52% 50% 51.4% 53.1%

8 12 16 18 22 28 37

12 18 24 32 38 42 44

60% 60% 60% 64% 63.3% 60% 54.3%

6S, 6U 10S, 8U 14S, 10U 17S, 15U 20S, 18U 22S, 20U 23S, 21U

81 highest 70 highest 60 highest 50 highest 40 highest 30 highest 20 highest

40 34 30 25 19 14 9

41 36 30 25 21 16 11

50.6% 51.4% 50% 50% 52.5% 53.3% 55%

38 35 28 24 20 16 10

43 35 32 26 20 14 10

53.1% 50% 53.3% 52% 50% 46.7% 50%

22S, 21U 19S, 16U 17S, 15U 13S, 13U 8S, 12U 5S, 9U 3S, 7U

The proportions of template and contents feature types show a bit more imbalance, with the percentage of contents features in the low 60% bracket for most of the lower normalized distances. The highest percentage of contents features (%C, at 64%) occurs in the range of the 50 lowest normalized median distances. The final column of Table 4.4 shows the breakdown of stable and unstable secondary categorizations within the contents category, and indeed stable features outnumber unstable features throughout the range of lower rankings. These counts indicate that contents features and specifically stable contents features are best at capturing general areal relationships in this dataset. 4.3

Summary of quantitative assessments

The first subsection examined a matrix of linguistic distance among pairs of languages based on shared features, produced using SplitsTree4. Hypotheses of relationship among the languages were investigated in Section 4.1 with representations of linguistic distances by feature type within pre-defined groups. Here we saw that all stable features are not alike: while the comparison of all stable and unstable features was inconclusive in distinguishing the Chibchan family, the subset of stable template features was best at identifying the genealogical group. The perspective from linguistic distances in the four small regional groups was less fruitful, suggesting in fact that diverse categories of stable features were slightly better at characterizing individual groups. Hypotheses on

Language contact in the Isthmo-Colombian area

95

the roles of unstable and/or contents features in reflecting local areas were not confirmed; instead, the role of stable template features as predictors of family relations was given a small measure of support in the results for the Isthmian group, composed of three Chibchan languages. The second subsection took a more general approach to the question of areal relations, calculating a normalized median geographic distance between pairs of languages that shared values for each feature. The overall picture suggested a rather homogeneous distribution of feature type. However, a closer look at proportions of feature categories in graded ranges of distance revealed a small but consistent predominance of contents features among the lower distances, with more stable contents than unstable contents, in a trend that peaked and then faded as the distances increased. The consistently uniform distribution of stable and unstable features at every range of distance was surprising; this picture may reflect the negotiated nature of the ranking from which features and labels were drawn, and it certainly reflects the complex character of structural stability. The prevalence of stable contents over unstable contents features as predictors of areal relation was also surprising, or counterintuitive, as one might have logically identified unstable contents as the category of features most susceptible to change. This outcome invites further investigation of the impact of social constraints on linguistic change. 5

Conclusions

If any linguistic feature can be borrowed (Thomason and Kaufman 1988; Curnow 2001), then we need something beyond the linguistic system to predict what will and will not be borrowed. Feature frequencies based on existing descriptions and inventories without weighting for geographic or sociohistorical factors are only part of the story. The fourteen languages in this study are very similar typologically, and at least half of them have occupied stable geographic locations for millennia. However, too many languages with insufficient documentation were missing from an ideal linguistic profile of the IsthmoColombian area. Furthermore, geography alone does not dictate the quality of contact; the importance to the linguistic system of simple proximity will vary, and most details of the social history are yet unknown. What are the linguistic impacts of “down-the-line” contact, and among which parts of society did this contact occur? How can the effects of sociopolitical transitions, of the network, corporate, or any other type, be incorporated into models of language change? Genealogical classification of South American languages based on lexical data does not seem to match up with the interesting stories told by structural features, and indeed the record of genealogical inheritance is only part of the history of a language and of its speakers (e.g. Ross 2003). Among neighboring languages, even lexical items are sometimes consciously conserved to maintain

96

Loretta O’Connor

social difference, while at other times these serve as a good reflection of contact relations. How can this psycho-social parameter of speaker communities be operationalized for interpreting patterns of convergence and difference among all types of linguistic features? This study has attempted to go beyond stability as defined by families and frequencies, proposing other categorizations of structural features that could be useful in tracing areal relations among languages in a specific social setting. The premise was that template and contents categorization, based partly on psychological notions, might mimic proposed contours of cultural exchange, might be a productive way to define linguistic practices that are sensitive to group membership, and might in some way be comparable to differences between basic and non-basic vocabulary. These preliminary proposals on the essence and predictive powers of template and contents features led to conflicting hypotheses about the role of each type in replicating areal patterns in the data, whether reflecting cultural convergence with speakers of neighboring languages or in providing a mechanism for the maintenance of separate cultural identities. Results from such a tiny database can only be suggestive, yet they do suggest that refining categories of “stable” and “unstable” with notions that reflect properties of human interaction will bear fruit. Studies of linguistic prehistory require the quantification and incorporation of data that delineate the social scenario of language contact; this study was meant to take a modest step in that direction. Appendix Table 4.5 Feature table

Function Question YES = 1, NO = 0 STABLE contents S1. Is there one or more uvular occlusive phoneme? contents S2. Is there a non-labial glottalized occlusive? contents S3. Is there a velar nasal phoneme? contents S4. Is there a nasality contrast for vowels? contents S5. Is there a rounding contrast for non-front (central or back) vowels of the same height? contents S6. Are there tonal contrasts? contents S7. Does the language lack bilabials, fricatives, or nasals?

Langs Normd-median Langs Normd-median with 0 dist (0) with 1 dist (1)

0

na

14

1

1

na

13

0.97659

0.62192 1.06739

8 4

0.91161 0.92763

7

1

7

0.84348

3 1

0.76902 na

11 13

1.04853 1.00418

6 10

Language contact in the Isthmo-Colombian area

97

Table 4.5 (cont.)

Function Question YES = 1, NO = 0 contents S8. Is there an opposition between masculine and feminine personal pronouns? contents S9. Is there a distinction between inclusive and exclusive for personal pronouns? contents S10. Are there numeral classifiers? template S11. Does the language have pronominal possessive prefixes? template S12. Does the language have pronominal possessive suffixes? contents S13. Is there a morpheme that marks genitive case in inalienable possession? contents S14. Is there a morpheme that marks genitive case in alienable possession? contents S15. Does the language mark past tense? template S16. Are there prefixes to indicate tense or aspect? template S17. Are there suffixes to indicate tense or aspect? template S18. Is there VO order in transitive clauses? template S19. Is there VS (verb-agent) order in transitive clauses? template S20. Is there VS (S may be agent or object) order in intransitive sentences? template S21. Does the language have case prefixes or prepositional clitics? template S22. Does the language have case suffixes or postpositional clitics? template S23. Is the basic order of constituents SOV? template S24. Is nominal plurality marked with an affix or clitic? template S25. Is the order of adpositions and nouns as follows: noun – postposition or noun – case suffixes? template S26. Is the order of the noun that is possessed and the noun that indicates the possessor (the genitive) N – Gen?

Langs Normd-median Langs Normd-median with 1 dist (1) with 0 dist (0) 2

1.23741

12

0.95645

6

0.76902

8

1.20527

3 5

0.46197 1.25670

11 9

1.20956 0.81900

3

0.62192

11

1.08528

3

0.19295

11

0.87454

5

1.05003

9

1.02218

10

0.95168

4

1.09921

2

2.00450

12

0.95645

13

0.97659

1

na

1

na

13

0.97659

1

na

13

0.97659

1

na

13

0.97659

1

na

13

0.97659

12

1.02052

2

0.29494

12

0.94343

2

0.19295

10

1.00836

4

1.19295

14

1

0

na

4

1.10451

10

1.04853

(cont.)

98

Loretta O’Connor

Table 4.5 (cont.)

Function Question YES = 1, NO = 0 template S27. Is the order of the noun that is possessed and the noun that indicates the possessor (the genitive) Gen – N? template S28. Is the order of the adjective and the noun A – N? template S29. Is the order of numerals with respect to the indefinite nominal phrase Num – N? template S30. Is the order of the demonstrative and the noun Dem – N? template S31. Is the order of the interrogative word and the clause obligatorily question word – clause (i.e., are question words positioned initially)? contents S32. Is a zero copula possible for at least some predicate nominals (non-overt copula)? contents S33. Does the language use the same strategy to encode nominal and locational predicates? contents S34. In NP conjunction, is ‘and’ the same as ‘with’? contents S35. Does the language have a case system that does not distinguish between the agent or patient of an intransitive verb and the patient of an transitive verb? contents S36. Does the language treat S and O the same in pronouns? contents S37. Does the language treat S and O the same in verbal person marking? contents S38. Does the language have a case system that distinguishes between the agent of an intransitive action verb and the patient of intransitive process verbs? contents S39. Does the language distinguish between Sa and So in pronouns? contents S40. Does the language distinguish between Sa and So in verbal person marking? template S41. Do predicative adjectives have verbal encoding?

Langs Normd-median Langs Normd-median with 1 dist (1) with 0 dist (0) 13

0.97659

1

na

2

1.28991

12

0.96400

6

1.20956

8

0.75712

10

0.99957

4

1.12406

6

1.33705

8

0.76987

11

0.97632

3

1.57167

5

1.00839

8

0.86227

4

0.72709

9

0.88195

5

0.78637

9

1.07639

5

0.78889

9

1.21384

3

1.08849

11

0.96122

2

1.06739

12

1.02994

2

1.06739

12

1.02994

3

0.19295

11

0.87454

5

1.17367

9

0.94343

Language contact in the Isthmo-Colombian area

99

Table 4.5 (cont.)

Function Question YES = 1, NO = 0 template S42. Do predicative adjectives have non-verbal encoding? contents S43. Does the language encode direct evidentiality (perceived with senses)? contents S44. Does the language encode indirect evidentiality (hearsay, inference, etc)? UNSTABLE contents U1. Is there a voiceless lateral fricative phoneme? contents U2. Is there a voiced lateral approximant? contents U3. Is there a simple lateral vibrant phoneme? contents U4. Is there definite marking distinct from demonstratives? contents U5. Is there inflection for indicating the (grammatical) person on intransitive verbs? contents U6. Is there inflection for indicating the (grammatical) person of the agent on transitive verbs? contents U7. Is there inflection for indicating the (grammatical) person of the object on the transitive verb? template U8. Is there zero realization of at least some third person singular S forms? template U9. Is there zero realization of at least some third person plural S forms? template U10. Is nominal plural marking obligatory? contents U11. Are nouns denoting humans marked for plural? contents U12. Are nouns denoting animates marked for plural? contents U13. Are nouns denoting inanimates marked for plural? contents U14. Is an associative or collective plural distinguished from the additive plural? template U15. Is plurality in independent pronouns expressed with unanalyzable person-number stems?

Langs Normd-median Langs Normd-median with 1 dist (1) with 0 dist (0) 10

0.93518

4

1.40936

3

1.03268

11

0.97686

5

0.98795

9

0.73613

0

na

8

0.90524

6

0.88804

3

1.08849

11

0.96122

5

0.98666

9

1.06744

10

1.20956

4

0.62256

12

1.10778

2

0.46197

8

1.25402

6

0.74523

8

0.97814

6

0.63188

5

1.26955

9

0.79435

1

na

13

1.00418

14

1

12

0.96427

2

1.21813

8

0.86972

6

0.87454

5

1.24973

9

0.98816

4

1.01023

10

8

1.15599

6

1

0.84348

(cont.)

100

Loretta O’Connor

Table 4.5 (cont.)

Function Question YES = 1, NO = 0 template U16. Is plurality in independent pronouns expressed with a stem and a nominal plural affix? template U17. Is plurality in independent pronouns expressed with a stem and a pronominal plural affix? contents U18. Do second person pronouns encode a politeness distinction? template U19. Are pronominal subjects obligatory pronouns in subject position? template U20. Are pronominal subjects verbal affixes? template U21. Are pronominal subjects clitics on a variable host? template U22. Is the indefinite article the same as ‘one’? template U23. Is the indefinite article distinct from ‘one’? contents U24. Do demonstratives indicate a 2-way contrast? contents U25. Do demonstratives indicate a 3 (or more)-way contrast? contents U26. Is the coding of comitatives and instrumentals identical? template U27. Are nominal and verbal conjunction expressed with the same marker? contents U28. Is there inflectional marking of a future/non-future distinction? contents U29. Does the language mark a perfective/imperfective distinction? contents U30. Does the language mark the perfect? template U31. Does the perfect marker come from ‘finish’ or ‘already’? contents U32. Is there a morphologically dedicated second singular as well as second plural imperative? contents U33. Is there a morphologically dedicated second person imperative that does not distinguish between singular and plural?

Langs Normd-median Langs Normd-median with 1 dist (1) with 0 dist (0) 2

0.79451

12

1.05442

4

1.02052

10

1.06750

0

na

14

1

1

na

13

1.06391

9

1.13242

5

0.81900

1

na

13

0.97659

7

1

7

0.97632

4

0.58940

10

1.20956

5

0.56873

9

0.79532

9

0.79532

5

0.56873

4

1.28991

10

0.97686

1

na

12

0.97938

12

1.01634

2

1.10992

11

0.99957

3

0.47418

8

0.91349

6

1.22348

0

na

7

0.95168

7

0.87529

4

1.13633

10

0.97632

14

1

Language contact in the Isthmo-Colombian area

101

Table 4.5 (cont.)

Function Question YES = 1, NO = 0 contents U34. Is the prohibitive expressed with a special negative marker (not found in declaratives)? contents U35. Is the prohibitive expressed with a special imperative (not the normal 2S imperative)? template U36. Are evidentials marked on the verb as an affix or clitic? template U37. Are evidentials marked with a separate particle? template U38. Is a polar question indicated by a question particle? template U39. Is a polar question indicated by verbal morphology? template U40. Does the polar question indicator come at the beginning of the sentence? template U41. Does the polar question indicator come at the end of the sentence? template U42. Is clausal negation in declarative sentences signaled with a negative affix? template U43. Is clausal negation in declarative sentences signaled with a negative particle or word? template U44. Is clausal negation in declarative sentences signaled with a negative auxiliary verb? template U45. Can the structure of the negative be identical to the structure of the affirmative, except for the presence of the negative marker(s)? template U46. Are third person pronouns and demonstratives related to demonstratives?

Langs Normd-median Langs Normd-median with 1 dist (1) with 0 dist (0) 10

0.95760

3

1.11459

6

1.21551

7

0.89551

5

1.02052

9

0.84524

2

0.97632

12

0.98061

6

1.03268

8

1.00011

9

1.15545

5

0.55742

0

na

7

1.06739

7

0.95168

10

0.99957

4

0.50809

8

0.69108

6

1.07349

4

1.51596

10

0.97686

11

0.93518

3

0.97632

3

1.36169

11

0.97632

14

1

5

The Andean foothills and adjacent Amazonian fringe Rik van Gijn

This chapter on the distribution of Andean and Amazonian features in the upper Amazon area shows that the transition from the Andean to the Amazonian area is gradual and complex. This is consistent with the intricate history of contact between the different ethnic groups of the area, and it presents a strong argument for connecting the research traditions associated with these areas. Morphosyntactic influence generally seems to represent older contact situations than phonological influence. 1

Introduction

South America is generally regarded as linguistically unusually diverse, especially in terms of genealogical units (including the exceptionally high number of isolates), but also in terms of the range of possibilities one finds in grammatical constructions. Nevertheless, regional traits of varying extensions that cross family boundaries have also been observed by several authors. Some of these characteristics are shared widely by South American languages in general, and some are restricted to particular areas of varying size. Two macro-areas within South America have received recurring attention from scholars in terms of shared grammatical features: the Amazon basin and the Andes (see also Birchall, this volume). The middle Andes, ranging from northern Ecuador to central Chile and Argentina, has been described as “a selfcontained area that proved resistant to linguistic influences from the outside” (Adelaar 2012b: 586). Contact between the different languages that are and were spoken along the Andean mountain range, especially those spoken in the inter-Andean valleys and along the coast on the western slopes, left its imprint on the languages in the form of a number of shared traits (see e.g. B¨uttner 1983; Torero 2002; Adelaar 2012b). The Amazon basin is more diffuse This paper was partly prepared at the Radboud University Nijmegen, supported by NWO grant 275–89–006, which is gratefully acknowledged. I thank the editors for useful comments on earlier drafts of this paper, and Franc¸oise Rose, Lev Michael, and Marine Vuillermet for generously providing unpublished material and/or personal comments on specific data points. I furthermore thank Harald Hammarstr¨om for his invaluable help with the statistics. Remaining errors are mine.

102

The Andean foothills and adjacent Amazonian fringe

103

typologically than the middle Andes, but several scholars have observed shared traits across language families over large territories (e.g. Derbyshire and Pullum 1986; Derbyshire 1987; Derbyshire and Payne 1990; Payne, D. 1990; Dixon and Aikhenvald 1999). In spite of the relative self-containedness of the Andean cultural region, and perhaps also in spite of the fact that Andean and Amazonian studies seem to form separate worlds, it is obvious that the transition from the Amazon basin into the Andes is not an abrupt one, they shade off into each other. Moreover, there is archaeological and ethnohistorical evidence that there used to be much more contact between the highlands and upper Amazon area until quite recently, continuing into the post-Columbian era (Taylor 1999). In this chapter, I take a closer look at the area where the Amazon basin and the Andes meet, an area that I will term the foothill-fringe (FF) area, covering the eastern slopes of the Andes and the westernmost fringe of the Amazon basin. It is an explorative chapter in the sense that it does not aim to test specific hypotheses about this area (there is, for instance, no underlying claim that the foothill-fringe forms a linguistic area), but rather tries to take stock of the distribution of linguistic features of the FF languages, especially those that have been claimed to be important areal characteristics of the Amazonian and Andean areas. There were certainly close historical connections of many of the FF languages with the Andean cultures (see e.g. Adelaar 2012b), as well as with Amazonian cultures like Arawakan and Tupian, also longer-distance riverine connections (Taylor 1999). In fact, a good many FF languages are classified as Arawakan or Tupian. The chapter is structured as follows. In Section 2, I first define what I mean by the FF area, and I introduce the languages that represent the area in this paper. Section 3 is devoted to a discussion of “Amazonian” and “Andean” linguistic features, as they have been proposed in the literature. Section 4 describes the approach taken to measuring distances between the languages of the sample, as well as the results. In the last section (5) I come to a conclusion.

2

The foothill-fringe area

The eastern slopes, or the foothills, of the Andean mountain range and the western fringe of the Amazon basin are among the genealogically most diverse areas of the continent. The region is home to many isolates and small language families, as well as representatives of larger families that have extended into this transition zone. Defining this area is not an easy task, because it is essentially an area between two other zones. Therefore we will first direct our attention to the zones that border the FF area.

104

Rik van Gijn

To the west, a number of successive Andean civilizations have occupied varying parts of the Andean mountains. The last of these indigenous civilizations, the Inca civilization, had its greatest extension as recently as the late fifteenth to early sixteenth century, when its influence stretched along the mountain range all the way from northern Ecuador/southern Colombia to central Chile (see Van de Kerke and Muysken, this volume). This relatively recent expansion has left a firm linguistic mark on the Andean landscape, not only in terms of the spread of the Quechuan languages and the extensive mutual interference with Aymaran languages, but also in terms of shallower contact with languages spoken on the outskirts of the empire, in Chile and Ecuador. To the east of the FF area, two major expansive movements took place over the last millennia: that of the Arawakan culture (see Eriksen and Danielsen, this volume) and later that of the Tup´ı-Guaranian culture (see Eriksen and Galucio, this volume). These expansions were mostly by river and promoted the spread of Arawakan and Tup´ı-Guaranian languages. Different opinions exist about the homeland of these cultures, but it is clear that both expanded (among other directions) east towards the Andes. Map 5.1 shows the maximum expansion of Quechuan and Aymaran languages in the Andes, as well as the probable maximum extensions of Tupian, Arawakan, and Panoan languages. Given that the different groups expanded at different times (see below), the map should not be regarded as representing the distribution of languages at any given time in history. Roughly speaking, the FF region as understood in this chapter comprises the strip of land between the Andes and the Amazon, delimited by the river systems that flow together into the Amazon River, resulting in a geographic range from northern Ecuador to southern Bolivia. This territory can be divided into three major sub-areas on the basis of the river systems: a northern system defined by the Napo and upper Mara˜non Rivers that join together (with the Ucayali) into the Amazon River near Iquitos, a central system where two major rivers (the Huallaga and the Ucayali) flow into a general south–north direction across Peru, joining the Mara˜non in northern Peru, and finally a southern system (Madre de Dios-Beni-Mamor´e) covering southern Peru and Bolivia. The position of the FF languages in the midst of a number of culturallinguistic expansions raises the question of how speaker communities have dealt with these expansions and, more particularly, what imprint, if any, this cultural interaction has made on the languages that they speak. Reviewing all languages of this area is at this point beyond our reach, since data are scanty, and the time span for the current chapter was not long enough. Therefore I confine myself to reviewing a representative sample of the languages listed in Table 5.1 (the number refers to the number on Map 5.2).

The Andean foothills and adjacent Amazonian fringe

105

Map 5.1 The greatest extent of the Quechuan, Aymaran, Panoan, Tupian, and Arawakan expansions

3

Andean versus Amazonian features

A number of different authors have proposed “areal” or “regional” features both for an Amazonian and for an Andean area. The proposals of these authors are not always easy to compare, since there is no clear consensus with respect to the

106

Rik van Gijn

Table 5.1 The languages in the sample and their sources Language (affiliation) 1 Cof´an (isolate) 2 3 4 5 6 7 8 9 10 11

Secoya (Tucanoan) Imbabura Quechua (Quechuan) Waorani (isolate) Z´aparo (Zaparoan) Taushiro (isolate) Achuar-Shiwiar (Jivaroan) Shuar (Jivaroan) Aguaruna (Jivaroan) Urarina (isolate) Muniche (isolate)

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Cocama (Tup´ı-Guaranian) Shipibo-Konibo (Panoan) Chol´on (Hibito-Cholon) Cashibo (Panoan) Yanesha’ (Arawakan) Nomatsiguenga (Arawakan) Ash´eninka Peren´e (Arawakan) Nanti (Arawakan) Cuzco Quechua (Quechuan) Amarakaeri (Harakmbet) Ese Ejja (Tacanan) Cavine˜na (Tacanan) Movima (isolate) Trinitario (Arawakan) Leko (isolate) Southern Aymara Moset´en (Mosetenan) Yurakar´e (isolate) Yuki (Tup´ı-Guaranian)

Source Borman 1962, Fischer and Van Lier 2011, Tobar 1995 Johnson and Levinsohn 1990 Cole 1982 Peeke 1973, Saint and Pike 1962 Peeke 1991 Alicea Ortiz 1975a, 1975b Fast and Fast 1981, Fast, Fast and Fast 1996 Gnerre 1999 Overall 2007 Olawsky 2006 Michael et al. 2013, Michael et al. 2009, Michael, p.c. Vallejos-Yop´an 2010 Valenzuela 2003 Alexander-Bakkerus 2005 Zariquiey Biondi 2011 Duff-Trip 1997, 1998 Shaver 1996 Mihas 2010 Michael 2008 Cusihuam´an 2001 Helberg-Ch´avez 1984 Vuillermet 2012; p.c. Guillaume 2008 Haude 2006 Rose in press Van de Kerke 2009 Hardman 2001 Sakel 2004 Van Gijn, 2006, in prep Villafa˜ne 2004

precise extensions of the areas. This is the case especially for the Amazonian area. Some authors look at a limited number of language families that cover a broad territory (see e.g. Payne, D. 1990); others look at a sample of languages spoken in different parts of Amazonia (Derbyshire and Pullum 1986), and yet others look at the entire Amazon basin that contains a multitude of families and which may also contain smaller linguistic areas (see e.g. Dixon and Aikhenvald 1999). This sometimes makes it hard to compare results, as they can be incompatible. In the discussion of the features, I will indicate the problematic points and the way I treat these problems. First, however, I briefly introduce the sources for the features in Table 5.2.

The Andean foothills and adjacent Amazonian fringe

107

Map 5.2 The languages in the sample and their geographic distribution

In what follows I will discuss proposals made by these authors for widely shared features in the Amazon and Andes with respect to phonology, morphology, syntax, and lexicon. I favor those characteristics that contrast the Andean area with the Amazonian area. Moreover, I favor those characteristics that pertain to languages and language families that are or were spoken in the FF zone between northern Ecuador and southern Bolivia.

108

Rik van Gijn

Table 5.2 Areal studies of the Amazon and Andean regions used in this study

3.1

Source

Code

Description

Area

B¨uttner 1983

b

and

Derbyshire and Pullum 1986

dp

Derbyshire 1987

d

Payne, D. 1990

p1

Dixon and Aikhenvald 1999 Payne 2001

da

Torero 2002

t

Adelaar 2012b

a

A lexical, phonological, and structural (broad typological features) comparison of the languages from the central Andes. Survey of a number of morphosyntactic “areal typological similarities” based on a sample of twenty languages. Report based on a sample of forty languages, which reconfirms some of the Amazonian features mentioned in DP Survey of morphological characteristics, based on a sample of selected Amazonian languages. List of features encountered across families in the whole of Amazonia. Review of Dixon and Aikhenvald in which the author criticizes the list of Amazonian features and proposes a number of additional ones List of forty features for the middle Andean area, ranging from north Peru to northeast Argentina and Chile; includes proto-languages and extinct language data, also includes some foothill data Overview of the language situation in the central Andes, focusing on structural and lexical traits of the Aymaran and Quechuan language families

p2

amz

amz

amz

amz amz

and

and

Phonology and morphophonology

Dixon and Aikhenvald (1999) list the following phonological features, which are explicitly marked as being absent or having different values in the Andean area. 1. one liquid phoneme, frequently a flap 2. affricates outnumber fricatives 3. presence of a high, unrounded central vowel 4. presence of mid vowels 5. contrastive nasalization of vowels

The Andean foothills and adjacent Amazonian fringe

109

Andean languages, according to Dixon and Aikhenvald, typically have more than one liquid phoneme and a preference for fricatives over affricates in terms of numbers of phonemes. The high unrounded central vowel is mentioned by Torero (2002) as an Andean characteristic with limited extension, as it occurs in Mapudungun (central Chile) and the extinct northern Peruvian coastal language Mochica. He furthermore mentions that it is possibly reconstructable for Puquina, also extinct, which was spoken around Lake Titicaca at the presentday Bolivian–Peruvian border. It should be borne in mind that the range of the Andean area that Torero talked about has a wider extension than the area talked about in this chapter, as Torero’s Andean area extended all the way down to the southern cone and included also the formerly spoken coastal languages. Since Mapudungun and Mochica fall outside the part of the Andes immediately adjacent to the foothill-fringe area, and because the high mid vowel is present in members of the most dominant western Amazonian families (Arawakan, Tup´ı-Guaranian, and Panoan), I take it up in the list of Amazonian features for this chapter. With respect to the mid vowels /e/ and /o/, there are also a number of Andean languages that have mid vowels as phonemes (see Torero 2002: 524; Adelaar 2008a: 26), but the two most dominant language families of the Andes, Quechuan and Aymaran, have three-vowel systems containing only high and low vowels. Nevertheless, this feature should be considered with care, since Adelaar (2008a) reports that some variants of Quechuan and Aymaran have developed phonemic mid vowels, possibly due to Spanish and Portuguese influence. Vowel nasalization is decidedly Amazonian, and does not occur in Andean languages. Payne (2001) adds a sixth morphophonological feature to the Amazonian list, nasal spreading, noting that the Tup´ı-Guaranian, Tucanoan, Jˆe, Panoan, and Mak´u families all show some form of this characteristic. 6. nasal spread Moving on to the Andean literature, Torero (2002) distinguishes between a number of levels of feature diffusion (general – wide extension – limited extension – restricted). In the questionnaire I consider the first two groups, plus a subset of the features with limited extension, to the extent that they are found in languages or language families that cover major parts of the Andes along the foothill region as it is considered here. These, however, will be used with some caution, and especially to shed more light on subareas within the general region.1

1

This particularly means those traits found in at least two of the following languages/language families in Torero’s list: Aymaran, Quechuan, Puquina, Uru-Chipaya, Cunza. Traits limited to coastal languages like Mochica and Sechura and/or to southern cone languages/families Mapudung´un, Huarpe, and Cunza are not taken into consideration, since they are or were not spoken in areas adjacent to the foothill-fringe.

110

Rik van Gijn

General and widely extended presence of a palatalized nasal frequently closed syllables velar–uvular opposition for voiceless stops presence of retroflex affricate Limited extension 5. glottalization of stops (some Quechuan, Aymaran, Uru-Chipaya) 6. aspiration of stops (some Quechuan, Aymaran, Uru-Chipaya) 7. three-vowel system (Quechuan, Aymaran) 1. 2. 3. 4.

The palatal nasal, the velar–uvular distinction, and the retroflex affricate are mentioned as traits that distinguish the Andes from the Amazonian area (Torero 2002: 523–524). Glottalization and aspiration of obstruents in Quechuan languages is limited to those languages that are situated in southern Peru and Bolivia. This is very probably an Aymaran substrate feature (see e.g. discussion in B¨uttner 1983 and Adelaar 2012b). The three-vowel system consisting of phonemes /u/, /i/, and /a/ is also in particular a feature of Quechuan and Aymaran languages (although perhaps not historically – Adelaar 2012b), and is rare in Amazonia. Andean feature 2 requires a few extra remarks, first of all because it is not an entirely straightforward feature with respect to the Andean area, and second because it must be translated into a question to which an answer can be given in terms of discrete categories. Adelaar (2012b: 601–602) mentions that neither proto-Quechua nor proto-Aymara allowed complex codas in underlying form, but since Aymaran morphophonology contains complex deletion rules of phonetic material, surface forms can contain highly complex consonant clusters. Moreover, Adelaar mentions that proto-Aymara may have been more restrictive in terms of the kinds of elements allowed in the coda, although modern Aymaran languages seem to have acquired greater coda tolerance, possibly as a result of contact with Quechuan languages (see Cerr´on-Palomino 2008: 47). In addition, many Amazonian languages do allow a few consonants in the coda (usually nasals or fricatives), but tend to have more severe restrictions on what can be present in the coda. Therefore, rather than looking at abstract syllable structure, I analyze the issue of closed syllables as the degree to which restrictions are placed on segments in the coda of a syllable – not counting phonologically deviating words like ideophones, interjections, etc. and looking at underlying syllable structure.2 The answer to this question can be based on

2

It would actually be preferable to also look at surface codas, since that is the signal that may be transferred from one language to the other, but lack of systematic data prevents this.

The Andean foothills and adjacent Amazonian fringe

111

Table 5.3 The phonological features

1 2 3 4 5 6 7 8 9 10 11 12

Feature

Source

amz

and

Central high vowel Phonemic mid vowels Contrastive vowel nasalization Palatal nasal Velar–uvular opposition for stops Retroflex affricates Affricates > fricatives Single liquid phoneme Closed syllables Nasal spread Glottalized stops (Peru, Bolivia) Aspirated stops (Peru, Bolivia)

da da da t t, a t, a da da t, a p2 t, a, b t, a, b

Y Y Y N N N Y Y A Y N N

N N N Y Y Y N N C N Y Y

the percentage of phoneme consonants that can occur in coda position, ranging from 0 to 100, divided into three groups: A: 0–30, B: 31–60, C: 61–100. More Andean-type syllable structures will fall into categories B and C, with Amazonian-types in category A. Since Andean characteristic 7 inherently contrasts with Amazonian characteristics 3 and 4, they can be collapsed. This leaves a total of twelve contrastive Andean and Amazonian phonological features for analysis: ten general features plus two which are more restricted (Table 5.3).

3.2

Morphosyntax

Both Andean and Amazonian languages are by-and-large characterized by having verbs with a highly synthetic, agglutinating morphological structure. Although this is a salient feature, it is not contrastive. Nevertheless, a number of contrasting features can still be listed on the basis of the proposals by the different scholars. The status of argument cross-referencing on the verb is unclear, since where Derbyshire and Pullum (1986) claim that the tendency for Amazonian languages is to have a set of pronominal affixes for both subject and object participants, Dixon and Aikhenvald (1999) claim that it is typically Amazonian to cross-reference only one core argument on the verb (which may differ according to context). Andean languages often cross-reference both subjects and objects on the verb, so this is potentially a contrastive feature. However, the three families with a large western Amazonian presence (Arawakan,

112

Rik van Gijn

Tup´ı-Guaranian, Panoan) differ with respect to this parameter. Whereas Arawakan languages usually have cross-reference markers on the verb for both subject and object, Tup´ı-Guaranian languages conform to Dixon and Aikhenvald’s prototypical Amazonian situation in that they mark one core argument on the verb, and Panoan languages, finally, have “no, incipient, or little developed argument marking in the main verb or auxiliary” (Valenzuela 2010: 68). What is striking, however, is the number of Amazonian languages that have pronominal prefixes (see Payne, D. 1990: 221). This may be part of a more general difference between Amazonian and Andean languages, in that largescale Andean languages like Quechuan and Aymaran are exclusively suffixing, whereas in Amazonian languages, prefixing is more common and is present in almost all languages to different degrees (see e.g. Payne, D. 1990; Dixon and Aikhenvald 1999: 9; Torero 2002: 526). Another opposing feature to do with person markers is the fact that isomorphism between possessors and one of the core arguments is common in Amazonia (Dixon and Aikhenvald 1999), and rather rare in the Andes. In Torero’s data, this is limited to foothill languages Chol´on and Cunza. The isomorphism feature can be extended to languages that do not have bound person markers by taking into account isomorphism on the basis of form parameters such as case marking or special forms of pronouns. I am more wary of basing isomorphism solely on positional encoding, since the possible variation is too limited. Therefore, languages that treat possessive and argument pronouns as the same only in terms of their position with respect to their head are counted as “non-applicable.” In addition to verbal cross-referencing, many Andean languages employ rich case systems (Torero 2002: 527), including core case markers, whereas Amazonian languages tend to have elaborate applicative systems (Payne, D. 1990), and a rather small set of peripheral case markers (Dixon and Aikhenvald 1999: 8). This characteristic is hard to quantify, since it is difficult to tell what is a rich system and what is a restricted system. Iggesen (2012) classifies languages into nine categories. I will distinguish three categories in a less refined way: (A) small set of case markers or no case marking (0–4), (B) medium set of case markers (5–6), and (C) large set of case markers (>6), where the typical Amazonian profile is “small set of case markers” and the typical Andean profile “large set of case markers.” Aymaran and Quechuan languages moreover have accusative case markers. Core case markers are unAmazonian, with the exception of Panoan languages, which often have an ergative marker. Finally, an often mentioned trait of Amazonian languages is their tendency to have ergative alignment, or alignment systems with clear and substantial ergative elements (e.g. Derbyshire 1987). Although the range of the systems

The Andean foothills and adjacent Amazonian fringe

113

encountered in Amazonia is rather great and involves various types of split systems, fully accusative systems appear to be very rare in Amazonia (Dixon and Aikhenvald 1999: 8), so this feature can be contrasted with the Andean languages, such as Quechuan and Aymaran languages, as well as Barbacoan languages, Cunza, and Huarpe, which have accusative systems (Torero 2002: 529). Encoding strategies I consider are constituent order, verbal cross-referencing, and case marking. For a language to be coded as accusative, at least one of these three must follow an accusative pattern, and the others cannot give a contrastive signal. I particularly look at NPs in simple clauses, and do not count as accusative any system that has a major alignment split (e.g. based on definiteness, semantic role, etc.). In the nominal realm, possessive constructions can be contrasted. The typical Amazonian structure involves a head-marked construction, making use of bound person markers (see e.g. Dixon and Aikhenvald 1999: 8). The Andean type often involves dependent marking, sometimes in combination with head marking (Quechuan, Aymaran – see Torero 2002; Adelaar 2012b), sometimes not (Mochica, Huarpe, Barbacoan – Torero 2002: 528). Puquina and Mapudungun both have Amazonian-type possessive structures in that they mark possessive relations on the head by means of person prefixes. Nevertheless, it is reasonable to contrast this feature in terms of Andean versus Amazonian in that the former tend to have dependent-marking strategies, and the latter not. Another salient Amazonian feature is the presence of a noun class or gender system of some sort. Noun class systems are also encountered in some of the languages that Torero counts as being part of the Andean area (Mochica and Chol´on), but as mentioned Mochica is a coastal language (on the west side of the Andes) and Chol´on is considered here to be part of the FF area. In terms of negation, Andean languages display several different strategies: a preposed particle, a suffix, or a combination of those. Torero (2002: 528–529) mentions that the first two strategies are also common in Amazonian languages, especially suffixal negation. So this feature is not contrastive enough to take up in the questionnaire. Apart from the aforementioned subject and object cross-referencing and negation marking, a number of further traits are encountered to different degrees both in the Andean and the Amazonian area, and are therefore not contrastive: evidentiality, nominalized subordinate clauses, switch reference, phrase- or sentence-final particles or enclitics, inclusive–exclusive distinction, alienability, incorporation, and lack of passive. The above considerations leave us with a further seven contrastive morphosyntactic characteristics (Table 5.4).

114

Rik van Gijn

Table 5.4 The morphosyntactic features

13 14 15 16 17 18 19

Feature

amz and

Prefixes Isomorphism of possessor and core verbal argument person markers Elaborate case marking system. Core case Accusative alignment in simple clauses Dependent marking for possession Classifier or gender systems

Y Y A N N N Y

N N C Y Y Y N

Table 5.5 The constituent order features Feature 20 O before S constituent order 21 AN order

3.3

amz

and

Y N

N Y

Constituent order

Especially Derbyshire and Pullum (1986) and Derbyshire (1987) give close attention to issues of constituent order. Among the Amazonian constituent order traits, they include O before S (Derbyshire and Pullum 1986) or O-initial (Derbyshire 1987) constituent order in the sentence and the combination of NA, Pr-Pd orders and postpositions. Torero (2002), on the other hand, mentions SOV clause order as an Andean trait with limited but still wide extension (Quechuan, Aymaran, Chipaya, and also true for Barbacoan languages – see Curnow and Liddicoat 1998: 387), and AN and Pr-Pd orders as widely shared features. This results in two contrastive features (Table 5.5). I have chosen the formulation of feature 20 as noted in Table 5.5 because asking for SOV order would have been too restrictive on the Andean-like languages, and asking for O-initial, too restrictive for Amazonian languages (in the sense that “no” as an answer would encompass many more logical possibilities). I will give more detailed information on word order below.

3.4

Lexicon

A final domain for which proposals have been made on the basis of which we can contrast an Amazonian profile with an Andean profile is the lexicon. One salient feature for Andean languages is a decimal counting system, shared by many of the languages and language families: Quechuan,

The Andean foothills and adjacent Amazonian fringe

115

Table 5.6 The lexicon features Feature 22 Numerals >9 23 Ideophones

amz

and

N Y

Y N

Aymaran,3 Puquina, Mochica, Chol´on, Uru-Chipaya, Cunza, Huarpe, and Mapuche. Although we cannot contrast this trait as such with the Amazonian type of numeral systems, Dixon and Aikhenvald (1999: 9) mention that there is generally only a small class of numerals in Amazonian languages. This means that we can set up an Andean–Amazonian contrast on the basis of elaboration of the numeral system, where a stable numeral system that goes to at least 10 (and that does not contain Spanish or Portuguese loans) is typically Andean, whereas smaller systems are typically Amazonian. A final lexical characteristic of Amazonian languages is mentioned by Payne (2001): the presence of an elaborate class of ideophones. Ideophones can be defined as “marked words that depict sensory imagery” (Dingemanse 2011: 25), i.e. they are words that typically show deviating characteristics, especially in their phonology and phonetics but often also in their morphological and/or syntactic behavior, that depict a situation in such a way that it evokes a perceptual sensation or perceptual knowledge. This goes well beyond the arguably universal onomatopoeia, as ideophones can depict at higher levels of abstraction, often involving perceptual modalities other than hearing, such as vision, taste, smell, touch, etc. Table 5.6 completes the list of twenty-three contrastive features. In the comparisons that are discussed in the next sections, I score the features for each of the thirty-two languages in the sample if the available data allow for it. By regarding the Andean profile and the Amazonian profile as “language” profiles, on a par with the profiles of the FF languages, I can calibrate the distance of an FF language to the Andean and Amazonian type. 4

Results and discussion

4.1

Linguistic distance

Figure 5.1 represents the distance between languages by taking into account all of the twenty-three features discussed above, without any differences in weight for the features. The Andean and Amazonian profiles (which are maximally 3

Aymaran has in fact historically a five-term system, but this is supplemented with Quechua loans (Cerr´on-Palomino 2000, Van de Kerke 2009, Adelaar 2012b).

116

Rik van Gijn asheninka amazon muniche

yanesha

nanti

movima nomatsiguenga trinitario zaparo

cholon yuki yurakare

eseejja cocama

cavinena

aguaruna

imbabura

achuar shuar

leko

amarakaeri

andes cuzco aymara

taushiro cashibo urarina waorani

shipibo

moseten cofan secoya

Figure 5.1 NeighborNet representation of the distances between the languages of the sample

contrastive) are treated as if they were languages; they are boxed in the network. The distances between the languages are visualized in a Neighbor-Net network (Bryant and Moulton 2004), a distance-based method that shows splits between languages, but also signals that go against proposed splits in the form of reticulation or ‘webbing’. A first major split we can observe is indicated by the vertical thick dotted line in Figure 5.1. The group of languages above the dotted line contains all the Arawakan languages as well as a few others, like Chol´on, Muniche, Movima, and Z´aparo, and towards the right the isolate Yurakar´e and Tup´ı-Guaranian language Yuki. The group below the dotted line contains the Quechuan, Tacanan, Panoan, and Jivaroan languages, as well as Aymara (Aymaran), Secoya (Tucanoan), Amarakaeri (Harakmbet), Cocama (Tup´ı-Guaranian) and the (semi-)isolates Leko, Waorani, Cof´an, Moset´en Taushiro, and Urarina. For ease of reference, I will refer to the group above the dotted line as “Amazonian” and to the group below the dotted line as “Andean.” If we contrast these two blocks, the binary features that contribute most to the contrast between them are (ordered according to contrast, the highest contrast first):

The Andean foothills and adjacent Amazonian fringe

117

1. presence of core case markers (0% of the ‘Amazonian’ group versus 89% of the ‘Andean’ languages); 2. isomorphism of possessor and core verbal argument (91% Amazonian, 32% Andean); 3. dependent marking for possession (9% Amazonian, 68% Andean); 4. the presence or absence of an elaborate case marking system (64% of Amazonian languages have a small case marker inventory vs. 5% of the Andean group, and 18% of the Amazonian group and 84% of the Andean group have an elaborate case marking system);4 5. the presence of gender/classifier systems (73% Amazonian, 16% Andean); 6. accusative alignment (27% Amazonian, 74% Andean). Both groups of languages show moreover a secondary contrast between languages to the left of the graph and languages to the right. This contrast is much less clear and seems more reminiscent of a continuum, or perhaps a tripartite distinction, and is indicated by the two thin vertical lines. If we take the most contrastive languages on the left–right axis, we can again distinguish an “Andean” group on the left, consisting of Arawakan languages Ash´eninka, Nanti, and Yanesha’, and Chol´on (Hibito-Chol´on) and the isolate Muniche, as well as the Quechuan languages Imbabura and Cuzco Quechua, the Tacanan languages Cavine˜na and Ese Ejja, Aymara, and the isolate Leko. To the right of the graph we can distinguish an “Amazonian” group consisting of Yurakar´e (isolate), Tup´ı-Guaranian languages Yuki and Cocama, the three Jivaroan languages Aguaruna, Achuar, and Shuar, Panoan Shipibo and Cashibo, isolate Urarina, and Amarakaeri (Harakmbet). For this axis the most contrastive features are the following: 1. basic adjective-noun order (73% of Andean, 0% of Amazonian); 2. phonemic central high vowel (18% Andean, 90% Amazonian); 3. the presence of more than one liquid phoneme (82% Andean, 10% Amazonian); 4. nasal spread (0% Andean, 70% Amazonian); 5. phonemic vowel nasalization (0% Andean, 60% Amazonian); 6. phonemic palatal nasal (100% Andean, 40% Amazonian). While the rather clear top-to-bottom split is dominated by morphosyntactic features, the more diffuse left-to-right split seems to be particularly based on phonological features (except for the first). This may at least in part reflect the fact that the phonological features seem to be more sensitive to diffusion through contact, probably through the incorporation of loanwords. If we split

4

In the latter interpretation (i.e. the presence or absence of an elaborate case marking system), this feature is the second-highest contributing factor.

118

Rik van Gijn moseten cofan movima nomatsiguenga zaparo

trinitario

secoya

yuki taushiro

cocama

amarakaeri

eseejja

amazon

cavinena nanti asheninka

aguaruna achuar

shuar

leko waorani cholon

yurakare muniche cashibo shipibo

imbabura yanesha aymara, cuzco

urarina

andes

Figure 5.2 NeighborNet of distances between languages of the sample (phonological features only)

the phonological features from the morphosyntactic features,5 we can observe that the distributions of Arawakan and to a lesser extent Quechuan languages (together with Aymara) are rather diffuse in the network based on the phonological features (Figure 5.2), and much closer together in the network based on morphosyntactic features (Figure 5.3). In Figure 5.3, all the Arawakan languages in the sample are in the left “tail” of the figure (with Movima and Muniche). The two Quechuan languages and Aymara are identical with respect to the morphosyntactic features, and converge completely on the Andean profile. In Figure 5.2 on the other hand, the Arawakan languages are spread all over the network, while Quechuan languages and Aymara are still rather close to each other, if not so close as in Figure 5.3. Another interesting difference that can be observed is the fact that Panoan (Cashibo, Shipibo) and Tacanan (Cavine˜na, Ese Ejja) languages, which are often regarded as being related in a deep sense (see e.g. Key 1968; Girard 1971), are rather close together in the morphosyntactic representation, compared to the phonological feature representation. On the whole, then, the morphosyntactic picture makes the impression of representing a more conservative, genealogical picture than the phonological one. This can possibly be connected to borrowing of linguistic forms (especially lexicon), thus introducing new phonemes to the recipient language. Since grammar is generally assumed to be more resistant to borrowing than the lexicon, we might hypothesize that Figure 5.2 may be read 5

The constituent order features are included in the morphosyntactic features; the lexicon features have not been considered in either of these networks.

The Andean foothills and adjacent Amazonian fringe

119

cofan waorani

secoya

andes, aymara, cuzco imbabura achuar

moseten

taushiro

yurakare

amarakaeri

shuar

yuki

eseejja cashibo

cavinena

shipibo cholon, zaparo

leko

aguaruna cocama

amazon nomatsiguenga asheninka nanti movima, trinitario yanesha

urarina

muniche

Figure 5.3 NeighborNet of distances between languages of the sample (morphosyntactic features only)

as indicating patterns of (shallow) language contact, whereas Figure 5.3 may be reflecting either genealogical links or deep/intense contact. In terms of features, the same major contributing factors that were identified for the top–bottom divide in Figure 5.1 are responsible for the main divide in Figure 5.3, which contrasts the same two groups of languages. The main contributing features of Figure 5.2 are nasal spread (79% for the Amazonian side, 0% for the Andean side), the presence of more than one liquid phoneme (5% for the Amazonian side, 79% for the Andean side), and the presence of a phonemic palatal nasal (32% for the Amazonian side, 100% for the Andean side). 4.2

Correlations with geographic factors

This chapter focuses on linguistic issues in the discussion on the foothill-fringe area. I will here touch on some possible geographic correlates, but it is clear that more in-depth research is necessary to give more detailed and definite answers to these matters. The Andean side of the divide in Figure 5.2 suggests contact between Andean (Quechuan and Aymaran) languages and some of the languages spoken close to the Andes in northern Bolivia (Leko, Cavine˜na) and Peru (Nanti, Ash´eninka, Chol´on). From there towards the right of the graph the situation becomes more diffuse. There are a few probable contact pairs (Cof´an and Secoya, perhaps Urarina with the Panoan languages – although there is quite a lot of reticulation,

120

Rik van Gijn

including the biggest divide in the graph), but there are more surprising positions: Z´aparo, Moset´en, and Amarakaeri, whose closest neighbors in the network are not their closest neighbors geographically. On the whole, then, the phonological graph seems to represent a rather specific Andean profile with some possible points of contact-induced change, and a much more diffuse Amazonian zone, where languages may share some traits but not others. The most widespread features in the Amazonian group are the presence of mid vowels (also quite common on the Andean side of Figure 5.2) and nasal spread (both 79 percent). One clear exception to the pattern that phonology is less stable than morphosyntax is the fact that the Jivaroan languages are distributed much more diffusely in the morphosyntactic picture than in the phonological one. There is no straightforward explanation for this apparent anomaly. A suggestion towards an explanation may come on the one hand from the long-term and complex relations between Jivaroan groups and highland groups,6 and on the other hand because the Jivaroan groups “show a particularly strong ethnic consciousness” (Adelaar with Muysken 2004: 432). The first factor may contribute to the result that Achuar is found rather close to the Andean profile, and the latter may account for the fact that the Jivaroan languages pattern so closely phonologically, perhaps due to an ethnic consciousness that includes a resistance to lexical (conscious) borrowing. Figure 5.3 shows a relatively homogeneous “Amazonian” group, containing all Arawakan languages, but also the isolates Movima and Muniche and, at more distance, Chol´on and Z´aparo. The Movima case may be explained by (deep) contact of Movima with Trinitario/Ignaciano and also Baure. Other cases, such as the puzzling position in the midst of the Arawakan languages in the network of Muniche, Z´aparo, and Chol´on, may require less straightforward explanations.7 Given the particular position of the languages in the sample, between two major geographical (and perhaps cultural) zones, a natural question to ask is whether we can find any correlations8 between the linguistic patterns and geographic variables. A first, simple question would be whether there is any correlation between linguistic distance and geographic distance.9 It seems that there is no such correlation, as can be observed in Figure 5.4, which shows geographic distance on the x-axis and linguistic distance on the y-axis. This 6 7 8 9

Adelaar with Muysken (2004: 432) suggest that the Jivaroan territory may even have extended into the Andean highlands. Lev Michael (p.c.) mentions that Muniche shows many traces of Arawakan elements in its lexicon. I thank Harald Hammarstr¨om for the calculations as well as providing me with the data points on geographic position and elevation of the languages in question. I have taken a crude approach here, distances being represented as the crow flies, and the languages being considered points rather than polygons on the map.

The Andean foothills and adjacent Amazonian fringe

121

Figure 5.4 Correlation between linguistic distance and geographic distance

confirms the observations made (i) that there are a few languages oddly placed in the graph, and (ii) that genealogical signals can be strong without necessarily coinciding with geographic proximity. Two other geographic factors that seem intuitively important are elevation and river systems, as they are of consequence to how people travel, and perhaps they limit the range of contacts between peoples. Figure 5.5 shows the correlation between geographic elevation and linguistic distance. It should be read as follows: the greater the difference in elevation between a language pair, the greater the linguistic difference between these languages. In very general terms, this can be interpreted as identifying the Andean mountains as a barrier for contact, although the correlation is not very strong (r = 0.37). The idea that differences in elevations are barriers for contact is corroborated by the graph indicated in Figure 5.6, which is the correlation between elevation and proximity to the Andean profile. The x-axis indicates the height of the

122

Rik van Gijn

Figure 5.5 Correlation between geographic elevation and linguistic distance

location where the languages are spoken, and the y-axis indicates distance from the Andean profile. As a general tendency, the higher a language is spoken, the more it conforms to the Andean profile. A final geographic factor that I want to take into consideration is the river system. Rivers in South America form pathways along which people move around, are in contact with each other, and thus possibly influence each other. The foothill-fringe area as presented here can be said to consist of three major river systems: 1. The northern basin, delimited in the north by the lower Napo and Aguarico Rivers and in the south by the lower Ucayali and Mara˜non Rivers, encompassing eastern Ecuador and northern Peru. 2. The drainage basin of the upper Ucayali and Huallaga Rivers, covering north-central to southern Peru.

The Andean foothills and adjacent Amazonian fringe

123

Figure 5.6 Correlation between elevation and proximity to the Andean profile

3. The basin defined by the Madre de Dios and Mamor´e Rivers, covering Bolivia and a small part of southern Peru. The languages in the sample can be classified according to the river system they belong to (see above), and the average distance between their linguistic profiles can be compared to the average of the entire sample. However, it seems that the river systems do not have any impact on the average linguistic distance, as is shown in Table 5.7. One proviso that we should make with this result is that the genealogical diversity is greater in the Napo/Aguarico (ten families) and Madre de Dios/Mamor´e basins (nine families) than in the Ucayali/Mara˜non basin (four families). This means that perhaps the picture should be adjusted somewhat and there might be a (relative) contact effect after all in the northern and

124

Rik van Gijn

Table 5.7 The average linguistic distance per river system Total average distance Napo/Aguarico Ucayali/Mara˜non M. de Dios/Mamor´e 0.41

0.41

0.40

0.41

southern basins. However, this is difficult to take into the equation, and must, moreover, await more detailed research. 5

Conclusion

When reviewed in terms of areal linguistic features that are considered to be of importance for the Andean and the Amazonian areas, the FF languages conform neither to the Amazonian profile, nor to the Andean profile. Instead, they form a mixed group, which fits well with their position between these two areas, and reflects their complex past of multilateral contacts. The results of the study do clearly show that the FF area, which is mostly associated with the Amazon in traditional terms, does not conform to the Amazonian prototype. On the basis of the results of this preliminary study, we can tentatively draw a few further conclusions (pending more research that incorporates results from ethno-historical and archaeological studies). In terms of genealogical patterns we cannot say very much on the basis of this sample, which would need to be expanded to allow for more firmly supported conclusions. Nevertheless, with this proviso in mind, some patterns can still be recognized and perhaps serve as hypotheses for future studies. The Quechuan languages (Cuzco and Imbabura) do end up relatively close to each other in all networks, but Aymara is generally closer to Cuzco Quechua than Imbabura. This is in line with the conclusions drawn in Van de Kerke and Muysken (this volume) about Ecuadorian variants of Quechua being substantially different due to contact effects. The Tacanan and Panoan languages show rather strong signals, and even end up together when only morphosyntactic features are considered, possibly reflecting an even older connection. Arawakan and Jivaroan languages show ambiguous signals (see below). Apart from the Andean sphere (including Tacanan and Leko), and a recurring northern group of Cof´an, Warao and Secoya, there are no obvious major areal patterns in the data. There may be some further more local areal patterns (Movima and Trinitario, Yurakar´e and Yuki, Urarina with the Panoan languages). Closer scrutiny may reveal more of these local patterns. In general terms, morphosyntactic features seem to represent more stable structural traits than the phonological features, which is possibly attributable to the fact that lexical borrowing, more likely to occur than structural borrowing,

The Andean foothills and adjacent Amazonian fringe

125

can influence phoneme systems. The phonological picture is more diffuse than the morphosyntactic one, which shows a clear – Arawakan-dominated – Amazonian group and a (somewhat more diffuse) Andean group. One of the hypotheses that could be tested further is that the patterns reflect two time layers: an older layer of languages with the longest presence in the area and the longest history of contact with the Andean civilizations, and a group of languages and language families that have moved into the area from Amazonia proper, dominated by the Arawakan profile, and which have undergone less long-term Andean influence. The Jivaroan languages form a notable exception to this pattern, which is possibly due to factors of a more ethno-cultural nature. This issue clearly requires more in-depth research. Apart from linguistic and cultural-historical considerations, I have reviewed some geographic factors that may be of influence. There is no correlation between geographic and linguistic distance as such, but there is a correlation between difference in elevation between language pairs and their linguistic distance, suggesting that elevation differences form a natural barrier against contact. This is corroborated by the fact that there is a correlation between the elevation of a language and the degree of conformation to the Andean profile. Belonging to the same broad river system seems to be less influential when it comes to predicting linguistic distance, although future research may reveal that there is some impact in the northern and southern river systems, or that smaller river systems may give more meaningful results.

6

The Andean matrix Simon van de Kerke and Pieter Muysken

This chapter deals with several long-standing issues in Andean linguistics: What is the best way to classify the Quechua language family internally and what does this classification tell us about the history of the language, contrasting a new morphosyntactic dataset with the lexical data analyzed by Heggarty (2005, 2007)? What do the complex structural relations between Quechua, Aymara, and the other highland languages reveal about their historical relationship? We hope to show that the Quechua language family is quite coherent and stable in many respects. There is structural evidence for a QI/QII split rather than for a more wave- or network-like configuration within the family. The Aymaran languages are clearly set apart from Quechua varieties as a group, but at the same time, the structural distance between the northern varieties of Quechua and those of most of Peru and Bolivia is larger than that between Aymaran and e.g. Quechua I. The other Andean languages clearly have separate structural profiles. 1

Introduction

Over the last three thousand years the central Andean area has seen the rise and fall of different civilizations. Periods of centralization of power were followed by periods of regionalization, but on the whole the direction was towards largescale cultural integration and increasing state control over an ever-growing territory, connected with the Chav´ın, Huari/Tiahuanacu, and Inca horizons. Maximal integration was reached when the Inca Empire controlled an area that ran from Ecuador to Argentina, just before the victory of the Spaniards in 1532 CE. The central Andes area is one of the hot spots of human civilization. It has seen the development and specialization of different food supplies: tubers like potato, maize, camelids, the sacred coca leaf, all of them linked to the complex We are grateful to Loretta O’Connor for comments on various earlier versions of this chapter, to Willem Adelaar for several suggestions, and to Harald Hammarstr¨om for technical support with the NeighborNet graphs. We also want to acknowledge discussions with Andeanist colleagues, notably Rodolfo Cerr´on-Palomino, and Paul Heggarty. None of these people are responsible for the errors of fact and interpretation in this chapter, of course.

126

The Andean matrix

127

Andean ecosystem, known in the literature as the pisos ecol´ogicos, ecological levels. It also led to a “vertical” social organization, with exchanges between different eco-zones. Living and agricultural conditions called for massive labor. Moving labor forces around has led to mixing populations of different origins. Under Inca rule, lasting no more than two centuries, a Quechuan lingua franca was propagated within the empire. Inca policy generally, however, was not to suppress local cultures and languages, but to overlay them with the state culture and the imperial language. This process was so successful that 500 years later some form of Quechuan, in its many local varieties, is still spoken by millions of people from the south of Colombia, through Ecuador, Peru, and Bolivia, into the north of Argentina. Only one important other language family managed to resist Spanish pressure: Aymaran. Of the other languages of the Andes, only a small community of speakers of Chipaya (Uru-Chipaya family) survives. The territory from which Quechuan and Aymaran started to expand around 1 CE lies in central Peru. While Quechuan and Aymaran share a homeland, they probably did not share a direct ancestor. In spite of structural and lexical similarities between the two families, the question of whether genealogical relatedness or language contact may explain this has been the subject of debate (see Adelaar with Muysken 2004: 34–36; Muysken 2012a; Adelaar 2012a). Lexical comparisons within and between varieties of the two families suggest intensive language contact, but morphology and syntax have not been analyzed systematically. The current chapter aims at a more refined comparison of the two language families, enhancing the picture with a more structurally oriented analysis of dialectal variation within the Quechua family, taking into account the other relevant languages spoken in the area. The Andean matrix has played an important role in South American indigenous linguistics, because of the postulated Andean linguistic area and civilization and its impact on its neighbors. Here we critically survey the current state of knowledge regarding the linguistic history of the Andes and present a new set of analyses based on structural rather than lexical or phonological criteria. We will use a data set of coded features for noun phrase structure and argument realization (see the contributions by Krasnoukhova and Birchall, this volume), as well as structural and morphological features specifically selected to distinguish Quechuan and other Andean languages. We mean to throw new light on several long-standing issues: (a) What is the best way to classify the Quechua language family internally and what does this classification tell us about the language history, contrasting our morphosyntactic data with the lexical data analyzed by Heggarty (2005, 2007) (Section 4)? (b) What does the complex structural relation of Quechua, Aymara, and the other highland languages suggest about their historical relationship (Section 5)?

128

Simon van de Kerke and Pieter Muysken

We begin by presenting the distribution of languages in the region in Section 2, while Section 3 contains information about sampling and coding. We conclude and raise the questions remaining in Section 6. 2

Definition and distribution of languages

Although the Andean ridge runs all along the western coast of the South American continent, the Andean matrix proper refers to the geographic area closed off in the north by the Chibchan area, in the east by the Eastern Lowlands, and in the south by the Southern Cone (Map 6.1). This largely coincides with the area brought under Inca rule in the latest phase of the Inca horizon, which lasted from 1300 CE until the collapse of the empire in the mid sixteenth century. Not only the Incas were confined to this central Andean area; the preceding Huari (500–900 CE) and linked Tiahuanacu (500–1000 CE) horizons were as well. Smaller, more localized outbursts of power concentration like Chav´ın in northern Peru, Moche on the northern coast, and Nazca on the central Peruvian coast also flourished within these confines. Environmental conditions include a generally arid coast, where life was confined to the river valleys that at the same time served as avenues to the highlands. A steep climb leads to altitudes between 2,000 and 3,500 meters, where food production makes larger concentrations of humans possible, either by tilling the land or by herding. Passing over the ridge of the Andes at 5,000 meters, a steep plunge through a very productive area leads one into the forested lower mountain slopes and then the jungle. Only during the larger archaeological horizons was there north to south contact, while in the intermediate periods we find small local kingdoms/cultures that manipulated the west to east link, integrating the different altitudes in the vertical exchange system. Without written material it is clear that we have to rely fully on archaeological information up to the end of the Huari/Tiahuanacu horizon. After this date, Andean oral traditions, as recorded by the Spanish, start to play a role. On the basis of this historical information, we may conclude that by that time the Bolivian Altiplano was populated with Aymara-speaking strongholds on the western side of Lake Titicaca and Puquina-speaking Collas on the eastern side. It is highly likely that the Urus, then as now, lived as hunter-gatherers in the surrounding river system. Further to the south we find first Atacame˜no in northern Chile and then Araucanian (Mapuche) in central Chile. Lule-Vilela was spoken in northern Argentina. Going to the north, we enter a large zone where speakers of Aymara, Quechua, and Puquina coexisted up to the area that was the supposed point of origin of Aymara and Quechua expansion, in central Peru. There we find Mochica, the language of the Moche or Chimu culture (1300–1438), and a number of other coastal languages: Tallana, Sechura, Olmos, and Quingnam. Very little is known of these latter languages (Adelaar

The Andean matrix

129

COLOMBIA

1 2 3

Quito

ECUADOR

Marañó

Chiclayo 4

. nR

B R A Z I L

Cajamarca

P E R U 5 Lima

Huancayo 6

Cuzco

BOLIVIA

7

Arequipa

Santa Cruz de la Sierra

9

H

PARAGUAY

10

Antofagasta

I

L 11

E

AYMARA

La Paz

C

Explanation of language names indicated by numbers 1 Awa Pit 2 Cha'palaachi 3 Tsafiki 4 Mochica(†) 5 Pacaraos Quechua 6 Jaqaru and Cauqui 7 Callahuaya 8 Uchumataqu 9 Chipaya 10 Atacameño(†) 11 Quechua dialects of Catamarca and La Rioja(†)

8

Santiago del Estero

ARGENTINA

QUECHUA I QUECHUA II Aymara and Quechua II: overlapping area

Map 6.1 Approximate distribution of the indigenous languages in the Andes in the mid twentieth century (Map 4, p. 169, from Willem F. H. Adelaar, with Pieter C. Muysken, The languages of the Andes (2004), Cambridge University Press)

130

Simon van de Kerke and Pieter Muysken

with Muysken 2004: 397–407). In Southern Cajamarca and Northern Ancash the now extinct language Culli was spoken, possibly related to Chol´on. Chol´on was spoken further east, on the Amazonian fringe. Further to the north into Ecuador a number of languages were spoken in the pre-Columbian era, most of which have disappeared, largely replaced by Quechuan varieties (see Adelaar with Muysken 2004: 392–397). Some languages situated on the fringes of Inca state control, such as Atacame˜no and Coli (a variant of Puquina) on the southern coast, survived for a time before disappearing around 1900. All other languages whose existence is known to modern scholars, including Puquina, the language associated with the Tiahuanacu period, were replaced by Aymara and Quechua. It is only on the borders of the Inca Empire, mainly on the eastern side of the Andes where diseases, armed resistance, and geographic circumstances brought the Inca armies down, that other languages were able to survive (see van Gijn on the Andean foothills, this volume). 3

Research methodology and language varieties studied

The main focus in our project has been on morphosyntactic characteristics. For this reason we use questions and data that were collected for the Noun Phrase (Krasnoukhova, this volume) and Argument realization (Birchall, this volume) questionnaires. We used a subset of the features in these questionnaires because of the overlap that exists between them and because of the fact that a number of the features were irrelevant for the languages in our sample. Of the total of 96 features from the Noun Phrase questionnaire we use 58; from the total of 83 features in the Argument realization questionnaire we use 68, giving a total of 126. We added the data for another 13 to the 4 Quechua varieties represented in the Krasnoukhova and Birchall samples. Apart from that we composed a questionnaire specifically aimed at Quechuan, with features that in earlier studies were identified as distinguishing Quechuan varieties. Most questions, 25 in total, concern the form of morphemes (see the Quechua questionnaire in the appendix, Table 6.6). A small number of questions were more general and could also be used for a comparison of the whole set of Andean languages (see the Andean questionnaire with 12 features in the appendix, Table 6.6). This means that the Quechuan languages (including Kallawaya) may be contrasted on 163 features, and the whole complex of Andean languages on 138 features. The use of new analytical tools allows for an in-depth analysis of the variation within the Quechuan family on the one hand and between the Andean languages as a possible Sprachbund on the other, and the results may shed light on a number of questions put forward by Heggarty (2005, 2007). To facilitate a comparison with Heggarty, working with lexical data, we aimed at a comparable

The Andean matrix

131

set of dialects and languages, as shown in Table 6.1. It presents the language varieties included in this study, compared to those used in the lexico-semantic study of Heggarty and the groupings in Adelaar with Muysken (2004). For the linguistic data we rely on published descriptions of Andean languages and in a few cases on our own fieldwork data.1 To analyze the material, we used feature distance matrices and NeighborNet analysis (Huson and Bryant 2006), which allows the representation of distances between languages as well as reticulations (shared features between different branches). 4

The internal structure of the Quechua language cluster

The internal structure of the Quechua family has been a subject of study from the moment that the Spanish friars tried to gain a grip on the complex language situation they were confronted with. Another period of attention came in the nineteenth century when large numbers of European explorers arrived to study the Andes, often as forerunners of commercial exploration. Then another century had to pass by before the pioneering work of Parker (1963) and Torero (1964) made it obvious that treating Quechua as if it were one language is mistaken; a reasonable point of comparison would be the variation within the Romance family. They argued for a split in the family into a more conservative central branch Quechua I (QI) and a more innovative southern and northern branch Quechua II (QII). This view is widely accepted, although the gradualness of distinctions between QI and QII varieties is debated. The QI dialects are spoken in a relatively small uninterrupted area in central Peru; they share a number of differences with the QII dialects but among themselves they diverge widely. The QII dialects have much in common but are subdivided into a number of geographically based sub-branches. Apart from this basic opposition, Parker and Torero have shown that Quechua was not simply spread from imperial Cuzco, but in an earlier phase had emanated from the central Andean area. Subsequently, a debate started between proponents of the “Cuzco origin” and “Central Andean origin” schools. Many comparative studies were carried out on different aspects of the lexicon, phonology, and morphology of the different dialects, to get a better understanding of the developments within the Quechua language family, all using qualitative methods. The lexical data were used to estimate time depth of a possible split using the lexical statistical method (Torero 1972), but it was only in the last decade that Heggarty (2005) initiated 1

(Van de Kerke for Bolivian Quechua and Muysken for Ecuadorean Quechua.) Our main sources for the ethnohistorical and archaeological data are the essays in Heggarty and Beresford-Jones (2012).

132

Simon van de Kerke and Pieter Muysken

Table 6.1 Languages of the study, compared to the varieties used by Heggarty (2005) and by Adelaar with Muysken (2004)

QI

Heggarty (2005)

Adelaar with Muysken (2004): subgroups

This study

ISO

Chacpar Y´anac Hu´anuco

Huaylas-Conchucos

Huaylash-Ancash

qwh Parker (1976)

Alto Pativilca-Alto Mara˜non-Alto Huallaga Yaru

Huallaga-Hu´anuco qub

Weber (1989, 1996)

Northern Jun´ın, Tarma Jauja-Wanka

Adelaar (1977)

Jauja-Huanca

QIIA

Cajamarca (Chetila) Incahuasi/ Ca˜naris

Laraos

QIIB

Chimborazo (Troje) Tena (Serena)

Pacaraos Huang´ascar-Topar´a department Cajamarca (provinces Cajamarca and Bambamarca) department Lambayeque (province Ferra˜nafe: districts Ca˜naris and Incahuasi) department Lima (province Yauyos: village dialects Laraos, Lincha, Made´an, Vi˜nac; province Huaral: village dialect of Pacaraos) Colombia

Ecuador: highlands and eastern lowlands

qvn

Reference

Pacaraos

qxw Cerr´on-Palomino (1976) qvp Adelaar (1987)

Cajamarca

qvc

Quesada (1976)

Yauyos

qux

Taylor (1986, 1990a,b)

Inga

inb

Imbabura

qvi

Levinsohn (1976), Mongu´ı and Levinsohn (1976) Cole (1982)

Salasaca Arajuno Peru: department Loreto Southern Pastaza (eastern lowlands); department San Mart´ın: area of Lamas (Lamista); department Amazonas: Chachapoyas and Luya San Mart´ın

qxl Fieldwork quw Fieldwork qup Zahn et al. (2002)

qvs

Park and Wyss (1995), Coombs et al. (1976)

The Andean matrix

133

Table 6.1 (cont.)

QIIC

Heggarty (2005)

Adelaar with Muysken (2004): subgroups

Huancavelica (Atalla) Cuzco, Taquile, Puno Curva, Pocona, Marawa

This study

ISO

Reference

department Ayacucho etc.

Ayacucho

quy

department Cuzco etc.

Cuzco

quz

Parker (1965), Soto-Ruiz (1976) Cusihuam´an (1976)

Bolivia: northern and southern varieties

Cochabamba

quh

Argentina

Santiago del Estero

(Altiplano Bolivia)

Northern

Kallawaya

Aymara

Huancan´e, Sullkatiti, Puqui Jaqaru, Kawki

Mochica

Uru-Chipaya Chipaya Chol´on

Southern Jaqaru Mochica

Uchumataqu (Uru) Chipaya Chol´on

Fieldwork and Herrero and S´anchez de Lozada (1978) qus Alderetes (2001), Nardi (2002) caw Oblitas Poblete (1968), Girault (1989) ayr Hardman, Vásquez and Yapita (2001) ayc Coler (2010) jqr Hardman (1966) omc Adelaar (2004), Hovdhaugen (2004), Torero (2002) ure Hannss (2008, 2009) cap Cerrón-Palomino (2006, 2009) cht AlexanderBakkerus (2005)

a new research line by evaluating the data by means of phylogenetic network trees, a research line pursued in this chapter. 4.1

QI and QII

It is clear that there are important differences between QI and QII varieties (Parker 1963; Torero 1964; Cerr´on-Palomino 1987; Adelaar with Muysken 2004). It is also generally agreed upon that the split between them took place quite early, possibly longer than 1,500 years ago. The main question is whether this led to a sharp division, as assumed by the sources listed above, or a more

134

Simon van de Kerke and Pieter Muysken

gradual one, as argued by Heggarty (2007: 335). Summarizing the NeighborNet analysis of lexical data reported on in Heggarty (2005) and on a reanalysis of the data presented in Torero (1972), he writes: All of these graphical outputs look nothing like neatly branching trees, but webs suggestive not of some radical early split in Quechua but a gradual expansion into a broad dialect continuum. Indeed the varieties of Northern Highland Peru, supposedly QIIa, i.e. a sub-branch of QII, in fact appear much closer to QI than to the rest of QII.

Inspection of the graphs presented in Heggarty suggests that the sources of contention concern three datasets: (a) The northern varieties of Cajamarca and Ferre˜nafe (Torero) and Cajamarca and Inkawasi (Heggarty) do not cluster with the other QII varieties, but either with Central QI (Torero) or as a separate branch (Heggarty). (b) The central variety of Pacaraos that was classified as QII by Torero (1964) is actually closer to QI in Torero’s (1972) data. (c) The varieties labeled Yauyos are intermediate between QI and QII in both datasets. Our data involve systematic morphosyntactic datasets that can help clarify these issues. The main question will be: Is the split between QI and QII so deep that we can speak of genealogical units or do we see a process of dialectalization in a network-like form with numerous early split-offs? We will focus on the position of Cajamarca, Pacaraos, and Yauyos. 4.2

Quantitative results for the Quechua languages

The primary technique we used involves distance matrices, representations of the percentage of shared features between any varieties in the sample.2 Applied to the three datasets we have, Noun Phrase, Argument expression, and Quechua 37 (the combined questions of Quechua and Andean), we observe that the internal variation within the Quechuan family is low when the analysis is based either on the Noun Phrase (GAD 0.11) or the Argument expression (GAD 0.12) questionnaire. The Noun Phrase feature database yields very few differences between the varieties (distances ranging from 0.05 to 0.20), suggesting that these features are relatively stable across the family. The Argument realization feature matrix results in a low distance on the average, but somewhat greater internal variation (range 0.05–0.30). The NeighborNet graphs associated with these figures present the dialects as relatively randomly grouped. This may be interpreted as an indication of the fact that the basic morphosyntactic 2

Here a 0.10 value implies that two varieties only diverge in 10 percent of the features in the sample, while the global average distance (GAD) gives an indication of the overall variation between all of the varieties in the sample. A graphical representation of these distances is given in NeighborNet graphs, that give a good visual representation of underlying dependencies.

The Andean matrix

135 Callawalla_caw

Salasaca_Highland_Quichua_qxl Tena_Lowland_Quichua_quw

Imbabura_Highland_Quichua_qvi

Yauyos_Quechua_qux

Inga_inb

South_Bolivian_Quechua_quh Cusco_Quechua_quz

Southern_Pastaza_Quechua_qup

Ayacucho_Quechua_quy San_Martin_Quechua_qvs

Santiago_del_Estero_Quichua_qus Cajamarca_Quechua_qvc

Huallaga_Hunuco_Quechua_qub

North_Junin_Quechua_qvn

Pacaraos_Quechua_qvp Huaylas_Ancash_Quechua_qwh Jauja_Wanca_Quechua_qxw

Figure 6.1 NeighborNet representation for the relative distances of the members of the Quechuan language family

layout of Quechuan languages has remained amazingly stable, assuming that geographic split and subsequent dialect formation started 1,500 years ago. It may not come as a surprise that the global average distance jumps considerably to 0.41 (range 0.15–0.70) with the 37 combined Quechua and Andean features. The associated NeighborNet strongly contrasts the QI, QIIb, and QIIc dialects and the effect of these features is so strong that even if they are lumped together with the Noun Phrase and Argument expression features, the global average of this combination (Quechuaall) falls to 0.18 (range 0.05–0.30), but the associated NeighborNet gives the same clear picture: see Figure 6.1. The main branching between QII languages and QI languages is confirmed. The problematic variety of Pacaraos is close to QI, and problematic Cajamarca and Yauyos are close to QII. Ecuadorian Quechua, a sub-group within

136

Simon van de Kerke and Pieter Muysken

Quechua II, is always the outlier, with other lowland Quechua languages such as Peruvian Pastaza Quechua and Inga (Colombia) close by. A first conclusion may be that the forms of morphemes may have changed (the 25 features in the Quechua database) but not the typological frame (Noun Phrase and Argument expression databases). Interestingly, variation is not random but clearly supports the classical division of Quechuan into QI and QII. In that way it provides answers to a number of questions that were formulated above. The split between the Quechua I, including Pacaraos, and the Quechua II dialects is deep and not likely the result of a slow dialectal spread. The northern Peruvian dialects Cajamarca and San Martin, and Yauyos Quechua as well, form a branch of the Quechua II cluster. This may be interpreted as support for the view that they are the result of early Huari expansion, as we will argue below. The loss of morphological complexity observed in the Ecuadorian dialects sets them clearly apart as a subgroup within QII. It is noteworthy that Kallawaya groups with them in this respect. Kallawaya does not resemble the surrounding Southern Quechua varieties, in contrast with most remarks in the literature, namely that Kallawaya would simply be a relexified variety of Bolivian QIIc. Its intermediate position next to the Ecuadorian Quechua varieties probably results from processes of simplification in the Quechua varieties that were the input to Kallawaya, similar to what happened in Ecuador. 4.3

The QII cluster and Ecuadorian Quechua

The QII varieties of the southern branch (spread from Ayacucho in the Peruvian highlands to northern Argentina) form a genealogical unit together with Cajamarca and Yauyos. However, the distance matrices and the NeighborNet analysis in 4.2 suggest that northern QII languages (including Ecuador and some speakers in southern Colombia) form a group by themselves. The key question concerns the relation between the Peruvian QII varieties and those exported to Ecuador. The varieties of Ecuador have been linked to a lingua franca Quechuan form called lengua general, assumed to have been the Inca imperial expansion variety. The morphological features of Ecuadorian Quechua, or Quichua, show that it is an off-shoot of QII varieties. It is related to early Chinchay (Torero 1975) or “general Quechua” and has a few features of Cuzco Quechua as well. It was introduced into Ecuador in the Incaic period (see also the testimony of Cieza de Le´on 1984 [1553]), and consolidated during the colonial period. Hocquenchem (2012) argues convincingly that all ethnohistorical and archaeological evidence points to expansion of Quechua into Ecuador during the Inca conquests after 1450 CE. We know of no compelling linguistic arguments which would support an earlier expansion.

The Andean matrix

137

Table 6.2 Morphosyntactic and phonological features that distinguish Southern Peruvian Quechua from Ecuadorian Quichua

A

1st person plural

B

benefactive genitive nominal possessives person marking subordinate verbs adverbial subordination object marking

C D E F

Peru

Ecuador

n˜ uqayku (excl.) n˜ uqanchis (incl.) -paq -pa/-p -y/-yki/-n/-nchis -y/-yki/-n/-nchis -pti/-spa 1ob -wa 1su2ob -yki 3su-2ob -su-nki 3su-4ob -wa-nchis

n˜ ukanchis -pak -pak 0 0 -kpi/-spa (-wa)

Quichua differs considerably from the Quechuan languages it is related to, such as Ayacucho Quechua (Muysken 1977; 2000b). It was consolidated as the community language of the runa peasants of the Ecuadorian highlands during the Incaic period, and particularly the Colonial period. Prior to the advent and genesis of Quichua, other substrate languages were spoken in Ecuador, notably Barbacoan languages in the north and Jivaroan languages in the south. There is some variation in Quichua that may be attributed to different substrates. Table 6.2 presents some of the main differences between Quichua and its Peruvian relatives. The differences are striking particularly in the nominal domain. Compare verbal constructions (1a) and (1b), where the basic person indexing is maintained, to (2a) and (2b). Ecuador has lost nominal person indexing: (1)

Peru3 a. (qan) puri-nki (you) walk-2sg ‘you walk’

(2)

a. (qan-pa) mama-yki (you-gen) mother-2sg ‘your mother’

Ecuador b. (kan) puri-ngi you walk-2sg ‘you walk’ b. kan-pak mama you-gen mother ‘your mother’

The consequences of this loss are also crucial in the nominalization and hence subordination domain; cf. the contrast between (3a) and (3b): 3

ac = accusative; af = affirmative; ag = agentive; ben = benefactive; ds = different subject; fu = future; ge = genitive; nom = nominalizer; pr = progressive; pro = pronoun; re = reflexive; ss = same subject; to = topicalizer.

138

Simon van de Kerke and Pieter Muysken

(3)

a. puri-na-yki-ta yacha-ni walk-fu.nmlz-2sg-ac know-1sg ‘I know that you will walk’ (lit. I know you to walk (ac))

b. (kan) puri-na-ta yacha-ni (you) walk-fu.nmlz-ac know-1sg ‘I know that you will walk’

However, not all Amazonian varieties have lost person marking on nominals, in nominalizations, and in adverbial clauses; cf. the following examples from Waters (1996: 167–169), of equivalents in (a) Pastaza (Peruvian Amazonian Quechua) and (b) Napo (Ecuadorian Amazonian): (4)

a. wasi-yki house-2sg ‘your house’

b. kan-pa wasi 2sg.pro-gen house ‘your house’

(5)

a. yanu-hu-ni miku-na-nchi-pa cook-pr-1sg eat-nmlz-1pl-ben ‘I cook so that we [can] eat.’

b. shamu-n miku-nk´a-j come-3sg eat-nmlz-ben ‘He comes to eat.’

(6)

a. kay-ta tukuchi-shpa-yni shamu-nka b. pay shamu-jpi miku-nchi this-ac finish-sub-1sg come-3.fu 3sg.pro come-ds eat-1pl ‘When I finish this, s/he will come.’ ‘When s/he comes we eat.’

Thus the changes in the northern varieties (including Peruvian Amazonian), separating them structurally from all Peruvian Quechua varieties, cannot be uniquely attributed to the loss of nominal person marking.

4.4

Origin and history

Given the results from the NeighborNet analysis and the earlier literature, the scenario which we consider most likely for the origin and spread of Quechua contains the following elements. Origin. As argued in Adelaar (2012a) and Muysken (2012a) Quechua emerged through interaction with Aymara. The precise details of the interaction are a matter of dispute and need further investigation. The overall typological profile of what we may think of as very early Quechua suggests affinities with languages in the north central Andes such as the Barbacoan and Jivaroan family. This reflects a possible northern origin for Quechuan. Initial spread. The evidence regarding internal variation in the Quechua language family points to an initial dispersal from the central Peruvian highlands. First, varieties which later gave rise to QII moved towards the south (to roughly the Ayacucho area, associated with the Huari civilization). The original area remained highly differentiated, with some varieties later emerging as the current QI languages (including Pacaraos).

The Andean matrix

139

The Huari civilization. Beresford-Jones and Heggarty (2012b: 57–84) argue that the Huari horizon is associated with the expansion of Quechua.4 We agree with this and assume that later consolidation and spread of QII varieties was linked to the Huari civilization. In the Huari period terrace construction gained momentum, increasing the potential for maize cultivation through stone heat retention, but it requires substantial labor and thus population density. Huari settlements were very grid-like and regular in shape. Huari expansion was achieved through military rule rather than through extensive movement of people. We follow Adelaar (2012b) in the idea that Cajamarca and related Quechua varieties were an off-shoot from the Huari civilization. The Inca Empire. The later spread of Quechua in the Inca period, both northward into Ecuador and southward into Argentina, can be linked to actual population movements. It is not evident that the Inca actually imposed Quechua on their subjects. There was very extensive movement of people, involving various types of subjects, during Inca rule, and in terms of numbers the mitmaqkuna were the most important. Mitmaqkuna were extended families or ethnic groups resettled by the Incas by force from their home territory to recently conquered areas.5 The Inca occupation of northern Argentina took place in the middle of the fifteenth century. The Spanish occupation. According to Cook (1998) the Andean population was 9 million at the time of conquest, and this went down to 600,000 in 1620, an incredible devastation. The figure of 370,000 can be given for Peru in 1730, which increased to 1.5 million in 1876, and 2.9 million in 1940. At present the population is somewhere near its size at the time of conquest. Thus the current distribution of languages does not necessarily represent the original situation. Quechua and Aymara continued to spread into the foothill regions of Ecuador, Peru, and Bolivia during the colonial period. An overview is given in Figure 6.2. Whether the Chav´ın and Huari/Tiahuanacu horizons need to be linked to language spread (Beresford-Jones and Heggarty 2012b) is an open question. The Chav´ın horizon is very early, 900–200 BCE, and is believed to have been limited in its expansion. It pre-dates the generally believed presence of both Quechua and Aymara in central Peru, which began around 200 CE. As such, Chav´ın may be linked to pre-proto-Quechua, as has been argued by Heggarty (p.c.). On the other hand, linking Chav´ın with pre-proto-Aymara, as is done in Beresford-Jones and Heggarty (2012b), is not incompatible with (1) the idea that Quechua was an invading language coming from the north, maybe filling 4 5

Heggarty (p.c.) has later suggested associating QI with the Chav´ın culture, and QII with Huari in the Middle Horizon. Other transplanted groups include Aqllakuna (women removed from their native homes at a young age and brought to state facilities called aqllahuasi, where they learned various crafts) and Yanakuna (people taken out of the ayllu system to work for the Incas as servants).

140

Simon van de Kerke and Pieter Muysken

Figure 6.2 The distribution of languages per region, over time (Q = Quechua)

the gap after the fall of the Chav´ın cultural complex (Adelaar 2012a), and (2) the spread of Aymara down the coast into the Nazca area, where it became the language of the Nazca culture (100 BCE–700 CE). This may also have been the avenue for the attested presence of Aymara in the highland area around 1200 CE, which makes it plausible that Aymara was one of the languages spoken in the Tiahuanacu realm, in addition to Puquina and Uru. There is no evidence that Quechua was present in southern Peru until the end of the Tiahuanacu era, and this raises the question why by 1300 CE Quechua was chosen as the imperial language by the (supposedly Aymara- or Puquina-speaking) Inca elite. The most plausible reason is that large parts of the south central area of Peru were Quechua-speaking by that time, making it likely that during the earlier Huari horizon (500–900 CE) Quechua was spoken in this area, probably in addition to Aymara. This squares with the idea that Quechua- and Aymara-speaking groups, in a herding and agriculture symbiosis, moved southward from north central Peru into the Ayacucho area between 300 and 600 CE.

The Andean matrix

5

141

The relation between Quechua and the other languages in the Andean matrix

Our sample of seventeen Quechuan dialects and Kallawaya against three Aymaran and two Uru-Chipaya languages, as well as Mochica and Chol´on, makes any comparison highly skewed towards the Quechua data. However, the distance matrices clearly show that we have introduced a number of unrelated languages. The low global average we have seen in our comparison of the Quechua dialects in the Noun Phrase and Argument realization, around 0.12, now jumps to 0.18. The Quechua dialects present low to high scores (0.05 to 0.35) (little internal variation and larger distance from the non-Quechua languages) just as we find for the Aymaran dialects (varying from 0.07 to 0.38), while the other languages present less variance, for example Mochica (0.32 to 0.43), that is unlike any other language in the sample. If we combine the features of the Noun Phrase and Argument realization databases and represent them in a NeighborNet (Allandean NPArg), we see that the whole Quechua cluster, internally relatively unstructured, is set apart away from Chol´on, Mochica, and Uru-Chipaya and is somewhat closer to the Aymaran dialects. It is interesting to see that the addition of the twelve features of the Andean questionnaire to the combined features of the Noun Phrase and Argument realization databases into AllAndean not only adds a lot of structure to the Quechuan family with evidence of a clear QI, QIIc, and QIIb group, but at the same time makes the Aymaran dialects shift to a position much closer to the clearly visible QI subgroup: see Figure 6.3. A number of other conclusions may be drawn. Obviously, the completely different typological make-up of Mochica makes it into an outlier. However, Chol´on is also much less “Andean” than sometimes suggested, despite obvious Quechua borrowings. Uchumataqu (Uru) and Chipaya are together, but the split was not as early in the history of the Altiplano as their structural separation in the splitsgraph would lead us to believe. This may be the effect of language attrition and death in Uchumataqu. We will briefly discuss the different language groups, starting with the most important one, Aymaran. The intermediate position of the Aymaran varieties in the NeighborNet graph asks for an explanation.

5.1

Aymaran

Apart from two very small and rather distinct older branches of the family in central Peru (Jaqaru and Cauqui), Aymara is spoken in a contiguous area in the south of Peru and northwest of Bolivia, with very little internal differentiation, as far as we know. Given the many similarities and limited structural distance

Downloaded from https://www.cambridge.org/core. University of Cambridge, on 12 Oct 2018 at 20:38:48, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107360105.008

Mochica_omc

Callawalla_caw

Tena_Lowland_Quichua_quw Salasaca_Highland_Quichua_qxl Imbabura_Highland_Quichua_qvi

Inga_inb Cholon_cht Southern_Pastaza_Quechua_qup

Chipaya_cap

San_Martin_Quechua_qvs Santiago_del_Estero_Quichua_qus Cajamarca_Quechua_qvc Uru_ure Ayacucho_Quechua_quy Cusco_Quechua_quz South_Bolivian_Quechua_quh

Huallaga_Hunuco_Quechua_qub Southern_Aymara_ayc Jaqaru_jqr Huaylas_Ancash_Quechua_qwh Jauja_Wanca_Quechua_qxw North_Junin_Quechua_qvn

Yauyos_Quechua_qux

Pacaraos_Quechua_qvp

Central_Aymara_ayr

Figure 6.3 NeighborNet representation for the relative distances of the different Andean languages discussed in this chapter

The Andean matrix

143

between Quechua and Aymara, questions concerning the relation involve both ultimate relatedness and further contacts between the two families. Ultimate relatedness. How similar are Quechua and Aymara, and what light does this shed on their genealogical relationship? By both proponents and opponents of such a relationship it is accepted that if they are related, the moment they split dates back to a period at least 2000 BCE if not much more. Adelaar with Muysken (2004: 34ff.) argue that the evidence points to a separate origin for the two language families, and we have no reason to assume differently here. So, unless clear genealogical links of either one of the languages with languages spoken to the north would make a scenario of the movement of either language family to Central Peru plausible, for our current state of knowledge of the early history of the central Andes, this does not make much of a difference. However, from a linguistic point of view, it does since it is a kind of a test case up to which points languages may converge or split. Aymaran, QI, and QII. Adelaar assumes that Quechua coexisted in a (pre-) proto-form with (pre)proto-Aymara speaking populations, to which this protoQuechua adjusted its form. Much later this transformed Quechua began its spread during the Huari horizon, both to the north (to the central highlands where it became a superstrate on Aymara dialects and to the coast) and to the south east (Cuzco). However, this expansion can only have involved QII varieties. There are at least two points in which Aymara and QI strongly differ from QII: directionals in verbal derivation and verbal plural cross reference marking. The Aymaran system is richer than the QI system but both have directional and aspectual suffixes. QII manipulates a number of these suffixes but only one has retained a directional meaning, apart from an aspectual one like the others; -yku, -rqu, -rpa. This does not look like a real innovation but more like a slow loss of complexity. Plural marking is a parallel case. It is an innovation in the QII dialects to create an extra slot for number marking in the verbal matrix and to get rid of the fairly complex event plurality as it is marked in the Aymara and QI dialects. Since Cajamarca, San Mart´ın, and Yauyos share this innovation we must assume that they were later split offs from Huari (Ayacucho) varieties rather than early remnants. Substrate. Another intriguing aspect of the Quechua/Aymara contact is the fact that Aymara may have disappeared without leaving many traces when overlaid by Quechua. One of Adelaar’s arguments for assuming much wider Aymara presence are a few Aymara lexical elements present in the north-central Quechua speaking area where Aymara for time immemorial has disappeared, but a much more recent case is the disappearance of Aymara in southern Bolivia where it was widely distributed well into the colonial times and where it disappeared without leaving noticeable traces, for example in the Northern Potosi area.

144

Simon van de Kerke and Pieter Muysken

Has the spread of QII southward led to Aymaran substrate influence in more southern varieties such as Cuzco and Puno? Regarding Aymaran influence, it has often been assumed that the glottalization and aspiration of initial stops in the QII varieties found in Cuzco and Bolivia (where historical sources suggest the earlier presence of Aymara) may reflect an Aymaran substrate. Likewise, the use of separate lexical dependent clause markers in southern varieties of QII, such as chayqa ‘that’ and hina ‘like’ may possibly be linked to Aymaran influence. However, a systematic exploration of possible syntactic convergence of southern varieties has not yet been undertaken. 5.2

Uru

The Uru languages Chipaya, still spoken, and Uchumataqu, which survived until recently, were spoken in parts of the Altiplano and the Lake Titicaca basin, mostly on the Bolivian side (Adelaar with Muysken 2004: 362–363). There is no evidence of population movements for the group as a whole, although it is clear that the range of communities where Uru languages were spoken is much wider than the present, quite reduced aquatic zones along the lakes and rivers of the Altiplano. There is evidence of earlier structural influence of Quechua on the Uru languages, and of current Aymara influence (see e.g. Muysken 2000a), suggesting the possibility of metatypy (Ross 1999, 2006). Nonetheless the Uru languages are clearly distinct from Quechua and Aymara structurally. 5.3

Chol´on

Chol´on was spoken until fairly recently in the upper Huallaga valley in northern Peru, and together with related Hibito may have occupied a larger area in the Andean foothills. Alexander-Bakkerus (2005) has documented a number of borrowings from Quechua, both lexical and morphological. In spite of this, Chol´on appears as a clearly separate entity in the NeighborNet trees. 5.4

Mochica

Mochica (Muchik) or Yunga was traditionally associated with the Moche or Chimu culture in coastal northern Peru. Mochica is typologically quite distinct from the other languages in the area, and has been linked to the Mayan languages (Stark 1972), without general agreement so far.6 Some of the “Mochica” were probably Muchik speakers, but there were multiple competing polities in Moche (Kaulicke 2012). Cerr´on-Palomino (1987) has shown multiple contacts between 6

A new ERC project (Willem Adelaar, PI), is exploring possible links between Mochica and Meso-America.

The Andean matrix

145

Mochica and Quechua. This may have led to structural convergence, but Mochica remains clearly distinct. 5.5

Arawakan, Puquina, and Kallawaya

Although it is still very much debated, for the sake of exposition we will operate on the assumption here that elements of what came to be known as Puquina came from an Arawakan language, with an early presence in the Altiplano. Relations between the Quechuan, Aymaran, and Arawakan language families are very complex and may span the last two thousand years. We may distinguish at least three stages in the interaction. First of all, the possibility has been suggested that Arawakan peoples participated in the culture of Tiahuanacu (300–1000 CE) near Lake Titicaca. However, Tiahuanacu may not have been exclusively Arawakan. Archaeologists Eduardo Machicado, Paul Goldstein, and Sarah Baitzel argue that Tiahuanacu was heterarchical rather than hierarchical, as can be seen in the archaeological remains. There were, for example, different styles of cranial modification. In Tiahuanacu itself there were specialized workshops with different food consumption patterns, suggesting multiethnic constituency. Tiahuanacu expansion was through people, with non-contiguous settlements and several co-existing styles in diasporic enclaves, which suggest that settlements were themselves multi-ethnic. Further north in the Andes, the presence of monkey imagery in Nazca cultural representations (the famous Nazca lines are dated 400–650 CE) may likewise be linked to Amazonian, possibly Arawakan, influence in coastal Peru. Monkeys were not present on the coast, but they appear in the lines in the desert as well as in pottery motifs. It is clear that there were Arawakan words in Cuzco Quechua, like unu ‘water,’ as well as the term for month, -quiz. This suggests an early influence on Inca civilization from Arawakan societies. Arawakan words have also been found in the Chilean coastal language Mapuche, further to the south, and Bertonio’s Aymara dictionary contains words from Puquina. Although it is clear that there was a strong Arawakan influence in the Andes starting at least 1 CE, this does not imply necessarily that there were large Arawakan-speaking populations. This is compatible with the idea that Puquina, identified above as one of the important languages in southern Peru and Bolivia in the pre-Inca and Inca periods, contains a substantial Arawakan component but is not a fully Arawakan language. Puquina is most likely one of the important languages of Tiahuanacu. Rodolfo Cerr´on Palomino argues that Puquina was the early language of the Incas. His evidence for this comes from the names of the first members of the dynasty, from the names for the rituals in Inca civilization, and from the practice

146

Simon van de Kerke and Pieter Muysken

Table 6.3 Putative historical development of language use in Kallawaya villages in the Charazani region 1900

1930

1960

1990

Daily life Ritual

Puquina Puquina

(Puquina) Kallawaya

Quechua Kallawaya

With outsiders

Quechua

Quechua

Quechua / Aymara

Quechua Relexified Quechua or Quechua Quechua Spanish

of sun worship. Cerr´on Palomino also assumes that the term Colla, now used for the Altiplano Aymaran populations, originally referred to Puquina. Cesar Itier’s work on Amarete Quechua in northern Bolivia may reveal still more traces of Puquina vocabulary. In the colonial period there were intensive relations between the Arawakan language Amuesha and local Quechua in the central Peruvian foothills. Currently, Quechuan and Arawakan Campa languages are in close contact in southern Peru. Of course the most famous example of Puquina–Quechua interaction is Kallawaya, the lexicon of which contains a number of Puquina elements, and the grammar of which, even though primarily Quechua, also has some unusual features, particularly in the early sources (Muysken, 2009). Kallawaya is the (almost) extinct ritual language of the Charazani region (northern Bolivia). Katja Hannss is currently exploring the Kallawaya lexicon, attempting to find more Puquina roots. A typical example of the differences between Kallawaya and Quechua is found in (7), cited from Oblitas Poblete (1968: 40). Bold elements are Quechua endings. (7)

mii-qa llalli oja-ku-j-mi acha-n Kallawaya runa-qa allin miku-(ku-)j-mi ka-n Quechua man-to good eat-re-ag-af be-3sg ‘The man is a very greedy eater.’

In our interpretation, Kallawaya may be an example of a language changed by processes of metatypy (through which Puquina was gradually restructured under the influence of Quechua), and then its structural frame was fully replaced by Quechua. Quechua underwent relexification with words from Puquina and other languages. Table 6.3 summarizes a proposed history of the complex historical relationships of the region. 6

Stability in the Quechuan family and links to other languages

The time depth of 2000 years postulated for the Quechua family and the internal differentiation of the family into numerous documented branches encourage

The Andean matrix

147

Table 6.4 The forms reconstructed for Proto-Quechua by Parker (1969)

Head final order: OV, Mod N, N P, Mod A Case marking with acc/loc/dir/abl/gen/instr/ben Nominalizations agentive / resultative / infinitive Nominal suffixes marking possession etc Person marking on the noun with 1, 2, 3 Person marking on the verb with 1, 2, 3, 4 Verbalizers on the noun A complex system of verbal derivational markers Tense marking partly linked to person (future), or preceding person (past) Switch reference marked on the verb Conditional marking following the person markers Clitics marking evidentiality, discourse information, etc.

QI

QII

EcQ

Comments Parker

x

x

x

Stable Mod N and Mod A order

x

x

x

x

x

x

Six cases reconstructed by Parker, ablative potentially reconstructed Five forms reconstructed

x

x

Five forms reconstructed

x

x

Four forms reconstructed

x

x

(x)

Four forms reconstructed

x x

x x

x

x

x

x

Two forms reconstructed Twenty-three forms reconstructed by Parker Decomposed by Parker

x

x

x

x

x

x

No system reconstructed as such by Parker Reconstructed

x

x

x

Ten forms reconstructed

us to address the question of stability: To what extent are the structural features of the Quechua languages shared by all branches of the family? Can we discern particular components where more and fewer changes have occurred? The division of Quechuan into several branches is based primarily on lexical, morphemic, and phonological criteria. Altogether, there are only a few aspects of morphosyntactic organization that distinguish the different branches. Quechuan has remained surprisingly stable grammatically, as noted by Parker (1969: 130): “however, the texts available for many dialects show a very high degree of syntactic uniformity except as regards restructuring which has resulted in certain dialects from the borrowing of prepositions and conjunctions from Spanish.” This is illustrated in Table 6.4. Thus it is striking that on the whole so many features can be reconstructed in the family. Sometimes, the actual forms have changed, but, as we indicated above, the more abstract underlying categories persist. Several explanations can be given for this:

148

Simon van de Kerke and Pieter Muysken

Table 6.5 Features in more than twelve of the seventeen Quechuan varieties in our database, as compared to their occurrence in Aymaran and other Andean languages features >12 Quechuan varieties Switch reference subordinate clauses Additive suffix in indefinites Agentive nominalizer in past habitual Comitative and Noun Phrase-conjunctions 3rd person imperatives Nominal plural -kuna Agentive nominalizer in purposives Additive suffix in concessives

Aymaran

Uru-Chipaya

Puquina

Chol´on

+

+

+

+

+ +

+ +

+ +

+

+

+

+

+ ± +

+ ± +

+ ±

?

?

Mochica

+

+

(a) high population density in the region, and subsequently intensive exchange and trade, keeping branches in contact; (b) internal movements of Quechua peoples within the Inca state, as the result of resettlement policies; (c) structural similarities at the outset between Quechua and Aymara and possibly other languages, and hence little syntactic change when an Aymaranspeaking population shifted to a Quechuan language, or vice versa, leading to many reconstructable features.7 To discuss this last possibility we may ask ourselves to what extent the most stable features in Quechua have counterparts in Aymara and the other Andean languages, which may have consolidated these features (see Table 6.5). The fact that we find structural overlap with Chol´on, Uru-Chipaya, and Puquina may also reflect the influence of Quechuan on these languages. In the case of Chol´on this is reflected in the morphological borrowing of additive -pit (Q -pis), apparently with the same range of meanings as in Quechua. 7

Conclusions

We hope to have clarified the issue of the origin and spread of the languages of the Andes by exploring their similarity in terms of structural properties. A 7

Several cases of such shifts have been documented (Torero 1987).

The Andean matrix

149

number of conclusions can be drawn. First, the Quechua language family is quite coherent and stable in many respects. Second, there is structural evidence for a QI/QII split rather than for a more wave- or network-like configuration within the family. Third, the Aymaran languages are clearly set apart from Quechuan varieties as a group, but, at the same time, the structural distance between the northern varieties of Quechuan and those of most of Peru and Bolivia is larger than that between Aymaran and e.g. Quechua I. The other Andean languages clearly have separate structural profiles, even though they have undergone influence from Quechuan and Aymaran. Appendix Table 6.6 Quechua and Andean feature questionnaire Quechua nrs. = A nrs. 1 2 3 4

5 6 7 8

9 10 11 12 13

14 15 16

Quechua questionnaire feature What is the nominal plural marker? What is the pronoun that expresses 1PLEXCL? Is the ‘Additional’ suffix found in concessive adverbial clauses? Is the form of the genitive and the benefactive identical (-paq) or different (-pa/-paq)? What is the form of the ablative? What is the form of the locative? What is the form of 1SG possessive? Infinitival complements (Action nominals) are marked with (type mikhuyta/mikunata munani) What is the form of the reflexive marker? What is the form of the reciprocal causative? Is SS marked with -spa (1) or -r (2)? Is DS marked with -qti (1) or -kpi (2)? Are there other switch reference markers than -spa/-r and -qti/-kpi (extended system)? Does the language have productive gerund marked with -stin? What is the form for the narrative? What is the form for the potential 2SG?

1=-kuna; 2=-zhapa 1=˜noqayku; 2=˜noqa-kuna; 3=˜noqa-sapa; 4=there is no opposition 0=no; 1=yes 1=identical; 2=different

1=-pi(q)ta; 2=-manta; 3=-paq 1=-chaw/-chu; 2=pi; 3=-pa 0=absent; 1=-y; 2=-yni; 3=-ni 1=-y; 2=-na

1=-ku; 2=-ri 1=-na-chi-; 2=na-ka-chi1=-spa; 2=-r; 3=both 1=-qti; 2=-kpi 0=no; 1=yes

0=no; 1=yes 1=-˜naq; 2=shqa 1=-ykiman/ -nkiman; 2=-waq; 3=both (cont.)

150

Simon van de Kerke and Pieter Muysken

Table 6.6 (cont.) Quechua nrs. = A nrs. 17 18 19 20 21 22 23 24

25

Andean nrs. = A nrs. 101 102

103

104

105

106 107

108 109 110

Quechua questionnaire feature What is the form for the potential 1PLINCL? What is the form for the durative marker? What is the form of 1sgSU Main Tense? What is the form of 2SGPAST? What is the form of 1sgOB? What is the form of 3sg>2sgPAST? What is the form of 1sg>2sgPRES? What is the form of the verbal plural marker? What is the form of the “equational” marker?

1=–nchikman; 2=-chwan; 3=both 1=-yka; 2=-chka; 3=-ku (or reflexes) 1=-:; 2=-ni; 3=-y 1=-rqayki; 2=-rqanki 1=-ma; 2=-wa 1=-sur(q)anki; 2=-r(q)asunki; 3=none 1=-yki; 2=-q; 3=none 0=absent; 1=-sapa; 2=-pa:ku; 3=-ri/-rka; 4=-naku; 5=-kuna; 6=-ku; 7=different 1=-nuy/-naw; 2=-hina

Andean questionnaire feature Is the “Additional” suffix obligatory in indefinites? Is the marker for the Causee in three place derived causative constructions identical to Comitative? Is the Agent nominalization used in predicative construction to express habitual? Is the Agent nominalization used as a complement with motion verbs to express purpose clause? How many different (formal) nominalizers does the language manipulate? Is the marker for NP conjunction identical to Comitative? Does the language have a class of verbal affixes expressing direction (up, down, inward)? Does the language have switch reference subordinate clauses? Does the language have 3rd person imperatives/hortative? Is the hortative (let’s X) the same as 1PLINCLFUT?

0=no; 1=yes 0=no; 1=yes

0=no; 1=yes

0=no; 1=yes

0,1,2,3,4,5

0=no; 1=yes 0=no; 1=yes

0=no; 1=yes 0=no; 1=yes 0=no; 1=yes

The Andean matrix Andean nrs. = A nrs. 111 112

151

Andean questionnaire feature Are verbal plural markers attached to person markers? In counting above 10 the possession marker is used

0=no; 1=yes 0=-ni (Aymara); -yoq (Quechua); 1=other

7

The Arawakan matrix Love Eriksen and Swintha Danielsen

This chapter investigates the cultural and linguistic characteristics of the ethnolinguistic groups in the Arawakan language family, particularly relating to situations of contact and exchange within and outside the family. In 1492, Arawakan languages were distributed from the Greater Antilles in the north to the Gran Chaco area in the south, and from the Amazon River mouth in the east, to the eastern Andean slopes in the west. The Arawakan languages expanded successfully across the South American continental land mass during pre-Columbian times as part of a powerful cultural complex characterized by intensive contact and exchange with neighboring groups: the Arawakan matrix, which this chapter aims to investigate and map. Geographic Information Systems (GIS) and various phylogenetic methods are used to explore the spatial and temporal distribution of cultural and linguistic features of Arawakan-speaking people, to gain a more complete picture of their expansion. The chapter also adds to our current theoretical knowledge about the sociocultural mechanisms of the Arawakan diaspora and the spatial distribution of particular linguistic features characteristic of the Arawakan language family. 1

Introduction

The study of the expansion of Arawakan languages across prehistoric Amazonia has much to gain from the integration of linguistic (Danielsen) and archaeological (Eriksen) perspectives. Previous studies of the Arawakan language family have revealed that its members possess not only highly characteristic lexical and structural features (see Payne 1991; Aikhenvald 1999a; Danielsen et al. 2011) but also a set of cultural features clearly distinguishing them from their indigenous Amazonian neighbors (cf. Hill and Santos-Granero 2002; Eriksen 2011). In order to understand the means and timing of the Arawakan expansion, it is therefore necessary to integrate findings from ethnography, archaeology, and linguistics. The Arawakan linguistic database was created partly with support from and in interaction with the LinC (Languages in Contact) project at the Radboud University Nijmegen.

152

The Arawakan matrix

153

Investigating the timing of the expansion of the Arawakan language family in Amazonia is more difficult than mapping the expansion of archaeological cultures and associated language families in e.g. the Pacific, where the Austronesian languages and the material culture of the speaking communities can be nicely plotted from island to island as the communities migrated across the Pacific, carrying with them material culture as well as language (Gray and Jordan 2000). In contrast, the lexical, structural, and cultural features of the Arawakan family had to be navigated through a cultural landscape already fully populated by such features belonging to other ethnolinguistic entities, making constant negotiations and renegotiations between the speakers an unavoidable component of the Arawakan expansion. Strikingly, the geographic distance between Arawakan languages only predicts 7 percent of the typological distance between them (the so-called “isolation by distance” measure), indicating that there were contacts between members of the family until fairly recently. The time depth of the ultimate diversification cannot be very great (Danielsen et al. 2011: 183f). This means that the Arawakan languages expanded relatively late in the prehistoric sequence, i.e. during a period when Amazonia had long since experienced advanced ceramic manufacture, intensive crop cultivation, and hierarchical social organization (see below). The title of this chapter, the Arawakan matrix, refers to the set of cultural features – material as well as non-material – identified in multidisciplinary investigations of Arawak-speaking societies as the set that “constitutes simultaneously the background, framework, and source of information that informs the sociocultural practices of the members of a given language family” (SantosGranero 2002: 42). The term was first coined by Santos-Granero (2002: 42ff.) to refer to a set of five key Arawakan non-material cultural features in societies across Amazonia (Map 7.1): (1) suppression of endo-warfare, (2) a tendency to establish sociopolitical alliances with linguistically related groups, (3) a focus on descent and consanguinity as the basis of social life, (4) the use of ancestry and inherited rank as the foundation for political leadership, and (5) an elaborate set of ritual ceremonies that characterizes personal, social, as well as political life. By conducting a large-scale GIS-mapping of pre-Columbian material culture across Amazonia, Eriksen (2011: 9) was able to add four additional points, linked to material culture, to the list: (6) various types of high-intensity landscape management strategies as the basis of subsistence (cf. Hill 2011),

154

Love Eriksen and Swintha Danielsen

(7) a tendency to situate their communities in the local and regional landscapes through the use of such techniques as “topographic writing,” ceremonial earthworks, extensive systems of place-naming, or rock art (cf. SantosGranero 1998), (8) an elaborate set of rituals including a repertoire of sacred musical instruments and extensive sequences of chanting, often performed as part of place-naming rituals (cf. Hill 2007), (9) a proclivity to establish settlements along major rivers and to establish trade and other social relations through river transportation (cf. Hornborg 2005). The current investigation seeks to map the timing of the expansion of these nine features, alongside a similar mapping of the linguistic features of Arawakan languages, thus seeking a detailed, multidisciplinary understanding of the Arawakan expansion. A linguistic database of Arawakan features was created by Danielsen using complex linguistic questionnaires. Our central theoretical assumption is that the best way to explain the interplay between non-material culture (points 1–5 above), material culture (points 6– 9 above), and language features is to view these all as part of one single phenomenon: the ethnic identity of Arawak-speaking communities. Ethnic identities and ethnicity in indigenous Amazonia have recently been explored as an important interdisciplinary field of research (Hornborg and Hill 2011). This involves the formation and renegotiation of Amazonian ethnic identities – the concept of ethnogenesis – (Hill 1996; Hornborg 2005; Hornborg and Eriksen 2011), and results in a new understanding of the multitude of ethnic identities in Amazonia and their role in situations of contact and exchange between indigenous groups. Here, the concept of ethnogenesis is used as a tool to understand the process of spreading of components of the Arawakan matrix to new groups through sociocultural, material, and linguistic exchange, thereby integrating other Amazonian groups into the Arawakan identity, a process inevitably leading to the incorporation of new cultural and linguistic elements among Arawak-speaking communities, and ultimately to a renegotiation of the Arawakan cultural and linguistic identity by small, constant changes and updates of the cultural matrix. 2

The ecology of the Arawakan expansion

2.1

The Amazonian pioneers

Since the first voyage of Christopher Columbus and his followers, South American landscapes, and particularly the Amazon region, have been thought of as the ultimate example of pristine wilderness, encompassing a unique example of rich biodiversity with little human disturbance. Informed by research in

The Arawakan matrix

155

anthropology, archaeology, historical ecology, and soil science since the 1980s, the scientific community has slowly adjusted this image towards a view encompassing substantial human influence in the species composition of the world’s largest area of tropical rainforest. Since the discovery of Bal´ee (1993: 231) that up to 12 percent of the Amazonian ecosystem is of anthropogenic origin, scholars have noted that much of the “pristineness” of Amazonian forest is actually an effect of the demographic collapse of the indigenous populations following in the wake of the European colonization. Furthermore, archaeological investigations reveal large-scale earthworks, subsistence systems, and settlements, confirming the hypothesis that the sparsely populated Amazonia of the historical period is a relatively recent anomaly when contrasted to the socio-economic development of the region during the last 3,000 years. Human subsistence strategies in Amazonia have for at least 9,000 years involved domesticated crops (Piperno and Pearsall 1998: 4; Oliver 2008: 208). When small bands of hunter-gatherers at that time began the domestication of bitter manioc (Manihot esculenta crantz), it marked the starting point of a landscape modification process that was to continue until the demographic collapse following the European colonization some 8,500 years later. By 7000 BP1 the indigenous societies along the lower Amazon and the present Brazilian Atlantic coastline were producing ceramic vessels and shell middens, forming the earliest centers of pottery production in the New World. Along the middle and lower Amazon, the archaeological sites of Dona Stella, Pedra Pintada, and Taperinha show initial signs of horticultural activities between 8000 and 7000 BP (Roosevelt et al. 1996; Piperno and Pearsall 1998: 4; Petersen et al. 2004), and at Taperinha forest clearing is indicated by 7000 BP, and clearly documented at Lake Geral, a site located in the same region dated to 5760 BP (Bush et al. 2000). These early signs of food production and accomplished material culture soon spread through a regional exchange system operating along the coastline between the mouth of the Amazon and Orinoco Rivers (Eriksen 2011: 127f). Along the coastline of present-day Colombia, Venezuela, and Guyana, the location of a number of shell mounds with a characteristic lithic assemblage labeled the Ortoiroid series, sharing similarities with the above-mentioned sites of the Amazon river region, indicates the establishment of a wide-reaching exchange system already at this point in time (Boomert 2000: 74). Sometime between 6500 and 5250 BP, the art of ceramic manufacture was exchanged between the lower Amazon and the Guyana coastline, an event marked by the establishment of the Late Alaka phase of the latter area (Evans and Meggers 1960; Roosevelt 1997: 360; Plew 2005: 13). These two areas 1

“Before present,” i.e. years before 1950 according to international standards for the calibration of C14 -dates derived from the radiocarbon method.

156

Love Eriksen and Swintha Danielsen

continued to share similarities when the Mina phase (5500–4000 BP), another archaeological complex producing crude ceramics and shell mounds, was established along the lower Amazon and at the coastline south of the river mouth (Sim˜oes and Araujo-Costa 1978; Roosevelt 1997). Without losing ourselves in details of the early indigenous material culture, subsistence strategies, and exchange systems of northern South America, it is safe to say that much of the early accomplishments of these socio-economic categories took place through the sharing of important achievements between different groups separated by rather large geographic distances, a mechanism in itself indicative of the character of the future to come.

2.2

The birth of the Arawakan matrix

The complexity of pottery production grew steadily along the Guyana coastline and the Orinoco River, a process leading to the establishment of technologically more complex and stylistically elaborated wares in the form of the Saladoid2 and Barrancoid3 series along the Orinoco River by around 3000 BP. By that time, agriculture also was substantially intensified in the same region. Along the Orinoco, the Saladoid and Barrancoid producing societies developed a technology for soil fertilization based on the addition of ash, charcoal, and domestic waste to the soil, with increased microbial activity and improved fertility as the result (Oliver 2008: 211; see also Arroyo-Kalin et al. 2009 for technical specifications). This process created sustainable conditions for highintensity food production, and the black, fertile soils (also known as terras pretas or Amazonian Dark Earths [ADE]) spread widely across Amazonia between 900 BCE and 1500 CE (Lehmann et al. 2003; Glaser and Woods 2004; Woods et al. 2009). Apart from the addition of charcoal and ashes to the agricultural lands, burnt tree bark (Licania sp.) was also being added to the ceramics as a potent tempering material for increased solidity of the vessels. West of the Orinoco River, on the seasonally sedimentary soils of the flooded savannas of the Llanos, proper drainage of the soils was a bigger challenge than lack of available nutrients. In this area agricultural intensification took place through the construction of elevated cultivation surfaces, so-called raised fields or camellones, improving agricultural conditions by elevating parts of 2 3

The Saladero phase dates to approximately 1300 BCE, i.e. 3000 BP (Roosevelt 1997; Boomert 2000). The Barrancas and Isla Barrancas phases date to approximately 900 BCE, i.e. 2800 BP (the discrepancy of the BP-dates in footnotes 2 and 3 is due to the non-linear correspondence between BCE/CE and BP in the C14 calibration curves. This phenomenon is in itself an effect of uneven amounts of solar radiation hitting the earth during different time periods (Cruxent and Rouse 1958, 1959; Sanoja 1979; Sanoja and Vargas 1983; Barse 1989; Oliver 1989; Roosevelt 1997; Boomert 2000; Gass´on 2002).

The Arawakan matrix

157

the otherwise flat landscape for the improvement of soil conditions, drainage, water management, and nutrient production in order to stimulate agricultural productivity (Denevan 1970; Darch 1983; Erickson 2006: 251). The refinements of agricultural technologies and pottery production were not isolated technological advancements, but, as we will argue below, part of a cultural package that was just beginning its march across Amazonia. Interestingly, the cohesive links of this cultural package were not the technological advancements or the surplus production (even though they were both intrinsic parts of it) but language, and more particularly languages of the Arawakan language family. At the time of European contact4 at least sixty Arawakan languages were spoken from the Greater Antilles in the north to northern Argentina in the south, and from the mouth of the Amazon in the east to the eastern Andean slopes in the west (Grimes 2009 lists fifty-nine documented Arawakan languages, not including several extinct ones) (Map 7.1). Arawak-speakers across South America and the Caribbean are united by two main factors: (1) the genealogical relationship of their languages, i.e. their descent from a common proto-language, and (2) a shared set of cultural features, i.e. both material and non-material attributes. By 1500 CE the Arawak-speaking groups inhabited numerous seasonally flooded environments of the South American tropical lowlands with raised field agriculture or similar technologies, including the Llanos of Venezuela and Colombia (Spencer and Redmond 1992), the Llanos de Moxos of Bolivia (Erickson 2006), Maraj´o Island at the mouth of the Amazon (Schaan 2008), and the Guyana Littoral (Versteeg 2008). They were also known as the moderators of an elaborate set of ritual ceremonies with the use of sacred musical instruments and chanting as essential ingredients (Izikowitz 1935; Hill 2009). Apart from this, Arawak-speakers such as the Taino of the Greater Antilles, the Lokono of the Guyana Littoral, the Manao of the central Amazon, the Achagua and Caquet´ıo of the Llanos, and the Moxo of the Llanos de Moxos (just to name a few) were well-known traders carrying out exchange between various ethnolinguistic groups (Eriksen 2011: 275). 2.3

The timing of the geographic expansion of the Arawakan matrix

The main ingredients of the Arawakan cultural package were first being brought together in the Orinoco region around 900 BCE. By that time, we find highintensity landscape management systems in the form of raised fields and terras pretas for agricultural production and causeways for transportation, water management, and possibly also including ritual functions. Also present were 4

The Arawak-speaking Taino of the Greater Antilles was actually the first indigenous group encountered by Columbus on his first voyage (Rouse 1993).

158

Love Eriksen and Swintha Danielsen

ceramic artifacts rich in painted and plastic decoration, that is to say features indicative of a rich ceremonial life similar to that documented from Arawak-speaking communities of the historical period (Santos-Granero 1998; Heckenberger 2008; Hill 2011). Interestingly, the rich ceremonial life of the Arawak-speakers of the historical period was essentially constructed around two themes: (1) the presence of elaborate techniques for physically and ritually domesticating their surrounding landscapes, and (2) the use of fire in processes of landscape domestication and during other types of ritual ceremonies. As described above, by-products of fire such as charcoal and ashes were an essential part of the subsistence strategies and ceramic manufacture of the Orinoco Region already by 900 BCE. The elaborately decorated ceramics of the Saladoid and Barrancoid series act as indicators of the elaborate ceremonial life of the Orinoco communities, and, interestingly, the rich ethnographic record of Arawak-speaking communities across Amazonia shows that the use of tobacco smoke by Arawakan shamans is considered an essential aspect of ritual ceremonies, including healing processes. Thus, an image of a cultural package capable of transforming the landscape into a high-productive resource, while at the same time providing a powerful ceremonial life for its members, now arises through the archaeological, anthropological, geological, and historical records. The combination of high-intensity landscape management strategies and a rich ceremonial life would prove to be highly successful during the centuries to follow. By 400 BCE the first evidence of terra preta farming appears in the central Amazon, and shortly thereafter the first earthworks of the Llanos de Moxos represents the initial signs of landscape modification in this region. Judging by the great differences in terms of ecology between the habitats colonized by the subsistence strategies of this emergent regional system, there was a great deal of adaptation available within the communities involved in this process. In much of the central and lower Amazon, large surfaces of fresh sediments rich in available nutrients annually drained from the Andes, forming great conditions for high-productive agriculture, were available. These so-called v´arzeas eliminated the need for raised fields or terras pretas in many areas of central Amazonia and, interestingly, Versteeg (2008: 305) notes how the raised fields can be compared to artificial v´arzeas that are subjected to controlled inundation, bringing nutrient-rich sediments to the elevated surfaces during parts of the year. By 200 BCE Barrancoid pottery and terra preta farming were present along the Ucayali River in the Peruvian Amazon (Lathrap 1970: 117; Eden et al. 1984: 126), and at around 100 BCE archaeological dates of the huge earthwork complex of Acre, northwest of the Llanos de Moxos, begin to cluster (Saunaluoma 2010: 106). The geometrical earthworks of Acre, also known as geoglyphs, are perhaps the most visually stunning example of the ceremonial aspects of earth-moving that had been crystallizing across Amazonia from

The Arawakan matrix

159

about 900 BCE. The Acre geoglyphs so far discovered consist of more than 200 (an estimated 10 percent of the total number [Mann 2000]) geometrical figures carved out of the soil by ditches and walls extending up to 3 meters deep and 11 meters wide. The size of the earthworks measures up to 300 meters across and their frequency is up to 5 geoglyphs per km2 (Hornborg et al. 2013). The cultural associations of these earthworks are so far unknown, but the presence of pottery related to the Barrancoid series (Saunaluoma 2010: 94), their dating, and the engineering skills and focus on soil moving among the Arawak-speaking communities on the nearby savannah of the Llanos de Moxos makes an association with the Arawakan cultural complex highly plausible. As for the functions of the geoglyphs, little is known, but suggestions that they were used for fortification purposes have been made. Although this may be true for the circular structures, particularly those dated to the late pre-Columbian period when military conflicts were expanding across the lowlands (see below), it is unlikely that the communities erected up to five elaborately constructed geometrical structures with low ditches with little or no defensive capabilities per square kilometer because they feared an external threat in the form of spears, arrows, or war clubs. On the contrary, the fortified villages of the Arawakan cultural complex documented from the late pre-Columbian periods are semicircular structures, located on high ground, adapted to the local topography and surrounded by palisades (Rebellato et al. 2009). Interestingly, new research in the upper Xing´u area, another region of Amazonia populated with – and culturally dominated by – Arawak-speakers, is finding support for the notion that Arawak-speakers sometimes devoted themselves to strictly ceremonial domestication of their landscapes. During the late pre-Columbian period, the upper Xing´u area had developed integrated patterns of centers organized in multi-ethnic, or “galactic,” clusters populated by up to 2,500, and perhaps as many as 5,000 inhabitants (Heckenberger 2006: 330; 2008: 955). These multiethnic confederations, referred to as an early form of urbanism by Heckenberger et al. (2008), were integrated by wide roadlike causeways resembling the elevated causeways of the Llanos de Moxos, which facilitated cultural, linguistic, and material exchange within and between regions. The landscapes were domesticated by the creation of circular villages with a central plaza and radial road networks with perfectly straight passages connecting a multitude of such population centers to each other in a regional system. In Arawak-speaking areas where earthworks and other landscape-altering techniques were less prevalent, other strategies of landscape domestication were employed. The Yanesha’, an Arawak-speaking group of the eastern Andean slopes in present-day Peru, apply an intricate system of “topographic writing” in order to maintain an intimate relationship to their landscape (Santos-Granero 1998). Topographic writing is the concept Santos-Granero uses to describe

160

Love Eriksen and Swintha Danielsen

how individual place names (topograms) are connected to extensive systems (topographs) and reiterated, for instance through chanting in ritual ceremonies in order to strengthen the ties to the local and regional landscape (p. 128). Such ritual place-naming is also well documented from the northwest Amazon Arawakan people (Vidal 2000, 2002; Hill and Chaumeil 2011; Wright 2011) and from Arawakan groups in southern Amazonia such as the Paresi (Schmidt 1917: 21f). Santos-Granero (1998: 132, 139) refers to the landscape domesticating process of topographic writing among the Yanesha’ as a form of proto-writing, also present among other Amazonian groups such as the Pa´ez (to which it likely diffused through contact with nearby Arawak-speaking communities), a linguistic isolate between the Mara˜non and Napo Rivers, and the Arawakspeaking Kurripako (Waku´enai) of the northwest Amazon (Hill 1996: 153f; 2002: 235f; 2009: 250). Returning to the archaeological material, by 300 CE a new ceramic style, labeled the Amazonian Polychrome tradition, had developed out of Barrancoid material along the middle and lower Amazon (Hilbert 1968; Lathrap 1970: 156f; Eden et al. 1984: 137; Myers 2004: 79; Petersen et al. 2004: 9; Eriksen 2011: 181). At the time of European contact, the Arawak-speakers of Maraj´o Island at the mouth of the Amazon, the Aru˜a, were still manufacturing an undecorated variant of the Marajoara phase (one of the most elaborate phases of the Amazonian Polychrome tradition)5 labeled the Aru˜a phase, when they were encountered by the Europeans (Brochado and Lathrap 1982: 53). Once again, the significance of burning is reflected in the anthropomorphic burial urns typical of the Amazon Polychrome tradition, an inventory indicating secondary urn burials in which the cremation of the corpse and the storing of the ashes in the urn were central components. In many instances, even the pottery itself included ashes in the form of caraip´e temper utilized in the Ipavu phase in the upper Xing´u (Heckenberger 1996: 136f), the Guarita phase in the middle Amazon (Petersen et al. 2003: 252), the Mazag˜ao phase in Marac´a (Meggers and Evans 1957: 596), the Koriabo phase of the Guyanas (Boomert 2004: 259), and, together with crushed sherds, in Marajoara (Brochado and Lathrap 1982: 50). The burial urns, also well known among the northwest Amazon Arawakan people and those in the Llanos de Moxos − the Moxo and the Baure − (Mann 2000; M´etraux 1948c) and Arawakan groups in the Bolivian Chiquitan´ıa, like the Paunaka (M´etraux 1948b), were often stored in caves that could be visited and inspected (Chaumeil 2007: 250ff), indicating close ties to the ancestors, and the importance of maintaining a strong relationship to deceased relatives,

5

Brochado and Lathrap (1982: 51) at one point describe Marajoara as “one of the most complex art styles of the world.”

The Arawakan matrix

161

shamans, and political leaders – a custom typical of the Arawak-speaking communities of the historical period. Another way of maintaining a close link to the ancestors was through ritual consumption of their remains, as illustrated by the Arawak-speaking Guayupe and Sae, who cremated their ancestors and drank their ashes mixed with beer (Kirchhoff 1948: 387f.). The process of burning was also a central aspect of other Arawak-moderated rituals performed in their sphere of influence. At religious ceremonies performed in the northwest and southwest Amazon and in the upper Xing´u area, tobacco smoke is a central element in healing-ceremonies conducted by the Arawakan shamans, who blow the tobacco smoke on the patient’s body in order to eliminate the patient’s symptom (Hill 2009: 249, 259; Hill and Chaumeil 2011). The blowing of tobacco smoke on patients has also been reported of the Bolivian Arawakan groups of the Paunaka in shamanic ceremonies (Danielsen, own observation) and it is also the way the shaman gets in contact with the spirits of the deceased among the Baure (Riedel 2012). According to Goldman (1948: 789), smoke was also blown during funerals, reflecting the association between this custom and the importance of deceased ancestors. Via the historical and contemporary ethnographical sources, we find another interesting connection between, on the one hand, shamanistic blowing of smoke, and, on the other, the ritual wind instrument also utilized by Arawakan shamans during religious ceremonies. Among contemporary Arawak-speaking communities of the northwest Amazon, the upper Pur´us River (the Apurin˜a) and the upper Xing´u area, ritual wind instruments play a central role in annual ceremonies and during rites de passages. Apart from the apparent association to shamanic blowing, the sacred flutes of the Arawakan people also had a striking connection to landscape and fire worth exploring further. According to the legends of the Arawak-speakers of the northwest Amazon, the earth was created from the remains of a mythological hero, K´uwai, after his body had been destroyed in a fire. Besides being the ancestor from whose body the world of humans was created, K´uwai also provided material for the ritual wind instruments used in religious ceremonies. The Yurupar´ı flutes are artifacts directly derived from the bones of the mythological hero and thus representatives of the ancestors (Steverlynck 2008: 580). In the words of Robin Wright (2011): After his [K´uwai’s] sacrificial death in an enormous conflagration, from the ashes of his body emerged the sickness-giving spirit Iupinai but also a giant tree from which the sacred flutes were made, and it is with these flutes that traditionally the men initiated boys and girls in the major rituals held at the beginning of the rainy season.

Overall, the sacred wind instruments of the Arawakan people were one of their most central characteristics. Sacred flutes have been known to occur among a number of Arawak-speaking groups of the northwest Amazon, including

162

Love Eriksen and Swintha Danielsen

the Achagua, Baniwa, Bare, Cabiyar´ı,6 Kurripako, Maipure, Yucuna,7 Pas´e,8 Res´ıgaro, and Yumana, and they also occur among neighboring non-Arawakspeaking groups who maintain close sociocultural contact with the Arawakan groups (Chaumeil 2007; Wright 2011). Chaumeil (1997, cited in Steverlynck 2008: 579) points to the connection between the sacred flutes complex of the northwest Amazon and the use of ceremonial trumpets by Taino shamans of the Greater Antilles. Chaumeil (2007: 265) notes how Arawak-speaking groups dominate the sacred flutes complex throughout Amazonia, and Wright (2011) identifies the sacred flutes as an important element in the expansion of Arawakan languages. Arawak-speaking groups located outside of the northwest Amazon who also use sacred flutes include the Apurin˜a of the Pur´us River; the Baure and Moxo in the Llanos de Moxos; the Paresi further west; and the Mehinaku in the upper Xing´u. Other groups belonging to the same complex include some Tupianspeaking groups such as Cocama and Omagua, Munduruk´u, Tupinamb´a, and Kamayur´a. In the upper Xing´u, the complex also spread to the Carib-speaking Kalapalo and Bakair´ı, who were “Arawakanized” by their Mehinaku, Kustenau, Yawalapit´ı, and Waur´a neighbors (Chaumeil 2007: 266). During female initiation rites among the Arawak-speaking Kurripako (Waku´enai) in the northwest Amazon, the sacred flutes are used during up to six hours long ceremonies of chanting during which an enormous series of place names along the rivers of northern South America are reiterated (Hill 1996: 153f; 2002: 235f; 2009: 250). These place names represent nodes in an exchange system once dominated by Arawak-speakers, but they are also part of a geographic network with strong mythological connotations. This exchange network, known as the K´uwai route (borrowing its name from the creator), represents both a trade network constructed on the basis of physical travels over centuries, but also a collection of mythological places where Arawakan shamans head on their transcendental journeys during s´eances. Thus, the sacred flutes, the K´uwai routes, and the relationship to the ancestors and the mythological past form a trinity of inseparable components that collectively contributed to a strengthened identity and sociopolitical status of the Arawak-speaking communities. Like many indigenous cultures around the world, for Arawak-speaking communities the physical and religious aspects of the landscape form a whole. The landscape functions as a single meaningful unit, steadily present in the life and minds of its inhabitants. However, the Arawakan domestication of 6 7 8

Cabiyar´ı (Cauyari, Cabuyar´ı, Acaroa) is classified as a dialect of Tariana (Landar 1977: 454). Yucuna is also known as Matap´ı (Matap´ı-Tapuya) (Lewis 2009). M´etraux (1948c: 708) writes that the “Pas´e were considered the most advanced Indians of the middle Amazon.”

The Arawakan matrix

163

Map 7.1 The reconstructed geographical dispersal of the Arawakan and Tupian language families at the time of European contact. For complete references, see Eriksen 2011: 12

the landscape was not only meaningful to the Arawakan people themselves but also part of a vast socio-religious and economic exchange system that affected the lives of all inhabitants of northern South America between 1000 BCE and 1000 CE. Together with intensive agriculture, an effective exchange system, and an advanced sociocultural and religious concept based on social

164

Love Eriksen and Swintha Danielsen

hierarchies and ancestor worship, Arawakan languages (which formed intrinsic parts of these three concepts) expanded across an enormous geographic territory on mainland South American and in the Caribbean. The diversity and the power of the Arawakan groups led to the complete or partial adoption of the Arawakan cultural matrix and associated languages by many indigenous groups between 1000 BCE and 1000 CE. At the time of the European arrival Arawakan languages were widely spoken and northern South America showed extensive domesticated landscapes (Map 7.1). Arawakan culture also came to influence the Andean region substantially, as indicated by the presence of a large number of lowland products brought to the highlands via the Arawak-controlled trade routes (Eriksen 2011: 164). Along the trade routes of the eastern Andean slopes, an ethnolinguistic group known as the Kallawaya transported lowland products with pharmaceutical or hallucinogenic characteristics to the Andean cultures (Rowe 1946: 239; Wass´en 1972: 63; Lathrap 1973: 180f; Taylor 1999: 199; Eriksen 2011: 78). The Kallawaya tongue was a mixed language based on the Arawakan language Puquina, the isolated Chipaya language, and Quechua (Gordon 2005, Hannss p.c.). Puquina was a high-status language spoken among the Inca elite (Torero 2002; Dudley 2009: 146). Along the eastern Andean slopes in present-day Peru, a number of Arawakan languages are still spoken, and advanced systems for ritual domestication of the landscape have been documented among the Yanesha’ of that region (Santos-Granero 1998) (Map 7.1). These groups bear testimony to the incredible ability of the Arawakan matrix to maintain its relevance for its users during the sociocultural changes on-going for centuries. 2.4

The expansion of the Arawakan family from a linguistic perspective

As noted, the Arawakan language family has expanded over a very large area of South America, more than other language families (Map 7.1). A suggested Arawakan language classification is presented in Table 7.1, mainly based on geographic proximity, but also on grammatical features, summarized from Aikhenvald (1999a), Walker and Ribeiro (2011), and Danielsen and Terhart (forthcoming).9 The internal classification of Arawakan languages is difficult to establish (see Facundes 2002), and some reasons for this are discussed in 2.6. This section focuses more on the character of the Arawakan linguistic family as 9

Aikhenvald (1999a) is the basis for all subdivisions in Northern Arawakan. Danielsen and Terhart (forthcoming) specifies the Southern Arawakan group, which is less classified in Aikhenvald (1999a), according to the former lack of information. Some more subgrouping could be done on the basis of the findings in Walker and Ribeiro (2011), which are supported by the analyses in the present chapter. The Purus subgroup has been claimed by Facundes (2002) under the name A-P-I; the name Purus is used in Walker and Ribeiro and on Ethnologue (Lewis 2009).

The Arawakan matrix

165

Table 7.1 The Arawakan language family Arawakan Northern Arawakan

Caribbean/ Extreme North ta-Arawak Palikur group R´ıo Branco North-Amazonian

Orinoco subgroup Middle R´ıo Negro Upper R´ıo Negro

Columbian group

Southern Arawakan

South-Western Arawakan

Purus subgroup Andean foothills Arawakan

South Arawakan

Baure languages

South-Eastern Arawakan

Pauna languages Moxo languages Terˆena subgroup Paresi subgroup Xing´u subgroup

Island Carib, Garifuna, Taino Lokono, Guajiro, Paraujano Palikur, Marawan, Aruan Wapishana, Mawayana Bare, Baniva, Yavitero, Mandawaka, Yabana Cawishana, Manao, Bahwana Kurripako, Tariana, Warekena Res´ıgaro, Yucuna, Achagua, Piapoco, Cabiyari, Maipure Apurin˜a, Piro, I˜napari, Machineri Yanesha’, Chamicuro, Apolista Campa subgroup: Ash´aninka, Ash´eninka, Caquinte, Machiguenga, Nomatsiguenga Baure, Carmelito, Joaquiniano Paunaka, Paiconeca Ignaciano, Trinitario Terˆena, Kinikinau, Chan´e Paresi, Saraveka Waur´a, Mehinaku, Yawalapiti, Kustenau, Ewanewˆe-Nawˆe

such. Instead of taking the lexicon as the basis of comparison, which was done in other studies (Payne 1991, Walker and Ribeiro 2011), we take grammatical features as our point of departure here and compare our findings to those of lexical comparisons. The similarities of the Arawakan languages in some key linguistic features suggest that the expansion happened rather quickly (see Danielsen et al. 2011). The personal paradigm, e.g., is similar in many respects in most Arawakan languages, formally as well as functionally (see Payne, David L. 1987), and it has often served as the first characteristic for assigning the genealogical relationship to a language (Gilij 1780–84). The proto-system

166

Love Eriksen and Swintha Danielsen

of person marking was presumably a “Latin-type” paradigm (see Cysouw 2003: 107): 1sg, 2sg, 1pl, 2pl, 3sg with a gender distinction (masculine/nonfeminine and feminine), and 3pl (also applied as a general pl suffix). Person markers are employed to mark the possessor on nouns (prefix), subject on verbs (SA , prefix), object(s) on verbs (suffixes), and the subject on stative or non-verbal predicates (SO , suffix).10 In addition, free pronouns are derived from these bound personal forms, as well as certain adpositions marked for person. This general grammatical system can be claimed for all Arawakan languages, and the differences lie mainly in the specific lexemes that make use of a certain kind of marking, such as which nouns are actually part of the category inalienably possessed (with person marking), and which verbs belong to the (active) set with SA marking or the (stative) set with SO marking. There may also be striking differences in the SAP (speech act participants, i.e. first and second person reference) marking system versus the 3rd person in some languages, and the SAP forms tend to be more stable. Here also certain functions only hold for sub-clusters of the language family, such as gender marking or derivation of adjectives or nouns by means of the same personal forms (suffixes) for 3rd person. If we model the Arawakan language family as a NeighborNet splitsgraph only with respect to the person marking system (forms and functions), we get a rather plausible picture of the geographic distribution of the languages and possible migration routes (Figure 7.1). The star-like splitsgraph in Figure 7.1 shows no clear northern versus southern branching of the language family with respect to the features examined, contrary to what Table 7.1 suggests. There is a tendency that Northern Arawakan and Southern Arawakan languages group together and a few sub-clusters can be observed. The same is true for the lexicon, as analyzed in Walker and Ribeiro (2011: 2563). This means that there are some shared Arawakan features almost equally distributed: the personal paradigms to some extent and some of the conservative lexicon. The reason for this presumably goes back to the Arawakan exchange network that functioned for a long time. If the languages had spread like Tupian, we would presumably be able to see certain clear expansion groups, which is not really the case (compare Figure 8.1 in Chapter 8 on the Tupian expansion). However, some groups and subgroups within the language family can be found clustering in the graph in Figure 7.1: Some Northern Arawakan languages cluster, like Lokono with Island Carib, Garifuna, and Paraujano, but also with Res´ıgaro in this graph. While Bare and Tariana appear closely related, Piapoco and Achagua have been separated according to their person marking systems. So, there is some confusion and a geographically unclear picture of northern and central Amazonian Arawakan languages. A similar observation was made about the lexical relations (Walker and Ribeiro 2011:

10

These are generally concepts expressed by adjectives in other languages.

The Arawakan matrix

167

0.01

bean

, Carib

ern North

Lokona

No Am r t h w az es on te ia rn

Paraujano

N Am orth az we on ste ia rn

Guajiro Resigaro

IslandCarib IslandCarib Kurripako Achagua Garifuna Garifuna

Tariana

Ca

Bare

Yucuna Nomatsiguenga

pa

m

Piapoco

Ashéninka

Maipure

Machiguenga Nanti

Warekena Yavitero

Chamicuro Yanesha′ Wapishana

Iñapari Piro ApurinaWaurá

Paunaka

So uth

Pu

Paiconeca

Apolista

s ru

Terêna Ignaciano Baure Trinitario

Kinikinau Palikur Saraveka

Paresi

a -E uth So

ste

rn

Figure 7.1 The distribution of the personal paradigms in Arawakan languages11

2563). The Purus group and Campa Arawakan cluster nicely, and Bolivian Arawakan languages, the Baure, Moxo, Pauna languages, more or less as well. The less Arawakan character of Apolista, Yanesha’, and Chamicuro, presumably due to the Andean influence these languages have undergone, is reflected in their relatively isolated positions in the splitsgraph. This may be a sign of an earlier migration into the region than the other Arawakan languages of the Andean foothills. The complicated position of Res´ıgaro within the Arawakan language family has been discussed in the literature before (Payne 1985). Its odd position among the Northern Arawakan languages in the graph − and also in Figure 7.2 − is probably just a sign of a connection at some point in history; this may have been in times when the Arawakan web stretched from 11

In this graph, Southern Arawakan languages are marked by bold grey script, Northern by black bold italics. The grey broken lines encircle the members of possible subgroups, as given in Table 7.1.

100.0

IslandCarib Garifuna Paresi Yavitero Waurá Kinikinau, Terêna

Ashéninka

Warekena

Machiguenga, Nanti Iñapari

Apolista

Lokono

Piapoco

Nomatsiguenga

Paraujano

Wapishana Bare

Chamicuno Yanesha′

Maipure

Paluikur

Guajiro Paunaka

Saraveka

Resígaro

Tariana Apuriná

Piro

Yucuna Baure, Paiconeca

Achagua Kurripako

Trinitario

Ignaciano

Figure 7.2 Minimum spanning network of the Arawakan language family (also taken from the NeighborNet algorithm, Huson and Bryant 2006)12 12

In this graph, the size of the circles indicates relative frequency of shared features of the present study. The grey shades of the circles refer to Southern Arawakan.

The Arawakan matrix

169

the Caribbean to the Andes. The Taino language, excluded from the analysis in Figure 7.1 due to the lack of sufficient data, has also lately been described as being on the one hand closely related to Northern Arawakan languages, such as Island Carib, Lokono, Guajiro, Piapoco, and Achagua (cf. Granberry and Vescelius 2004: 56). On the other hand, it also seems to have certain characteristics found in South-Western Arawakan languages, in particular the Campa group, namely certain nominal (classifying) root formatives (cf. Granberry and Vescelius 2004: 94). Is this another hint at traces of the times of the interaction over such distances? The Northern Arawakan language Palikur appears right among the Southern Arawakan languages, from which it is geographically far away. However, being a Brazilian Arawakan language, Palikur seems to display some connection to other eastern and southeastern (Brazilian) Arawakan languages. The latter have not been claimed to form any particular subgroup, but in Walker and Ribeiro (2011: 2563), these languages are included under the name “Central Brazil,” since they cluster in their lexical analysis. In Table 7.1, we call this assumable group South-Eastern Arawakan. The Arawakan languages of the Xing´u group (Paresi, Saraveka, Waur´a) and possibly those of the Terˆena subgroup (Terˆena, Kinikinau, Mehinaku) may well be a loose intermediate group with characteristics similar to their northern and northwestern neighbors as well as to their southern genealogical neighbors. The findings in Granberry and Vescelius (2004: 55 ff.) also seem to point in this direction, and Walker and Ribeiro (2011) demarcate Palikur (and Marawan) as a subgroup named “Northeast.” This, however, remains to be further substantiated. To get a better picture of the possible expansion of Arawakan languages, the same feature matrix as used for the NeighborNet splitsgraph in Figure 7.1 can be reduced to a Minimum Spanning Tree (MST), as given in Figure 7.2. The graph in Figure 7.2 provides us with possible routes of the dispersal of the respective Arawakan languages. Even though this is only one interpretation of the given data as a dispersal route, this scenario has plausibility.13 The position of Island Carib and Garifuna shows an excursion of the Arawakspeaking people into the Caribbean Sea, probably at an early stage and starting off from Maipure and Palikur. Another expansion of Arawakan could have led to the northern coast with Lokono at its end. As already mentioned above for Figure 7.1, Res´ıgaro may be a remnant of the Arawakan expansion towards the Andes and a sign that there was still regular exchange at that time between the east (Res´ıgaro) and the west (Lokono and others) of Amazonia through Arawakan peoples.14 In the south, we may conclude that the Moxo languages 13 14

Cf. Salipante and Hall 2011 for criticism on the interpretation of these graphs. Unfortunately, we do not have enough data for the inclusion of the Chan´e Arawakan language into the analysis of personal paradigms. It would indeed be interesting to see where this old and already extinct Arawakan language that reached the north of Argentina would be in the

170

Love Eriksen and Swintha Danielsen

Trinitario and Ignaciano came into the area through Baure. There is some evidence for the fact that Baure came into the region earlier due to its relatively conservative character. It is here also suggested that Apolista and Yanesha’ are part of one migration route, starting off with Chamicuro, which is the most northern member of this loose group of Andean-influenced Arawakan languages.

2.5

The fragmentation of the Arawakan matrix

By approximately 800 CE, the Arawakan languages and the associated cultural complex, here labeled the Arawakan matrix, had reached their maximal territorial extent. By that time, Arawakan languages from the Greater Antilles to northern Argentina and from the Atlantic to the Andes were united by a large and complex regional exchange system. The Arawak-speaking communities had by that time accumulated considerable land-based capital in the form of agricultural earthworks, aquacultural facilities, and infrastructure that was attractive to other indigenous Amazonian groups.15 Since the early centuries of the first millennium CE, another major ethnolinguistic formation came to the fore in the Tupian language family, which had begun expanding out of its point of origin in the Brazilian state of Rondˆonia (Rodrigues 1964). Up until then, the geographic distribution of the Tupian family had remained very restricted, despite early internal branching (Eriksen and Galucio, this volume). The geographic expansion of the Tupian languages took place very differently from the spread of the Arawakan languages. While the Arawakan languages were part of a complex exchange system with strong mythological and ceremonial underpinnings, the Tupian language family was part of an expansionistic military culture. Where the Arawakan societies prioritized ancestry and descent as the bases for political power, the Tupians based their social hierarchies on feats on the battlefield (Eriksen and Galucio, this volume). Particularly among the communities of the Tup´ı-Guarani branch, the groups developed an effective ability to absorb cultural traits and technological elements from neighboring groups in order to strengthen their own social status, military power, and agricultural production (Eriksen and Galucio, this volume). Due to these abilities of the Tupian cultures, it was inevitable that the encounter between groups speaking Tupian and Arawakan languages along the shores of the middle and lower Amazon around 700 to 800 CE would lead to extended periods of conflict (Map 7.1). Military aggression was in evidence from the

15

graph. Chan´e had been replaced by a Tupian language during the early days of the European colonization. For an extended discussion on the relationship between the Arawak-created land-based capital and the socio-economic and cultural development of the region, see Hornborg et al. (2013).

The Arawakan matrix

171

start and the remains in the archaeological record bear witness of burned villages, destroyed palisades, and ultimately a change in village layout from the circular villages of the Arawakan communities to the linear settlements of the Tup´ı-speakers documented from the historical period (Heckenberger 2005: 56; Rebellato et al. 2009: 22, 29). The military conflict between Tup´ı- and Arawakspeaking communities along the main river ultimately led to Tupian control of a large section of the Amazon River from the mouth of the Amazon to the tributaries in eastern Peru by 1200 CE (Map 7.1). In addition to this, other Tup´ı-Guarani languages had expanded along the Atlantic coastline, circumscribing the Macro-Jˆe speakers and restricting their distribution to the Brazilian highlands. The three other expanding branches of the Tupian language family, Munduruk´u, Maw´e-Sater´e, and Yuruna expanded in the area immediately south of the middle and lower Amazon River, contributing to a strong dominance of Tupian languages in southern Amazonia during the late pre-Columbian period (Eriksen 2011). Overall, the expansion of the Tupian family replaced the Arawakan dominance in many areas of Amazonia and led to the fragmentation of the previously pan-Amazonian character of the Arawakan regional exchange system (Map 7.1). However, the sociocultural processes through which Tupian languages replaced Arawakan were sometimes more complex than the predatory ethos of the Tup´ı-speaking groups would lead us to believe. According to several linguists (Cabral 1995; Jensen 1999: 129; Adelaar with Muysken 2004: 432), the structures of the Tup´ı-Guarani languages (Omagua, Cocama, and Cocamilla) of the upper Amazon indicate that they represent a language shift16 from some non-Tupian language(s) to Tupinamb´a. This indicates that a new cultural pattern, including both language and material culture, was adopted in the region about 1200 CE. Included in this cultural package was polychrome pottery, locally developed into the Napo and Caimito phases. Epps (2009: 599) has suggested that Cocama and Omagua represent two different language shifts from Arawakan languages to Nheengat´u, the Tupinamb´a-based lingua franca still spoken in the northwest Amazon. As a result of the Tupian expansion at the expense of the Arawakan languages, a change in land use also followed. While the Arawakan communities had been heavily sedentary, relying on their earthworks, terras pretas, and aqua-cultural facilities for long-term subsistence, the military apparatus of the Tupian groups acted as an incentive for more mobile subsistence strategies. Many ethnolinguistic groups of the Tup´ı-Guarani, Munduruk´u, Maw´e-Sater´e, 16

The occurrence of multilingualism and language shifts has been documented in various parts of Amazonia (Schmidt 1917; Sorensen 1967; Jackson 1983; Campbell 1997; Aikhenvald 2002, 2003b). For other examples of language contact situations resulting in language shifts, see e.g. Thomason and Kaufman (1988); Sasse (1992).

172

Love Eriksen and Swintha Danielsen

and Yuruna branches launched annual war expeditions up the major rivers and tributaries. These expeditions, some of which were documented by the early Europeans of the continent, lasted for months and required access to easily transportable food resources. As a result of this, the Tupian groups along the main river came to rely heavily on short-ripening maize-varieties that were grown on the annually flooded v´arzea areas. The Arawakan groups also produced a considerable food surplus, e.g. beer made from maize and manioc, to be consumed during religious ceremonies. Thus, the Tupians could rely on their ability to steal food during their war expeditions and their dominance over tributary populations for part of their subsistence (Santos-Granero 2009).17 It would take another millennium before Amazonia once again experienced an alteration of the landscape similar to the one that took place during the Arawakan expansion from roughly 1000 BCE until 1000 CE. Between 1000 and 1500 CE, the region suffered conflicts and warfare, and the most important socio-economic progress took place in the Andes. During this period, the landscape alteration processes were less intensive than during the first millennium CE. As a consequence of the demographical collapse among the indigenous populations that followed in the wake of the European colonization (an event that eradicated perhaps 90 percent of the population in Amazonia), the anthropogenic environments of Amazonia underwent a reforestation process that in most areas resulted in an advance of the tropical rainforest at the expense of previously maintained grounds. The image of the reforested Amazonia (a process that was completed in most areas of the lowlands before any Europeans entered) has contributed strongly to the image of Amazonia as an ecosystem with little human historical influence.

2.6

The fragmentation of the Arawakan language family from a linguistic perspective

A subdivision into Northern versus Southern Arawakan is not as straightforward as suggested in Aikhenvald 1999a (Danielsen et al. 2011, cf. also Walker and Ribeiro 2011: 2563). This is supported by the fact that no major branches can be shown for the language family (cf. Figure 7.1). As we have shown in Section 2.4, Arawakan languages are much alike, at least with respect to selected linguistic features such as the personal paradigms and the lexicon. This then results in the star-like splitsgraph in Figure 7.1. Some other features are in tendency more Southern Arawakan, such as the morphological complexity of the verb 17

For an extended discussion on indigenous slavery and predation in Amazonia, see SantosGranero (2009).

The Arawakan matrix

173

and applicative marking types on verbs.18 Taking all linguistic features into consideration, however, the picture is much more blurred. The splitsgraph in Figure 7.1 demonstrates that there are not many major splits between groups of languages of the Arawakan family, and the distances between them are relatively even, and much more balanced than a main subdivision into Northern versus Southern Arawakan would suggest. This tells us something about the nature of the Arawakan expansion: Firstly, the Arawakan matrix must have remained intact for quite some time, so that linguistic features could still be exchanged (and spread throughout Amazonia to other languages). This explains why general Amazonian features (see Derbyshire and Pullum 1986: 19; Dixon and Aikhenvald 1999: 8–9) mostly reflect Arawakan features, and why it is almost impossible to find an Arawakan feature that is not also Amazonian or vice versa. Secondly, the expansion of Arawakan was neither unidirectional, nor did it happen in one stroke. Walker and Ribeiro (2011: 2566) have suggested a more southern point of departure of the Arawakan expansion − western Amazonia, in the area of the Apurin˜a − than others have come up with before (the Caribbean coast in Aikhenvald 1999a: 75; the Upper Amazon referring to Lathrap 1970 and Oliver 1989 in Aikhenvald 1999a: 75). Usually, we take the area of most linguistic diversity within the language family as the probable homeland, as in the case of Tupian. However, the diversity must represent the source of divergence within the family. In the case of Arawakan, diversity may mean local linguistic interaction with other unrelated languages and is not directly related to the different migrations of Arawakan languages. The area of the northern Amazon and the Caribbean coast are both examples of intensive language contact, in particular between Arawakan languages and languages of other stocks. Therefore, linguistic diversity alone may not be always taken as the key evidence for a homeland. Later language contact is probably the reason why an analysis of general linguistic features of Arawakan languages gives us the picture it does in Figure 7.3 (see Danielsen et al. 2011). In Figure 7.3, which is again a star without any major branching, we see that general linguistic features are shared by some subgroups within Arawakan − indicated by the dotted lines − but a great number of the languages appear to be simply mixed in the graph. This fact is the main reason why it has been complicated till now to come up with a decent internal classification of the language family. In contrast to Figures 7.1 and 7.2, Res´ıgaro now occurs more closely to the languages that are also geographically closest, such as Tariana, and not to the Caribbean Arawakan language Lokono. Thus, local contact effects are stronger than possible historical genealogical relations that may be restored 18

Taking only features related to the marking of semantic roles on either the verb or not, we do get some Northern versus Southern branching (Danielsen, unpublished).

174

Love Eriksen and Swintha Danielsen

0.01

Lokono Tariana

Maipure Piapoco

Kurripaka

Baniva Warekena

Bare

Yavitero

Resigaro

Yanesha′ Achagua Paresi

Waurá

Palikur

Chamicuro

Wapishana

Guajiro Nanti

Nomatsiguenga Ashéninka

Ca

m

pa

Kinikinau Piro

Machiguenga

Ignaciano, Trinitario

Iñapari

Terêna

Apuriná

Baure Garifuna Paunaka

Purus

er & st th -Ea u h So out S

n

Figure 7.3 Structural analysis of thirty-one Arawakan languages19

from the personal paradigm. Examples of languages that underwent language contact with genetically unrelated languages and are therefore grammatically quite different from other Arawakan languages are the following: r Garifuna (see Escure 2004): Arawakan with Cariban (and European languages); Garifuna is a language of Arawakan origin (Island Carib) with substantial interaction of Cariban (at the time during colonization) and English- and French-lexifier pidgins and Creoles at the time during and after colonization) r Tariana: Arawakan with Tucanoan (Aikhenvald 1999b, 2001, 2002); Aikhenvald has done a detailed study of language diffusion in the Vaup´es area, and 19

The feature list consists of the Constenla (1991) questionnaire and additional distinctive features selected by Danielsen; for more details see Danielsen et al. (2011). Excluded from the analysis for the combined feature set were Apolista, Enawenˆe-Nawˆe, Mehinaku, Saraveka, and Taino because of incomplete data.

The Arawakan matrix

175

these strong effects of language contact hold for all the languages in this region, not only Arawakan. r Res´ıgaro: Arawakan with Bora (Seifart 2011); Res´ıgaro has in particular changed its nominal morphology and underwent great grammatical changes under the influence of the Bora language; the lexicon and the verbal morphology remained more Arawakan. r Yanesha’ (Wise 1976) and other Andean foothill Arawakan languages (see Table 7.1): Arawakan with Quechua; Yanesha’, Chamicuro, and Apolista were influenced by Quechua, and they are therefore grammatically distinct from other Arawakan languages in the same area, like e.g. the Campan Arawakan languages. r Paunaka: Arawakan with B´esiro (Macro-Jˆe) from the Chiquitan´ıa (Danielsen and Terhart forthc.); the Paunaka language shows regional grammatical constructions in the morphology of borrowed verbs that are typical for the Chiquitan´ıa, and Paunaka has also had lexical influence from B´esiro more recently. r Moxo: Arawakan with possibly B´esiro (Macro-Jˆe) or already extinct language(s) of the area; the Moxo languages only show a particular nonArawakan pattern in the personal paradigm form for 3rd person with speaker gender distinction (see Danielsen 2011, Rose, p.c.) that is not Arawakan. The long list of reported cases of language contact of Arawakan languages with other languages demonstrates why it is difficult to base an internal classification on the same features for all languages. While some Arawakan languages have been influenced in their nominal morphology, others have changed their verbal morphology, and again others the personal paradigms or the lexicon, the results of the fragmentation of the Arawakan matrix after their wide expansion. A more detailed comparative analysis is needed of the grammatical features of the languages that Arawakan languages came into contact with before a more comprehensive analysis of the fragmentation process can be carried out.

3

Conclusion

In this chapter we have sketched the birth, expansion, and fragmentation of the Arawakan matrix, one of the most important cultural systems of prehistoric South America. It is characterized by a surprisingly robust uniformity in its earlier stages, but then in its aftermath, it underwent complex interactions with neighboring systems. The expansion of the Arawakan matrix was characterized by a network of contact and exchange manifested in a regional exchange system that spread the material culture and languages of the matrix to neighboring groups, but also absorbed linguistic and cultural traits – thereby contributing to constant renegotiations and renewal of the system.

176

Love Eriksen and Swintha Danielsen

The linguistic analysis shows that the regional exchange system of the Arawak-speaking communities must have been intact until late prehistory. This is evident from the distribution of linguistic features typical of the Arawakan family among most of the languages of the family. Many features typical of the Arawakan family are also characteristic of Amazonian languages in general. This is most likely the result of the fact that the Arawakan languages, through the cultural matrix which they were part of and the exchange system which they spread through, came into contact with a very large number of Amazonian languages belonging to other genealogical groupings. The process of contact and exchange between Arawakan and non-Arawakan languages resulted in a diffusion of features between these two. Another archaeological claim (apart from the existence of the regional exchange system) sustained by the linguistic analysis is the tendency of the Arawakan matrix to expand in a multidirectional and irregular fashion (Figures 7.2 and 7.3). According to the archaeological analysis, the Arawakan matrix constantly renegotiated its character through contact with new groups, and new items were added to the matrix as it expanded along the major rivers of the great basin. The regional exchange system facilitated recursive feedback of new features, thereby contributing to the fluent and dynamic character of the matrix, a feature probably contributing to the longevity of the system. As a result of the dynamic character of the Arawakan matrix, Figure 7.3, which depicts diversity of linguistic features within the Arawakan family, could just as well be used as an illustration of possible routes of contact and exchange of material culture or as the routes of mythological travels of the Arawakan shamans (both phenomena would have contributed to the linguistic exchange). Thus the expansion of Arawakan languages was a complex process where language, material culture, and non-material culture formed an inseparable entity and where all components were crucial for the successful expansion and renewal of the system. It also shows that in order to decipher such a process, a broad, multidisciplinary scientific approach is called for, matching the many different aspects of the system. Finally, the composition of the language groupings and the distribution of individual linguistic features among Amazonian languages is the result of long-term processes of contact and exchange, in which material culture, social organization, customs and traditions, and language have interacted to form complex sociolinguistic structures that require multidisciplinary research to unravel.

8

The Tupian expansion Love Eriksen and Ana Vilacy Galucio

This chapter explores the expansion of the Tupian languages and culture across greater Amazonia to better understand the mechanisms and processes of cultural and linguistic contact and change. Tupian languages are or were spoken among indigenous groups in Lowland South America from the Brazilian Atlantic coast through Paraguay to the eastern Andean slopes of Peru. The investigation uses Geographic Information Systems (GIS) to map the spatial distribution of cultural and linguistic features associated with Tup´ı-speaking groups in order to plot the historical expansions of the Tupian languages and to characterize the sociocultural and linguistic context and consequences of these events, particularly relating to internal and external contact situations. Research is directed toward multidisciplinary integration of linguistic data with cultural data derived from anthropology, archaeology, ethnohistory, and geography in order to reach a multifaceted understanding of the history of contact and exchange involving Tup´ı-speaking groups. The chapter breaks new ground in combining traditional studies of material culture with linguistic data through the use of GIS, as well as in mapping and investigating the spatial distribution of linguistic features and their relationship to cultural attributes.

1

Introduction

Attempts to reconstruct the expansion of the major linguistic families have a long and proud history in the research of the tropical lowlands of South America. Schmidt (1917) described the expansion of Arawakan, while Nordenski¨old (1918–38) dealt with both Arawakan and Tupian, along with other ethnolinguistic groups. Lathrap (1970) described the expansion of Panoan, Arawakan, and Tupian, and his followers Brochado (1984) and Oliver (1989) concerned themselves with Tupian and Arawakan, respectively. Meggers (e.g. 1971) tried Parts of the studies reported on here were carried out under the Tup´ı Comparative Project, a collaborative project ongoing at the Museu Goeldi/Brazil, since 1998, in cooperation with various Tupian specialists. We thank Pieter Muysken for making his personal notes on L´ıngua Geral Amazˆonica and Cocama-Cocamilla available to us.

177

Downloaded from https://www.cambridge.org/core. University of Cambridge, on 12 Oct 2018 at 20:33:56, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107360105.010

178

Love Eriksen and Ana Vilacy Galucio

to explain the expansion of the major linguistic families in the region as a consequence of population movements triggered by climate fluctuations, and Meggers and Evans (1978) proposed an origin of the Tupian family east of the Madeira River (a hypothesis already advocated by M´etraux (1928) and Rodrigues (1964); see below). Noelli (1998, 2008) and Urban (1996) also devoted studies to the Tupian expansion, while Heckenberger (2002) addressed the Arawakan dispersal. More recently, Neves (2011) has attempted to correlate ceramic styles with Arawakan and Tupian languages from an archaeological perspective; Walker and Ribeiro (2011) have modeled the linguistic history of Arawakan, and Eriksen and Danielsen (this volume) have studied the Arawakan dispersal from a transdisciplinary perspective. There have been several advances in various academic disciplines relevant to our understanding of linguistic expansions in pre-Columbian Amazonia during the last two decades. One decisive theoretical advance in this field of research comes from the work of Hornborg (2005) and Hornborg and Hill (2011), who stress the importance of understanding the development of ethnic identities through the process of ethnogenesis (i.e. the development and continuous renegotiation of ethnic identities through sociocultural interaction) in order to decipher processes of cultural and linguistic exchange among indigenous groups. Another important factor includes the use of large-scale computerized databases of spatially distributed cultural and linguistic data (Geographic Information Systems, or GIS), which promotes multidisciplinary comparative studies of the interplay between cultural and linguistic variables through time (Eriksen 2011). And finally, the field of linguistics has seen a veritable boom both in good quality documentation of South American languages and in the use of computational tools and large-scale databases to probe the internal relationships of Amazonian language families as well as the areal diffusion of lexical and structural features between different linguistic groupings (Muysken and O’Connor, this volume). 2

The Tupian language family and its branches

In order to contextualize the current investigation, we start with a basic and non-exhaustive orientation to what has been accomplished in previous Tupian studies. The Tupian family is one of the largest and most widely distributed language families in lowland South America, with languages still spoken in a large geographic area that covers a great part of Brazil as well as adjacent areas in Paraguay, Argentina, French Guiana, Bolivia, and Peru (Map 8.1). It has long been recognized, based on the time depth of regional Tupian diversity, that the vast expansion of the Tupian family stems from a single point of origin (M´etraux 1928; Rodrigues 1964; Noelli 2008), located east of the MadeiraGuapor´e basin, in the Brazilian state of Rondˆonia. From this point of origin, the

Downloaded from https://www.cambridge.org/core. University of Cambridge, on 12 Oct 2018 at 20:33:56, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107360105.010

The Tupian expansion

179

Map 8.1 The location of Tup´ı-speaking groups at the time of European contact

family has expanded into ten branches: Tupar´ı, Arik´em, Purubor´a, Ramarama, Mond´e, Juruna, Munduruk´u, Tup´ı-Guaran´ı, Awet´ı, and Maw´e (Figure 8.1) over a time span of roughly 4–5,000 years (cf. estimates by Rodrigues 1964 and other researchers). These ten branches encompass about 40–45 languages, not counting the differences among dialects spoken by distinct ethnic groups (Moore et al. 2008).

Downloaded from https://www.cambridge.org/core. University of Cambridge, on 12 Oct 2018 at 20:33:56, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107360105.010

180

Love Eriksen and Ana Vilacy Galucio TUPÍ FAMILY

Ramarama-Puruborá Arikém

Karitiána Jurúna Xipáya Salamay Aruá Zoró

Mawé-Awetí-Tupí-Guaraní Tupí-Guarani

Mondé Jurúna

Tuparí

Mundurukú

Gavião Cinta-larga Suruí Kuruáya Mundurukú Karo

Puruborá Mekens Akuntsú Wayoro Tuparí Makurap

Awetí Mawé

Figure 8.1 The branches of the Tupian language family1

The genetic relationship and internal classification of the Tupian family shown schematically in Figure 8.1 incorporates recent historical-comparative studies concerning internal classification and proposals of intermediary stages in the evolution from Proto-Tup´ı to the current languages, including the results of lexical and grammatical comparison and reconstruction for the different branches of the Tupian family (Rodrigues 1984, 1985; Gabas Jr. 2000; Galucio and Gabas Jr. 2002; Moore 2005; Drude 2006; Picanc¸o 2010; Galucio and Nogueira 2012). The close relationship between Awet´ı, Maw´e, and the Tup´ı-Guaran´ı languages has long been recognized (Rodrigues 1964, 1984, 1985; Rodrigues and Dietrich 1997), and it is by now well established that these languages constitute a large branch inside Tup´ı, the Maw´e-Awet´ı-Tup´ı-Guaran´ı branch (Drude 2006; Correa da Silva 2011; Drude and Meira to appear), termed the Mawet´ı-Guaran´ı branch by the latter two authors. This branch represents the major branch of the family, in number of languages and in territorial extension. Given the enormous diversity within the family in terms of territorial expansion, it is clear that the different Tupian groups have been shaped by distinct and individual historical experiences. An attempt to reconstruct the internal diversification and expansion of Tupian languages must therefore take these experiences into account, aided by a multidisciplinary approach that seeks to understand not only the genealogic relationship and contact history of the languages from a linguistic point of view, but also the particular historical experiences of the groups by mapping the sociocultural features associated with them. 3

Lexical and structural distances between Tupian languages

The genealogical classification of the Tupian family is presented in Figure 8.1, which shows the distinct levels of relationships between the 1

The Tup´ı-Guaran´ı branch has several languages and sub-branches, represented by the dotted lines, which are not shown in the diagram.

Downloaded from https://www.cambridge.org/core. University of Cambridge, on 12 Oct 2018 at 20:33:56, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107360105.010

The Tupian expansion

181

languages and their evolutionary paths from the ancestor language, Proto-Tup´ı. In this section, we present another view of Tupian language relations, based on distance matrices of shared features. We analyzed the data compiled for this study using quantitative techniques to visualize patterns of relationship in terms of lexical and structural similarity, without presuming an explicit genealogical history. We then compare assessments of similarity presented in network representations (Figures 8.2 and 8.3) to the internal relationships and genealogy of the Tupian family (Figure 8.1).

3.1

Linguistic distance based on lexical similarity analysis

Galucio and colleagues (to appear) present the results of a lexicostatistical and phylogenetic study based on the analysis of the Swadesh list of 100 diagnostic words considered to be most stable over time (Swadesh 1955) for all the nineteen Tupian languages outside the Tup´ı-Guaran´ı family and for four Tup´ı-Guaran´ı languages (Guaran´ı, Parintintim, Tapirap´e, and Urubu-Kaapor). Their study shows the degree of distance across Tupian languages, confirms the two more recently established branches of Ramarama-Purubor´a and Maw´eAwet´ı-Tup´ı-Guaran´ı, and also supports the internal structure of each branch of the family based on historical-comparative methods. In the case of Tupar´ı and Mond´e, the two most diversified branches outside Tup´ı-Guaran´ı, the phylogenetic similarity tree agrees exactly with the independent internal classification of these branches (Moore 2005; Galucio and Nogueira 2012). We took their study and extended it to include a more complete set of languages from the Tup´ı-Guaran´ı branch. Using the NeighborNet algorithm implemented in SplitsTree4 (Huson and Bryant 2006), we generated an unrooted network expressing a distance measure among the Tupian languages on the basis of lexical similarity in the basic vocabulary for thirty-one Tup´ı-Guaran´ı languages and dialectal varieties and the nineteen languages from the other Tupian branches already established by Galucio et al. (to appear). The distances between the languages based on the percentage of shared lexical items are shown in the NeighborNet representation in Figure 8.2.2 The analysis is not intended to show the historical development of these languages but rather the degree of distance between them, based on lexical similarity that may also reflect the result of horizontal transfer. It is nonetheless remarkable that the major clusters of languages that surface from the distance measure shown in the graphic are comparable to the proposed path of historical development for the Tupian languages, on the basis of the comparative method (cf. Figure 8.1). The NeighborNet representation places Awet´ı as the closest 2

Analysis relative to the non-Tup´ı-Guaran´ı languages draws directly from Galucio et al. (to appear).

Downloaded from https://www.cambridge.org/core. University of Cambridge, on 12 Oct 2018 at 20:33:56, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107360105.010

MUNDURUKU

yuk.Yuki

TUPI-GUARANI myu.Munduruku kyr.Kuruaya mpu.Makurap

TUPARI

xet.Xeta

srq.Siriono

cod.Kokama-Kokamilla tpj.Tapiete gui.Chiriguano

skb.Mekens gun.Mbya_Guarani gug.Paraguayan_Guarani

aqz.Akuntsu

kgk.Kaiowa-Guarani gyr.Guarayo

gvj.Guaja grn.old_Guarani tqb.Tenetehara-tembe avv.Ava-Canoeiro

ann.Anambe sru.Surui

taf.Tapirape

asn.Assurini_Xingu asu.Assurini_Tocantins

mnd.Salamay gvo.Zoro gvo.Gaviao arx.Arua

MAWETI-GUARANI

wyr.Wayoro tpr.Tupari

pak.Parakana kay.Kamaiura kyz.Kayabi

MONDÉ

api.Apiaka

urz.Uru-en-uau-uau uks.Urubu-Kaapor pah.Tenharim pah.Parintintin yrl.Lingua_GeralAmazonica oym.Wajampi adw.Amondawa awe.Aweti

pur.Purubora arr.Karo

RAMARAMA PURUBORA

ktn.Karitiana mav.Mawa

eme.Emerillon awt.Arawete

KAWAHIB

ARIKÉM xiy.Xipaya jur.Juruna

JURUNA

Figure 8.2 NeighborNet representation of lexical distances among Tupian languages

Downloaded from https://www.cambridge.org/core. University of Cambridge, on 12 Oct 2018 at 20:33:56, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107360105.010

The Tupian expansion

183

language to the Tup´ı-Guaran´ı cluster, followed by Maw´e, and together forming the Mawet´ı-Guaran´ı larger cluster, which is consistent with the proposed path of evolution in the history of these languages (Drude 2006; Correa da Silva 2011; Drude and Meira, to appear). The six other lexical clusters (Juruna, Arik´em, Ramarama-Purubor´a, Mond´e, Tupar´ı, and Munduruk´u) and their subsplits correspond exactly to the more recent genealogic classification of these languages, as clearly seen in the Tupar´ı and Mond´e branches (Moore 2005; Galucio and Nogueira 2012). The linguistic cohesiveness of the Tup´ı-Guaran´ı branch is also prominent in the network representation. Horizontal transfers due to contact and borrowing may be responsible for a great number of the synchronic resemblances in the Tup´ı-Guaran´ı lexicon, not all of them due to retention from a common ancestor language. Nonetheless, as expected from the known history of these languages, the thirty-one Tup´ı-Guaran´ı languages are closer to each other than to any other language in the Tupian family. However, the splits inside the Tup´ı-Guaran´ı cluster do not correspond exactly to classifications of the Tup´ı-Guaran´ı branch based on phonological criteria (Mello 2000; Rodrigues and Cabral 2002) or on a combination of lexical, phonological and grammatical criteria (Dietrich 1990). There is an overall absence of well-delimited lexical clusters inside the Tup´ı-Guaran´ı group in Figure 8.2. Among the few specific clusters that surface from the quantitative lexical comparison are the Kawahib languages (Parintintim, Tenharim, Amondawa, and Uru-eu-uau-uau) that are classified as dialectal variants (Sampaio 1997); the Yuki-Sirion´o cluster of two closely related Tupian Bolivian languages; the Wayampi-Em´erillon cluster of languages spoken in the same geographic area in French Guiana; the L´ıngua Geral Amazˆonica3 -Urubu-Kaapor grouping, for which there have been claims of mutual influence through contact; and a Guaran´ı cluster that includes most of the languages in Rodrigues and Cabral’s subgroup I of Tup´ı-Guaran´ı (2002) but also includes Guarayo, spoken in Bolivia. The Cocama-Cocamilla language4 appears close to Xeta. The Cocama lexicon, including the core vocabulary, is primarily Tupian (Cabral 1995), but it also shows lexical traits of Arawakan, Panoan, and Quechuan origin, in addition to Portuguese and Spanish (Muysken 2012b).5 3.2

Linguistic distance based on structural similarity analysis

For the structural analysis, we designed a preliminary questionnaire of twenty prominent typological features, divided between phonology, morphology and 3 4 5

Also known as Nheengat´u. Also known as Kokama. In this volume the spelling Cocama-Cocamilla is adopted, following Peruvian usage. The question of Cocama’s genetic affiliation is discussed in Section 5.

Downloaded from https://www.cambridge.org/core. University of Cambridge, on 12 Oct 2018 at 20:33:56, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781107360105.010

184

Love Eriksen and Ana Vilacy Galucio

syntax.6 Due to the availability of data, our structural sample is smaller than the lexical sample. It consists of thirty languages, including eighteen from the Tup´ı-Guaran´ı branch and twelve from the other nine branches of the family. The features were coded on the basis of published material, complemented with direct verification with specialists working on particular languages. The current location of the analyzed languages is shown in Map 8.2. Only two features are identical for all the languages in the sample. All Tupian languages have the order possessor-possessed in the possessive phrase, and all have nounpostposition order in the noun phrase, which is consistent with the general headmarking characteristic of the family. With the exception of Cocama-Cocamilla, which has a causative suffix -ta, all other languages have a causative prefix of the form mV- ( Condition, Knowledge, Utterance, Propositional attitude. The hierarchy should be read as follows: if a nominalized form (i.e. one that can take case/adposition marking) is used to encode the dependent event at a point on the hierarchy, the points to its left will also allow a nominalized form. The specific distribution per type in Cristofaro’s and my samples are compared in Table 12.4. The numbers do not add up to reflect the number of constructions, because there is often a one-to-many relationship between constructions and meaning. The number of nominalized manipulation predicates is low in Cristofaro’s distribution in part because I have only counted direct (make) manipulation. 5 6

I have discounted utterance and location clauses, since Cristofaro does not consider the latter; for the former, see above. No hierarchy was proposed for possession, but it follows a similar pattern (Cristofaro 2003: 235).

282

Rik van Gijn

Table 12.4 Comparison of distribution of nominalized structures per semantic relation type Cristofaro

%

Van Gijn

%

p-value

modals phasals purpose desideratives manipulatives perception temporal reason

16 16 29 19 5 21 24 24

7.5 7.5 13.6 8.9 2.3 9.9 11.3 11.3

16 16 29 26 2 22 36 25

5.6 5.6 10.1 9.1 0.7 7.7 12.6 8.7

0.46 0.46 0.26 1 0.14 0.42 0.76 0.36

relativization

25

11.7

43

15.0

0.36

condition knowledge utterance propositional attitude

10 10 9 5

4.7 4.7 4.2 2.3

21 21 12 18

7.3 7.3 4.2 6.2

0.26 0.26 1 0.05

total

213

287

Although there are differences between the distributions in the samples, none of them is significant, which means that, if there were to be evidence of contactinduced diffusion of nominalized constructions, this is not connected to a particular semantic field. So a next step we can take is to look in more detail into the structural properties of the nominalizations of South American languages. 4

Types of nominalizations in South America and their distribution

It has been recognized by many scholars that the typology of nominalization shows quite a bit of internal structural variation cross-linguistically. For instance, nominalizations may differ from each other in how they encode core arguments (see e.g. Koptjevskaja-Tamm 1993), or in the extent to which they allow for verbal and nominal categories to be marked on the nominalized verb (see e.g. Malchukov 2006), nominalizations may be flagged in different ways, and of course they can differ in which semantic relation types they can encode. Because of the potential variation within the group of nominalized structures, it makes sense to evaluate the homogeneity of the nominalizations across the continent, and to see whether the internal variation found can best be explained as a geographic (contact) signal or a phylogenetic signal. For a first impression of the internal variation of nominalized constructions, see Figure 12.1, which gives a visual representation in the form of a NeighborNet network (Bryant and Moulton 2004) of the distance between the constructions

Subordination strategies: nominalization

283

Figure 12.1 NeighborNet of nominalizations as subordination strategies in the languages of the sample

based on similarity among the input features. The sheer number of the constructions renders the figure rather difficult to read, but the star-shaped form and the general lack of tree-like branches indicate that the nominalizations that are used as subordination strategies are far from homogeneous. In the remainder of this section, I will look in greater detail at the different nominalizing subordination strategies found in South American languages. I take semantics as a basis for comparison, based on the assumption that, if a language borrows a construction, or if two constructions in different languages converge as a result of contact, they will most likely have comparable semantics.

284

Rik van Gijn

25 30 12 11 15 7 10 1 1 4

33 16 15 16 10 12 2 2 4

23 22 26 15 20 6 6 10

4 4 2 3 0 0 0

18 10 9 15 7 11 15 22 4 24 13 16 8 8 7

20 14 9 9 14

ind utt

know 13 12 10 12 8 15 16 26 4 24

11 8 5 8 6 7 10 15 2 13 20 10 4 4 8

o-rel

2 5 7 0 7 8 6 7 6 6 8

2 2 0 3 0 12 16 23

a-rel

21 7 10 17 3 15 12 8 10 9 9 10

15 10 8 17 7 30 33

s-rel

7 9 7 6 8 0 9 10 5 10 10 10 12

10 7 6 10 5 25

eval

11 23 33 4 7 10 2 10 12 8 10 8 8 12

5 4 7 7 2

perc

41 33 9 21

manip

31 23 7

desid

13 11

modl

cond

41

phas

purp

41 13 31 41 5 10 15 2 18 13 11 15 10 10 13

loc

temp reas loc purp cond phas modl desid manip perc know ind utt eval s-rel a-rel o-rel

reas

temp

Table 12.5 Overlap of semantic relation types

13 10 10 10 7 10 12 20 3 16 14 10

10 8 10 9 6 1 2 6 0 8 9 4 4

10 8 10 9 6 1 2 6 0 8 9 4 4 43

13 12 12 10 8 4 4 10 0 7 14 8 6 37 35

4 4 6

43 37

35

However, defining comparable semantics can be a complex task, since we do not know which semantic building blocks of the different relation types are relevant. The way I approach this problem is to look at every semantic relation type defined in the questionnaire separately, and at its closest neighbors. Closest neighbor is defined on the basis of the frequency that two semantic types are expressed by one and the same construction in the entire subordination database: given semantic type X (e.g. temporal relations) and the set of constructions Y that can encode this type in the entire database, what is the most frequently occurring other semantic type that is expressed by the set of constructions Y? Given these frequencies we can expand to include the closest neighbor(s), and take semantic closeness as a parameter into the equation. In Table 12.5, an index of semantic closeness is presented in the form of an absolute number of shared constructions per semantic type (see Table 12.1 above) for the entire database. For each semantic type, the two closest neighbors are highlighted in different shades of grey. Table 12.5 shows particularly strong connections between, one the one hand, the relative constructions and, on the other, temporal/reason/condition constructions, and to a lesser extent also with constructions of purpose relations. For complementation strategies the bonds between phasals, modals, and desideratives seem rather strong, as well as those between knowledge, perception, and again

Subordination strategies: nominalization

285

desideratives. If there are traces of contact to be found, we particularly expect them between these three types of semantic clusters. These different groupings can in turn be correlated to different morphosyntactic forms of the constructions. In particular, I will look at the following parameters: 1. the type of nominalization (participant versus event nominalizations); 2. the expression of core arguments as possessors; 3. case marking. The subsections are organized according to these three formal parameters, in the order given above, followed by a final section that discusses other issues to do with nominalization.

4.1

The type of nominalization

Comrie and Thompson (2007: 334) make a major distinction between nominalizations that name an activity or state (“A forms”), and those that name an argument (“B forms”). They furthermore claim a basis for this division in that “the A forms retain certain properties of the verbs and adjectives they are related to, while those in B behave syntactically like other nouns in the language” (p. 334). The way the questionnaire is set up, whether or not to count a construction as a participant (argument) nominalization or as an event nominalization is linked to bound flagging. If a dependent EDU is marked by a bound marker, and that marker at the same time singles out a participant, the construction counts as an argument nominalization.7 I focus on those languages that have such markers and look at their distribution over the continent, as well as their distribution over the semantic space. A total of thirty-one constructions in twenty-one languages meet the narrow definition of participant nominalizations given above. As expected, these constructions are highly skewed in terms of semantics. All of the constructions can encode relative relations, one of the clusters in Table 12.2, and sixteen of them are exclusively used for relative relations. Nevertheless, the constructions differ in terms of the other semantic relation types they can encode, with purposive, spatial, and temporal relations as the most common non-relative semantic types. There are two broad strategies that participant-nominalizing languages follow in the relativization of core arguments: (i) the underspecification of participant-denoting nominalizers, and (ii) the use of a paradigm of role-specific 7

This definition is rather narrow and ignores, for instance, unmarked nominalizations or nominalizations marked by a free marker, and it is restricted to core arguments. However, it captures the most common patterns found in the corpus, and can therefore be expected to give meaningful patterns.

286

Rik van Gijn

Map 12.1 The use of participant nominalization as a relativization strategy

nominalizers, specifying the semantic role of the relativized argument in the relative clause. The three groups are indicated on Map 12.1: white for no participant nominalizations as relativization strategy; black for those languages that do have participant nominalizations; and grey dots for those languages that have participant nominalizations in a construction where there is a semantically non-specific derivation.

Subordination strategies: nominalization

287

To illustrate this latter difference, consider examples from Desano (Tucanoan) and Kamaiur´a (Tup´ı-Guaran´ı), which represent the two types. In Desano, there are animate and inanimate nominalizers. Normally, the animate nominalizers yield an agentive relativization and the inanimates a patientive one, but this is not necessarily so, and as a consequence, animate patients yield ambiguous nominalizations (Miller 1999: 142): (1)

buʔe-gi study-nlz.m.sg ‘the one who teaches/the one who studies’

In Kamaiur´a, on the other hand, there are different nominalizers depending on the role of the relativized argument in the relative state of affairs. There are separate markers for deriving S (-ama’e), A (-tat), and P (-ipyt) arguments. Example (2) illustrates the S argument nominalizer (Seki 2000: 179). (2)

a-mo-y’u rak akwama’e-a i-‘ywej-ama’e-her-a 1sg-cau-drink at man-nuc 3-be.thirsty-nlz-pst-nuc ‘I made the man who was thirsty drink.’

As mentioned above, some languages allow for other semantic relations to be expressed by these participant nominalizations. These extensions basically follow along the same lines as those mentioned above: non-specific versus paradigmatically opposed specific markers. An example of the first type is the suffix -ta˜ı in the Jivaroan language Aguaruna, which singles out a participant, broadly defined as non-S/A. The precise interpretation depends on whether it carries a case marker or not (Overall 2007: 435). (3)

a. [buukɨa paka-ta˜ı-numa] ɨhɨɰ̃ a-u [skull peel-non.a/s:nr-loc] caus+arrive-rel ‘He brought them to the place where skulls were skinned (to make shrunken heads).’ (6:3:32) b. [iwa wampatʃi aɨntsu ɨŋkɨpa-ta˜ı-utʃi-h˜ı] [Iwa backpack person put-non.a/s:nr-dim-pert:1pl/3] ‘his backpack that Iwa puts people in’

An example of the second type is the Nambikwaran language Mamainde (Eberhard 2009: 523–524), where different classifiers, which have a derivational function, can mark different roles in the state of affairs. (4)

a. Paulo-soʔka wanih-soʔka kajauka haiʔka wan˜un set-soʔka paulo-ncl.hum tell-ncl.hum white.man language good speak-ncl.hum kaʔj˜ainʔ-ø-tʰunna-wa write-s3-fut2-decl ‘Paulo, the teacher, the one who speaks the white man’s language well, he will write.’

288

Rik van Gijn b. anuʔka-hen-˜a eu-khit-ten-lat.a-O-wa gather-ncl.time-fns see-s1.pl-des-s3-prs-decl ‘When we gather together, we will see (about that).’

There are a few potential areal patterns on Map 12.1: (a) the south-central and north-central Andes and foothills (Cuzco Quechua, Huallaga Quechua, Aymara, and foothill languages Leko and Yurakar´e and in the north-central area Awa Pit, Aguaruna, and Huallaga Quechua), (b) Rondˆonia and adjacent areas in eastern Bolivia (Baure, Itonama, Mekens, Mamaindˆe, Karo, and Apurin˜a), and (c) the border area between Colombia and Brazil and northeastern Peru (Puinave, Tariana, Desano, Mira˜na, Urarina). All three of these loosely defined areas are associated with linguistic areas and diffusion of linguistic features, in respective order: the Andean area (see e.g. Torero), the Guapor´e-Mamor´e area (Crevels and Van der Voort 2008),8 and the Vaup´es (Aikhenvald 2002). In particular, there seems to be an Andean tendency for agent nominalizations that can be used as relative clauses, but specific participant nominalizations also occur throughout the Amazon. Semantically neutral markers or strategies are found in some adjacent languages (Itonama and Baure in northeast Bolivia; Desano and Tariana in the Vaup´es area in the border area between Colombia and Brazil; and Mira˜na slightly further off, in the border area between Colombia and Peru). Furthermore, a functional equivalence between participant nominalizations and relative clauses seems to be a genealogical trait of a few large families, such as Quechuan, Aymaran, Tupian, and Cariban. The general picture, therefore, seems to be a mix of the fact that some of the most widely dispersed families have this characteristic, and that the trait may also have spread through contact in a number of more regional environments. A curious final point for this section is the fact that there are four languages, spoken in non-adjacent areas, that permit the participant nominalization to mark same-subject purpose clauses. These constructions are cross-linguistically not very common. The examples come from Cuzco Quechua (Lefebvre and Muysken 1988: 22), Desano (Miller 1999: 153), and Kamaiur´a (Seki 2000: 188), respectively.9 (5)

8

9

a. mikhu-q hamu-ni eat-ag come-1 ‘I come to eat.’

(Cuzco Quechua)

The extent of this area, especially towards the west in Bolivia, is unclear – and is argued to also include the foothill languages – but the clearest areal patterns seem to be found in Rondˆonia (see Muysken et al. in press). The fourth language is Huallaga Quechua which, since it is related to Cuzco Quechua, is not represented in the examples.

Subordination strategies: nominalization

289

b. wai w˜eh˜e-r˜a wa-r˜a ba-bo-r˜a (Desano) fish kill-an.nlz.pl go-hort.imp eat-pot-an.nlz.pl ‘Let’s go kill fish in order to eat!’ c. morerekwar-a je=r-en˜oj je=r-etsak-ar-am (Kamaiur´a) chief-nuc 1sg=rel-call 1sg=rel-see-nlz-attr ‘The chief called me to see me.’

4.2

Possession

An alternative way to express participants in nominalized constructions is by encoding them as possessors. Typological research suggests that S, A, and P participants are all potentially expressed as possessors in nominalizations, but that subject possessors (S/A) are more likely and more frequent than object possessors. One particular difficulty that arises for South America is that possessors are formally often expressed in the same way as one of the core arguments. For the coding of the questionnaire, this means that there are three answer categories for both subjects and objects: either they are not expressed as a possessor, they are expressed as a possessor, or it is impossible to tell because there is no formal difference between the expression of a possessor and a subject/object. The three categories are shown on Map 12.2 for the subject category and on Map 12.3 for the object category, with the languages that do not have a construction where the subject/object is expressed as a possessor in white, those that do have constructions where the subject/object is expressed as a possessor in black, and those for which it is impossible to tell in grey. As can be observed on these maps, subject possessors are particularly common in the Andean and adjacent areas – presumably under the influence of Quechuan and Aymaran languages – but they also occur in non-contiguous spots in the Amazon. Object possessors are less common and, moreover, geographically more scattered. In terms of semantics, the constructions with subject possessors are more or less divided over the range of semantic relation types; the most frequent type is object relativization, illustrated by the contrastive pair from the isolate language Itonama (Crevels 2010: 688), where the b-example is a relative clause, with the subject expressed in the same way as a possessor. (6)

a. k’i-chuduwa’-na lauro chamaye appl-buy-neut Lauro manioc ‘Lauro bought manioc.’ b. lowo’-tya chamaye ah-mi-k’i-chuduwa’-te lauro be.rotten-stat manioc 3-rel-appl-buy-cnt Lauro ‘The manioc that Lauro bought was rotten.’

290

Rik van Gijn

Map 12.2 The encoding of notional subjects as possessors in subordinate clauses

Other slightly more frequent relation types are temporal, reason, purpose, and desiderative relations, partially following Cristofaro’s case hierarchy given above. Given their infrequent occurrence, not much can be said about the semantics of constructions with object possessors. Moreover, the few constructions are more or less evenly divided over the semantic types. Both Cariban languages

Subordination strategies: nominalization

291

Map 12.3 The encoding of notional objects as possessors in subordinate clauses

in the sample, Tiriy´o (Meira 1999) and Hixkaryana (Derbyshire 1979), have constructions with object possessors. This can be connected to a more general characteristic of nominalizations in Cariban languages which follow an ergative pattern in the sense that it is the absolutive argument that is expressed as a possessor (Gildea 1992: 125).

292

Rik van Gijn

Map 12.4 The use of case marking to form adverbial clauses

As a general conclusion of this section, it seems that expressing core arguments as possessors is possibly areally diffused in the case of the Andean area, with the Quechuan and Aymaran languages as the most likely agents of the spread. Object possessors are rarer, and more scattered geographically, but Cariban languages in general seem to have absolutive possessors in their nominalized clauses.

Subordination strategies: nominalization

4.3

293

Case and adpositions

One of the more common nominal features acquired by nominalized predicates is the ability to take case markers, or to be the object of an adposition. In fact, all languages of the sample that have case markers and/or adpositions use these in the formation of complex sentences, with the possible exception of Tapiete. It is therefore not very insightful to project this onto a map, so rather than that, I have chosen to look at oblique case only, used in the formation of adverbial clauses, as shown in Map 12.4. As can be seen, the majority of languages can form adverbial clauses with case markers or adpositions. This makes this type of construction particularly interesting from the perspective of this chapter, as it is a potential candidate for diffusion. Table 12.6 takes a closer look at the case/adposition-marked adverbial clauses in the sample, with each column indicating a different type of adverbial relation and, for each language, the case marker(s) or adposition(s) that can be used to form the respective adverbial relation type. Empty cells do not necessarily mean that cases or adpositions are not used to express those relation types but can also indicate a lack of information. The table describes the potential of constructions to take oblique case markers, not the obligatoriness of the markers. Furthermore, the information only concerns the semantic relation types that are considered in the questionnaire. As shown in Table 12.6, temporal, reason, and locative clauses (partly corresponding to the “adverbial“ cluster in Table 12.5) in particular tend to be marked with an adposition or a case marker. An often observed strategy is the extension of locational markers to encode temporal relations. Some of the languages that follow this strategy are spoken relatively close to each other (Hup and Tariana in the Vaup´es area; Huallaga Quechua and Shipibo in northeastern Peru; Cuzco Quechua, Moset´en, Leko, and Yurakar´e in the south-central Andean foothills). Others, such as Mekens and Tiriy´o, are more isolated geographically. It may be that contact with members of the Quechuan family has promoted the spread of spatial markers to encode temporal clauses. Another recurring strategy is to use instrument markers for reason relations. The languages that do this, however, are not spoken in a shared vicinity. In summary, case marking, or the use of adpositions, is a common strategy in South American languages to indicate relationships between events. Some of the sub-structures may be connected to proposed linguistic areas, such as the Andean area and the Vaup´es. Again, Quechuan languages may have promoted the spread of this feature.

5

Conclusion

I set out to evaluate the claim that nominalization as a subordination strategy has spread through South America by diffusion through contact, rather than

294

Rik van Gijn

Table 12.6 Non-core case markers and adpositions used to form adverbial relations tmp

rea

loc

-numa ‘LOC’ =akwa =ki ‘at’ ‘because’ Cuzco Q ? -rayku ‘cause’ ? Desano pi?ri after kore -ge ‘loc’ before b˜er˜a with Hixkaryana =ke because =hona ‘towards’ Huallaga Q -pita abl -chaw -pita abl -chaw loc -man loc dir Hup -V́t obl -an dir -V́t obl -an dir Ika -ekɨ loc Jarawara jaa ‘peripheral’ jaa ‘peripheral’ jaa ‘peripheral’ Kamaiur´a r-ehe ‘about, -ipe ‘loc’ wrt’ Karo =kəy dat Kwaz´a -ko instr -ko instr Leko -ra loc -ra loc -ra loc Mapudungun -mew instr (-mew instr) Mekens =ese loc =ese loc =eri abl Mira˜na -li´ı(hye) ben -vu dir -tu abl Moseten -tom com =ya adess Movima n- obl Puinave -a all -a all Shipibo pekao after (incl loc) Tariana -se loc -nuku -se loc non a/s Tiriy´o =htao loc =ke ins Tsafiki several locative Urarina baja after bana when Yanesha’ -ot loc Yurakar´e =jsha abl =la ins =y loc =chi dir

prp

cnd

Aguaruna Awa Pit

-paq ‘ben’

-paq ‘ben’ -V́t obl jaa ‘peripheral’ wi ‘ablative’ (avertive)

-ra loc =ese loc

-dyesi’, -dyeti’ ben n- obl -a all -a all

=htao loc

Subordination strategies: nominalization

295

through chance or genealogical inheritance. In order to meet this challenge I tried to answer two questions, repeated here: (i) Is the distribution of nominalized subordinate clauses geographically skewed towards South America? (ii) Is there variation within the group of nominalized structures, and is that geographically skewed? On the global level, question (i) can be answered positively: the occurrence of nominalizations as subordination strategies is significantly higher in South America than would be expected on the basis of Cristofaro’s (2003) global sample. This fact alone rules out chance as a possible explanation. Within South America, since almost all languages of the sample have nominalized constructions that can be used as a subordination strategy, there is no clear geographic skewing. The first part of question (ii) can also be answered positively, as can be seen by only a superficial look at the NeighborNet in Figure 12.1. The second part of question (ii), whether the variation is geographically skewed within South America, is less clear. I reviewed three formal parameters along which nominalizations can differ from each other. In particular, participant nominalizations and case marking are very common strategies. Assuming a diffusion through contact scenario, the widespread occurrence of participant nominalization may be related to a combination of the fact that the major families (Quechuan, Tupian, Arawakan, Cariban) have these structures, and the fact that these features have spread in several smaller areas, such as the Vaup´es, the Andean area, and Rondˆonia (the Guapor´e-Mamor´e). A similar account can be given for the use of case markers to form adverbial relation types, especially for the Andes. Moreover, the semantic coherence of these groups of constructions makes a spread scenario more plausible. With respect to possession, the semantic coherence is less clear, and the occurrence of core argument possessors is also less pervasive. In particular, agent possession seems common in the Andean area and adjacent zones. The fact that the Andean area is so dominantly present in all of these areas goes against Dixon and Aikhenvald’s (1999: 10) claim that clause nominalization is an Amazonian, and not an Andean, phenomenon. The patterns furthermore only partly confirm Crevels and Van der Voort’s claims for subordination through nominalization as an areal feature for the Guapor´e-Mamor´e area. In the first place, as we have seen, clause nominalization is extremely common, and occurs well beyond the Guapor´e-Mamor´e area, and second, coherent patterns for the linguistic area itself seem to occur mainly on the Brazilian side of the area. These patterns do not give us a definitive or direct explanation of the skewed distribution, but they are consistent with a scenario of spread through contact: not as the result of a continent-wide spread region, but rather as the result of

296

Rik van Gijn

several smaller spread zones, and through a few language families with major extensions, like Quechuan, Tupian, and Cariban. The patterns found do not completely rule out an inheritance-based account, but because nominalized structures are found throughout the continent and across language families and stocks this would mean that the predominance of nominalizations is an extremely old pattern, and the variation found within the group of nominalized structures does not suggest extreme stability for this structure. Another possible reason for the predominance of nominalized clauses is that it is in a dependency relation to some other widespread, more fundamental structural feature of South American languages. This question falls outside the scope of this chapter, and is left for further research. Further research should also make clear whether similar patterns of regional spread can be found for the underrepresented areas in the sample, in particular towards the east.

Part IV

Major findings and conclusions

13

The languages of South America: deep families, areal relationships, and language contact Pieter Muysken, Harald Hammarstr¨om, Joshua Birchall, Swintha Danielsen, Love Eriksen, Ana Vilacy Galucio, Rik van Gijn, Simon van de Kerke, Vishnupraya Kolipakam, Olga Krasnoukhova, Neele M¨uller, and Loretta O’Connor

After summarizing the earlier chapters, we sketch a general overview of the different phases in the development of South America. We then explore the possibility of a continental bias for typological features characteristic of South America, which may point to the early entry of a limited set of features into the continent. Subsequently we analyze possible deep families or macro-groups in the continent, and their regional distribution. We then turn to the issue of whether different subsets of structural features yield different distance matrices for the language families studied. To further explore contact possibilities, the results for language contact in our book are charted. Finally, we conclude and take stock of what has been achieved and how further research should proceed.

1

Introduction

In the contributions assembled in this book we have explored a number of specific cases of language expansion and contact, as well as four sub-domains in which the genealogical and geographic distributions of features in different domains of the grammar were charted. In this chapter we further reflect on how we can relate these contributions to the general questions posed at the beginning of this book: (A) Why are there around 108 genealogical units in the continent? Why so many language families, and why so many isolates? What is the distribution of both larger families and isolates? The present chapter has resulted from the work in our group over the last few years. We also acknowledge the input of the various researchers listed in our acknowledgments, notably also Helder Perri Fereira, at different points on the ideas presented here.

299

300

Pieter Muysken, Harald Hammarstr¨om et al.

(B) Given the apparent genealogical diversity, why are there so many shared specific areal typological patterns, some characterizing most of the continent as a whole, and some individual parts of the continent? (C) What can we learn about the relation between the issues in (A)–(B) from the perspective of language history (vertical transmission) and language contact (horizontal transmission)? After summarizing the chapters in Section 2, we sketch a general overview of the different phases in the development of South America in 3. Section 4 deals with the possibility that there is a continental bias for typological features characteristic of South America, which may point to an entrance of a limited set of features into the continent in the early stages of its peopling. In Section 5 we turn to possible deep families or macro-groups in the continent, and their regional distribution. Section 6 raises the issue of whether different subsets of structural features could yield different distance matrices for the language families studied, and in Section 7 the results for language contact in our book are explored. In Section 8 we conclude and take stock: what has been achieved and how ought we to proceed in further research? 2

Summary of the contributions in the book

In the first chapter Muysken and O’Connor presented the main issues raised in this book, against the background of the genealogy, typology, and language contact situation of the South American indigenous languages. All three areas are underexplored so far, and particularly the relationship between them raised many unresolved questions. In the subsequent chapter O’Connor and Kolipakam developed a portrait of population movements and contacts in South America, from initial migrations some 15,000 years ago through millennia of dispersal and interaction, which resulted in localized pockets of population growth and cultural development. Current genetics research supports separate patterns of population density and interaction between East and West, and various types of evidence point to localized social complexity and down-the-line contact without major population dispersals until roughly 4,000 years ago. Hammarstr¨om examined the role of basic vocabulary comparison in the classification of South American languages with two empirical results emerging. First, the classification of South American languages by Loukotka (1968), based on basic vocabulary inspection, closely mirrors the classification presented by Campbell (2012a) for which far more extensive lexical and grammatical data had become available. Second, results of automated lexical comparison (ASJP) have a high degree of correspondence to those of traditional methods, despite the simplistic assumptions of the former and question marks on systematicity and objectivity of the latter. Thus shallow groups are robustly

Deep families, areal relationships, and language contact

301

recognizable in basic lexicon, and provide the foundation both for tracing earlier connections between shallow groups and for tracing contact that occurred within the time frame of the shallow groups. In a regional case study of the Isthmo-Colombian area, O’Connor devised a metric of feature categorization that incorporates sensitivity to properties of human interaction. Results indicate that analyses of both contact and genealogical relations are enhanced by categorization that reflects the impact of social constraints on linguistic change as well as conventional notions of stability in linguistic systems. Reflections of social scenarios need to be combined with simple frequency of contact. Van Gijn’s survey of the distribution of Andean and Amazonian features in the upper Amazon area shows that the transition from the Andean to the Amazonian area is gradual and complex. This is consistent with the intricate history of contact between the different ethnic groups of the area, and it presents a strong argument for connecting the research traditions associated with these areas. Morphosyntactic influence generally seems to represent older contact situations than phonological influence. In their chapter on the Andean matrix, Van de Kerke and Muysken argued that the traditional division of the Quechuan family into two main branches can be maintained for structural features. However, Aymaran is structurally closer to Central Peruvian Quechua than innovative Ecuador Quechua. Other Andean languages differ much more than previously assumed. Eriksen and Danielsen sketched the birth, expansion, and fragmentation of the Arawakan culture and languages across Amazonia. This ethnolinguistic complex is characterized by a robust uniformity that was sustained until late prehistory, resulting from an intensive exchange system that − despite expansion in a multidirectional and irregular fashion − managed to keep the system together across vast distances. Eriksen and Galucio showed that one out of five expansive Tupian branches, Tup´ı-Guaran´ı, expanded through a hybridizing culture that spread across vast geographic distances through the absorption of cultural and linguistic elements from neighboring populations. The linguistic analysis shows that lexical features were better preserved than structural ones, and that the expansion process likely continued into the historical period. With respect to Tense/Aspect/Mood/Evidentiality (TAME) systems, M¨uller presented evidence that grammatical desiderative markers occur more frequently in South American languages than in other parts of the world. Desideratives in the sample stem from proto-forms, but they also developed due to language-internal pressure and contact-induced grammaticalization. Birchall examined the diverse array of verbal argument marking patterns encountered across the continent and tested for regional distributions of certain often-discussed features. Statistical tests showed that many areal proposals in

302

Pieter Muysken, Harald Hammarstr¨om et al.

the literature are in fact not significant, and that an East–West division was often more significant than the classic Andean–Amazonian division. Krasnoukhova showed that in Noun Phrase structure there is a split between languages spoken in the western part vs. the eastern part of the continent, and not between the Andes and the Amazon as has been traditionally assumed. While the western part corresponds to the Andean sphere, the eastern part includes languages spoken far beyond the Amazon region. Furthermore, in a case-study on semantic features encoded by demonstratives, Krasnoukhova has shown that the Chaco and the Southwest Amazon region stand out on the continent for encoding verbal categories with demonstratives. And finally, Van Gijn showed that nominalization as a subordination strategy is significantly more pervasive in South America than would be predicted on the basis of global patterns. The patterns found within South America are most consistent with a scenario of several smaller spreads, possibly promoted by a few language families with major extensions (e.g. Quechuan, Tupian, Cariban). 3

Phases in the development of the South American languages

To organize our answers to these questions, we will use a framework in terms of four phases in the history of the continent, building on O’Connor and Kolipakam (this volume). It is impossible to look into the past as far back as 12,000 BCE, but the most likely scenario for the history of the languages of South America that we can infer from the current evidence involves the following:1 I 11,000–6000 BCE Initial settlement and dispersal A small (