Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes (Varieties of English Around the World) 9027249032, 9789027249036

This volume presents a collection of in-depth cross-varietal studies on a broad spectrum of grammatical features in Engl

133 24 3MB

English Pages 308 [309] Year 2012

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes (Varieties of English Around the World)
 9027249032, 9789027249036

Table of contents :
Title Page
Table of contents
International Corpus of English
Introduction
Off with their heads”
1. Introduction
2. The corpus-driven approach to TAM
2.1 Tagging and chunking
2.2 Beheaded verb groups
2.3 Comparing observed and expected frequencies
3. Corpus-driven results and analysis
3.1 ICE-Fiji
3.2 ICE-India
3.3 ICE-New Zealand
3.4 ICE-Ghana
3.5 ICE-Great Britain
4. Analysis of selected features
4.1 Tense
4.1.1 Lexical heads and tense
4.1.2 Qualitative analysis: A case study on perfect constructions
4.1.2.1 Past perfect.
4.1.2.2 Present perfect.
4.2 Modality
4.3 The progressive
5. Conclusion
References
Appendix
Modals and quasi-modals in New Englishes
1. Introduction
2. Recent diachronic trends
3. The Englishes
4. The data
5. The Englishes compared
6. Speech and writing compared
7. The individual quasi-modals
7.1 have to
7.2 have got to
7.3 be going to
7.4 want to
8. Conclusion
References
The diverging need (to)’s of Asian Englishes
1. Introduction
2. Need and need to in 1960s and 1990s British and American English
3. A methodological preliminary: British English once more
4. The needs of four Asian Englishes
5. Conclusion
Acknowledgements
References
Corpora
Will and would in selected New Englishes
1. Introduction and previous research
2. New Englishes selected
3. Data and method
4. Results and discussion
4.1 Results and general findings
4.2 Trinidadian English
4.3 Jamaican English
4.4 Bahamian English
4.5 Fiji English
4.6 Indian English
4.7 Singapore English
5. Conclusion and outlook
References
Progressives in Maltese English
1. Introduction
2. Outline and contextualization of Maltese English
3. Previous research on progressives
4. The variable: Definition and constraints
5. Data
6. Quantitative analysis
6.1 Maltese and British newspaper corpora
6.2 Comparison of spoken and written corpus data
7. Qualitative analysis
8. Questionnaire data
9. Conclusion
References
Mapping unity and diversity in South Asian English lexicogrammar
1. Introduction: Unity and diversity in and across South Asian Englishes
2. Verb-complementational patterns as parameters of variation
3. Verb complementation of TCM-related verbs in South Asian Englishes
3.1 The patterns of CONVEY, SUBMIT and SUPPLY
3.2 TCM-related verbs: previous studies of verb-complementational variation
4. Corpus data
4.1 The international corpus of english (ICE)
4.2 Web-derived newspaper corpora
5. Analysis and results
5.1 Verbs under scrutiny: CONVEY, SUBMIT and SUPPLY
5.2 CONVEY in the ICE and SAVE corpora
5.3 SUBMIT in the ICE and SAVE corpora
5.4 SUPPLY in the ICE and SAVE corpora
6. Discussion and conclusion
References
Particle verbs across first and second language varieties of English
1. Introduction: Unity and diversity in World Englishes
2. Particle verbs in first and second language varieties of English
3. Methodology
3.1 Corpus data – the International Corpus of English
3.2 Particle verbs with up
4. Quantitative analysis and results
4.1 PVUs in the ICE [W140] corpora
4.2 The distribution of PVUs within and across the ICE [W140] corpora
4.3 Formality, genres and PVUs in the ICE [W140] corpora
5. Qualitative analysis
5.1 Unrecorded PVUs across the ICE [W140] corpora
5.2 Additional particles in PVUs across ICE [W140] corpora
5.3 Innovative usages in PVUs across ICE [W140] corpora
6. Conclusion and outlook
References
Particle verbs in African Englishes
1. Introduction
2. Corpus and methodology
3. Analysing the data
4. Results and discussion
5. Innovative” PVs in Ugandan English
5.1 Widening the search
6. Conclusion
Relatives worldwide
1. Introduction
2. Data and analysis
3. Results
3.1 Overall occurrence of relative clauses and relative markers
3.2 Types of relative clauses
3.3 Stylistic variability
4. Summary and discussion
References
Change from to-infinitive to bare infinitive in specificational cleft sentences
1. Introduction: The variable and its variants
2. Rapid shift to the unmarked infinitive in contemporary British and American English: Real-time evidence from the “BROWN family” and the DCPSE
3. Specificational clefts in ICE: Quantitative survey
4. Illustration of specific local usages in New Englishes
4.1 Syntactic complexity and the retention of to
4.2 All-clefts and what-clefts
4.3 Insertion of that after BE
4.4 Omission of BE.
5. Conclusion
And they were all like ‘What’s going on?’”
1. Introduction
2. Background
3. Data and methods
4. Results and discussion
4.1 The factor “register”
4.2 The factor “collection period”
4.3 The factor “grammatical person of the quotative”
4.4 The factor “content of the quote”
4.5 The factor “speaker sex”
4.6 Multivariate analysis
5. Conclusion
References
Index

Citation preview

Mapping Unity and Diversity World-Wide

Varieties of English Around the World (VEAW) A companion monograph series devoted to sociolinguistic research, surveys and annotated text collections. The VEAW series is divided into two parts: a text series contains carefully selected specimens of Englishes documenting the coexistence of regional, social, stylistic and diachronic varieties in a particular region; and a general series which contains outstanding studies in the field, collections of papers devoted to one region or written by one scholar, bibliographies and other reference works. For an overview of all books published in this series, please see http://benjamins.com/catalog/veaw Editor Stephanie Hackert Ludwig-Maximilians University, Munich

Editorial Board Manfred Görlach Cologne

Rajend Mesthrie

University of Cape Town

Peter L. Patrick

University of Essex

Edgar W. Schneider

University of Regensburg

Peter Trudgill

University of Fribourg

Walt Wolfram

North Carolina State University

Volume G43 Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes Edited by Marianne Hundt and Ulrike Gut

Mapping Unity and Diversity World-Wide Corpus-Based Studies of New Englishes Edited by

Marianne Hundt University of Zurich

Ulrike Gut University of Muenster

John Benjamins Publishing Company Amsterdam / Philadelphia

8

TM

The paper used in this publication meets the minimum requirements of the American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.

Library of Congress Cataloging-in-Publication Data Mapping Unity and Diversity World-Wide : Corpus-Based Studies of New Englishes / edited by Marianne Hundt and Ulrike Gut. p. cm. (Varieties of English Around the World, issn 0172-7362 ; v. G43) Includes bibliographical references and index. 1. English language--Variation--Foreign countries. 2. English language--Foreign countries. 3. Languages in contact. 4. Linguistic change. 5. Communication, International. 6. Linguistic change. I. Hundt, Marianne. II. Gut, Ulrike. PE2751.M37   2012 427--dc23 2011044820 isbn 978 90 272 4903 6 (Hb ; alk. paper) isbn 978 90 272 7494 6 (Eb)

© 2012 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa

Table of contents International Corpus of English: List of corpora Introduction: Mapping unity and diversity in New Englishes Marianne Hundt & Ulrike Gut “Off with their heads”: Profiling TAM in ICE corpora Gerold Schneider & Marianne Hundt

vii ix 1

Modals and quasi-modals in New Englishes Peter Collins & Xinyue Yao

35

The diverging need (to)’s of Asian Englishes Johan van der Auwera, Dirk Noël & Astrid De Wit

55

Will and would in selected New Englishes: General and variety-specific ­tendencies Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

77

Progressives in Maltese English: A comparison with spoken and written text types of British and American ­English Michaela Hilbert & Manfred Krug

103

Mapping unity and diversity in South Asian English lexicogrammar: Verb-complementational preferences across varieties Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

137

Particle verbs across first and second language varieties of English Lena Zipp & Tobias Bernaisch

167

Particle verbs in African Englishes: Nativization and innovation Gerald Nelson & Ren Hongtao

197

Relatives worldwide Ulrike Gut & Lilian Coronel

215

 Mapping Unity and Diversity World-Wide

Change from to-infinitive to bare infinitive in specificational cleft sentences: Data from World Englishes 243 Christian Mair & Claudia Winkle “And they were all like ‘What’s going on?’”: New quotatives in Jamaican and Irish English Nicole Höhn

263

Index

291

International Corpus of English List of corpora INNER CIRCLE ICE-GB Great Britain ICE-NZ New Zealand ICE-AUS Australia ICE-IRE Ireland ICE-CAN Canada ICE-USA United States of America OUTER CIRCLE ICE-HK ICE-PHI ICE-SIN ICE-MAL

Hong Kong Philippines Singapore Malaysia

ICE-IND ICE-SL

India Sri Lanka

ICE-NIG ICE-GHA ICE-EA ICE-EA(Ken)

Nigeria Ghana East Africa East Africa (Kenya)

ICELite An Internet-sourced International Corpus of English (various varieties) ICE-FJ Fiji ICE-MTA Malta ICE-JAM Jamaica ICE-T&T Trinidad and Tobago ICE-BAH Bahamas

Introduction Mapping unity and diversity in New Englishes Marianne Hundt & Ulrike Gut

University of Zurich / University of Muenster

The current volume presents research on a broad range of New Englishes and an array of grammatical and lexico-grammatical variables. It is based on new ­corpus data of New Englishes that allow in-depth cross-varietal studies on a broad spectrum of grammatical features and possible substrate influences. The previous limitation of research on New Englishes to the regions covered by the “old” International Corpus of English (ICE) corpora with a focus on Asian varieties, but including only one African and one Caribbean variety, has been overcome by the compilation of a new set of ICE corpora. Many of the authors who have contributed to this volume are involved in the collection of such new corpora within the ICE framework and have thus considerably broadened the regional spread of New Englishes under investigation by adding data on two further African ­Englishes (Ghanaian and Nigerian English), two Caribbean Englishes (Trinidad and Tobago English and Bahamian English), a New English from the South Pacific (Fiji ­English), as well as Sri Lankan and Maltese English. The current volume is novel in both its theoretical and methodological approaches. In theoretical terms, it explores the structural unity and diversity of New Englishes and thus investigates central aspects of dialect evolution and language change. In particular, it offers answers to the question as to what constrains new dialect formation, and explores universal trends across a wide range of contact situations. On the methodological side, it demonstrates the possibilities (and limitations) of quantitative and qualitative corpus analyses in comparative studies of New Englishes. Since most of the contributors are actively involved in the compilation of ICE corpora, the chapters in this volume also critically reflect methodological issues, such as comparability of corpus data across the varieties, as well as the novel approaches that syntactic annotation (tagging and parsing) of the data allow us to take in the description of New Englishes, the use (and limitations) of Web-derived data as an additional source of information, and the possibility to complement corpus data with evidence from sociolinguistic fieldwork.



Marianne Hundt & Ulrike Gut

The book starts off with an exploratory study (Schneider & Hundt) of a ­corpus-driven approach to the profiling of tense, aspect and modality (TAM) in New Englishes. The authors use syntactically annotated data to retrieve VPs without their lexical heads in two Inner-Circle (IC) Englishes (British and New Zealand English) and three geographically and typologically unrelated OuterCircle (OC) Englishes, namely Indian English as a representative of a South Asian ­English variety, Fiji English as one of the Pacific Englishes and Ghanaian English as a variety of English from Africa. They discuss both quantitative differences in TAM across these Englishes and the problem of mapping quantitative differences onto the qualitative level of analysis, i.e. the question of whether differences in the frequency of TAM items are due to functional differences across the varieties they study. Crucially, they argue that the corpus-driven approach is meant to supplement rather than replace the hypothesis-driven approach to variation. The subsequent chapters all take a more traditional hypothesis-driven approach. Chapters 2–4 deal with modality in New Englishes and Chapter 5 with the progressive aspect. Collins and Yao focus on a set of modals (must, should, will and shall) and semantically related quasi- or semi-modals (have (got) to, be going to and want to). They compare findings from corpora of IC varieties like British, American, Australian and New Zealand English with nine OC varieties, namely Jamaican, Singapore, Philippine, Indian, Nigerian, Malaysian, Hong Kong, Kenyan and Fijian English. The corpus data confirm previous findings that report an ongoing change from modals towards semi-modals in IC varieties, led by American English. Collins and Yao show that the more established OC Englishes like Jamaican English are closest to IC varieties whereas varieties like Kenyan and Nigerian English lag behind in this development. Van der Auwera, Noël and De Wit home in on one modal – need and its lexical counterpart need to – in Asian Englishes (Hong Kong, Singapore, Philippine and Indian English), taking findings in the BROWN family of corpora as their starting point. The ICE data for the Asian Englishes are synchronic, so van der Auwera et al. take differences between speech and writing as a proxy for ongoing change on the assumption that the change is more advanced in spoken than in written texts. They find similar developments (unity) in the IC varieties but more diversity in the OC varieties. The diversity is more difficult to interpret because the changes they observe focus on one modal and a semantically related lexical verb. For a fuller understanding, van der Auwera et al. argue, the findings for need and need to should be embedded in a study of other modals to arrive at “modality’s semantic map” which could then be discussed with respect to explanatory factors such as the evolutionary stage of an OC variety, possible substrate influence, learner ­strategies or universal principles.

Introduction 

Like van der Auwera et al., Deuber, Biewer, Hackert and Hilbert also focus on a very limited set of modals, namely will and would, but unlike van der Auwera et al. they limit their discussion to synchronic variation. Using evidence from three Caribbean ICE corpora (Trinidad, Jamaica and the Bahamas), ICE-India and ICESingapore as well as comparable data from Fiji, supplemented by evidence from the IC variety British English, they corroborate and refine the results of previous studies on New Englishes which suggested that the use of will and would was more variable in OC than in IC varieties. The diversity they find in the OC varieties stems mainly from the different kinds of New Englishes investigated, i.e. those countries where (local) Standard English is a second dialect (ESD) alongside an English-based creole, and those where English is an institutionalized second language (ESL). Their study shows (a) that even small datasets from incomplete ICE corpora can be used in studying variation across different New Englishes and (b) that different kinds of New Englishes need to be studied in order to arrive at a broader picture of variation than one that maps only ESL varieties. The scope of different kinds of New Englishes is further broadened by Hilbert and Krug’s contribution with a focus on a variety whose status in Kachru’s concentric circles model is not straightforward: Maltese English. Their case study presents results from ICE-Malta and ICE-Great Britain on the use of the progressive aspect, comparing it with previous findings in other New Englishes. They also combine corpus- with questionnaire-data as some nativized structures are not attested in the limited corpus data of ICE components that are still under construction. Lexico-grammatical variation has been an obvious and central area for research into (ongoing) nativization in New Englishes. Three chapters focus on the lexico-grammar of verbs (Chapters 6–8). Schilk, Bernaisch and Mukherjee investigate CONVEY, SUBMIT and SUPPLY and their complementation patterns in the written parts of ICE-Great Britain, the British National Corpus news section, ICEIndia and ICE-Sri Lanka as well as large Internet newspaper corpora. S­ ystematic inter-varietal differences are found in the preferences for ­one-argument, two-argument and three-argument complementation patterns and at the level of individual complementation patterns. Marked differences between the two Asian varieties in some of these patterns and varying degrees of unity with British ­English usage lead the authors to conclude that regionally motivated labels such as “Asian Englishes” should be used with great care. Zipp and Bernaisch investigate the usage of particle verbs with up in nine ICE corpora of IC and OC varieties of English – ICE-Great Britain, ICE-New Zealand, ICE-Ireland, ICE-India, ICE-Singapore, ICE-Philippines, ICE-Fiji, ICE-Sri Lanka and ICE-Ghana – and in Web-based data. Based on a series of quantitative and genre-based comparisons they show both divergent regional clusters and shared

 Marianne Hundt & Ulrike Gut

genre conventions with a high degree of inter-varietal homogeneity in the distribution of particle verbs with up. The data thus support two relational models of varieties: lexical diversity and lexical “teddy bears” correlate with the evolutionary status of a given variety, while the cluster analysis of the distribution of these particle verbs across various genres shows geographical proximity as the main correlating factor. In addition, qualitative analyses reveal the novel usage of particle verbs with up in some OC varieties of English. The chapter by Nelson and Ren is concerned with the investigation of the use of particle verbs in Ugandan English and other African varieties. On the basis of both classic corpus analyses and Web searches, they find unity in the relative distribution and stylistic usage of the four major structural types across native varieties and the New Englishes. Diversity is found with respect to some “innovative” particle verbs in the African varieties of English such as result into, request for, demand for, culminate into, and look forward for. The authors argue that while these particle verbs are very frequently used in formal, official contexts they are not fully established or nativized (e.g. they do not have corresponding variants in the passive voice). The next two contributions (Chapters 9 and 10) investigate syntactic variation across New Englishes. Gut and Coronel provide a comparison of relativization strategies in Nigerian, Jamaican, Philippine and Singapore English. Based on an analysis of the four ICE subcorpora they show that the New Englishes share a large number of relativization strategies, while some differences across them exist in terms of relative marker choice. They test previous claims of reduced stylistic variability and the influence of norm-orientation in these OC Englishes. It turns out that all four varieties of English exhibit systematic variation of relativization strategies with text types, but this is less pronounced in Nigerian and Philippine English, the two varieties whose norm-orientation is more external and where a local standard form of English has not yet been established. Mair and Winkle use spoken data from ten ICE corpora (Great Britain, ­Australia, Canada, Ireland, New Zealand, India, Jamaica, Hong Kong, Singapore, Philippines) to trace ongoing change (in apparent time) in specificational cleft sentences. Their analysis shows that the drift away from the marked and towards the unmarked infinitive in British and American English, which occurred in the twentieth century, can also be found in the IC varieties of E ­ nglish spoken in Australia, Canada, Ireland and New Zealand. Furthermore, they find that the British-input OC varieties spoken in India, Jamaica, Hong Kong and Singapore are less advanced in this development than the IC varieties; ­Philippine English, on the other hand, is intermediate between British-input OC varieties and most IC varieties, which they interpret as being linked to its historical American source.

Introduction 

Höhn, finally, demonstrates that ICE corpora can also be used to study ­pragmatic features (Chapter  11). She draws on evidence from ICE-Jamaica and ICE-Ireland in her investigation of the quotative be like, which (unlike quotative go) is a globalizing feature found in many IC and OC Englishes. The results of her multivariate analysis confirm that, in the process of spread, quotative be like shows diverse influence of such factors as “grammatical person” or “content of the quote” across different varieties of English. This diversity seems to be more pronounced in different kinds of Englishes (IC vs. OC) than amongst varieties of English as a native language, a finding that needs further support from the spoken data of the new ICE corpora currently being compiled. The collection of new ICE corpora has brought us to what we, tongue-incheek, have referred to as “ICE Age 2” (see the articles in the ICAME Journal No. 34, 2010). The studies collected in this volume all make use of the broader range of available corpus data for the study of New Englishes. They have made a start in mapping out the unity and diversity among different Englishes, but, in doing so, they have really only begun to scratch away at the tip of the ICE-berg. Further studies are needed to develop our initial sketch of a map into a more reliable tool for navigating the complexities that underlie the evolution and development of the New Englishes. We would like to thank the contributors of this volume for sharing their views on World Englishes with us over the last year, and those who have been involved in the compilation of new ICE corpora for the stimulating discussions at various informal workshops. Melina Ruoss and Danielle Hickey helped to prepare the manuscript for publication. Danielle’s contribution as editorial assistant was particularly valuable – with her keen eye for detail, she spotted many an inconsistency that had escaped our notice: heartfelt thanks to both of them.

“Off with their heads” Profiling TAM in ICE corpora Gerold Schneider & Marianne Hundt University of Zurich

The main aim of our chapter is a methodological one, that of comparing a largely data-driven approach to regional variation in world Englishes and a corpus-based approach. As a case study, we examine tense, aspect and modality (TAM) differences between five varieties. Our investigation uses frequency differences in verb chunks and tags, based on syntactically annotated material from the International Corpus of English. Most of our results corroborate previous, corpus-based findings. The data-driven findings guide our qualitative investigation of the perfect tense, modal verbs and the progressive. While our approach is far from being fully automatic, only minimal manual interaction is needed for going through and filtering the top one or two dozen entries in ranked lists. Keywords:  syntactically annotated data; data-driven approach; surprised-based measure

1.  Introduction Tense, aspect and modality (TAM) are grammatical categories that have been well documented, not only from a typological perspective, but also for the two major reference varieties of English, British (BrE) and American English (AmE) (see, amongst others, Hopper 1982; Bybee, Perkins & Pagliuca 1994; Biber et al. 1999; Facchinetti, Krug & Palmer 2003).1 Some TAM features have also been discussed in the context of vernacular universals (see Kortmann & Szmrecsanyi 2004; Sharma 2009).2 Very often there is synchronic variation but also ongoing change .  Modality (i.e. modal and semi-modal verbs) has received more attention than aspect and tense. .  Note that Kortmann and Szmrecsanyi (2004) base their comparison on the attestation of a feature in a variety, whereas Sharma (2009) investigates the functional equivalence (or ­non-equivalence) and relates it to possible language contact.



Gerold Schneider & Marianne Hundt

in this central area of grammar. A case in point is the shift away from core modals like must and should towards semi- or quasi-modals like need and have (got) to, a change that appears to be led by AmE (Leech et al. 2009: 73). Mair (2009a: 50, 2009b: 18) points out that modals of obligation and necessity, for instance, are “an almost perfect diagnostic to assess the synchronic regional orientation of a New English with regard to British or American norms and also its degree of linguistic conservatism”.3 Other corpus-based studies investigate variation and ongoing change in the use of the progressive (passive) or the perfect in Inner-Circle (IC) and Outer-Circle (OC) varieties (e.g. Sharma 2001; Hundt 2009; van Rooy 2009; Hundt & Vogel 2011).4 Most of the existing research on TAM in different Englishes is based on lexical corpus searches (all possible surface forms need to be searched). This poses some unwanted restrictions on the analyses, for instance with respect to the possibilities of defining the variable context under investigation, as some phenomena in the envelope of variation may be missed. In his study of past perfect constructions in Indian English (IndE), Sedlatschek points out that Since the […] Corpus was untagged, there was no straightforward way of comparing the past-perfect forms to those of other tense forms in IndE and in the other varieties, which would have been desirable in a next step to describe the position of the IndE past perfect in relation to other tense forms in IndE more exactly. (Sedlatschek 2009: 260)

Leech et al. (2009) and Hundt and Smith (2009) use tagged corpora but also a corpus-based rather than corpus-driven approach, i.e. they mostly use corpus data to test hypotheses on grammatical change in contemporary English. In a corpusdriven approach, the differences arise from the data set itself, as we discuss in Section 2. In this chapter, we explore a partly corpus-driven approach to profiling TAM in grammatically annotated International Corpus of English (ICE) subcorpora. We use chunking in addition to tagging. Chunking is an intermediate level between part-of-speech tagging and syntactic parsing. We compare observed and expected frequencies of the non-lexical parts of verb chunks. This partly c­orpus-driven

.  On modals of necessity, see also Nelson (2003), Collins (2009), Collins and Yao (this volume) and van der Auwera, Noël and De Wit (this volume). .  Note that while the progressive is commonly recognized as an aspectual category, the perfect is classified as having mainly aspectual meaning (e.g. Biber et al. 1999; Quirk et al. 1985) or as being a tense marker (see Huddleston & Pullum 2002). Van Rooy (2009: 310) uses the label “perfect construction” and leaves the classification open. His analysis shows that it is more frequently used with the function of a tense rather than an aspectual marker (2009: 318).



“Off with their heads”

approach has advantages and disadvantages. One advantage is that it brings all surprising frequency differences to the researcher’s attention, thus reducing the risk of oversights. Furthermore, it does not look at individual features but at a broad range of phenomena. Moreover, working hypotheses largely arise from the frequency differences. In other words, the data in all their complexity demand an interpretation. Instead of focusing on a single phenomenon and its envelope of variation (Labov 1969), the researcher is forced to interpret many features, which possibly interact with each other. The detection of the envelope of variation is typically not corpus driven and often not clear. Arppe et al. (2010), for example, point out that their […] focus on alternations is the result of theoretical heritage from generative syntax and a matter of methodological convenience. Most linguistic decisions that speakers make are more complex than binary choices … alternations are as simplistic and reductionistic as the theories of language that originally studied them. (Arppe et al. 2010: 12)

Making the step from corpus-derived features to linguistic categories is not easy, however. Surface strings are often multifunctional (e.g. be is used as auxiliary for both progressive and passive); inflected and negated forms of be, on the other hand, will have to be subsumed in an analysis of progressive forms. In order to partly counteract this danger, we include qualitative analyses and more detailed quantitative analyses of many features. In Section 2, we introduce our methodology, and present quantitative differences in Section 3. In Section 4, we explore some quantitative differences in more detail, i.e. whether more frequently used patterns in New Englishes turn out to be used differently at the qualitative level of analysis. For our case study, we focus on the two Inner-Circle Englishes, BrE and New Zealand English (NZE), and OuterCircle varieties from different regions, namely IndE (South Asia), Fiji English (FijE, Pacific) and Ghana English (GE, Africa). ICE-GB, ICE-NZ and ICE-IND have all been completed and are available; ICE-FJ and ICE-GHA are still being compiled. We therefore limit our analyses to parts of the ICE corpora that were tailored to fit the material available from ICE-FJ and ICE-GHA.5 2.  The corpus-driven approach to TAM The distinction between corpus-driven and corpus-based approaches has been described in detail by Tognini-Bonelli (2001). In corpus-based approaches, existing hypotheses are tested, while in purely corpus-driven or data-driven .  For a list of texts chosen in our analyses, see Table 1a in the appendix.





Gerold Schneider & Marianne Hundt

approaches, hypotheses arise entirely from the corpus data. In its more radical forms, the ­corpus-driven approach puts the axioms of linguistics, accepted linguistic categories like word classes or syntactic structure, into question (e.g. Yngve 1996). In less radical forms, it investigates gradience and interactions between lexis and grammar without preconceptions, e.g. using probabilistic models that predict feature values. Corpus-driven approaches have a number of advantages and disadvantages. An advantage is that, in areas of gradience and subtle differences, a corpus-driven approach can bring patterns to the surface that went unnoticed by linguists (e.g. Hunston & Francis 2000). Variationist linguistics often deals with very subtle differences and gradient categories. An obvious disadvantage of corpus-driven approaches is that they are somewhat at the mercy of the data and thus the quality of the corpus: “… since the information provided by the corpus is placed centrally and accounted for exhaustively, then there is a risk of error if the corpus turns out to be unrepresentative” (Tognini-Bonelli 2001: 88). In our approach, we use both pre-established word classes and syntactic assumptions, but we use frequency patterns in verbal groups as a stopgap to variationist differences. For a corpus-driven approach to grammatical categories, an obvious requirement is that the corpora be grammatically annotated. We tagged, chunked and parsed several ICE components and measured which sequences of the grammatical part of a chunk occur more frequently than expected in a given ICE component, in comparison to other ICE components. For example, in the verb group will eat the first word is the grammatical part, a closed-class structural word expressing future tense, while the second word expresses the event, using an open-class content word. We discuss our method in more detail in Section 2.2. Obviously, the annotation introduces predefined categories. In addition, the focus on verb chunks implies a tacit hypothesis, namely that significant frequency differences in the predefined grammatical categories are indicative of differences between varieties: In the Firthian framework, the typical cannot be severed from actual usage, and “repeated events” are the central evidence of what people do, how language functions and what language is about. The statements derived from the formalisation of repeated events, therefore, are taken to correlate directly with language as a semiotic system, as realised in a specific corpus.  (Tognini-Bonelli 2001: 89)

2.1  Tagging and chunking The use of tagging, chunking and parsing offers new perspectives to descriptive linguistics, as we discuss in more detail in Schneider and Hundt (2009). In the



“Off with their heads”

c­ urrent investigation, we mainly use the chunking stage, which is a part of the parsing approach described in Schneider (2008). Chunking is based on part-of-speech tagging.6 Chunkers group certain sequences of tags or words together, usually verb groups and noun chunks. While tagsets are standardized,7 different chunkers may follow different policies. We use the conditional random field chunker Carafe.8 This is a chunker that takes a relatively greedy and semantic approach. A greedy chunking option typically coincides with being more semantic and less syntactic in nature. Verb-group chunks can be more or less greedy. The expression going to sleep may be interpreted as consisting of a single chunk, or of the two chunks going and to sleep. Carafe is a relatively greedy chunker and reports large chunks, therefore going to sleep is a single chunk. Even control structures like wants to sleep are reported as one chunk, despite the fact that they extend across two clauses. An advantage of this greedy option is that the going to future is recognized at chunk level, and so are control structures (want to go), semi-modals (have to go) and modals (must go); they can thus be treated on a par. The use of a less greedy chunker would result in increased fragmentation of the data, and this, in turn, would lead to more data sparseness and constitute a potential disadvantage for our approach. We only use tagging and chunking, not lemmatizing. This has the advantage that the approach can also spot irregularities involving distributions across the inflectional paradigms, and that the approach remains simpler and more corpus driven. At the same time, it may have the disadvantage that the potential level of abstraction of lemmatized data is not used. 2.2  Beheaded verb groups Beheaded verb groups are verb groups whose lexical heads have been removed. The lexical head is usually the last word in the verb group. The beheaded verb .  Part-of-speech tagging is a standard technique in natural language processing. The ­accuracy of Treetagger with the Penn tagset is reported as 96.3% (Schmid 1994). There are a number of different tagsets that are used for English. We used the Penn Treebank as it is the most widely used for chunker and parser input. With only 32 tags, this tagset is relatively small. Decisions that are difficult to take for taggers are left underspecified by this tagger, making the tagging more robust. .  By this we refer to the fact that taggers that use the same tagset (e.g. Penn Treebank) produce standardized output (i.e. standardization to tagsets across different taggers). We do not imply that standardization across different tagsets has been achieved (as this is not the case). .  http://sourceforge.net/projects/carafe. The performance of the Carafe random field chunker is 93.77% F-measure (Wellner & Vilain 2006). F-measure is a weighted mean of ­precision and recall.





Gerold Schneider & Marianne Hundt

group of the verb group going to sleep is thus going to, which unequivocally indicates going to future. The idea that beheading verb phrases or verb groups might be an interesting approach to TAM profiling in different Englishes occurred to us when we were exploring the heuristic value of parsing tools for the description of New Englishes (see Schneider & Hundt 2009). In initial experiments conducted for that paper, we used the entire verb group, but data sparseness proved problematic. Data sparseness typically affects open-class items. The only open-class word in a verb group is typically the lexical head, so we decided to cut the heads off the verb phrases. While tense information partly involves the head (with the exception of future tense), aspect and modality can largely be assessed on the basis of the beheaded verb group. In an additional step, it is also possible to analyse the inflected lexical heads in our corpus, for instance for a comparison of the use of simple past tense VPs with those that make use of the present or past perfect. This is usually possible by considering the part-of-speech tag of the chunk head. But in some New Englishes verbs do not inflect for past tense in past tense contexts; such uses obviously require manual analysis. The relation between beheaded verb groups and TAM profiling is not as straightforward as it may at first appear. On the one hand, not all beheaded verb groups are unambiguous with respect to TAM. Beheading is eating and is eaten, for instance, leads to the same beheaded verb chunk is. As the mapping from the phenomenon that we observe (beheaded verb groups) to the linguistic features we investigate (TAM) can be one-to-many, the results of beheaded VPs sometimes need to be looked at in context (e.g. the beheaded verb chunk is can have heads with the tag VBG, expressing progressive, or with the tag VBN, expressing the non-TAM category passive). Even the combination is plus VBG is not necessarily always progressive (e.g. this is interesting),9 so for fine-grained analyses, automatically obtained data need to be supplemented by manual sifting of the top 20 or 50 entries of the ranked lists. On the other hand, a single linguistic category may correspond to several of the observed features (beheaded verb groups). For example, is often eating beheads to is often, while is eating beheads to is, although both express the same TAM category (present progressive indicative).10

.  The word interesting, for instance, ought to be tagged as an adjective (tag JJ) but, as this is an area of gradience, there is a substantial margin of error in the tagging. .  It can be argued that the adverb often changes the TAM profile slightly: it adds the ­semantic aspect of repetition, and the resulting present progressive thus actually refers to a habitual action.



“Off with their heads”

2.3  Comparing observed and expected frequencies We observe the frequencies that one can expect in a homogeneous distribution11 across our ICE subsets and compare them to the frequencies in the individual corpora. We use a measure from the class of Observed over Expected (O/E). The simplest measure m that we use is mcorp = O corp − E corp = f (corp ) −

f (icefiji ) + f (iceindia ) + f (icegb ) + f (icenz ) + f (iceghana ) 5

where corp is an ICE component and f(corp) is the frequency of a given beheaded verb chunk in the corpus. For example, micefiji is: micefiji = O icefiji − Eicefiji = f (icefiji ) −

f (icefiji ) + f (iceindia ) + f (icegb ) + f (icenz ) + f (iceghana ) 5

The total number of verb groups varies, for example due to sentence length, and the values of m in the formula partly reflect the total number of verb chunks in the corpora. We thus use a version that normalizes for corpus size. mcorp = O corp − E corp = f (corp ) −

f (icefiji ) + f (iceindia ) + f (icegb ) + f (icenz ) + f (iceghana ) n(icefiji ) + n(iceindia ) + n(icegb ) + n(icenz ) + n(iceghana )

where n is the total number of verb chunks in the ICE subcomponent. Values where O is below 5 are filtered out. This measure gives particularly high scores to frequent events.12

3.  Corpus-driven results and analysis This section reports the results obtained by comparing rankings of the O/E measure mcorp. The information in the tables below includes (a) rank (column 1), omitting non-TAM relevant ones; (b) O/E measure (column 2) by which the table is ranked; (c) beheaded verb chunk (column 3); (d) observed (O) and expected (E) frequency per 10,000 words (columns 4 and 5, respectively); (e) raw observed

.  The homogenous distribution that is the expected frequency E which is used in the ­chi-square contingency test is e.g. calculated from the marginals, for each cell as E = (row total * column total) / grand total. .  We also tested a measure with normalization by frequency. It calculates (O-E/E) and tends to give high scores to relatively rare events. We do not use it in this investigation.





Gerold Schneider & Marianne Hundt

f­requency (column 6);13 (f) manually added comment (column 7). We present a full list for the example of ICE-FJ, and discuss parts of the list for the other corpora. 3.1  I CE-Fiji For ICE-FJ, the results for our O/E measure are reported for values down to mcorp = 2 (see Section 2 for the definition) in Table 1. Table 1.  O/E ranking of TAM-relevant beheaded chunks in ICE-FJ #

mcorp

Beheaded chunk: word_tag sequence

2

37.18

are_vbp

3

25.61

should_md

4

23.43

will_md

5

20.13

should_md be_vb

O

E

Freq.

189.08

151.90

421

Potential TAM cat. progressive or passive?

70.96

45.35

158

modality

150.45

127.02

335

future

47.16

27.03

105

modality

6

13.12

were_vbd

133.84

120.72

298

past progressive or passive?

7

11.15

can_md

92.07

80.92

205

modality

8

10.82

was_vbd

196.71

185.89

438

past progressive or passive?

101.50

94.35

226

25.60

18.57

57

9

7.15

have_vbp

10

7.03

could_md be_vb

perfect modality

11

6.49

are_vbp not_rb

17.52

11.03

39

progressive or passive?

12

5.87

will_md be_vb

36.38

30.51

81

future

13

5.01

need_vbp to_to

11.23

6.22

25

modality

15

4.19

would_md be_vb

24.25

20.06

54

modality

16

4.17

are_vbp being_vbg

8.98

4.81

20

progressive

17

3.88

needs_vbz to_to

6.29

2.40

14

modality

19

3.44

have_vbp to_to

13.47

10.03

30

modality

20

3.39

should_md not_rb

8.53

5.14

19

modality

21

3.19

needs_vbz to_to be_vb

5.84

2.65

13

modality

26

2.69

need_vbp to_to be_vb

5.84

3.15

13

modality

27

2.57

has_vbz to_to be_vb

5.39

2.82

12

modality

29

2.43

should_md also_rb

3.59

1.16

8

modality

31

2.24

should_md not_rb be_vb

5.39

3.15

12

modality

32

2.09

has_vbz not_rb

6.74

4.64

15

perfect

.  All frequencies below 6 are cut so that differences that could not be included in a ­chi-square test of statistical significance, for instance, are not reported.



“Off with their heads”

The verb tags have the following significance: MD modal; VB verb base form; VBZ verb present third singular; VBP other present; VBD simple past; VBN past participle; VBG present participle. RB is used for adverbs including negation. TO is used to mark the infinitive particle. Rank 2 and rank 11 conflate the progressive and passive plural forms. In order to distinguish them, we compared the frequencies of the possible participle chunk endings in our datasets (see Table 2). Table 2.  are followed by present or past participle across ICE subcorpora14

counts per verb chunk

Search expression

India

GB

NZ

Fiji

Ghana

are_VBP *_VBN

300

235

269

331

321

are_VBP *_VBG

77

67

67

90

74

are_VBP *_VBN

1.34%

0.98%

0.99%

1.49%

1.30%

are_VBP *_VBG

0.34%

0.28%

0.25%

0.40%

0.30%

ICE-FJ and, to a lesser degree, ICE-IND and ICE-GHA use both the passive (tag VBN) and the progressive construction (tag VBG) more frequently in the plural. This indicates increased use of the passive, of the progressive, or simply of the plural variants of these constructions. Table 3 shows the counts for the singular and the plural taken together. The figures indicate a slight preference for the present progressive in ICE-IND and in ICE-FJ (see Hundt & Vogel 2011).15 Table 3.  is or are followed by present participle across ICE subcorpora Search expression

India

GB

NZ

Fiji

Ghana

counts

is|are_VB. *_VBG

174

170

171

172

173

per verb chunk

is|are_VB. *_VBG

0.78%

0.71%

0.63%

0.77%

0.70%

A look at the ICE-FJ data reveals an interesting example. In sentence (1), the present progressive is used in a habitual context:

(1) Attitudes of tourists that are traveling nowadays are changing. (ICE-FJ W1A-001)

Recent corpus-based studies (Mair & Hundt 1995; Leech et al. 2009) did not observe an increase of this established but infrequent use of the progressive. The .  The per verb chunk percentage expresses how many of all verb chunks have this pattern. .  Note, however, that ICE-FJ may have a preference for plural forms.



 Gerold Schneider & Marianne Hundt

use of the progressive for habitual context has often been claimed to be a typical Indian feature (e.g. Mesthrie 2005: 322). Rank 16 also points to frequent use of the progressive, which we discuss in Section 4.3. Ranks 3, 5, 7, 10, 13, 15, 17, 19, 20, 21 illustrate that ICE-FJ has a modal verb system that is significantly different from other Englishes. The FijE modal system has been described in detail in Biewer (2009), who finds that should, need to and have to, in particular, are more frequent in ICE-FJ than in other South Pacific ­Englishes. Concerning rank 15, for instance, Deuber et al. (this volume) show that in FijE relatively many uses of would are habitual. Ranks 4 and 12 indicate that will is relatively frequent in ICE-FJ. The relative frequency of will does not mean that the going to future is used more rarely in ICE-FJ. In fact, the going to future also ranks in the top third. Table 4 compares raw frequencies of will and the going to future. Deuber et al. (this volume) show that also some uses of will are habitual in FijE. Table 4.  will and going to future across ICE subcorpora Search expression counts

will_VB going_VB Gto_TO

per verb chunk

India

GB

NZ

Fiji

Ghana

437

570

535

552

523

12

13

26

16

22

will_VB

1.95%

2.38%

1.96%

2.48%

2.12%

going_VB Gto_TO

0.05%

0.05%

0.10%

0.07%

0.09%

Rank 6 conflates past passive (were plus tag VBN) and past progressive (were plus tag VBG). In order to distinguish them, the frequencies of the possible participle chunk endings were investigated. They are given in Table 5 for our ICE data. In order to abstract over singular and plural form, singular counts (=rank 8) are also included. Table 5.  was or were followed by participle across ICE subcorpora Search expression counts per verb chunk

India

GB

NZ

Fiji

Ghana

was|were_VB. *_VBN

578

510

995

662

527

was|were_VB. *_VBG

56

110

100

74

72

was|were_VB. *_VBN

2.58%

2.13%

3.65%

2.97%

2.13%

was|were_VB. *_VBG

0.25%

0.46%

0.37%

0.33%

0.29%

Table 5 shows that past passives are particularly frequent in ICE-NZ. We will come back to the use of tense in Section 4.1, where we will see that past tense VPs are overrepresented in our ICE-NZ data as a whole.



“Off with their heads”

Ranks 9 and 32 may indicate that the present perfect is more frequent in ICEFJ. Table 6 gives the numbers for singular and plural perfect. Table 6.  has or have followed by past participle

counts

per verb chunk

Search expression

India

GB

NZ

Fiji

Ghana

have_VBP *_VBN

238

200

258

221

192

has_VBZ *_VBN

366

289

323

240

247

has|have *_VBN

604

489

581

461

439

have_VBP *_VBN

1.06%

0.83%

0.95%

0.99%

0.78%

has_VBZ *_VBN

1.63%

1.21%

1.18%

1.08%

1.00%

has|have *_VBN

2.69%

2.04%

2.13%

2.07%

1.78%

While the results for plural perfects seem to indicate that the perfect is more frequent in ICE-FJ than in other components (except ICE-IND and ICE-NZ), the results for the singular perfects differ considerably. We will pursue the question whether the perfect form is more frequent in ICE-FJ and especially in ICE-IND in Section 4.1. 3.2  I CE-India Table 7 lists the results from ICE-IND. Table 7.  O/E ranking of TAM-relevant beheaded chunks in ICE-IND #

mcorp

Beheaded chunk: word_tag sequence

O

E

Freq.

Potential TAM cat.

2

47.75

is_vbz

280.32

232.57

629

passive or progressive?

3

43.52

has_vbz

166.23

122.71

373

perfect

4

20.94

has_vbz been_vbn

65.96

45.02

148

perfect passive or progressive ?

5

16.12

are_vbp

168.01

151.90

377

passive or progressive?

6

13.05

have_vbp

107.40

94.35

241

perfect

8

9.17

have_vbp been_vbn

42.34

33.17

95

11

5.51

can_md be_vb

51.70

46.18

116

perfect passive or progressive? modality

12

5.09

can_md not_rb

21.84

16.75

49

modality

13

5.04

having_vbg

11.59

6.55

26

non-finite past progressive (Continued)



 Gerold Schneider & Marianne Hundt

Table 7.  O/E ranking of TAM-relevant beheaded chunks in ICE-IND (Continued) #

mcorp

Beheaded chunk: word_tag sequence

O

E

Freq.

Potential TAM cat.

14

4.86

is_vbz to_to be_vb

6.68

1.82

15

modality

16

4.66

may_md not_rb

9.80

5.14

22

modality

17

4.63

can_md not_rb be_vb

12.92

8.29

29

modality

19

4.23

have_vbp to_to

14.26

10.03

32

modality

20

4.11

should_md

49.47

45.35

111

modality

22

3.68

has_vbz to_to

11.14

7.46

25

modality

23

3.31

can_md

84.23

80.92

189

modality

24

3.05

Would_md not_rb

11.59

8.54

26

modality

26

2.88

should_md not_rb

8.02

5.14

18

modality

29

2.46

will_md have_vb to_to

6.68

4.23

15

modality

39

2.10

may_md

53.92

51.82

121

modality

43

2.00

may_md be_vb

19.16

17.16

43

modality

ICE-IND and ICE-FJ show many similarities. Ranks 2 (singular) and 5 (­ plural)  are the potential passive and progressive VPs (see close-up on Tables 4 and 5). Ranks 3 (singular perfect), 4 (progressive perfect and singular passive), 6 (plural perfect) and 8 (plural progressive perfect and passive) are variants of rank 9 in ICEFJ, which we mentioned in connection with Table 6. Many of the high ranks are related to modality (11, 12, 14, 16, 19, 20, 22, etc.). Rank 13 is a non-finite verb phrase, which is overused in IndE. It is an aspectually marked subordinator, as example (2) shows: (2)  Having lived it Gandhiji knew for certain that unless and until the village economy of India was boosted, Swayaraja would be meaningless. (ICE-IND W1A-010)

Although ICE-FJ and ICE-IND can be expected to pattern similarly because of the sizeable number of speakers whose first language is a variety of Hindi, they also show differences. For example, obligation in ICE-IND seems to follow B ­ ritish patterns (have to, be to), where FijE has a strong tendency towards need to, like ­ICE-JAM (see Nelson 2003; Mair 2009a, 2009b). 3.3  I CE-New Zealand Our measure delivers relatively low values for ICE-NZ, as we can see in Table 8. For example, already at rank 8 mcorp is below 5, which only happens at rank 14 in ICE-IND. Thus, as O/E is a measure of surprise, there are a few surprising features. Also, the use of modal verbs (be to, have to, may, must) is similar to that in ICE-GB (see Collins 2009; Collins & Yao this volume).



“Off with their heads” 

Table 8.  O/E ranking of TAM-relevant beheaded chunks in ICE-NZ #

mcorp

Beheaded chunk: word_tag sequence

O

E

Freq.

1

67.22

3

was_vbd

253.11

185.89

690

past passive or progressive?

30.59

had_vbd

114.08

83.49

311

past perfect

4

29.31

were_vbd

150.03

120.72

409

past passive or progressive?

5

9.67

30.81

21.14

84

past perfect

had_vbd been_vbn

Potential TAM cat.

8

4.98

do_vbp n`t_rb

13.94

8.95

38

contraction

9

4.90

could_md be_vb

23.48

18.57

64

modality

10

4.06

ca_md n`t_rb

7.70

3.65

21

contraction

11

3.38

were_vbd not_rb

8.44

5.06

23

past passive or progressive?

12

2.99

was_vbd to_to

9.54

6.55

26

modality

14

4.06

ca_md n`t_rb

7.70

3.65

21

contraction

15

4.06

ca_md n`t_rb

7.70

3.65

21

contraction

16

2.69

may_md have_vb

7.34

4.64

20

modality

17

2.68

would_md

104.91

102.23

286

modality

22.38

20.06

61

modality

3.67

1.58

10

contraction

12.11

10.03

33

modality

19

2.31

would_md be_vb

22

2.09

`re_vbp

23

2.07

had_vbd to_to

24

2.05

was_vbd to_to be_vb

4.04

1.99

11

modality

26

1.94

did_vbd n`t_rb

5.50

3.57

15

contraction

A large fraction of the top of the list contains contractions (ranks 8, 10, 14, 15, 22, 26), as ICE-NZ is the only corpus of the ICE components in our investigation in which contractions are frequent. In future applications of our approach to TAM profiling it would be useful to group contracted and uncontracted forms together. Because contractions are frequent only in our ICE-NZ data, they are unlikely to have had a skewing effect on the overall outcome of our study. Moreover, tokenization ensures that negative contractions are annotated as separate words, i.e. they do not amplify the slight skewing effect. For a more detailed analysis of some TAM categories, we include both contracted and uncontracted forms in the regular expressions used to retrieve the relevant data (see Section 4.3). Ranks 1 and 4 are largely the past progressive and passive (see Table 5). Ranks 3 and 5 are the relatively complex past perfect constructions; the past perfect is attested less frequently in all OC varieties in our data. Table 9 gives the values for the active and the passive variants.

 Gerold Schneider & Marianne Hundt

Table 9.  had or had been followed by past participle

counts per verb chunk

Search expression

India

GB

NZ

Fiji

Ghana

had_VBD *_VBN

139

253

310

148

148

had_VBD been_VBN *_VBN

30

66

78

22

29

had_VBD *_VBN

0.62%

1.06%

1.14%

0.66%

0.60%

had_VBD been_VBN *_VBN

0.13%

0.28%

0.29%

0.10%

0.12%

The mcorp measures for the past perfect are compared in Table 10. Table 10.  Values of mcorp for had or had been followed by past participle counts

Search expression

India

GB

NZ

Fiji

Ghana

had_VBD *_VBN

–20.6

+22.5

+30.6

–16.1

–22.3

–6.4

+10.6

+9.6

–8.1

–7.7

had_VBD been_ VBN *_VBN

These measures suggest that the past perfect is used less frequently in OC varieties, for example ICE-IND. This is in contrast to Sedlatschek (2009) who claims that the past perfect is used comparatively frequently in IndE. How can this contrast be explained? The phenomenon is unequally distributed in Sedlatschek’s data, and not all differences are significant; moreover, those that prove significant do so only at a relatively low level (Sedlatschek 2009: 260).16 In our data, the dispersion in the use of the past perfect is also very high: about half of the samples do not contain any past perfect active form, whereas others have a high text frequency of past perfect VPs. In ICE-GB, for example, the mean is 2.6 and the standard deviation is 3.9 (which is bigger than the mean);17 10 of the 97 samples contribute over half of the forms. This means that the chi-square test of significance is not reliable because the independence assumption is seriously violated.18 Figure 1 thus

.  Note that Sedlatschek’s corpora each contain 40 articles, each containing 2,000 words. The compared corpora are from the press genre. .  Most grammatical phenomena are not completely homogenously distributed, but here the dispersion is extreme. The central modals, for comparison, are distributed as follows in our ICE-GB subset: all 97 documents contain forms, the lowest count per article is 2 and the highest 54. The average is 36, the standard deviation 12 (which is less than half of the average). .  The chi-square test does not assume normal distribution, but independence of the data: “The most common mistake of the chi-square statistics and yet the most critical for its correct application is the violation of independence between measures or events” (Sharma 2005: 152).



“Off with their heads” 

shows a distribution in which the forms cluster in certain documents instead of the assumed normal distribution required for the chi-square test. Totally different results are obtained if only a few documents are replaced by other texts: the differences between ICE-GB and ICE-IND in Table 9 are highly significant ­(chi-square p < 0.1%), but if only the two documents containing most past ­perfects are replaced by two documents without any past perfect forms (there are already 27 such documents, see Figure 1) the difference is not significant (p > 5%). In other words, we have to be very cautious in interpreting statistically significant differences as differences between varieties, especially when the comparison is based on a small corpus. Past perfect forms: frequency per document 30

# of documents →

25 20 15 10 5 0

0

1

2

3

4

5

6 7 8 9 10 11 12 13 14 15 16 17 # of past perfect forms →

Figure 1.  Histogram of past perfect distributions in ICE-GB

The non-homogenous distribution, on the one hand, calls the results of significance tests into question. On the other hand, it raises the suspicion that constructions like modals or the past perfect depend on the style, content and narrator’s perspective, which were not aimed at being homogenously represented across the ICE corpora, and the number of available texts is likely to be too small to lead to a natural randomization. Finally, when comparing the press genre only, our dataset actually shows more past perfect active VPs in ICE-IND (93) than in ICE-GB (82), although fewer than in ICE-NZ (131). On the one hand, this illustrates the point just made: due to the high dispersion it is quite easily possible to obtain datasets that suggest opposite

A typical independence violation is a hidden factor in the data. In this case, the document is a factor that seriously violates the independence assumption, as the past perfect tokens are not spread homogenously across the documents.

 Gerold Schneider & Marianne Hundt

results. On the other hand, a qualitative analysis (which is beyond the scope of this paper) could be used to investigate the underlying differences, for example that IndE sometimes uses the past perfect to denote remote pastness (Comrie 1985; Sharma 2001). Our corpus-driven method has not confirmed previous findings here, but stressed a methodological danger that is typical in corpus linguistics: in essence, only a document is an independent token, not the words that are contained in that document. For grammatical phenomena that are largely dependent on semantic factors that vary considerably between documents (e.g. according to the narrator’s point of view in the use of past perfect constructions), this poses a challenge to significance testing. 3.4  I CE-Ghana Table 11 lists the results from ICE-GHA. Table 11.  O/E ranking of TAM-relevant beheaded chunks in ICE-GHA #

mcorp

Beheaded chunk: word_tag sequence

2

34.76

is_vbz

5

11.41

6

8.90

are_vbp

must_md

7

6.88

can_md be_vb

O

E

267.33

232.57

Freq.

Potential TAM cat.

660

passive or progressive?

40.10

28.69

99

160.80

151.90

397

passive or progressive?

modality

53.06

46.18

131

modality

9

5.98

are_vbp not_rb

17.01

11.03

42

passive or progressive?

10

5.93

can_md not_rb

22.68

16.75

56

modality

11

5.80

shall_md

10.53

4.73

26

12

5.66

could_md

44.96

39.30

111

modality

13

5.15

could_md not_rb

13.77

8.62

34

modality

130.02

127.02

321

8.91

6.22

22

modality

9.45

29

passive or progressive?

16

3.00

will_md

18

2.69

need_vbp to_to

20

2.29

is_vbz not_rb

11.75

shall-future

future

21

2.24

can_md not_rb be_vb

10.53

8.29

26

modality

25

2.11

should_md not_rb be_vb

5.27

3.15

13

modality

27

2.05

are_vbp to_to

4.46

2.40

11

modality

28

2.05

`ll_md

4.46

2.40

11

future

Ranks 2, 6 and 9 could indicate that progressives are frequent, but, as we discussed in connection with Tables 2 and 3, this is not the case: it rather seems that passives are slightly more frequent in ICE-GHA. Ranks 5, 7, 10, 12, 13, 18, etc.



“Off with their heads” 

are related to modality. The frequent use of shall is perhaps the most remarkable result, although many of the instances come from religious contexts discussing the Bible and are thus likely to be a result of the sampling for this corpus. 3.5  I CE-Great Britain Our comparative frequency measure delivers low values for ICE-GB, as we can see in Table 12. In other words, there are few surprising features. Table 12.  O/E ranking of TAM-relevant beheaded chunks in ICE-GB #

mcorp

Beheaded chunk: word_tag sequence

O

1

25.41

may_md

77.23

2

22.54

had_vbd

4

12.15

would_md

5

11.86

might_md

6

10.58

had_vbd been_vbn

7

10.32

will_md

8

8.29

9 10

E

Freq.

Potential TAM cat.

51.82

185

modality

106.04

83.49

254

past perfect

114.39

102.23

274

modality

23.80

11.94

57

modality past perfect

31.73

21.14

76

137.35

127.02

329

future

could_md

47.59

39.30

114

modality

7.47

may_md be_vb

24.63

17.16

59

modality

4.56

will_md be_vb

35.07

30.51

84

future

12

3.28

were_vbd to_to

6.26

2.98

15

modality

13

3.28

might_md be_vb

16

2.50

would_md have_vb

18

2.35

19

2.35

6.26

2.98

15

modality

13.78

11.28

33

modality

would_md have_vb been_vbn

4.59

2.24

11

modality

are_vbp to_to be_vb

4.59

2.24

11

modality

Ranks 2 and 6 point to the high observed frequency of the past perfect, which we discussed in connection with Tables 9 and 10 (and the problem of dispersion). Most results, namely ranks 1, 4, 5, 8, 9, 13, 16, 18 and 19, indicate again that the modal system of the ICE subsets included in our data shows strong variation.

4.  Analysis of selected features Based on the rankings presented in the previous section, we singled out a number of interesting TAM features, particularly tense, modals and progressive aspect. Concerning modals, we can largely confirm the findings presented in previous

 Gerold Schneider & Marianne Hundt

studies, for example Collins (2007: 490) on may; Biewer (2009) on should, need to, have to; and new uses of will and would (Balasubramanian 2009: 104; Deuber et al. this volume). We discuss modals in Section 4.2. Concerning tense and the progressive aspect, however, our findings do not always fully match previously reported results. The distribution of past perfects observed in Section 3.3 also merits further analysis. We discuss tense in ­Section 4.1 and progressive aspect in Section 4.3. 4.1  T  ense We give a quantitative investigation of tense in Section 4.1.1 followed by a qualitative analysis in Section 4.1.2. 4.1.1  Lexical heads and tense The chopped-off heads can also be profitably investigated, e.g. to compare those tense features which are encoded in the part-of-speech tag of the chunk head. Classifying the heads by tags leads to the distribution shown in Table 13. The highest value in each column is in bold print, the lowest in bold italics. Table 13.  Distribution of verb chunk heads by Penn Treebank tag Corpus\TAGS

VBN

VBD

VB

VBG

VBZ

VBP

Fiji

22.76%

17.00%

23.93%

12.20%

15.13%

8.95%

India

24.28%

14.63%

21.56%

12.22%

19.25%

8.04%

GB

22.38%

16.64%

23.15%

11.78%

17.56%

8.46%

NZ

24.39%

19.36%

20.97%

11.98%

15.04%

8.24%

Ghana

20.66%

14.49%

25.63%

11.22%

17.96%

10.00%

Average

22.89%

16.42%

23.05%

11.88%

16.99%

8.74%

Perhaps the strongest difference between the varieties is that ICE-NZ has very many simple past tense tags (VBD), many past participles (VBN) and few present tense tags (VBZ, VB;19 VBP to a lesser degree). This indicates that the simple past as a whole is frequent in ICE-NZ, whereas the simple present is used less frequently than in the other corpora; this supports earlier findings (see Table 5) on the differences between present and past passive. Other differences are smaller,

.  Note that 3rd person singular forms are relatively frequent in ICE-IND (e.g. the tag VBZ marking 3rd person singular present). Plural forms, on the other hand, are more frequent in ICE-FJ (e.g. VBP marking plural present).



“Off with their heads” 

but some indications of possible variation emerge. VBG is rare in ICE-GHA, but quite frequent in ICE-IND and ICE-FJ. This difference might indicate a slight preference for the progressive in ICE-IND and in ICE-FJ (see also Hundt & Vogel 2011). However, present participles also function as a subordinating device. Other instances are in fact deverbal nouns, as the discussion in Section 4.3 shows. VBN is marginally more frequent in ICE-IND and in ICE-NZ. For ICE-IND this is in line with the high frequency of are plus past participle (Table 2) and the high frequency of the perfect that we suggested in Section 3. The high frequency of VBD in ICENZ could indicate high frequency of the simple past (and inversely for ICE-GHA and ICE-IND) but qualitative analyses are needed to verify this. In order to get a clearer picture, we also investigated “simplex” verbs separately, i.e. verb groups that only consist of one word and thus lead to an empty beheaded chunk. The tag distributions are given in Table 14. Table 14.  Distribution of simplex verb group heads by Penn Treebank tag Corpus\TAGS

VBN

VBD

VB

VBG

VBZ

VBP

Fiji

8.78%

29.35%

5.01%

15.25%

26.17%

15.43%

India

9.58%

24.17%

4.66%

15.83%

32.25%

13.50%

GB

8.69%

28.27%

3.91%

14.87%

29.95%

14.32%

NZ

9.54%

32.37%

3.81%

15.29%

25.19%

13.79%

Ghana

7.57%

24.38%

6.69%

14.19%

30.29%

16.87%

Average

8.83%

27.71%

4.82%

15.09%

28.77%

14.78%

The tendencies are similar to those in Table 13, but slightly stronger, and there is less ambiguity.20 For example, a simplex verb with VBD tag almost always maps onto the linguistic feature of simple past (modulo tagging errors). High frequency of simple past in ICE-NZ opposed to low frequency of simple past in ICE-IND thus receives additional supportive evidence from this analysis. In all ICE components investigated here, VBD is either the most frequent simplex verb tag or almost as frequent as VBZ; but in ICE-IND and ICE-GHA there is a large difference: VBD is considerably rarer than in the other corpora. There may thus either be more present point of view texts in ICE-IND and ICE-GHA or the differences may be due to lack of tense marking on the verb.

.  Columns 1 (VBN) and 4 (VBG) express non-finite participial uses and are thus not relevant to TAM. Here Table 13 gives a better approximation to tense and aspect forms. We discuss progressives in Section 4.3.

 Gerold Schneider & Marianne Hundt

We reported in Table 6 that the frequency of some perfect constructions (has|have *_VBN) is high in ICE-IND. As simple past and perfect are often in the envelope of variation, this can also account for the low simple past frequency in ICE-IND. Zero-marked past VPs could be another explanation for the low occurrence of simple past forms. In order to substantiate the claim that the present perfect may be more frequent in ICE-FJ (Hundt & Biewer 2007), we counted all surface forms by regular expressions.21 The results are given in Table 15. Table 15.  Perfect form counts in comparison Perfect forms counts

per verb chunk

perfect/past

India

GB

NZ

Fiji

Ghana

active

788

656

731

618

607

progressive

30

20

38

38

20

passive

294

230

229

208

194

TOTAL

1112

906

977

864

821

active

3.5%

2.7%

2.7%

2.8%

2.5%

progressive

0.1%

0.1%

0.1%

0.2%

0.1%

passive

1.3%

1.0%

0.8%

0.9%

0.8%

TOTAL

5.0%

3.8%

3.6%

3.9%

3.3%

per simple past

34.9%

23.1%

18.8%

23.1%

23.4%

Although these figures offer a strong indication that the perfect is particularly frequent in ICE-IND and ICE-NZ (the difference is highly significant), the following three observations complicate the facts. i. Again, dispersion is relatively high, although not as extreme as shown for the past perfect in the discussion of Table 9. The average of active forms per sample is 6.3; the standard deviation is 5.7. The sample with most active perfect forms contains 31 forms, while there are 9 samples that do not contain any active perfect. Significance tests on the observed differences can thus only offer relative reliability.

.  The regular expressions are “(has|have|`s|`ve)_VB. ([^_]+_RB)*[^_]+_VBN” e­ xcluding “been_VBN” for the active, “(has|have|`s|`ve)_VB. ([^_]+_RB)*been_VBN ([^_]+_ RB)*[^_]+_VBG” for the progressive form, and “(has|have|`s|`ve)_VB. ([^_]+_RB)*been_ VBN ([^_]+_RB )*[^_]+_VBN” for the passive form. Tokenization ensures that negative contractions are included.



“Off with their heads” 

ii. ICE-IND and ICE-GHA are known to contain zero perfect marking. We do not detect these perfect forms with our method. iii. The fact that the high frequency of the simple past (VBD) in ICE-NZ does not lead to a lower frequency of the perfect in ICE-NZ merits discussion. Table 15 shows that ICE-NZ has a medium proportion of potential perfect VPs, while VBZ (third person present) is least frequent, and VBP (other present tense forms) are also less frequent. The most likely explanation for this variation in the use of tense is that past tenses, the past point of view, is on the whole overrepresented in the analysed subsample from ICE-NZ. This could again be due to the document-dependence of the writer’s point of view, a semantic factor which affects tense tokens. To sum up, our hypothesis is that the profiles indicate an increased use of present and perfect constructions in ICE-IND at the expense of simple past tense VPs. In ICE-NZ, however, an increased use of simple past tense VPs does not appear to be directly linked to a less frequent use of perfect constructions (both present and past perfects); instead, past tense VPs occur more frequently than present tense VPs. We investigate perfect constructions in ICE-FJ and ICE-IND in more detail in the following section. 4.1.2  Qualitative analysis: A case study on perfect constructions To complement our quantitative investigation we now present a brief qualitative analysis of perfect constructions. The perfect has not yet been widely researched in New Englishes. Van Rooy (2009) reports a higher frequency of the perfect in East African English (EAfE) and Hong Kong English (HKE): he (2009: 316) finds a lower frequency of the present perfect but a higher frequency of the past perfect in ICE-EA; and a higher frequency of the present perfect and a similar frequency of present perfects in ICE-HK (in comparison with ICE-GB). Hundt and Biewer (2007: 256f.) report low overall frequencies of the present ­perfect in FijE, Philippine (PhilE) and Singapore English (SingE) in comparison to past tense forms. Hundt and Biewer (2007) found that the results on the use of the present perfect as an alternative to the simple past depended to a large extent on the definition of the variable, i.e. whether the overall perfective-friendliness of the varieties was studied (on the basis of the Mossé-coefficient) or whether the ­co-occurrence with temporal adverbials such as yet, already and just was investigated; Biewer (2008) discusses co-variance with certain verbs. Furthermore, van Rooy (2009) included have got while Hundt and Biewer (2007) did not. In our automatically annotated data, finally, non-standard forms are typically tagged

 Gerold Schneider & Marianne Hundt

incorrectly and thus not reported. For example, van Rooy (2009: 319) finds examples with uninflected participles like the following:

(3) Yeah they they had take some photographs uh and they showed to me and I I think uh it’s (ICE-HK S1A-042)

Despite these methodological caveats, we find considerable quantitative and qualitative differences. We first discuss the past perfect, then the present perfect. 4.1.2.1  Past perfect.  Sharma (2001: 355) found an overall higher number of past perfect constructions in the press sections of the Kolhapur Corpus than in those of the BROWN and LOB corpora. She also found that the functions of the past perfect in IndE were markedly different from those in the American and British data: the past perfect was used with preterite and present perfect meanings and the pragmatic function of a general marking of remoteness and completion in narrative contexts. The same function, i.e. general marking of remoteness, is also attested in ICE-IND in non-narrative contexts in sentences (4) and (5):

(4) Against sixty one districts which had experienced communal trouble in 1661 there are now eighty one districts identified as hypersensitive […]. (ICE-IND W1A-005)



(5) Previously, we had reported glottographic data only on bilabial stops and glottal fricative of Hindi (Dixit & MacNeilage 1974; Dixit 1989). (ICE-IND W2A-004)

The past perfect is sometimes used in contexts in which current relevance can still be assumed, i.e. contexts in which BrE would prefer a present perfect, like (6) and (7):

(6) We are still in the process of dismantling what we had inherited. (ICE-IND W1A-005)

(7) The government’s problems are due to the fact that it had taken a tough stand until now on the subject of product patenting. (ICE-IND ­W2E-008)

There are also instances in which the sequence of present and past perfect ­constructions is different from BrE, as in (8):

(8) Many of the people have come from distant country to India and have settled here and had got in to the culture of India. (ICE-IND ­W1A-011)



“Off with their heads” 

4.1.2.2  Present perfect.  In IC varieties, the present perfect has been decreasing (see Elsness 2009);22 Bergs and Pfaff (2009) attribute this loss partly to the past progressive. In ICE-IND, however, the present perfect is used in contexts in which either the simple past or even a past perfect would be preferred or required in BrE, as in sentences (9) to (11): (9) It has received the attention of people like Gandhiji, Tagore and others much before independence. (ICE-IND W1A-010)

(10) But my loyalties lie with Bengali Sweet House whose chaat has hastened my growth into manhood. (ICE-IND W2B-017) (11) There are at least two reasons for this (I) during 1981–91 there has been rapid urbanisation and this tends to increase the coverage error of the ­census (ICE-IND W2B-013)

A surprising combination of present perfect with the progressive can be found in example (12): (12) Thus the concept of rural developoment [sic!] has been changing from time to time and according to the need of people, the Government has launched various rural development programmes. (ICE-IND W1A-010)

In (12), the present perfect progressive refers to a series of distinct instances in the past, a context in which BrE would use the simple past tense. We discuss ­progressive and simple aspect in Section 4.3. 4.2  Modality We discussed the increased use of will and would in ICE-FJ in Section 3.1. We now turn to a more general overview of modals. The modals of the ICE components of this investigation are juxtaposed in Table 16 by counts, and in Table 17 by percentage. Table 17 is a first step in the direction of observing variable contexts rather than absolute counts. It is only a partial step as the semantics only partly overlap; speakers typically do not have the full choice between all of them. A manual separation into the different modalities is also necessary. As many other factors, for example idiomaticity, restrain the envelope of variation, a categorization into choice context cases and others will allow one to obtain more reliable data. Even then, text type factors will still skew the picture.

.  Note that Hundt and Smith (2009) found stable regional variation rather than ongoing change in the BROWN family of corpora.

 Gerold Schneider & Marianne Hundt

Table 16.  Modal verb counts Modal verb #

India

GB

NZ

Fiji

Ghana

will

441

583

556

560

539

would

412

489

536

422

429

can

488

447

479

439

526

could

185

234

285

219

255

may

241

349

269

140

208

should

242

153

198

360

249

must

136

125

155

107

176

might

37

113

72

28

51

shall

22

17

12

6

41

be_to

50

83

78

45

48

have_to

162

151

137

119

116

need_to

20

40

46

91

54

ought_to

1

6

4

4

10

have_got_to

0

3

2

1

0

2437

2793

2829

2541

2702

10.86%

11.66%

10.38%

11.41%

10.94%

TOTAL TOTAL per verb chunk

Table 17.  Modal verb percentages Modal verb

India

GB

NZ

Fiji

Ghana

will

18.10%

20.87%

19.65%

22.04%

19.95%

would

16.91%

17.51%

18.95%

16.61%

15.88%

can

20.02%

16.00%

16.93%

17.28%

19.47%

could

7.59%

8.38%

10.07%

8.62%

9.44%

may

9.89%

12.50%

9.51%

5.51%

7.70%

should

9.93%

5.48%

7.00%

14.17%

9.22%

must

5.58%

4.48%

5.48%

4.21%

6.51%

might

1.52%

4.05%

2.55%

1.10%

1.89%

shall

0.90%

0.61%

0.42%

0.24%

1.52%

be to

2.05%

2.97%

2.76%

1.77%

1.78%

have to

6.65%

5.41%

4.84%

4.68%

4.29%

need to

0.82%

1.43%

1.63%

3.58%

2.00%

ought to

0.04%

0.21%

0.14%

0.16%

0.37%

have got to TOTAL

0.00%

0.11%

0.07%

0.04%

0.00%

100.00%

100.00%

100.00%

100.00%

100.00%



“Off with their heads” 

Biewer (2009) finds that should, need to and have to, in particular, are relatively frequent in a Web-derived corpus of Fijian newspapers and the ICE-FJ press genre. Our data confirm the high frequencies of the first two compared to other varieties, whereas have to is not particularly frequent. Biewer (2009: 49) found that ought to, which is marginalized in IC varieties (Mair 2006: 111), is almost nonexistent in the South Pacific English varieties. Our prediction that will is frequent in ICE-FJ (Table 4) is not confirmed by Deuber et al. (this volume), which may well be due to the fact that they look at spoken, not written data (i.e. there is no overlap between our data sets). However, Deuber et al. (this volume) describe new uses of will and would, which we can confirm. They show that in FijE relatively many uses of would are habitual. A writing manual published by a language teacher at the University of the South Pacific lists will and would as the only modals that are difficult for the local students: “These two words cause a lot of problems. This section shows the correct usage” (Pene 2003: 35). The manual does not mention the specific problems that students have with these modals, for instance whether they overuse one modal where the other would be more appropriate. The following uses of would ((13) to (14)) are not habitual but seem to be examples of a textual function, i.e. a context in which speakers of English as a first language are likely to use will rather than would: (13) Jane Eyre would be used to show how these oppositions operate originally after which, an analysis would be done to show how the Wide Sargasso Sea overturns that hierarchy and gives precedence to whatever was considered not important. (ICE-FJ W1A-007) (14) This essay, that I would be writing down onto this piece of paper, would be divided into three parts. (ICE-FJ W1A-016)

Deuber et al. (this volume) also report that, in addition to many uses of would as habitual, will can also have habitual or hypothetical interpretation in FijE. ­Examples (15) to (16) in our data support this suggestion: (15) Less intensive fishing will allow the fisheries resource to build up to the point where the harvest is balanced with the natural replenishment of the population (Chakalall). (ICE-FJ W1A-008) (16) A Muslim will make this pilgrimage to the holy city of Mecca only if he is free of debt, has no dependent children, has enough money for his trip (i.e. he should not borrow), and is free of any other burden. (ICE-FJ W ­ 2B-018)

 Gerold Schneider & Marianne Hundt

Balasubramanian (2009: 104) describes the same feature for IndE: out of the 2,057 instances of the will future in her spoken corpus of IndE, 253 were used in sentences where the verb described habitual actions. We also notice that may and might are considerably rarer in the OC varieties. In fact BrE shows the highest counts, as is confirmed as a general tendency, e.g. across other varieties by Collins (2007: 490): “The frequency of may, which is still the primary exponent of epistemic possibility, has declined more markedly in AmE and AusE than it has in BrE.” Turning to ICE-GHA, we note that it is quite conservative in that it uses shall relatively frequently. Many occurrences, however, come from religious texts that discuss and quote from the Bible.

 e progressive 4.3  Th We reported in Table 3 that the frequency of some potential progressive constructions (is|are *_VBG) is high in ICE-IND (see ranks 2 and 5) and ICE-FJ (see ranks 2 and 11), and in Table 14 that the frequency of the VBG tag is also high in these two ICE components. As a first approximation to the progressive in our corpus-driven approach, verb chunks ending with the VBG tag can be used. Only a fraction of the verb chunks ending in VBG are progressive forms, however. First, over two-thirds are simplex chunks, many of them ing-forms functioning as prepositions (according (to), including), subordinators of participial clauses, nominalizations, adjectives, etc. Second, as many VBG forms function as nouns, many non-simplex VBG chunks contain determiners (e.g. the meeting, the supreme being, another ­helping), i.e. only a fraction of the non-simplex VBG chunks are actually progressive forms. In order to see if the progressive is, indeed, more frequent in ICE-IND and ICE-FJ (Hundt & Vogel 2011), we counted all surface forms of progressive VPs by regular expressions.23 The results are listed in Table 18. For comparison, the counts for simplex and non-simplex verb chunks ending in VBG are also listed. Absolute numbers are not very reliable indicators because the compilers of ICE-NZ chose to include longer texts in the corpus. Therefore,

.  The regular expressions are “(be|am|is|are|`m|`s|`re)_VB. ([^_]+_RB)*[^_]+_VBG” excluding “going_VBG to_TO” for the present forms, “(has|have|`s|`ve)_VB. ([^_]+ ­ _RB)*been_VBN [^_]+_VBG” for the perfect forms, “(was|were)_VB. ([^_]+_RB)*[^_]+ _VBG” excluding “being_VBG” and “going_VBG to_TO” for the past forms, and “_VB. being_VBG [^_]+_VBN” for the passive forms. The tag _RB again is a negation or intervening adverb or a combination thereof.



“Off with their heads” 

percentages per verb chunk are also given. ICE-IND has many VBG chunks, its percentage for simplex VBG chunks is highest, and its non-simplex percentage is lowest. Table 18.  Progressive form frequencies Progressive

Form

counts

per verb chunk

India

GB

NZ

Fiji

Ghana

present

286

293

297

347

301

perfect

27

19

36

37

20

past

76

144

152

100

88

passive

26

36

39

42

29

TOTAL PROGRESSIVE

415

492

524

526

438

non-simplex VBG chunks

655

761

815

776

726

simplex VBG chunks

2088

2061

2452

1941

2045

present

1.27%

1.22%

1.09%

1.56%

1.22%

perfect

0.12%

0.08%

0.13%

0.17%

0.08%

past

0.34%

0.60%

0.56%

0.45%

0.36%

passive

0.12%

0.15%

0.14%

0.19%

0.12%

TOTAL PROGRESSIVE

1.85%

2.05%

1.92%

2.36%

1.77%

non-simplex VBG chunks

2.92%

3.18%

2.99%

3.49%

2.94%

simplex VBG chunks

9.31%

8.60%

8.99%

8.71%

8.28%

The percentage results suggest that the present progressive is slightly more frequent in ICE-FJ and in ICE-IND. If one counts all forms, the progressive is still more frequent in ICE-FJ than in ICE-GB (see also Hundt & Vogel’s 2011 study based on lexical string searches). In ICE-IND, however, the overall frequency of progressives is not higher than in ICE-GB. Especially past forms are considerably rarer. The situation is complicated by the following five facts. First, the dispersion is quite high, although a little lower than in the past perfect. For example, in ICE-IND the average present progressive form count per document is 3.04 and the standard deviation is 2.89 (in other words, the standard deviation is almost as big as the mean). The sample with most present progressive forms contains 12 forms, while there are 21 samples that do not contain any present progressive. It may thus not be possible to draw quantitative conclusions even if the differences

 Gerold Schneider & Marianne Hundt

are significant. The graphical representation in Figure 2 shows a distribution that is partly Zipfian and partly normal. Present progressive per document

# of documents −−>

25 20 15 10 5 0

0

1

2

3 4 5 7 6 8 9 # of present progressive forms −−>

10

11

12

Figure 2.  Histogram of present progressive distribution in ICE-IND

Second, not all OC varieties tend to overuse the progressive (see Hundt & Vogel 2011 for ICE-SIN vs. ICE-FJ). Furthermore, there is substantial genre variation. In the student essay genre, we get the counts shown in Table 19. Table 19.  Progressive forms in student writing Progressive

Form

counts

per verb chunk

India

GB

Fiji

present

71

45

115

perfect

9

1

7

simple past

9

27

6

passive

2

6

20

TOTAL PROGRESSIVE

91

79

148

TOTAL PROGRESSIVE

1.86%

1.57%

2.94%

Compared to ICE-GB, the progressive is more frequent in both ICE-IND and ICE-FJ student essays, but the difference between the totals for ICE-IND and ICEGB is not significant (chi-square contingency test, p = 0.267), while the difference between ICE-FJ and ICE-GB is highly significant. Third, IndE yields progressives without an auxiliary, i.e. instances that were missed in our search strings, like example (17). (17) Thus revolution going on. (ICE-IND W1A-004)



“Off with their heads” 

This entails the possibility that some simplex VBG chunks in our OC data are progressive constructions. In other words, the fact that ICE-IND has the highest simplex VBG chunk count could be indicative of progressive forms being used without an auxiliary. However, zero auxiliary constructions are not very frequent: our manual sifting of 200 simplex VBG chunks delivered only two instances. A  more comprehensive qualitative investigation would show how (in)frequent they really are. Fourth, a number of examples seem to suggest that nominalizations, which form the biggest class of simplex VBG chunks, tend to be overused in ICE-IND, as examples (18) and (19) show. (18) Mother would take care of the children with working also. (ICE-IND W ­ 1A-004) (19) Opposition helps in making the people politically educated. (ICE-IND W1A-005)

Finally, some examples, such as (20), clearly suggest that the progressive is overgeneralized to contexts where BrE prefers the simple form: (20) For this essay, it will be focussing on how women are being treated or the hardships being faced. (ICE-FJ W1A-013)

5.  C  onclusion The main aim of our paper was to test a new methodology of mapping unity and diversity in (lexico-)grammatical variation in New Englishes. We used a frequency-based approach and beheaded verb chunks to obtain TAM profiles for selected ICE subcorpora and discussed the ranked lists, speculating on the potential linguistic reasons for the rank differences. Our partly corpus-driven approach revealed some potentially interesting differences in the TAM profiling of IC and OC Englishes, some of which appear to fit in well with previous, corpus-based studies that were mainly based on unannotated corpora, and others which do not fit in. The latter have guided us to a methodological danger of significance testing in corpus linguistics: the high dependence of the tokens on the document constitutes a serious violation of the independence assumption in some investigations. Especially in studies that make use of small corpora, the role of the individual document has to be given due attention, and the results of significance tests need to be treated with caution. We have shown that differences which are claimed to be significant may also arise from minor corpus compilation coincidences. We suggest that significance claims should be supplemented

 Gerold Schneider & Marianne Hundt

by data on distribution across documents and genres. This fits in with a recent development, i.e. an increasing awareness of the important role of register variation in corpus-based studies of New Englishes. We have also pointed out that constructions like modals or the past perfect depend on the style, content and author’s perspective, which were not aimed at being homogenously represented across the ICE corpora, and the number of available texts may be too small to lead to a natural randomization. This entails that studies which investigate the envelope of variation instead of raw numbers suggest themselves as a next step for the data analysis, if a working hypothesis on the envelope of variation with high inter-annotator agreement can be found. When over- and underuse partly compensate each other, which seems to be the case in the use of the past perfect, purely quantitative analyses fail to pick up a signal, and qualitative analyses are vital. As far as qualitative differences are concerned, our data confirm the finding that the present and past perfect are used differently in IndE. With respect to modals, our results also support previous findings, particularly that may, should, need to, have to, will and would are used differently across the varieties studied here. The close-up on progressives confirms that they are most frequently used in ICE-FJ and ICE-NZ, even when a broader range of text types is used than in Hundt and Vogel (2011). It has to be stressed, however, that beheading verb chunks is only a first step towards detailed profiling of the TAM dimension in different Englishes. This works better for simplex verb chunks (i.e. simple tenses and VBD) and modals but poses challenges for more complex constructions. So far, our profiling method only provides us with a relatively coarse picture and one, moreover, which is sometimes not straightforward to interpret. At the same time, it has the advantage that the risk of missing striking differences between varieties is minimal, and it has the potential of revealing previously undetected differences. With a view to obtaining more fine-grained analyses, we disambiguated some beheaded verb chunks (e.g. the different forms of the auxiliary have for the perfect construction, as well as combinations of modals with aspectual categories). Explanations for different TAM profiles might have to go beyond the VP as individual varieties may well resort to sentential strategies or nominal groups to express, e.g. modal meaning, especially in the case of potentially face-threatening acts with modals of obligation. Nelson (2003: 31), for instance, found that in ICEEA constructions such as there BE need (also in combination with to-infinitives) were used and that this partly explained the low counts for VPs with need in the corpus. TAM profiling of the kind proposed in this paper is not meant to replace the previous corpus-based approaches to the description of New Englishes. It is a



“Off with their heads” 

s­ upplementary methodology, one that may be seen as a way into the investigation of variability at other levels. Sharma (2009), for instance, found that use of zero past tense forms was similar in both IndE and SingE on a functional level whereas there were marked differences in the use of the progressive. Chambers (2004: 28 as quoted in Sharma) cautions that: The degree and distribution of a given feature must be understood in relation to the substrate before any universal claims can be made regarding “processes [that] recur in vernaculars wherever they are spoken …” (Sharma 2009: 191)

Quantitative profiling constitutes a possible initial step. A subsequent qualitative interpretation of samples is crucial. The relation between frequency of verb chunks and their function is not necessarily direct. There are limitations to the uncovering of variety-specific patterns: for example, regional patterns might be camouflaged by the overuse of one particular function and the underuse of another function, cancelling each other out (e.g. if the past perfect is underused in back-shifting, but overused in expressing remoteness this would not show up in a purely frequencybased approach).24 Methodologically, there are two opposing, simplistic views on the validity of the hypothesis-driven vs. corpus-driven approach. The one, assuming previous hypothesis-driven research as the gold standard, speaks for the validity of the corpus-driven approach if it confirms previous findings (as our approach did in most instances). The other, assuming the corpus-driven view as the gold standard, interprets every difference between two corpora as a valid signal and refutes the existence of noise, postulating that, as far as statistical tests can confirm it, the frequency of the past point-of-view in ICE-NZ, for instance, is likely to indicate a feature of this variety. On the one hand, such a view neglects the possibility of skewed data (see Section 3.3) and does not give qualitative, ­common-sense investigation its due role. On the other hand, it respects the subtle nature of linguistic variation: as the vast majority of sentences in corpora of New Englishes show structural overlap with the patterns found in, for instance, ICE-GB, frequency differences need to come from a cause (whether linguistic, textual or even extralinguistic). The upshot of such discussions is usually that the two opposing approaches supplement rather than compete with each other.

.  Theoretically, if data are unlimited, two features will never have exactly the same frequency and will thus not cancel each other out completely, so it can be seen as a sparse data problem.

 Gerold Schneider & Marianne Hundt

References Arppe, A., Gilquin, G., Glynn, D., Hilpert, M. & Zeschel, A. 2010. Cognitive corpus linguistics: Five points of debate on current theory and methodology. Corpora 5(1): 1–27. Balasubramanian, C. 2009. Register Variation in Indian English [Studies in Corpus Linguistics 37]. Amsterdam: John Benjamins. Bergs, A. & Pfaff, M. 2009. I was just reading this article. Is the perfect of the recent past on its way out? Paper presented at the SEU Symposium Current Change in the English Verb Phrase, 14 July 2009. Biber, D., Finegan, E., Johansson, S., Conrad, S. & Leech, G. 1999. Longman Grammar of Spoken and Written English. London: Longman. Biewer, C. 2008. South Pacific Englishes: Unity and diversity in the usage of the present perfect. In Dynamics of Linguistic Variation: Corpus Evidence on English Past and Present [Studies in Language Variation 2], T. Nevalainen, I. Taavitsainen, P. Pahta & M. Korhonen (eds), 203–219. Amsterdam: John Benjamins. Biewer, C. 2009. Modals and semi-modals of obligation and necessity in South Pacific Englishes. Anglistik 20(2): 41–55. Bybee, J., Perkins, R. & Pagliuca, W. 1994. The Evolution of Grammar: Tense, Aspect and Modality in the Languages of the World. Chicago IL: University of Chicago Press. Collins, P. 2007. Can/could and may/might in British, American and Australian English: A ­corpus-based account. World Englishes 26(4): 474–491. Collins, P. 2009. Modals and quasi-modals in world Englishes. World Englishes 28: 281–292. Comrie, B. 1985. Tense. Cambridge: CUP. Elsness, J. 2009. The perfect and the preterite in Australian and New Zealand English. In Comparative Studies in Australian and New Zealand English: Grammar and Beyond [Varieties of English around the World G39], P. Peters, P. Collins & A. Smith (eds), 89–114. Amsterdam: John Benjamins. Facchinetti, R., Krug, M. & Palmer, F. (eds). 2003. Modality in Contemporary English. Berlin: Mouton de Gruyter. Hopper, P.J. (ed.). 1982. Tense-Aspect: Between Semantics and Pragmatics [Typological Studies in Language 1]. Amsterdam: John Benjamins. Huddleston, R. & Pullum, G.K. 2002. The Cambridge Grammar of the English Language. ­Cambridge: CUP. Hundt, M. 2009. Global feature – local norms? A case study on the progressive passive. In World Englishes – Problems, Properties and Prospects [Varieties of English around the World G40], T. Hoffmann and L. Siebers (eds), 287–308. Amsterdam: John Benjamins. Hundt, M. & Biewer, C. 2007. The dynamics of inner and outer circle varieties in the South Pacific and East Asia. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 249–269. Amsterdam: Rodopi. Hundt, M. & Smith, N. 2009. The present perfect in British and American English: Has there been any change, recently? ICAME Journal 33: 45–63. Hundt, M. & Vogel, K. 2011. Overuse of the progressive in ESL and Learner Englishes – fact or ­ ridging fiction? In Exploring Second-Language Varieties of English and Learner Englishes: B a Paradigm Gap [Studies in Corpus Linguistics 44], J. Mukherjee & M. Hundt (eds),­ 145–166. Amsterdam: John Benjamins.



“Off with their heads” 

Hunston, S. & Francis, G. 2000. Pattern Grammar. A Corpus-Driven Approach to the Lexcial Grammar of English [Studies in Corpus Linguistics 4] Amsterdam: John Benjamins. Kortmann, B. & Szmrecsanyi, B. 2004. Global synopsis: Morphological and syntactic variation in English. In A Handbook of Varieties of English, Vol. 2: Morphology and Syntax, B. Kortmann, K. Burridge, R. Mesthrie, E.W. Schneider & C. Upton (eds), 1122–1182. ­Berlin: Mouton de Gruyter. Labov, W. 1969. Contraction, deletion, and inherent variability of the English copula. Language 45(4): 715–762. Leech, G., Hundt, M., Mair, C. & Smith, N. 2009. Changes in Contemporary English: A Grammatical Study. Cambridge: CUP. Mair, C. 2006. Twentieth-Century English. Studies in English Language. Cambridge: CUP. Mair, C. 2009a. Corpus linguistics meets sociolinguistics: Studying educated spoken usage in Jamaican on the basis of the International Corpus of English. In World Englishes – Problems, Properties and Prospects [Varieties of English around the World G40], T. Hoffmann and L. Siebers (eds), 39–60. Amsterdam: John Benjamins. Mair, C. 2009b. Corpus linguistics meets sociolinguistics: The role of corpus evidence in the study of sociolinguistic variation and change. In Corpus Linguistics: Refinements and Reassessments, A. Renouf & A. Kehoe (eds), 7–32. Amsterdam: Rodopi. Mair, C. & Hundt, M. 1995. Why is the progressive becoming more frequent in English? A ­corpus-based investigation of language change in progress. Zeitschrift für Anglistik und Amerikanistik 2(2): 111–122. Mesthrie, R. 2005. Assessing representations of South African Indian English in writing: An application of variation theory. Language Variation and Change 17(3): 303–326. Nelson, G. 2003. Modals of obligation and necessity in varieties of English. In From Local to  Global English – Proceedings of Style Council 2001/2, P. Peters (ed.), 25–32. Sydney: ­Dictionary Research Centre, Macquarie University. Pene, F. 2003. Write it Right. Fiji: Institute of Education, University of the South Pacific. Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. 1985. A Comprehensive Grammar of the ­English Language. London: Longman. Schmid, H. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of International Conference on New Methods in Language Processing, Manchester, 44–49. Schneider, G. 2008. Hybrid Long-Distance Functional Dependency Parsing. Ph.D. dissertation, University of Zurich. Schneider, G. & Hundt, M. 2009. Using a parser as a heuristic tool for the description of New  Englishes. In The Fifth Corpus Linguistics Conference, Liverpool, 20–23 July 2009. (Online). Sedlatschek, A. 2009. Contemporary Indian English. Variation and Change [Varieties of English around the World G38]. Amsterdam: John Benjamins. Sharma, A.K. 2005. Text Book Of Chi-Test And Experimental Designs. New Delhi: Discovery Publishing House. Sharma, D. 2001. The pluperfect in native and non-native English: A comparative corpus study. Language Variation and Change 13(3): 343–373. Sharma, D. 2009. Typological diversity in New Englishes. English World-Wide 30(2): 170–195. Tognini-Bonelli, E. 2001. Corpus Linguistics at Work [Studies in Corpus Linguistics 6]. Amsterdam: John Benjamins.

 Gerold Schneider & Marianne Hundt van Rooy, B. 2009. The shared core of the perfect across Englishes: A corpus-based analysis. In World Englishes – Problems, Properties and Prospects [Varieties of English around the World G40], T. Hoffmann & L. Siebers (eds), 309–330. Amsterdam: John Benjamins. Wellner, B. & Vilain, M. 2006. Leveraging machine readable dictionaries in discriminative sequence models. In Proceedings of Language Resources and Evaluation Conference (LREC) 2006. Genoa, Italy, 22–28 May 2006, online. Yngve, V. 1996. From Grammar to Science: New Foundations for General Linguistics. Amsterdam: John Benjamins.

Appendix Table 1a.  Text categories sampled for the tailored ICE corpora Text

Used Texts

Unused

Genre

W1A

20

0

Essay

W1B

0

20

W2A

28

12

Academic

W2B

20

20

Non-Academic

W2C

20

0

W2D

0

20

W2E

9

1

W2F TOTAL

0

20

97

93

Press Press

Modals and quasi-modals in New Englishes Peter Collins & Xinyue Yao University of New South Wales

Recent research on modals and quasi-modals has identified two complementary trends: a rise in the popularity of quasi-modals and a decline in that of modals. There is a strong tendency for American rather than British English to be leading the way in these developments. Furthermore, quasi-modals are thriving in speech, their modal counterparts in writing. This chapter investigates the distribution of a set of semantically similar modals and quasi-modals in a set of matching components of the International Corpus of English. The findings suggest, inter alia, that in the “Inner Circle” it is American English that is predominantly in the box seat in the rise of the quasi-modals and the decline of the modals, and that in the “Outer Circle” it is the more established Englishes that tend to be more advanced in these trends. Keywords:  modal; quasi-modal; New Englishes; corpus

1.  Introduction The focus of the present study is on the four quasi-modals (often also referred to as “semi-modals”) have to, have got to, be going to and want to, which are compared to four modals with which they share some semantic affinities and into whose semantic territory they appear to be making inroads, must, should, will and shall.1 The English quasi-modals are a rather heterogeneous set of periphrastic forms that are semantically similar to modal auxiliaries, but formally distinguishable from them. Some have been regarded as serving suppletive roles in the defective morphological paradigms of the modals; for example, infinitival to have to in the absence of *to must (see for example Coates 1983: 57).

.  Have got to and have to are often treated as variants in the literature, but are treated here as separate quasi-modals on the grounds of their syntactic and semantic differences (see Collins 2009: 68).

 Peter Collins & Xinyue Yao

While the inflectional and syntactic properties of the modals have been discussed extensively in the literature (see for example Palmer 1990; Huddleston & Pullum 2002), the properties of the quasi-modals have attracted less attention. The most comprehensive account is that of Westney (1995: 11), who posits the three criteria presented below for determining quasi-modal status. i. Grammaticalization. Quasi-modals are characteristically subject to grammaticalization, the diachronic process by which a periphrastic lexical unit is transformed into a more grammatical one, which typically involves phonological weakening (as for example in gotta and wanna) and semantic bleaching (as for example in the loss of the originally comparative meaning of had better); see Krug (2000). ii. Idiomaticity. Most of the quasi-modals are idiomatic, expressing meanings that transcend those of their constituent parts (for example got, the primary semantic element of have got to, is used elsewhere with possessive meaning; going, the primary semantic unit of be going to, elsewhere with motional meaning). iii. Semantic relatedness to a central modal auxiliary. A number of the quasimodals have a close semantic affiliation with a modal, although there may be subtle differences between them (for instance have to and must both express strong deontic necessity, or ‘obligation’, but have to tends to be more ‘objective’ than must). Furthermore a number of the quasi-modals (including have to, have got to and be bound to) exhibit the semantic “root” versus “epistemic” duality that is a characteristic feature of the modals (see for example Coates 1983). The analysis presented is essentially form based. When meanings are discussed, the approach is merely exemplary and qualitative. Furthermore, discussion is underpinned by Palmer’s (2001) tripartite distinction between “epistemic”, “deontic” and “dynamic” modality. Epistemic modality is concerned with the speaker’s judgement of the likelihood that the proposition on which the utterance is based is true, while both deontic and dynamic modality are concerned with the conditions for the actualization of a situation, typically deriving in the case of deontic modality from an external source, and in the case of dynamic modality from an internal source (namely, the subject referent).

2.  Recent diachronic trends According to Leech (2003), Mair & Leech (2006), and Leech et al. (2009), there has been a rise in the frequency of the quasi-modals and a fall in the frequency



Modals and quasi-modals in New Englishes 

of the modals in British and American English in recent decades. Their findings are based on two written corpora with data originating in the 1960s (the British LOB corpus and the American BROWN corpus), and two parallel corpora with data originating in the 1990s (the British FLOB corpus and American FROWN corpus). Table 1 quantifies these rises and falls, presenting differences between the frequency of items in the 1960s corpora and those in the 1990s corpora, as, in each case, a percentage of the former.2 The frequency of all four of the quasi-modals has increased in American writing, ranging from minimally in the case of have to, to dramatically in the case of want to. By contrast in British writing be going to has not changed much, have got to has declined considerably, and have to and want to have undergone a moderate increase. All four of the modals have suffered a decline, in both Englishes, with a quite high level of consistency between the Englishes in the degree of the decline for each modal, ordered from sharpest to mildest as follows: shall > must > should > will. Table 1.  Changes in the frequency of some quasi-modals and modals in British and American writing from the 1960s to the 1990s Quasi-modals

have to have got to be going to want to

BrE

AmE

+9.1%

+2.8%

–34.1%

Modals BrE

AmE

must

–29.0%

–33.8%

+16.6%

should

–11.7%

–12.8%

–1.1%

+55.0%

will

–3.9%

–10.3%

+18.6%

+72.4%

shall

–43.6%

–43.3%

In the absence of matching spoken corpora, Leech et al. (2009) cannot provide comparable information about spoken English. However, using non-parallel corpora, they note that in British speech the rise of the quasi-modals and the fall of the modals have been sharper than in British writing. In the absence of any spoken American corpus for the 1960s, Leech et al. draw on 1990s data from the 4-millionword Longman Corpus of Spoken American English, and note that quasi-modals dominate modals to such an extent that spoken American English confirms its status as “the most advanced variety” (2009: 104) in the rise of the quasi-modals. While the present study is not diachronic, diachronically relevant insights can be gleaned from the patterns of regional and stylistic variation identified.

.  The figures in Table 1 are based on those reported in Leech et al. (2009).

 Peter Collins & Xinyue Yao

3.  The Englishes The study compares the distribution and frequency of the set of quasi-modals and modals identified in Section 1 above across four Englishes representing Kachru’s (1985) “Inner Circle” (henceforth IC) – British English (BrE), American English (AmE), Australian English (AusE) and New Zealand English (NZE) – and nine representing his “Outer Circle” (henceforth OC) – Jamaican English (JamE), Singapore English (SingE), Philippine English (PhilE), Indian English (IndE), Nigerian English (NigE), Malaysian English (MalE), Hong Kong English (HKE), Kenyan English (KenE) and Fijian English (FijE). In the IC varieties English is the first language for the majority of the population and the language in which almost all public and private interaction is conducted. In the OC varieties English, though an official language, is usually the second language, and is learnt at school. There is no shortage of surveys and commentaries upon the grammatical features that are distinctive to particular World Englishes (e.g. Schneider 2007, 2008; Burridge & Kortmann 2008; Kortmann & Upton 2008; Mesthrie 2008; Mesthrie & Bhatt 2008), but in order to provide empirical support for the observations and claims therein we need comprehensive corpus-based investigations of the type reported in this paper. The study seeks, furthermore, to interpret patterns of grammatical variation identified in the light of such generally attested trends as the tendency for AmE to lead the way in many developments (cf. Trudgill 1986: 130). In pursuing the intervarietal comparisons of the study, reference is made to the evolutionary model developed by Schneider (2007). According to Schneider, a uniform underlying process, involving a cyclic series of characteristic phases determined by extralinguistic conditions, has operated across the diverse range of contact situations in which Postcolonial Englishes have emerged. The process involves identity reconstruction shaped by both sociocultural factors and by a linguistic dimension to which Schneider refers as “structural nativization”. The detection of such patterns on the level of grammar is more challenging than that on the levels of lexis and pronunciation. This is because, as Mukherjee and Gries (2009) point out, the differences that exist in grammar are rarely categorical in nature: more typically they involve uses of structures of differing frequency that may operate below the level of linguistic awareness. Appropriately, then, the favoured approach has become the corpus-based analysis of large amounts of natural data. This approach is particularly suited to the investigation of the quasi-modals, which display quantitative differences of an often subtle nature that lie largely below the level of consciousness across a range of varieties. According to Mukherjee and Gries (2009: 36) the evolutionary advancement of a New English can be measured by the degree of its dissimilarity from its primary English parent. This assumption



Modals and quasi-modals in New Englishes 

may well be appropriate when one is considering phenomena such as verb complementation patterns of the type studied by Mukherjee and Gries, where there is no evidence of major diachronic change in parent varieties. However, with items that appear to be undergoing increases and decreases in frequency, it may be more valid to measure the evolutionary advancement of New Englishes in terms of their similarity to, rather than difference from, those IC varieties that are most likely to be leading the way.3 Schneider (2007) proposes that there are five phases relevant to the evolution of the New Englishes, progression through which is motivated by group-identity and identity-construction. The phases are: (i) foundation (transportation); (ii) exonormative stabilization; (iii) nativization; (iv) endonormative stabilization; and (v) differentiation. There follows a brief discussion of the evolutionary history and current status of the twelve Postcolonial varieties examined in this study, ordered from most to least advanced. Table 2 summarizes the discussion. Table 2.  Onset dates for the phases of Schneider’s (2007) evolutionary cycle of Postcolonial Englishes English

Phase 1

Phase 2

Phase 3

Phase 4

Phase 5

AmE

c.1587

c.1670

c.1773

1828/1848

1898

AusE

1788

c.1830s

1901

1942

1980s

NZE

c.1790s

1840

1907

1973

1990s

JamE

1655

1962

+

SingE

1819

PhilE

c.1690s c.1867 1898

1942

1970s

1946

+

IndE

1600

1757

c.1905

+

NigE

early C19

c.1900

late 1940s

+

MalE

1786

1957

HKE

1841

1898

1960s

KenE

1860s

c.1920

late 1940s

FijE

early C19

1930s

+

+ = weak/disputible signs of the onset of the phase are in evidence

.  The editors of this volume wisely caution that it may be a difficult challenge to measure how similar to or different from leading IC varieties an OC variety is, noting that nativization in ESL varieties may be different from usage profiles of patterns that are part of Global English.

 Peter Collins & Xinyue Yao

i. AmE: Although AmE is now regarded as an established “reference” variety rather than a new Postcolonial English, its evolution – as Britain’s first overseas colony – was similar to that of AusE, NZE and South African English. According to Schneider (2007: 251) AmE, insofar as it differs from the other Postcolonial Englishes in the chronological extent of its evolution (c.400 years), “provides an almost unique opportunity to observe the entire developmental cycle in hindsight”. ii. AusE: AusE has reached Phase 5, but only quite recently (from the 1980s). The onsets of a number of phases are associated with specific historical events: Phase 1 with the arrival of the First Fleet in 1788 and the establishment of a penal colony, Phase 3 with the formation of the Commonwealth of Australia in 1901, and Phase 4 with the fall of Singapore in World War II, as a result of which Australia found itself unprotected in the face of Japanese attack and a new sense of national identity began to emerge. Today there is increasing evidence of the regional, social and ethnic diversification that is characteristic of Phase 5. iii. NZE: NZE is similar to AusE in many ways: it has also reached the final phase of differentiation (see for example Bauer & Bauer 2002), albeit slightly later than AusE. Again there are several major historical events that represent evolutionary landmarks. The Treaty of Waitangi in 1840 saw Maori chiefs yield sovereignty to Britain, resulting in an influx of British settlers in Phase 2. The achievement of full independence in 1907 is associated with the ensuing nativization of the pronunciation and vocabulary. The year that Britain joined the European Union and New Zealand found it necessary to reorient to the Asia-Pacific region, 1973, is identified by Schneider as marking the beginning of a move towards linguistic homogeneity, codification and literary creativity. iv. JamE: JamE is in Phase 4, dating from 1962, when independence issued in a new era of nationalism and pan-ethnic identity which saw Jamaican Creole spread widely, attract political support for its constitutional recognition, and even begin its appearance in serious literature. A significant difference between JamE on the one hand, and most of the Asian and African Englishes on the other, is that JamE is the first language of most of its speakers. v. SingE: SingE is also in Phase 4. The British colonial tradition, having prevailed from 1819, crumbled in the aftermath of Japanese occupation during World War II. The drive for independence that achieved success in 1965 signalled the emergence of a new sense of identity which, from the 1970s onwards, was given further momentum from Singapore’s rapid economic growth, industrialization and modernization. The ethnic neutrality of SingE has become a strong badge of identity for the members of this exceptionally multicultural nation. While linguists generally agree that Phase 5 linguistic diversification



Modals and quasi-modals in New Englishes 

of the type found in AmE, AusE and NZE is not yet in evidence, some (for example Lim 2001) have nevertheless suggested that SingE is moving from the OC to the IC. vi. PhilE: The evolutionary status of PhilE dates from 1898, when the United States was granted authority over the Philippines. Thus PhilE, a by-product of America’s colonial expansion, spans little more than a single century. In recent years it has been losing ground to Filipino, which is being promoted as a national language, while English has become associated with colonization and political elitism. Some weak signs of Phase 4 are in evidence, including proposals for the codification and standardization of language education and the growth of a Philippine literature written in English. vii. IndE: While IndE has a large number of speakers, it is restricted to particular domains and social strata. The twentieth century saw a weakening of the previously dominant exonormative linguistic orientation, aided arguably in part by the granting of independence in 1947. Although English has not developed into a bearer of national identity or become accessible to the majority of the population, IndE is associated with literary creativity, and surveys indicate the emergence of endonormative attitudes. viii. NigE: NigE, which dates from the early Nineteenth Century and is currently spoken by approximately 20 per cent of the population, is in Phase 3, with some preliminary signs of Phase 4. These signs include debate over the acceptance of a local standard Nigerian form of English, perceptions of a British accent as affected, and the use of Nigerian Pidgin and English in literature. ix. MalE: MalE can be dated from the establishment of the colony of Penang in 1786. With the constitution of 1957 came a nationalist language policy that saw English disestablished as an official language in favour of Bahasa Malaysia (officially in 1976). Despite the consequent marginalization of MalE it has been steadily nativized, and its use as well as code-mixing/shifting are now widespread. Though MalE is squarely in Phase 3, some traces of Phase 4 are already discernible (in the form of complaints about falling standards of English). x. HKE: The beginnings of HKE can be traced to the occupation of Hong Kong Island in 1841, the beginning of the British colony. HKE is currently in Phase 3, which began in the 1960s with the growth of Hong Kong into an entrepreneurial commercial powerhouse. Traces of Phase 2 still persist, with a continuing tra­ dition of complaint about falling language standards promoting a re-evaluation of the role and status of English. xi. KenE: KenE has not moved beyond Phase 3. Following the Phase 2 establishment of a British colony in 1920 and the resultant promotion of English as the language of business, law, administration and education (initially only for children of the elite), Phase 3 commenced in the post-World War II years,

 Peter Collins & Xinyue Yao

when widespread teaching of English was undertaken as part of the British attempt to modernize Kenya in preparation for independence. xii. FijE: Fijian English is in Phase 2, which may be dated from the 1930s, when a policy to use English as the medium of instruction in schools was implemented. Today English is de facto the official language. There are several signs of progression to Phase 3, including codification of the lexis in the Macquarie Dictionary for the Fiji Islands. 4.  The data The study is based on an analysis of all relevant tokens in the following currently available corpora of the International Corpus of English (ICE) collection: ICE-GB, ICE-AUS, ICE-NZ, ICE-JAM, ICE-SIN, ICE-PHI, ICE-IND, ICE-NIG, ICE-MAL, ICE-HK, ICE-EA(Ken) and ICE-FJ.4 We extracted all tokens of the modals and quasi-modals listed in Section  1 above. Each complete ICE corpus contains approximately one million words of text, dating from the early 1990s, and conforms to a common design, comprising 500 texts of 2,000 words each (300 spoken texts – 180 dialogic and 120 monologic; and 200 written texts – 50 nonprinted and 150 printed). ICE-EA(Ken), which contains around 788,000 words, has a bigger proportion of written text (386,000 words for spoken, 402,000 for written). For three of the corpora there are no spoken and only partial written data available (approximately 324,000 words for NigE, 165,000 words for MalE and 298,000 words for FijE).5 The compilation of ICE-US is as yet incomplete, so we assembled a smaller makeshift corpus called C-US of around 210,000 words, using texts constituting Part A of the (spoken) Santa Barbara Corpus (SBC) and selected texts from the Freiburg-Brown Corpus of written American English (FROWN), both of which have data also from the early 1990s, in comparable text categories.6

.  For the latest information on the ICE corpora, visit: http://www.ucl.ac.uk/english- usage/ ice/. All ICE corpora comprise the following text types: dialogues (S1A private, S1B public); monologues (S2A unscripted, S2B scripted); non-printed writing (W1A student writing, W1B letters); printed writing (W2A academic, W2B popular, W2C reportage, W2D instructional, W2E persuasive, W2F creative). .  We wish to express our thanks to three people for supplying us with the available (written) data from three as-yet-incomplete ICE corpora: Hajar Abdul Rahim for ICE-MAL, Ulrike Gut for ICE-NIG, and Lena Zipp for ICE-FJ. .  The name C-US was suggested by Edgar Schneider: see Collins (2005).



Modals and quasi-modals in New Englishes 

In the selection of texts from FROWN the primary objective was to match the written ICE categories as closely as possible, the parallels being as follows: ICE C-US Non-printed (50 texts) G1-3; P1-7 (10 texts) Printed: informational (100 texts) J1-8; F1-8; A1-4 (20 texts) Printed: instructional (20 texts) H1-2; E1-2 (4 texts) Printed: persuasive (10 texts) B1-2 (2 texts) Printed: creative (20 texts) K1-4 (4 texts) 5.  The Englishes compared Table 3 presents frequencies for all the quasi-modals investigated in our spoken data. All frequencies were normalized to tokens per million words.7 Table 3.  Frequencies of the quasi-modals in speech BrE AmE AusE NZE Av.IC JamE SingE PhilE IndE HKE KenE Av.OC have to have got to

1254 2049 1672 1436 1603 2015 1663 358

278

548

479

416

75

309

be going to 1386 3232 1616 1589 1956 1288

958

1597 1732 1756 1273 1673 70

48

109

54

111

1296

605

771

673

932

want to

1457 2536 1770 1557 1830 2027 2264

1716 1281 1825 1503 1769

Total

4454 8096 5607 5062 5805 5406 5194

4679 3666 4462 3503 4485

.  In the first version of this paper a scheme was devised for normalizing the frequencies for the three incomplete corpora to tokens per million words, bearing in mind the likely differences in frequency between writing and speech, and using the following calculations. Step 1: normalize the frequency to tokens per 400,000 words; step 2: normalize the resulting frequency to tokens per 1,000,000 words; step 3: using the speech/writing ratio for the other OC ICE corpora, calculate a frequency per 1,000,000 words for speech; step 4: normalize the resulting frequency to tokens per 600,000 words; step 5: add the frequencies resulting from step 1 and step 4. While we ultimately decided to “play it safe” and compare the written fragments of the incomplete corpora only with the (complete) written components of the completed ICE corpora, in this note we supply the normalized frequencies for the modal expressions resulting from the five steps described above: in ICE-NIG, ICE-MAL and ICE-FJ respectively: have to (985, 1577, 1790), have got to (94, 94, 72), be going to (984, 343, 847), want to (945, 907, 1007), must (812, 806, 845), should (1582, 1435, 2298), will (4856, 3936, 5183) and shall (248, 70, 50).

 Peter Collins & Xinyue Yao

The quasi-modals are, as we have seen above, known to be on the rise in BrE and AmE. Given that it is likely that they are also on the rise in English worldwide, what the frequencies presented in Table 3 suggest is that AmE is leading the way in this change. AmE has a considerably higher total number of tokens than the other nine varieties (and, within the IC, leads in the frequency of every quasimodal selected except have got to), and has over twice as many tokens as the least “advanced” variety in sheer frequency terms, KenE. The IC varieties as a group have a stronger predilection for the quasi-modals overall than do the OC varieties, although it should be noted that in the case of have to the OC varieties are slightly ahead of the IC. Within the IC it is BrE that is the most “conservative” and least similar to AmE, followed by NZE and then AusE. Within the OC varieties, KenE and IndE are considerably more conservative than the Southeast Asian and Caribbean varieties. Table 4 presents frequencies for all the quasi-modals investigated in our written data. AmE again leads overall, its affinity for be going to and want to being strikingly stronger than any other variety’s. The IC varieties were marginally ahead of the OC. The ordering within the latter (SingE>HKE>KenE>JamE>FijE>PhilE>M alE>NigE>IndE) is difficult to explain, especially the low frequency for JamE and the high frequency for HKE and KenE. Tables 5 and 6 present frequencies for the modals investigated. Given that the modals appear to be in decline, AmE – with the smallest total number of modal tokens – can again lay claim to being the most advanced of the IC Englishes in speech (and with the exception of spoken JamE, of all the ten Englishes). The OC varieties are considerably more conservative than the IC. Again the OC ordering is hard to explain (JamE>KenE>PhilE>HKE>IndE>SingE), particularly the conservative result for SingE and the high ranking for KenE. In writing, AmE is again the most advanced of the IC Englishes, whose ordering follows the pattern (AmE>AusE>BrE>NZE). The OC varieties are again more conservative than the IC. Again the OC ordering is well-nigh impossible to interpret (IndE>PhilE>JamE>FijE>NigE>MalE>KenE>SingE> HKE), particularly the conservative result again for SingE and the high ranking for IndE. The ordering within the IC when we consider both speech and writing together is almost identical for the quasi-modals (AmE>AusE>NZE>BrE) and the modals (AmE>AusE>BrE>NZE). In both cases AmE leads the way, followed by AusE, with NZE and BrE behaving conservatively.

820 34 144 537 1534

782 50 348 807 1987

AmE

646 48 180 575 1449

AusE

793 60 226 505 1584

NZE

507 977 3828 211 5523

278 766 4361 108 5513

AmE

483 1023 4306 51 5863

AusE

AmE

522 969 3266 87 4843

BrE

must 853 should 1213 will 3805 shall 218 Total 6089

742 1255 3077 177 5251

AusE

750 1215 4022 144 6131

NZE 845 1513 3075 483 5917

717 1163 3543 157 5579 992 1296 4215 130 6634

SingE

PhilE

726 1068 2719 295 4808

IndE

955 1437 6412 89 8893

SingE

757 13 103 627 1500

734 1070 3380 298 5482

PhilE

583 1002 3016 61 4662

468 942 4129 107 5645

JamE

602 1000 4022 57 5681

JamE

930 63 153 721 1867

SingE

Av.IC

928 24 155 508 1615

760 48 224 606 1638

NZE

JamE

Av.IC

Av.IC

Table 6.  Frequencies of the modals in writing

must should will shall Total

BrE

Table 5.  Frequencies of the modals in speech

have to have got to be going to want to Total

BrE

Table 4.  Frequencies of the quasi-modals in writing

480 1092 4820 203 6595

793 1460 3651 293 6198

MalE

853 1636 5361 137 7987

1190 1269 3681 73 6213

761 1759 3973 164 6656

HKE

IndE

808 18 55 534 1415

MalE

488 12 120 738 1358

NigE

PhilE

NigE

763 7 88 444 1303

IndE

983 2031 3046 192 6252

KenE

724 1870 3485 47 6125

FijE

779 1617 3834 329 6558

KenE

757 17 192 707 1672

KenE

520 1407 5345 82 7354

HKE

777 18 197 748 1740

HKE

695 1365 4708 150 7008

Av.OC

782 21 129 627 1559

861 1482 3469 219 6032

Av.OC

834 13 97 616 1561

FijE Av.OC

Modals and quasi-modals in New Englishes 

 Peter Collins & Xinyue Yao

6.  Speech and writing compared It was anticipated that a comparison of frequencies of the modal expressions under examination in speech and writing would yield relevant insights into the contrasting diachronic fortunes of the modals and quasi-modals, for several reasons. One is Mair and Leech’s (2006) finding, noted in Section 2 above, that these diachronic trends tend to be more extreme in speech than in writing, and another is the attested tendency for many linguistic innovations to spread rapidly in informal spoken genres before becoming established more broadly in the language. Yet another is the tendency for items that become limited to the written word to ossify. Table 7 presents the speech versus writing ratios for the quasi-modals in all the Englishes for which both spoken and written data were available. The results provide one possible explanation for our previous finding that AmE is leading the way in the rise of the quasi-modals: the preference of the quasi-modals in AmE for occurrence in speech over writing is greater than that in the other three IC varieties. Not surprisingly the ordering of the varieties from the highest to the lowest ratio matches that attested above for the quasi-modals in speech and writing, namely AmE>AusE>NZE>BrE. Within the OC JamE clearly has the highest ratio, and KenE the lowest, the ordering being: JamE>PhilE>IndE>SingE>HKE>KenE. The biggest surprise here is the relatively low ratio for the SingE, given its high evolutionary ranking. The IC varieties exhibit a considerably stronger speech preference than the OC. The most striking contrast that emerges with the individual items is with have got to, its usage in speech being much more frequent in the IC than in the OC varieties. Table 7.  Speech versus writing ratios for the quasi-modals BrE AmE AusE NZE Av.IC JamE SingE PhilE IndE HKE KenE Av.OC 2.59 1.81 2.11

2.17

1.79

2.11

2.27 2.26

1.68

2.04

have got to 10.53 5.56 11.42 7.98 8.67

3.13

4.90

5.38

6.86 6.06

3.18

4.69

have to

1.53 2.62

be going to

9.63 9.29

8.98 7.03 8.73

8.31

6.26 12.58

6.88 3.91

3.51

6.30

want to

2.71 3.14

3.08 3.08 3.02

3.99

3.14

2.74

2.89 2.44

2.13

2.83

Total

2.90 4.07

3.87 3.20 3.54

3.35

2.78

3.12

2.81 2.56

2.10

2.78

Consider next the modals in Table 8. All bar shall tend to be more popular in speech in the OC varieties than in the IC, suggesting that they retain a degree of vitality, and consequently that their rate of decline may be less marked – or at least delayed. Interestingly, the results for the IC varieties are rather equivocal with respect to the advanced status of the decline of the modals in AmE, where the popularity of will and shall in speech increases their ratio to an unanticipated level.



Modals and quasi-modals in New Englishes 

The ratios suggest that the IC Englishes are in general slightly more advanced in the decline of the modals than the OC. Table 8.  Speech versus writing ratios for the modals BrE

AmE AusE NZE Av.IC JamE SingE PhiE IndE HKE KenE Av.OC

0.59

0.53

0.65

0.80

0.65

0.69

0.96

0.65

1.17

0.68

0.79

0.83

should 0.81

0.79

0.82

0.82

0.81

0.66

1.11

1.02

1.53

0.80

0.80

0.94

must will

1.01

1.34

1.40

1.00

1.17

0.98

1.52

1.43

1.97

1.35

1.26

1.38

shall Total

0.97 0.91

1.24 1.14

0.29 1.12

0.40 0.93

0.68 1.01

0.13 0.79

0.68 1.34

0.68 1.20

0.46 1.66

0.50 1.10

1.71 1.05

0.58 1.18

7.  The individual quasi-modals In this section we focus on the individual quasi-modals (and their semantically related modal counterparts). 7.1  have to We can glean from combining results from Tables  3 and 4 above that have to, which we have observed above (Table 1) to have enjoyed increasing popularity in recent British and American usage, is the second most popular of the quasi-modals investigated in the IC varieties, the most popular in the OC, and the most popular of those investigated overall. Within the IC have to follows the general trend for AmE to be leading the way in the rise of the quasi-modals and for NZE and BrE to be the most conservative. Within the OC have to is strikingly popular in spoken JamE, and in writing comparatively unpopular in the two African Englishes, KenE and NigE. Not surprisingly, given its recent increase in frequency, have to is more commonly found in spoken than written registers (with AmE marginally ahead of the other Englishes in this regard), but the extent of its popularity in speech is less than that for be going to and want to, suggesting that the others may be increasing at a faster rate. Have to is almost twice as frequent as the modal that is closest to it in meaning, must. The contrasting fortunes of the quasi-modal and modal are reflected in their speech/writing preferences: the greater use of have to in spoken genres noted above contrasts with the lesser use of must. The dominant meaning of both have to and must in contemporary English is (strong) deontic necessity, or ‘obligation’ (see Collins 2009: 34, 60). They typically differ, however, on the dimension of objectivity/subjectivity (i.e. in terms of

 Peter Collins & Xinyue Yao

whether the source of the obligation is external to the speaker, or whether the deontic source is the speaker): have to is generally objective, as in example (1), while must is more often subjective than have to, as in example (2).

(1) In many other countries like Melanesia women are culturally bound they are not even given time to go freely if they have to – live – leave their husband but instead have to pay for the bride-price. (ICE-FJ W1A-011 34)



(2) If you want my boys to surrender, you must remove your askaris so the boys get a chance to come out, so that the war may be stopped, so that we may negotiate. (ICE-EA(Ken) W2F)

It has been suggested that one reason for the increasing popularity of have to at the expense of must might be the less overbearing tone that is generated by its objective deontic meaning (e.g. by Myhill 1995 for American English, and by Collins 2005 for British, American, Australian and New Zealand English; it remains for scholars to test whether the suggestion is valid for the OC Englishes as well). The same objective/subjective tendency applies to dynamic have to and must. Dynamic have to expresses a need that characteristically derives from external circumstances, in the case of (3) the insufficient spiciness of the food in Hong Kong restaurants. Dynamic must by contrast expresses a need that typically derives from internally driven factors, in the case of (4) the speaker’s inner compulsion.

(3) So uh from young I am used to eating all the Malay food and Indian food and uh Chinese food uh so uh I I really like to eat uh s spicy hot curry uh so so whenev whenever in Hong Kong I I find that sometimes the food here is not hot enough for me so I have to I have to go to a a Malaysian restaurant or Indian restaurant. Uh maybe once every two weeks to to uh satisfy my my yeah my my need uh (ICE-HK S1A-008 213)



(4) In my life, everything I do must have reasons and justification. (ICE-MAL W2B-004)

Both have to and must express epistemic necessity; however, with have to this is a very minor meaning, an indication of its incomplete grammaticalization (see Collins 2009: 34, 60). Thus, while had to could in principle be substituted for epistemic must in (5), its informality – and perhaps objectivity – would be considered inappropriate by many speakers.

(5) She thought these must have been frightful creatures (ICE-PHI W1A)

Like have to the modal should is approximately twice as popular as must, a numerical superiority attributable in part at least to should’s milder subjectivity



Modals and quasi-modals in New Englishes 

and consequently less forceful and authoritarian tone. In (6), for example, must could not easily be substituted for should: its modal strength, in particular, would not combine readily with the past time situation.

(6) After all, if he should have intended, but did not intend, to produce an effect in the audience, then presumably his rings on the bell don’t mean anything at all. (ICE-NIG W1A 012)

If, as the diachronic pattern discussed in Section 2 above suggests, should is currently in slow decline, then the IC varieties (with 942 tokens on average in speech, and 1163 in writing) are more advanced than the OC (with 1365 and 1482 respectively). Of the varieties it is AmE that is the most advanced in this regard. 7.2  have got to Have got to is the least popular of the quasi-modals analysed in the study. It differs from have to grammatically (see Collins 2009: 68) and stylistically in being avoided in written genres (in the present study the writing to speech ratio for have got to was approximately 1:9 for the IC varieties, and 1:5 for the OC). This dispreference for have got to in writing is not surprising in view of the traditional censure of got, with its familiar application to the written word, by prescriptive grammarians. Have got to, like have to, serves primarily to express strong deontic necessity. What distinguishes the two quasi-modals semantically is have got to’s less consistent expression of objectivity. Thus in (7), where the speaker is the deontic source, substitution of have to might suggest the greater likelihood of a deontic source other than the speaker.

(7) Oh you have got to wait for your marks first (ICE-IND S1A)

Have got to is far more popular in the IC varieties: about four times more so in speech and about two times in writing. As the ratios noted above show, the dispreference for have got to in writing is far stronger in the IC than it is in the OC varieties. This difference may suggest greater awareness of and sensitivity to traditional proscriptions of the verb get amongst IC speakers, insofar as avoiding the use of have got to in writing might well indicate a desire to avoid attracting the censure of prescriptivists. 7.3  be going to Be going to is significantly more frequent in the IC varieties than in the OC, a difference most likely related to the frequencies for will, where the positions of the IC and OC groupings are reversed. What this suggests is that the incursion

 Peter Collins & Xinyue Yao

of be going to into the territory of will may be more advanced in the IC than the OC varieties.8 The far greater affinity of be going to for the spoken word than the written word in the present study, taken in conjunction with the findings of Leech et al. (2009) reported above for AmE, is certainly suggestive of an item whose popularity is rising. The two main meanings of be going to, epistemic futurity as in (8) and dynamic volition as in (9), set it up for competition with will.

(8) But it’s going to be very difficult for foreign people to accept the rules of China after nineteen ninety-seven like I think in Chini a policeman can just arrest you for no reason at all (ICE-HK S1A-009 652)

(9) Okay we shall start now talk about are you going to watch Annie or not? (ICE-SIN S1A-013 3)

The forays that be going to is making into the semantic territory of will are more selective than across the board. Thus for example epistemic be going to can only be used with relation to future situations, not present or past situations as in the case of “central epistemic” will (see Huddleston & Pullum 2002: 188). Thus, be going to could not be substituted for will in (10): (10) The final tally comes after the competition draws to a close at the weekend in Victoria Park. The girls will be awarded points for their enthusiasm, routine, smile and presentation. But the bulk of what they know about cheerleading will have been gleaned from watching corn-fed blonds do rah-rah choruses in American movies. (ICE-HK W2B-008 70)

Another difference is that dynamic be going to typically expresses weak intentionality rather than the stronger sense of willingness. Thus in (11) not gonna means ‘doesn’t intend to’. Substitution with won’t would have the effect of strengthening the volitionality to refusal (‘refuses to’): (11) Yeah I’ve heard that he’s not gonna have any more fancy hairstyle no more tattoos no more earrings and all that. He wants to be a good boy now not a bad boy. (ICE-PHI S1A-001 57)

Finally, be going to often suggests a greater degree of immediacy than will (derivable from the motional sense of the originally progressive construction, the

.  Obviously the question of diachronic developments in the expression of future time is more complicated than the mere trade-off between will and be going to that is being discussed here. For a recent examination of the development of future time expressions in Late Modern English that considers how the different options are related to each other see Nesselhauf 2010.



Modals and quasi-modals in New Englishes 

modal idiom suggesting that events leading up to the actualization of a situation are in train), hence the compatibility of ’re going to with just in (12): (12) But we’re just going to prepare the the tomato basil and the and the and the onion soup right (ICE-PHI S1A-014 2) [I think MH is wrong here: it means ‘right away/right now’ rather than ‘simply’]

7.4  want to Want to has similar frequencies in the IC and OC corpora, with the familiar pattern again of AmE exhibiting the highest frequency in the IC group, followed by AusE, with NZE and BrE the most conservative, and the results for the OC well-nigh inexplicable in evolutionary terms. The greater affinity of want to for the spoken word than the written word, by an approximate ratio of three to one – taken in conjunction with the findings of Leech et al. (2009) reported above for BrE and AmE – is certainly suggestive of an item whose popularity is rising. Want to expresses, dominantly, the dynamic modal meaning of volition, as in (13): (13) All people, not only Fijians, want to be recognised and to be identified with their roots. (ICE-FJ W2B-004 61)

Even though the status of want to as a quasi-modal is somewhat controversial, the position adopted here is that it is an emergent member of the category. One piece of evidence for this claim is the morphological incorporation of the infinitival to into a single compound form that is commonly found in speech with want to, and represented orthographically in informal written styles as wanna, as in (14): (14) Yes I still have the tickets wanna see (ICE-SIN S1A-068 173)

Another type of evidence, as noted by Krug (2000: 147–151), is the emergence of modal senses additional to its dominantly volitional meaning. These meanings are represented in the following examples: deontic modality in (15) (although admittedly somewhat ambivalent between deontic and dynamic modality) and (16) (arguably epistemic): (15) We cannot accept crime as a fact of life. We must make Malaysia a better place to live in. We want to cut the crime rate in our cities, our streets and or neighbourhood. Failure is not an option. (ICE-MAL W2E-010) (16) Tough games for Agassi now. He wouldn’t wanna get behind two sets to love against a big serve volleyer like Martin who’s got some good groundies too (ICE-AUS S2A-004 138)

 Peter Collins & Xinyue Yao

8.  C  onclusion The findings of the present study suggest that, of the thirteen Englishes analysed, it is AmE that is most advanced in the rise of the quasi-modals. AmE has a higher total number of tokens than the other varieties, and in fact within the IC it leads in the frequency of every quasi-modal selected except have got to. At the other extreme, within the IC BrE and NZE are the most “conservative” varieties when both speech and writing are considered together. With the single exception of have to, the quasi-modals are more prevalent in the IC than in the OC. Within the OC there was a mild tendency for more advanced varieties (e.g. JamE) to be those with the highest evolutionary status, and for less advanced varieties (e.g. KenE and NigE) to be those with the lowest evolutionary status. The influence of registers was also noted: the quasi-modals are thriving in speech, their modal counterparts holding on in the written word. That there may be a degree of covariation between region and register is suggested by the fact that within the IC it is AmE that both leads the way in the frequency of quasi-modal usage and at the same time evidences the largest gulf between speech and writing in quasi-modal usage. If we can assume that the modals are in the process of decline – albeit mild – then the present study suggests that within the IC it is again AmE, with the lowest frequency of modals, that is the most advanced variety, and that the IC varieties are generally more advanced than the OC. To conclude, we have found some evidence to suggest that the diachronic trends with the English quasi-modal and modal categories may be influenced by the phenomenon of Americanization – although of course it is difficult to prove that the findings for AmE and the other varieties cannot alternatively be interpreted as merely parallel developments – and, although the results are admittedly less clear-cut, the evolutionary status of the New Englishes.

References Bauer, L. & Bauer, W. 2002. Can we watch regional dialects developing in colonial English? The case of New Zealand. English World-Wide 23: 169–193. Burridge, K. & Kortmann, B. 2008. Varieties of English: The Pacific and Australasia. Berlin: Mouton de Gruyter. Coates, J. 1983. The Semantics of the Modal Auxiliaries. London: Croom Helm. Collins, P. 2005. The modals and quasi-modals of obligation and necessity in Australian English and other Englishes. English World-Wide 26: 249–273. Collins, P. 2009. Modals and Quasi-modals in English. Amsterdam: Rodopi. Huddleston, R. & Pullum, G. 2002. The Cambridge Grammar of the English Language. Cambridge: CUP.



Modals and quasi-modals in New Englishes 

Kachru, B. 1985. Standards, codification, and sociolinguistic realism: The English language in the outer circle. In English in the World, R. Quirk & H. Widdowson (eds), 11–30. Cambridge: CUP. Kortmann, B. & Upton, C. 2008. Varieties of English: The British Isles. Berlin: Mouton de Gruyter. Krug, M. 2000. Emerging English Modals. A Corpus-based Study of Grammaticalization. Berlin: Mouton de Gruyter. Leech, G. 2003. Modality on the move: The English modal auxiliaries 1961–1992. In Modality in Contemporary English, R. Facchinetti, M. Krug & F. Palmer (eds), 223–240. Berlin: Mouton de Gruyter. Leech, G., Hundt, M., Mair, C. & Smith, N. 2009. Change in Contemporary English: A Grammatical Study. Cambridge: CUP. Lim, L. 2001. Ethnic group varieties of Singapore English: Melody or harmony? In Evolving Identities. The English Language in Singapore and Malaysia, V. Ooi (ed), 53–68. Singapore: Times Academic Press. Mair, C. & Leech, G. 2006. Current changes in English syntax. In The Handbook of English Linguistics, B. Aarts & A. McMahon (eds), 318–342. Oxford: Blackwell. Mesthrie, R. 2008. Varieties of English: Africa, South and Southeast Asia. Berlin: Mouton de Gruyter. Mesthrie, R. & Bhatt, R. 2008. World Englishes: The Study of New Linguistic Varieties. Cambridge: CUP. Mukherjee, J. & Gries, S. 2009. Collostructural nativisation in New Englishes. English WorldWide 30: 27–51. Myhill, J. 1995. Change and continuity in the functions of the American English modals. Linguistics 33: 157–211. Nesselhauf, N. 2010. The development of future time expressions in Late Modern English: Redistribution of forms or change in discourse? English Language and Linguistics 14: 163–186. Palmer, F. 1990. Modality and the English Modals, 2nd edn. London: Longman. Palmer, F. 2001. Mood and Modality, 2nd edn. Cambridge: CUP. Schneider, E. 2007. Postcolonial English: Varieties of English Around the World. Cambridge: CUP. Schneider, E. 2008. Varieties of English: The Americas and the Caribbean. Berlin: Mouton de Gruyter. Trudgill, P. 1986. Dialects in Contact. Oxford: Blackwell. Westney, P. 1995. Modals and Periphrastics in English: An Investigation into the Semantic Correspondence between Certain English Modal Verbs and their Periphrastic Equivalents. Tübingen: Max Niemeyer.

The diverging need (to)’s of Asian Englishes Johan van der Auwera1, Dirk Noël2 & Astrid De Wit1 1University

of Antwerp / 2University of Hong Kong

This paper proffers a corpus-based study of the modal auxiliary need and its lexical counterpart need to in four Asian varieties of English, viz. Hong Kong, Singapore, Philippine and Indian English, in comparison to British and American English. We investigate the distribution of need and need to in positive and negative polarity contexts, their recent frequency evolution in these contexts, and the interregional differences. We characterize the changes and the differences in terms of five properties and observe that these properties change differently in different varieties. This calls for a more comprehensive inquiry into the evolution of the frequency of the modals of necessity. The paper also provides methodological justification for drawing diachronic conclusions from a comparison of synchronic written and spoken data. Keywords:  need (to); polarity, Asian Englishes; American and British English

1.  Introduction English has two need verbs, one an auxiliary, the other a lexical verb. The ­auxiliary is a negative polarity verb: it only occurs in negative contexts, in questions, in conditionals and in a few other non-affirmative contexts, the negative context being the most frequent one. In this paper we will use “negative polarity” as a shorthand term for this collection of contexts. As a negative polarity verb, auxiliary need is negated and questioned without do, the third person present tense takes no -s, and the verb complement is the bare infinitive (see examples (1) to (3)). (1) a. *He need do that. b. *He need to do that. (2)

a. b. c. d.

He needn’t do that. *He doesn’t need do that. *He needsn’t do that. *He needn’t to do that.

 Johan van der Auwera, Dirk Noël & Astrid De Wit

(3)

a. b. c. d.

Need he do that? *Does he need do that? *Needs he do that? *Need he to do that?

The lexical verb is neutral to polarity. It uses do in questions and negative contexts, the third person present takes -s, and the verb complement is the to infinitive, as illustrated in (4) to (6). (4) a. He needs to do that. b. *He needs do that.

(5) He doesn’t need to do that.



(6) Does he need to do that?

Also, only the lexical verb accepts (pro)nominal complementation: (7) a. He doesn’t need coffee. b. *He needn’t coffee.

We will henceforth refer to the auxiliary as need, and to the lexical verb as need to. The more versatile of the two verbs is need to, since it can occur in both positive and negative polarity contexts. In other words, everything need does, need to can do too, with no or only a minimal difference in meaning. Given this observation it would seem that English should be able to dispense with need, and indeed, various studies have documented the current decrease in use of need and the increase of need to in both British and American English in the context of a general decline in the frequency of the modals and a rise in the frequency of the so-called semi- or quasi-modals (Hundt 1998: 63–64; Leech 2003; Smith 2003; Taeymans 2004: 223; Mair & Leech 2006; Collins 2009a: 59, 2009b: 285–286, 289; Leech et al. 2009: 73, 94; Millar 2009). Van der Auwera & Taeymans (2009), however, argue that, at least in British English, there is not only a tendency for need to disappear, but also one for need to to become increasingly associated with positive polarity, which, if need does not completely disappear and hangs on to negative polarity, could lead to a division of labour between the two. Work on need and need to in other varieties of English exists as well: Collins (1978) studies Australian English, Hundt (1998) studies New Zealand English set against the background of British and American English, Lee (2001) deals with Australian as well as Hong Kong English, Tagliamonte & D’Arcy (2007) look at Canadian English (but they only study need to), Collins (2009a) compares Australian English to British and American English (with Collins 2009c adding New Zealand English), and Biewer (2009)



The diverging need (to)’s of Asian Englishes 

focuses on Fiji, Samoa and Cook Islands English, to which she adds Singapore, Philippine and Ghanaian English in Biewer (2011). In this paper we will focus on the need verbs in four Asian Englishes and compare them to British and American English, varieties also covered by Collins (2009b), who furthermore includes New Zealand and Kenyan English. Like Collins (2009b) we will deal with the English of Hong Kong, Singapore, India and the Philippines on the basis of their ICE corpora, and for British English and American English we will use LOB, FLOB, BROWN and FROWN, and the British ICE corpus. However, our study differs from Collins (2009b) in that we will include data from a second Hong Kong corpus (HKCSE), subject LOB, FLOB, BROWN and FROWN to renewed scrutiny – for Collins (2009b) it suffices to rely on Leech (2003) – and, most importantly, unlike previous research we will focus on issues of polarity. The following four restrictions are worth mentioning with relation to the research we report on here. First, the use of need is in part a function of the uses of the full array of modal expressions, and while Collins (2009a) goes some way in considering this, we do not. Second, we do not distinguish between the various uses or meanings of the need modals (see Nokkonen 2006 for a recent study). Third, we also do not look at the nature of the transmission of English in the four Asian territories considered, nor at possible linguistic substrate influence (see Lim & Gisborne 2009). Fourth, as to methodology, we refrain from statistics, the reason being that the numbers that we discuss are (in most cases) quite low, yielding a statistical power that does not meet the conventional requirements. Instead, our aim is to test the hypothesis resulting from the study on British English by van der Auwera & Taeymans (2009) referred to above that need to is increasingly associated with positive polarity, leading to a polarity split between need to and need, in case the latter does not completely evaporate. We will do so, first of all, by comparing British English with American English, since it has been observed with relation to the general decline of the modals and the rise of the quasi-modals that British English “is following in the track of ” American English (Mair & Leech 2006: 327) (Section 2). The focus of this chapter, however, is on the fate of need and need to in a geographically coherent set of “other” Englishes, namely the four Asian Englishes mentioned above, with the aim of testing whether the hypothesized evolution, if it can be taken to hold for British and American English, may be generalized over all Englishes (Section 4). Unlike for British and American English, no monitor corpora are in existence today with which one could trace the frequency evolution of need and need to in these Englishes, but we will argue (in Section 3) that a comparison of data from the written and spoken parts of the ICE corpora provides a good basis for the formulation of hypotheses on diachronic change.

 Johan van der Auwera, Dirk Noël & Astrid De Wit

2.  Need and need to in 1960s and 1990s British and American English Table 1 summarizes some of the findings reported by van der Auwera & Taeymans (2009) for British English based on the LOB and FLOB corpora and it adds findings for American English based on the BROWN and FROWN corpora. Each of these corpora contains around one million words of the written register. For each corpus we supply absolute figures as well as rounded off percentages for the uses of both need and need to in positive as well as negative polarity contexts. As explained above, the bulk of the negative polarity contexts consists of negative sentences, such as (2) and (5), but they include other contexts as well, such as interrogative ones, as in (3) and (6), and especially sentences with only, as in (8) below.

(8) To be reminded of this we need only glance at the world map and note the extent to which religious divisions … (BROWN)

Constructions manifesting a blending of the properties of the auxiliary and the lexical verb, such as (9), which shows the do periphrasis of the lexical verb but the bare infinitive of the auxiliary, have been classified on the basis of the infinitive (failing any indication to the contrary, we assume this to have been the practice in the work on the frequency evolution of need versus need to referred to above, which this study wants to link up with). Example (9) is therefore counted as an instance of need.

(9) You don’t need worry, Angelo (BROWN)

Table 1.  need and need to with positive and negative polarity in British and American English of the 1960s and the 1990s

need

need to

BrEng 1960s (LOB)

BrEng 1990s (FLOB)

AmEng 1960s (BROWN)

AmEng 1990s (FROWN)

+pol

0 (0%)

0 (0%)

0 (0%)

0 (0%)

–pol

74 (100%)

40 (100%)

40 (100%)

37 (100%)

total

74 (100%)

40 (100%)

40 (100%)

37 (100%)

+pol

40 (71%)

173 (89%)

47 (69%)

132 (81%)

–pol total

16 (29%) 56 (100%)

21 (11%) 194 (100%)

21 (31%) 68 (100%)

30 (19%) 162 (100%)

From the figures in Table 1 we can extract the following five observations. First, what is absolutely stable across the two varieties is the negative polarity of need: neither of the two varieties has even a single occurrence of need in a positive polarity context.



The diverging need (to)’s of Asian Englishes 

Second, if one compares British English of the 1990s to that of the 1960s, one can see a strong decrease of need: the 1960s feature 74 attestations and in the 1990s we only have 40, a reduction by almost half. This is the decline of need already referred to in the introduction and observed previously by Hundt (1998: 63–64), Leech (2003: 229), Smith (2003: 248), Taeymans (2004: 223), Leech et al. (2009: 73, 94) and others.1 It has been suggested by Collins (2009a: 59) that American English is “leading the way”, and indeed, we see that the total number of instances of need in 1990s British English is the same as the total in 1960s American English: both amount to 40. Note that not much seems to have happened to American English since then: there are 40 instances of need in the 1960s corpus and 37 in the 1990s corpus. This could mean, and this is a point that has not been made yet, that if American English is still leading the way, it could well be that need’s decline in British English has virtually stopped. That is to say, although need has indeed become less frequent in the British variety, the decrease may not be “life-threatening” yet. Of course, British English might also follow a nonAmerican course (see also Smith 2003: 256). Third, if one compares the frequencies of instances of need to in 1960s and 1990s British English, one can see a strong increase: for the 1960s we counted 56 occurrences and for the 1990s we counted 194, which is an increase by a ratio of 3.46:1. The American materials also show an increase: from 68 instances to 162, which is high too (2.38:1), but not quite as high as the one registered for British English. The point about the increase of need to is one that has been made by earlier scholars as well (in the wake of Hundt 1998: 63–84, by Leech 2003; Smith 2003 & Taeymans 2004: 223), and here, too, it has been suggested that American English is “leading the way”. An additional observation that can be made from our data, however, is that if American English is leading the way for British English, and given the lower rate of increase in American English, it may be the case that the increase for British English will slow down now. Fourth, the figures show that the increase of need to is not simply a compensation for the decrease of need, in both varieties. In British English need went from 74 occurrences to 40, so it “lost” 34. Need to, however, went from 56 to 194

.  The frequencies reported by Hundt (1998), Leech (2003), Smith (2003) and Leech et al. (2009) for the decrease of need and – next observation – for the increase of need to are also based on the LOB, FLOB, BROWN and FROWN corpora, but they are not identical and, moreover, differ from our findings as well. For example, Hundt (1998: 63) attests 73 occurrences of need in LOB vs. 40 in FLOB, Leech (2003: 228) has 87 vs. 52 attestations, Smith (2003: 248) arrives at 78 and 44, and we settle on 74 and 40. Why this difference should exist may have to do with the status of blends. At least it is reassuring that the tendencies are identical.

 Johan van der Auwera, Dirk Noël & Astrid De Wit

occurrences and thus “gained” 139, far more than 34. In American English need went from 40 to 37 occurrences, “losing” 3, while need to went from 68 to 162, an increase of 94, again far more than required to make up for the loss of 3 instances of need. Of course, need to could really only compensate for need in negative polarity contexts, and whereas for both British and American English negatively polar need to did indeed increase in the 1960s-to-1990s time span (from 16 to 21 instances for British English, and from 21 to 30 for American English), the bigger increase occurred in positive polarity contexts (from 40 to 173 instances for British English and from 47 to 132 for American English). Table 2 extracts the absolute figures from Table 1 and adds the difference between the “scores” for the 1960s and the 1990s, as well as the ratios. Table 2.  The increase of need to in British and American English of the 1990s compared to the 1960s need to

BrEng

AmEng

60s

90s

Difference 60s–90s

Ratio 60s:90s

60s

90s

Difference Ratio 60s–90s 60s:90s

+pol

40

173

+133

1:4.32

47

132

+85

1:2.80

–pol total

16 56

21 194

+5 +138

1:1.31 1:3.46

21 68

30 162

+9 +94

1:1.42 1:2.38

To conclude, since there is no one-to-one correlation between the increase of need to and the decrease of need, need to must be increasing at the expense of other modal expressions as well, such as must or have to. This point has also been made by Smith (2003: 249), who suspects that need to has gained terrain on must. Fifth, even though the increase of need to happened primarily in positive polarity contexts, negative polarity contexts register an increase too, as we have just pointed out in connection with Table 2. We can also compare the totals for negative polarity and compare the shares of need and need to in both varieties. In British English of the 1960s the total number of negative polarity contexts was 90, of which 74, or 82%, took need and 16 instances, or 18%, need to. In the 1990s, the total number of negative contexts was 61, of which 40 instances, or 66%, featured need and 21 instances, or 34%, took need to. In absolute figures the increase in negative polarity contexts for need to is small (from 16 to 21 instances), but in relative terms (relative to the total number of negative polarity contexts) the share of need to almost doubled (from 18% to 34%). These figures can be calculated from Table 1, but for ease of reference we have listed them in a separate table (Table 3) where we have added the figures for American English as well. The latter equally show an increase in the share of need to in negative polarity contexts.

The diverging need (to)’s of Asian Englishes 



Table 3.  Relative distribution of need and need to in negative polarity contexts in British and American English of the 1960s and the 1990s BrEng 1960s

BrEng 1990s

AmEng 1960s

AmEng 1990s

–pol need

74 (82%)

40 (66%)

40 (66%)

37 (55%)

–pol need to –pol need & need to

16 (18%) 90 (100%)

21 (34%) 61 (100%)

21 (34%) 61 (100%)

30 (45%) 67 (100%)

Note, however, that the increase in American English is smaller in relative terms than in British English, which – assuming that the former variety is indeed “leading the way” – could mean that the increase is slowing down. Note as well that negative polarity still has need as the first choice, even in American English, although we are almost at the breaking point (55% for need versus 45% for need to).2 Since we will keep returning to these five observations in what follows, we summarily list them in (10) for convenience. (10) Observations on the use of need and need to in written 1960s and 1990s British and American English

i. in both varieties need is negatively polar; ii. in British English the use of need has decreased to a point reached by American English earlier, while there is hardly a decrease in American English; iii. in both varieties the use of need to is increasing, more strongly so in British English;

.  Sadly, Millar (2009: 197) chose not to include need in his study of the evolution of the modals and semi-modals in American English using Mark Davies’s Time Magazine Corpus (http://corpus.byu.edu/time/) because “it was found to be extremely infrequent” in the corpus. The study does record a dramatic 66% increase in the frequency of need to between the last decade of the previous and the first decade of the present century, however, which could mean that need to continued to take over the negative polarity role of need. Leech (2010) has observed an ongoing steep increase in the frequency of need to in British English during the same period, as well as a sharp drop in the frequency of need (through a comparison of FLOB data with data from Paul Baker’s BE06 Corpus, containing texts sampled between 2003 and 2008). This may suggest that need is completely on the way out, both in American and in British English. These evolutions postdate the 1990s data we are comparing in this contribution, however, and we will therefore not include them in the observations listed below that will form the basis of comparison with the Asian Englishes data.

 Johan van der Auwera, Dirk Noël & Astrid De Wit

iv. in both varieties the increase of need to is strong in positive polarity contexts, a little less so in American English; v. in both varieties need to also increases in negative polarity ­contexts, yet need remains the more frequent strategy, and in American English the percentage increase is smaller. In van der Auwera & Taeymans (2009) it is claimed on the basis of British English data only, but going back all the way to Old English, that the competition between need and need to may be resolved in two ways. Either need to will take over completely from need or, given that the enormous increase of positively polar need to is only minimally related to the decline of need, need to will become increasingly positive and leave need for negative polarity. Figures 1 and 2, taken from van der Auwera & Taeymans (2009: 324) visualize both tendencies, at least to some extent. Figure 1 shows how need to was once predominantly used in positive polarity contexts, then in negatively polar contexts only, and in its third life it is progressing towards preferring positive polarity again. Not shown in the figure is that need did not change: it started in negative polarity contexts and remained there. If need does not give up and need to becomes positively polar, the result is a division of labour. Figure 2 shows the second tendency, that of the frequency drop of need and the prospect of its extinction. The American English data we add in this study are compatible with the double tendency hypothesis and they are particularly relevant given the assumption that American English is the more progressive variant paving the way for British English. On the one hand, need to might indeed take over completely from need: Table 3 shows that in American English need to is threatening the majority position of need for negative polarity. On the other hand, we see that the decline of need is not progressing at the same speed as it does in British English and may even be stabilizing and that the percentage increase of need to is lower than in British English. In the next, methodological, section we prepare for a study of the four Asian Englishes. In doing so, however, we first have to turn to British English once more.

3.  A methodological preliminary: British English once more For the Asian Englishes of Hong Kong, India, the Philippines and Singapore, we will use the ICE corpora. Like the corpora of the LOB family, they contain about one million words, but unlike them, they document both written and spoken language (at a 40:60 ratio), and for each variety only one corpus is available, recording the usage of the 1990s or later. There is thus no direct way of using the

E

lM

eM

o

eM

d2

o eM

d3

o o lM LM

d1

d2

lM

o

d3

lP

d1 d2

lP

lM

od

1

d2 lP

d1 lM

od

3

2

eM

od

3 2

eM

od

1

lM

E

need to need

Figure 2.  Frequency of need and need to in British English: An overview

Periods

lP

[lME = late Middle English (1350–1500), eMod1 = early Modern English 1 (1500–1570), eMod2 = early Modern English 2 (1570–1640), eMod3 = early Modern English 3 (1640–1710), lMod1 = late Modern English 1 (1710–1780), lMod2 = late M ­ odern English 2 (1780–1850), lMod3 = late Modern ­English 3 (1850–1920), lPd1 = late Present-day English 1 (1960s), lPd2 = late P ­ resent-day English 2 (1990s)]

Figure 1.  need to in positively and negatively polar contexts in British English

Periods

0%

1 od

0%

20%

30%

40%

50%

10%

Negative polarity Positive polarity

60%

70%

80%

90%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

od eM

100%

od LM

The diverging need (to)’s of Asian Englishes 

 Johan van der Auwera, Dirk Noël & Astrid De Wit

ICE corpora for investigating recent changes within these four Englishes. However, the very fact that the ICE corpora include both written and spoken materials can be used to our advantage, given the assumption that spoken language is ceteris paribus more progressive than written language. Previous research on modality in World Englishes (Collins 2009a, 2009b) has drawn diachronic conclusions from the comparison of synchronic written and spoken data, without, however, testing the validity of the underlying assumption with relation to the matter at hand. In this section, we will do exactly that. We will first turn to the ICE corpus of British English and see whether the assumption holds true with respect to the “battle between need and need to”. In particular, we expect that the tendencies we witness when comparing LOB and FLOB will reappear in the comparison between ­ICE-GBw[written] and ICE-GBs[spoken]. Table 4 shows the uses of need and need to in the written and spoken parts of ICE-GB. Since the two parts are not equal in size, we have normalized the figures to a frequency per million words, and we have rounded them off to the nearest integer (the norm was chosen to allow comparison with the earlier figures). Table 4.  Frequency per million words of need and need to with positive and negative polarity in ICE-GBw and ICE-GBs ICE-GBw need

need to

ICE-GBs

+pol

0 (0%)

0 (0%)

–pol

52 (100%)

23 (100%)

total

52 (100%)

23 (100%)

+pol

127 (71%)

176 (89%)

–pol total

30 (29%) 157 (100%)

51 (11%) 227 (100%)

Let us consider these figures from the perspective of the five observations listed in (10). First, the negative polarity of need has proved very stable in the earlier corpora. The same is true here: positive polarity need occurs in neither the written nor the spoken part of the corpus. Second, since we expect need to decrease in the more progressive register, its frequency in ICE-GBs should be lower than in ICE-GBw. This is the case: the figure for ICE-GBs (23) is less than half the one for ICE-GBw (52). Third, we expect need to to increase in the more progressive register. This expectation is borne out as well: ICE-GBw has 157 occurrences per million words of lexical need and ICE-GBs has 227. Fourth, we expect need to to continue growing in positive polarity contexts, to the extent that the biggest number of attestations with which ICE-GBs exceeds ICE-GBw should occur in positive polarity contexts.

The diverging need (to)’s of Asian Englishes 



In absolute terms, this is true: as shown in Table 5, of the 70 normalized instances that make up the difference between 227 (ICE-GBs) and 157 (ICE-GBw), 49 are positively polar. Note, however, that the ratio of the additional instances in negative polarity is higher than for positive polarity, which is not expected. Table 5.  The distribution of need to in ICE-GBs as compared to ICE-GBw (frequency per million words) need to

ICE-GBw

ICE-GBs

Difference Ratio ICE-GBw – ICE-GBs ICE-GBw:ICE-GBs

+pol

127

176

+49

1:1.38

–pol total

30 157

51 227

+21 +70

1:1.70 1:1.44

This immediately takes us to the fifth point: need to encroaches upon need in the negative polarity domain. As shown in Table 6, while the 30 (normalized) occurrences of negative polarity need to in ICE-GBw amount to a share of 37% of the joint total of negative polarity need and need to instances in this subcorpus, the 51 instances in ICE-GBs represent a share of 69%. In other words, the share of need to in negative polarity contexts in the more progressive register is not far from double that in the more conservative one, or, conversely, the share of need in these contexts in the progressive medium register is less than half of its share in the more conservative one. Table 6.  Relative distribution of need and need to in negative polarity contexts in ICE-GBw and ICE-GBs (frequency per million words) ICE-GBw

ICE-GBs

–pol need

52 (63%)

23 (31%)

–pol need to

30 (37%)

51 (69%)

82 (100%)

74 (100%)

–pol need & need to

We can thus conclude that a comparison of the written and spoken materials in ICE-GB indeed reveals most of the tendencies earlier discovered by comparing two diachronically different written corpora, which provides justification for using the same technique for the discovery of evolutionary tendencies in the Asian ICE corpora.3

3.  An anonymous reviewer has remarked that not all ingredients of the spoken part of the ICE corpora are good examples of spoken language, since they also include news readings, scripted speeches, etc. However, this fact does not alter the observation that the findings of a comparison of the written and spoken ICE-GB data mirror those of a comparison of LOB and FLOB data.

 Johan van der Auwera, Dirk Noël & Astrid De Wit

A word of caution is necessary, though. The comparison of the two ICE-GB subcorpora does not quite display all of the tendencies discussed earlier. This is particularly true for some of the tendencies revealed by a comparison of the ­differences between the British LOB and FLOB corpora with the differences between the American BROWN and FROWN ones. The changes noted between BROWN and FROWN, both written corpora, were assumed to augur further (i.e. post-1990s) changes in written British English.4 Since the differences between ICE-GBs and ICE-GBw are also assumed to be indicative of future tendencies in written British English, we thus have two partial indicators. Let us, once again, return to the five observations in (10) and see to what extent these two indicators point in the same direction. First, both indicators demonstrate the strong stability of the negative ­polarity of need. Second, the BROWN–FROWN comparison gives us reason to think that the decrease of need may come to a halt with the drop from 40 instances in BROWN to 37 in FROWN being minimal.5 The difference between ICE-GBw and ICE-GBs, however, is of a different order, with ICE-GBw having 52 and ICEGBs 23 (normalized) instances. In other words, the ICE-GB indicator does not confirm the prediction based on the BROWN–FROWN difference here. Third, the BROWN–FROWN comparison suggests that the increase of need to may be slowing down: between BROWN and FROWN the increase happened with a ratio of 2.38:1, whereas the increase between LOB and FLOB happened with a ratio of 3.46:1. The difference between ICE-GBw and ICE-GBs indicates a modest 1.38:1 ratio of change, which is in agreement with the BROWN–FROWN indicator. Fourth, the BROWN–FROWN comparison showed that the increase of need to vastly surpasses the decrease of need in absolute terms: need to goes from 68 to 162 instances, a difference of 94, which is much more than the figure for the decrease of need, which is 3, the difference between 40 and 37 instances. Here the figures for ICE-GB again offer a different picture. The difference between spoken need to, which has 51 (normalized) occurrences, and written need to, with 30 occurrences, is 21. The difference between written need, with 52 occurrences, and spoken need, with 23, is 29. Clearly, 29 is much closer to 21 than 94 is to 3 and, as shown in Table 7, whereas the rate of the increase of need to suggested by the BROWN–FROWN indicator is higher than the rate of the decrease of need, the rate of the increase of need to suggested by the ICE-GB indicator is lower than the rate of the decrease of need. In other words, unlike the BROWN–FROWN

.  This assumption has been confirmed in the research by Leech (2010) referred to in Note 3. .  But see Note 3.

The diverging need (to)’s of Asian Englishes 



i­ndicator, the ICE-GB one could suggest that the rise of need to is essentially a (partial) compensation for the fall of need. Table 7.  The frequency difference between need to and need in BROWN and FROWN

compared to ICE-GBw and ICE-GBs (frequencies per million words) AmEng

BrEng

BROWN FROWN Difference Ratio ICE-GBw ICE-GBs Difference Ratio need to need

68 40

162 37

+94 −3

1:2.38 1.08:1

30 52

51 23

+21 −29

1:1.70 2.26:1

Fifth, the BROWN–FROWN comparison showed that the increase in the share of need to in negative polarity contexts is maybe slowing down: from LOB to FLOB the share of negatively polar need to increased by 16 percentage points (from 18% to 34%, see Table 3), whereas from BROWN to FROWN the increase was only 11 percentage points (from 34% to 45%, see Table 3). The relevant figures for ICE-GB can be found in Table 6. They suggest that the increase in the share of need to in negative polarity contexts does not slow down, since there is a difference of 32% (between 37% and 69%). The latter data in effect suggest that need to will dethrone need as the first choice for negative polarity. This metacomparison warns us to exercise caution: the two predictors sketch similar scenarios, but they are not fully identical. It is not clear which predictor should be trusted more. Basically, not every feature of spoken British English might find its way into written British English and not every feature of written American English might find its way into written British English either. On the positive side, the two predictors need not be taken as contradictory. It is possible, for instance, that written British English will initially experience a slowdown in the demise of need, like written American English did, but will eventually kick out need altogether, on the road laid out by spoken British English.6 Let us come to the methodological side of the exercise. We brought in the comparison of the written and the spoken parts of ICE-GB to find out whether this comparison has diachronic relevance. The answer proved positive: comparisons of spoken and written ICE data do indeed have indirect diachronic relevance. However, they do not form a substitute for the comparison of data sets of different periods. Unfortunately, for the Asian Englishes, we will have to content ourselves with the indirect data, for the time being. Though it might very well be that there is

.  The latter half of this hypothesis has been confirmed by Leech (2010). The first half is awaiting further research.

 Johan van der Auwera, Dirk Noël & Astrid De Wit

less difference between spoken and written language in some of these outer-circle varieties, there is no reason to assume the spoken register to be less progressive than the written one in any of them.7 Of course, independent of the value of the study of the spoken for detecting tendencies in the written, the study of the spoken is interesting in its own right, as is the study of the written. 4.  The needs of four Asian Englishes Table 8 is a summary of the Asian English data (which can be put next to Table 1) with reference to which we will review the five observations listed in (10). As with ICE-GB, the figures in this table are normalized to a frequency per million words, and rounded off to the nearest integer. Table 8.  Frequency per million words of need and need to with positive and negative polarity in written and spoken ICE-HK, ICE-IND, ICE-PHI and ICE-SIN

+pol need

need to

need

need to

ICE-HKw

ICE-HKs

ICE-INDw

ICE-INDs

2 (7%)

13 (65%)

0 (0%)

2 (4%)

–pol

27 (93%)

7 (35%)

7 (100%)

51 (96%)

total

29 (100%)

20 (100%)

7 (100%)

53 (100%)

+pol

332 (85%)

475 (78%)

72 (91%)

126 (93%)

–pol total

57 (15%) 389 (100%)

133 (22%) 608 (100%)

7 (9%) 79 (100%)

10 (7%) 136 (100%)

ICE-PHIw

ICE-PHIs

ICE-SINw

ICE-SINs

+pol

0 (0%)

0 (0%)

5 (8%)

0 (0%)

–pol

25 (100%)

18 (100%)

57 (92%)

38 (100%)

total

25 (100%)

18 (100%)

62 (100%)

38 (100%)

+pol

177 (84%)

82 (73%)

470 (75%)

156 (78%)

–pol total

42 (16%) 219 (100%)

30 (27%) 112 (100%)

155 (25%) 625 (100%)

43 (22%) 199 (100%)

.  Xiao (2009: 443–444) has shown spoken Indian English to display fewer typical features of “interactive casual discourse” compared to British English and the other Asian varieties considered in this paper, and spoken Hong Kong, Singapore and Philippine English to be somewhere in between British and Indian English on what, for convenience, we could call a formality scale. The study does not show the written register to be less formal than the spoken one in any of these Asian varieties, however.



The diverging need (to)’s of Asian Englishes 

According to the first observation, need is solidly negatively polar in British and American English. This is less clearly the case in the Asian Englishes. Of the eight relevant cells that should have a zero, half of them have something else; i.e. need can be used in positive polarity contexts in some of these Englishes. Of course, a frequency of 2 instances per million words is not to be taken too seriously, but the 5 instances in written Singapore English and the 13 in spoken Hong Kong English may be meaningful. Further evidence that positively polar need is not uncommon in the latter variety, for instance, is that we also found 4 instances of it in the close to one-million-word Hong Kong Corpus of Spoken English (HKCSE). Typical of spoken Hong Kong English as well are blends like A gentleman needs not keep his word (ICE-HKs) and I think every every [sic] business needs have research (HKCSE), i.e. instances that combine morphosyntactic features of need and need to, which are (or used to be) warned against in local normative grammars like Bunton’s (1989) Common English Errors in Hong Kong. Such blends, as well as the fact that positively polar need regularly occurs only in spoken Hong Kong English, might be related to the stage Hong Kong English has reached in its development as a New English, i.e. the “nativization” stage in Schneider’s (2003, 2007) developmental model. Similarly, the occurrence of positive polarity need in written S­ ingapore English could be related to the fact that Singapore ­English has developed one stage further to the “endonormative stabilization” stage. In what way has it progressed towards “endonormative stabilization”? Of course, Hong Kong English and Singapore English have a common prevailing substrate language, viz. Chinese, either predominantly Cantonese or Mandarin/Putonghua (it goes beyond our expertise, though, to evaluate whether this could account for this particular kind of morphosyntactic patterning, let alone how), but Hong Kong English is less developed as a New English than Singapore English (Schneider 2003, 2007). Singaporeans may have made positively polar need “their own” at some stage, hence its attestation in written texts (endonormative stabilization) – but then why does it not occur in the spoken data? Hongkongers, on the other hand, may still be struggling with the “Inner Circle” (Kachru 1985) norm that forbids need in positive polarity contexts, so that positive polarity need is (still?) kept out of the written language (nativization). Likewise, the blended forms have not (yet?) become part of the written language, i.e. they have not become the norm. As to the remaining two Asian Englishes considered here, it may already have been settled somehow that Indian English and Philippine English will not “go native” in respect of the polarity of need in that they will stick to the inner circle norm of restricting need to negative polarity contexts. In line with the second observation, we expect that need is less used in the spoken register. This holds for three of the four varieties, not, however, for Indian English (with as many as 53 normalized instances of need in the spoken part

 Johan van der Auwera, Dirk Noël & Astrid De Wit

and only 7 in the written register). This is bizarre if we take the use of need to be a conservative feature, since this would situate a conservative choice in the more progressive register. However, the case of Indian English may alert us to the danger of treating certain linguistic choices as inherently conservative (or inherently progressive, for that matter): one variety’s conservatisms may be another’s progressivisms. Third, one would expect more need to in the spoken register. This holds true for two of the four varieties, but not for Philippine and Singapore English (with spoken 112 vs. written 219 for the Philippines, and spoken 199 vs. written 625 for Singapore). Interestingly, Indian English, which, “conservatively”, uses more need in the spoken variety, now appears “progressive” in also using more need to in the spoken variety. This nicely underscores two points made earlier, viz. that (i) the frequency of need to need not have an impact on the frequency of need, and that (ii) a comprehensive study will have to look at the relative frequencies of all necessity modals. Fourth, in the two varieties that have more need to in the spoken register (Hong Kong and Indian English), one would expect the dominance of need to to be greatest for positive polarity. This is the case in Indian English which has 136 (normalized) occurrences of need to in its spoken register and only 79 in its written one, and, as shown in Table 9, the ratio of the difference between ­written and spoken is higher in positive polarity contexts. In Hong Kong English, on the other hand, the ratio of the difference between the spoken and the written ­variety is the highest in the case of negative polarity contexts, even though, as far as the absolute frequencies are concerned, the bigger increase has been attested in ­positive polarity contexts. Table 9.  Ratios of the differences in the frequency of need to between the written and spoken registers in ICE-HK and ICE-IND (frequencies per million words) need to

ICE-HK w

s

ICE-IND

Difference w-s

Ratio w:s

w

s

Difference w-s Ratio w:s

+pol

332 475

+143

1:1.43

72

126

+54

1:1.75

–pol total

57 133 389 608

+76 +219

1:2.33 1:1.56

7 79

10 136

+3 +57

1:1.42 1:1.72

Fifth, we expect that in negative polarity contexts the share of need to relative to need would be bigger in the spoken than in the written register. As shown in Table 10, this is confirmed only for Hong Kong English: the percentage for negative polarity need to as compared to negative polarity need is 68% for the written

The diverging need (to)’s of Asian Englishes 



register and 95% for the spoken. In the three other varieties the spoken percentages for negative polarity need to are lower than the written percentages. Table 10.  Relative distribution of need and need to in negative polarity contexts in the written and spoken registers of ICE-HK, ICE-IND, ICE-PHI and ICE-SIN (frequencies per million words) ICE-HKw

ICE-HKs

ICE-INDw

ICE-INDs

–pol need

27 (32%)

7 (5%)

7 (50%)

51 (84%)

–pol need to –pol need&need to

57 (68%) 84 (100%)

133 (95%) 140 (100%)

7 (50%) 14 (100%)

10 (16%) 61 (100%)

ICE-PHIw

ICE-PHIs

ICE-SINw

ICE-SINs

–pol need

25 (37 %)

18 (38%)

57 (27%)

38 (47%)

–pol need to –pol need&need to

42 (63 %) 67 (100 %)

30 (62%) 48 (100%)

155 (73%) 212 (100%)

43 (53%) 81 (100%)

5.  Conclusion In view of our tentative assumption that spoken language is more progressive than written language and that, consequently, a comparison of frequencies in the written and the spoken registers can give an indication of grammatical change, we can summarize the hypothesized changes in the use of need and need to (or the lack of them) in the four Asian Englishes we have looked at as follows, again with reference to the observations listed in (10). First, as in British and American English, need remains negatively polar in Indian and Philippine English. It may not stay that way in Hong Kong English and may not have been that way, but could return to that state, in Singapore English. Second, as in British but not really in American English (when comparing 1960s with 1990s data), the frequency of need “decreases” in Hong Kong, Philippine and Singapore English, but not in Indian English, where it “increases” (the scare quotes indicating, once again, that these are hypotheses about changes based on synchronic data). Third, as in British English and, to a lesser extent, in American English, the frequency of need to “rises” in Indian and Hong Kong English, but not in Philippine and Singapore English, where it “drops” drastically. Fourth, as in British and American English, the frequency of need to “rises” especially in positive polarity contexts in Indian English. In the other variety where need to “gains” in frequency, Hong Kong English, it does so considerably in both positive and negative polarity contexts, but it “rises” at a sharper rate in the latter. Fifth, as in British and American

 Johan van der Auwera, Dirk Noël & Astrid De Wit

English, the share of need to in negative polarity contexts “increases” in Hong Kong English, but not in Indian, Philippine and Singapore English. All of this leads to the summary given in Table 11. Table 11.  Summary Br

Am

Ind

HK

Phil

Sin

need stays negatively polar

+

+

+



+

+

the frequency of need decreases

+

±



+

+

+

the frequency of need to increases

+

+

+

+





+

+

+

±





+

+



+





the frequency of need to especially rises in positive polarity contexts the share of need to in negative polarity contexts increases

It is obvious from Table 11 that the notable unity between British and American English in respect to the phenomena we have considered is not extended to the Asian varieties, though there are two varieties (Indian and Hong Kong English) that resemble the inner circle varieties more closely. But they do so in diverging ways. The two other Asian varieties (Philippine and Singapore English), on the other hand, display unity in their considerable divergence from the two inner circle varieties. However, it is too soon, in many ways, to tell whether there is any system in the unity and diversity of the “changes” observed in the Asian varieties. Naturally, although we have argued for the validity of a diachronic interpretation of stylistic differences, these changes must remain hypothetical rather than proven, not just because of the non-diachronic nature of the data, but also in view of the relatively low frequencies on which our observations are based. Moreover, even though we have flirted with Schneiderian terms like “nativization” and “endonormative stabilization” in our discussion of the results, a study of a single modal and a single cognate quasi-modal provides an insufficient basis for explanations of why certain changes are occurring in one variety but not in another. However, one thing we can conclude with some certainty is a theoretical point regarding frequency changes in the use of the modal and the cognate quasi-modal considered here, namely that the very fact that it is difficult to detect a clear pattern in the hypothesized changes in the use of need and need to across the four Asian varieties we have examined demonstrates that these changes are not necessarily connected. The fact that there are different configurations of changes affecting need and need to in the Asian varieties underscores what we already observed on the basis of the British and American data: one change does not necessarily effectuate



The diverging need (to)’s of Asian Englishes 

another. The New Englishes data have confirmed, for instance, that a decline in the frequency of need does not necessarily lead to a rise of the frequency of need to (which is most obviously shown in Philippine and Singapore English), nor to a higher frequency of need to in negative polarity contexts (most obvious in Indian English). Our contribution therefore calls attention to a point of methodological importance for diachronic modality research, i.e. the necessity to consider the division of semantic labour between the available modal expressions. Rather than merely affecting the behaviour of a cognate verb, the changes we have observed in this paper probably need to be considered as changes in the way sets of different modals and quasi-modals partition certain areas of “modality’s semantic map” (van der Auwera & Plungian 1998), which are likely to have diverged in different Englishes. The larger picture would make it more straightforward to detect the system behind such changes within individual varieties and sets of varieties, which, if any, could then much more easily be accounted for, either with reference to the explanatory factors we have hinted at, i.e. the developmental stage of the variety and linguistic substrate influence, or with reference to other factors, such as learner strategies and certain universal tendencies.

Acknowledgements This paper was presented at the weekly seminar of the School of English of the University of Hong Kong (October 2009), and at the 15th conference of the International Association of World Englishes in Cebu City, Philippines (October 2009). We are thankful for the comments received from our audiences on these two occasions, and for the feedback from the editors of the volume. Part of the analysis and data derive from a University of Antwerp Master Seminar in the Spring of 2008 and from Taeymans (2004, 2006). The work was made possible through the financial assistance of the Belgian Federal Science Ministry (within the programme of inter-university attraction poles, Grant P6/44), the University of Hong Kong Seed Funding Programme for Basic Research (contract no. 200911159051) and the Research Foundation Flanders (doctoral grant De Wit).

References Biewer, C. 2009. Modals and semi-modals of obligation and necessity in South Pacific Englishes. Anglistik 20: 41–55. Biewer, C. 2011. Modal auxiliaries in second language varieties of English: A learner’s perspective. In Exploring Second-Language Varieties of English and Learner Englishes: Bridging a Paradigm Gap [Studies in Corpus Linguistics 44], J. Mukherjee & M. Hundt (eds), 7–34. Amsterdam: John Benjamins. Bunton, E. 1989. Common English Errors in Hong Kong. Hong Kong: Longman.

 Johan van der Auwera, Dirk Noël & Astrid De Wit Collins, P. 1978. Dare and need in Australian English: A study of divided usage. English Studies 59: 434–441. Collins, P. 2009a. Modals and Quasi-Modals in English. Amsterdam: Rodopi. Collins, P. 2009b. Modals and quasi-modals in world Englishes. World Englishes 28: 281–292. Collins, P. 2009c. Modals and quasi-modals. In Comparative Studies in Australian and New ­Zealand English. Grammar and Beyond [Varieties of English around the World G39], P. Peters, P. Collins & A. Smith (eds), 73–87. Amsterdam: John Benjamins. Hundt, M. 1998. New Zealand Grammar. Fact or Fiction? A Corpus-Based Study in Morphosyntactic Variation [Varieties of English around the World G23]. Amsterdam: John Benjamins. Kachru, B. 1985. Standards, codification, and sociolinguistic realism: The English language in the outer circle. In English in the World, R. Quirk & H. Widdowson (eds), 11–30. Cambridge: CUP. Lee, J. 2001. Functions of need in Australian and Hong Kong English. World Englishes 20: 133–143. Leech, G. 2003. Modality on the move: The English modal auxiliaries 1961–1992. In Modality in Contemporary English, R. Facchinetti, M. Krug & F. Palmer (eds), 223–240. Berlin: Mouton de Gruyter. Leech, G. 2010. Where have all the modals gone: On the recent loss of frequency of English modal auxiliaries. Plenary lecture delivered at the Fourth International Conference on Modality in English, Universidad Complutense Madrid, 9–11 September 2010. Leech, G., Hundt, M., Mair, C. & Smith, N. 2009. Change in Contemporary English: A Grammatical Study. Cambridge: CUP. Lim, L. & Gisborne, N. 2009. The typology of Asian Englishes. Setting the agenda. English World-Wide 30: 123–132. Mair, C. & Leech, G. 2006. Current changes in English syntax. In Handbook of English Linguistics, B. Aarts & A. McMahon (eds), 318–342. Oxford: Blackwell. Millar, N. 2009. Modal verbs in TIME. International Journal of Corpus Linguistics 14: 191–220. Nokkonen, S. 2006. The semantic variation of NEED TO in four recent British English corpora. International Journal of Corpus Linguistics 11: 29–71. Schneider, E. 2003. The dynamics of New Englishes: From identity construction to dialect birth. Language 79: 233–281. Schneider, E. 2007. Postcolonial English: Varieties around the World. Cambridge: CUP. Smith, N. 2003. Changes in the modals and semi-modals of strong obligation and epistemic necessity in recent British English. In Modality in Contemporary English, R. Facchinetti, M. Krug & F. Palmer (eds), 241–267. Berlin: Mouton de Gruyter. Taeymans, M. 2004. DARE and NEED in British and American Present-day English: 1960s–1990s. In New Perspectives on English Historical Linguistics, Vol. 1: Syntax and Morphology [Current Issues in Linguistic Theory 251], C. Kay, S. Horobin & J. Smith (eds), 215–228. Amsterdam: John Benjamins. Taeymans, M. 2006. An Investigation into the Emergence and Development of the Verb Need from Old to Present-Day English: A Corpus-Based Approach. Ph.D. dissertation, University of Antwerp. Tagliamonte, S. & D’Arcy, A. 2007. The modals of obligation/necessity in Canadian perspective. English World-Wide 28: 47–87. van der Auwera, J. & Plungian, V. 1998. Modality’s semantic map. Linguistic Typology 2: 79–124.



The diverging need (to)’s of Asian Englishes 

van der Auwera, J. & Taeymans, M. 2009. The need modals and their polarity. Corpora and Discourse – and Stuff. Papers in Honour of Karin Aijmer, R. Bowen, M. Mobärg & S. Ohlander (eds), 317–326. Gothenburg: University of Gothenburg. Xiao, R. 2009. Multidimensional analysis and the study of world Englishes. World Englishes 28: 421–450.

Corpora BROWN = Kucera, H. and W. N. Francis, compilers (1961) Brown University Standard Corpus of American English. Brown University, Providence RI. FLOB = Mair, C., compiler (1997) Freiburg/LOB Corpus of British English. University of Freiburg, Freiburg. FROWN = Mair, C., compiler (1999) Freiburg – Brown Corpus of American English. University of Freiburg, Freiburg. HKCSP = Hong Kong Corpus of Spoken English. Research Centre for Professional Communication in English of the Hong Kong Polytechnic University, Hong Kong. ICE = International Corpus of English, http://ice-corpora.net/ice/index.htm LOB = Leech, G., Johansson S. and Hofland K., compilers (1978) Lancaster/Oslo-Bergen Corpus. Norwegian Computing Centre for the Humanities, Bergen.

Will and would in selected New Englishes General and variety-specific tendencies Dagmar Deuber1, Carolin Biewer2, Stephanie Hackert3 & Michaela Hilbert4 1University

of Muenster / 2University of Zurich / 3Ludwig Maximilian University of Munich / 4University of Bamberg

This paper presents a quantitative and qualitative investigation of the use of the modal verbs will and would in six New Englishes (Fiji, Indian, Singapore, Trinidadian, Jamaican and Bahamian English), with British English considered for comparison; will/would in their future use are also compared to other markers of futurity. The database consists of conversations from the respective components of the International Corpus of English or comparable data. The results show that the use of will versus would tends to be more variable in all New Englishes than in British English but that there are differences between the New Englishes in the type and degree of variation. Thus, both general and variety-specific tendencies seem to be at work in our data. Keywords:  New Englishes; International Corpus of English; will/would; frequency; semantics

1.  Introduction and previous research The modals and semi-modals are an area of English grammar which has seen recent diachronic change and evidences considerable synchronic variation (Krug 2000; Collins 2009b; Leech et al. 2009). Much work in this area has concentrated exclusively on native varieties of English (ENL), but in recent years there has also been a growing number of studies on expressions of modality in New Englishes (e.g. Nelson 2003; Bautista 2004; Nkemleke 2007; Biewer 2009; Collins 2009a, 2009c; Deuber 2010a). One tendency in these has been to look at those aspects known to be subject to variation and change in native varieties to see to what extent the New Englishes follow recent changes in those varieties. Another angle has been to try to identify aspects of modal verb usage that seem to be specific to one or several New English varieties. Will, which has turned up as either the most frequent modal verb, or the second most frequent one after would in c­ orpus-based ­studies of native

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

varieties and New Englishes alike (Coates 1983; Leitner 1991; Biber et  al. 1999; Nkemleke 2007; Leech et al. 2009), is of i­nterest from both of these p ­ erspectives. First, in native varieties at least – American E ­ nglish in ­particular – it is one of the modals that has been affected by the rise in use of semi-modals, in this case be going to (Szmrecsanyi 2003; Leech et al. 2009). Collins (2009c: 290) found in this regard that “the incursion of be going to into the territory of will is more advanced in the IC [inner circle] than in the OC [outer circle] varieties”. Second, it has been reported for many New Englishes that will can be replaced with would in certain contexts where this would not be possible in Standard British or American English. In Standard British or American English, the domains of will and would are fairly clearly demarcated in relation to each other. Would as the past tense member of the pair is used in four types of contexts (see Coates 1983; Quirk et al. 1985; Huddleston & Pullum 2002; Collins 2009b): a. as past time equivalent of will:

(1) Later, he would learn his error.

(Quirk et al. 1985: 232)

b. as past tense equivalent of will in indirect speech constructions where ­backshift applies:

(2) I felt sure that the plan would succeed.

(Quirk et al. 1985: 231)

c. in the hypothetical sense of the past tense (in conditional sentences or where an unfulfilled condition is implied): 

(3) If you really worked hard you would soon get promoted. (Quirk et al. 1985: 188)



(4) Don’t bother to read all these papers. It would take too long. (Quirk et al. 1985: 234)

d. to express tentativeness or politeness in pragmatically specialized uses related to the hypothetical sense: 

(5) … and, I would suggest, it’s too expensive anyway. (Huddleston & Pullum 2002: 200)

(6) I would like to see him tomorrow. (7) Would you lend me a dollar?

(Huddleston & Pullum 2002: 200) (Quirk et al. 1985: 233)

Against this background, consider the following examples from Trinidadian, ­Philippine and Singapore English:

(8) Smelter protesters are planning what they hope would be a massive ­demonstration to again highlight their total rejection of government’s plans (ICE-T&T S2B-008)





Will and would in selected New Englishes 

(9) While our neighbors are bogged down by the complexities of the fall-out from severe currency depreciation, the Philippines has showed remarkable resiliency and its attractiveness as an investment haven has been enhanced. Ford’s return to the Philippines engenders the hope that other big foreign investors would follow. (ICE-PHI W2E-007, from Bautista 2004: 124)

(10) So to have optimal development you must work with all areas and so early childhood teachers try to provide a holistic uhm curriculum that would cater for all development (ICE-T&T S2A-030) (11) The aim of this paper is to throw light on how the singles problem and social development is being tackled in MINDEF. The framework of analysis would be geared to look at the policy context of the singles problem within MINDEF. (ICE-SIN W1A-007, from Collins 2009a)

Collins (2009a) has introduced the label “extended would” for such uses of would in non-past, non-hypothetical contexts. Different reasons have been cited for the extension of would in New Englishes. Studies of Trinidadian English (Youssef 1990, 2004; Solomon 1993) have emphasized the role of language contact, as the ­English-based Creole with which English coexists in Trinidad has a modal verb system where would is equivalent to English will. Collins (2009a), in contrast, argues on the basis of an analysis of data from Asian countries – in particular ­Singapore and India, as well as Hong Kong and three countries where English constitutes a f­oreign language1 – that “[t]he development of extended would in the New Englishes is most likely motivated by the desire that speakers have to exploit the ­capacity of this form to convey a high level of polite and tactful ­unassurredness”. Bautista in her study of would in Philippine English also m ­ entions “non-­assertiveness” (2004: 126) as a factor, in addition to imperfect learning and simplification. Extended would has been reported for other New Englishes as well, ­including Indian English (Nihalani, Tongue & Hosali 1979: 196; Trudgill & ­ Hannah 2002: 132), Ghanaian English (Huber & Dako 2004: 856), Nigerian English (Kujore 1985: 38) and Cameroonian English (Nkemleke 2007). In the case of the last three varieties, substitution of will for would has been mentioned as well (Sey 1973: 35–36; Alo & Mesthrie 2004: 815; Nkemleke 2007). Sand (2005) in her study of shared morphosyntactic features in contact varieties of English based on those parts of the International Corpus of English

.  Collins (2009a) had a grammaticality judgement test with items from Bautista’s (2004) study administered to students in Singapore and India as well as in Hong Kong, Indonesia, Korea and Japan, but the corpus examples supplied in the study are all from ICE-Singapore and ICE-India.

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

(ICE) available at the time found non-standard uses of will and would in New Englishes mainly in conditional sentences. She observes that “[t]he combination of would in the matrix clause and present tense in the conditional clause appears to be most frequent, but any combination of modal and tense can be encountered” (2005: 149). Deterding in his observations on the use of will and would in Singapore English also refers specifically to conditional sentences. He argues that “the grammatical category of hypothetical conditional does not generally exist” (2007: 50), will being used instead of hypothetical would, and points out further that “[i]nstead of occurring in hypothetical conditionals, would is often used in Singapore to indicate that something is tentative” (2007: 50). Another phenomenon that has been noted in connection with will/would in some New Englishes is that the habitual use of will that is also possible in native varieties is extended. The following is an example of what Quirk et al. (1985: 228) describe as the “habitual predictive meaning” of will: (12) She’ll sit on the floor quietly all day. She’ll just play with her toys, and you won’t hear a murmur from her. [of a good baby] (Quirk et al. 1985: 228)

According to Deterding (2003, 2007: 48–49), habitual will is very common in ­Singapore English; he cites substrate influence from Chinese (Mandarin/Hokkien) as a possible source. While the habitual predictive use described by Quirk et al. is possible only in non-past contexts, habitual will in Singapore English also occurs in past contexts, as in the following example: (13) last time, erm … she will um babysit for other people (Deterding 2007: 48)

Deuber (2010a) found habitual will to be prominent in Trinidadian English, arguing that this could be due to indirect influence from Trinidadian Creole, which has an overt marker of present habitual aspect (does, see 4.2 below). This study further found that would is widely used in Trinidadian English in present habitual contexts. Deterding (2007: 51) also reports the use of would as a variant of will in present habitual contexts in Singapore English. For Indian English, it has been suggested by Balasubramanian (2009: 104) that habitual will is more commonly used than in British or American English. Previous research thus indicates a considerable degree of variation in the use of will and would in New Englishes, but the extent to which general or varietyspecific tendencies and factors are at work remains insufficiently understood, as most studies concentrate on only one variety. The present study aims to shed more light on this issue by investigating the use of these modals across a selection of New Englishes. Six New Englishes, all represented in ICE, are considered in the ­present study. Three of them function as a second language (ESL) for most of their s­ peakers



Will and would in selected New Englishes 

(Indian, Singapore and Fiji English), while the other three (Trinidadian, Jamaican and Bahamian English) coexist with an English-based Creole, a situation which has been described as “English as a second dialect” (ESD) (Görlach 1991). British English is taken as a basis for comparison (all countries considered are former British colonies). In selecting the New Englishes for investigation we have placed a special focus on the Caribbean region, for two reasons. First, the language contact situation with English-based Creoles where the same form as in English can occur with a different meaning (e.g. would for ‘will’ in Trinidadian Creole, see above) raises important issues for the study of modal verbs (Deuber 2010a). ­Second, this region has so far been given less consideration in corpus-based comparative research on New Englishes than the Asian region since until recently most available ICE corpora of New Englishes represented the latter region (see e.g. Collins 2009c); ICE-Jamaica became available only in 2009, and ICE-­Trinidad & Tobago and ICE-Bahamas are still being compiled (see Deuber 2010b & Hackert 2010, respectively, for details of these projects). To further broaden the scope of research, we include data from Fiji, the only New English from the South Pacific region for which an ICE corpus is being compiled (see Biewer, Hundt & Zipp 2010). Finally, we have selected Singapore and Indian E ­ nglish from the varieties for which complete ICE corpora are currently available as several ­studies indicate that there is variation in the use of will/would in these varieties that merits closer investigation. The data are drawn from the text category “private conversations” in the respective ICE corpora where available; where the conversation component of the respective ICE corpus has not yet been compiled comparable other data are used. We analyse the quantitative distribution of will/would as well as uses and m ­ eanings; will/would in their future use are moreover considered in comparison to be going to and other markers of futurity. Details of the varieties selected and the data and method are provided in ­Sections  2 and 3, respectively. Section 4 presents and discusses the results, and Section 5, finally, draws the conclusions and gives an outlook for further research. 2.  New Englishes selected Indian and Singapore English are often considered typical ESL varieties, though in the latter variety there are ongoing changes towards ENL status. In both cases, the language was introduced during the colonial period not primarily via face-to-face communication but via the education system and now plays a key role not only in that domain, but also in national government and politics, the judiciary, business and the media.

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

Even though, in terms of speaker numbers, Indian English constitutes one of the most important varieties of English worldwide, locally its status and ­function are fairly tightly circumscribed. English constitutes “a minority lect, largely restricted to utilitarian functions and certain domains and strata of s­ociety” (Schneider 2007: 161). On the one hand, its role is that of a neutral l­ingua franca in an ­ethnically, religiously and linguistically diverse nation; on the other, the ­language clearly functions as a marker of education and higher social status. Recent work (e.g. Vaish 2008; Chand 2009; Sedlatschek 2009) does suggest that Indian English is becoming increasingly important in the lives of a broadening segment of the urban population, but this development largely post-dates the compilation of ICE-India in the early 1990s. In the city state of Singapore, English “occupies a place, enjoys a status, and performs roles which it does not in any other Asian country” (Tickoo 1996: 431). It is, in fact, a home language for a growing number of speakers (see e.g. Lim & Foley 2004). Singapore may thus be considered “an ESL>ENL transition country” (Schneider 1999: 193). It must be noted, though, that many of these speakers have in fact acquired colloquial Singapore English (Lim & Foley 2004), a restructured variety (popularly known as Singlish) that may be considered as creolized (Ansaldo 2004). English in the Fiji Islands, a group of more than 300 islands situated in the South Sea between Vanuatu and New Caledonia to the west and Samoa and Tonga to the east, is again a typical ESL variety. Like India and Singapore, Fiji is a multiethnic, multicultural and multilingual country. Beside the two biggest ethnic groups, the Fijians and Indo-Fijians, who comprise 57% and 37% of the population respectively, there are also many other ethnic minorities in the country such as Rotumans, Chinese, other Pacific Islanders, Europeans and Part-Europeans (www. statsfiji.gov.fj). Most people speak English as a second language (Tent 2001: 210) and one important function of English is that it serves as a lingua franca between the different ethnic groups (Tent & Mugler 2008: 236). English is not only the medium of instruction in school and the predominant language in literature and the media (Tent & Mugler 2008: 235); young urban Fijians also tend to use English with their friends and their siblings after school, and business correspondence and business transactions may take place in English depending on the language skills of the customer (Biewer in preparation). In the urban centres English is very much part of everyday life. There are different lects of Fiji English that can be distinguished in terms of schooling, social class or rural versus urban upbringing (Tent & Mugler 2008: 236; Biewer in preparation). Fiji English may be influenced by various external models such as British English through its history, New Zealand English through its economy and geographical closeness, and American English through the media. In addition, second language acquisition, transfer from the



Will and would in selected New Englishes 

mother-tongue and distinct cultural habits shape Fiji English into a recognizably distinct variety of English (Biewer in preparation). Even though the other three New Englishes selected all represent the ­category ESD, Jamaica, Trinidad and the Bahamas not only have a somewhat different ­linguistic make-up and different histories but they also represent different settings economically and socially. The most populous state in the anglophone Caribbean, Jamaica (population 2.8 million), builds on tourism and bauxite/alumina as main industries but is currently facing a number of severe economic challenges (see Central Intelligence Agency 2010). Ethnoculturally, the country represents the typical situation on the Caribbean islands, with over 90% of the population being of African descent. Trinidad, the larger and more populous constituent of the two-island republic of Trinidad and Tobago (population 1.2 million), in contrast, has been described as being “like no other Caribbean island” (Blouet 2002: 352). Reasons for this include its oil and natural gas-based economy, due to which it enjoys considerable prosperity, and the fact that people of African and East Indian descent make up about equal proportions of the population. The Bahamas is one of the wealthiest Caribbean countries, its economy being largely dependent on tourism and offshore banking. The country is heavily urbanized, with roughly two thirds of all Bahamians living in the capital, Nassau. Some 85% of the Bahamian population of ca. 345,000 (http://statistics.bahamas.gov.bs/download/093911800. pdf) are black. The 2000 census registered 21,000 Haitians, but some estimates including illegal immigrants put the current number as high as 78,000, or 25% of the population. In each of these three Caribbean countries, English coexists with a lexically related Creole in what has most often been described as a continuum with C ­ reole and English poles and a range of varieties in between (see e.g. Winford 1997; Hackert 2004). However, while in Jamaica a conservative or basilectal variety ­constitutes the Creole extreme, in Trinidad and in the Bahamas only intermediate or mesolectal Creoles are spoken. Attitudes towards the Creole varieties have been substantially transformed over the past few decades. Whereas formerly they were simply and uniformly ­designated as “bad” or “broken” English, they are now much more positively valued. However, as language attitude surveys (e.g. Beckford Wassink 1999; ­ Mühleisen 2001) have also revealed, while speakers for the most part now hold positive attitudes towards their vernaculars, perceptions of a functional division between them and English remain fairly strong. As Youssef (2004: 44) observes for Trinidad and Tobago, “the Creole is the language of solidarity, national identity, emotion and humour, and Standard the language of education, religion, and officialdom”. Nevertheless, speakers balance the varieties according to the dynamics of the context, and some degree of “mixing” is characteristic of all but the most

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

formal spoken language situations. Contexts in which such “mixing” increasingly occurs include the media and the classroom, as Youssef (1996: 11) notes. Similar observations have been made by Shields-Brodber (1997) for Jamaica, by Hackert (2004: 54–64) for the Bahamas, and by Carrington (2001) for the Caribbean region as a whole. In the context of ICE, the interaction between English and Creole varieties is evident in particular in some of the public dialogues and most especially in the private dialogues (see e.g. Sand 1999; Deuber 2009a, 2009b, 2009c, 2010b). 3.  Data and method The quantitative distribution of modal verbs can easily be analysed in large data sets (see e.g. Collins 2009c), but the study of uses and meanings necessitates close attention to the context of each token. Both for this reason and also because for three of the varieties considered the respective ICE corpora are still being compiled, we have opted for a relatively small sample consisting of 15 ICE texts or equivalent data per variety. The data are spoken, private, dialogic texts. In ICE terms, these data make up the category of conversations, with texts coded as S1A-001 to S1A-090. Arguably, this category is the most heterogeneous of all ICE text categories, as conversations can be anything from intimate to fairly formal; i.e. they range from the type of excited exchange that occurs between two best friends cursing their husbands to discussions between colleagues at work or even more interview-like exchanges between relative strangers. In native English-speaking contexts it seems natural and even desirable to record informal conversations between people who know each other well (see Holmes 1996: 169). In the case of New Englishes, in contrast, the contexts of use of these varieties tend to lead to the inclusion of mainly relatively formal interactions in this category (see Schmied 1996: 185–186; Deuber 2009a, 2009c, 2010b; Hilbert & Krug 2010: 60). However, a considerable range including some rather more informal conversations may still be represented. For example, the Bahamian texts analysed in this study consist of three types. There are, first, sociolinguistic interviews, i.e. interactions between a fieldworker and an interviewee designed to elicit casual speech (see Labov 1984: 32–33). Another set comprises interviews with linguistically sensitive professionals, such as t­eachers or journalists, about language use and attitudes in the community. Both sets of interviews were conducted by one of the authors of this paper (SH), who was a student researcher at the time. Obviously, such interactions with outsiders to the community represent precisely the kind of situation in which recourse to the standard variety or the upper lectal levels of one’s linguistic competence is called for. Finally, two conversations involved only community members, with the



Will and would in selected New Englishes 

researcher (SH) as participant observer. These conversations obviously were the most private, informal ones and produced the most vernacular speech. A similar range of data is found in the conversation category of the two other Caribbean components of ICE, those for Jamaica and Trinidad & Tobago (see Deuber 2009a, 2009c, 2010b). For four of the six varieties selected, namely Singapore, Indian, Jamaican and Trinidadian English,2 a fully or partially complete ICE conversation category was available at the time of the study. In these cases, we have used the first 15 texts (codes S1A-001 to S1A-015). The amount of data is in each case about 31,000 words.3 The Bahamian data described above, which are yet to be integrated into ICE-Bahamas, also amount to about 31,000 words. In the case of ICE-Fiji, conversation data are not yet available for research (and had not even been collected when the analyses for this paper were conducted). The analysis of will and would in Fiji English in this paper is based on 14 interviews with Fijians which were recorded in Suva in 2007 as part of field research on South Pacific Englishes (see Biewer in preparation).4 As each interviewee also filled in a questionnaire giving sociolinguistic background information, their age, ethnicity and regional upbringing is known as well as the highest education level attained and current occupation. The 14 interviews were chosen out of 28 to give a good cross-section of age and gender. As a result eight of the interviewees are women, and nine are men. While two of the interviewees are pupils, another seven are students from the University of the South Pacific, and the remaining eight are adults who finished their education and are part of Fiji’s workforce – apart from one woman who was already retired at the time the interviews were recorded. The interviews consist of a total of 37,519 words; in order to achieve comparability with the data retrieved for the other varieties in question the results were normalized to 31,000 words. We consider each token of will/would (and of the other markers of futurity) in the selected data, including negative and contracted forms, but excluding

.  All the data from ICE-Trinidad & Tobago analysed in this study are from Trinidad. .  The standard text length in ICE is approximately 2,000 words but many texts are slightly longer so that the end of the text occurs at an appropriate point in the discourse (see Nelson 1996: 27). .  The 14 interviews are raw transcriptions which have been checked once but still await their final correction; they are therefore liable to minor changes. The direct conversations in ICE-Fiji will include speakers from the various ethnic groups residing in Fiji. In particular, following the demographic proportions given in the 1996 census, an equal amount of data from both Fijians and Indo-Fijians is attempted (see Biewer, Hundt & Zipp 2010: 8–9).

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

tokens of will/would that are part of perfect forms (will have/would have)5 and all tokens that occur in uncertain parts of transcriptions (text enclosed by ICE mark-up symbols 〈?〉 〈/?〉), quotations (text enclosed by ICE mark-up s­ ymbols 〈quote〉 〈/quote〉), extra-corpus text (text enclosed by ICE mark-up symbols 〈X〉 〈/X〉), and repetitions and self-corrections (text enclosed by ICE mark-up ­symbols 〈-〉 〈/-〉). The framework for the analysis of uses and meanings has been adopted from Deuber (2010a); it is informed by descriptions of the meanings of the modal verbs in the major reference grammars of English as well as specialized studies such as Coates (1983) and Palmer (1990) but adapted to the particular research issue of the choice between will and would. Accordingly, a basic distinction is drawn in the categorization between the past or hypothetical uses associated with would in ­Standard British or American English (see Section  1) and non-past, non-­ hypothetical uses. Instances of the modals in conditional sentences were analysed separately since previous research had identified conditional sentences as a special area of variation. A separate category was further set up for pragmatically specialized uses. In the non-past and non-hypothetical category, three types of m ­ eaning were distinguished, the first two exclusively and the third primarily associated with will in Standard British or American English. The first type is futurity accompanied by various shades of modal meaning, either epistemic (prediction) or dynamic (volition). The second is the present habitual meaning already described in Section 1, and the third, epistemic will with present rather than future reference.6 The following is an example of the third type: (14) That’ll be the postman. [on hearing the doorbell ring]  (Quirk et al. 1985: 228)

Would can also be used in the epistemic sense with present reference; it is often considered to be more tentative than will (see Palmer 1990: 58; Huddleston & ­Pullum 2002: 200), though Ward, Birner and Kaplan (2003) have shown that this is not necessarily the case.

.  As explained in detail in Deuber (2010a), the perfect forms, which are normally in a ­minority by far, are best treated separately when Caribbean varieties of English are considered because they are potentially subject to the influence of different forms in the Creole than are non-perfect forms (in the case of the modals analysed here, the Creole past form woulda). .  There are a couple of examples, mainly in the Trinidadian data, where the reference is clearly not to the future but where it is not entirely clear whether habitual or epistemic meaning is intended. These have been classified as “habitual or epistemic”.



Will and would in selected New Englishes 

4.  Results and discussion This section first presents the results and highlights the main findings, and then goes on to further discuss the results for each variety. 4.1  Results and general findings The quantitative distribution of will/would, the uses and meanings of these two modals and will/would in comparison to other future markers are shown in Tables 1 to 3, respectively. Some of the main general findings from these tables are as follows: in Table 1, Indian and Singapore English stand out in having high ratios of will to would of 3.2:1 and 5.5:1 respectively in comparison to all other varieties, where the ratios are around 0.7:1/0.8:1. With regard to uses and meanings, Table 2 shows that the distinction between will and would is less clear-cut in the New Englishes than in British English, with tokens of would appearing where only will occurs in British English and vice versa,7 but there are differences among the New Englishes both in the categories in which variation occurs and in the extent to which the respective other auxiliary is used. Interesting specific findings from this table include the following. First, future would is attested but rare in all New Englishes considered. Second, there are more tokens of present habitual will in the data from all the New Englishes than in the British data; this use of will is ­especially prominent in Trinidadian and Singapore English, followed by ­Bahamian and Indian English. Third, some tokens of habitual would with present reference are found in most varieties but this usage seems to be especially common in ­Trinidadian and Bahamian English. Fourth, in the data for all New Englishes a few tokens of will occur in past habitual contexts; the three ESL varieties additionally exhibit a tendency for will to appear in contexts where it does not occur elsewhere (e.g. hypothetical contexts, protases of conditional sentences). As regards future markers (Table 3), would, shall and the ­Creole forms go and a go in the Trinidadian and Jamaican data respectively, turned out as minor variants in the data analysed. Will and (be) going to/gonna are the main future markers in all data sets, although the Bahamian one also has

.  The single instance of would in a non-past context in the British data that could be considered habitual seems to be highly exceptional (note the occurrences of will in the same context): Uhm 〈,〉 what is a common occurrence is you’ll have somebody coming into a college to do a workshop on work with the disabled or dance with the disabled and you’ll go along to 〈}〉 〈-〉 that 〈/-〉 〈=〉 that 〈/=〉 〈/}〉 workshop and it would be full of able-bodied students 〈,〉 who are on the course wanting to find out how you do 〈,〉 this new thing 〈#〉 Uhm there will be no disabled dancers in the class (ICE-GB S1A-001).

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

Creole gon’ as a common form. In the Trinidadian and Jamaican data, will and ­ arking (be) going to/gonna make up practically identical shares of the future m domain as in the British data, whereas in the data from Fiji, Indian and S­ ingapore English, the proportion of will is higher, which is in line with Collins’ (2009c) findings for ESL varieties. Table 1.  Quantitative distribution of will/would in conversation data from selected New Englishes

total number of tokens 

will would

ratio will:would

GB

Trin

Jam

Bah

Fiji

India

Sing

89

125

79

71

74

145

191

114

175

107

104

97

46

35

0.8:1

0.7:1 0.7:1 0.7:1

0.8:1

3.2:1

5.5:1

Table 2.  Uses and meanings of tokens of will/would in conversation data from selected New Englishes

non-past, non-hypothetical uses

GB

Trin

Jam

Bah

Fiji

India

Sing

62

41

42

25

26

92

95

2

1

1

3

2

1

7

37

20

29

16

24

35

20

7

2

5

4

2

1

4

future (prediction/ volition)

will

habitual

will would

1

45

epistemic

will

4

5

4

would

5

1

will

7

would

3

habitual or epistemic past/ hypothetical past time uses (except habitual)/ backshift Habitual

would

would

pragmatically specialized uses

2

1

1

15 13

2 3

1

11

4

1

3

8

19

9

39

43

1

3

will

4

9

will would

past hypothetical

2

will

would hypothetical

2 2

2 52

36

37

11

4

12 5

will

1

would

1

1

will would

8

1 18

26

21

17

11

1 28

12

(Continued)

Will and would in selected New Englishes 



Table 2.  (Continued) uses in apodoses of conditional sentences

protasis: present tense

GB

Trin

Jam

will

6

12

7

would

1

14

5

protasis: past tense

will

protasis: past tense (backshift)

will

protasis: tense-neutral verb form

will

Bah

Fiji

India

Sing

15

13

14

7

2

4

4

1

would

7

would

1

7

4

1 2

would

2

2

1

will

uses in protases of conditional sentences

would

unclear/indeterminate

3

2

1

will

10

20

5

2

6

8

6

would

13

19

20

3

14

3

0

Table 3.  Future markers in conversation data from selected New Englishes GB n will

Trin n

%

n

%

n

%

n

62 55

41

55

42

56

25

32

26

72

92 71

2

3

1

1

1

3

8

5

45 40

30

41

1

1

a go

31

41

1

1

gon’ TOTAL

India

%

go

*Excluding

Fiji

n

6

(be) going to /gonna*

Bah

%

would shall

Jam

113

74

75

27

35

24

31

77

7

36

19

%

Sing n

%

95 62

2

2

1

1

1

1

3

2

34 26

129

53 35

152

non-finite forms.

4.2  Trinidadian English Previous studies have highlighted the future use of would in Trinidadian English, based on the function of would as a future marker in Trinidadian Creole, as in example (15): (15) The meeting would be held at ten o’ clock.

(Youssef 2004: 48)

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

In the present data, this is not a prominent feature, however. The most striking findings from the Trinidadian data are that habitual will is very frequent and that habitual would is very strongly represented in non-past, non-hypothetical contexts. Example (16) below shows the use of habitual will, (17) the use of habitual would, and (18) variation between the two forms; in connection with the use of would, it must be emphasized that in no case is the context past or hypothetical. (16) the only Standard English they exposed to is the one that they read not in the papers but the one they will read in the textbook right (ICE-T&T S­ 1A-014) (17) That’s why at least we here in La Romaine you know we would lock the gates at during break and lunchtime and make the students stay in the foyer (ICE-T&T S1A-011) (18) But what we will talk about more so is the actual structure of the reproductive systems and then we would go on to S T Ds (ICE-T&T S­ 1A-012)

That habituality is so frequently expressed in these data has to do with the topics, as the topics in the conversations from ICE-Trinidad & Tobago analysed here often involve questions about habitual behaviour. As described in more detail in Deuber (2009a, 2010b), these texts are all discussions among teachers about school-related topics, in particular language use in the schools. However, although the topics promote the expression of habituality, the frequency with which will is used, and even more so its apparent interchangeability with would in this type of context, is still striking. In British English, for example, the normal form would be the simple present – which is often found in the Trinidadian data alongside will/would (see example (16) above) – and habituality is only a minor meaning of will (see also Mindt 1995: 59). Deuber (2010a) has suggested that the Trinidadian Creole preverbal marker does, which indicates present habitual aspect, may have an influence on these patterns of usage. In this connection she cites Hodge (1997), who draws attention to interference from Trinidadian Creole does in the use of habitual will/would. Hodge argues that in Trinidad “[s]ome people feel very uncomfortable using just one word for this tense [present habitual]. When they use the habitual present tense in English, they feel the need to slide in another word before the verb” (1997: 100); the examples she gives are similar to those in (16) to (18) above, and involve the use of will/would (or adverbs such as usually) in contexts where a simple present would have sufficed (1997: 101). With regard to conditional sentences, it can be noticed in Table 2 that there is a comparatively large number of instances of the combination of present tense and would in the Trinidadian data. In this connection it must also be noted, however, that many of the instances of would in apodoses of conditional sentences with a



Will and would in selected New Englishes 

present tense verb in the protasis are due to the habitual use of would commonly observed in the Trinidadian data, as in (19): (19) if I am delivering a lesson I would use Standard English (ICE-T&T S­ 1A-006)

4.3  Jamaican English Jamaican English shows a similar quantitative distribution of will/would as the other two Caribbean varieties, Trinidadian and Bahamian English (and as in ­British English), but there are some differences on the level of uses and m ­ eanings. Habitual will is not as common in the Jamaican as in the Trinidadian and ­Bahamian data and habitual would does not occur in non-past contexts at all. While the topics of the texts may have an influence on the degree to which non-past ­habitual actions are referred to, it should also be noted that Jamaican Creole, unlike ­Trinidadian Creole (and Bahamian Creole as well), does not possess a preverbal marker of present habitual aspect, and also that Jamaican Creole does not have would as a non-past form but rather uses wi, equivalent to English will (see Bailey 1966: 45). 4.4  Bahamian English Bahamian English shows striking parallels with Trinidadian English in terms of meanings and uses of will/would. Both varieties evidence a high frequency of tokens of will with habitual meaning in the data analysed; in other words, whereas in British English, for example, will clearly functions mainly as a future marker, in the present Trinidadian and Bahamian data the marking of habituality seems to be at least as important. Furthermore, habitual would also commonly occurs in non-past contexts in the Bahamian data, though it is not quite as strong as in the Trinidadian data. What could be the reasons for these parallels? For one thing, interview topics might have promoted the expression of habituality in general, because just as in the Trinidadian data, language use and attitudes constituted one of the central points of interest in the Bahamian conversations, and this topic, of course, fosters questions and answers about habitual behaviour. For another thing, just like Trinidadian Creole, Bahamian Creole possesses a preverbal marker of non-past habitual aspect, does, as in (20): (20) We does go to church every Sunday.

As explained above in the section on Trinidadian English (4.2), the presence of this preverbal marker in the Creole seems to have a favouring effect on the occurrence of habitual will (which, just like does, could be described as a preverbal marker followed by a verb in the base form), and it might be surmised that the same effect holds in Bahamian English. As for the strong presence of would in non-past

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

c­ ontexts, this, too, might have to be attributed to Creole influence, as Bahamian Creole also has would as a non-past form. What we seem to be w ­ itnessing here is a case of indirect Creole influence in that a form which is also present in Standard British or American English is used with a Creole meaning. As for the past/hypothetical category in Bahamian English, habituality once more constitutes the primary use of will and would, with would almost four times as frequent here than will. In contrast to that, hypothetical uses and pragmatically specialized contexts constitute minor environments for the occurrence of would. In sum, the primary function of will and would in Bahamian English, at least on the level of conversations, seems to be the expression of habituality, with will dominant in non-past contexts and would in past ones. Both markers have secondary functions as well, these being futurity in the case of will and hypothetical uses and pragmatically specialized uses in the case of would. With regard to future markers, Bahamian English presents a striking exception, because whereas in all other varieties in our sample future temporal reference is split mainly between will and (be) going to/gonna,8 with will the more frequently employed option, in Bahamian English we see a tripartite division between will, (be) going to/gonna, and gon’, which are all employed at almost identical rates. The following example illustrates the use of the latter two markers within a single utterance: (21) I – if I going to beat him, I sit him down and tell him what I gon’ beat him for. From the time I beat him. No, some people just lash out at their kids and beat them.

One reason for the frequent employment of gon’9 in the Bahamian data may be the fact that these conversations include not only a number of semi-formal discussions or interviews with teachers or other professionals on topics such as language use and attitudes, but also a few excited exchanges between Bahamian friends in which the researcher took part but did not have a main role as interviewer (see Section 3). It is in these conversations that we find almost all of the tokens of gon’ extant in the sample. What this goes to show is that Standard English in the Bahamas – and in the Caribbean in general, as shown by Deuber (2009c) for Jamaica and Deuber (2009a, 2010b) for Trinidad – if it is defined solely by its usage by

.  Copula deletion in going-to-futures (and other contexts) is a feature of the Caribbean varieties; see e.g. Deuber (2009c) on Jamaican English. .  Incidentally, except in the case of one speaker, gon’ always occurs without a preceding copula, in contrast to going to/gonna, where the copula is mostly used, though there are also a few tokens without the copula, as in example (21).



Will and would in selected New Englishes 

educated speakers comprises a fairly large range of the continuum – a range which in some situations includes a fair amount of Creole or Creole-influenced forms. 4.5  Fiji English In comparison to the other New Englishes one notes in the Fijian data a relatively high usage of habitual would in connection with past events, which can be found to a similar extent only in the Bahamian English data. For the Fijian data this is partly triggered by the content of the interviews, so is therefore topic-related, as the interviewees were, for instance, asked to recite a legend or to talk about their memories of their first day at school or their experience of story-telling in the village; in a way it is also age-related as this usage mostly occurs with older interviewees, naturally with those for whom the first day at school or the listening to a story is a matter of the distant past.10 In four cases the non-past modal will is used as a habitual marker to describe past events, as for instance in (22): (22) so when you lived in a Fijian bure there was certain things that you knew about parts of the house and your parents or whoever will teach you/ the the magimagi where it was made from/ you actually see them doing it … (SaFiRa-s/fc_me.doc)

This may be influenced by a narrative technique typical of the Fijian substrate. In particular as part of an informal narrative, past tense does not have to be marked in Fijian if from the context, or a previous sentence, it is already clear that the event that is being narrated took place in the past (see Dixon 1988: 69f.). In the Fijian data will was also used twice instead of would in a hypothetical context. In both cases it was used by pupils. See, for instance, example (23): (23) […] we never use it during English classes/ our teachers will kill us/ … (SaFiRa-s/fc_meal.doc)

Here, the context is an implied hypothetical situation; it is not part of a ­conditional clause in which the if-construction explicitly marks a situation as unreal. ­Bautista (2004) also found in the case of Philippine English that Filipino students had ­difficulties with the different functions of would. As one typical scenario of a learner mistake, she identifies, in accordance with Lock (1996), an under-usage of would in contexts in which a hypothetical situation is implied but not ­explicitly stated; the learners do not seem to be aware that would is required in these cases

.  This usage is found particularly often in SaFiRa-s/fc_me.doc and SaFiRa-s/fc_ape.doc, i.e. in the conversation of two women over 55; both women are talking a lot about their ­childhood.

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

(Lock 1996: 201; Bautista 2004: 122–123). Bautista (2004: 122) links this to classroom teaching in the Philippines, where hypothetical would is usually taught in combination with if-constructions. Something similar seems to be happening in the case of the Fijian data. While the pupils do not seem to recognize that the implied hypothetical situation requires would in Standard British English, they show a high usage of will in the apodosis of conditional sentences. The pupils used will within an if-construction in nearly every second construction they formed with will; also two of the student groups that were interviewed only used will in the context of a conditional sentence. It seems that these younger speakers somehow strongly identify the usage of will with conditional sentences, a kind of restricted usage of will which may have been reinforced by a restrictive way of teaching these modals in school in which a limited number of constructions tend to be over-emphasized.11 In addition, it is worth noting that among the pragmatically specialized uses the collocation I would like to was prominent; it was mainly used by one student who was shy and more reluctant to talk, and not as proficient in English as his fellow student with whom he took part in the interview. This was also the only context in which he used would. See for instance example (24): (24)  I would like to earn my major in geography and language/ … (SaFiRa-s/ fc_jojo.doc)

What Hasselgren (1994) calls the “teddy bear principle” may be applicable here, the principle that learners tend to “clutch to what they feel is safe and familiar” (­Tschichold 2002: 133; see also Hasselgren 1994). Once again these idiomatic expressions may have been rehearsed and automatized in school (also see G ­ eorgieva 1993: 161). There seems to be a link between a high usage of these formulaic expressions and lower language proficiency. The findings can also be interpreted from a text-linguistic perspective. One possibility to create a cohesive text is the use of repetition (see Schiffrin 2006: 185). In the Fijian data the repetition of phrases is relatively frequent, e.g. in (25): (25) […] on eighteen June eh/ that’s my birthday eighteen June/ we will be ­starting in Bau/ Bau island the 〈chiefly?〉 island of Fiji/ we will be starting there/ … (SaFiRa_s/fc_il.doc)

This technique of course adds to findings of will (or would) of a similar type. This is not necessarily the teddy bear principle at work but a matter of emphasizing a point for some of the speakers. It also occurs frequently in passages in which a .  In the Fijian data only one adult used will almost exclusively in if-constructions; ­interestingly, he grew up in a more rural area and had originally been a primary school teacher.



Will and would in selected New Englishes 

story is narrated; again it helps to emphasize important information within the story. Whether this is an influence of a rhetoric style used in Fijian, remains to be seen. The results for Fiji English show that, again, the topics of the interviews play a role in the usage of will and would but also narrative techniques typical of the local substrate language may have an effect. Moreover, language proficiency and the teaching methods used in school may have an influence. It was also shown that sociolinguistic background information is vital to understand the use of will and would by different speakers. 4.6  Indian English The overall frequencies of will and would in Indian English differ substantially from the frequencies in most of the other varieties under discussion. Will is used more often, while the overall occurrence of would is considerably lower. The ­former tendency is mostly based on the high use of will with future meaning. Accordingly, Indian English uses going to or gonna less frequently than most of the other varieties in future contexts. Parallel to the other varieties, will is commonly used with habitual meaning in Indian English, but this is practically restricted to the present tense. Will hardly features in past (or hypothetical) contexts. A distinctive usage of will occurs in the protasis of conditional sentences, in which the present tense is usually used in other varieties, as in example (26): (26) It takes six hours maximum six hours if we will travel by a state transport bus service (ICE-IND S1A-008)

Would in Indian English is, as mentioned above, considerably less frequent compared to other varieties. It occurs almost exclusively in pragmatically specialized uses, i.e. in this case polite requests and similar indirect speech acts. In all other contexts in which would features more or less prominently in other varieties, such as past or backshift, past habitual and, most importantly, hypothetical, would rarely appears in the present Indian English data. 4.7  Singapore English As to the overall occurrence of the two modal verbs, the observations made for Indian English above can also be made for Singapore English. If anything, the trends are more pronounced: will is even more frequent and would is even less frequent in the Singaporean than in the Indian data (and obviously the data from the other varieties under discussion). This high use of will is, on the one hand, due to similar tendencies as in Indian English. Firstly, there is a tendency to prefer

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

will to going to or gonna in future contexts, but not to the same extent as in Indian English. And secondly, will is also used in present habitual contexts, and, in individual cases, in protases of conditional sentences. On the other hand, will is used in additional contexts in Singapore English. Most prominently, it is used with past time meaning, as in example (27): (27) And then there’re there’re the first batch that reached there they’ll ­waiting for us because they don’t have all the charcoal and everything Anyway ­raining they can’t even set it up Then after that it was drizzling they so they tried setting up you know they would hold umbrellas all over shutting in the the pit Ya then after that about six-thirty my friend and I we left ha (­ICE-SIN S1A-001)

Equally frequent is the use of will with hypothetical meaning as in example (28): (28) uh after going to Australia and coming back uh I’ll be dead broke You know I mean I would dearly loved to go with you Cecilia but the problem is … (ICE-SIN S1A-011)

Taken together, these contexts, in addition to the ones also present in British ­English as well as in the other New Englishes, account for the comparatively high overall occurrence of will in Singapore English. Would is very infrequent in the data from ICE-Singapore. As in the Indian data it tends to be mainly restricted to pragmatically specialized uses, though not quite to the same extent. Would is rarely used in past, past habitual or hypothetical contexts, which mainly accounts for its low overall frequency. There are cases of would with a present habitual meaning, but not with the same productivity as in other varieties. The results for Singapore English thus display several parallels with those for Indian English: firstly, the high overall frequency of will and the low overall f­ requency of would; secondly, the preference of will over going to/gonna in future contexts; thirdly, the frequent use of will with present habitual meaning; fourthly, the use of will in protases of conditional sentences; and, fifthly, the use of would mainly in pragmatically specialized contexts. In addition, Singapore English ­displays two distinctive uses of will: first, in past time contexts; and, second, with hypothetical meaning. The results support Deterding’s (2007) observations that will is regularly used in habitual contexts in Singapore English, both for present and past time, though the former is far more common in the present data. The results of this study are also in line with Deterding’s (2007) findings on the limited use of would in ­Singapore English. As in his study, the speakers sampled for the present data from ICE-Singapore prefer to use will in hypothetical contexts.



Will and would in selected New Englishes 

Deterding’s (2007) observation that would is increasingly used in Singapore English to convey tentativeness is supported in the sense that this modal is indeed used mainly in pragmatically specialized contexts in the corpus data, though it does not display a higher number of tokens in these contexts than in the other varieties under study. 5.  Conclusion and outlook The present study has started out from observations in previous studies that, on the one hand, New Englishes lag behind in the development of going to/gonna as an alternative future marker to will, and that, on the other hand, the use of will versus would tends to be quite variable, with a usage described as “extended would” apparently being common across a range of varieties, and other areas such as marking of habitual aspect showing variation as well. The analysis of each token of will and would in a selected data set from six New Englishes in comparison to British English and the quantitative comparison of future markers across the seven varieties have partly confirmed the ­findings of previous research but have also revealed a complex picture that can only be ­captured by a combination of quantitative and careful qualitative analysis. The tendency for be going to to be less common in New Englishes than in native varieties was confirmed in the present study for conversations in the ESL varieties selected, in particular Fiji and Indian English. Singapore English occupied a middle ground between these varieties and British English as well as Jamaican and Trinidadian English, where (be) going to took up a similarly large proportion of the future marking domain. Bahamian English was exceptional in showing a high frequency of the Creole future marker gon’ in this variety. This could be due to the composition of the sample, since, in the Creole continuum situation that obtains in the anglophone Caribbean, the level of Creole use varies strongly according to level of formality, in terms of which the conversation category is rather heterogeneous. It would be worth investigating the use of (be) going to versus will in larger data sets from the ESD corpora to see to what extent they differ systematically from ESL varieties with regard to this feature, as the present results suggest. Such research would have to bear in mind that, in some text types at least, a large share of the attested tokens of will may be used in the habitual rather than the future sense. The findings for non-past habitual uses are an interesting aspect of our results. Although the topics of conversation definitely play a role in influencing the degree to which habituality is expressed, it is very notable that habitual will is quite prominent in all New Englishes analysed here, but especially in three varieties where ­language contact is a plausible factor in supporting this usage, namely ­Trinidadian,

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert

Bahamian and Singapore English. This means that we are probably dealing with a case of parallel influence from unrelated varieties (Caribbean Creoles on the one hand and Chinese varieties on the other hand). It is also noteworthy that non-past habitual would is widely used in the two Caribbean varieties where the Creole has would as a non-past form, Trinidadian and Bahamian English. The overall picture for will/would is one where all New Englishes selected in the study show greater variation in the use of the two modals than British English: will or would appear in categories of meaning in which they do not occur in the British data, but different varieties show different tendencies in the form that this variation takes. In other words, the modals under investigation in this study are a case where New Englishes display both unity and diversity. The affinities between the meanings of the members of this pair of modals are probably one of the f­ actors responsible for the variation found in the New Englishes (see Deuber 2010a for more detailed discussion). Also, there seems to be a general tendency in the New Englishes for the sequence of tense rules, which determine the use of will versus would in indirect speech constructions and conditional sentences, to be relaxed. For reasons such as these, the uses and meanings of these modals seem to be prone to extensions and reinterpretations, which in each context are also influenced by a variety of factors including influence from related or unrelated language ­varieties in the local contact situation, politeness conventions, narrative strategies and ­second language acquisition. Diversity among New Englishes has become especially evident in the present study through the inclusion of ESD varieties in addition to ESL varieties, since these groups of varieties showed different behaviour to some extent, though the three varieties in each group certainly did not behave in a uniform way either. Future research on unity and diversity among New Englishes should pay attention to ESD varieties, for which one complete ICE corpus is now available and two more are being compiled (see Section 1), as these are shaped by a special language contact situation with English-related Creoles; if comparative research on New Englishes is limited to varieties used in similar sociolinguistic contexts there might appear to be more unity than there actually is. This study on the use of will and would in selected New Englishes has also shown that even a small amount of data can yield interesting results which supplement large-scale studies such as Collins (2009c). While Collins (2009c) can give us information on the general frequency of modal auxiliaries in different varieties of English, it is only with the use of a more limited amount of data that a careful qualitative analysis of the different functions of will and would in different New ­Englishes and a consideration of the speakers’ sociolinguistic background becomes possible. The study has also proved that it is worthwhile studying a ­linguistic variable across different varieties, as only with this approach do common trends and



Will and would in selected New Englishes 

individual differences become visible. For future research it will be interesting to extend this approach to modal verb use to other text types in the ICE corpora and to consider other New Englishes represented in ICE as well; for example, the analysis of will/would could be extended to African varieties as well as Philippine and Hong Kong English, varieties for which previous studies have also identified the use of these modal verbs as an area of variation.

References Alo, M.A. & Mesthrie, R. 2004. Nigerian English: Morphology and syntax. In A Handbook of Varieties of English, Vol. II: Morphology and Syntax, B. Kortmann & E.W. Schneider (eds), 813–827. Berlin: Mouton de Gruyter. Ansaldo, U. 2004. The evolution of Singapore English: Finding the matrix. In Singapore ­English: A Grammatical Description [Varieties of English around the World G33], L. Lim (ed.), ­127–149. Amsterdam: John Benjamins. Bailey, B.L. 1966. Jamaican Creole Syntax: A Transformational Approach. Cambridge: CUP. Balasubramanian, C. 2009. Register Variation in Indian English [Studies in Corpus Linguistics 37]. Amsterdam: John Benjamins. Bautista, M.L. 2004. The verb in Philippine English: A preliminary analysis of modal would. World Englishes 23: 113–128. Beckford Wassink, A. 1999. Historic low prestige and seeds of change: Attitudes toward ­Jamaican Creole. Language in Society 28: 57–92. Biber, D., Johansson, S., Leech, G., Conrad, S. & Finegan, E. 1999. Longman Grammar of Spoken and Written English. London: Longman. Biewer, C. 2009. Modals and semi-modals of obligation and necessity in South Pacific Englishes. Anglistik 20: 41–55. Biewer, C. In preparation. South Pacific Englishes. The Dynamics of Second-Language Varieties in Fiji, Samoa and the Cook Islands. Ph.D. dissertation, University of Zurich. Biewer, C., Hundt, M. & Zipp, L. 2010. ‘How’ a Fiji corpus? Challenges in the compilation of an ESL ICE component. ICAME Journal 34: 5–23. Blouet, O.M. 2002. The Caribbean. In Latin America and the Caribbean: A Systematic and Regional Survey, 4th edn, B.W. Blouet & O.M. Blouet (eds), 311–366. New York NY: Wiley. Carrington, L.D. 2001. The status of Creole in the Caribbean. In Due Respect: Essays on ­English and English-Related Creoles in the Caribbean in Honour of Professor Robert Le Page, P. Christie (ed.), 25–36. Mona: University of the West Indies Press. Central Intelligence Agency. 2010. Jamaica. The World Factbook. 〈https://www.cia.gov/library/ publications/the-world-factbook/geos/jm.html〉 (19 August 2010). Chand, V. 2009. [v]at is going on? Local and global ideologies about Indian English. Language in Society 38: 393–419. Coates, J. 1983. The Semantics of the Modal Auxiliaries. London: Croom Helm. Collins, P.C. 2009a. Extended uses of would in some Asian Englishes. Asian Englishes 12(2). Collins, P.C. 2009b. Modals and Quasi-Modals in English. Amsterdam: Rodopi. Collins, P.C. 2009c. Modals and quasi-modals in world Englishes. World Englishes 28: 281–292.

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert Deterding, D. 2003. Tenses and will/would in a corpus of Singapore English. In English in ­Singapore: Research on Grammar, D. Deterding, E.L. Low & A. Brown (eds), 31–38. ­Singapore: McGraw-Hill Education (Asia). Deterding, D. 2007. Singapore English. Edinburgh: Edinburgh University Press. Deuber, D. 2009a. Caribbean ICE corpora: Some issues for fieldwork and analysis. In C ­ orpora: Pragmatics and Discourse. Papers from the 29th International Conference on English ­Language Research on Computerized Corpora (ICAME 29), A.H. Jucker, M. Hundt & D. Schreier (eds), 425–450. Amsterdam: Rodopi. Deuber, D. 2009b. Standard English in the secondary school in Trinidad: Problems – p ­ roperties – prospects. In World Englishes – Problems, Properties and Prospects: Selected Papers from the 13th IAWE Conference [Varieties of English around the World G40], T. Hoffmann & L. Siebers (eds), 83–104. Amsterdam: John Benjamins. Deuber, D. 2009c. “The English we speaking”: Morphological and syntactic variation in ­educated Jamaican speech. Journal of Pidgin and Creole Languages 24(1): 1–52. Deuber, D. 2010a. Modal verb usage at the interface of English and a related Creole: A ­corpus-based study of can/could and will/would in Trinidadian English. Journal of English ­Linguistics 38: 105–142. Deuber, D. 2010b. Standard English and situational variation: Sociolinguistic considerations in the compilation of ICE-Trinidad and Tobago. ICAME Journal 34: 24–40. Dixon, R.M.W. 1988. A Grammar of Boumaa Fijian. Chicago IL: University of Chicago Press. Georgieva, M. 1993. A cognitive approach to the acquisition of English modals by Bulgarian learners. In Current Issues in European Second Language Acquisition Research, B. ­Kettemann & W. Wieden (eds), 151–163. Tübingen: Narr. Görlach, M. 1991. English as a world language – the state of the art. In Englishes: Studies in Varieties of English 1984–1988 [Varieties of English around the World G9], 10–35. ­ ­Amsterdam: John Benjamins. Hackert, S. 2004. Urban Bahamian Creole: System and Variation. Amsterdam: John Benjamins. Hackert, S. 2010. ICE Bahamas: Why and how? ICAME Journal 34: 41–53. Hasselgren, A. 1994. Lexical teddy bears and advanced learners: A study into the ways N ­ orwegian students cope with English vocabulary. International Journal of Applied Linguistics 4(2): 237–258. Hilbert, M. & Krug, M. 2010. The compilation of ICE-Malta: State of the art and challenges along the way. ICAME Journal 34: 54–63. Hodge, M. 1997. The Knots in English: A Manual for Caribbean Users. Wellesley: Calaloux Publications. Holmes, J. 1996. The New Zealand spoken component of ICE: Some methodological challenges. In Comparing English Worldwide: The International Corpus of English, S. Greenbaum (ed.), 163–181. Oxford: Clarendon. Huber, M. & Dako, K. 2004. Ghanaian English: Morphology and syntax. In A Handbook of Varieties of English, Vol. II: Morphology and Syntax, B. Kortmann & E.W. Schneider (eds), 854–865. Berlin: Mouton de Gruyter. Huddleston, R. & Pullum, G.K. 2002. The Cambridge Grammar of the English Language. ­Cambridge: CUP. Krug, M. 2000. Emerging English Modals: A Corpus-Based Study of Grammaticalization. Berlin: Mouton de Gruyter. Kujore, O. 1985. English Usage: Some Notable Nigerian Variations. Ibadan: Evans.



Will and would in selected New Englishes 

Labov, W. 1984. Field methods of the Project on Linguistic Change and Variation. In Language in Use: Readings in Sociolinguistics, J. Baugh & J. Sherzer (eds), 28–53. Englewood Cliffs NJ: Prentice-Hall. Leech, G., Hundt, M., Mair, C. & Smith, N. 2009. Change in Contemporary English: A ­Grammatical Study. Cambridge: CUP. Leitner, G. 1991. The Kolhapur Corpus of Indian English: Intra-varietal description and/ or intervarietal comparison. In English Computer Corpora: Selected Papers and Research Guide, S. Johansson & A. Stenström (eds), 215–232. Berlin: Mouton de Gruyter. Lim, L. & Foley, J.A. 2004. English in Singapore and Singapore English: Background and ­methodology. In Singapore English: A Grammatical Description [Varieties of English around the World G33], L. Lim (ed.), 1–18. Amsterdam: John Benjamins. Lock, G. 1996. Functional English Grammar: An Introduction for Second Language Teachers. Cambridge: CUP. Mindt, D. 1995. An Empirical Grammar of the English Verb: Modal Verbs. Berlin: Cornelsen. Mühleisen, S. 2001. Is “Bad English” dying out? A diachronic comparative study of attitudes towards Creole versus Standard English in Trinidad. PhiN 15: 43–78. 〈http://web.fu-berlin. de/phin/phin15/p15t3.htm〉 (19 August 2010). Nelson, G. 1996. The design of the corpus. In Comparing English Worldwide: The International Corpus of English, S. Greenbaum (ed.), 27–53. Oxford: Clarendon. Nelson, G. 2003. Modals of obligation and necessity in varieties of English. In From Local to Global English: Proceedings of the Style Council 2001/2, P. Peters (ed.), 25–32. Sydney: ­Macquarie University, Dictionary Research Centre. Nihalani, P., Tongue, R.K. & Hosali, P. 1979. Indian and British English: A Handbook of Usage and Pronunciation. Delhi: OUP. Nkemleke, D. 2007. Frequency and use of modals in Cameroon English and application to ­language education. Indian Journal of Applied Linguistics 33: 87–105. Palmer, F.R. 1990. Modality and the English Modals. 2nd ed. London: Longman. Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. 1985. A Comprehensive Grammar of the ­English Language. London: Longman. Sand, A. 1999. Linguistic Variation in Jamaica: A Corpus-Based Study of Radio and Newspaper Usage. Tübingen: Narr. Sand, A. 2005. Angloversals? Shared Morpho-Syntactic Features in Contact Varieties of English. Postdoctoral thesis, University of Freiburg. Schiffrin, D. 2006. Discourse. In An Introduction to Language and Linguistics, R.W. Fasold & J. Connor-Linton (eds), 169–203. Cambridge: CUP. Schmied, J. 1996. Second-language corpora. In Comparing English Worldwide: The International Corpus of English, S. Greenbaum (ed.), 182–196. Oxford: Clarendon. Schneider, E.W. 1999. Notes on Singaporean English. In Form, Function and Variation in ­English: Studies in Honour of Klaus Hansen, U. Carls & P. Lucko (eds),193–205. Bern: Peter Lang. Schneider, E.W. 2007. Postcolonial English: Varieties around the World. Cambridge: CUP. Sedlatschek, A. 2009. Contemporary Indian English: Variation and Change [Varieties of English around the World G38]. Amsterdam: John Benjamins. Sey, K. 1973. Ghanaian English: An Exploratory Survey. London: Macmillan. Shields-Brodber, K. 1997. Requiem for English in an “English-speaking” community: The case of Jamaica. In Englishes around the World, Vol. II: Caribbean, Africa and ­Australasia.

 Dagmar Deuber, Carolin Biewer, Stephanie Hackert & Michaela Hilbert Studies in Honour of Manfred Görlach [Varieties of English around the World G19], E.W. Schneider (ed.), 57–67. Amsterdam: John Benjamins. Solomon, D. 1993. The Speech of Trinidad: A Reference Grammar. St Augustine: UWI School of Continuing Studies. Szmrecsanyi, B. 2003. BE GOING TO versus WILL/SHALL: Does syntax matter? Journal of English Linguistics 31: 295–323. Tent, J. 2001. A profile of the Fiji English lexis. English World-Wide 22(2): 209–245. Tent, J. & Mugler, F. 2008. Fiji English: Phonology. In Varieties of English, Vol. III: The Pacific and Australasia, K. Burridge & B. Kortmann (eds), 234–266. Berlin: Mouton de Gruyter. Tickoo, M.L. 1996. Fifty years of English in Singapore: All gains, (a) few losses? In Post-Imperial English: Status Change in Former British and American Colonies, 1940–1990, J.A. Fishman, A.W. Conrad & A. Rubal-Lopez (eds), 431–455. Berlin: Mouton de Gruyter. Trudgill, P. & Hannah, J. 2002. International English: A Guide to Varieties of Standard English. London: Arnold. Tschichold, C. 2002. Learner English. In Perspectives on English as a World Language, D.J. Allerton, P. Skandera & C. Tschichold (eds), 125–133. Basel: Schwabe. Vaish, V. 2008. Biliteracy and Globalization: English Language Education in India. Clevedon: Multilingual Matters. Ward, G., Birner, B.J. & Kaplan, J.P. 2003. A pragmatic analysis of the epistemic would construction in English. In Modality in Contemporary English, R. Facchinetti, M. Krug & F. Palmer (eds), 71–79. Berlin: Mouton de Gruyter. Winford, D. 1997. Re-examining Caribbean English creole continua. World Englishes 16: 233–279. Youssef, V. 1990. The Development of Linguistic Skills in Some Trinidadian Children: An Integrative Approach to Verb Phrase Development. Ph.D. dissertation, University of the West Indies, St Augustine. Youssef, V. 1996. Varilingualism: The competence underlying codemixing in Trinidad and Tobago. Journal of Pidgin and Creole Languages 11(1): 1–22. Youssef, V. 2004. “Is English we speaking”: Trinbagonian in the twenty-first century. Some notes and comments on the English usage of Trinidad and Tobago. English Today 80: 42–49.

Progressives in Maltese English A comparison with spoken and written text types of British and American English Michaela Hilbert & Manfred Krug University of Bamberg

In investigating progressives in Maltese English we use data from a sub-sample of ICE-Malta and parallel British and American texts, as well as questionnaires from these countries. Progressives in written Maltese English, with one exception, broadly follow the British exonormative standard. However, substantial differences are evident in spoken texts, including the extension of progressives to stative verbs. But the differences cannot generally be explained in terms of overuse and extension. Our data confirm previous comparative studies in that the use of progressives also in Maltese English is subject to text type and formality considerations. Two additional important factors are language contact (e.g. the existence of parallel constructions and imperfectives in Maltese) and languageinternal constraints (e.g. a cline between dynamic and stative verbs). Keywords:  Maltese English; progressive; stative verb; English as a Second Language; language contact

1.  Introduction Non-standard uses of the progressive be V-ing construction have often been reported to occur in varieties of English, including both major English as a Native Language (ENL) varieties, British (BrE) and American (AmE) English, as well as English as a Second Language (ESL) varieties such as Indian English and ­Singapore English. Variation in the ENL varieties often seems to involve an extension of the construction to future meaning, whereas ESL varieties – apart from reportedly using an overall higher frequency of the progressive – seem to use the construction not only with dynamic verbs but also with stative verbs (see Platt, Weber & Ho 1984). An extension to stative verbs could thus be assumed to be the ­underlying reason for a higher overall use of the progressive construction in ESL varieties. However, in more recent comparative studies, this tendency proves to be far from

 Michaela Hilbert & Manfred Krug

straightforward: Hundt and Vogel (2011), for instance, show that the progressive does not occur significantly more frequently in written texts in ESL varieties than in some ENL varieties. The complicating factor with regard to non-standard uses of the progressive in ESL varieties seems to be that they are often still subject to exonormative influence, not only in written text types. In this paper, we will look at the use of the progressive in Maltese English (MaltE), a comparatively small and young variety of English. Results from ­Maltese English could contribute some interesting insights into this topic: Maltese ­English is an ESL and thus a contact variety of English, with widespread bilingualism throughout the speech community. It has a strong exonormative tradition on the one hand and shows incipient development of variety-specific features on the other. Among the latter, non-standard uses of the progressive have already been reported (see Mazzon 1992). In Schneider’s (2003, 2007) dynamic model of English varieties, Maltese English is classified as being in the “nativization” stage (Thusat et al. 2009). This enables us to study the development of variety-specific features as well as maybe the very beginnings of endonormative stabilization in its early stages. One possible outcome of this stage is that some non-standard ­features show interesting distributions in Maltese English: whereas speakers seem to be conservative across all genres with regard to the use of widespread lexical and syntactic non-standard and colloquial features of Present-day English (such as contractions, invariant tag questions, innovative second person plural pronouns), variety-specific features resulting from transfer from Maltese seem to be more sensitive to text type and are avoided in more formal written but used in informal spoken Maltese English (see Krug, Fabri & Hilbert forthcoming). Our aim in this paper is to explore whether Maltese English uses progressives more frequently than or differently from British English and other ESL v­ arieties of English. Also, in order to assess the range of traditional exonormative and ­incipient endonormative influences, we will compare the occurrence of progressives in both written and spoken texts (on the basis of news reports, editorials and private conversations). The results can be expected to have important descriptive implications regarding the nature of Maltese English and, more generally, ­regarding the identification of typical characteristics of ESL varieties. 2.  Outline and contextualization of Maltese English English in Malta is a co-official language alongside Maltese, a Semitic language written in the Latin alphabet, with a high proportion of Italian and Sicilian loanwords and an increasing number of English borrowings. In 2002, two years before Malta’s accession to the European Union, Maltese became an official language of



Progressives in Maltese English 

the EU. The variety of English spoken in Malta, Maltese English, can be and has often been described as an “Outer-Circle” or “English as a Second Language” variety of English in Kachru’s (1986) well-known concentric-circles model of English varieties, which is based on the criteria of historical spread, acquisition patterns and its current status and functional allocation in Malta. Maltese English is situated between the “Inner Circle” or “English as a Native Language” and “Expanding Circle” or “English as a Foreign Language” (EFL) varieties of English (see e.g. Görlach 2002: 211). Maltese is the native language for over 95% of the population, which currently totals an estimated 410,000 (roughly 380,000 of whom live on the main island, Malta, and some 30,000 on Gozo). Overall, some 90% of the population claim competence in English, although degrees of proficiency vary considerably and can be said to range from ENL to EFL.1 This is a common phenomenon in ESL varieties and has been cited as one of the problems inherent in Kachru’s model of concentric circles (Jenkins 2003: 17). Bongartz and Buschfeld (2011) discuss this problem for Cyprus English (CyE), another young island variety of English which (although not an official language) bears some similarity to Maltese English regarding its history. Bongartz and Buschfeld (2011: 35) claim “hybrid ESL–EFL status” for the variety and tentatively suggest the possibility of CyE undergoing a reversal from ESL to EFL. This suggestion is primarily based on their observation that the linguistic behaviour of older and younger speakers seems to display notable differences. The same (anecdotal) observation could be made for Malta and must be looked at carefully in future studies. Hence, while we are sceptical regarding the universal applicability of the categories ESL, ENL and EFL, we use them as broad labels for convenience and for expository purposes, always keeping in mind that qualifications might be necessary. To anticipate some of our conclusions, we will see that different usage conventions apply to different styles and codes (e.g. spontaneous spoken vs. edited press language). The vast majority of the population of Malta acquire English as a second language. Some five per cent use both English and Maltese as their main languages at home.2 This will often come in the form of the one-parent-one-language strategy or in the form of varying language use with parents, siblings and grandparents. L1-speakers of English are found primarily in the higher socio-economic strata; in addition, higher usage rates of English vary regionally along the typical urban–rural

.  Needless to say, the term EFL itself represents an extremely heterogeneous class of levels ranging from incipient learner to near-native competence (where “near-native” in some cases and registers can indeed be higher than the average native-speaker’s level). .  The proportion of people claiming English to be their only native language is roughly 1%.

 Michaela Hilbert & Manfred Krug

cline. Particularly high rates are reported for such districts as Sliema and neighbouring St Julian’s, where both tourism and affluent households are concentrated. Table 1.  Native languages and multilingualism in Malta Languages spoken in Malta*

Native languages in Malta**

Language

Speakers

%

Language

Maltese

354,664

97.9

Maltese

% 98.6

English

318,354

87.9

English

1.2

Italian

205,375

56.7

Italian

0.2

French

75,914

20.9

German

20,110

5.5

Arabic

14,046

3.9

Other

15,159

4.2

*2005 census data; all degrees of proficiency considered **Representative sample of 500 speakers. Data collected by Sciriha & Vassallo in 2001, adapted from Sciriha & Vassallo 2006: 26.

As can be seen from Table 1, bi-, tri- and multilingualism are widespread in Malta, an aspect that is well documented in the relevant literature (e.g. Camilleri 1991; Sciriha & Vassallo 2001). This is primarily due to historical language c­ ontact and to the importance of tourism in the country. Obviously, language and education policies play a role as well: schooling in Malta is mandatory until the age of 16. There is no official policy on the classroom use of languages, but the National Minimum Curriculum from 1999 issued by the Ministry of Education emphasizes the importance of English and Maltese as official languages and states further that pupils in secondary schools are expected to learn a third or fourth language. Both Maltese and English are used from school entry, but Maltese is naturally more prominent in primary schools, while English is more prominent in secondary schools and dominant at tertiary level. There is a tendency for state schools to use less English than Catholic and private (so-called independent) schools. The language of instruction also depends to a great extent on the language of the textbooks, most of which are in English. In particular, most reading and ­writing is done in English. Conversely, the subjects Maltese and History are almost exclusively taught in Maltese. In spoken interaction, code-switching is widespread among both pupils and teachers, although a change in teacher education has shifted the balance somewhat towards Maltese: until the 1970s, teachers were trained by British religious orders, but more recently teachers have been trained by Maltese native-speaker scholars at the University of Malta, which has generally led to an increasing proportion of Maltese in spoken classroom interaction.



Progressives in Maltese English 

A second problem with categorizing Maltese English as an Outer-Circle variety based on Kachru’s criteria is that it should be “norm-developing”. Personal observation and the linguistic evidence we will give below, however, show Maltese English to be fundamentally influenced by the British English norm. Also, there is no sign of any institutionalization of Maltese English yet. The current status of Maltese English as a young ESL variety with variety-­ specific features in their initial stages makes it difficult to locate it in other models of global English. Often, it will occupy a position at the very margins of the categories reserved for “other local varieties” (other than the ENL and older ESL varieties). Modiano’s (1999) model of international English, for instance, distinguishes only between non-localized and localized sets of features: non-localized features define English as an International Language (EIL), which is made up of the core features of English (see Figure 1). A second layer around EIL includes features that are not yet part of this core. Localized features then determine the Outer Circle, which consists of the major varieties (British English, American English, etc.) and other local varieties, each with specific features that are not understood by speakers of other local varieties. Maltese English might be currently developing its own variety-specific features, most of which, however, will be understood by speakers of other varieties of English, apart from borrowings from Maltese. Also, with regard to number, pervasiveness and codification of these features, Maltese English has not yet developed variety-specific features to the extent that other local varieties such as Indian English have.

American English

Major varieties CAN, AUS, NZ, SA

British English

The common core

EIL Foreign language speakers

Other varieties

Figure 1.  Modiano’s (1999) model of international English

Since Maltese English seems to be very much at the margins of categories in Kachru’s and Modiano’s models, Schneider’s (2003, 2007) dynamic model is more suitable for categorizing this developing and smaller variety of English. Schneider’s model distinguishes five distinct phases in the development of English varieties: (1) foundation, (2) exonormative stabilization, (3) nativization, (4) endonormative stabilization and (5) differentiation. The parameters of classification involve

 Michaela Hilbert & Manfred Krug

socio-political background, identity constructions, sociolinguistic conditions and linguistic effects. With regard to the first two of these criteria, Maltese English can be classified between Phase 4 and 5, between endonormative stabilization and differentiation: socio-politically, Malta is an independent and culturally mostly self-reliant country, with a regionally-based identity and developing or partly established internal socio-political heterogeneity. Linguistically, on the other hand, Maltese English seems to be at least one stage behind in Phase 3, nativization. Bilingualism is widespread and, most importantly, there is a noticeable continuum between innovative speakers who adopt local forms of English, and conservative speakers who uphold the external norm. Linguistically, we find local phonological and syntactic innovations due to transfer from Maltese (see ­Mazzon 1992; Vella 1994; Camilleri 2004; Schembri 2005), such as varying prepositional usage and innovative verb complementation patterns, in addition to lexical innovation and code-mixing. The next phase in Schneider’s model would be the acceptance of a local norm as a carrier of identity. This might well be under way in Malta, but the other criteria in this stage, such as the stabilization and coding of the local variety, e.g. in dictionaries, and literary creativity in the new variety, do not appear to have started yet. Thus, for the following analysis of the progressive in Maltese English, we hypothesize its use to be very close to the British English norm, at least in written texts. Qualitative and quantitative differences are likely to occur mostly in informal spoken English, with possible influence and transfer from Maltese. In the following section, we will give an outline of previous research into the use of the progressive in other ESL varieties of English, as well as ways in which we can expect Maltese to have an effect on Maltese English in the relevant grammatical domain. 3.  Previous research on progressives L2 varieties of English have often been reported to use the progressive more frequently than most ENL varieties, and the argument usually revolves around the extension of the construction to stative verbs (see Platt, Weber & Ho 1984). Using exam scripts and essays, Hundt and Vogel (2011) show that, contrary to this belief, speakers of ESL and EFL varieties do not use more progressives than speakers of ENL varieties across the board. Moreover, no gradient pattern ranging from ENL via ESL to EFL varieties emerges from their study. As to the ­combination of the progressive with stative verbs, Hundt and Vogel (2011) find that this is r­ elatively rare in all three types of English varieties, except in Fiji English. In the few reported cases, the stative verbs adopted a more dynamic meaning, which is in line with standard grammars (see Section 4 below). They also show, however, that it is in spoken rather



Progressives in Maltese English 

than written data that more stative progressives occur in ESL E ­ nglishes (Singapore is mentioned as an example; Hundt & Vogel 2011). The apparent text-type sensitivity aside, what prima facie looks like a “universal” overuse of the progressive can pattern very differently in individual varieties: Sharma (2009), for instance, shows that even though Indian English and ­Singapore English both overuse the progressive, the constructions differ fundamentally in the two varieties and can only be understood against the backdrop of the respective substrate languages. Her conclusion therefore is that “a universalist interpretation of such levelling cannot explain the discrepancy between IndE and SgE usage” (Sharma 2009: 183). Unlike ESL varieties, EFL varieties do not seem to extend the progressive to new contexts such as stative verbs, but tend to overuse the prototypical constructions, based on verb types that are most commonly associated or taught with the progressive (Westergren Axelsson & Hahn 2001; van Rooy 2006; Hundt & Vogel 2011). Hundt and Vogel (2011) conclude that “the tolerance towards combinations of the progressive with stative verbs seems to be stretched in the ESL varieties. This is not the case in Learner English”. They doubt that this stretched tolerance is the reason for the high number of progressives found in some varieties, however, and suggest instead that the more influential change in this respect is the use of the progressive in ESL varieties in contexts in which ENL varieties would prefer simple or perfect aspect (e.g. after adverbials such as ever since, this is the first time). With regard to Maltese English, Mazzon confirms Platt, Weber & Ho’s (1984) claim regarding the extension of the progressive to stative verbs but she suggests that it is primarily a phenomenon of speech (Mazzon 1992: 139) as seen in examples (1) and (2).

(1) As I was seeing…

(Swithun 1961: 43; quoted from Mazzon 1992: 139)



(2) Are you understanding my point?  (Navarro & Grech 1984: 10 [unpublished manuscript]; quoted from  Mazzon 1992: 139)

Mazzon relates this phenomenon to the Maltese substrate, which has a construction involving the Maltese verb for stay followed by the imperfective form of the verb. She provides anecdotal evidence of Maltese English speakers producing ­constructions on the basis of this (examples (3) and (4)).

(3) I stayed playing.

(Mazzon 1992: 139)



(4) Don’t stay walking on the grass.  (Dimech 1973: 11; quoted from Mazzon 1992: 139)

Maltese has been described as an “aspect prominent language” (Spagnol 2009: 55), and aspect is closely linked to semantic distinctions in the verb, notably stative

 Michaela Hilbert & Manfred Krug

vs. dynamic and durative vs. punctual (Spagnol 2009: 81). The difference between stative and dynamic verbs is salient, for instance, when they occur in the imperfective, in which case dynamic verbs are interpreted as habituals, whereas stative verbs are interpreted as having present time reference (Spagnol 2009: 64, 79). Maltese distinguishes between three aspects: imperfective (unmarked), perfective and progressive, with the last being “a type of imperfective” (Spagnol 2009: 79). The progressive is constructed with qiegħed (the active participle of qagħad ‘stay’) or qed (a shortened form of qiegħed), followed by the imperfective form of the verb. Semantically, in addition to indicating an ongoing and temporary process, the Maltese construction can also have a habitual meaning “expressing a habit restricted in terms of the period of time to which the action applies” (Spagnol 2009: 80). This interpretation of the progressive in terms of a temporary habit can also be found in English, as in The professor is typing his own letters while his secretary is ill (Quirk et al. 1985: 199). A synthetic alternative to this construction is the expression of the progressive with active participles, which are derived from verbs of motion (Spagnol 2009: 79ff.) and occur in verbless sentences. These cannot express temporally restricted habituality (which the analytic construction can), but participles derived from achievement (telic, punctual) verbs like publish can indicate future time (Spagnol 2009: 80f.). This does not apply to other participles, such as those derived from durative verbs (e.g. walk around). Spagnol relates this to English verbs in the progressive with future time reference, such as She is coming to my birthday party (Spagnol 2009: 81). Spagnol observes that stative verbs do not occur in the progressive unless they express some dynamic meaning in a given context. Punctual verbs (unlike durative verbs) do not normally occur in progressive constructions either; hence the ungrammaticality of *ħija qed jasal id-dar ‘my brother is arriving home’. There are some exceptions, however. Firstly, achievement verbs are acceptable in the progressive when the argument of the verb expresses an indefinite plural (as in qed jaslulna ħafna ittri ‘we are receiving many letters’). Secondly, semelfactives like għatas (‘to sneeze’) or nfaqa’ (‘to burst’) in the progressive either receive a multiple-event reading (e.g. ‘sneeze a number of times’), or again their arguments express an indefinite plural (as in il-bżieżaq qed jinfaqgħu ‘the balloons are bursting’; Spagnol 2009: 76ff.). Finally, like the progressive in English, the Maltese progressive freely combines with the passive, but not with the imperative (Spagnol 2009: 81).3

.  Thanks are owed to Mike Spagnol for providing us with comments on this paper and with Maltese examples beyond the ones given in Spagnol (2009).



Progressives in Maltese English 

The hypotheses to be inferred from this are complex. On the one hand, Maltese English has been reported to follow the tendency of New Englishes to extend the use of the progressive to stative verbs. On the other hand, the substrate language Maltese should reinforce the semantic distinction between stative and dynamic verbs in Maltese English, including the semantic changes that stative verbs undergo when they are used in the progressive. If transfer figures as a relevant factor, we can also expect Maltese English speakers to use the progressive with future time reference, at least with achievement verbs. Our central research questions, then, are the following: –– Does MaltE use more progressives than BrE? –– Does MaltE use progressives differently from BrE? More specifically, does the progressive figure in different contexts from those known for Standard ­British and American English (e.g. co-occurrence with different tenses, aspects, aktionsarten, modals, diatheses, verb types, subject types, adverbials)? –– If MaltE differs qualitatively or quantitatively from BrE, is it Maltese substrate influence that can account for the differences or are there alternative and additional explanatory factors? –– How do MaltE progressives fit with Hundt and Vogel’s (2011) findings for other varieties of English? –– Are there differences between text types and between the spoken and written codes? Due to the text-type sensitivity of contact features in Maltese English found in another study (Krug, Fabri & Hilbert forthcoming), our working hypothesis regarding the last question is: the more formal the text type, the fewer non-­standard progressives there will be. We do not expect the recent spread of progressives to written genres in ENL varieties as shown in Leech et al. (2009: ch. 6) to figure prominently in printed Maltese English, for this variety has proved rather resistant to the adoption of what were until recently informal features, like contractions, which are spreading to more formal genres in British and American English (see also Krug, Fabri & Hilbert forthcoming). The answers to the above questions can be expected to have important descriptive implications regarding the nature of Maltese English and, more generally, regarding the identification of typical characteristics of ESL, ENL and EFL varieties. We want to explore, for instance, whether it is possible for a variety like MaltE to exhibit features of both ESL and EFL. Ideally, the present findings can contribute to a theoretical discussion on the adequacy of available classifications of varieties of English including the question to what extent ESL status is compatible with exonormative pressures.

 Michaela Hilbert & Manfred Krug

4.  The variable: Definition and constraints The progressive in Standard English is formed with a form of be and the -ing form of the main verb. Its semantics is typically described in terms of the contrast to the second major aspect category, the perfective, and can thus be described as “imperfective”. In actual fact, however, this definition is not always applicable. Quirk et al. state that aspect is so closely connected in meaning with tense, that the distinction in English grammar between tense and aspect is little more than a terminological convenience which helps us to separate in our minds two different kinds of realisation: the morphological realization of tense and the syntactic realisation of aspect. (Quirk et al. 1985: 189)

Generally, the progressive aspect in English denotes that the given event a. has duration; b. is limited in its duration; c. is not necessarily completed.  (Quirk et al. 1985: 197–198; see also König & Lutzeier 1973) Uses differ depending on the tense of the verb phrase, though. In the present tense, the progressive indicates that something is true at a given particular point in time rather than a permanent characteristic or state (for which the simple present is used; Quirk et al. 1985: 197). In the past tense, however, the difference between simple and progressive aspect is rather one of focus or perspective: the simple aspect focuses on the event as a whole, the progressive stresses the activity in process, regardless of possible results, outcomes or consequences (see Quirk et al. 1985: 197). On the basis of these semantic characteristics of the progressive aspect, Quirk et al. find constraints on the use of the progressive with certain types of verbs. The most commonly reported and frequent constraint concerns the combination of the progressive aspect with stative verbs (Quirk et al. 1985: 198) as in examples (5) to (7).

(5) *He is knowing English.

(Quirk et al. 1985: 75)



(6) *We are owning a house in the country.

(Quirk et al. 1985: 198)



(7) *Sam’s wife was being well-dressed.

(Quirk et al. 1985: 198)

The distinction between stative and dynamic verbs is not clear-cut in all contexts. Firstly, many stative verbs can be used with a dynamic meaning (examples (8) and (9)).



Progressives in Maltese English 



(8) I have a car. (stative meaning: ‘possess’)



(9) I am having breakfast. (dynamic meaning: ‘eat’)

Secondly, a stative verb can be used with the progressive to stress the temporary nature of the situation (examples (10) and (11) from Quirk et al. 1985: 198–199): (10) We are living in the country. (temporary) (11) We live in the country. (permanent)

These two contextual meanings may also figure with the copula be. Consider the contrast in examples (12) and (13). (12) Peter is awkward. (permanent characteristic)

(Quirk et al. 1985: 200)

(13) Peter is being awkward. (temporary behaviour)

(Quirk et al. 1985: 200)

According to Quirk et al. (1985: 201–202), verbs denoting qualities and states (be, have, love, resemble, love, own, think) need the dynamic interpretation in addition to the temporariness of the situation for them to be able to combine with progressive aspect. For the subgroup “private states” (i.e. states of mind, volition and attitude), the progressive can indicate politeness or tentativeness (examples (14) and (15) from Quirk et al. 1985: 202–203): (14) What were you wanting? (15) I was hoping you would give me some advice.

On a similar note, certain types of verbs are unlikely to occur with the progressive, for instance verbs of perception in the copular construction such as look, sound, feel, smell, taste (Quirk et al. 1985: 203–204). This contrasts sharply with their dynamic uses, as is exemplified by (16) and (17). (16) This looks nice. (copular) (17) He is looking at me. (dynamic)

So-called “verbs of being and having” (Quirk et al. 1985: 205) like contain, hold, matter, depend, resemble, belong represent yet another type of verb that is unlikely to occur with the progressive aspect (see examples (18) to (20)). (18) *The box is containing a necklace.

(Quirk et al. 1985: 205)

(19) ?He is holding a degree in linguistics. (stative possession or stable c­ haracteristic) (20) He is holding a stick. (dynamic)

 Michaela Hilbert & Manfred Krug

A similar type of constraint relating to aktionsart affects the perfect progressive. As Quirk et al. (1985: 211) note, the perfect progressive is only marginally acceptable with punctual (or momentary) verbs (see example (21) below). This is rather unsurprising because in past contexts such verbs, at least under default conditions, fail to exhibit imperfective aspect, which is one of the typical characteristics of progressives in English given in (a) to (c) at the beginning of this section. Consider the contrast between present and past contexts in examples (21) and (22). (21) ?He has been starting the engine. (22) He is starting the engine.

In addition to these semantic factors, other constraints have been found to influence the occurrence of the progressive aspect in English. Firstly, there is an overall low frequency of the perfect progressive: in Biber et al. (1999: 461–462) the perfect progressive is rare across all registers and occurs only in 0.5% of all verb phrases. Secondly, highly complex verb constructions involving the progressive seem to be avoided, particularly the perfect progressive passive, which according to Quirk et al. (1985: 213) “is felt to be awkward” (see examples (23) and (24)). (23) The road has been being repaired for months.

(Quirk et al. 1985: 213)

(24) Seats have not been being won by the Conservatives lately.  (Quirk et al. 1985: 167)

The progressive passive, however, has been increasing in British English, whereas it has decreased in American English, possibly due to American style guides shunning the construction (see Hundt 2004; Smith & Rayson 2007; Hundt 2009; Hundt & Dose forthcoming). Thirdly, constraints with regard to tense have been shown to be largely genredependent: the progressive tends to combine with the present tense in the large majority of texts in conversation, news and academic prose, whereas in fiction the past tense dominates (Biber et al. 1999; Leech et al. 2009; Hundt & Dose forthcoming). In the latter, first-person narration has been shown to include more progressives, possibly due to its closeness to speech (or the representation thereof) and the notion that innovations originating in speech surface first in this particular genre (Hundt & Dose forthcoming). Fourthly and finally, constraints seem to occur with regard to dialect: Biber et al. (1999: 462) find that the progressive is more frequent in American English than in British English, particularly in conversation. In news reportage, however,



Progressives in Maltese English 

the frequencies of the progressive aspect in the two varieties are roughly the same (see Hundt & Dose forthcoming). 5.  D  ata The written material used in this study includes the press reports and editorials compiled for the Maltese component of the International Corpus of English (hereafter ICE-MTA). They were collected from the Internet websites of four major Maltese newspapers published in English: Malta Today, The Independent, Business Today and The Times of Malta. The individual texts were selected on the basis of the following ICE criteria: (a) publication after 2005; (b) exclusion of (parts of) news agency publications; (c) preference for the Local News section; and (d) the author is Maltese. The press report and editorial material has been expanded to 100,000 words for an extra-ICE-MTA newspaper corpus (60,000 words of press ­reportage, 40,000 words of editorials), for which parallel corpora for the UK (The Times, The Guardian, The Independent) and the Channel Islands (Jersey Evening Post) have been compiled. These extended newspaper corpora are the basis of this study. The spoken material used is part of the “private conversations” data of ICE-MTA. Since compilation and transcription are still in progress, a subset of approximately 30,000 words has been used here. Private conversations were recorded during a field trip, when German students recorded half-hour stretches of English conversations between Maltese university students and members of their families as well as their friends. The drawback of this method is that in both these contexts the language of choice would be Maltese for the overwhelming majority of the population, so that a certain unnaturalness attaches to these conversations in ­English. This is a problem for ESL corpora in general. In the presence of a speaker with no competence in Maltese, however, the language of choice in Malta is E ­ nglish, so that the situation is not entirely artificial. The collection of spoken material is subject to the challenges arising from the sociolinguistic situation in Malta (see above). Most everyday conversations among the Maltese speech community take place in Maltese, which is also the language of the government and the court. Most TV stations broadcast exclusively in Maltese, since English programmes are available from British channels such as BBC, ITV and others (the same is true for Italian programmes). In addition to the corpus material, we use questionnaire-based data. The ­present questionnaire for investigating the use of lexical and grammatical items was developed in the English Linguistics Department at the ­University

 Michaela Hilbert & Manfred Krug

of ­Bamberg in 2007 and in its grammatical section contains some of the 76 features from K ­ ortmann and Schneider (2004) as well as features that have been reported for English in Malta, Gibraltar and the Channel Islands, which form part of our larger project on (pen)insular varieties with Romance–English ­language contact. Progressive aspect constructions were retrieved from the database by means of the Wordsmith software. We included all forms of be + V-ing, allowing for up to five elements in between. In order to produce figures comparable to the ­studies by Hundt and Dose (forthcoming), and Hundt and Vogel (2011), the results were normalized per million words (pmw) and manually post-edited to exclude examples like the following (all examples from our material): –– Non-finite verbal constructions except the progressive infinitive I think there’s a danger of us couching this as ‘no to men’ and stopping talented men, when it should be about recognising talented women are being put off going into politics. John is a father, and going along and seeing the fathers of youngsters who had been killed, obviously he could empathise with that –– be going to with future reference We are not going to be able to do our mission in Afghanistan through tanks and helicopters alone. –– adjectival participles She has set out her stall as a radical, but her record is unconvincing. The move was both promising and reassuring. –– be + gerund Whether it is pioneering open primaries to select our parliamentary candidates, or using new technology to give the public power through access to government information, we are the ones setting the progressive pace in politics. –– appositive participles Unison general secretary Dave Prentis warned that social work vacancies were at “danger level”, running at an average of 12% across the UK. –– incomplete verb phrases (e.g. be + coordinated adjective and -ing participle He is now 61 and working as a computer programmer. –– Coordinated progressives were only counted once. An example is given below. I wish they showed a fraction of this concern for the living ‘dead’ – the population in Gaza, who their Jewish brothers have been bombing and starving to death. In the remainder of this paper, we will look at how progressives are used in MaltE press language and spoken MaltE conversations. In addition, we will c­ ompare our



Progressives in Maltese English 

news data with parallel British corpus data, and we will contrast MaltE conversations with findings from other varieties of English.

6.  Quantitative analysis 6.1  Maltese and British newspaper corpora The overall number of progressives in the written data from Malta and the UK are very similar, both in total and with regard to the two text types, news reports and editorials (see Table 2). Unsurprisingly, then, statistically significant differences between the two varieties cannot be observed: a chi-square test (at one degree of freedom) produces a p-value close to 1.0 for press and editorials across the two varieties (p = 0.893; χ2 = 0.018). Table 2.  Progressives in MaltE and BrE newspaper corpora Raw frequencies

Percent

pmw

MaltE

BrE

MaltE

BrE

MaltE

BrE

Total

454

420

100

100

4,540

4,200

Press

296

272

65

65

4,933

4,533

Editorials

158

148

35

35

3,950

3,700

This supports the hypothesis that for the overall text frequency of progressives, the Maltese usage conforms to the exonormative standard in this text type. Both varieties use the progressive more often than Leech et al. (2009) found for the BROWN family of corpora for the 1960s and 1990s. This suggests that the use of the progressive might have increased even further in the last 20 years in this text type. The results are also parallel to those in Hundt and Vogel’s (2011) study, in which ESL varieties do not universally feature more progressives than ENL varieties in students’ writing.4 Striking parallels occur also in the distribution of tenses in the two varieties, with the exception of one factor: the progressive combines with modal auxiliaries significantly more often in our Maltese data. This is supported by statistical

.  The matrix is, of course, more complex. New Zealand English, for instance, behaves similarly to some ESL and EFL varieties with regard to the use of progressives (for details, see Hundt & Vogel 2011; but seevan Rooy 2006 and Sharma 2009 discussed in Section 3 above).

 Michaela Hilbert & Manfred Krug

tests: a chi-square test across all contexts yields significant differences between the two varieties (p = 0.008; χ2 (3df) = 11.950). If the modals are excluded, ­however, no significant differences between the varieties remain (p = 0.936; χ2 (2df) = 0.132), which seems obvious from the similarity of the absolute figures for past, present and non-finite in Table 3.5 In other words, the fact that the two varieties differ significantly overall is solely due to the drastically higher usage rate of progressives after modals in MaltE. Table 3.  Progressive aspect and tense in MaltE and BrE newspaper corpora Raw frequencies

Percent

Tense

MaltE

BrE

MaltE

BrE

present

276

271

61

65

past

110

113

24

27

modal

58

25

13

6

non-finite

10

11

2

3

Let us now take a closer look at the individual modal auxiliaries and modal constructions6 which precede the progressives (see Table 4). Will is the preferred option in both varieties, followed by may and would. Taken together, these three account in both varieties for exactly the same high proportion (84% of all occurrences) of the progressives that form part of a modal construction. Notably, will is used in almost two-thirds of all cases in the Maltese data. It is difficult to subject the figures from Table 4 to an adequate significance test. We first conflated the modals and constructions other than will, may and would under the category “other”, but still the expected absolute frequencies were for some cells too low for an ordinary chi-square test. We therefore applied ­Fisher’s Exact Test, which addresses such problems.7 The result is a p-value of ­ ifferences 0.184 (χ2 (3df) = 4.675 for Fisher’s Exact Test), which suggests that the d between MaltE and BrE press language are not statistically significant for modal usage.

.  Due to rounding-off, not all percentages total exactly 100. .  For their functional equivalence with central modals and modal constructions, we include the constructions have to and be going to in our investigation, which have been labelled in the literature, inter alia, as semi- or quasi-modals, semi-auxiliaries or emerging modals (see Krug 2000: ch.1 for a detailed discussion). .  Special thanks are owed to Ole Schützler for his help with statistical issues.



Progressives in Maltese English 

Table 4.  Progressives following modal auxiliaries and modal constructions in MaltE and BrE newspaper corpora MaltE (%)

BrE (%)

Total: 58 occurrences

Total: 25 occurrences

will

62

40

may

12

20

would

10

24

should

5

4

could

2

4

have to

2

4

shall

2

0

ought (to)

2

0

must

2

0

going to

2

0

might

0

4

Notice, however, that the difference in individual (but rare) modal constructions is disguised by the significance test applied. We would argue therefore, that what is quantitatively not significant in this case is nevertheless qualitatively of importance. This is the wider range of modals other than will, would and may in the Maltese data, only half of which occur in the British data. The Maltese data, thus, display a wider variety than the UK data, and this may shed some light on the issue of classifying Maltese English as ENL, ESL or EFL. Previous research suggests that EFL varieties tend to overuse the dominant construction, as a default construction as it were, and to display less variability and creativity than ENL users (see Lorenz 1999 on the use of intensifiers like very, terribly, etc. in native and learner English; or the summary in Hundt & Vogel 2011 for progressives quoted in Section 3 above). On that count, Maltese English press language, however, displays strategies of both EFL and ENL: a higher proportion of the dominant construction (will-progressives) and a higher number of alternative patterns, i.e. more variability than the ENL variety which is (or at least was until recently) the exonormative standard. Overall, this suggests that Maltese English press language is close to the ENL pole of varieties. In addition, however, it seems tempting to speculate that the observed pattern might be typical of ESL varieties more generally – a speculation that obviously needs (and in our view deserves) further investigation. As Table 5 shows – and as anticipated by the literature survey of Section 4 – the perfect progressive and the passive progressive are relatively rare in both varieties, with the former occurring more frequently in the UK data, and the latter more

 Michaela Hilbert & Manfred Krug

frequently in the Maltese data. While the differences between MaltE and BrE are statistically significant at the 5% level for simple vs. perfect progressive (p = 0.013; χ2 (1df) = 6.164), the more frequent use of passive progressives in MaltE fails to reach statistical significance at p = 0.082 (χ2 (1df) = 3.024), though not by a wide margin. The trend to avoid the passive progressive in BrE could be due to an increasing prescriptive bias against the passive, which has been reported especially for the USA, and this may have reached the UK earlier than Malta (see Hundt & Dose forthcoming). Also, these results are in line with Hundt’s (2009: 301–304) finding that Outer-Circle Englishes tend to use the progressive passive relatively more frequently than Inner-Circle varieties. Since the occurrences of perfect and passive progressives are rather low, however, and since the differences regarding voice only represent a trend, rather than yielding statistically significant results, we would caution against far-reaching conclusions at this stage and would instead encourage further research into both issues. In fact, more noteworthy than the differences between the two varieties is probably the overall similarity of the absolute figures, which yield fairly similar text frequencies of the constructions in press language in the UK and Malta. Table 5.  Perfect vs. present progressive and passive vs. active progressive in MaltE and BrE newspaper corpora MaltE

BrE

Raw frequencies

Percent

Raw frequencies

Percent

Total

454

100

420

100

simple progressive

433

95

383

91

perfect progressive

21

5

37

9

active voice

390

86

377

90

passive voice

64

14

43

10

To sum up, the overall distributions of progressives in the Maltese and UK newspaper corpora are very similar and support the claim of strong exonormative influence from British English on Maltese English. The most important significant difference between the two varieties appears to be that in MaltE progressives are more frequent after modals, which in their turn are also more varied in type than in BrE press language. 6.2  Comparison of spoken and written corpus data This section compares the use of progressives in written and spoken corpora. For this, 30,000 words from conversational data of ICE-MTA and ICE-GB (the first



Progressives in Maltese English 

15 texts of spoken private conversations, i.e. dialogues, S1A-001 to S1A-015) were analysed. The often-quoted more frequent use of the progressive in L2 varieties of ­English does not feature in our Maltese English newspaper data, but could be expected to do so in text types that are less subject to exonormative influence, like spoken language, notably private conversations. It is to this aspect that we now turn. As can be seen from Table 6, however, there is again a surprising congruence to be observed for the two varieties in terms of overall frequencies of progressives in the spoken data. Approximately 5,800 progressives per ­million words occur in Maltese English; these compare with 5,770 progressives per million words in the British data.8 An expected finding emerges from Table 6, viz. that the progressive indeed occurs significantly more often in private conversations than in the newspaper texts in each variety. Across the two varieties, however, the major result is that both press language and conversation behave similarly in British and Maltese English. Table 6.  Progressives in MaltE and BrE written and spoken texts Written MaltE absolute total per million words

Spoken BrE

454

420

4,540

4,200

MaltE 166 c.5,800

BrE 234 5,770

Table 7 differentiates between different tenses and constructions in which the progressive occurs in our MaltE data. The tendency for the progressive to c­ ombine with modal auxiliaries found in our written data does not hold for the spoken MaltE data. As Table 7 also shows, progressives in MaltE conversations occur, at 90%, overwhelmingly (and not surprisingly) in present tense verb phrases. Needless to say, the differences between the two text types of spoken and written ­Maltese English are highly significant. Fisher’s Exact Test, which was again chosen for the low expected frequencies in non-finite verb phrases, produces a p-value of less than 0.001 (p = 0.000; χ2 (3df) = 52.92).

.  The normalized text frequency for spoken MaltE is an approximate value because the data are not yet available in the final format. The preliminary finding, however, that progressives in MaltE are neither heavily over- nor underrepresented when compared to British spoken data seems noteworthy even at this early stage.

 Michaela Hilbert & Manfred Krug

Table 7.  Progressive aspect and tense in MaltE written and spoken texts Written MaltE

Spoken MaltE

Raw frequencies

Percent

Raw frequencies

Percent

Total

454

100

166

100

present

276

61

149

90

past

110

24

13

8

modal

58

13

4

2

non-finite

10

2

0

0

Table 8 suggests that the perfect progressive and the passive voice are slightly less frequent in the spoken than in the written data, but the differences are not statistically significant at the 5% level. The results obtained are, for simple vs. perfect: χ2 (1df) = 0.047; p = 0.828; and for active vs. passive: χ2 (1df) = 2.800; p = 0.094. Table 8.  Perfect vs. present progressive and passive vs. active progressive in MaltE written and spoken texts Written MaltE

Spoken MaltE

Raw frequencies

Percent

Raw frequencies

Percent

Total

454

100

166

100

simple progressive

433

95

159

96

perfect progressive

21

5

7

4

active voice

390

86

151

91

passive voice

64

14

15

9

(64) (0)

(100) (0)

(12) (3)

(80) (20)

(of which be-passives) (of which get-passives)

Among the passives in the spoken data, one-fifth (3 out of 15, listed as examples (25) to (27) below) involves the get-passive. Once more, this supports the hypothesis that exonormative influence is weak in the realm of private conversations, as otherwise it might have prevented this relatively new and colloquial construction from making its way into Maltese English. At the same time, the comparison with the printed data (which contain not a single get-passive among 64 passive progressives) shows, as expected, that this written text type behaves more conservatively. (25) I think it’s changing # it’s being more no uhm it’s getting changed to go to older uh to older homes yeah to old people ho



Progressives in Maltese English 

(26) now I can say from from my work experience it’s getting introduced to have old peoples homes huh and it’s ge it’s it is (27) so that’s why we’re getting populated around the harbour

In summary, the progressive is significantly more frequent in spoken than in written Maltese and British English. However, in terms of text frequency, we find no significant differences between the two varieties and thus no indication of an overrepresentation of the progressive. This result is noteworthy because it applies not only to edited, formal written language in Malta where conservative British ­English is considered the model norm, but also to an informal spoken genre. The next section will explore qualitatively whether the use of the progressive in MaltE is being expanded. We will focus on two aspects: first, do we witness an extension of the progressive to certain verb types (notably stative verbs) and, ­secondly, can we observe an extension to different aspectual and tense contexts (e.g. future or perfect contexts)? For the second question, it is the adverbials used in combination with progressives that are of particular relevance.

7.  Qualitative analysis Qualitatively, MaltE differs from BrE markedly in the use of the progressive with stative verbs. In the British data, hardly any stative verbs can be found in the progressive. In fact, what we are finding is that be and stative find do not occur in the progressive aspect at all. Furthermore, most uses of have clearly display dynamic meaning, as in having a holiday, having a few bottles, having a discussion. The same is true for example (28), since have an influence allows a dynamic interpretation and a temporary state of affairs, which is emphasized in the co-text by currently: (28) dance can be a wider uh can have a wider influence and use than it’s c­ urrently having (ICE-GB:S1A-001)

The semantic and pragmatic nuances of the following examples are more difficult to interpret, even though the deontic modal construction have to is very common in Present-day BrE, though not with progressive aspect as in (29) below: (29) you’re having to build up the muscles (ICE-GB:S1A-003)

More frequent in the British data is the use of the progressive for future contexts, as in examples (30) to (32) below: (30) When’s your Mum coming back – Friday I think (ICE-GB S1A-006)

 Michaela Hilbert & Manfred Krug

(31) If I knew for a fact that if you weren’t knowing that you’re not going dancing tomorrow night that this would mean that a lot of people wouldn’t know I could go to tomorrow night (ICE-GB S1A-011) (32) I’m graduating in June uhm (ICE-GB S1A-002)

In the Maltese newspaper data, by contrast, there are some progressives occurring with what are stative verbs in unmarked contexts, notably be. Crucially, however, each occurrence in our corpus allows for an interpretation in terms of a meaning change from stative to dynamic or transitory be, and thus conforms to the grammatical constraints of Standard English laid out in Section 4 above. Consider the be being-progressives in examples (33) to (37), for which the interpretation ‘temporary behaviour’ or ‘transitory state’ is preferable to one in terms of ‘permanent characteristic’. This is stressed by the additional use of here in example (33). (33) Muscat may well score crucial political points by accusing Gonzi of ­misleading the public. But the sad truth of the matter is both sides are being deceptive here: Gonzi, by promising the impossible, when he knows only too well that the State simply can’t afford such luxury without cutting… (34) “… that they are not granted legal protection,” she commented yesterday. “The general assumption is that this must be their fault, because they are being dishonest about their true identity, or because they are not cooperating with the authorities’ legitimate attempts to remove them.” (35) The Prime Minister reminded his counterpart that the country was still committed to meeting its financial targets by 2015. The government was ­being responsible in doing our utmost to reach these targets; not with ­populist rhetoric, but by gradually introducing a change in how energy is used in our country. (36) The sites are Wied Rini and Hal Far for land-based wind farms, and the Sikka l-Bajda for the offshore site. Lawrence Gonzi said the government was being “serious” about the project, estimated to cost some € 80 million, after experts reportedly said that … (37) … the first directive, in which people are being urged not to pay their ­utility bills before the end of the 45-day window. “We’re being cautious and ­reasonable. About 7,000 bills are being issued every day, and we have been receiving hundreds of telephone calls from…”

Similar cases occur in the UK press corpus, but less frequently (example (38)): (38) … when you are slumped on the sofa reading an Andy McNab novel, you are being anything but a couch potato.



Progressives in Maltese English 

As to the motivation of such be being-progressives, it is striking that in most of the MaltE passages there is a high density of -ing forms (emphasized in the above examples) in the co-text. This seems to point to the triggering function of the same and similar-sounding grammatical constructions, which has been amply demonstrated by Szmrecsanyi (2006). Particularly illuminating is ­example (37) with three progressives involving be in the immediate co-text. Notice, though, that their function differs from the temporary behaviour expressed in we’re being cautious: two are passive constructions (people are being urged; bills are being issued), while the third is a perfect progressive (we have been receiving). It is obvious that such a triggering function plays a more important role in spontaneous spoken language, but one might argue that such a natural discourse principle is less inhibited in the context of a smaller speech community where there is less editorial interference relating to subtle grammatical and stylistic nuances. Finally, apart from the lower frequency of be-progressives expressing temporary behaviour in British press language, a stylistic and text-type related difference seems to be in evidence. While the one British example (38) occurs in a rather informal text passage (notice the lexical items slump, sofa, couch potato in the ­co-text), most of the Maltese English examples occur in rather formal passages of direct or indirect speech and thus seem neither stylistically marked nor restricted to the spoken or written mode. Let us next turn to progressive be having, which occurs three times in the Maltese newspaper corpus. Once it has a clearly dynamic meaning (I was having a cup of tea). The two other cases, however, do not seem to involve any such semantic change away from the stative verb ‘possess’, but seem to stress the imperfective aspect of possessive have (see example (39)). (39) One of the main problems we are currently facing involves ­biological ­parents who take drugs. We are having mothers who injecting [sic] ­themselves till their last day of their pregnancy.

Other stative verbs clearly fulfil the ‘dynamic meaning’ criterion or the ‘temporariness’ criterion, as live does in example (40). (40) Tortell added that this approach was necessary because several rejected a­ sylum seekers have been living at open centres for a number of years – some since 2003 – even though it is impossible for them to legally find work, …

Apart from the extension of the progressive to stative verbs, Hundt and Vogel (2011) note the occurrence of the progressive in contexts in which the present perfect would be expected, a development that in their data is not limited

 Michaela Hilbert & Manfred Krug

to ESL varieties, but also occurs in ENL varieties. These findings can be confirmed by the newspaper data from Malta and the UK, even though none of these examples would be judged ungrammatical, but a matter of choice (see example (41)). (41) He has paid attention to the European parliament, persuading members to speed up their decisions. These were all important decisions that are already leaving their mark on economies throughout the European Union. (Malta)

In the spoken data, stative progressives are much more frequent. Be occurs six times in the progressive. In example (42), it is possible to infer from the context that the speaker is stressing the temporary nature of a state. (42) even the cars that are are being reported # that’s that’s another problem tha that we’re having # construction material cranes #it’s it’s being quite difficult because our policy how it’s entertained…

Notice that in the above example, the immediate context contains two more progressives prior to it’s being: are being reported and we’re having. These progressives might be stimuli triggering the use of it’s being. Interestingly, the three verbs progress on a cline from dynamic to stative: report – have – be. It is not unlikely, therefore, that we are indeed dealing with a speaker overusing the progressive, i.e. a speaker who extends the progressive to stative contexts. In the remaining cases of be being in our MaltE corpus, however, not even traces of temporariness or a semantic change towards a more dynamic meaning can be identified (examples (43) to (46)). (43) … so that’s why I think we’re having the the older population that we’re ­having apart from uh science which nowadays getting more into act which the older people are being uh more healthier (44) Gozo now it’s ge it’s being out because of the ferry – yeah (45) the the rent is like it’s not worth to rent your home you’re your your house – yeah – whilst it’s being the same as having a loan and instead of ­renting it yeah we jus (46) [About a speaker’s younger brother] 〈$D〉 wow # so he’s famous on Malta 〈$A〉 yeah 〈$B〉 yes # he’s being at about one yes 〈$A〉 two one one and a half 〈$B〉 one and a half years yes one and a half years

Find as a verb of perception also occurs with the progressive in our Maltese English data – given below as (47) and (48) – and while this is also a possible



Progressives in Maltese English 

context in British English, no such examples can be found in our parallel UK press corpus: (47) At least what I am finding is that each local plan has its own uh way of ­writing uh and its own characteristics. (48) Uhm the other thing is that we are finding from a planning point of view now uh that Gozo it’s quite difficult to they want to be on their own.

Progressive have occurs 30 times in our spoken data, and there are cases in which there is no semantic change involved and the meaning of have is indeed the stative ‘possess’. Interestingly, this use of be having frequently figures in combination with an adverbial like now and at the moment. These adverbials thus seem to trigger the use of progressives, very much like in Standard BrE and AmE, while their triggering function is being extended to possessive (i.e. stative) have in MaltE. See examples (49) to (63) from the Maltese data. (49) to prevent mhm the main problem in our local plans that we are having (50) but we have so much jargon here and there that the winding roads that we are having it’s quite expensive yeah and the government would be uhm (51) so that’s why I think we’re having the the older population that we’re having apart from uh science which nowadays getting more into act which the old (52) yeah instead of including it within the the the policy that we’re having at the moment I see and believe me it’s no joking (53) that’s that’s another problem tha that we’re having (54) interesting so you’re saying we’re having fifty-seven schools in secondary mhm (55) because it w would be just at home hm but again nowadays we’re having internet (56)  nowadays in our culture that we’re having most of mothers and fathers are taking their children to home themselves (57)  We’re having much more populations around the Grand Harbour (58) okay you’re having that house for instance if if you go back to these old houses that that … (59) so we’re having like you’re having the whole map of Malta (60) And now we’re having the Free Port and that’s why we have the second industry most largest industry

 Michaela Hilbert & Manfred Krug

(61) Around the Grand Harbour is where the we’re having the the the harbor harbour goods (62) but it’s shortened down yeah that’ why we’re having these these changes… (63) Now Gozo’s having a large large banging in its head because of this thing.

In addition, we find in the MaltE examples the adverb nowadays, which has a habitual meaning aspect (‘regular event or action’, ‘these days’) and which typically co-occurs with simple aspect in Standard BrE and AmE. The temporary meaning ‘right now’ associated with the progressive that can be expressed by now is not common for nowadays in BrE and AmE. We might therefore be dealing with a case of underdifferentiation in Maltese English. From a Maltese speaker’s perspective, however, another important aspect is that both ‘these days’ and ‘right now’ are imperfective and would in Maltese typically feature in a morphological form that overlaps with the progressive function and contrasts with perfective aspect (see Section 3 above). Apart from the extension of the progressive to stative verbs, we can find the progressive also in contexts in which the simple aspect would be expected, as in the habitual contexts in (64) and (65). (64) we do need to have an area which is being like permitted mhm each year (65) so every month we’re just gathering this all this this data

More frequently than in the habitual present, however, the progressive is used in future contexts, as for instance in examples (66) and (67). (66) I’m more forward for New Year’s Eve cause we are going to a party uhm and i it’s not so that important Christmas Eve (67)  we’re leaving Saturday

To sum up, in written Maltese English texts progressive aspect marking with stative verbs occurs very infrequently. The same is true for progressives expressing the functions normally associated with present perfect contexts. In the few cases we found, the constraints are very much the same as in the UK data. A notable exception is the construction be having with a stative ‘possession’ meaning, which does occur in the Maltese English data, but not in the UK texts. In the spoken data, stative be having is much more frequent than in the written data, and it clearly dominates among the stative verbs used with morphological progressive aspect marking overall. The dynamic meaning of have in fact occurs less frequently than the stative ‘possession’ meaning. Other stative progressives involve the verb be, and less frequently, the verb find in its use as a verb of perception.



Progressives in Maltese English 

Apart from the extension of verb types, we can observe an extension of time reference with the progressive in MaltE. This applies most notably to the future and, less frequently, to the present habitual. Hundt and Vogel’s (2011) hypothesis of an extension of the progressive to present perfect cannot be confirmed on the basis of our data, though. The very few cases in which the present perfect would probably be the default option in Standard BrE or AmE are far from unambiguous in our Maltese English data.

8.  Questionnaire data Our questionnaires include three examples of the progressive with stative verbs given as (68) to (70) below. In addition, we investigated an analogue of the Maltese progressive qed ‘stay’ construction (71). (68) Are you understanding my point? (69) I’m really liking this film. (70) What are you wanting? (71) My sister told me to stop, but I stayed playing.

We had Maltese, US and UK students (i.e. educated speakers of their varieties) rate the frequency of these sentences in their speech community according to the scale given below. Each sentence was rated separately for an informal s­ poken register and a semi-formal written register by selecting one of the ­following options: This sentence could be a. b.

said in Malta/the UK/the US in an informal conversation by written in Malta/the UK/the US in an email to a former teacher by Everyone = 5; Most = 4; Many = 3; Some = 2; Few = 1; No-one = 0

The information was elicited independently for spoken and written Maltese English, typically not on the same day and in a different order. The questionnaire sentences relating to the spoken register were heard twice and the judgement was solely auditory-based. More precisely, the subjects were exposed twice to the same sound file spoken by a native speaker of the investigated variety. The usage judgements relating to the written register were entered into a printed form of the questionnaire (see Krug, Fabri & Hilbert forthcoming for the full questionnaire and the rationale behind the methodology).

 Michaela Hilbert & Manfred Krug

Table 9.  Average ratings of progressives in spoken and written contexts Spoken

Written

Average

Malta

UK

US

Malta

UK

US

(68) Are you understanding my point?

3.9

3.2

3.6

3.2

2.9

3.4

3.4

(69)  I’m really liking this film.

3.2

3.7

4.0

2.5

2.7

3.4

3.3

(70)  What are you wanting?

0.7

1.9

1.3

0.9

1.9

1.4

1.4

(71) My sister told me to stop, but I stayed playing.

2.6

1.7

1.2

1.9

1.3

1.1

1.6

Overall average

2.6

2.6

2.5

2.1

2.2

2.3

Table 9 shows that across the four contexts, all three varieties score very similarly (between 2.5 and 2.6 for spoken English, and consistently lower, between 2.1 and 2.3, for written English). Similar overall averages for each variety and each mode, however, mask interesting individual differences for the different grammatical constructions. In each variety, I’m really liking… and Are you understanding…? are rated to be very common. Given the marginal status of progressives occurring with stative verbs and in view of the fact that average ratings, even for neutral control sentences, in our questionnaires rarely exceed 4.0, it comes as a surprise just how high these two actually rank for all varieties: essentially between “many” and “most” speakers of all three speech communities are reported to use them. With average ratings of about 1.5 (i.e. between “few” and “some”), What are you wanting? and I stayed playing rank considerably lower. But while in MaltE What are you wanting? is the item with the lowest usage ratios, in BrE and AmE the lowest scores are obtained for I stayed playing. The relatively high scores for I stayed playing in MaltE (2.6 and 1.9 for spoken and written text types, respectively) are best explained in terms of language contact, i.e. the existence of a parallel construction in Maltese. Of the stative progressives, What are you wanting? scores lowest for each of the three varieties. Interestingly, unlike the remaining test items, which score consistently higher in spoken English, each individual variety scores similarly low on this item for the informal spoken and semi-formal written text types. This strongly suggests that it is not a stylistic but a grammatical factor regulating its (non-)use, and we believe this is the fact that want is the least dynamic of the three verbs understand, like and want. Another explanatory factor may play a role here. A more recent phenomenon of British and American English seems to be the usage of the progressive



Progressives in Maltese English 

with stative verbs of liking, disliking and desire for emphatic purposes. Consider ­example (72): (72) I’m (so/really) loving/liking/hating/wanting/craving it.

Using the expanded form for emphasis and emotional involvement is not new to the progressive but an integral part of its origin (see Hübler 1998: ch. 4; Leech et al. 2009: ch. 6) and this may at least partly explain the robust tendency for the construction to occur more frequently in informal spoken than formal written types of English around the world. What we seem to be witnessing in contexts like (72), then, is a case of retention of an older meaning facet or, more likely, the revival of an older function in a new grammatical niche, i.e. with a special type of stative verb (see Lehmann 1995: 164; Traugott 2004 on general principles in related phenomena). It is noteworthy that with ratings in MaltE between “no-one” (0) and “few” (1) What are you wanting? has the lowest individual usage scores of all the cells in Table 9. We submit that this is due to an older exonormative prescriptive rule, which has lost some of its force (and concomitantly developed new emphatic and stylistic functions) in both British and American English, but apparently not yet to the same extent in the Maltese English context. In support of this hypothesis, we can point to the other semantically related verb liking, which too has consistently lower usage rates in spoken and written Maltese English than in British and American English. We assume that the strikingly higher rates for I’m liking across all varieties can be largely accounted for by the presence of intensifier really, which – like so in (72) – in our view raises the acceptability of the progressive with verbs of personal (dis)preference considerably. Let us finally integrate these individual findings into more general approaches to and models of World Englishes. It is interesting to see that consistent differences between informal spoken and semi-formal written text types are to be found with the Maltese speakers of English. Overall, MaltE displays in fact the greatest stylistic differentiation of all three varieties for the items under investigation. This strongly supports the position that stylistic differentiation may precede Phase 5 of Schneider’s model, which indeed explicitly allows for such situations (2007: 54). On a more general note, we can detect certain patterns for our spoken and written questionnaire data: the ENL varieties of BrE and AmE differ least from each other in the spoken questionnaire data under discussion here. On average, each item differs by 0.4 points between the two ENL varieties, while each of them differs from the MaltE spoken data by 0.8 points, i.e. twice as much. This is confirmed by statistical testing: T-tests show that the averages of all four progressive sentences displayed in Table 9 differ significantly at the 5% level between spoken Maltese English and spoken British English, and all except one (the exception

 Michaela Hilbert & Manfred Krug

being understanding) show significant differences between spoken Maltese and spoken American English. None of our four progressive items, however, produces significant differences between spoken American and spoken British English. The situation is different for the written data, though, where the differences between the three varieties are less pronounced. Only one item each differs significantly between MaltE and BrE (wanting), and MaltE and AmE (stayed playing), while again no significant differences obtain between BrE and AmE. Overall, then, the two ENL varieties of British and American English are judged to be rather similar in terms of their usage of the progressive, while Maltese English differs significantly from each of them. The differences are more pronounced in the spoken mode than in the written mode, but this is rather unsurprising. After all, the semi-formal written text type for which we elicited our questionnaire information is more subject to exonormative pressures than informal conversations.

 onclusion 9.  C From a quantitative perspective, press language from the early third millennium differs very little between Maltese English and British English with respect to the use of progressives, both overall and with regard to the two text categories investigated, reportage and editorials. This suggests that the two regions have similar genre-specific norms, which are essentially the ones developed in Britain, with the Maltese press following an exonormative standard. From a bird’s eye perspective, even the Maltese and British English uses of progressives in conversations display striking quantitative similarities. As expected, and in a parallel fashion, both varieties display significantly higher text frequencies of progressives in spoken conversations than in journalistic writing. As to the description of MaltE as an ESL, ENL or EFL variety, we believe that the classification of MaltE as an ESL variety is over and above indeed the most adequate one. We also agree with previous analysts like Thusat et al. (2009) that within Schneider’s (2003, 2007) dynamic model, Phase 3 (nativization) seems to be the best fit for MaltE, but we would apply important qualifications (e.g. due to the existence of systematic stylistic variation). At the same time, we believe that the heterogeneous character of MaltE is reflected in our data. Speakers in Malta vary significantly in their use of English and their attitudes to what is perceived as the codified British Standard. As far as progressives are concerned (and we assume that this is true more generally), MaltE press language is close to the native-language pole of varieties of English, notably to its historical exonormative standard, BrE. In this study, the more substantial



Progressives in Maltese English 

differences between the varieties were repeatedly found for spoken English, where the exonormative standard exerts lower pressures. It is here that we expect the potential for the development of variety-specific norms to be greatest and most imminent. Our data show that the recurring “overuse” claim from the literature on progressives in non-ENL varieties is not easily or universally applicable. No straightforward higher frequency of occurrence in our ESL data can be observed. (Perhaps the normative influence discourages the use of the progressive in less formal genres as well, a pattern we found in our investigation on contractions, but it is one which we find less likely for the use of progressives, see Section 6.2 above and Krug, Fabri & Hilbert forthcoming.) Nor do the differences figure as simple reductions or expansions of usage contexts. What we do find are stative verbs used in the progressive aspect, though primarily with a limited number of verbs, notably have and be. These are high-frequency verbs and thus statistically to be expected to figure prominently. Future research will have to show whether they are anchor constructions, as it were, from which the progressive – in what are still marked contexts – may spread, whether the situation remains stable or whether indeed the use of progressives in these contexts declines again in a development towards a perhaps converging, more regulated use of the progressive in global Standard English. Given the complexity of our data, we can certainly confirm Hundt and Vogel’s (2011) finding that ESL varieties cannot simply be placed in the middle of a continuum between ENL and EFL varieties. However, we also found interesting patterns of divergence in progressive usage between the varieties investigated, both at a quantitative and qualitative level. For instance, we noticed a higher frequency of be-progressives (of the type the government was being serious, both sides are being deceptive here), indicating temporary behaviour or qualities – a construction which in MaltE does not seem to be stylistically marked. The most notable difference we could find in journalistic prose was a significantly higher frequency of progressives after modal verbs and modal constructions in MaltE, where in addition the range of modals co-occurring with the progressive was greater. Another difference in press language is the use of progressive aspect marking with stative verbs, which, although rare, is more frequent in MaltE than in BrE, notably with possessive have. In the spoken MaltE data, this construction is far more frequent still. The MaltE use of the progressive is more conservative than in comparable British and American English text types because, we would argue, MaltE has not been affected to the same extent by two trends found in AmE and BrE. In one case – the trend (reported in Hundt & Dose forthcoming) to avoid the passive – this leads to higher usage rates of the progressive in MaltE when compared to

 Michaela Hilbert & Manfred Krug

the ENL varieties. In the other case – the trend towards using the progressive (in sentences like I’m wanting/liking it) for emphatic purposes with verbs of personal (dis)preference and desire in the two major (and probably most other) ENL varieties – this conservatism results in lower usage rates of the progressive. Here we argued that an older constraint applies more consistently in MaltE than in BrE and AmE. The questionnaire data revealed that a contact feature (I stayed playing for marking progressive aspect) and the above-mentioned more rigorously applied older standard (“No progressive for markedly non-dynamic verbs like want”) make Maltese English distinct from the two ENL varieties of BrE and AmE. It would be interesting to investigate whether such a pattern is typical of ESL varieties more generally.

References Biber, D., Finegan, E., Johansson, S., Conrad, S. & Leech, G. 1999. Longman Grammar of Spoken and Written English. London: Longman. Bongartz, C.M. & Buschfeld, S. 2011. English in Cyprus: Second language variety or learner English? In Exploring Second-Language Varieties of English and Learner Englishes: Bridging a Paradigm Gap [Studies in Corpus Linguistics 44], J. Mukherjee & M. Hundt (eds), 35–54. Amsterdam: John Benjamins. Camilleri, A. 1991. Crosslinguistic influence in a bilingual classroom. The example of Maltese and English. Edinburgh Working Papers in Applied Linguistics 2: 101–111. Camilleri, G. 2004. Negative transfer in Maltese students’ writing in English. Journal of Maltese Education Research 2(1): 3–12. Dimech, W. 1973. Theoretical Approaches towards the Use of the Language Laboratory as an Aid to the Teaching of English in Malta. MA thesis, University of Malta. Görlach, M. 2002. Still more Englishes [Varieties of English Around the World G28]. ­Amsterdam: John Benjamins. Hübler, A. 1998. The Expressivity of Grammar: Grammatical Devices Expressing Emotion across Time. Berlin: Mouton de Gruyter. Hundt, M. 2004. The passival and the progressive passive: A case study of layering in the English aspect and voice systems. In Corpus Approaches to Grammaticalization in English [Studies in Corpus Linguistics 13], H. Lindquist & C. Mair (eds), 79–120. Amsterdam: John Benjamins. Hundt, M. 2009. Global feature – local norms? A case study on the progressive passive. In World Englishes – Problems, Properties and Prospects [Varieties of English around the World G40], S. Hoffmann & L. Siebers (eds), 287–308. Amsterdam: John Benjamins. Hundt, M. & Dose, S. Forthcoming. Differential change in British and American English: ­Comparing pre- and post-war data. In Looking Back – Moving Forward. Papers from the Thirtieth Conference of the International Archive of Modern and Medieval English [ICAME 30], S. Hoffmann, P. Rayson & G. Leech (eds). Amsterdam: Rodopi.



Progressives in Maltese English 

Hundt, M. & Vogel, K. 2011. Overuse of the progressive in ESL and Learner Englishes – fact or fiction? In Exploring Second-Language Varieties and Learner Englishes: Bridging the Gap [Studies in Corpus Linguistics 44], J. Mukherjee & M. Hundt (eds), 205–222. Amsterdam: John Benjamins. Jenkins, J. 2003. World Englishes. London: Routledge. Kachru, B.B. 1986. The power and politics of English. World Englishes 5: 121–140. König, E. & Lutzeier, P. 1973. Bedeutung und Verwendung der Progressivform im heutigen Englisch. Lingua 32: 277–308. Kortmann, B. & Schneider, E.W. (eds). 2004. A Handbook of Varieties of English: A M ­ ultimedia Reference Tool. Vol. 1: Phonology. Vol. 2: Morphology and Syntax. Berlin: Mouton de Gruyter. Krug, M. 2000. Emerging English Modals: A Corpus-Based Study of Grammaticalization. Berlin: Mouton de Gruyter. Krug, M., Fabri, R. & Hilbert, M. Forthcoming. Chapter 5: Aspects of Maltese English morphosyntax: Corpus-based and questionnaire-based studies. Il-Lingwa Taghna (Special Issue; A. Vella & R. Fabri, eds) Towards a Description of Maltese English. ­ rammatical Leech, G., Hundt, M., Mair, C. & Smith N. 2009. Change in Contemporary English: A G Study. Cambridge: CUP. Lehmann, C. 1995. Thoughts on Grammaticalization. Munich: Lincom. Lorenz, G. 1999. Adjective Intensification – Learners versus Native Speakers: A Corpus Study of Argumentative Writing. Amsterdam: Rodopi. Mazzon, G. 1992. L’inglese di Malta. Naples: Liguori Editore. Modiano, M. 1999. Standard English(es) and educational practices for the world’s lingua franca. English Today 15(4): 3–13. Platt, J., Weber, H. & Ho, M.L. 1984. The New Englishes. London: Routledge & Kegan Paul. Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. 1985. A Comprehensive Grammar of the ­English Language. London: Longman. Schembri, N. 2005. Noun Phrase Structures in Maltese University Students’ Commerce Texts. A Study in Academic Maltese English. Munich: Lincom. Schneider, E.W. 2003. The dynamics of New Englishes: From identity construction to dialect birth. Language 79(2): 233–281. Schneider, E.W. 2007. Postcolonial English: Varieties around the World. Cambridge: CUP. Sciriha, L. & Vassallo, M. 2001. (repr. 2003). Malta: A Linguistic Landscape. Malta: Socrates. Sciriha, L. & Vassallo, M. 2006. Living Languages in Malta. Malta: Print It Printing Services. Sharma, Devyani. 2009. Typological diversity in New Englishes. English World-Wide 30: 170–195. Smith, N. & Rayson, P. 2007. Recent change and variation in the British English use of the ­progressive passive. ICAME Journal 31: 107–137. Spagnol, M. 2009. Lexical and grammatical aspect in Maltese. In Ilsienna, T. Stolz (ed.), 51–86. Bochum: Universitätsverlag Brockmeyer. Swithun, B. 1961. Some Aspects of the Teaching of English in the Maltese Schools. Malta. Szmrecsanyi, B. 2006. Morphosyntactic Persistence in Spoken English. A Corpus Study at the Intersection of Variationist Sociolinguistics, Psycholinguistics, and Discourse Analysis [Trends in Linguistics: Studies and Monographs 177]. Berlin: Mouton de Gruyter. Thusat, J., Anderson, E., Davis, S., Ferris, M., Javed, A., Laughlm, A., McFarland, C., Sangsiri, R., Sinclair, J., Vastalo, V., Whelan, W. & Wrubel. J. 2009. Maltese English and the nativization phase of the Dynamic Model. English Today 25: 25–32.

 Michaela Hilbert & Manfred Krug Traugott, E.C. 2004. Exaptation and grammaticalization. In Linguistic Studies Based on Corpora, M. Akimoto (ed.), 133–156. Tokyo: Hituzi Syobo. van Rooy, B. 2006. The extension of the progressive aspect in Black South African English. World Englishes 25(1): 37–64. Vella, A. 1994. Prosodic Structure and Intonation in Maltese and its Influence on Maltese ­English. Ph.D. dissertation, University of Edinburgh. Westergren Axelsson, M. & Hahn, A. 2001. The use of the progressive in Swedish and German advanced learner English – a corpus-based study. ICAME Journal 25: 5–30.

Mapping unity and diversity in South Asian English lexicogrammar Verb-complementational preferences across varieties Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee Justus Liebig University, Giessen

It has been noted that the study of the interface between lexis and grammar in general and verb-complementational patterns and preferences in particular offers new insights into distinctive and so far largely neglected structures of varieties of English. Based on data from the International Corpus of English and large Web-derived newspaper corpora, we explore the verbs CONVEY, SUBMIT and SUPPLY, which are typically associated with the transfer-causedmotion construction, and their complementation patterns to discuss the unity and diversity found in Indian and Sri Lankan English as two prominent and institutionalized South Asian Englishes. Our findings suggest that the (degree of) homogeneity and heterogeneity across South Asian Englishes is a complex issue and a matter of the level of descriptive granularity. Keywords:  transfer-caused-motion (TCM) construction; Sri Lankan English; verb-complementation patterns (of TCM-related verbs); Web-derived newspaper corpus; transitivity trends

1.  Introduction: Unity and diversity in and across South Asian Englishes The label “South Asian English(es)” has been used by various scholars to refer to the localized forms of English used on the Indian subcontinent, going back to the colonial past of British India. At a merely geographical level, the core of what can be subsumed under “South Asian English(es)” can be found in India, Pakistan, Nepal, Bangladesh and Sri Lanka; other countries in the region, e.g. the Maldives, pose special cases for historical reasons and due to the present-day status and use of English and are more on the periphery of what is covered by “South Asian English(es)”.

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

In research into World Englishes, geographically motivated category labels are usually used not only to refer to an otherwise unrelated group of varieties that happen to exist in neighbouring countries, but also because the varieties are believed to share essential characteristic features. For example, in an early attempt to systematize the unity and diversity across Englishes world-wide in a model with concentric circles representing geographically oriented families of Englishes, McArthur (1987) uses the term “South Asian Standard(izing) English” to refer to all variants of English that share similar colonial roots, that have emerged in South Asian countries after their independence, that are still in the process of standardization and norm-development and that are, thus, linguistically similar to each other. In a similar vein, Kachru (2005: 43) argues that there is a South Asian identity that manifests itself in the English language: he explicitly speaks of the “South Asianness of English [that] has to be characterized both in terms of its linguistic characteristics and in terms of its contextual and pragmatic functions”. It should not go unmentioned, however, that in Kachru’s (passim) work the distinction between the meta-category of South Asian English(es) on the one hand and Indian English (IndE) as a distinct nation-based variety on the other is not very clear; in fact, at times he seems to consider the English language in India as a kind of core or lead variety for South Asian Englishes. There are, indeed, good reasons to assume that IndE plays a particularly important role in the family of South Asian Englishes, for example because it is by far the largest anglophone speech community with approximately 50 million regular speakers of acrolectal standard Indian English. Also, Kachru (2005: 57) notes that for a number of reasons, the use of English in creative fiction writing (and, thus, as a literary means of postcolonial identity construction) is particularly widespread and visible in India, which “has the largest, most vibrant, productive and articulate writers of English”. Given also that the English language in India has an official status at the federal level and in various states and union territories and is an institutionalized means of communication in a wide range of contexts (e.g. as the language of the Supreme Court and as a medium of instruction at the leading institutions of higher education), it is no surprise that Leitner (1992) has referred to IndE as a potential “epicentre”, i.e. as a model variety for South Asia (in a similar vein to Australian English, which Leitner (2004) and Peters (2009) consider to have developed into an epicentre for neighbouring countries in the Pacific region). The shared past of the British Raj, the resulting South Asian identity and the potential epicentral role of IndE as a lead variety for South Asia may be considered reasons why the English language in South Asia can be viewed as relatively homogeneous. It is against this background that Baumgardner (1996), Kachru (2005) and others seem to prefer the singular label “South Asian English” to refer to the linguistic unity of the English language across the Indian subcontinent,



Mapping unity and diversity in South Asian English lexicogrammar 

which also shows in a range of shared linguistic features and tendencies at virtually all linguistic levels (cf. Kachru 2005). While there certainly is a high degree of unity across South Asian Englishes, for example in phonetics and phonology (e.g. monophthongization of diphthongs as in coat and fate) and in syntax (e.g. the use of invariant question tags like no?), it is obvious that there are also historical and functional differences between South Asian Englishes contributing to the manifestation of linguistic variation across varieties of English on the subcontinent: –– In some South Asian countries, the beginning of British influence and dominance set in later than in the heartland of British India. In Sri Lanka, for example, British colonization represented the third wave of colonization (following the Dutch and Portuguese periods); it is thus no surprise, for example, that lexical items taken over from the previous superstrates found their way into Sri Lankan English (SLE). –– In each individual South Asian country, there is a unique constellation of local first languages, ranging from largely monolingual settings with basically only one local language and English as an additional language (e.g. in Bangladesh) to very complex multilingual settings with a great number of local languages with sizeable speech communities and English (e.g. in India). It is obvious that depending on the unique language settings, different South Asian languages have led to different transfer effects in the local variants of English, with interference effects being stronger in basilectal variants than in the acrolect (and stronger, in general, at the level of accent than, say, in grammar). –– While in some South Asian countries there are hardly any native speakers of English at all (e.g. Pakistan), in other countries there are native English speech communities that exist alongside the (larger) communities of L2 speakers of English (e.g. Sri Lanka with its distinct Burgher community). –– The English language has played very different roles in the national language policies of South Asian countries after their independence. While in India, for example, English has been a (co-)official language of the Union ever since 1947, in Sri Lanka the Sinhala-only policy implemented in the 1950s aimed at replacing English with Sinhala in all (official) communication situations. The role and status of English in the national language policies has had an effect on the process of variety formation in general and the emergence of local standards and norms in English in particular. –– Depending on the degree of multilingualism and the national language policy in a given South Asian country, there is a more or less pressing need for a link language that can serve as a neutral communicative vehicle across linguistic and ethnic barriers. In India, English has always fulfilled this link language

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

function, and the re-introduction of English as an official language in the Sri Lankan Constitution in the 1980s was also motivated by establishing an interethnic link language. On the other hand, in countries like Pakistan and Bangladesh, there have never been national language policies with a focus on English as an intranational and interethnic lingua franca. It is, therefore, no surprise that South Asian Englishes are also marked by linguistic diversity, resulting, inter alia, from differences between the individual linguistic ecologies in which English forms part of the local linguistic repertoire. For example, while in IndE the hybrid compounds ticket wallah and lathi charge (with items taken over from Hindi) are widespread, this is not the case in SLE. On the other hand, it is only in Sri Lankan English that the construction take a (phone) call can be used with the meaning of make a (phone) call, which stems from corresponding uses of the cognate verbs in Sinhala and Tamil (cf. Hoffmann, Hundt & Mukherjee 2011). At the level of national varieties of English, we thus find manifestations of both unity and diversity across South Asian Englishes. It goes without saying that the level of (national) varieties of English refers to a relatively high level of abstraction; at this level we operate with what Lyons (1981: 24) has called “the fiction of homogeneity: the belief or assumption that all members of the same speech-community speak exactly the same language”. As in all other speech communities there is, of course, considerable variation within each individual South Asian variety of English. Recent studies, including corpus-based analyses, have shed new light on the internal variation within South Asian Englishes, be it between acrolectal, mesolectal and basilectal forms (see e.g. Hosali 2004 on South Asian “Butler English”), between speech and writing (see e.g. Gries & Mukherjee 2010 on IndE), between individual registers (see e.g. Balasubramanian 2009 on IndE) or between different speaker types such as the minority of local L1 speakers of English and the majority of competent L2 speakers (see e.g. Rajapakse 2008 on the English of Sri Lanka’s Burgher community). It is for this reason that scholars have introduced plural labels also for individual varieties of English, e.g. “Indian Variant(s) of English” (IVE; Nihalani et al. 2004) and “Sri Lankan E ­ nglishes” (Mendis & Rambukwella 2010). Notwithstanding the need for a detailed analysis of intravarietal variation, it remains important – and useful – to capture aspects of linguistic homogeneity and variation between (national) varieties of English in South Asia. It should also be mentioned that unity and diversity across Englishes can also be described and modelled at even higher levels of abstraction, with South Asian English potentially functioning as a family of interrelated varieties being contrasted with other groups of varieties and/or subsumed into larger families of varieties. For example, South Asian English can be viewed as one particular



Mapping unity and diversity in South Asian English lexicogrammar 

manifestation of what McArthur (2003), Kachru (2005) and Bolton (2008) have labelled “English as an Asian language”, “Anglophone Asia” and “Asian Englishes”, respectively. They thus view all variants of English in Asia as a linguistically noteworthy category, including all forms of English as a postcolonial link language in multilingual speech communities, as a pan-Asian communicative vehicle and a key to international communication. In the present paper, we are interested in describing unity and diversity at the level of national varieties of English in South Asia, focusing on acrolectal variants of these varieties. More specifically, we compare IndE with SLE and relate our findings to British English (BrE), i.e. the shared historical input variety. India and Sri Lanka are particularly interesting because the present-day status and functions of English in the two countries are marked both by common features and by clear differences. For example, in both countries, English today is a co-official language and is used as an intranational means of communication. However, while in India there are hundreds of local languages and 17 languages officially recognized by the Constitution as regional languages (with Hindi only spoken by a third of the total population as their first language), in Sri Lanka the two major indigenous languages are Sinhala (the majority language) and Tamil (the minority language). India and Sri Lanka also provide an interesting combination of South Asian varieties because the government of Sri Lanka has launched a large-scale initiative recently, called “Speak English our way”, which is intended to teach “English as a life skill” to Sri Lankan pupils and students (see Meyler 2010). At an early stage already, the coordinator of this initiative, Sunimal Fernando, propagated Indian English as a model variety for the development of a teaching model for the English language classroom in Sri Lanka: “India has emerged as the country which now has the most successful methods for teaching job-oriented English – English without the social and cultural baggage.” (The Guardian Weekly – Learning English, 23 May 2008). From a linguistic perspective, this view of Indian English cannot be upheld, of course – it is a variety which is used as a linguistic means of Indian identity construction and which has developed into a distinctly Indian medium of communication. This notwithstanding, the controversial debate that the Englishas-a-life-skill programme has triggered in Sri Lanka reveals that there is a growing anticipation (and, for some, a niggling worry) that SLE norms might be influenced in future to a much larger extent by IndE (see Mukherjee forthcoming). Whether this is true or not, however, remains to be seen. In the present study, we will restrict ourselves to a so far under-researched area of variation between the two South Asian varieties of English, namely complementational preferences of verbs. After briefly characterizing the relevance of the lexis-grammar interface in general and verb complementation in particular for the description of the structural nativization of varieties of English (Section 2), we

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

will introduce the group of verbs that we will focus on (i.e. convey, submit, supply) and that are habitually associated with the so-called transfer-caused-motion construction (see Section 3). Then we will describe the corpus data and our methodology (see Section 4). The results of the corpus analysis will be discussed in detail both from a quantitative and from a qualitative perspective (see Section 5). In the light of our findings we will readdress the complexity of the issue of unity and diversity in South Asian English(es) (see Section 6).

2.  Verb-complementational patterns as parameters of variation In recent years, there has been a growing interest in a so far neglected area of “structural nativisation, understood as the emergence of locally characteristic linguistic patterns” (Schneider 2007: 5f.), namely the interface between lexis and grammar, both in research into New Englishes in general and into South Asian Englishes in particular. Referring, inter alia, to Olavarría de Ersson and Shaw’s (2003) and Mukherjee and Hoffmann’s (2006) corpus-based studies of quantitative differences between IndE and BrE at the level of verb complementation, Schneider notes: These are stable and noteworthy results, and it is worth pointing out that they operate way below the level of linguistic awareness: without quantitative methodology no observer would have expected such differences to exist.  (Schneider 2007: 87)

It is thus no surprise that with the availability of large and representative corpora such as the International Corpus of English (ICE, see Section  4), new perspectives have emerged for the description of such quantitative differences at the lexisgrammar interface between New Englishes and their historical input varieties. In the present paper, we will focus on the complementation patterns of a semantically defined group of verbs in IndE, SLE and BrE, namely verbs that are typically used in the transfer-caused-motion (TCM) construction (see Goldberg 1995). From a construction-grammar perspective, the TCM construction is closely related to the ditransitive construction, as they are considered to be semantically synonymous (cf. Goldberg 1995: 91), albeit pragmatically distinct. The dative alternation of ditransitive verbs represents, in essence, the alternation between the ditransitive construction (e.g. give someone something) and the TCM construction (e.g. give something to someone). Mukherjee and Schilk (2008) have introduced the label “TCM-related verb” for verbs that are typically used in the TCM construction, although they may also sporadically occur in the ditransitive construction. Focusing on the TCM-related verbs convey, submit and supply in IndE and BrE,



Mapping unity and diversity in South Asian English lexicogrammar 

Mukherjee and Schilk (2008) have shown that there are identifiable differences between the two varieties at the level of verb-complementational preferences for this class of verbs. In the present paper, we will (1) look more closely at the distribution and usage of the verb-complementational patterns of the three aforementioned TCM-related verbs, (2) take into account SLE data as well and (3) combine the relevant components of ICE with data obtained from Web-derived newspaper corpora. 3.  Verb complementation of TCM-related verbs in South Asian Englishes 3.1  The patterns of CONVEY, SUBMIT and SUPPLY1 We classify the verb-complementational patterns of the TCM-related verbs CONVEY, SUBMIT and SUPPLY along the lines of the descriptive framework introduced by Mukherjee (2005) for ditransitive verbs. In general, we distinguish between five basic patterns, as described and exemplified in (1) to (10). (1) I

(S) SUPPLY [Oi:NP] [Od:NP]

(2) I use the vendors from my neighbours who supply me fresh vegetables (DN 2003-05-02)

(3) II (S) CONVEY [Od:NP] [Oi:PPto] (4)  the authorities conveyed incorrect information to the Ministry (DN 2003-02-13) (5) III (S) SUBMIT Oi [Od:NP]. (6)  some students from those schools have already submitted the forms. (ToI 37540) (7) IV (S) SUPPLY Oi Od.

(8) Do we have adequate sources to supply (DM 2005-01-07)

(9) V (S) SUBMIT [Oi] Od. (10) I submit to the customary kiss on both cheeks (BNC AA8)

From these five basic patterns, various related patterns, such as passive constructions, participle constructions, constructions featuring relative clauses,

.  In the following sections, the abstract lemma of a verb will be given in capital letters. The word forms of the lemma will be given in lower case and italics.

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

etc. can be derived.2 In our data, we opted for a simplified coding system of these related patterns: the passive patterns that can be derived from the pattern types I, II, III and IV are given the labels IP, IIP, IIIP and IVP, respectively; all other structurally related derivative patterns are merged under the labels Ider, IIder, IIIder, IVder and Vder, respectively.

 CM-related verbs: Previous studies of verb-complementational 3.2  T variation In Mukherjee and Hoffmann’s (2006) study, different verb-complementational trends and preferences for the ditransitive verbs GIVE and SEND are described on the basis of corpus data. They also list a number of so-called new ditransitives in IndE, i.e. verbs which are not admissible in the ditransitive construction in BrE, but are – at least sporadically – used ditransitively in Present-day IndE. With regard to the TCM-related verbs in the present study, they show that CONVEY and SUBMIT are attested marginally in the type-I pattern in IndE, while SUPPLY is used relatively often in the ditransitive construction in IndE. They conclude that “verb complementation in general and ditransitive verb complementation in particular represent core areas in which different varieties of English are marked by diverging preferences and structural options” (Mukherjee & Hoffmann 2006: 167). Because of the close semantic and cognitive relation between ditransitive verbs and TCM-related verbs it can be safely assumed that TCM-related verbs also display different verb-complementational preferences across varieties. TCM-related verbs in IndE and BrE are at the centre of Mukherjee and Schilk’s (2008) analysis, based on a large Web-derived newspaper corpus consisting of material collected from the online archives of the Times of India (c. 110 million words) and the periodical part of the British National Corpus (c. 30 million words); from these corpora random samples of 500 concordance lines (generated by WordSmith Tools) for each verb in each variety were analysed. They find, inter alia, that the type-II pattern is used more frequently in IndE than in BrE with the three TCM-related verbs CONVEY, SUBMIT and SUPPLY. With regard to the (monotransitive) type-III pattern, however, no such overall trend can be observed: CONVEY is used more frequently with this pattern in BrE, while SUBMIT is more frequently attested with it in IndE, and for SUPPLY there are no identifiable distributional differences. In line with Hopper and Thompson (1980), who view transitivity as a continuum from low to high, the degree of transitivity of .  For a comprehensive overview of the complementation patterns of ditransitive verbs, see Mukherjee (2005).



Mapping unity and diversity in South Asian English lexicogrammar 

the above verbs is discussed. On the basis of the number of profiled arguments in the complementation patterns of the three TCM-related verbs, the IndE data display a tendency to syntactically realize more arguments, and, thus, a higher degree of transitivity, than the BrE data. This is mirrored in the fact that, for each verb, the scores for patterns in which all three arguments (i.e. the subject, the direct object and the indirect object) are profiled are higher in the IndE data, while the respective scores for one profiled element are higher in the BrE data. From these observations the conclusion can be drawn that the degree of transitivity might vary between different varieties of English not only at the level of individual verbs but also at the level of semantically defined verb classes such as TCM-related verbs. As pointed out by Mukherjee and Schilk (2008), this conclusion, however, has to be taken with a measure of caution as the amount of the coded data was limited and as a number of aspects relevant to (the degree of) transitivity (see Hopper & Thompson 1980) were not taken into account. This notwithstanding, TCM-related verbs certainly represent a group of verbs that are relevant to the description of structural nativization of New Englishes at the lexis-grammar interface. 4.  Corpus data In the present study, we investigate the complementation patterns of the TCM-related verbs CONVEY, SUBMIT and SUPPLY in IndE, SLE and BrE. While IndE and SLE provide interesting cases of South Asian varieties of ­English (see Section 1), BrE represents the historical input variety for all South Asian ­Englishes and, thus, remains a relevant reference point for any description of the process of variety formation in South Asia. The corpus material used is listed in Table 1. Table 1.  The corpus data Variety

Corpus

Words

Indian English

ICE-INDw [200]

400,000

Sri Lankan English

British English

Times of India (ToI) Corpus

1,521,388

The Statesman (ST) Corpus

1,511,753

ICE-SLw [200]

400,000

Daily Mirror (DM) Corpus

1,518,726

Daily News (DN) Corpus

1,528,917

ICE-GBw [200] BNC news

400,000 8,992,587

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

As can be seen in Table 1, we used the complete written (w) parts of the Indian, Sri Lankan and British components of ICE, each including 200 texts with 2,000 words each (thus amounting to approximately 400,000 words each). These ICEw components are comparable in that they display exactly the same corpus design (see Section 4.1). There are two main reasons why we also included larger newspaper corpora in our analysis: (1) the ICEw corpora are comparatively limited in size, especially with regard to quantitative descriptions of verb-complementational profiles of verbs that are not very frequent in language use, and (2) it has been pointed out elsewhere (e.g. Schilk 2011) that newspaper language provides a strong normative influence on language users in postcolonial settings in which English is a widespread second language. The Indian and Sri Lankan newspaper corpora listed in Table 1 were compiled at the University of Giessen while the British newspaper data were obtained from the British National Corpus (BNC, see Section 4.2).3 4.1  The International Corpus of English (ICE) The International Corpus of English (ICE) was launched at the end of the 1980s (see Greenbaum 1996) by Sidney Greenbaum. The project aims at collecting sample text corpora representing varieties of English as a Native Language (ENL) as well as English as a Second Language (ESL). The various national ICE components represent an unprecedented database which “will undoubtedly provide valuable information on the use of English in many countries, in most of which there have never been systematic studies, and it will provide the basis for international comparisons” (Greenbaum 1991: 91). Currently, 23 teams are involved in the ICE project, each collecting components featuring one million words, of which 60% are speech and 40% writing. The sample texts are taken from the same genres; all texts are from the 1990s (or later). Some components have already been completed (e.g. ICE-GB, ICE-HK), while others are still being compiled (e.g. ICE-USA, ICE-GHA). The ICE components relevant to the present study are ICE-India (ICEIND), ICE-Sri Lanka (ICE-SL) and ICE-Great Britain (ICE-GB). As ICE-SL is still in the process of being compiled, with the written part already completed

.  The Indian and Sri Lankan newspaper corpora form part of the South Asian Varieties of English (SAVE) Corpus which has been compiled in the context of the research project Verb Complementation in South Asian Englishes: A Study of Ditransitive Verbs in Web-Derived Corpora funded by the German Research Foundation (DFG MU 1683/3–1, 2008–2011).



Mapping unity and diversity in South Asian English lexicogrammar 

(see Mukherjee, Schilk & Bernaisch 2010), only the written parts of ICE-IND, ICE-SL and ICE-GB have been used (see Table 1).4 4.2  Web-derived newspaper corpora Being fully aware of the limitations of one-million-word corpora for a wide range of research questions, Greenbaum and Nelson (1996: 6) make four suggestions as to how to complement ICE data with additional material: (1) an expanded corpus with more material from each text category, (2) a specialized corpus based on more material from one text category only, (3) a non-standard corpus with a less restrictive approach as regards speaker selection, and (4) a monitor corpus founded on continuous input of new material. In the present study, we opted for (2) in that we complemented the ICE data for each variety with genre-specific data from acrolectal standard newspaper language. For each South Asian variety, two newspaper corpora containing approximately 1.5 million words each (see Table 1) were compiled along the lines of a slightly adapted version of Hoffmann’s (2007) webpage-to-megacorpus method. There are various problems that need to be solved when applying this method to online newspaper archives, for example the vast number of news agency reports. For our Indian and Sri Lankan newspaper corpora, we generated a list of 250 news agency names (and abbreviations) in order to automatically delete texts marked with any of the names (or abbreviations) from the corpus. With the help of the webpage-to-megacorpus method, a three-million-word offline newspaper corpus of SLE (SAVE-SL) was created from the online archives of the print versions of the Daily Mirror (DM) and the Daily News (DN). The IndE newspaper corpus (SAVE-India) includes three million words from the online archives of The Statesman (ST) and The Times of India (ToI) (see Bernaisch et al. 2011). The daily newspapers taken from the periodicals section of the BNC (BNC news) provide comparable data for Present-day BrE. While the newspaper corpora are a very useful database, one needs to be aware of certain limitations. The most significant restriction is the genrespecificity of the data. Although a newspaper may be seen as a relatively diverse collection of various text types (e.g. editorials, comments, obituaries) covering a rich array of topics, the language as it is used in this narrowly defined context can hardly be regarded as representing a certain variety of English with regard to all written genres, let alone spoken language. Furthermore, there is a complex process of editing and re-editing the text on its way from the original manuscript

.  For a detailed description of the ICE corpus design and the design of the written component in particular, see Nelson (1996).

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

to the final article (just as in other written texts to be published). In spite of these restrictions, Web-derived offline newspaper corpora are no doubt valuable in that they provide very large collections of text representing the acrolectal standard variant. Schilk (2011) also argues in favour of employing newspaper corpora for corpus-linguistic analyses since the authors (and editors) of these newspapers can be considered very proficient users of the English language – whatever can be found in a published article is presumably not considered a learner mistake. This benefit of newspaper corpora might be regarded as the other side of the coin of the editing process. Note also that in many postcolonial Englishes, language use is much more strongly oriented towards written norms than it is in ENL contexts (see Hundt 2006: 223), which also turns newspaper corpora into particularly attractive corpus-linguistic resources for the description of South Asian Englishes. In this context, Schilk (2011) puts forward that, in the absence of variety-specific dictionaries and grammars, newspapers also serve as starting points for the standardization process in ESL contexts as in India. A corpus environment consisting of a well-defined small corpus of a range of genres and a genre-specific large corpus of newspaper language thus seems to provide an adequate and easily accessible database for the description of South Asian Englishes. 5.  Analysis and results In the following, the use of the TCM-related verbs CONVEY, SUBMIT and SUPPLY will be analysed from a quantitative and qualitative point of view. The first part of this section gives an introduction to the prototypical semantics of the verbs and their frequencies in the different varieties (see Section 5.1). Afterwards, quantitative differences with regard to the preferred complementation patterns and qualitative differences with regard to variety-specific usage patterns of the verbs under scrutiny will be described (see Sections 5.2–5.4). 5.1  Verbs under scrutiny: CONVEY, SUBMIT and SUPPLY The three selected verbs have been chosen because they are typically used in the TCM construction and each of them encodes a transfer process; but in contrast to other verbs of transfer, such as GIVE, they are usually not used in the semantically synonymous ditransitive construction. Although all three verbs can be viewed as typical members of the semantically defined class of TCM-related verbs, there are clear differences in the meaning and use of each of them, which we will briefly discuss.



Mapping unity and diversity in South Asian English lexicogrammar 

To begin with, CONVEY is relatively infrequent in all three varieties while SUBMIT and SUPPLY are used fairly frequently. The overall frequencies of all verbs in the corpus data are shown in Table 2. Table 2.  CONVEY, SUBMIT and SUPPLY in Sri Lankan, Indian and British English Verb (lemma)

ICE-SLw

ICE-INDw

ICE-GBw

total

pmw

total

pmw

CONVEY

21

43.5

40

91.1

9

20.7

SUBMIT

53

109.8

49

111.6

15

34.5

SUPPLY

36

74.6

32

72.6

48

110.4

Verb (lemma)

SAVE-SL total

pmw

SAVE-India total

pmw

total pmw

BNC news total

pmw

CONVEY

110

35.4

92

29.9

82

9.1

SUBMIT SUPPLY

478 214

153.9 68.9

514 197

166.9 64.0

256 506

28.5 56.4

Especially in the case of SUBMIT, not only significant variety-based differences between BrE and the South Asian varieties, but also genre-specific differences seem to play a role in the South Asian varieties. While in BrE the frequencies of SUBMIT in ICE and BNC news are comparable, in the South Asian corpora the verb is used significantly more frequently in the newspaper corpora than in the ICE components. It thus seems plausible to assume that SUBMIT may display genre-specific usage patterns in South Asian English newspaper language. When it comes to the description of verb complementation as a part of the lexis-grammar interface, it is useful to distinguish between several levels of granularity. On a very high level of abstraction, it is possible to analyse sentences with regard to their level of transitivity (see Hopper & Thompson 1980). If transitivity of verbs is seen as a cline rather than a static feature, higher levels of ditransitivity may be assumed if more arguments in a sentence are explicitly profiled, whereas lower levels of transitivity are assumed when fewer argument elements of a sentence are made explicit. In their operationalization of transitivity differences between IndE and BrE, Mukherjee and Schilk (2008) restrict themselves to the most essential of the ten parameters that Hopper and Thompson (1980) posit for the description of transitivity, namely “the number of participants [which] is central to the traditional notion of Transitivity” (Thompson & Hopper 2001: 32). In order to capture differences on the overall level of transitivity according to the number of profiled arguments, we opted for the coding schema used by Mukherjee and Schilk (2008): they coded each sentence that they analysed according to its complementation

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

pattern and subsumed all complementation patterns with the same number of profiled arguments into three categories (including all patterns in which one, two and three arguments are made explicit, respectively). Table 3 shows which pattern types include which and how many profiled arguments according to this logic. Note in this context that all passive patterns (IP, IIP, IIIP, VP) are treated here as including one profiled argument less than the active pattern from which it is derived as in the vast majority of all cases the by-agent is not made explicit in a passive sentence. Table 3.  Profiled arguments in the complementation patterns I to V and derivative patterns (Mukherjee & Schilk 2008: 176) Number of arguments

S profiled?

Od profiled?

Oi profiled?

Pattern type

1

Yes

No

No

IV

1

No

Yes

No

IIIP

1

No

No

Yes

VP

2

Yes

Yes

No

III

2

Yes

No

Yes

V

2 3

No Yes

Yes Yes

Yes Yes

IP, IIP I, II

The comparison of varieties of English at the level of transitivity trends, i.e. with regard to overall preferences for more or fewer profiled arguments, refers to a very abstract level of descriptive granularity. In the present study, we will also look at individual complementation patterns of CONVEY, SUBMIT and SUPPLY and describe at a more concrete level of description how the profiled arguments are filled with lexical items, also in order to capture variety-specific usage patterns for the three TCM-related verbs. For IndE and BrE, Schilk (2011) has shown that there are very often significant correlations between the preference of a verb for a specific complementation pattern and the choice of particular lexical items (in the sense of collocational routines) in the pattern. In the following sections, each of the three TCM-related verbs CONVEY, SUBMIT and SUPPLY will be analysed in three steps: (1) quantitative differences between the three varieties at the level of verb-complementational differences, (2) overarching transitivity differences between the three varieties, and (3) qualitative (and semantic) differences with regard to potential variety-specific usage patterns. 5.2  CONVEY in the ICE and SAVE corpora CONVEY is the least frequent of the three TCM-related verbs under scrutiny. A detailed quantitative analysis is not feasible for the ICE data and our analysis will



Mapping unity and diversity in South Asian English lexicogrammar 

thus be limited to the much larger newspaper dataset. The SAVE/BNC news dataset provides some interesting insights into the usage of CONVEY in the three varieties. Table 4 shows the distribution of the complementation patterns of CONVEY in the newspaper corpora. Table 4.  CONVEY in BNC news and SAVE corpora Pattern

BNC news

BNC news pmw

SAVE-SL

SAVE-SL pmw

SAVEIndia

SAVEIndia pmw

I

n.a.

0.00

1

0.32

n.a.

0.00

Ider

n.a.

0.00

n.a.

0.00

1

0.32

II

10

1.11

28

9.02

15

4.87

IIder

4

0.45

8

2.58

10

3.25

IIP

2

0.22

16

5.15

7

2.27

IIPder

1

0.11

4

1.29

1

0.32

III

52

5.80

36

11.59

45

14.61

IIIder

9

1.00

6

1.93

7

2.27

IIIP

1

0.11

8

2.58

4

1.30

3

0.33

3

0.97

2

0.65

IV

IIIPder

n.a.

0.00

n.a.

0.00

n.a.

0.00

V sum

n.a. 82

0.00 9.14

n.a. 110

0.00 35.42

n.a. 92

0.00 29.88

Table 4 reveals that, at first sight, there is a tendency to use the type-II pattern, its derivative and passive patterns more frequently in SAVE-SL and – to a lesser extent – in SAVE-India compared to the British data. In the BNC, on the other hand, the type-III pattern is used more frequently than in SAVE-SL and about as frequently as in SAVE-India. However, it needs to be noted that the expected frequencies are very low in many cells so that significant differences between the three varieties are difficult to pinpoint. When focusing on overall transitivity trends according to the number of profiled elements, there are neither statistically significant differences between BrE and IndE nor between IndE and SLE. At this level, significant differences are only attested between BrE and SLE. Table 5 shows that in SAVE-SL significantly more elements tend to be profiled than in the BNC news corpus. This reflects the higher frequency of use of the type-II pattern, since in the type-II pattern all three elements are profiled, while in the type-III pattern (which is more frequently used in BrE) only two elements of the argument structure of CONVEY are made explicit.

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

Table 5.  Number of profiled elements for CONVEY in BNC news and SAVE-SL Profiled arguments

BNC news

BNC news exp.

χ²

SAVE- SL

SAVE- SL exp.

χ²

1

4

6.41

0.90

11

8.59

0.67

2

64

53.81

1.93

62

72.19

1.44

3

14

21.78

2.78

37

29.22

2.07

sum total χ²/p

82

82.00 5.61 9.79/p < .05

110

110.00

4.18

It has to be borne in mind, however, that at the level of overall transitivity trends, the profiling and non-profiling of any single argument is treated in very much the same way in the coding system, e.g. the non-profiling of the agent (as frequently done in passive patterns in which the by-agent is optional) and the nonprofiling of the recipient (as in the type-III pattern). Thus, if we examine pattern distribution from this macroscopic perspective, not much can be said about which differences between which complementation patterns are primarily responsible for the different transitivity trends and which variety-specific usages of the verbs lead to verb-complementational differences across varieties in the first place. If specific usage patterns are scrutinized, a clear difference in the case of CONVEY between BrE, on the one hand, and the two South Asian varieties, on the other, can be shown for the lexical items that encode the direct object in the underlying transfer process, i.e. the conveyed “entities”. In the British corpora (ICE-GB and BNC news) CONVEY is most frequently used in combination with direct objects that refer to feelings or mental states; see examples (11) and (12). (11) The author also conveys the strong feeling of nationalism, so much so that Jaruzelski, although Moscow-trained, tried to remain faithful to the Polish people […]. (ICE-GB W2B-005#39:1) (12) The colliding conversations are neatly synchronised but the main problem is that each part needs to convey a sense of tough experience with some firm characterisation which the self-conscious and rather tense cast couldn’t find in this patchy production. (BNC K57)

While in the South Asian corpora, this meaning-group is also frequently attested for the direct object of CONVEY, the vast majority of the lexical elements in the direct-object position represent a wide range of verbal messages (including the expression of emotions, gratitude, etc.). Consider examples (13) to (16): (13) Naya Daur, Naya Zamana, Deshpremi, Leader, Inquilab or Main Azaad Hoon, they have all conveyed a message. (ToI 37330)



Mapping unity and diversity in South Asian English lexicogrammar 

(14) “It is particularly sad for all of us in the BJP that she met with a untimely death while on a campaign journey for the party,” the prime minister said in a message conveying his heartfelt condolences to the bereaved members of the families of the deceased. (ToI 38094) (15) “I hope he conveyed to the press my suggestions on improving the prison conditions.” (ST 2004-12-20). (16) Since we leave tomorrow morning I write to convey my sincere thanks to you for all that you did to us during our sojourn at Diyatalawa. (ICE-SL W2C-002#89: 2)

The South Asian corpus data thus show that the verb CONVEY is used frequently in combination with lexical items encoding verbal messages in the direct-object position – this is not the case in BrE. Together with the fact that in BrE the direct object of complementation patterns of CONVEY usually refers to feelings or states of mind, these observations may also explain why the type-II pattern is preferred in South Asian varieties and the type-III pattern is more frequent in BrE: the more concrete nature of verbal messages (as compared to feelings or states of mind) requires the explicit mention of a recipient (→ type II), while this is not the case for feelings or states of mind, as for example in convey an atmosphere (→ type III). 5.3  SUBMIT in ICE and SAVE corpora Table 6 gives the numbers for the occurrence of the verb SUBMIT in the ICE data. Table 6.  Submit in ICE Pattern

ICE-GBw

ICE-SLw

ICE-INDw

I

n.a.

n.a.

n.a.

II

1

6

8

IIder

1

n.a.

1

IIP

2

1

4

n.a.

6

2

IIPder III IIIder IIIP IIIPder

5

19

18

n.a.

n.a.

1

1

11

3

2

6

8

IV

n.a.

4

3

V sum

3 15

n.a. 53

1 49

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

Against the background of the much higher frequency of SUBMIT in ICE-SL and ICE-IND compared to ICE-GB, the emergence of distinctly South Asian usage patterns of the verb seems plausible. The overall quantitative difference between British English and the South Asian Englishes, which is in line with the results for CONVEY, is corroborated by the figures obtained from the much larger newspaper corpora; consider Table 7. Table 7.  SUBMIT in BNC news and SAVE corpora Pattern

BNC news

BNC news pmw

SAVE-SL

SAVE-SL pmw

SAVE-India

I

n.a.

0.00

2

0.64

n.a.

0.00

II

35

3.90

63

20.29

143

46.44

IIder

6

0.67

21

6.76

7

2.27

IIP

27

3.01

34

10.95

23

7.47

IIPder

9

1.00

29

9.34

24

7.79

III

73

8.14

135

43.47

150

48.71

IIIder

32

3.57

100

32.20

75

24.36

IIIP

26

2.90

35

11.27

28

9.09

IIIPder

31

3.46

53

17.07

56

18.19

IV

5

0.56

1

0.32

2

0.65

V

12

1.34

5

1.61

6

1.95

sum

256

28.53

478

153.91

SAVEIndia pmw

514

166.93

On the basis of the newspaper corpora, we compared the overall transitivity trends across the three varieties by grouping all instances of SUBMIT into categories defined by the number of profiled arguments. The results are given in Table 8, taking into account all intervarietal comparisons. Table 8.  Number of profiled arguments for SUBMIT Profiled arguments

BNC news

SAVE-SL

BNC news exp.

χ²

SAVE-SL exp.

χ²

1

62

89

52.66

1.65

98.34

0.89

2

153

303

159.04

0.23

296.96

0.12

3

41

86

44.29

0.25

82.71

0.13

478 256.00 3.27/p > .05

2.13

478.00

1.14

sum total χ²/p

256

(Continued)



Mapping unity and diversity in South Asian English lexicogrammar 

Table 8.  (Continued) Profiled arguments

BNC news

SAVE-India

1

62

86

49.21

2

153

278

3

41

150

sum total χ²/p Profiled arguments

256

BNC news exp.

SAVEIndia exp.

χ²

3.33

98.79

1.66

143.29

0.66

287.71

0.33

63.50

7.97

127.50

3.97

514 256.00 17.92/p < .05

11.96

514.00

5.96

χ²

SAVEIndia exp.

χ²

SAVE-SL

SAVE-India

1

89

86

84.32

0.26

90.68

0.24

2

303

278

279.96

1.90

301.04

1.76

86

150

113.72

6.76

122.28

6.28

514 478.00 17.2/p < .05

8.91

514.00

8.29

3 sum total χ²/p

478

SAVE-SL exp.

χ²

As shown in Table 8, there is no significant difference between BrE and SLE. However, there are significant differences between BNC news and SAVEIndia as well as between SAVE-India and SAVE-SL. The differences between the British and the Indian data are due to the fact that SUBMIT is generally used with more profiled arguments in SAVE-India compared to BNC news. While in SAVE-India there are 150 instances of sentences with three profiled arguments, in the British corpus only 41 such instances are attested. For the group of complementation patterns with one profiled argument, the reverse is true, i.e. these patterns are much more prominent in BrE than in IndE in relative terms. These observations are in line with the tentative hypothesis of Mukherjee and Schilk (2008) that in IndE more arguments seem to be profiled with SUBMIT than in BrE.5 As shown in Table 8, in IndE there is also a tendency towards profiling more arguments than in SLE, especially when it comes to the three-argument patterns.

.  At first glance, this seems not surprising as there is an overlap between the datasets used by Mukherjee and Schilk (2008) and in the present study. However, this overlap is only partial: firstly, SAVE-India in the present study includes two newspapers while in Mukherjee and Schilk (2008) only data from one newspaper were used; secondly, BNC news in the present study only includes the newspaper part of the BNC periodical section that was used in Mukherjee and Schilk (2008).

 Marco Schilk, Tobias Bernaisch & Joybrato Mukherjee

In contrast to the comparison of BrE and IndE, however, there is no marked difference between IndE and SLE with regard to the preference for one-argument patterns; rather, IndE and SLE differ with regard to their preference for patterns with two profiled arguments. In order to gain a more detailed picture of the verb-complementational differences between the three varieties, it is useful to compare the various pattern types. In Table 9, the distributional differences between the varieties according to pattern type are shown. Table 9.  Pattern distribution of SUBMIT in newspaper corpora Pattern

BNC news

SAVE-SL

BNC news exp.

χ²

II + IIder

41

84

43.22

0.11

81.78

0.06

IIP + IIPder

36

63

34.23

0.09

64.77

0.05

III

73

135

71.91

0.02

136.09

0.01

IIIder

32

100

45.64

4.07

86.36

2.15

IIIP

26

35

21.09

1.14

39.91

0.60

IIIPder

31

53

29.04

0.13

54.96

0.07

V

12

5

5.88

6.38

11.12

3.37

475 251.00 18.26/p < .05

11.95

475.00

6.31

sum total χ²/p Pattern

251

SAVE-SL exp.

χ²

BNC news

SAVE-India

BNC news exp.

χ²

SAVE-India exp.

χ²

II + IIder

41

150

62.83

7.59

128.17

3.72

IIP + IIPder

36

47

27.30

2.77

55.70

1.36

III

73

150

73.36

0.00

149.64

0.00

IIIder

32

75

35.20

0.29

71.80

0.14

IIIP

26

28

17.76

3.82

36.24

1.87

IIIPder

31

56

28.62

0.20

58.38

0.10

V

12

6

5.92

6.24

12.08

3.06

512 251.00 31.15/p < .05

20.90

512.00

10.25

sum total χ²/p Pattern

251 SAVE-SL

SAVE-India

SAVE-SL exp.

χ²

SAVE-India exp.

χ²

II + IIder

84

150

112.61

7.27

121.39

6.75

IIP + IIPder

63

47

52.94

1.91

57.06

1.77

135

150

137.16

0.03

147.84

0.03

III

(Continued)



Mapping unity and diversity in South Asian English lexicogrammar 

Table 9.  (Continued) Pattern

SAVE-SL

SAVE-India

SAVE-SL exp.

χ²

SAVE-India exp.

χ²

IIIder

100

75

84.22

2.96

90.78

2.74

IIIP

35

28

30.32

0.72

32.68

0.67

IIIPder

53

56

52.46

0.01

56.54

0.01

V sum total χ²/p

5

6

5.29

0.02

5.71

0.02

475

512

475.00

12.92

512.00

11.98

24.9/p < .05

The upper panel in Table 9 shows the differences in the distribution of complementation patterns between BrE and SLE, which is significant at p