[Journal] Applied Linguistics. 2010. Vol. 31. No 4

Citation preview

APPLIN-31(4)Cover.qxd

7/27/10

5:51 PM

Page 1

APPLI ED LI NGUISTICS

ISSN 0142-6001 (PRINT) ISSN 1477-450X (ONLINE)

Applied Linguistics

Volume 31 Number 4 September 2010

Volume 31 Number 4 September 2010

Published in cooperation with AAAL American Association for Applied Linguistics AILA International Association of Applied Linguistics BAAL British Association for Applied Linguistics

OXFORD

www.applij.oxfordjournals.org

EDITORS Ken Hyland, Director, Centre for Applied English Studies, KK Leung Building, The University of Hong Kong, Pokfulam Road, Hong Kong Jane Zuengler, Nancy C. Hoefs Professor of English, University of Wisconsin-Madison, 6103 Helen C. White 600 North Park Street Madison, WI, 53706 USA Assistant to Jane Zuengler: Heather Carroll, University of Wisconsin-Madison

REVIEWS AND FORUM EDITOR Stef Slembrouck, Professor of English Linguistics and Discourse Analysis, Universiteit Gent, Vakgroep Engels, Rozier 44, B-9000 Gent, Belgium. Assistant to Stef Slembrouck: Tine Defour, Universiteit Gent

ADVISORY BOARD Guy Cook, British Association for Applied Linguistics Aneta Pavlenko, American Association for Applied Linguistics Martin Bygate, International Association for Applied Linguistics Huw Price, Oxford University Press

EDITORIAL PANEL Karin Aronsson, Linko¨ping University David Block, London University Institute of Education Jan Blommaert, University of Jyva¨skyla¨ Deborah Cameron, University of Oxford Lynne Cameron, Open University (BAAL Representative) Tracey Derwing, University of Alberta Zolta´n Do¨rnyei, University of Nottingham Patricia Duff, University of British Columbia Diana Eades, University of New England, Australia ZhaoHong Han, Columbia University (AAAL representative) Gabriele Kasper, University of Hawai’i at Manoa Claire Kramsch, University of California at Berkeley Angel Lin, University of Hong Kong Janet Maybin, Open University, UK Tim McNamara, University of Melbourne Junko Mori, University of Wisconsin-Madison Greg Myers, Lancaster University Susanne Niemeier, University Koblenz-Landau (AILA Representative) Lourdes Ortega, University of Hawai’i at Manoa Alastair Pennycook, University of Technology, Sydney Ben Rampton, King’s College, University of London Steven Ross, Kwansei Gakuin University Alison Sealey, University of Birmingham Antonella Sorace, University of Edinburgh Lionel Wee, National University of Singapore Applied Linguistics is published five times a year in February, May, July, September and December by Oxford University Press, Oxford, UK. Annual subscription price is £254/US$457/E381. Applied Linguistics is distributed by Mercury International, 365 Blair Road, Avenel, NJ 07001, USA. Periodicals postage paid at Rahway, NJ and at additional entry points. US Postmaster: send address changes to Applied Linguistics (ISSN 0142-6001), c/o Mercury International, 365 Blair Road, Avenel, NJ 07001, USA. # Oxford University Press 2010 All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without prior written permission of the Publishers, or a licence permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Typeset by Glyph International, Bangalore, India Printed by Bell and Bain Ltd, Glasgow

Applied Linguistics Journal online The full text of Applied Linguistics is available online to journal subscribers. Online access has a number of advantages: . quality PDFs ensure articles look the same as the print original and are easy to print out . access is easy—all you need is your subscription number or institutional IP address (see below) . online access is available ahead of print publication—so view while you await your print version! . access the text wherever you are (or from any part of your institution network if you have a library subscription) . perform searches by word or author across the full text of the articles of any part of the journal . download articles whenever you choose—you will be able to access past online issues as long as you have a current subscription . free sample copy available online . fully searchable abstracts/titles going back to volume 1 . Table of Contents email alerting service. The print version will continue to be available as previously. Institutions may choose to subscribe to the print edition only, online only, or both. Individual subscribers automatically receive both.

CONTRIBUTORS There is no need for contributors to format their articles any differently; online files are produced automatically from the final page proofs of the journal. However, if you know that an item in your list of references is available online, please supply the URL. If you have your own website, you are welcome to include the URL with your contact address in your biodata.

ADVANCE ACCESS Applied Linguistics now has Advance Access articles. These are papers that have been copyedited and typeset but not yet paginated for inclusion in an issue of the journal. More information, including how to cite Advance Access papers, can be found online at http://www.applij.oxfordjournals.org.

Applied Linguistics Subscription Information

A subscription to Applied Linguistics comprises 5 issues. Annual Subscription Rate (Volume 31, 5 issues, 2010) Institutional. Print edition and site-wide online access: £254.00/US$457.00/E381.00; Print edition only: £233.00/US$419.00/E350.00; Site-wide online access only: £212.00/US$382.00/E318.00.

Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, UK. Email: [email protected]. Tel (and answerphone outside normal working hours): +44 (0)1865 353907. Fax: + 44 (0)1865 353485. In Japan, please contact: Journals Customer Services, Oxford Journals, Oxford University Press, Tokyo, 4-5-10-8F Shiba, Minato-ku, Tokyo 108-8386, Japan. Tel: þ81 3 5444 5858. Fax: þ81 3 3454 2929.

Personal. Print edition and individual online access: £82.00/US$164.00/E123.00.

subscribe to applied linguistics

Please note: US$ rate applies to US & Canada, Euros applies to Europe, UK£ applies to UK and Rest of World.

For new subscriptions and recent single issues only. Current subscribers will automatically receive a renewal form.

Prices include postage by surface mail, or for subscribers in the USA and Canada by airfreight, or in India, Japan, Australia and New Zealand, by Air Speeded Post. Airmail rates are available on request. There are other subscription rates available for members of AAAL, BAAL, AILA, and LSA, for a complete listing please visit www.applij.oxfordjournals.org/subscriptions. Full prepayment, in the correct currency, is required for all orders. Orders are regarded as firm and payments are not refundable. Subscriptions are accepted and entered on a complete volume basis. Claims cannot be considered more than FOUR months after publication or date of order, whichever is later. All subscriptions in Canada are subject to GST. Subscriptions in the EU may be subject to European VAT. If registered, please supply details to avoid unnecessary charges. For subscriptions that include online versions, a proportion of the subscription price may be subject to UK VAT. Personal rate subscriptions are only available if payment is made by personal cheque or credit card and delivery is to a private address. The current year and two previous years’ issues are available from Oxford Journals. Previous volumes can be obtained from the Periodicals Service Company at http://www.periodicals. com/oxford.html or Periodicals Service Company, 11 Main Street, Germantown, NY 12526, USA. Email: [email protected]. Tel: +1 (518) 537 4700. Fax: +1 (518) 537 5899. For further information, please contact: Journals Customer Service Department,

Please complete the form below and return it to: Journal Customer Service Department (please see above). Please record my subscription to Applied Linguistics, starting with Volume__________ (Subscriptions start with the March issue and can be accepted for complete volumes only.) Please send me the following single issue(s) Volume_________ Issue_________ Name (BLOCK CAPITALS please) _________________________________________ Adresss __________________________________ _________________________________________ _________________________________________ City _____________________________________ Country _________________________________ Postcode _________________________________ I enclose the correct payment of (see rates above): £/US/E __________________________________ Please debit my credit card: American Express / Mastercard / Visa (delete as appropriate) Card number: __|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__|__ Expiry date: |__|__|__| Signature ________________________________ œ Please tick this box if you do NOT wish to receive details of related products and services of OUP and other companies that we think may be of interest.

Aims Applied Linguistics publishes research into language with relevance to real-world problems. The journal is keen to help make connections between fields, theories, research methods, and scholarly discourses, and welcomes contributions which critically reflect on current practices in applied linguistic research. It promotes scholarly and scientific discussion of issues that unite or divide scholars in applied linguistics. It is less interested in the ad hoc solution of particular problems and more interested in the handling of problems in a principled way by reference to theoretical studies. Applied linguistics is viewed not only as the relation between theory and practice, but also as the study of language and language-related problems in specific situations in which people use and learn languages. Within this framework the journal welcomes contributions in such areas of current enquiry as: bilingualism and multilingualism; computer-mediated communication; conversation analysis; corpus linguistics; critical discourse analysis; deaf linguistics; discourse analysis and pragmatics; first and additional language learning, teaching, and use; forensic linguistics; language assessment; language planning and policies; language for special purposes; lexicography; literacies; multimodal communication; rhetoric and stylistics; and translation. The journal welcomes both reports of original research and conceptual articles. The Journal’s Forum section is intended to enhance debate between authors and the wider community of applied linguists (see Editorial in 22/1) and affords a quicker turnaround time for short pieces. Forum pieces are typically responses to a published article, a shorter research note or report, or a commentary on research issues or professional practices. The Journal also contains a Reviews section. Applied Linguistics is covered by the following abstracting/indexing services: Bibliographie Linguistique/Linguistic Bibliography, BLonline, British Education Index, Current Index to Journals in Education, ERIC (Education Resources Information Centre), International Bibliography of the Social Sciences, ISI: Social Sciences Citation Index, Research Alert, Current Contents/Social and Behavioral Sciences, Social Scisearch, Sociological Abstracts: Language and Linguistics Behaviour Abstracts, Language Teaching, MLA Directory of Periodicals, MLA International Bibliography, PsycINFO, Sociological Abstracts, Zeitschrift fu¨r Germanistische Linguistik.

ADVERTISING Inquiries about advertising should be sent to Linda Hann, Oxford Journals Advertising, 60 Upper Broadmoor Road, Crowthorne, RG45 7DE, UK. Email: [email protected]. Tel/Fax: +44 (0)1344 779945.

PERMISSIONS For information on how to request permissions to reproduce articles/information from this journal, please visit www.oxfordjournals.org/permissions.

DISCLAIMER Statements of fact and opinion in the articles in Applied Linguistics are those of the respective authors and contributors and not of Applied Linguistics or Oxford University Press. Neither Oxford University Press nor Applied Linguistics make any representation, express or implied, in respect of the accuracy of the material in this journal and cannot accept any legal responsibility or liability for any errors or omissions that may be made. The reader should make his/her own evaluation as to the appropriateness or otherwise of any experimental technique described.

NOTES TO CONTRIBUTORS Articles submitted to Applied Linguistics should represent outstanding scholarship and make original contributions to the field. The Editors will assume that an article submitted for their consideration has not previously been published and is not being considered for publication elsewhere, either in the submitted form or in a modified version. Articles must be written in English and not include libelous or defamatory material. Manuscripts accepted for publication must not exceed 8,000 words including all material for publication in the print version of the article, except for the abstract, which should be no longer than 175 words. Additional material can be made available in the online version of the article. Such additions will be indexed in the print copy. Applied Linguistics operates a double-blind peer review process. To facilitate this process, authors are requested to ensure that all submissions, whether first or revised versions, are anonymized. Authors’ names and institutional affiliations should appear only on a detachable cover sheet. Submitted manuscripts will not normally be returned. Forum pieces are usually reviewed by the journal Editors and are not sent for external review. Items for the Forum section are normally 2,000 words long. Contributions to the Forum section and offers to review book publications should be addressed to the Forum and Reviews Editor. For more detailed guidelines, see our website http://www.oxfordjournals.org/applij/for_authors/index.html

PROOFS Proofs will be sent to the author for correction, and should be returned to Oxford University Press by the deadline given.

OFFPRINTS On publication of the relevant issue, if a completed offprint form has been received stating gratis offprints are requested, 25 offprints of an article, forum piece or book review will be sent to the authors free of charge. Orders from the UK will be subject to a 17.5% VAT charge. For orders from elsewhere in the EU you or your institution should account for VAT by way of a reverse charge. Please provide us with your or your institution’s VAT number.

COPYRIGHT Acceptance of an author’s copyright material is on the understanding that it has been assigned to the Oxford University Press subject to the following conditions. Authors are free to use their articles in subsequent publications written or edited by themselves, provided that acknowledgement is made of Applied Linguistics as the place of original publication. Except for brief extracts the Oxford University Press will not give permission to a third party to reproduce material from an article unless two months have elapsed without response from the authors after the relevant application has been made to them. It is the responsibility of the author to obtain permission to reproduce extracts, figures, or tables from other works.

APPLIN-31(4)Cover.qxd

7/27/10

5:51 PM

Page 1

APPLI ED LI NGUISTICS

ISSN 0142-6001 (PRINT) ISSN 1477-450X (ONLINE)

Applied Linguistics

Volume 31 Number 4 September 2010

Volume 31 Number 4 September 2010

Published in cooperation with AAAL American Association for Applied Linguistics AILA International Association of Applied Linguistics BAAL British Association for Applied Linguistics

OXFORD

www.applij.oxfordjournals.org

APPLIED LINGUISTICS Volume 31 Number 4 September 2010 CONTENTS Articles An Academic Formulas List: New Methods in Phraseology Research RITA SIMPSON-VLACH and NICK C. ELLIS

487

The Role of Phonological Decoding in Second Language Word-Meaning Inference MEGUMI HAMADA and KEIKO KODA Dynamic Patterns in Development of Accuracy and Complexity: A Longitudinal Case Study in the Acquisition of Finnish MARIANNE SPOELMAN and MARJOLIJN VERSPOOR

532

Investigating L2 Performance in Text Chat SHANNON SAURO and BRYAN SMITH

554

Forum Making it Real: Authenticity, Process and Pedagogy RICHARD BADGER and MALCOLM MACDONALD

578

REVIEWS Barbara Ko¨pke, Monika S. Schmid, Merel Keijzer, and Susan Dostert (eds): Language Attrition: Theoretical Perspectives BENJAMIN SHAER Zolta´n Do¨rnyei: Research Methods in Applied Linguistics HONG ZHONG and HUHUA OUYANG V. Samuda and M. Bygate: Tasks in Second Language Learning MACHTELD VERHELST H. Spencer-Oatey and P. Franklin: Intercultural Interaction: A Multidisciplinary Approach to Intercultural Communication ALIREZA JAMSHIDNEJAD NOTES ON CONTRIBUTORS

513

583 586 589

592 595

Applied Linguistics: 31/4: 487–512 ß Oxford University Press 2010 doi:10.1093/applin/amp058 Advance Access published on 12 January 2010

An Academic Formulas List: New Methods in Phraseology Research RITA SIMPSON-VLACH and NICK C. ELLIS University of Michigan

AN ACADEMIC FORMULAS LIST The aim of this research is to create an empirically derived and pedagogically useful list of formulaic sequences for academic speech and writing, comparable with the Academic Word List (hereafter AWL; Coxhead 2000). It is motivated by current developments in language education, corpus linguistics, cognitive science, second language acquisition (SLA), and English for academic purposes (EAP). Research and practice in SLA demonstrates that academic study puts substantial demands upon students because the language necessary for proficiency in academic contexts is quite different from that required for basic interpersonal communicative skills. Recent research in corpus linguistics analyzing written and spoken academic discourse has established that highly frequent recurrent sequences of words, variously called lexical bundles, chunks, multiword expressions (inter alia) are not only salient but also functionally significant. Cognitive science demonstrates that knowledge of these formulas is crucial for fluent processing. And finally, current trends in SLA and EAP demand ecologically valid instruction that identifies and prioritizes the most important formulas in different genres. The AFL includes formulaic sequences, identifiable as frequent recurrent patterns in written and spoken corpora that are significantly more common in academic discourse than in non-academic discourse and which occupy a

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

This research creates an empirically derived, pedagogically useful list of formulaic sequences for academic speech and writing, comparable with the Academic Word List (Coxhead 2000), called the Academic Formulas List (AFL). The AFL includes formulaic sequences identified as (i) frequent recurrent patterns in corpora of written and spoken language, which (ii) occur significantly more often in academic than in non-academic discourse, and (iii) inhabit a wide range of academic genres. It separately lists formulas that are common in academic spoken and academic written language, as well as those that are special to academic written language alone and academic spoken language alone. The AFL further prioritizes these formulas using an empirically derived measure of utility that is educationally and psychologically valid and operationalizable with corpus linguistic metrics. The formulas are classified according to their predominant pragmatic function for descriptive analysis and in order to marshal the AFL for inclusion in English for Academic Purposes instruction.

488 NEW METHODS IN PHRASEOLOGY RESEARCH

range of academic genres. It separately lists formulas that occur frequently in both academic spoken and academic written language, as well as those that are more common in either written or spoken genres. A major novel development this research brings to the arena is a ranking of the formulas in these lists according to an empirically derived psychologically valid measure of utility, called ‘formula teaching worth’ (FTW). Finally, the AFL presents a classification of these formulas by pragma-linguistic function, with the aim of facilitating their inclusion in EAP curricula.

BACKGROUND Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Functional, cognitive linguistic and usage-based theories of language suggest that the basic units of language representation are constructions—form-meaning mappings, conventionalized in the speech community, and entrenched as language knowledge in the learner’s mind (Langacker 1987; Tomasello 1998, 2003; Barlow and Kemmer 2000; Croft and Cruise 2004; Goldberg 2006; Robinson and Ellis 2008;). Constructions are associated with particular semantic, pragmatic, and discourse functions, and are acquired through engaging in meaningful communication. Constructions form a structured inventory of a speaker’s knowledge of the conventions of their language, as independently represented units in a speaker’s mind. Native-like selection and fluency relies on knowledge and automatized processing of these forms (Pawley and Syder 1983; Ellis 2009). Corpus Linguistics confirms the recurrent nature of these formulas (Hunston and Francis 1996; McEnery and Wilson 1996; Biber et al. 1998). Large stretches of language are adequately described as collocational streams where patterns flow into each other. Sinclair (1991, 2004) summarizes this in his ‘idiom principle:’ ‘a language user has available to him or her a large number of semi-preconstructed phrases that constitute single choices, even though they might appear to be analyzable into segments.’ (1991: 110). Rather than being a minor feature, compared with grammar, Sinclair suggests that for normal texts, the first mode of analysis to be applied is the idiom principle, as most text is interpretable by this principle. Comparisons of written and spoken corpora demonstrate that more collocations are found in spoken language (Brazil 1995; Biber et al. 1999; Leech 2000). Speech is constructed in real time and this imposes greater working memory demands than writing, hence the greater the need to rely on formulas: it is easier to retrieve something from long-term memory than to construct it anew (Kuiper 1996; Bresnan 1999). Many formulaic constructions are non-compositional or idiomatic, like ‘once upon a time’, or ‘on the other hand’, with little scope for substitution (‘twice upon a time’, ‘on the other foot’) (Simpson and Mendis 2003). Even those that appear to be more openly constructed may nevertheless be preferred over alternatives (in speech, ‘in other words’ ‘to say it differently’,

R. SIMPSON-VLACH and N. C. ELLIS

489

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

‘in paraphrase’, ‘id est’) with the demands of native-like selection entailing that every utterance be chosen from a wide range of possible expressions, to be appropriate for that idea, for that speaker, for that genre, and for that time. Natives and experts in particular genres learn these sequential patterns through repeated usage (Pawley and Syder 1983; Ellis 1996, 2009; Wray 1999, 2002). Psycholinguistic analyses demonstrate that they process collocations and formulas with greater facility than ‘equivalent’ more open constructions (Bybee and Hopper 2001; Ellis 2002a, 2002b; Jurafsky 2002; Bod et al. 2003; Schmitt 2004; Ellis et al. 2008, 2009). For example, in speech production, ‘items that are used together, fuse together’ (Bybee 2003: 112): words that are commonly uttered in sequence become pronounced as a whole that is shortened and assimilated (‘give + me’ ! ‘gimme’; ‘I + am + going + to’ ! ‘I’m gonna’, etc.). The phenomenon is graded—the degree of reduction is a function of the frequency of the target word and the conditional probability of the target given the surrounding words (Bybee and Hopper 2001; Jurafsky et al. 2001). EAP research (e.g. Swales 1990; Flowerdew and Peacock 2001; Hyland 2004, 2008; Biber and Barbieri 2006) focuses on determining the functional patterns and constructions of different academic genres. These analyses have increasingly come to be based on corpora representative of different academic fields and registers, such as the Michigan Corpus of Academic Spoken English (Simpson et al. 2002), with qualitative investigation of patterns, at times supported by computer software for analysis of concordances and collocations. But these studies need to be buttressed with quantitative information too, as in the case of vocabulary where there have been longstanding attempts to identify the more frequent words specific to academic discourse and to determine their frequency profile, harking back, for example, to the University Word List (West 1953). The logic for instruction and testing is simple—the more frequent items have the highest utility and should therefore be taught and tested earlier (Nation 2001). The most significant recent developments in this direction have been those of Coxhead (2000). Her development of the AWL has had a significant impact on EAP teaching and testing because it collects words that have high currency in academic discourse by applying specific criteria of frequency and range of distribution in a 3.5-million-word corpus of academic writing representing a broad spectrum of disciplines. Because academic study puts unique demands on language learners, the creation of the AWL as a teaching resource filled a substantial gap in language education by providing a corpus-based list of lexical items targeted specifically for academic purposes. Can the same principles of academic vocabulary analysis be applied to other lexical units characterizing academic discourse? Can the theoretical research on formulaic language, reviewed above, which demonstrates that contiguous multiword phrases are important units of language, be likewise transformed

490 NEW METHODS IN PHRASEOLOGY RESEARCH

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

into practical pedagogical uses (Nattinger and DeCarrico 1992; Lewis 1993; Wray 2000; Schmitt 2004)? Is an AFL equally viable? A crucial factor in achieving this goal lies in the principles for identifying and classifying such units. The lexical bundle approach of Biber and colleagues (1998, 2004), based solely on frequency, has the advantage of being methodologically straightforward, but results in long lists of recurrent word sequences that collapse distinctions that intuition would deem relevant. For example, few would argue with the intuitive claim that sequences such as ‘on the other hand’ and ‘at the same time’ are more psycholinguistically salient than sequences such as ‘to do with the’, or ‘I think it was’, even though their frequency profiles may put them on equivalent lists. Selection criteria that allow for intuitive weeding of purely frequency-based lists, as used by Simpson (2004) in a study of formulaic expressions in academic speech, yield much shorter lists of expressions that may appeal to intuitive sensibilities, but they are methodologically tricky and open to claims of subjectivity. In this paper, we present a method for deriving a list of formulaic expressions that uses an innovative combination of quantitative and qualitative criteria, corpus statistics and linguistic analyses, psycholinguistic processing metrics, and instructor insights. Long lists of highly frequent expressions are of minimal use to instructors who must make decisions about what content to draw students’ attention to for maximum benefit within limited classroom time. The fact that a formula is above a certain frequency threshold and distributional range does not necessarily imply either psycholinguistic salience or pedagogical relevance; common sequences of common words, such as ‘and of the,’ are expected to occur frequently. Psycholinguistically salient sequences, on the other hand, like ‘on the other hand’, cohere much more than would be expected by chance; they are ‘glued together’ and thus measures of association, rather than raw frequency, are likely more relevant. Our primary aim in this research is to create a pedagogically useful list of formulaic sequences for academic speech and writing. A secondary aim, however, is to discuss the statistical measures beyond frequency counts available for ranking formulaic sequences extracted from a corpus. The departure point for our research was dissatisfaction with a strictly frequency-based rank ordering of multiword phrases on the one hand, and a frequency plus intuition-based ordering on the other hand, coupled with a need for relatively contained, manageable sets of multiword expressions for use in classroom applications and teaching materials development. We used frequency as a starting point, but our approach is substantially more robust than the previous corpus-based methods for classifying multiword formulas; it encompasses a statistical measure of cohesiveness—mutual information (MI)—that has heretofore not been used in related research, in conjunction with validation and prioritization studies designed to provide insights into which formulas are perceived to be the important ones for teaching.

R. SIMPSON-VLACH and N. C. ELLIS

491

METHODS The corpora Target corpora

Comparison corpora For comparative purposes, two additional corpora were used. For non-academic speech, we used the Switchboard (2006) corpus (2.9 million words), and for non-academic writing we used the FLOB and Frown corpora (1.9 million words) which were gathered in 1991 to reflect British and American English over 15 genres and to parallel the original LOB and Brown collections (ICAME 2006). FLOB and Frown were favored over their predecessors because the age of the texts is closer to the target corpus texts. The Switchboard corpus was chosen because it contains unscripted casual telephone conversations, and thus lies near the opposite end of the style spectrum from academic speech.2

Formula identification and MI The first decision was what length of formulas we would include in the data. It is well known that 2-word phrases (bi-grams) are highly frequent and

Table 1: Word counts by discipline for the Academic subcorpora Academic speech

Academic writing

Discipline

Word count

Discipline

Humanities and Arts Social Sciences Biological Sciences Physical Sciences Non-departmental/other Total

559,912 710,007 357,884 363,203 159,592 2,153,770

Humanities and Arts Social Sciences Natural Sciences/Medicine Technology and Engineering Total

Word count 360,520 893,925 513,586 349,838 2,117,869

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The target corpora of academic discourse included 2.1 million words each of academic speech and academic writing. The academic speech corpus was comprised of MICASE (1.7 million words) (Simpson et al. 2002) plus BNC files of academic speech (431,000 words) (British National Corpus 2006). The academic writing corpus consisted of Hyland’s (2004) research article corpus (1.2 million words), plus selected BNC files (931,000 words) sampled across academic disciplines using Lee’s (2001) genre categories for the BNC.1 The speech corpus was broken down into five subcorpora and the writing corpus into four subcorpora by academic discipline, as shown in Table 1.

492 NEW METHODS IN PHRASEOLOGY RESEARCH

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

include many phrases that are subsumed in 3- or 4-word phrases; so we excluded 2-word sequences, to keep the data set to a more manageable size. Although recurrent 5-word sequences are comparatively rare, we decided to include them for the sake of thoroughness, thus including strings of 3, 4, and 5 words into the data set. The next decision was what frequency level to use as a cutoff. Previous research uses cutoff ranges between 10 and 40 instances per million words. Since our research goals included using other statistical measures to cull and rank the formulas, we wanted a less restricted data set to start with, and so opted for the lowest frequency range used in previous research, namely 10 per million (Biber et al. 1999). We began by extracting all 3-, 4-, and 5-grams occurring at least 10 times per million from the two target and two comparison corpora, using the program Collocate (Barlow 2004). These four data sets naturally included a great deal of overlap, but also substantial numbers of phrases unique to each corpus. The next step then was to collapse the overlapping data and collect frequency counts for each phrase appearing in any one of those four corpora (at the threshold level of 10 per million) for all the other corpora, for comparison purposes. The total number of formulas in this list was approximately 14,000. From this master list, we wanted to determine which formulas were more frequent in the academic corpora than in their non-academic counterparts, because our goal was to identify those formulas that are characteristic of academic discourse in particular, in contrast to high-frequency expressions occurring in any genre. This is an important step that warrants additional justification. Just as the AWL omitted words that were in the most frequent 2,000 words of English, we needed a way to sift out the most frequent formulas occurring in both academic and non-academic genres. To accomplish this, we used the log-likelihood (LL) statistic to compare the frequencies of the phrases across the academic and non-academic corpora. The LL ratio is useful for comparing the relative frequency of words or phrases across registers and determining whether the frequency of an item is statistically higher in one corpus or subcorpus than another (Oakes 1998; Jurafsky and Martin 2000; Rayson and Garside 2000). Those expressions found to occur statistically more frequently in academic discourse, using the LL statistic with a significance level of p = 0.01, comprise the basis for the academic formulas list (AFL). We separately compared academic vs. non-academic speech, resulting in over 2,000 items, and academic vs. non-academic writing, resulting in just under 2,000 items. The overlapping items from these two lists were identified as the core formulas that appear frequently in both academic speech and writing. Once these lists were obtained, cutoff values for distributional range across the academic subdivisions of the corpora had to be established. The subcorpora for academic speech were (Table 1): Humanities and Arts, Social Sciences, Biological and Health Sciences, Physical Sciences and Engineering, and Other/non-disciplinary. For academic writing, the subcorpora were: Humanities and Arts, Social Sciences, Natural Sciences and Medicine, and

R. SIMPSON-VLACH and N. C. ELLIS

493

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Technology and Engineering. The cutoff values we used were as follows: Expressions occurring primarily in speech had to occur at the 10 tokens per million level or above in four out of five of the academic divisions, resulting in a Spoken AFL of 979 items; expressions occurring primarily in writing had to occur at least 10 times per million words in three out of four academic divisions, resulting in a Written AFL of 712 items; and expressions occurring in both speech and writing had to occur at a level of 10 per million in at least six out of all nine subcorpora, resulting in a Core AFL of 207 items.3 These range thresholds ensure that the AFL formulas are found across the breadth of academic spoken or written language and are thus relevant to general EAP, rather than to particular disciplines. Furthermore, the range ensures that the formulas on the list are not attributable to the idiosyncrasies of particular speakers or speech events. Another important statistic we calculated for each of the strings was the MI score. MI is a statistical measure commonly used in the field of information science designed to assess the degree to which the words in a phrase occur together more frequently than would be expected by chance (Oakes 1998; Manning and Schuetze 1999). A higher MI score means a stronger association between the words, while a lower score indicates that their co-occurrence is more likely due to chance. MI is a scale, not a test of significance, so there is no minimum threshold value; the value of MI scores lies in the comparative information they provide. The question we then posed is: To what extent are these corpus metrics of frequency and MI useful for ranking the formulas on a list? High frequency n-grams occur often. But this does not imply that they have clearly identifiable or distinctive functions or meanings; many of them occur simply by dint of the high frequency of their component words, often grammatical functors. In addition, relying solely on frequency means that some distinctively useful but lower frequency phrases whose component words are highly unlikely to occur together by chance will not make it to the top of the frequency-ordered list. So frequency alone is not a sufficient metric. High MI n-grams are those with much greater coherence than is expected by chance, and this tends to correspond with distinctive function or meaning. But this measure tends, in contrast to frequency, to identify rare phrases comprised of rare constituent words, such as many subject-specific phrases. So nor is MI alone a perfect metric for extracting phrases that are highly noteworthy for teachers, since it privileges low-frequency items. Tables 2 and 3 present a simple re-ordering by frequency and MI of the top 10 and bottom 10 phrases of the approximately 2,000 original Academic speech and original Academic writing items to illustrate these points. For the speech data in Table 2, we see that frequency prioritizes such phrases as ‘and this is’ and ‘this is the’ which seem neither terribly functional nor pedagogically compelling, while it satisfactorily relegates to the bottom the phrases ‘cuz if you’, and ‘um and this’. Instructors might, however, be interested in

494 NEW METHODS IN PHRASEOLOGY RESEARCH

Table 2: The top 10 and bottom 10 phrases of the original Academic speech items prioritized by frequency and by MI Top 10 by MI blah blah blah trying to figure out do you want me to for those of you who we’re gonna talk about talk a little bit does that make sense thank you very much the university of Michigan you know what i mean

Bottom 10 by frequency if you haven’t so what we’re as well but cuz if you right okay and um and this think about how we’re interested in will give you we can we

Bottom 10 by MI okay and the is like the so in the and so the the um the is what the this in the that it’s the is it the of of of

other low frequency neighbors such as ‘we’re interested in’, and ‘think about how’. MI, on the other hand, privileges functional formulas such as ‘does that make sense’ and ‘you know what I mean’, though ‘blah blah blah’ and ‘the University of Michigan’ are high on the list too. The low priority items by MI such as ‘the um the’ and ‘okay and the’ do indeed seem worthy of relegation. For the written data in Table 3, frequency highlights such strings as ‘on the other hand’ and ‘it is possible’ (we think appropriately), alongside ‘it has been’ and ‘but it is’ (we think inappropriately), and pushes ‘by the use’ and ‘of the relevant’ to the bottom (appropriately), alongside ‘it is obvious that’ and ‘in the present study’ (inappropriately). MI, in contrast, prioritizes such items as ‘due to the fact that’ and ‘there are a number of’ (appropriately; indeed all of the top ten seem reasonable), and it (appropriately) relegates generally non-functional phrases such as ‘to be of’, ‘as to the’, ‘of each of’, etc. These tables represent just a glimpse of what is revealed by the comparison of a given list of formulas ordered by these two measures. Our intuitive impressions of the prioritizations produced by these measures on their own, as illustrated here, thus led us to

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Top 10 by frequency this is the be able to and this is you know what you have a you can see look at the you need to so this is you want to

R. SIMPSON-VLACH and N. C. ELLIS

495

Table 3: The top 10 and bottom 10 phrases of the original Academic writing items prioritized by frequency and by MI Top 10 by MI due to the fact that it should be noted on the other hand the it is not possible to there are a number of in such a way that a wide range of take into account the on the other hand as can be seen

Bottom 10 by frequency is sufficient to weight of the of the relevant by the use of the assessment of by the use of the potential it is obvious that in the present study is obvious that

Bottom 10 by MI to the case of each of with which the as in the it is of is that of to that of as to the to be of that as the

favor MI over frequency. Ideally, though, we wanted to combine the information provided by both metrics to better approximate our intuitions and those of instructors, and thus to rank the academic formulas for use in pedagogical applications. Our efforts to achieve this synthesis were part of a large validation study which triangulated corpus linguistic measures, educator insights, and psycholinguistic processing measures. A full description of these investigations is available in Ellis et al. (2008). Because these details are available elsewhere, and because the primary aim of the present paper is to present the AFL items and their functional categorizations, we simply summarize the relevant parts of the procedures here.

Determining a composite metric to index FTW We selected a subset of 108 of these academic formulas, 54 from the spoken and 54 from the written list. These were chosen by stratified random sampling

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Top 10 by frequency on the other in the first the other hand on the other hand in the united but it is can be seen it has been is likely to it is possible

496 NEW METHODS IN PHRASEOLOGY RESEARCH

to represent three levels on each of three factors: n-gram length (3,4,5), frequency band (High, Medium, and Low; means 43.6, 15.0, and 10.9 per million, respectively), and MI band (High, Medium, and Low; means 11.0, 6.7, and 3.3, respectively). There were two exemplars in each of these cells. We then asked twenty experienced EAP instructors and language testers at the English Language Institute of the University of Michigan to rate these formulas, given in a random order of presentation, for one of three judgements using a scale of 1 (disagree) to 5 (agree):

Formulas which scored high on one of these measures tended to score high on another: r AB = 0.80, p < 0.01; r AC = 0.67, p < 0.01; r BC = 0.80, p < 0.01). The high alphas of the ratings on these dimensions and their high inter-correlation reassured us of the reliability and validity of these instructor insights. We then investigated which of frequency or MI better predicted these instructor insights. Correlation analysis suggested that while both of these dimensions contributed to instructors valuing the formula, it was MI which more strongly influenced their prioritization: r frequency/A = 0.22, p < 0.05; r frequency/B = 0.25, p < 0.05; r frequency/C = 0.26, p < 0.01; r MI/A = 0.43, p < 0.01; r MI/B = 0.51, p < 0.01; r MI/C = 0.54, p < 0.01. A multiple regression analysis predicting instructor insights regarding whether an n-gram was worth teaching as a bona fide phrase or expression from the corpus metrics gave a standardized solution whereby teaching worth = b 0.56 MI + b 0.31 frequency. That is to say, when instructors judge n-grams in terms of whether they are worth teaching, considering both frequency and MI factor into their judgements, it is the MI of the string—the degree to which the words are bound together—that is the major determinant. These beta coefficients, derived from the 108 formula subset for which we had obtained instructor ratings, could then be used over the population of academic formulas which they represented to estimate from the two corpus statistics available for all formulas—the combined measures of MI and frequency—a FTW score that is a prediction of how instructors would judge their teaching worth. This score, like the MI statistic, does not provide a threshold cutoff score, but enables a reliable and valid rank ordering of the formulas, which in turn provides instructors and materials developers with a basis for prioritizing formulaic expressions for instructional uses. The FTW score, with its use of both frequency rank and MI score is thus a methodologically

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

A. whether or not they thought the phrase constituted ‘a formulaic expression, or fixed phrase, or chunk’. There were six raters with an inter-rater a = 0.77. B. whether or not they thought the phrase has ‘a cohesive meaning or function, as a phrase’. There were eight raters with an inter-rater a = 0.67. C. whether or not they thought the phrase was ‘worth teaching, as a bona fide phrase or expression’. There were six raters with an inter-rater a = 0.83.

R. SIMPSON-VLACH and N. C. ELLIS

497

innovative approach to the classification of academic formulas, as it allows for a prioritization based on statistical and psycholinguistic measures, which a purely frequency-based ordering does not.

RESULTS: THE AFL AND FUNCTIONAL CATEGORIZATION

Rationale and overview of the functional categories The purpose of the following classification is primarily pedagogical. An ordered list of formulas sorted according to major discourse-pragmatic functions allows teachers to focus on functional language areas which, ideally, will dovetail with functional categories already used in EAP curricula. The creation of a functional taxonomy for formulaic sequences is an inherently problematic endeavor, as Wray and Perkins (2000: 8) point out, arguing that typologies

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

In the appendix (see supplementary material available at Applied Linguistics online), we present the AFL grouped into the three sublists— the Core AFL in its entirety, and the first 200 formulas of the Spoken AFL, and the Written AFL. Since all three lists are sorted by the two-factor FTW score, providing the top 200 formulas for the two longer AFL components effectively distills them into the most relevant formulas. Scrutiny of the lists also shows substantial overlap among some of the entries. Thus, for example, in Appendix 1, the Core AFL listing includes the n-grams from the point of view, the point of view of, point of view, point of view of, the point of view, etc. Since this degree of redundancy is not especially useful, and moreover takes up extra space, in our functional categorizations we collapsed incidences like these together into their common schematic core—in this case, (from) (the) point of view (of). We retained the original formulas in the Appendix 1 tables, but only collapsed them in Table 4, the functional categorization. We acknowledge that in so doing we have sacrificed some detail as to the specific configurations and functions of component phrases; however, the differences in pragmatic function of these formula variations are generally minor and the detail lost can easily be retrieved by looking at the fuller lists in the appendix. The final stage of the analysis involved grouping the formulas into categories according to their primary discourse-pragmatic functions. For purposes of expediency as well as the anticipated pedagogical applications, we again included only those formulas from the Core AFL list and the top 200 from the Written AFL and the Spoken AFL lists. These functional categories—determined after examining the phrases in context using a concordance program— are not meant to be taken as definitive and exclusive, since many of the formulas have multiple functions, but rather as indications of the most salient function the phrases fulfill in academic contexts. In the following section, we present an overview of the functional analysis, providing examples to illustrate some of the more important functions in context.

498 NEW METHODS IN PHRASEOLOGY RESEARCH

Table 4: The AFL categorized by function Group A.

Referential expressions

(1) Specification of attributes (a) Intangible framing attributes spoken) (in) such a (way) (in) terms of (the) in which the is based on (the) nature of the of the fact (on) the basis (of) the ability to the concept of the context of the definition of the development of

the distribution of the existence of (the) extent to which (the) fact that (the) the idea that the issue of the meaning of the nature of (the) the notion of the order of the presence of (a)

the problem of the process of the question of the role of the structure of the study of (the) way(s) in (which) the way that the work of the use of with respect to (the)

the idea of

the kind of

this kind of

in accordance with (the) (in) such a way that in terms of a in the absence of

in the course of in the form of in this case the insight into the

on the basis of the on the part of to the fact that with regard to

the change in the frequency of the level of

(the) part(s) of the the rate of the sum of

(the) size of (the) (the) value of (the)

High levels of

over a period of

Primarily spoken it in terms of Primarily written an attempt to [are/was] based on by virtue of degree to which depend([ing/s]) on the

(b) Tangible framing attributes Core AFL (written & spoken) (as) part of [a/the] the amount of the area of Written AFL an increase in the

(c) Quantity specification Core AFL (written & spoken) a list of a series of a set of

[a/large/the] number of And the second

both of these each of [the/these] of [the/these] two

of the second the first is there are three

little or no in a number of in both cases in most cases

in some cases there are no (the) total number (of) there are several (there) are a number (of) two types of

Primarily spoken all sorts of Primarily written a high degree a large number (of) (a) small number (of) (a) wide range (of)

(continued)

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Core AFL (written & [a/the] form of (as) a function (of) based on [a/the] focus on the form of the (from) (the) point of view (of) in relation to in response to (in) the case (of) in the context (of) in the sense (that)

R. SIMPSON-VLACH and N. C. ELLIS

499

Table 4: Continued (2) Identification and focus Core AFL (written & spoken) a variety of is for the [an/the] example of (a) is not [a/the] as an example is that [it/the/there] different types of is the case here is that is to be if this is it can be it does not

it is not means that the referred to as such as the that in [a/the] that is the that there [are/is (a)]

that this is that we are there is [a/an/no] this is [a/an/not] this type of this would be which is [not/the]

how many of you nothing to do with one of these

so this is the best way to there was a

this is the this is this is those of you who

it has been none of these that it is not

that there is no there has been they [did/do] not

this does not this means that which can be

different from the exactly the same have the same [in/of/with] the same

is much more related to the the same as

(the) difference between (the) the relationship between

the same thing

to each other

(on) the other (hand) (the) similar to those

the difference between (the) same way as the to distinguish between

Primarily spoken

Primarily written (as) can be seen (in) does not have has also been his or her

(3) Contrast and comparison Core AFL (written & spoken) and the same as opposed to associated with the between the two Primarily spoken (nothing) to do with (the) Primarily written be related to the is more likely

(4) Deictics and locatives Core AFL (written & spoken) a and b

the real world

of the system

Primarily spoken (at) the end (of) (the) at this point

(at) (the) University of in Ann Arbor Michigan

piece of paper

Primarily written at the time of

at this stage

the United Kingdom

b and c

(5) Vagueness markers Core AFL (written & spoken) and so on Primarily spoken and so forth

and so on and so

blah blah blah

(continued)

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

[has/have] to do with it’s gonna be and this is for those of you (who)

500 NEW METHODS IN PHRASEOLOGY RESEARCH

Table 4: Continued Group B. Stance expressions (1) Hedges Core AFL (written & spoken) (more) likely to (be) [it/there] may be

may not be

to some extent

it could be it looks like

it might be little bit about

might be able (to) you might want to

Primarily written appear(s) to be are likely to as a whole

at least in does not appear

is likely to (be) it appears that

it is likely that less likely to

to show that

we can see

how do we how do you know I think this is

trying to figure (out) to figure out (what) you think about it okay I don’t know

what do you mean what does that mean (you) know what I (mean)

be seen as been shown to can be considered

be considered as have shown that if they are

is determined by we assume that we have seen

Primarily spoken do you want (me) (to) doesn’t have to be don’t worry about has to be

I want you to it has to be keep in mind take a look (at)

tell me what (to) make sure (that) we have to we need to

you you you you

Primarily written (it should) be noted (that)

need not be needs to be

should also be should not be

take into account (the) to ensure that (the)

(2) Epistemic stance Core AFL (written & spoken) according to the assume that the be the case out that the Primarily spoken [and/as] you can (see) do you know what (does) that make sense Primarily written assumed to be be argued that be explained by be regarded as (3) Obligation and directive don’t need to need to (do) want me to want to

(4) Expressions of ability and possibility Core AFL (written & spoken) can be used (to) to use the Primarily spoken (gonna) be able (to) so you can (see)

that you can to think about

(you) can look at you can see ([that/the])

you could you could you’re trying to

Primarily written allows us to are able to be achieved by [be/been/was] carried out carried out [by/in]

be used as a be used to can also be can be achieved can be expressed

can easily be can be found (in) could be used has been used (it) is not possible (to)

it is possible ([that/to]) most likely to their ability to to carry out

(continued)

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Primarily spoken a kind of a little bit about in a sense

R. SIMPSON-VLACH and N. C. ELLIS

501

Table 4: Continued (5) Evaluation Core AFL (written & spoken) the importance of Primarily Spoken it doesn’t matter Primarily written important role in is consistent with it is difficult

it is important (to) it is impossible to it is interesting to

it is necessary (to) it is obvious that it is worth

(it) is clear (that) the most important

if you were (to) I’m gonna go

I’m not gonna let me just um let me

I’m talking about talk a little bit talk(ing) about the to talk about wanna talk about

we talk(ed) about we were talking (about) We’ll talk about We’re gonna talk (about) We’re talking about

We’ve talked about what I’m saying what I’m talking about what you’re saying You’re talking about

in the next section in the present study in this article

(in) this paper (we) shown in figure

shown in table the next section

wanna look at we look(ed) at we’re looking at what I mean what I want to

when you look at you have a you look at (the) you’re looking at you’ve got a

see what I’m saying so if you

what happens is you know what I’m

(6) Intention/volition, prediction

if you wanna if you want(ed) (to)

Primarily written to do so

we do not

Group C: Discourse organizing functions (1) Metadiscourse and textual reference Primarily spoken come back to go back to the gonna talk about I was gonna say (I) was talking about I’ll talk about Primarily written as shown in at the outset in table 1

(2) Topic introduction and focus Core AFL (written & spoken) For example [if/in/the] what are the Primarily spoken a look at first of all I have a question I’ll show you if you have (a) if you look (at) (the)

if you’ve got let’s look at look at [it/the/this] looking at the to look at (the)

(3) Topic elaboration (a) non-causal Core AFL (written & spoken) But this is Primarily spoken any questions about came up with come up with (a)

I mean if (you) (it) turns out (that)

(continued)

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Primarily spoken I just wanted to I wanted to

502 NEW METHODS IN PHRASEOLOGY RESEARCH

Table 4: Continued Primarily written are as follows factors such as

in more detail

see for example

such as those

so that the the effect(s) of

the reason for whether or not (the)

(b) Topic elaboration: cause and effect Core AFL (written & spoken) [a/the] result of due to the (as) a result (of) in order to because it is in order to get

the reason why

Primarily written as a consequence as a result of the due to the fact (that)

for the purposes of for this purpose for this reason

give rise to is affected by

it follows that to determine whether

Core AFL (written & spoken) and in the as well as

at the same (time)

(in) other words (the)

Primarily spoken and if you and then you

but if you by the way

no no no (no) thank you very (much)

oh my god yes yes yes

Primarily written even though the

in conjunction with

(4) Discourse markers

The table includes all 207 formulas of the Core List, the top 200 items of the Written AFL and the top 200 items of the Spoken AFL lists.

such as those offered by Nattinger and DeCarrico (1992), among others, suffer from a proliferation of types and subtypes. This proliferation of categories does indeed make it difficult to distill the data into a compact functional model applicable across corpora and domains of use. In spite of these difficulties, however, we maintain that for pedagogical purposes, a functional taxonomy, however multilayered or imprecise because of overlapping functions and multifunctional phrases, is nevertheless crucial to enhancing the usefulness of the AFL for teachers. As for pedagogical applications, this functional categorization of the AFL is intended primarily as a resource for developing teaching materials based on further contextual research around the items rather than a resource for teaching itself. Due to space constraints, we cannot present specific teaching suggestions here, but do reiterate that the formula in context is what is pedagogically relevant. The functional categorization of the AFL is an important resource, but nevertheless only a starting point. Previous researchers have in fact already paved the way in this area; in particular, we credit the work of Biber et al. (2004) in this aspect of our study. The current classification scheme is an adaptation of the functional taxonomy outlined in their article, but with some important extensions and

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Primarily spoken End up with

R. SIMPSON-VLACH and N. C. ELLIS

503

Description and examples of the functional categories The following section outlines the pragmatic functional taxonomy. Numbers in brackets refer to the total number of formulas in that category from the combined Core AFL and top 200 each from the Written AFL and Spoken AFL.

Group A: Referential expressions The largest of the three major functional groupings, the referential expressions category encompasses five subcategories: specification of attributes, identification and focus, contrast and comparison, deictics and locatives, and vagueness markers. (1) Specification of attributes (a) Intangible framing attributes [66]. The largest pragmatic subcategory for all AFL phrases is the specification of attributes—intangible framing devices. The majority of these phrases appear on the Core AFL list, indicating that these are clearly important academic phrases across both spoken and written genres. This category includes phrases that frame both concrete entities (as in A.1) and abstract concepts or categories (as in A.2).

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

modifications. As in their study, we grouped the formulas into three primary functional groups: referential expressions, stance expressions, and discourse organizers. Several functional categories in our classification scheme, however, are not in the Biber et al. taxonomy, and these should be mentioned here. Within the referential expressions group, we have added one category—namely, that of contrast and comparison. This is a common functional category in EAP curricula, and with over 20 formulas it represents an important functional group of the AFL. For the category of stance expressions, a number of formulas represent two essential categories not explicitly named by Biber et al.: These are hedges and boosters, and evaluation. In addition, we have collapsed two of their categories (desire and intention/prediction) into one, called volition/ intention, since the AFL formulas in the two categories did not seem distinct enough in their discourse functions to warrant splitting them. Finally, the discourse organizers group is substantially expanded and modified from the Biber et al. grouping, with three important additional subcategories: metadiscourse and textual reference, cause and effect expressions, and discourse markers. Our functional classification is thus considerably more extensive than Biber et al.’s; we suspect that this may be due primarily to the fact that there are close to 500 formulas in this portion of the AFL, compared with fewer than 150 phrases included in their list of the most common lexical bundles. Finally, we reiterate that even though some of the formulas are multifunctional, we have nevertheless tried to align all of them with their most probable or common function.

504 NEW METHODS IN PHRASEOLOGY RESEARCH

(A.1) . . . based on the total volume passing through each cost center (A.2) so even with the notion of eminent domain and fair market value . . . There are close to 70 formulas in this category, and roughly half are composed of the structure ‘a/the N of’, sometimes with a preceding preposition, as in as a function of, on the basis of, and in the context of. Most of these formulas frame an attribute of a following noun phrase, but some frame an entire clause (A.3), or function as a bridge between a preceding verb and a following clause (A.4).

(b) Tangible framing attributes [14]. The second subcategory of attribute specifiers is that of tangible framing attributes such as the amount of, the size of, the value of, which refer to physical or measurable attributes of the following noun. (A.5) this is uh, what she found in terms of the level of shade and yield of coffee . . . (c) Quantity specification [26]. The final subcategory of attribute specifiers is closely related to the category of tangible framing attributes, and includes primarily cataphoric expressions enumerating or specifying amounts of a following noun phrase, as in a list of, there are three, little or no, all sorts of. Some of the quantity specifiers, however—for example, both of these, of these two—are anaphoric, referring to a prior noun phrase (e.g. A.7). (A.6) From an instrumental viewpoint, there are three explanations worth considering. (A.7) It is the combination of these two that results in higher profits to the EDLP store. (2) Identification and focus [53]. The second most common functional category, with 53 formulas, is the subcategory of identification and focus, which includes typical expository phrases such as as an example, such as the, referred to as, and means that the, and also a number of stripped-down sentence or clause stems with a copula, auxiliary verb, or modal construction, such as it is not, so this is, this would be. It is not surprising that this functional category figures prominently in academic discourse, since exemplification and identification are basic pragmatic functions in both academic speech and writing. In fact, these phrases often occur in clusters, as in example A.9. (A.8) So many religions, such as the religion of Ancient Egypt, for instance . . . (A.9) so this would be an example of peramorphosis.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

(A.3) But another clear example of the way in which domestic and foreign policy overlaps is of course in economic affairs. (A.4) human psychology has evolved in such a way, as to allow us to make those kinds of judgements that would normally be reliable.

R. SIMPSON-VLACH and N. C. ELLIS

505

(3) Contrast and comparison [23]. Many of the contrast and comparison phrases included explicit markers of comparison such as same, different, or similar. As mentioned earlier, this category is not included in Biber et al., but constitutes an important language function for EAP teaching purposes. (A.10) that’s probably a prefix code as opposed to a suffix code.

(5) Vagueness markers [4]. There are only four phrases included in the AFL that are classified as vagueness markers, making it the smallest functional category. Furthermore, three of these phrases are limited to the Spoken AFL; only the phrase and so on appears in the Core AFL. Nevertheless, the frequency rates and FTW scores show that these phrases are important; making vague references with these particular extenders is a common discourse function in academic speech. Interestingly, Biber et al. (2004) also only list three phrases in this category (which they call imprecision bundles), yet claim that it is a major subcategory of referential bundles; perhaps this claim is also based on frequencies. Note that the three phrases they list in this category (or something like that, and stuff like that, and things like that), do not appear in the AFL, because although they may indeed be frequent in academic speech, they were not sufficiently more frequent in academic speech as compared with non-academic speech to make the cut for the AFL.

Group B: Stance expressions Stance formulas include six functional subcategories, two of which—hedges and evaluative formulas—are additions to the Biber et al. taxonomy. (1) Hedges [22]. This category includes a number of phrases that have multiple functions, but whose hedging function seems paramount (e.g. there may be, to some extent, you might want to). All of these formulas express some degree of qualification, mitigation, or tentativeness (Hyland 1998). Other examples of hedges show clearly the tendency of these formulas to co-occur with other hedge words or phrases, as in B.1, where the formula is preceded by ‘I mean, uh, you know’. (B.1) but the, there are the examples of, and and the examples in the Renaissance I mean, uh, you know Copernicus is to some extent a figure of the Renaissance.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

(4) Deictics and locatives [12]. The deictic and locative expressions are a small but important functional category, referring to physical locations in the environment (e.g. the real world) or to temporal or spatial reference points in the discourse (e.g. a and b, at this point) These formulas obviously reflect the provenance of the corpus, so the University of Michigan, Ann Arbor, and the United Kingdom all appear on this list because of the inclusion of both MICASE and BNC texts.

506 NEW METHODS IN PHRASEOLOGY RESEARCH

(2) Epistemic stance [32]. Epistemic stance formulas have to do with knowledge claims or demonstrations, expressions of certainty or uncertainty, beliefs, thoughts, or reports of claims by others. (B.2) so we’re just gonna be saying let’s assume that the two variabilities in the two populations are the same . . . (3) Obligation and directive [23]. Obligation and directive formulas are generally verb phrases directing readers or listeners to do or not do something, or to recall or attend to some observation, fact, or conclusion.

(4) Ability and possibility [29]. The ability and possibility formulas frame or introduce some possible or actual action or proposition. In the spoken genres, these formulas are often interactive phrases with the second person pronoun, as in you can see, you can look at, and you’re trying to. (B.4) We aren’t gonna be able to predict all behaviors because chance variables play a big role. (5) Evaluation [13]. The subcategory of evaluation is another addition to the Biber et al. taxonomy. Biber et al. included only two of these phrases and listed them under the category of impersonal obligation/directive (i.e. it is important to, it is necessary to). The AFL, however, includes several phrases that are clearly evaluative, without necessarily being directive, such as the importance of, is consistent with, it is obvious that, it doesn’t matter. Furthermore, even those that are also directive we maintain function primarily as evaluators. Interestingly, of the thirteen phrases in this category, most are on the Written AFL; only one appears on the Core AFL (the importance of), and one on the Spoken AFL (it doesn’t matter). (B.5) Much macrosociological theory emphasizes the importance of societal variation. (6) Intention/volition [11]. Most of the phrases in this category occur in the spoken genres, and express either the speaker’s intention to do something, or the speaker’s questioning of the listener’s intention. (B.6) So let me just take this off momentarily and put my other chart back on.

Group C: Discourse organizing expressions Discourse organizers in the AFL fall into four main subcategories: metadiscourse, topic introduction, topic elaboration, and discourse markers. Each of these functions involves either signaling or referring to prior or upcoming

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

(B.3) Why? Tell me what your thought process is.

R. SIMPSON-VLACH and N. C. ELLIS

507

discourse. With the exception of the cause–effect subcategory of topic elaboration, all the discourse organizing expressions are more frequent in the spoken genres. This is consistent with Biber’s (2006) finding that discourse markers are rare in written compared with spoken academic genres.

(C.1) The seven studies are summarized in the next section. (C.2) Yeah I was gonna say something similar to that. (2) Topic introduction and focus [23]. This category overlaps functionally to a certain degree with the referring expressions identification and focus category. The main difference is that the global discourse organizing function of introducing a topic is primary here, with the phrase often framing an entire clause or upcoming segment of discourse, while the local referential function of identification is more salient for the other category. (C.3) so the first thing we wanted to do was take a look at and see if in fact this compound can kill cancer cells. (3) Topic elaboration. The topic elaboration subcategory includes two groups: non-causal topic elaboration, and cause and effect elaboration. Both categories function to signal further explication of a previously introduced topic. (a) Non-causal [15]. Non-causal topic elaboration includes any phrase that is used to mark elaboration without any explicit causal relationship implied. This includes phrases that summarize or rephrase, as in it turns out that and what happens is, as well as interactive formulas and questions such as see what I’m saying, and any questions about. (C.4) and let’s just look at birth rate, and what happens is we have inverse, density dependence . . . (b) Cause and effect [22]. The cause and effect formulas signal a reason, effect, or causal relationship. Although these are grouped as a subset of the topic elaboration formulas, they are an important functional group in and of themselves in academic discourse and for EAP teaching. (C.5) at this point in order to get fired you have to do something really awful.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

(1) Metadiscourse and textual reference [31]. The subcategory of discourse organizers with the largest number of phrases is the metadiscourse and textual reference category. As mentioned earlier, this functional category was not included in the Biber et al. taxonomy; most of the phrases we classified in this category were grouped in their study with the topic introduction/focus category (2004: 386). With no phrases on the Core AFL, these phrases are clearly differentiated between the spoken and written lists, thus indicating that metadiscourse formulas tend to be genre-specific.

508 NEW METHODS IN PHRASEOLOGY RESEARCH

(C.6) As a result, research on the imposition of the death penalty in the United States has a long and distinguished history. (4) Discourse markers [14]. The discourse markers category includes two subtypes. Connectives, such as as well as, at the same time, in other words, which connect and signal transitions between clauses or constituents. Interactive devices and formulas include thank you very much, yes yes yes, and no no no, which are phrases that stand alone and function as responses expressing agreement, disagreement, thanks, or surprise.

DISCUSSION AND CONCLUSIONS Our methods and results suggest that formulaic sequences can be statistically defined and extracted from corpora of academic usage in order to identify those that have both high currency and functional utility. First, as in prior research with lexis (Nation 2001) and lexical bundles (Biber et al. 2004; Biber 2006), we used frequency of occurrence to identify constructions that appear above a baseline threshold frequency and which therefore have a reasonable currency in the language as a whole. Second, as in prior research defining academic lexis (Coxhead 2000), we identified those that appear more frequently in academic genres and registers and across a range of disciplines as being particular to EAP. But currency alone does not ensure functional utility. However frequent in our coinage, nickels and dimes aren’t worth as much as dollar bills. So too with formulas. When we assessed the educational and psycholinguistic validity of the items so selected, we found that they vary in worth as judged by experienced instructors, and in their processability by native speakers. In the present article, we show that experienced EAP and ESL instructors judge multiword sequences to be more formulaic, to have more clearly defined functions, and to be more worthy of instruction if they measure higher on the two statistical metrics of frequency and MI, with MI being the major determinant. In our companion paper (Ellis et al. 2008) we report experiments which showed how processing of these formulas varies in native speakers and in advanced second language learners of English. Next, therefore, we used these findings to prioritize the formulas in our AFL for inclusion in EAP instruction using an empirically derived measure of utility that is both educationally valid and operationalizable with corpus linguistic metrics. Our FTW score weighs MI and frequency in the same way that EAP instructors did when judging a sample of these items for teaching worth. When we rank ordered the formulas according to this metric, the items which rose to the top did indeed appear to be more formulaic, coherent, and perceptually salient than those ordered by mere frequency or MI alone, thus providing

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

(C.7) Material data as well as functional principles must be taken into account for the physical design.

R. SIMPSON-VLACH and N. C. ELLIS

509

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

intuitive confirmation of the value of the FTW score. We used this ordering to inform the selection and prioritization for inclusion in EAP instruction of the Core and the top 200 Written and Spoken AFL formulas. This inclusion of MI for prioritizing such multiword formulas represents an important advance over previous research. We then analyzed these formulas for discourse function to show that many of them fall into coherent discourse-pragmatic categories with enough face validity to encourage their integration into EAP instruction when discussing such functions as framing, identification and focus, contrast and comparison, evaluation, hedging, epistemic stance, discourse organization, and the like. Our AFL is categorized in this way in Table 4, with the functions further explained and exemplified in our Results section. It is our hope that this functional categorization, along with the FTW rank-ordered lists, will facilitate the inclusion of AFL formulas into EAP curricula, and that further work on the pedagogical value of the AFL will take these results as a starting point. We recognize that there are other possible ways of going about this task, each with particular advantages and disadvantages. Biber et al.’s groundbreaking work in defining lexical bundles on the basis of frequency alone has served as a contrast for us throughout this paper. It showed how corpus analysis could be used to identify interesting EAP constructions. But it also showed how frequency alone generates too many items of undifferentiated value. Biber et al. (2004) included only four-word bundles because the same frequency cutoff would generate far too many lexical bundles to deal with if threeand five-word bundles were included; yet, as we show here, many of the important (and high FTW) words on our AFL are actually tri-grams. So too, many of the phrases in their high-frequency lexical bundles list don’t appear in the AFL because while they gathered all strings of frequency in university teaching and textbooks, we used comparison non-academic corpora and the LL statistic to pull out only those phrases that are particularly frequent in academic discourse. Our conclusions also stand in contrast to those of Hyland (2008) who argues that there are not enough lexical bundles common to multiple disciplines to constitute a core academic phrasal lexicon, and therefore advocates a strictly discipline-specific pedagogical approach to lexical bundles. Although we would not deny that disciplinary variation is important and worthy of further analysis, by using the metrics we did, we were able to derive a common core of academic formulas that do transcend disciplinary boundaries. Several factors that explain our divergent claims warrant mentioning. First, Hyland also analyzed only four-word bundles, whereas a glance at the top 50 Core AFL phrases shows the majority to be three-word phrases (e.g. in terms of, in order to, in other words, whether or not, as a result). Second, he used a higher cutoff threshold, whereas we started with a lower cutoff frequency; since our FTW score incorporates another statistic (MI) to insure relevance, the lower frequency range allowed us to cast a wider net without prioritizing numerous less relevant

510 NEW METHODS IN PHRASEOLOGY RESEARCH

SUPPLEMENTARY DATA Supplementary material is available at Applied Linguistic sonline.

NOTES 1 MICASE speech events include lectures, seminars, student presentations, office hours, and study groups; for further details about the specific genres in MICASE, see Simpson-Vlach and Leicher (2006). BNC spoken academic files include primarily lectures and tutorials. BNC written academic texts include research articles and textbooks. 2 Furthermore, this was the only corpus of conversational American English speech available to us; although telephone conversations are not necessarily

ideal, they were quite adequate for comparison purposes. 3 Because these formulas appeared frequently in both spoken and written genres, the minimum threshold was set at six out of nine of the disciplinary sub-corpora, which had to include both written and spoken corpora. In fact, over 100 of the Core AFL formulas appeared in at least eight out of nine, and furthermore most of them occurred at frequencies well over 20 times per million.

REFERENCES Barlow, M. 2004. Collocate. Athelstan Publications. Barlow M. and S. Kemmer (eds). 2000. UsageBased Models of Language. CSLI Publications. Biber, D. 2006. University Language. John Benjamins. Biber, D. and F. Barbieri. 2006. ‘Lexical bundles in university spoken and written registers,’ English for Specific Purposes 26: 263–86.

Biber, D., S. Conrad, and V. Cortes. 2004. ‘If you look at . . .’’: Lexical bundles in university teaching and textbooks,’ Applied Linguistics 25: 371–405. Biber, D., S. Conrad, and R. Reppen. 1998. Corpus Linguistics: Investigating Language Structure and Use. Cambridge University Press. Biber, D., S. Johansson, G. Leech, S. Conrad, and E. Finegan. 1999. Longman Grammar of Spoken and Written English. Pearson Education.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

formulas. Our research thus finds quite a number of core formulas common to all academic disciplines. In closing, we are left with important conclusions relating to the complementarity of corpus, theoretical, and applied linguistics. Whatever the extraction method, there are so many constructions that there is ever a need for prioritization and organization. The current research persuades us that we will never be able to do without linguistic insights, both intuitive and academic. While some of these can be computationally approximated, as in the use of range of coverage of registers, and statistics such as MI and frequency in our FTW metric here, functional linguistic classification and the organization of constructions according to academic needs and purposes is essential in turning a list into something that might usefully inform curriculum or language testing materials.

R. SIMPSON-VLACH and N. C. ELLIS

Goldberg, A. E. 2006. Constructions at Work: The Nature of Generalization in Language. Oxford University Press. Hunston, S. and G. Francis. 1996. Pattern Grammar: A Corpus Driven Approach to the Lexical Grammar of English. Benjamins. Hyland, K. 1998. Hedging in Scientific Research Articles. John Benjamins. Hyland, K. 2004. Disciplinary Discourses: Social Interactions in Academic Writing. University of Michigan Press. Hyland, K. 2008. ‘As can be seen: Lexical bundles and disciplinary variation,’ English for Specific Purposes 27: 4–21. ICAME. 2006. Available from http://icame. uib.no/. Accessed 1 October 2007. Jurafsky, D. 2002. ‘Probabilistic modeling in psycholinguistics: Linguistic comprehension and production’ in R. Bod, J. Hay, and S. Jannedy (eds): Probabilistic Linguistics. MIT Press. Jurafsky, D. and J. H. Martin. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics. Prentice-Hall. Jurafsky, D., A. Bell, M. Gregory, and W. D. Raymond. 2001. ‘Probabilistic relations between words: Evidence from reduction in lexical production’ in J. Bybee and P. Hopper (eds): Frequency and the Emergence of Linguistic Structure. Benjamins. Kuiper, K. 1996. Smooth Talkers: The Linguistic Performance of Auctioneers and Sportscasters. Erlbaum. Langacker, R. W. 1987. Foundations of Cognitive Grammar, Vol. 1: Theoretical Prerequisites. Stanford University Press. Lee, D. 2001. ‘Genres, registers, text types, domains and styles: Clarifying the concepts and navigating a path through the BNC jungle,’ Language Learning & Technology 5/3: 37–72. Leech, L. 2000. ‘Grammars of spoken English: New outcomes of corpus-oriented research,’ Language Learning 50: 675–724. Lewis, M. 1993. The Lexical Approach: The State of ELT and The Way Forward. Language Teaching Publications. Manning, C. D. and H. Schuetze. 1999. Foundations of Statistical Natural Language Processing. The MIT Press. McEnery, T. and A. Wilson. 1996. Corpus Linguistics. Edinburgh University Press.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Bod R., J. Hay, and S. Jannedy (eds). 2003. Probabilistic Linguistics. MIT Press. Brazil, D. 1995. A Grammar of Speech. Oxford University Press. Bresnan, J. 1999. ‘Linguistic theory at the turn of the century’. Plenary address to the 12th World Congress of Applied Linguistics. Tokyo, Japan. British National Corpus. 2006. Available from http://www.natcorp.ox.ac.uk/. Accessed 1 October 2007. Bybee, J. 2003. ‘Sequentiality as the basis of constituent structure’ in T. Givo´n and B. F. Malle (eds): The Evolution of Language out of Pre-language. John Benjamins. Bybee J. and P. Hopper (eds). 2001. Frequency and the Emergence of Linguistic Structure. Benjamins. Coxhead, A. 2000. ‘A new Academic Word List,’ TESOL Quarterly 34: 213–38. Croft, W. and A. Cruise. 2004. Cognitive Linguistics. Cambridge University Press. Ellis, N. C. 1996. ‘Sequencing in SLA: Phonological memory, chunking, and points of order,’ Studies in Second Language Acquisition 18/1: 91–126. Ellis, N. C. 2002a. ‘Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition,’ Studies in Second Language Acquisition 24/ 2: 143–88. Ellis, N. C. 2002b. ‘Reflections on frequency effects in language processing,’ Studies in Second Language Acquisition 24/2: 297–339. Ellis, N. C. 2009. ‘Optimizing the input: Frequency and sampling in usage-based and form-focussed learning’ in M. H. Long and C. Doughty (eds): Handbook of Second and Foreign Language Teaching. Blackwell, pp. 139–58. Ellis, N. C., E. Frey, and I. Jalkanen. 2009. ‘The psycholinguistic reality of collocation and semantic prosody (1): Lexical access’ in U. Ro¨mer and R. Schulze (eds): Exploring the Lexis-Grammar Interface. John Benjamins, pp. 89–114. Ellis, N., R. Simpson-Vlach, and C. Maynard. 2008. ‘Formulaic language in native and second-language speakers: Psycholinguistics, corpus linguistics, and TESOL,’ TESOL Quarterly 42/3: 375–96. Flowerdew J. and M. Peacock (eds). 2001. Research Perspectives on English for Academic Purposes. Cambridge University Press.

511

512 NEW METHODS IN PHRASEOLOGY RESEARCH

Academic Spoken English. The Regents of the University of Michigan. Simpson-Vlach, R. and S. Leicher. 2006. The MICASE Handbook: A resource for users of the Michigan Corpus of Academic Spoken English. University of Michigan Press. Sinclair, J. 1991. Corpus, Concordance, Collocation. Oxford University Press. Sinclair, J. 2004. Trust the Text: Language, Corpus and Discourse. Routledge. Swales, J. M. 1990. Genre Analysis: English in Academic and Research Settings. Cambridge University Press. Switchboard. 2006, August 5, 2006. ‘A user’s manual.’ Available from http://www.ldc .upenn.edu/Catalog/docs/switchboard/. Tomasello M. (ed.). 1998. The New Psychology of Language: Cognitive and Functional Approaches to Language Structure. Erlbaum. Tomasello, M. 2003. Constructing a Language. Harvard University Press. West, M. 1953. A General Service List of English Words. Longman. Wray, A. 1999. ‘Formulaic sequences in learners and native speakers,’ Language Teaching 32: 213–31. Wray, A. 2000. ‘Formulaic sequences in second language teaching: Principle and practice,’ Applied Linguistics 21: 463–89. Wray, A. 2002. Formulaic Language and the Lexicon. Cambridge University Press. Wray, A. and M. R. Perkins. 2000. ‘The functions of formulaic language: An integrated model,’ Language and Communication 20: 1–28.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Nation, P. 2001. Learning Vocabulary in Another Language. Cambridge University Press. Nattinger, J. R. and J. DeCarrico. 1992. Lexical Phrases and Language Teaching. Oxford University Press. Oakes, M. 1998. Statistics for Corpus Linguistics. Edinburgh University Press. Pawley, A. and F. H. Syder. 1983. ‘Two puzzles for linguistic theory: Nativelike selection and nativelike fluency’ in J. C. Richards and R. W. Schmidt (eds): Language and Communication. Longman. Rayson, P. and R. Garside. 2000. ‘Comparing corpora using frequency profiling.’ Proceedings of the workshop on Comparing Corpora held in conjunction with the 38th annual meeting of the Association for Computational Linguistics (ACL 2000). Hong Kong. Robinson, P. and N. C. Ellis (eds). 2008. A Handbook of Cognitive Linguistics and Second Language Acquisition. Routledge. Schmitt, N. (ed.). 2004. Formulaic Sequences. Benjamins. Simpson, R. 2004. ‘Stylistic features of academic speech: The role of formulaic expressions’ in T. Upton and U. Connor (eds): Discourse in the Professions: Perspectives from Corpus Linguistics. John Benjamins. Simpson, R. and D. Mendis. 2003. ‘A corpusbased study of idioms in academic speech,’ TESOL Quarterly 3: 419–41. Simpson, R., S. Briggs, J. Ovens, and J. M. Swales. 2002. The Michigan Corpus of

Applied Linguistics: 31/4: 513–531 ß Oxford University Press 2010 doi:10.1093/applin/amp061 Advance Access published on 3 February 2010

The Role of Phonological Decoding in Second Language Word-Meaning Inference 1,*

MEGUMI HAMADA and 2KEIKO KODA

1

Ball State University and 2Carnegie Mellon University E-mail: [email protected]

*

INTRODUCTION Word knowledge is crucial in all aspects of L2 learning. In the past four decades, there has been increasing interest in the nature of this knowledge and its acquisition. One growing area of research focuses on incidental word learning, whereby L2 learners expand their word knowledge through inferring the meaning of unknown words they encounter while reading. Although a number of findings have been reported on this topic (e.g. Paribakht and Wesche 1999; Nassaji 2003), there is a need for more empirical data offering insight into the mechanisms of word-meaning inference; a point made by Haastrup and Henriksen (2001). Consequently, the primary objective of this study was to reduce this void by providing a psycholinguistic glimpse of complex cross-linguistic interplay between reading and word-meaning inference among L2 learners with diverse L1 backgrounds. Specifically, the study examined phonological-decoding efficiency (i.e. the ability to extract phonological information from printed words) and its relation to word-meaning inference among ESL learners with contrasting L1 orthographic backgrounds. The sections that follow illustrate the interconnection between phonological decoding and word-meaning inference through an overview of relevant theories and empirical findings, including ones relating to the role of phonology in reading comprehension, to mechanisms of word-meaning inference, and to

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Two hypotheses were tested: Similarity between first language (L1) and second language (L2) orthographic processing facilitates L2-decoding efficiency; and L2-decoding efficiency contributes to word-meaning inference to different degrees among L2 learners with diverse L1 orthographic backgrounds. The participants were college-level English as a second language (ESL) learners with either alphabetic or logographic L1 backgrounds. Response speed and accuracy of English real- and pseudoword naming served as the decoding efficiency measure. The participants read three passages that contained pseudowords and inferred their meanings. Results showed that (i) alphabetic, as opposed to logographic, L1 background was associated with better decoding; (ii) the groups did not differ in meaning-inference performance; and (iii) the relationship between decoding efficiency and meaning-inference was stronger in the alphabetic group.

514 ROLE OF DECODING

cross-linguistic transfer of reading subskills. On the basis of the overview, the investigative framework is described and the hypotheses are presented.

THE ROLE OF PHONOLOGY IN READING COMPREHENSION

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

One might wonder why speedy access to phonological information is critical in silent reading for comprehension, wherein overt vocalization is not required. The most plausible answer lies in the way that phonology is thought to facilitate the integration of segmental information for text-meaning construction in working memory. Among the three components of working memory, a central module (central executive) and two subcomponents (phonological loop and visuo-spatial sketch pad), the phonological loop, a speech-based system, plays a critical role in executing cognitive tasks, including reading and word learning (Baddeley 1986). In reading, the first stage of phonological involvement is seen in lower-level processing such as lexical processing, namely, the retrieval of semantic and grammatical information of words. Because phonologically encoded information is more durable than any other form of representation in working memory (Kleiman 1975; Levy 1975; Just and Carpenter 1980), it is more efficient to convert visually presented words into their phonological forms for use in the higher-order operations, such as phrase construction and sentence processing. The phonological conversion of visually presented words is called decoding. Once extracted from print, the semantic and grammatical information of individual words must be integrated incrementally into larger, meaningful chunks, such as phrases and sentences. Working memory plays a pivotal role in the integration process by providing mental space for storing and processing the extracted information. Successful performance of cognitive tasks, moreover, requires decoding to be automated. This is because human cognition has limited capacity and cannot simultaneously handle all of the mental operations at all levels of text processing, such as phonological, morphological, lexical, syntactic, and discourse processing (LaBerge and Samuels 1974; Schneider and Shiffrin 1977). As noted above, reading requires the continuous integration of text information at virtually all levels of processing. Therefore, many interlinked processors must become operative at the same time. To accomplish this within capacity limitations, several components must become automated (Daneman and Carpenter 1980, 1983; Perfetti 1985; Daneman 1991). Automaticity, however, is not easily accomplished in the higher-order processes entailing meaning construction because they require unique, rather than routine, operations in each processing instance. In contrast, phonological decoding involves systematic operations because it entails mappings between two finite sets of features (i.e. phonological values and the graphic symbols representing them) and therefore it can be automated more easily than higher-order conceptual processes. Hence, it is essential that phonological decoding be highly efficient to the point of automaticity, so as to leave sufficient capacity for more resource

M. HAMADA and K. KODA

515

MECHANISMS OF WORD-MEANING INFERENCE As noted earlier, our goal was to clarify how L1 orthographic processing experience affects the development of L2 phonological-decoding efficiency, as well as how differences in decoding efficiency, if any, relate to L2 word-meaning inference. Word-meaning inference is a complex construct because it entails multiple operations such as analyzing, extracting, and integrating sub-lexical information with the reader’s background knowledge. Sternberg (1987) identified major sequential operations involved in learning the meaning of a new word from context: (i) separating relevant from irrelevant text information for the purpose of inferring the meaning of an unknown word, (ii) combining relevant contextual cues to formulate a workable definition, and (iii) evaluating the hypothesized meaning against information from the subsequent context. The Sternberg model thus underscores the utility of word-external information, such as context and background knowledge, in word-meaning inference. However, such an emphasis is not uniformly endorsed. Nagy and Gentner (1990), for example, have contended that the contribution of word-external resources is limited to the formulation of a specific scenario described in the text segment where the word to be inferred appears. Such scenarios are helpful in forming workable hypotheses of what the word might mean, but they are generally too broad to provide sufficient semantic constraints. In this latter view, the centrality of word-internal

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

demanding higher-order operations (LaBerge and Samuels 1974; Perfetti and Lesgold 1977, 1979; Adams 1994). Phonological decoding is related not only to reading, but also to word learning. This is because, like reading comprehension, word learning is also a complex task that involves multiple cognitive operations whose performance relies on working memory. In fact, studies investigating the effects of phonological suppression on short-term recall of newly learned words have consistently demonstrated that paired associative word learning is seriously impaired when participants are asked to vocalize something irrelevant to the words being learned—a procedure often used by experimenters wishing to make the phonological loop (a central component of working memory whose function is to provide temporary storage of phonological forms) unavailable (Gathercole and Baddeley 1989, 1990; Papagno et al. 1991; Gathercole et al. 1999). For example, in Papagno et al. (1991), Italian-speaking college students learned Italian–Italian word pairs and Italian–Russian word pairs through auditory input and visual input, using the phonological suppression technique. The suppression effect was found only in Italian–Russian pairs, suggesting that the phonological loop plays a critical role in word learning, even learning through visual input. In sum, learning new words involves working memory, and its success must rely on the ability to extract phonological information efficiently from printed words. As in reading, therefore, phonological decoding is a vital competence in new word learning and remembering.

516 ROLE OF DECODING

CROSS-LINGUISTIC TRANSFER OF READING SUBSKILLS Transfer is a major concern in second language acquisition (SLA) research. A considerable number of studies have shown systematic L1 influences on virtually all aspects of L2 learning and processing, including vocabulary knowledge (Hancin-Bhatt and Nagy 1994; Nagy et al. 1997; Paribakht 2007), syntactic processing (Kilborn and Ito 1989; Koda 1993) and spelling (James et al. 1993; Fashola et al. 1996). In order to examine cross-linguistic impacts on L2 reading, it is vital to clarify how previously learned reading subskills are incorporated into L2 print information processing, and then, to clarify how such incorporation affects L2 reading development. A recent theory of transfer postulates that well-established L1 reading subskills are activated automatically by L2 print input regardless of the learner’s intent (Koda 2007). The theory further posits that transferred L1 subskills are gradually adjusted to L2 properties through cumulative L2 print processing experience. In what follows, we discuss, within this theory, how decoding skills transfer across languages, and how transferred L1 skills affect the development of L2 phonological-decoding efficiency. Inasmuch as no language has perfect one-to-one symbol-to-sound correspondences (DeFrancis 1989), we can logically assume that multiple procedures are employed in phonological decoding in all languages. Although the range of procedures employed by readers is unlikely to vary greatly across languages, readers are likely to tend to form different preferences for particular

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

resources—morphological information, in particular—has been highlighted. As such, the core operations in word-meaning inference involve intraword segmentation and morphological analysis in identifying known morphological elements in an unknown word. According to Nagy and Anderson (1984), more than 60 per cent of new words children encounter in printed school materials are morphologically complex words, which are structurally transparent and functionally salient, as evident in un-lady-like. In principle, therefore, the meanings of more than half of new words can be deduced on the basis of their morphological constituents. Viewed as a whole, word-meaning inference involves information extraction from word-external (e.g. context and background knowledge) and word-internal (e.g. morphological information) resources and subsequent integration of the extracted information. Undoubtedly, it entails multi-layered operations involving various linguistic and conceptual processes and integration of their outputs. For this reason, it seems reasonable to assume that word-meaning inference, like reading comprehension and new word learning, places relatively heavy demands on working memory, and that successful inference performance depends upon efficient phonological decoding. Oddly, however, there has been little exploration of the extent and manner to which phonological decoding contributes to word-meaning inference either in L1 or L2 reading.

M. HAMADA and K. KODA

517

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

methods in accordance with the way phonology is graphically represented in their writing system (Frost et al. 1987; Perfetti and Zhang 1995; Ziegler and Goswami 2005). Under the theory of transfer, the basic assumption is that once established in one language, these preferences affect the development of L2-decoding skills by serving as the basis for forming preferences for particular methods in the new language. According to the theory, transferred L1 skills are gradually adjusted to L2 properties to maximize their utility. It follows then that the degree of similarity between L1 and L2 orthographic properties—how phonology relates to the grapheme, in particular—is an important factor determining how much accommodation is necessary for transferred decoding skills to become functional in the L2. When L1 and L2 orthographic systems are dissimilar, transferred L1 skills must be substantially modified. Conversely, when the two systems share major properties, L1 skills can be utilized in extracting L2 phonological information with minimum adjustment. By logical extension, then, orthographic similarity should be a powerful predictor of L2 phonological-decoding efficiency. It has been hypothesized that learners whose L1 is orthographically similar to their L2 would be faster and more accurate in extracting phonological information from printed words at any given point in time than their proficiency-matched counterparts whose L1 is orthographically dissimilar. The existence of the orthographic distance effect has been generally supported in experimental studies measuring phonological-decoding efficiency among adult L2 learners of English with similar (alphabetic) and dissimilar (logographic) L1 orthographic backgrounds (Muljani et al. 1998; Wang et al. 2003; Hamada and Koda 2008). Although these studies tested the specific impact of L1 orthographic features on L2-decoding procedures, in all of them, L2 learners with alphabetic L1 backgrounds performed significantly faster across stimulus conditions than their logographic counterparts. Given that transferred L1 skills are a source of variance in L2 phonologicaldecoding efficiency, the next step is to clarify precisely how L1-based variance relates to performance in cognitive tasks in an L2 such as text comprehension and word learning, which are more complex than the L2 tasks researched in the studies mentioned above. The central issues are how, and to what extent, initial efficiency differences among learners with diverse L1 backgrounds predict success in complex L2 tasks whose performance necessitates working memory. Given that phonological-decoding efficiency plays a critical role in working memory, the logical prediction would be that, all other things being equal, the accelerated L2 phonological decoding among learners with similar L1 orthographic backgrounds would yield better performance on complex tasks than those with dissimilar L1 backgrounds. Previous cross-linguistic studies have empirically tested this prediction involving two proficiency-matched ESL groups—one with an alphabetic (similar) and the other with a logographic (dissimilar) L1 background. Although the prediction seems simple and straightforward, the picture emerging from the studies is far from being either in that it exhibits complex cross-linguistic interplay.

518 ROLE OF DECODING

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

In general, transfer studies have shown that (i) the hypothesized advantage of L2 learners with similar L1 backgrounds is observable in decoding speed, that is, the time required to pronounce visually presented words (Muljani et al. 1998; Hamada and Koda 2008), but not in decoding accuracy (Koda 1998); (ii) L2 word properties, such as frequency and spelling regularity, have measurable impacts on L2 print processing uniformly across learners regardless of their L1 background (Muljani et al. 1998; Akamatsu 2002; Wang et al. 2003; Hamada and Koda 2008); and (iii) decoding skills are closely related to reading comprehension among L2 learners of English with alphabetic L1 backgrounds, but no systematic relationships have been found between performance in these two constructs among their logographic counterparts (Koda 1998, 1999). In sum, similarity of L1 and L2 orthographic properties appears to accelerate L2-decoding development. However, the way that decoding efficiency relates to reading comprehension seems to differ fundamentally between learners with alphabetic and learners with logographic L1 backgrounds. A recent study (Leong et al. 2008) involving a large number of Chinese-speaking school-age children (Grades 3 to 5; N = 518) sheds substantial light on the issue. Applying a sophisticated statistical procedure (structural equation modeling) to their data, the researchers found that the ability to manipulate sub-lexical phonological information contributes minimally to phonologicaldecoding performance in Chinese; and that phonological decoding explained only a small portion of the variance of individual differences in reading comprehension ability. The most important implication of their study is that phonological skills, including phonological awareness and decoding, may play a far less prominent role in literacy development in Chinese than has been reported in reading acquisition studies in English (e.g. Perfetti 1985; Shankweiler 1989; Stanovich 2000; Perfetti 2003). It is not clear, at this point, what explains the seemingly script-related divergence of the role of phonological decoding in reading comprehension. However, variation appears to have a long-term impact on the extent to which L2 decoding relates to reading comprehension among L2 learners with alphabetic vs. logographic L1 backgrounds. Taking all the above into account, it seems reasonable to conclude that L1 and L2 factors jointly shape L2 phonological-decoding competence. Apparently, L2 factors, such as the frequency of L2 words that a learner encounters in print and the amount of L2 print experience that the learner has, explain partly the reported variation in L2-decoding efficiency among learners sharing the same L1 background. Also, L1–L2 orthographic distance is likely to affect the rate at which decoding ability develops in L2 among learners with diverse L1 backgrounds. It is highly probable, moreover, that L1 orthographic properties explain the differential ways in which decoding relates to reading comprehension. The observed patterns of L1 influence on the role of decoding in reading are far more complex than have been assumed in cross-linguistic transfer research. If, indeed, L2 reading development is jointly constrained by L1 and L2 linguistic/orthographic properties, such cross-linguistic interplays should be taken into account in designing L2 reading

M. HAMADA and K. KODA

519

instruction. However, lack of evidence has so far made it impossible to say exactly what instructional changes, if any, might be appropriate. Thus, more data are clearly desirable to disentangle these and other related issues stemming from the dual-language involvement in L2 reading development.

INVESTIGATIVE FRAMEWORK AND HYPOTHESES The insights gained from the above survey can be summarized as follows:

Building on these findings, our goal was to extend the documentation of impacts of L1-decoding skills on reading comprehension to another subskill, that of word-meaning inference. In the current study, the insights listed above served as the premises, collectively providing the basis for the following hypotheses: (i) L2 learners of English with alphabetic L1 orthographic backgrounds are more efficient (i.e. faster and accurate) in phonological decoding of English than their proficiency-matched counterparts with logographic backgrounds. (ii) L2 phonological-decoding efficiency differentially relates to L2 word-meaning inference among ESL learners with alphabetic and logographic L1 backgrounds. Specifically, L2 phonological-decoding efficiency is correlated with L2 word-meaning inference success among alphabetic ESL learners, but its impact is far more limited among their logographic counterparts. These hypotheses were tested empirically by comparing phonologicaldecoding efficiency and its relation to word-meaning inference among proficiency matched L2 learners of English with alphabetic and logographic L1 backgrounds.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

(i) Because phonologically encoded information is more durable than any other form of representation in working memory, it enhances the information integration required for complex cognitive tasks. (ii) In order to permit the simultaneous activation of multiple processors within capacity limitations, phonological decoding and other computational operations must become automated. (iii) Word-meaning inference entails a number of component operations and their simultaneous activations. Hence, it is reasonable to assume that word-meaning inference depends on working memory, and effective word-meaning inference depends on highly efficient phonological decoding. (iv) L1–L2 orthographic distance is responsible in large part for efficiency differences in L2 phonological decoding among learners with similar and dissimilar L1 backgrounds. (v) Decoding efficiency differentially correlates with reading comprehension among L2 learners with alphabetic and logographic L1 backgrounds. The correlation appears to be stronger for L2 learners with alphabetic L1 backgrounds.

520 ROLE OF DECODING

THE STUDY Participants

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The participants were two groups of college-level ESL learners with contrasting L1 orthographic backgrounds. One was an alphabetic L1 group that consisted of a group of ESL learners with non-Roman alphabetic L1 orthographic backgrounds who were likely to exhibit congruity of L1 and L2 orthographic processing experiences. This group consisted of 15 native speakers of Korean and one native speaker of Turkish (N = 16). The other group, a logographic L1 group (N = 17) consisted of 13 native speakers of Chinese and four native speakers of Japanese.1 The majority of the Chinese participants were from Taiwan, and the rest were from China. Several points were taken into consideration to ensure that the participants shared similar characteristics except for their L1 orthographic backgrounds. The participants were required to have a minimum of a high school education in their home countries in their L1-medium institutions, which demonstrated that they were fluent readers in their L1s. They were enrolled in either advanced classes at a university-affiliated intensive English institute or regular university programs in mid-size universities in the United States. Most participants were from level 4 and 5 intensive English classes, where level 5 is the highest proficiency at the ESL institute and graduate school admission level. The purpose of studying English was academic for all the participants. The mean length of residency in the United States was 4.08 months (SD = 2.82) for the alphabetic L1 group and 5.09 months (SD = 3.11) for the logographic L1 group. These mean numbers were tested using a two-tailed t test, and the difference was found to be non-significant, t (31) = .974, p = .337. To determine whether the two groups were similar in their English proficiency, a reading section of the Test of English as a Foreign Language (TOEFL) was administered to the participants. A retired copy of TOEFL was provided by the Educational Testing Service for the study. The mean score was 31.63 (SD = 8.28) for the alphabetic L1 group and 31.06 (SD = 8.49) for the logographic L1 group, where the maximum score was 50. These group mean scores were tested using a two-tailed t-test, and this difference was found to be non-significant, t (31) = .194, p = .848. A numerical digit memory span test was also administered to determine whether the two groups were similar in their working memory capacity. Following the procedure used in Harrington and Sawyer (1992), the participants listened to digit strings ranging from three to six digits, which were recorded by a native speaker of English. The participants then orally recalled the strings in the same order (forward recall) and in reverse order (backward recall). The test began with a three-digit string. Upon successful recall, the number of digits was increased by one, and the test ended when the participants failed to recall the string accurately. Two sets of digit strings were administered in the test, and the average maximum number of digits recalled was

M. HAMADA and K. KODA

521

recorded as data. The mean forward recall number was 4.81 (SD = 0.68) for the alphabetic L1 group and 4.76 (SD = 0.47) for the logographic L1 group. The mean backward recall number was 4.56 (SD = 0.83) for the alphabetic L1 group and 4.50 (SD = 0.56) for the logographic L1 group. The group mean numbers were tested using two-tailed t tests, and these differences were found to be non-significant in both the forward recall task [t(31) = .236, p = .815] and the backward recall task [t(31) = .254, p = .801].

Materials and tasks Naming task

Meaning-inference task This task involved two steps: a reading session and a definition writing session. The purpose of the reading session was to provide an environment where word-meaning inference could occur. Before detailing the reading session procedure, a thorough description of the materials is in order.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

A naming task was administered to measure phonological-decoding efficiency, defined as the speed and accuracy of phonological information extraction from printed words. The stimuli were 20 English real words and 20 English pseudowords (Appendix A can be found as supplementary material available at Applied Linguistics online), selected from the Word Identification sub-test and the Word Attack sub-test, respectively, in Woodcock Reading Mastery Test-Revised (Woodcock 1987). The selection criterion for the stimulus items was appropriateness for the current participants’ reading proficiency level. The 20 real words were selected from Grades 4 through 7–10 items out of the total 100 items. An informal interview with the participants after task completion confirmed that they were familiar with (i.e. knew the meaning and were able to pronounce) all of the real word stimulus items. The 20 pseudowords were selected from the middle items in the list, ordered from easiest to hardest, of the total 45 pseudowords. The appropriateness of these items was checked by two of the participants’ ESL instructors. In the naming task, the participants read aloud visually presented English real words and pseudowords as quickly and accurately as possible. This task was administered to each participant individually by a computer-programmed instrument. The stimuli were randomized and presented on the computer screen one at a time for a maximum of 2,500 ms. The lapse between the onset of the stimulus presentation and the participant’s voice articulation within the 2,500 ms presentation time was recorded as the reaction time (RT). Any responses after the maximum duration were considered to be incorrect. Participants’ responses were also tape-recorded for an analysis of response accuracy. Before beginning the naming task, to ensure that the procedure was clear, a practice session consisting of five pseudowords that were not included in the stimuli was administered to all of the participants.

522 ROLE OF DECODING

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

For the reading session, a medium-length passage was selected from an advanced ESL textbook (Insights by Brinton et al. 1997) and divided into three short passages, which were then modified to be used as test passages. First, 10 words common to the three original passages were identified as target words, and then each target word was replaced by a pseudoword in the test passages (Appendix B can be found as supplementary material available at Applied Linguistics online). This was done in order to simulate a situation where learners encounter unknown words (pseudowords in this case) three times while reading. The pseudowords were created by following English orthographic rules, with a maximum of two syllables and seven letters. Since it was not possible to find the exact same 10 words common across the passages, words that have similar meanings (e.g. communicate and interact) were regarded as the same target word. The inflectional and derivational morphemes of the target words were kept in their original forms in the target pseudowords (e.g. noked in Passage 1; noke in Passage 2; nokes in Passage 3). Some of the target words were different parts of speech in the original passages (e.g. ‘feelings’ noun in Passage 1; ‘feel’ verb in Passage 2). For these target words, the morphological information was kept in its original form in the target pseudowords (e.g. hoakings in Passage 1; hoak in Passage 2). The two ESL instructors checked the test passages and made minor modifications of syntactic structures, to minimize the influence of structural unfamiliarity in meaning-inference. For example, some of the complex sentences were made into compound sentences or two simple sentences in order to minimize the use of subordinating clauses. This adjustment targeted a learner population of slightly lower proficiency than the present participants. That is, the target level was level 3, in order to make the passages structurally comprehensible to everyone. Familiarity of vocabulary was also adjusted by changing some words into synonyms, based on the ESL instructors’ judgement, again targeting level 3 proficiency. After this adjustment, it was assumed that all the words except the 10 pseudowords were familiar to the participants, giving 96 per cent known word coverage, above the minimum known word coverage (95 per cent) for word-meaning inference to be successful suggested by Hu and Nation (2000). The pilot study participants and several of the current study participants randomly selected after this experiment confirmed that all of the words and syntactic structures in the passages were familiar to them. Finally, the target pseudowords were underlined to keep the participants from ignoring them. The topic of the passages was ‘folk objects’. To familiarize the participants with this topic, a short introductory passage was created. The reading session started with the participants being instructed to read the introductory passage. The participants were then instructed to read the three test passages. The time allotted for reading was 5 minutes per passage. So that the participants’ attention would be directed to comprehension of the passages, they were asked to give a brief oral summary of the passage in English after reading. During a brief interview conducted after reading the passages, none of the participants indicated that they had specialized knowledge of folk objects.

M. HAMADA and K. KODA

523

RESULTS Phonological decoding The naming task was used to measure individual phonological-decoding efficiency. Since decoding efficiency was defined as speed and accuracy of phonological information extraction, RTs of only the accurate responses served as an index of decoding efficiency and were used for subsequent analyses.2 The answer key from Woodcock Reading Mastery Test-Revised (Woodcock 1987) was used for the judgement of response accuracy. Two raters independently listened to the tape recording of the participants’ responses and determined the accuracy of the responses. Interrater reliability was .97. For the responses whose judgement was disagreed upon, a third rater’s judgement was used to determine accuracy. Since it was not the purpose of this task to assess pronunciation accuracy, foreign accent was taken into lenient consideration in determining response accuracy. For example, a pronunciation of /Z/ or /dZ/ for the phoneme /z/ was considered to be correct for the Korean L1 participants.3 Typical incorrect responses were misidentification of letters (e.g. ‘journery’ for ‘journey’ and ‘slow’ for ‘sloy’) and mispronunciation of vowels (e.g. the vowel in ‘shum’ was incorrectly pronounced as /u/ and the second vowel in ‘depine’ as /I/). The mean RT of accurate responses in the alphabetic L1 group was 869.45 ms (SD = 120.71) for the real words and 1001.10 ms (SD = 127.03) for the pseudowords. The mean RT of accurate responses in the logographic L1 group was 969.27 ms (SD = 94.41) for the real words and 1138.57 ms (SD = 114.13) for the pseudowords.4 A two-way analysis of variance (ANOVA) was performed with L1 orthographic background (alphabetic vs. logographic) as the between-subject factor and word type (real words vs. pseudowords) as the within-subject factor. The main effect for L1 orthographic background was significant, F (1, 29) = 16.465, MSE = 217,944.823, p < .0002, 2p = .221, with the alphabetic

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Immediately after the reading session, the second step of the task, a definition writing session was given to measure individual word-meaning inference success. The participants inferred and wrote down the meanings of the target pseudowords in English. In the pilot study, the participants had difficulty matching the inferred meanings with the correct target pseudowords when the pseudowords were presented as a list. Therefore, the participants in the present study were allowed to view the third passage while performing this task. A maximum of 10 min was given for the definition writing session, which was judged to be an appropriate duration based on the pilot study. Ten native speakers performed both the reading and the definition writing tasks, and their definitions were used to judge the acceptability of the participants’ inferred meanings. The reason for doing this was to elicit any acceptable answers other than the original meanings of the target pseudowords in order to reduce the risk that the researchers would mismark acceptable inferences.

524 ROLE OF DECODING

group being faster than the logographic group. The main effect for word type was also significant, F (1, 34) = 26.487, MSE = 350,592.285, p < .0001, 2p = .314, with the real words showing shorter RTs than the pseudowords. The interaction between L1 orthographic background and word type was not significant, F (1, 29) = .415, p = .522.

Word-meaning inference

Phonological decoding and word-meaning inference The following section presents results regarding the relationship between phonological-decoding efficiency and word-meaning inference for the two groups. Because RTs are not normally distributed, non-parametric correlations were used to analyze the relationship. Individual mean RTs of correct naming responses were entered as an index of phonological-decoding efficiency, and individual scores in the meaning-inference task were input as wordmeaning inference accuracy. TOEFL reading scores were included as a measure of reading comprehension in the correlation analysis. The intercorrelations among phonological-decoding efficiency (for real words and pseudowords), TOEFL reading scores, and meaning-inference accuracy for the alphabetic L1 group appear in Table 1. Spearman rank correlation coefficients indicated that phonological-decoding efficiency of real words was significantly correlated with TOEFL reading scores, rs (16) = .537, p = .032, and with

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The meaning-inference task (reading and definition writing sessions) measured word-meaning inference success. In this task, the participants read the passages and wrote down inferred meanings of the target pseudowords. Accuracy of the inferred meanings was judged based on two sources: the original words that were replaced in the passages and the definitions given by the ten native speakers who performed the same task (see Appendix C which can be found as supplementary material available at Applied Linguistics online). The participants’ definitions were scored following the definition evaluation procedure used in Haynes and Carr (1990). One point was awarded (i) when the definition for a particular pseudoword matched that of the original word; (ii) when the definition matched one of the native speakers’ definitions; or (iii) when the definition was semantically identical (or synonymous) to the original word or one of the native speakers’ definitions. A half point was awarded when the inferred meaning was semantically close to the original word or one of the native speakers’ definitions (e.g. ‘participate’ or ‘unite’ for ‘interact’). Scoring was done by two independent raters (interrater reliability .90), and items whose scoring was disagreed upon were resolved by discussion. The mean score was 3.16 (SD = 1.98) for the alphabetic L1 group and 3.97 (SD = 2.03) for the logographic L1 group, where 10 was the maximum. These group means were tested using a two-tailed t-test, and this difference was found to be non-significant, t (31) = 1.166, p = .252.

M. HAMADA and K. KODA

525

Table 1: Intercorrelations among phonological-decoding efficiency, TOEFL reading scores, and meaning-inference accuracy for the alphabetic L1 group

Decoding (Real words) Decoding (Pseudowords) TOEFL reading Meaning-inference p < .05;



Decoding (Pseudowords)



.785 –

TOEFL reading

Meaning inference

.537 .216

.710 .472 .788

– –

p < .01.

Table 2: Intercorrelations among phonological-decoding efficiency, TOEFL reading scores, and meaning-inference accuracy for the logographic L1 group

Decoding (Real words) Decoding (Pseudowords) TOEFL reading Meaning-inference 

Decoding (Real words)

Decoding (Pseudowords)



.473 –

TOEFL reading .220 .009 –

Meaning inference .068 .396 .493

p < .01.

meaning-inference accuracy, rs(16) = .710, p = .002. Also, TOEFL reading scores were significantly correlated with meaning-inference accuracy, rs(16) = .788, p < .001. Phonological-decoding efficiency of pseudowords also showed a strong correlation with meaning-inference accuracy, approaching the significant level, rs (16) = .472, p = .065. The intercorrelations among phonological-decoding efficiency, TOEFL reading scores, and meaning-inference accuracy for the logographic L1 group appear in Table 2.5 Spearman rank correlation coefficients indicated that TOEFL reading scores were significantly positively correlated with meaning-inference accuracy, rs(17) = .493, p = .044. However, in contrast to the alphabetic L1 group, phonological-decoding efficiency of real words and pseudowords showed significant correlations with neither meaning-inference nor TOEFL reading for the logographic L1 group. As predicted, the data from the logographic L1 group indicated a lack of association between faster decoding RTs and either meaning-inference accuracy or TOEFL reading scores.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010



Decoding (Real words)

526 ROLE OF DECODING

DISCUSSION

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The results demonstrated that the alphabetic L1 group performed significantly more efficiently (i.e. faster and more accurately) than the logographic L1 group in the word naming task, suggesting that congruity in L1 and L2 print processing experiences may expedite L2-decoding development. The finding confirms the hypothesis regarding the L1 orthographic distance effect on L2 print processing, providing further support for the cross-linguistic transfer of reading subskills (e.g. Muljani et al. 1998; Wang et al. 2003). In addition, the significant main effect of word type (real words vs. pseudowords) indicates that L2 word properties have measurable impacts on L2 print processing uniformly across learners regardless of their L1 background, a finding which is consistent with that from previous studies (e.g. Muljani et al. 1998; Wang et al. 2003). However, it is worth noting that some studies seem to have reported otherwise. For example, using a pseudoword phonological judgement task, Koda (1998) found no performance difference in terms of accuracy between Korean and Chinese ESL learners’ ability to discriminate English phonemes. Akamatsu (2002) reported that Iranian (alphabetic) ESL learners did not outperform their logographic counterparts in the speed of English word naming. A possible explanation for the discrepant finding reported in Koda’s study is that her task involved neither speed measure nor phonological production. Hence, the task may not have been demanding enough to discriminate between L2 learners with similar and dissimilar L1 orthographic backgrounds. Akamatsu’s study manipulated L2 word properties, such as frequency and regularity. Given his participants’ high proficiency in English, facilitation from L2 word properties may have overridden possible advantages stemming from congruent L1 orthographic experience. Needless to say, more studies are necessary to clarify the nature of the L1 orthographic distance effect. Although our results suggest that a wide difference between L1 and L2 print processing experiences inhibits naming efficiency of learners with logographic L1 backgrounds, their word-meaning inference performance is unaffected by this difference. Regarding the second research question, the results indicate that decoding efficiency is differentially related to word-meaning-inference in the two groups. The alphabetic L1 group exhibited a significant correlation between efficiency of phonological decoding of real words and meaninginference accuracy and between efficiency of phonological decoding of real words and TOEFL reading scores, and a marginally significant correlation between phonological-decoding efficiency of pseudowords and meaninginference accuracy. These correlations, as predicted, are suggestive of the critical role of phonological decoding in reading comprehension and wordmeaning inference, via enhanced working memory functioning. In contrast, the logographic L1 group showed no reliable correlations between decoding efficiency of either word type and meaning-inference accuracy or TOEFL reading scores, which is consistent with Leong et al.’s study (2008) described earlier. The same contrasting patterns of correlations between the two groups

M. HAMADA and K. KODA

527

CONCLUSION This study provides a psycholinguistic account of L2 word-meaning inference during reading, an account which focuses on the contribution of phonological-decoding efficiency in reading comprehension and word learning as well as on the cross-linguistic transfer of reading subskills. The results provide additional evidence supporting the hypothesis that the phonological decoding of the participating ESL learners with alphabetic L1 orthographic backgrounds is more efficient than that of their logographic counterparts.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

were also found in Korean and Chinese ESL learners in Koda (1998). The present results indicate that phonological-decoding efficiency measured through a naming task had little impact on the word-meaning inference accuracy of the logographic L1 group. Although the present data do not yield any explanation for the observed contrast found between the alphabetic and the logographic groups, two explanations are logically possible. First, ESL learners with logographic L1 backgrounds may rely less on phonology in L2 reading comprehension and word-meaning inference than their alphabetic counterparts. This explanation calls for a different L1-specific framework to be developed to fully understand logographic ESL learners’ word-meaning inference behavior. Second, the procedures for phonological information extraction used by logographic ESL learners may be different from those used by alphabetic readers. Should this be the case, a naming task may not be appropriate for measuring logographic ESL learners’ phonological-decoding efficiency. It may be necessary to examine the hypothesized relationship using a different task that does not involve letter-byletter grapheme-to-phoneme conversions. Lastly, our tentative explanation for the different correlation patterns observed in phonological-decoding efficiency of real words and pseudowords for the alphabetic L1 group is as follows. Although both TOEFL reading scores and meaning-inference accuracy were significantly correlated with phonological-decoding efficiency of real words, such a strong correlation was not found between phonological-decoding efficiency of pseudowords and TOEFL reading, rs = .216, and between phonological-decoding efficiency of pseudowords and meaning-inference accuracy rs = .472, p = .065. Presumably, a real word’s phonology grants access to its semantic information. This being the case, performance on real word naming should be a reliable indicator of efficiency in lexical access, including word meaning. In contrast, pseudowords, by definition, are phonological entities conveying no semantic information. These differences may explain the contrasting ways in which real word and pseudoword naming efficiency related to word-meaning inference among alphabetic ESL learners. It may well be that real word and pseudoword naming efficiency among L2 learners represent distinct capabilities despite their high correlation. Needless to say, these speculations require finely tuned experiments for verification.

528 ROLE OF DECODING

ACKNOWLEDGEMENTS This article was based on the first author’s PhD dissertation submitted to Carnegie Mellon University. Partial results were presented at the meeting of the World Congress of Applied Linguistics (AILA), in Madison, WI, in July 2005. The study was partially funded by the GuSH funds at Carnegie Mellon University granted to the first author. A retired copy of Test of English as a Foreign Language was also granted to the first author by the Educational Testing Service. We thank Richard Tucker, Isabel Beck, and anonymous reviewers and the editor for their comments on the earlier drafts. We also would like to express our gratitude to Dorolyn Smith for her

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The results also provide a new perspective regarding the relationship between decoding and meaning-inference, by supporting the hypothesis that decoding efficiency is differentially related to meaning-inference between the groups, with the alphabetic L1 group showing a stronger correlation, although the two groups did not differ in meaning-inference performance. Further research disentangling the L1-based difference in the role of phonological decoding on meaning-inference and other cross-linguistic issues in L2 reading and word learning development is warranted. Overall, the present study has provided a new perspective concerning contextual L2 word learning. Given the scarcity of psycholinguistic investigations of L2 word learning pointed out by Haastrup and Henriksen (2001), the findings from the present study should be beneficial for understanding the underlying factors that influence L2 word learning—in particular, L2 word-meaning inference while reading. It should be noted that the role of phonology is widely acknowledged in L1 and L2 reading research, but it has not yet received much attention in L2 vocabulary research. It is hoped that the present study will stimulate theoretical discussion regarding the contribution of phonological decoding to word learning. A critical issue has been identified for further research. Considering that the present findings derive from small samples, it would be highly desirable to replicate the study with larger samples and participant groups that consist of learners with homogeneous L1 backgrounds (e.g. Korean ESL vs. Chinese ESL) in order to observe L1 orthographic effect more clearly. Moreover, it would be necessary to control the exposure to Pinyin, a roman alphabetic pronunciation aid designed for children, for participants from China. Specifically, it is not clear at this point whether the contrasting correlational patterns are indicative of an L1-based difference in the role of phonology in complex cognitive tasks, such as word-meaning inference among learners with diverse L1 orthographic backgrounds. Alternatively, the different correlational patterns found between the groups may suggest that logographic learners use qualitatively different procedures in extracting phonological information from printed words, and the procedures perhaps cannot be captured through a naming task. These issues need to be addressed in further research, in order to advance our understanding of the role of phonological decoding in reading comprehension and word-meaning inference processes.

M. HAMADA and K. KODA

529

assistance in recruiting participants, Holmes Finch for his assistance in statistical analysis, and Harold Walls for manuscript editing.

NOTES

2

3

4

5

(approximately 10 per cent) of Hancha used in Korean texts (Taylor and Taylor 1995). Response accuracy was 92.2 per cent (real words) and 79.7 per cent (pseudowords) for the alphabetic L1 group, and 76.55 per cent (real words) and 57.95 per cent (pseudowords) for the logographic L1 group. We averaged the RTs of only the accurate responses to estimate individual mean RTs. The phonetic symbols used in this article follow the International Phonetic Alphabet (International Phonetic Association 1996). Two outliers were detected in the logographic L1 group and excluded from the analysis. The two outliers’ phonological-decoding data were treated as missing data in the intercorrelations.

SUPPLEMENTARY DATA Supplementary data is available at Applied Linguistics online.

REFERENCES Adams, M. J. 1994. ‘Modeling the connections between word recognition and reading’ in Ruddell R. B., M. R. Ruddell, and H. Singer (eds): Theoretical Models and Processes of Reading. International Reading Association, pp. 830–63. Akamatsu, N. 2002. ‘A similarity in wordrecognition procedures among second language readers with different first language background,’ Applied Psycholinguistics 23: 117–33.

Baddeley, A. D. 1986. Working Memory. Oxford University Press. Brinton, D., L. Jensen, L. Repath-Martos, J. Frodesen, and C. Holten. 1997. Insights 1. A Content-Based Approach to Academic Preparation. Longman. Daneman, M. 1991. ‘Individual differences in reading skills’ in Barr R., M. L. Kamil, P. Mosenthal, and P. D. Pearson (eds): Handbook of Reading Research Vol. 2. Longman, pp. 512–38.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

1 Chinese employs a so-called logographic writing system, in which primary correspondences are established between morphemes and graphemes, to which syllables are also holistically assigned (DeFrancis 1989; Mattingly 1992). Japanese employs distinct logographic (Kanji) and syllabic (Kana) writing systems, with the logographic writing system having been borrowed from Chinese. Although two distinct writing systems exist for Japanese, Japanese readers are generally considered to be more experienced in processing Kanji because these are the symbols used for content words (Taylor and Taylor 1995). Korean also has logographic characters, Hancha, but the experience of reading Hancha is far from comparable with the experience of reading Chinese characters or Japanese Kanji because of the small proportion

530 ROLE OF DECODING

Haynes, M. and T. H. Carr. 1990. ‘Writing system background and second language reading: a component skills analysis of English reading by native speaker-readers of Chinese’ in Carr T. H. and B. A. Levy (eds): Reading and Its Development. Component Skills Approaches. Academic Press, pp. 375–421. Hu, M. H-C. and P. Nation. 2000. ‘Unknown vocabulary density and reading comprehension,’ Reading in a Foreign Language 13: 403–30. James, C., P. Scholfield, P. Garrett, and Y. Griffiths. 1993. ‘Welsh bilinguals’ English spelling: an error analysis,’ Journal of Multilingual and Multicultural Development 14: 287–306. Just, M. A. and P. A. Carpenter. 1980. ‘A theory of reading: from eye fixation to comprehension,’ Psychological Review 87: 329–54. Kilborn, K. and T. Ito. 1989. ‘Sentence processing strategies in adult bilinguals’ in MacWhinney B. and E. Bates (eds): The Cross-Linguistic Study of Sentence Processing. Cambridge University Press, pp. 256–91. Kleiman, G. M. 1975. ‘Speech recording in reading,’ Journal of Verbal Learning and Verbal Behavior 14: 323–39. Koda, K. 1993. ‘Transferred L1 strategies and L2 syntactic structure during L2 sentence comprehension,’ The Modern Language Journal 77: 490–500. Koda, K. 1998. ‘The role of phonemic awareness in second language reading,’ Second Language Research 14: 194–215. Koda, K. 1999. ‘Development of L2 intraword structural sensitivity and decoding skills,’ The Modern Language Journal 83: 51–64. Koda, K. 2007. ‘Reading and language learning: crosslinguistic constraints on second language reading development,’ Language Learning 57: 1–44. LaBerge, D. and S. J. Samuels. 1974. ‘Toward a theory of automatic information processing in reading,’ Cognitive Psychology 6: 293–323. Leong, C. K., S. K. Tse, K. Y. Loh, and K. T. Hau. 2008. ‘Text comprehension in Chinese children: relative contribution of verbal working memory, pseudoword reading, rapid automatized naming, and onset-rime phonological segmentation,’ Journal of Educational Psychology 100: 135–49. Levy, B. A. 1975. ‘Vocalization and suppression effects in sentence memory,’

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Daneman, M. and P. A. Carpenter. 1980. ‘Individual difference in working memory and reading,’ Journal of Verbal Learning and Verbal Behavior 19: 450–66. Daneman, M. and P. A. Carpenter. 1983. ‘Individual differences in integrating information between and within sentences,’ Journal of Experimental Psychology: Learning, Memory, and Cognition 9: 561–83. DeFrancis, J. 1989. Visible Speech. The Diverse Oneness of Writing System. University of Hawaii Press. Fashola, O. S., P. A. Drum, R. E. Mayer, and S-J. Kang. 1996. ‘A cognitive theory of orthographic transitioning: Predictable errors in how Spanish-speaking children spell English words,’ American Educational Research Journal 33: 825–43. Frost, R., L. Katz, and S. Bentin. 1987. ‘Strategies for visual word recognition and orthographic depth: A multilingual comparison,’ Journal of Experimental Psychology: Human Perception and Performance 13: 104–15. Gathercole, S. E. and A. D. Baddeley. 1989. ‘Evaluation of the role of phonological STM in the development of vocabulary in children,’ Journal of Memory and Language 28: 200–13. Gathercole, S. E. and A. D. Baddeley. 1990. ‘Phonological memory deficits in language-disordered children: is there a causal connection?,’ Journal of Memory and Language 29: 336–60 Gathercole, S. E., G. J. Hitch, A.-M. Adams, and A. J. Martin. 1999. ‘Phonological shortterm memory and vocabulary development: Further evidence on the nature of the relationship,’ Applied Cognitive Psychology 13: 65–77. Haastrup, K. and B. Henriksen. 2001. ‘The interrelationship between vocabulary acquisition theory and general SLA research,’ EUROSLA Yearbook 1: 69–78. Hamada, M. and K. Koda. 2008. ‘Influence of first language orthographic experience on second language decoding and word learning,’ Language Learning 58: 1–31. Hancin-Bhatt, B. and W. Nagy. 1994. ‘Lexical transfer and second language morphological development,’ Applied Psycholinguistics 15: 289–310. Harrington, M. and M. Sawyer. 1992. ‘L2 working memory capacity and L2 reading skill,’ Studies in Second Language Acquisition 14: 25–38.

M. HAMADA and K. KODA

Perfetti, C. A. and A. M. Lesgold. 1979. ‘Coding and comprehension in skilled reading and implications for reading instruction’ in Resnick L. B. and P. Weaver (eds): Theory and Practice of Early Reading. Erlbaum, pp. 57–84. Perfetti, C. A. and S. Zhang. 1995. ‘Very early phonological activation in Chinese reading,’ Journal of Experimental Psychology: Learning, Memory, and Cognition 21: 24–33. Schneider, W. and R. M. Shiffrin. 1977. ‘Controlled and automatic human information processing: detection, search, and attention,’ Psychological Review 84: 1–66. Shankweiler, D. 1989. ‘Phonology and reading disability: solving the reading puzzle’ in Shankweiler D. and I. Liberman (eds): International Academy for Research in Learning Disabilities Monograph Series vol. 6. The University of Michigan Press, pp. 35–68. Stanovich, K. E. 2000. Progress in Understanding Reading: Scientific Foundations and New Frontiers. Guilford Press. Sternberg, R. J. 1987. ‘Most vocabulary is learned from context’ in McKeown M. G. and M. E. Curtis (eds): The Nature of Vocabulary Acquisition. Lawrence Erlbaum, pp. 89–105. Taylor, I. and M. M. Taylor. 1995. Writing and Literacy in Chinese, Korean, and Japanese. John Benjamins. Wang, M. and K. Koda. 2005. ‘Commonalities and differences in word identification skills among learners of English as a second language,’ Language Learning 55: 71–98. Wang, M., K. Koda, and C. A. Perfetti. 2003. ‘Alphabetic and nonalphabetic L1 effects in English word identification: A comparison of Korean and Chinese English L2 learners,’ Cognition 87: 129–49. Woodcock, R. W. 1987. Woodcock Reading Mastery Tests – Revised. American Guidance Service. Ziegler, J. C. and U. Goswami. 2005. ‘Reading acquisition, developmental dyslexia, and skilled reading across languages: A psycholinguistic grain size theory,’ Psychological Bulletin 131: 3–29.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Journal of Verbal Learning and Verbal Behavior 14: 304–16. Mattingly, I. G. 1992. ‘Linguistic awareness and orthographic form’ in Frost R. and L. Katz (eds): Orthography, Phonology, Morphology and Meaning. Elsevier, pp. 11–26. Muljani, M., K. Koda, and D. Moates. 1998. ‘Development of L2 word recognition: a connectionist approach,’ Applied Psycholinguistics 19: 99–114. Nagy, W. E. and R. C. Anderson. 1984. ‘How many words are there in printed school English?,’ Reading Research Quarterly 19: 304–30. Nagy, W. E. and D. Gentner. 1990. ‘Semantic constraints on lexical categories,’ Language and Cognitive Processes 5: 169–201. Nagy, W. E, E. F. McClure, and M. Mir. 1997. ‘Linguistic transfer and the use of context by Spanish-English bilinguals,’ Applied Psycholinguistics 18: 431–452. Nassaji, H. 2003. ‘L2 vocabulary learning from context: Strategies, knowledge sources, and their relationship with success in L2 lexical inferencing,’ TESOL Quarterly 37: 645–70. Papagno, G., T. Valentine, and A. D. Baddeley. 1991. ‘Phonological STM and foreign-language vocabulary learning,’ Journal of Memory and Language 30: 331–47. Paribakht, T. S. 2007. ‘The influence of first language lexicalization on second language lexical inferencing: a study of Farsi-speaking learners of English as a foreign language,’ Language Learning 55: 701–48. Paribakht, T. S. and M. Wesche. 1999. ‘Reading and ‘‘incidental’’ L2 vocabulary acquisition: an introspective study of lexical inferencing,’ Studies in Second Language Acquisition 21: 195–224. Perfetti, C. A. 1985. Reading Ability. Oxford University Press. Perfetti, C. A. 2003. ‘The universal grammar of reading,’ Scientific Studies of Reading 7: 3–24. Perfetti, C. A. and A. M. Lesgold. 1977. ‘Discourse comprehension and source of individual differences’ in Just M. A. and P. A. Carpenter (eds): Cognitive Process in Comprehension. Erlbaum, pp. 141–84.

531

Applied Linguistics: 31/4: 532–553 ß Oxford University Press 2010 doi:10.1093/applin/amq001 Advance Access published on 9 February 2010

Dynamic Patterns in Development of Accuracy and Complexity: A Longitudinal Case Study in the Acquisition of Finnish 1

MARIANNE SPOELMAN and

1

2,

*MARJOLIJN VERSPOOR

2

University of Oulu and University of Groningen E-mail: [email protected]

*

INTRODUCTION In a theme session on Complexity, Accuracy and Fluency (CAF) at the 2008 AAAL convention, Norris and Ortega argued that CAF measures do not remain constant over time and are not collinear. They argued that CAF is multivariate and dynamic and is therefore best understood integratively and across the full developmental trajection. They pointed out that the variability found in studies on information processing theory (Skehan 2003; Robinson 2005; DeKeyser 2007) was usually attributed to inter-individual differences in learner abilities or task requirements, but that variability within individuals in studies within a complexity theory perspective was seen as the motor of development (Larsen-Freeman 2006; De Bot et al. 2007). Therefore, they advocate integrating CAF studies with more dynamic descriptions as in Larsen-Freeman (2006) and with more focus on items not usually covered in CAF research such as the lexis, formulae, fluency, and morphology. Ortega and Byrnes (2008) also point to the need for longitudinal investigations to obtain a full view of developmental trajectories.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Within a Dynamic System Theory (DST) approach, it is assumed that language is in a constant flux, but that differences in the degree of variability can give insight into the developmental process. This longitudinal case study focuses on intra-individual variability in accuracy rates and complexity measures in Finnish learner language. The study illustrates the use of several useful DST methods and techniques such as min–max graphs and regression analyses to gauge whether different degrees of variability are meaningful, and Monte Carlo analyses to test for significance. Error rates were found to decrease rapidly in most cases except in four notoriously troublesome ones. Both word complexity and sentence complexity, and word complexity and NP complexity develop simultaneously and can be seen as connected growers, but NP complexity and sentence complexity alternate in developing and can be considered competitors. The study clearly shows that the interaction of different complexity measures change over time. Quite surprisingly, no meaningful relationship was found between accuracy and complexity measures over time.

M. SPOELMAN and M. VERSPOOR

533

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The present article is an attempt to meet some of these suggestions. Analyzing the longitudinal data of a Dutch student learning Finnish, we will trace the dynamic development of accuracy rates in the acquisition of morphology on the one hand and the development of several complexity measures on the other. Then we will look at the interaction between complexity and accuracy over time. To do so, it will make use of several methods developed by van Geert and van Dijk (2002) and exemplified in Verspoor et al. (2008). The study will make three points: (1) degrees of variability in the learner’s language can indeed give insight into the developmental process, (2) the interaction of several complexity measures shows that even within one linguistic domain there seems to be competition of resources and (3) there is no interaction over time between accuracy and complexity in the measures that we examined. As Verspoor et al. (2008) point out, variability in developmental studies has until recently not been considered as a meaningful developmental phenomenon in its own right. Traditional statistical methods see variability as ‘noise’ in the data due to measurement error, and not something that has to be analyzed. In the 1980s, a great number of studies looked at variability in L2 learners’ language, not to see what variability can tell about development but what the direct causes were of variability (cf. Tarone 1988). However, one of the earliest exceptions was Ellis (1994: 137), who concluded with reference to the longitudinal multiple case study by Cancino et al. (1978) that ‘free variation occurs during an early stage of development and then disappears as learners develop better organized L2 systems’. Within the field of developmental psychology, Thelen and Smith (1994) proposed that variability is needed for a learner to explore and select. They conclude that development can take place only when learners have access to a variety of forms and they are able to select those that help them develop. Within a similar vein, Berenthal (1999), in a study on crawling patterns in infants, suggests that variability offers flexibility, driving development following Darwinian principles. Principles of variation and selection lead to storing and repeating behaviors that were successful more frequently than behaviors that were less successful. Therefore, within-subject variability is functional and occurs continuously in any developing complex system, but the degree of variability may change depending on how stable or unstable the system is at a given moment. A relatively more unstable period is often a sign that the system is moving from one phase to another. By looking at the degree of variability over time within the different sub-systems in the learners’ L2 system, we may discover how and when these sub-systems are developing. Therefore, both intra- and inter-individual variability should not be ignored, but rather be treated as data and be analyzed. Taking an emergentist and dynamic systems perspective, this article describes a longitudinal case study on writing development in Finnish learner language.

534 DEVELOPMENT OF ACCURACY AND COMPLEXITY

EMERGENTISM AND DYNAMIC SYSTEMS THEORY

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Emergentism is a general approach to cognition that emphasizes the interaction between organism and environment and denies the need for predetermined, domain-specific faculties or capacities. An emergentist approach assumes that complexity of language emerges from relatively simple developmental processes being exposed to a massive and complex environment (Ellis 1998). The adjective emergent is taken literally as a continual movement toward structure. A structure that is emergent, is taken to be constantly open and in flux (Hopper 1998). Many emergentists assume that language learning is purely usage-based. Within the framework of usage-based learning, it is assumed that language development is sensitive to external factors such as frequency, contextual contingencies, attention and perceptual salience, among others, because of the input-dependent nature of usage-based learning (Ellis 2006a, 2006b). However, frequency is especially assumed to play an important role. For example, Tomasello (2003) suggests that the process whereby language structures emerge from language use depends crucially on the type and token frequency with which certain structures appear in the input. Therefore, in usage-based theory, constructions (units involving form, meaning, and use) can basically be considered as chunks that are established through practice and processed as single units (Bybee 2008). Although it is assumed that first language development is sensitive to many external factors (Ellis 2006a, 2006b), second language learning presents an even more complicated situation due to additional factors such as prior knowledge of another language, possible critical period effects, type of instruction, feedback providing negative evidence, aptitude and individual differences in motivation, among others (O’Grady 2008a, 2008b). Furthermore, the first language should be considered as both a help and a hindrance to Second Language Development (SLD). To the extent that constructions in a second language are similar to those of the first language, L1 constructions can serve as the basis for L2 constructions, with only the particular lexical or morphological material changed. In contrast, the acquisition of the L2 patterns is hindered by the L1 pattern, if similar constructions across languages differ in detail (Ringbom 1987; Bybee 2008). Because Dynamic Systems Theory (DST) (Thelen and Smith 1994; Van Geert 1994) is compatible with the view that there is no need for a pre-existing universal grammar in the mind of any individual, but rather supposes that a human disposition for language learning is required (De Bot et al. 2007), DST can be used in conjunction with Emergentism, and DST principles can be applied to complement the emergentist approach. According to Van Geert (2008), DST should not be considered a specific theory, but rather as a general view on change. DST has been used to study complex dynamic systems. As several recent publications have pointed out (Larsen-Freeman 1997; De Bot et al. 2005;

M. SPOELMAN and M. VERSPOOR

535

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

De Bot et al. 2007; Larsen-Freeman and Cameron 2008; Verspoor et al. 2008; Verspoor et al. forthcoming) many of the phenomena of interest to SLD can be seen as complex systems as well. L2 complex systems are also likely to contain many subsystems, nested one within another. Furthermore, as for example Larsen-Freeman and Long (1991) have pointed out, learning linguistic items is not a linear process, as learners do not master one item before they move on to another. In fact, the learning curve for a single item is not linear either, but it is filled with peaks and regressions, progress and backsliding. Because SLD involves complex dynamic systems, it is argued that DST offers a helpful way of thinking about matters and processes concerning SLD. As they develop over time, complex subsystems appear to settle in specific states, which are called attractor states. However, even when a complex system has reached a temporary attractor state, the system is still continually changing as a result of change in its constituent elements and their interaction. So a complex system will even show degrees of variability around stabilities, because variability is an inherent property of any complex, self-organizing system. However, the degree of variability is greatest, when a system or subsystem moves from one attractor state to another. Basically, it can be assumed that the degree of variability depends on how stable the complex system is at a given moment. Therefore, it is assumed that variability is an important developmental phenomenon. Both the degree of variability and fluctuations in the degree of variability are assumed to provide us with information about stages in the developmental processes of SLD. A DST approach assumes that the growth process is related to the system’s resources. As van Geert (1994) explains, resources are required to keep the process of growth going. Resources in growth systems have two main characteristics: they are interlinked in a complex system and they are limited. In van Geert’s work, ‘resources’ are a complex of internal or external factors that a learner may use or that may affect the learner. For example, in his 2008 work, he states that ‘those resources might imply the frequency with which high-quality (native) L2 is heard by the L2-learner, the frequency of use, the speaker’s linguistic talent (whatever that may mean) and so forth.’ Because resources are limited and growth is resource-dependent, growth is by definition limited. Based on its resources, the growth process has a limited carrying capacity, which refers to the state of knowledge that can be attained in a given interlinked structure of resources. However, even though the limited amount of resources has to be distributed over the different subsystems that grow, not all subsystems require an equal amount of resources. Basically, compensatory relations between different types of resources can be found. Some connected growers support each other’s growth, so that they need fewer resources than two growers that are unconnected. Such relationships are called supportive relations. In contrast, competitive relations between growers can be found. Such a relationship implies that an increase in the use of a more advanced grower is related to a decrease in the use of a less

536 DEVELOPMENT OF ACCURACY AND COMPLEXITY

THE CASE STUDY The Finnish language is particularly well known for its rich and complex morphology (Moscoso del Prado Martin et al. 2004) because in Finnish, a synthetic

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

advanced grower, which will contribute to the latter’s decline. The support or competition between variables can be strong or weak. Growth or developmental change can take many shapes. One important distinction among these different shapes of development concerns the difference between continuity and discontinuity (Van Dijk and Van Geert 2007). Discontinuities are of special interest, because they may indicate increasing self-organization in the system, that is, a spontaneous organization from a lower level to a higher level of development. Discontinuities occur during stage transitions. In a discontinuity, the growing variable basically jumps from one stage to the next without intermediary points (Van Geert et al. 1999). It is assumed that variability increases in the vicinity of a developmental jump (Van Dijk 2004). More specifically, it was found that increased variability (Verspoor et al. 2008) or anomalous variance (Van Dijk 2004) often precedes a developmental jump. Emergentism and DST are promising theories, but the development of an emergentist view on SLD is still in its infancy. However, frequency is, among other factors, assumed to play an important role in L2 development because repetition and repeated exposure strengthen linguistic representations in the L2, at the same time making them more accessible (Bybee 2008). In addition, although DST acknowledges intra-individual variability as an important developmental phenomenon that should be analyzed, not many studies have been conducted to investigate the role of variability in SLD. One of these studies was a longitudinal case study of an advanced learner of English, conducted by Verspoor et al. (2008). This case study revealed an interesting dynamic interaction of subsystems at the lexical and syntactic level. In accordance with the assumption of limited resources, the learner showed a variable development for some related measurements in the course of the trajectory. A strong trend towards a competitive relationship was found between the Type Token Ratio (in texts of equal lengths of 200 words) and the average sentence length, suggesting that the learner either focused on vocabulary or syntactic complexity. In addition, a supportive pattern became apparent for two complexity measures at different levels of granularity, number of words to finite verb ratio (a general complexity measure) and NP length. The positive correlation between the scores for these measures and their coinciding local highs and lows supported the idea that these measures were connected growers. In the current study, a longitudinal case study was conducted to investigate writing development in Finnish learner language. The purpose of this study was to provide additional insight into the role of intra-individual variability in SLD by investigating a language quite different from English and looking at development starting from an absolute beginner’s level.

M. SPOELMAN and M. VERSPOOR

537

Subject and data collection This longitudinal case study involves data of a native speaker of Dutch (19–21) who learned Finnish as a foreign language. The subject in this study majored in theoretical linguistics and took a minor in Finnish language at the University of Groningen for three years. She had no previous knowledge of Finnish language and she had never been to Finland.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

language of the agglutinating type, inflections are numerous and in most cases phonologically distinct (Holmberg and Nikanne 1993). The Finnish language is different from most Indo-European languages, especially with respect to the fact that Finnish has more cases than most Indo-European languages have (Karlsson 1999). The Finnish case system comprises 15 cases (Leino 1997; Helasvuo 2008) and is traditionally taken to consist of two types of cases: structural and semantic cases (Nikanne 1993). Nominative, genitive, partitive, and accusative are considered as structural or grammatical cases. Semantic cases are divided into locative and marginal cases. Locative cases are subdivided into internal locative cases, external locative cases, and general locative cases (Helasvuo 2008). All marginal cases are rare and appear mainly in fixed expressions like idioms (Karlsson 1999). Appendix 1 provides an overview of the Finnish case system (see Supplementary Data available at Applied Linguistics online). The data of the longitudinal case study comprise 54 writing samples. The study focuses on two aspects of linguistic performance: accuracy and complexity. Accuracy concerns the extent to which the language produced conforms to the target language norms. Complexity concerns the elaboration of the language that is produced and reflects the learner’s preparedness to restructure and to try new constructions (Skehan 1996). It was decided to focus on accuracy and complexity, because both aspects of linguistic performance will change and develop over time. Because of a limited ability to coordinate and control attentional resources (Skehan and Foster 1997, 2001), learners find it difficult to attend to different aspects at the same time, and thus have to make decisions about how to allocate their attentional resources by prioritizing one language subsystem over others. However, the greatest influence of these limited attentional resources can be found at the beginning of L2 development, when the learner has only limited proficiency (Skehan 1996). In order to take the complex and rich morphology of the Finnish language into account, the accuracy analyses focus on the use of cases, whereas word, Noun Phrase (NP) and sentence constructions were examined for developmental patterns of complexity. The reason for looking at complexity at these different levels is that complexity is ‘a multi-dimensional construct with several measurable sub-constructs that relate to distinct sources of complexification, each in need of measurement by application of distinct metrics.’ (Norris and Ortega forthcoming). In the following sections, the method and results of the case study will be discussed.

538 DEVELOPMENT OF ACCURACY AND COMPLEXITY

Design and analysis All writing samples were transcribed based on Codes for the Human Analysis of Transcripts (CHAT) and analyzed by Computerized Language Analysis (CLAN), both developed as part of the Child Language Data Exchange System (CHILDES) project (MacWhinney and Snow 1990). Several techniques developed by Van Geert and Van Dijk (2002) were used to gain insight into the dynamic developmental processes involved in foreign language learning. First, min–max graphs were used to visualize the bandwidth of observed scores. The moving min–max graph plots the score range for each measurement occasion by using a moving window, a time frame that moves up one position (i.e. one measurement occasion) each time. Each window largely overlaps with the preceding windows, using all the same measurement occasions minus the first and plus the next. For instance, for every predetermined set of consecutive measurements, the maximum and the minimum values are calculated. This is done by way of a predetermined moving window of, for instance, five positions, so that we obtain the following series: min(t1 . . . t5), min(t2 . . . t6), min(t3 . . . t7), etc. max(t1 . . . t5), max(t2 . . . t6), max(t3 . . . t7), etc. By using a moving window, min–max graphs highlight the score ranges so that subtleties in the developmental process can be distinguished (Verspoor et al. 2008). In addition, detailed insight into dynamic developmental processes was achieved by analyzing the interactions between developmental variables. Both raw and detrended correlations were calculated. Raw correlations (based on the actual data) may give skewed results because the incline of the slopes of the two different variables may differ. Therefore, detrended correlations were calculated by detracting the incline of the slope from the raw data, preventing overestimating local variability related to the developmental slope (Verspoor et al. 2008). To test for significance on the individual data,

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The data consist of 54 writing samples on academic topics, written over the course of three years. Task conditions were rather similar: the writing samples result from homework assignments, written at home without time pressure and reference materials available on academic topics for several Finnish proficiency courses. All samples were marked by a Finnish language teacher (a native speaker of Finnish) and checked by an advanced learner of Finnish. Because the texts were originally not equal in length, a random sample of approximately 100 words was selected from each text (a 10 per cent deviation was permitted). The data of some of the earliest texts were normalized because they did not consist of a hundred words. In other words, if the text contained only 50 words, the number of errors, and so on, were multiplied by two.

M. SPOELMAN and M. VERSPOOR

539

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

statistical simulations of variability, named Monte Carlo simulations, were carried out in order to analyze the interactions between developmental variables. All statistical simulations were carried out in Microsoft Excel, with a statistical add-in that performs random sampling (Poptools, a set of statistical tools by Hood 2004). A Monte Carlo simulation basically analyzes whether a correlation is based on coincidence. By simulating variability under the null hypothesis, the 95% level of statistical variability can be calculated. The number of times the simulated sets are equal to or greater than the observed sets (exceeding the 95% boundary) divided by the number of statistical simulations gives an estimation of the probability (p-value) (Bassano and Van Geert 2007). It was decided to run 10,000 iterations to calculate estimated p-values. The a decision level was determined at 0.05. Furthermore, the Progressive Maximum and the Regressive Minimum were calculated in order to be able to detect possible developmental jumps. The Progressive Maximum was calculated by determining maximum values based on a window starting from the first five data points and extending with one data point at the time (1–5, 1–6, 1–7, 1–8, 1–9, etc.) Similarly, the Regressive Minimum was computed by specifying a window with a period of 5 starting from the last data point and moving backwards (54–50, 54–49, 54–48, 54–47, etc.) (Van Geert and Van Dijk 2002). The Progressive Maximum and Regressive Minimum were plotted in so-called Progmax-Regmin graphs. The point at which maximal distances between the Progressive Maximum and the Regressive Minimum are observed indicates the position of the assumed developmental jump. Here too, a series of Monte Carlos were run to test for significance. All writing samples were analyzed on two central aspects: accuracy and complexity. For accuracy, an overall case accuracy rate was calculated: all cases were taken together, and for each text, the number of incorrect cases was subtracted from the total number of cases. The difference between the total number of cases and the number of incorrect cases was then divided by the total number of cases. Furthermore, for each of the 11 most frequently occurring cases, error rates were calculated. In the 3 cases with the highest error rates, types of errors were studied in detail. For complexity, the developmental patterns of complexity were investigated at the word, NP and the sentence level. Word level complexity was calculated by counting the number of morphemes each word contains: for each text the number of single morpheme words, two morpheme words, three morpheme words, and words consisting of more than three morphemes was counted. The word complexity ratio was operationalized by calculating the difference between the average sentence length in morphemes (AVSLm) and the average sentence length in words (AVSLw). NP complexity was calculated by counting the number of words in each NP: for each writing sample the number of NPs containing one word, NPs containing two words, and NPs containing three or more words was counted. The NP complexity ratio was calculated by averaging NP length in words for each text. Sentence level complexity was calculated by

540 DEVELOPMENT OF ACCURACY AND COMPLEXITY

RESULTS To calculate case accuracy rates, all case errors were taken together. Figure 1 provides the case accuracy rate and its min–max graph. The figure shows that most case accuracy rates lie between 0.80 and 1.00. However, a lower case accuracy rate of 0.65 was found in the eighth text. The min–max graph illustrates that the degree of variability is relatively high within the earliest texts. After the 11th text, variability decreases, although there is still a considerable degree of variability. From the 28th text onward, variability decreases even further and seems to stabilize. On the whole, case error rates were low and four cases did not occur often enough to be calculated individually. Therefore, separate error rates for the 11 cases that were used most frequently within the 54 writing samples were calculated. In eight cases (nom sg: 0.07, gen sg: 0.03, elat sg: 0.09, iness sg: 0.07, ill sg: 0.08, adess sg: 0.01, ess sg: 0.07, nom pl: 0.06), the error rates were below 0.10. Only in three cases—the partitive singular, accusative singular, and partitive plural—did the error rate exceed 0.10. These errors were examined qualitatively. Table 1 shows that most errors related to the partitive singular and accusative singular were found to be semantic use errors, whereas most errors related to partitive plural appeared to be purely form errors. Some examples of semantic errors are provided in Table 2. Several complexity levels were analyzed to trace their development over time. Figure 2 shows the developmental patterns of (a) word complexity, (b) NP complexity, and (c) sentence complexity. These graphs show that for each level of complexity, the simpler categories are used most frequently. For word complexity, single morpheme words and words consisting of two morphemes occur most frequently. Words consisting of three or more than three morphemes are used rather infrequently. For NPs, those containing one or two words occur most frequently, whereas NPs containing three words or more occur rather infrequently. For sentence complexity, simple sentences and

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

counting the number of simple, compound, complex, and compound–complex (a compound sentence with one or more dependent clauses or a complex sentence containing two or more dependent clauses joined by a coordinating conjunction) sentences the texts contained. The classification of sentence types was based on Verspoor and Sauter (2000). The sentence complexity ratio was calculated by averaging number of dependent clauses per text. Appendix 2 provides an overview of the formulas used to calculate the different complexity ratios (see Supplementary Data available at Applied Linguistics online). For the interaction between accuracy and complexity, the development of case errors was compared with the development of word complexity, not only because they are closely interrelated as both measures are at the word level but also because word-level complexity turned out to be a good predictor for general complexity.

M. SPOELMAN and M. VERSPOOR

541

Table 1: Detailed analysis of the error rates of partitive singular, accusative singular and partitive plural Case

Partitive singular Accusative singular Partitive plural

Error rate

0.177 0.184 0.230

Error type Form

Use

Form/use

39% 15% 83%

54% 85% 17%

7% 0% 0%

Table 2: Examples of semantic errors concerning the partitive singular and accusative singular Realized structure

Target-like structure

Mina¨ tee-n laivamatka-n I-Nom Sg make-1Sg boot trip-Acc Sg

Mina¨ tee-n laivamatka-a I-Nom Sg make-1Sg boot trip-PartSg ‘I am making a boot trip’ (atelic) Isa¨ ve-i koira-a ulos Isa¨ ve-i koira-n ulos father bring-Past-3Sg dog-Part Sg outside father bring-Past-3Sg dog-AccSg outside ‘Father brought the dog outside’ (telic)

complex sentences are used most frequently, and compound and compound complex sentences occur rather infrequently. Each of the complexity levels shows rather similar patterns. With respect to word complexity, from the seventh text onward, the number of single

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Figure 1: Moving min–max graph of the developmental pattern of case accuracy. (Window size of five data points)

542 DEVELOPMENT OF ACCURACY AND COMPLEXITY

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Figure 2: Development of complexity on the (a) morphological level, (b) the Noun Phrase level and (c) the sentence level

M. SPOELMAN and M. VERSPOOR

543

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

morpheme words seems to go up when the number of two morpheme words goes down, and vice versa, a logical inverse relationship. Words consisting of three morphemes are used on a more regular basis from the seventh text onward and they show an increase toward the end. It is not until the 52nd text that words consisting of more than three morphemes are used more frequently. With respect to NP complexity, two word NPs are not used very frequently until the 11th text. After the 11th text, however, the number of two word NPs seems to go up when the number of one word NPs goes down, and vice versa. After the 11th text, the development of one word NPs and two word NPs coincides with an increased degree of variability. From the 42nd text onward, the number of one word NPs decreases and the degree of variability decreases. From these most advanced texts onward, the development of the different NP categories seems to stabilize. With respect to sentence complexity, simple sentences are used very frequently within the early texts. The number of simple sentences decreases slightly over time. However, the development of simple sentences coincides with a large degree of variability, not stabilizing until the 28th text. Complex sentences are already used in the first text, but from the 9th to the 11th text onward, they are used more widely (although a regression can be observed in the 12th text). After the 12th text, the use of complex sentences increases with a relatively small degree of variability. Both compound and compound–complex sentences show a floor effect, since these sentence types are used very infrequently. Compound sentences are used from the beginning onward, whereas the first compound–complex sentence emerges in the 15th text. Compound–complex sentences are used on a more regular basis from the 30th text onward. To test the observations, several correlations were run. Indeed, a moderately strong negative correlation (R = 0.368) was found between the single morpheme words and the two morpheme words and also a detrended correlation was moderately strong (R = 0.357). A Monte Carlo simulation (10,000 iterations) revealed an estimated p-value of 0.005, suggesting a significant competition between the use of one and two morpheme words. A moderately strong negative correlation (R = 0.3364) was found between the number of one word NPs and the number of two word NPs. In addition, the detrended correlation appeared to be moderately strong (R = 0.357) and a Monte Carlo simulation (10,000 iterations) revealed an estimated p-value of 0.003, suggesting a significant competition between the use of one word and two word NPs. More detailed analyses of the less frequently used categories also appeared to be useful. A detailed analysis of NPs containing more than three words provided more insight into the developmental pattern of NP complexity. Figure 3 provides a Progmax–Regmin graph of the development of NPs containing more than three words and illustrates a possible developmental jump at point 0, indicating that the number of NPs containing more than three words increases

544 DEVELOPMENT OF ACCURACY AND COMPLEXITY

Figure 4: Moving window of correlation between word complexity rates and case accuracy rates. (Window size of five data points) discontinuously between the 44th and the 45th text. A Monte Carlo simulation (10,000 iterations) revealed an estimated p-value of 0.015, so the developmental jump can be considered significant. In addition, interactions between word complexity, NP complexity, and sentence complexity were analyzed. The raw correlation between the NP complexity ratio and the sentence complexity ratio appeared to be weak (R = 0.185), but the detrended data showed a strong negative correlation (R = 0.451). Monte Carlo simulations revealed an estimated p-value of 0.000, indicating a strong competitive relationship between NP and sentence complexity. Furthermore, strong positive correlations were found between both the word complexity ratio and the sentence complexity ratio and between the word complexity ratio and the NP complexity ratio (R = 0.605; R = 0.604,

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Figure 3: Moving Progmax–Regmin graph representing the development of NPs containing more than three words

M. SPOELMAN and M. VERSPOOR

545

DISCUSSION In spite of the fact that a very complex case system comprising fifteen cases had to be learned, most case accuracy rates were very high. As one would expect in the earliest stages, the degree of variability was relatively high in the early texts, but already from the 28th text onward, variability stabilized, indicating that the system settled. A more detailed look at the errors showed that for the 11 most frequently used cases, the error rates were less than 10 per cent. Relatively high error rates were found only for the partitive singular, the accusative singular, and the partitive plural. The high error rates in the partitive case are in line with literature findings that the partitive case is often problematic for learners of Finnish as a foreign language (Denison 1957; Schot-Saikku 1990; Martin 1995). A more detailed analysis of the errors showed that most errors related to partitive and accusative singular were semantic use errors, whereas most errors related to partitive plural were form errors. These findings can be explained in terms of DeKeyser’s (2005) claim that lack of linguistic transparency is an important factor in acquisition difficulty. In other words, lack of saliency seems to interact with factors such as frequency and opacity. First, the high number of semantic use errors made with respect to the partitive and accusative singular can be explained by taking into account the accusative–partitive alternation that exists in Finnish. The Finnish partitive case expresses unknown identities, partialness, and irresultative actions. The partitive has two functions, which can be termed aspectual and NP-related (Kiparsky 1998). The NP-related function of the partition implies that quantitatively indeterminate objects are partitive regardless of the verb. In line with the aspectual function, partitive case is assigned by most atelic verbs and accusative is assigned by most telic verbs (Kiparsky 2005). Table 3 illustrates the

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

respectively). Detrended correlations between word complexity and sentence complexity and word complexity and NP complexity were weaker (R = 0.228; R = 0.226, respectively). Monte Carlo simulations showed strong trends (p = 0.054; p = 0.054, respectively). Finally, for the interaction between accuracy and complexity, we looked at the development of case errors and the development of word complexity, as the latter proved to be a good indicator of general complexity. The detrended correlation between the case accuracy rate and word complexity ratio appeared very weak (R = 0.022). A weak correlation could be due to a changing interaction over time between the two variables. Therefore, we created a moving window of correlations, which shows that the periods of negative and positive correlations alternate. As Figure 4 shows, there is a negative correlation until text 5, from text 5–15 there is a positive correlation, then from text 15–23 there is a negative correlation. From text 23 onward, there is a positive correlation with a few outliers. A Monte Carlo simulation suggests that this pattern is random (p-value of 0.5704 with 10.000 simulations).

546 DEVELOPMENT OF ACCURACY AND COMPLEXITY

Table 3: Accusative (a) – partitive (b) alternation; examples adapted from Kiparsky (2005) (b) Ammu-i-n karhu-a. shoot-Past-1Sg bear-Part Sg ‘I shot at the (a) bear.’

(a) Ammu-i-n karhu-n. Shoot-Past-1Sg bear-Acc Sg ‘I shot the (a) bear.’

Table 4: Overview of the main vowel changes before the plural ending –i Morphological rule

stem

translation

part pl

the short vowels /o/, /o¨/, /u/, /y/ (the round vowels) do not change a long vowel shortens the first vowel of the diphthongs / ie/, /uo/, /yo¨/ is dropped /i/ is dropped in diphthongs endings in –i short /e/ is dropped short /i/ changes to /e/ /a¨/ is dropped in two-syllable words, /a/ changes to /o/ if the first vowel is /a/, /e/, or /i/, but is dropped if the first vowel is /u/ or /o/ kirjo-j-akoht-i-a

talo-

‘house’

talo-j-a

puuyo¨-

‘tree’ ‘night’

pu-i-ta o¨-i-ta¨

hai-

‘shark’

ha-i-ta

kielelasipa¨iva¨kirjakohta-

‘language’ ‘glass’ ‘day’ ‘book’ ‘point’

kiel-i-a¨ lase-j-a pa¨iv-i-a¨

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

accusative–partitive alternation in Finnish, by providing examples using the ambivalent verb ampua (‘to shoot’). In addition to ambivalent verbs as illustrated in Table 3, there are verbs that always assign aspectual partitive case (Kiparsky 2005). As can be inferred from these rules and exceptions, the use of the partitive and accusative case is highly complex and difficult in Finnish. We may conclude that the difficulty lies in a lack of consistency and/or frequency of relevant instances to entrench the instances. Most errors related to partitive plural were form errors. These findings can be explained by taking into account the stem changes that are caused by the addition of the plural marker /i/ (Karlsson 1999). The plural marker is realized either as /i/ or as /j/ depending on the morphophonological environment (Karttunen 2006). The plural marker causes eight main changes within a noun’s stem vowels (Karlsson 1999), as illustrated in Table 4. It may very well be the case that the high percentage of form errors in the partitive plural was caused by the large number of stem alternations that had to be learned and applied.

M. SPOELMAN and M. VERSPOOR

547

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

In brief, we can conclude that the accuracy rate settled rather rapidly. The greatest amount of variability was found in the earliest text (see Figure 1), the bandwidth narrowed and then settled after the 28th text. These outcomes are in line with our expectations that the highest degree of variability occurs at the early stages when the morphological system still needs to be discovered and is followed by stabilization in the more advanced texts. We were surprised, however, by the relative quick acquisition of most cases. There were only three cases that remained troublesome: the use of the partitive and accusative singular, both of which involve complex semantic rules, and the formation of the partitive plural, which concerns complex morphological rules. Because of these complex rules, the linguistic structures of the partitive singular, the accusative singular, and the partitive plural are likely to coincide with a lack of rule transparency. Looking at the developmental patterns of word complexity, NP complexity and sentence complexity, we can conclude that for each complexity level, a nice distribution of the different complexity types emerged in the more advanced texts. Towards the end, each complexity level appeared to remain within a steady bandwidth, indicating that the system has found an attractor state and stabilized (cf. De Bot et al. 2007). However, even then some variability remained, which is in line with the assumption that variability is an inherent property of the self-organizing system (Thelen and Smith 1994; Larsen-Freeman and Cameron 2008). For the development of word complexity, we can see a nice distribution of the different subtypes in the more advanced texts. There is an overall competition between the use of one and two morpheme words, which may be explained by the fact that they are both very frequent and they naturally alternate. In contrast with the rather stable patterns in the more advanced texts, the earlier texts clearly showed periods of increased variability and many peaks with progress as well as regression and backsliding. Even though single and two morpheme words remain very frequent, the distribution at the end can still be considered target-like, because in Finnish, only a few words within a sentence can consist of three or more morphemes. (In 10 native-speaker texts, the average number of single-word morphemes was 37.80 and the average number of words consisting of two morphemes, three morphemes, and more than three morphemes were 40.20, 22.50, and 2.90, respectively. The samples were selected from the academic component of the Native Finnish Corpus, which is a subset of the Corpus of Translated Finnish (Mauranen 2000). This principle, illustrated in Table 5, is due to the fact that function words like conjunctions and certain adverbs consist of one morpheme, that adjectives cannot assign possessive suffixes, and that verbs usually consist of two morphemes. Relatively strong positive correlations were found between the word complexity ratio and the NP complexity ratio, and between the word complexity ratio and the sentence complexity ratio. These findings seem to indicate a supportive relation between both word complexity and NP complexity,

548 DEVELOPMENT OF ACCURACY AND COMPLEXITY

Table 5: Morpheme analysis of a sentence taken from one of the advanced texts On arvioitu, etta¨ Suomen saamelaisista yli kolmannes osa asuu saamelaisalueen ulkopuolella. ‘It has been estimated that more than a third part of the Sa´mi living in Finland lives outside the borders of the Sa´pmi area.’ be -PRES-3Sg estimate -Past participle that Finland -GenSg Sa´mi -plural-Elat more than three -Ordinal part -NomSg live -PRES-3Sg Sa´pmi area -GenSg outside -AdessSg

2 2 1 2 3 1 2 1 2 3 3

morphemes morphemes morpheme morphemes morphemes morpheme morphemes morpheme morphemes morphemes morphemes

and word complexity and sentence complexity. In other words, these complexity measures are connected growers. The findings are compatible with the assumption that the growth processes of word complexity and NP and sentence complexity are interrelated to some extent, because every word corresponds to at least one morpheme, implying that word complexity increases if NP or sentence complexity increases (because the number of words increases). Although we expected the variables to be interrelated, we did not expect them to support each other to such a high extent because they do reflect complexity at different levels. Because the word complexity ratio and the NP complexity ratio, and the word complexity ratio and the sentence complexity ratio were found to be connected growers, the development of word complexity in Finnish might even be taken as an indicator of overall complexity up to at least an intermediate level of proficiency. However, it must be taken into account that there is a very complex interaction between the different complexity levels: a supportive relation was found between word complexity and NP complexity and between word complexity and sentence complexity, but at the same time, a competitive relation was found between NP complexity and sentence complexity. The competitive relation between the NP complexity ratio and the sentence complexity ratio indicates that the development of one complexity level goes at the expense of the other. A similar relationship was found in the advanced ESL writer in (Verspoor et al. 2008: 226) where at one point in the process the NP length increases while W/FV decreases. The question is whether this is a natural result of embedding strategies, both used to explain more complex and extensive ideas. More specifically, it can be

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

on arvioi-tu etta¨ Suome-n saamelais-i-sta yli kolmanne-s osa asu-u saamelais-aluee-n ulko-puole-lla

M. SPOELMAN and M. VERSPOOR

549

assumed that embedding is expressed by nominalization (increase in NP complexity) or by the formation of dependent clauses (increase in sentence complexity). It is likely that when nominalization is used to embed, there is no need to use dependent clauses, and vice versa. If so, the competition between NP and sentence complexity could be considered merely a consequence of different strategies in embedding. Although this is a plausible explanation, it does not explain the entire relation between NP complexity and sentence complexity. To understand this particular competitive relation (whereas all other complexity levels appeared to be connected growers), we studied the correlation between NP complexity and sentence complexity in more detail. Figure 5 provides a representation of the correlation between NP and sentence complexity as it changes over time. From this graph, we can see that the competitive relation between these complexity levels appeared from the eighth text onward. The strong positive correlation in the earliest texts can be explained by taking into account that both NPs and sentences were very short and simple in the beginning. From the eighth text onward, both NP complexity and sentence complexity started to grow and a competitive relationship between the different complexity levels emerged. This competitive relationship disappeared after the 45th text. The competition between the variables disappeared from the 46th text onward. Because the competition between NP and sentence complexity disappears in the most advanced texts, we may assume that the earlier competition was not due to a natural, inverse distribution. This finding supports the observation by Norris and Ortega (forthcoming: 17) that complexification does not occur only at the subordination level but also at the NP level; therefore, complexity measures should include both, especially at the more advanced levels. It is thus strong evidence that syntactic complexity should not be considered as

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Figure 5: Moving Window of correlation between Noun Phrase and sentence complexity. (Window size of five data points)

550 DEVELOPMENT OF ACCURACY AND COMPLEXITY

CONCLUSION If we go back to the suggestions and observations that Norris and Ortega made at the 2008 AAAL convention, we may indeed conclude that complexity and accuracy measures do not remain constant over time and are not collinear. Complexity and accuracy are indeed multivariate and dynamic and should be studied across the full developmental trajection. This study has shown that the variability in competition found in studies on information processing theory

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

one single construct, but multi-dimensional. Within one construct the different sub-systems can compete. Finally, the relationship between accuracy and complexity was examined. The moving window of correlations shows a rather up-and-down relationship. At the very early stages (first five texts), the data show a negative correlation, which means that the two constructs seem to compete for attentional resources. Then comes a slightly longer period (texts 5 through 15) where the correlation is positive, meaning the learner is able to pay attention to both constructs at the same time. After that, there is a period again (texts 16 through 23) where the two constructs compete again. After text 23, the constructs do not seem to compete anymore even though there are two outliers. The Monte Carlo simulation suggests that this pattern is rather random (p-value of 0.5704 with 10.000 simulations). Considering the fact that both the subject (same learner) and task conditions (the learner was asked to write an essay in academic style without time pressure) remained constant over time, the variable pattern of interaction between accuracy and complexity can not be attributed to task condition or individual differences. Therefore, we assume that the interaction is mainly due to the proficiency level of the writer, which increased over time. This finding suggests that the level of proficiency is a major variable in the complexity vs. accuracy debate, but that especially at the early stages the interaction may be quite random. To summarize, the development of certain complexity types was characterized by developmental jumps, both significant and insignificant ones. In addition, competitive relationships were found between certain complexity types, sometimes disappearing in the more advanced texts. From these findings, we can conclude that intra-individual variability behaves according to the principles of Dynamic Systems. This is in line with the outcomes of the more advanced learner in Verspoor et al. (2008). In the current study, furthermore, many developmental patterns showed ‘classic’ examples of a step-wise developmental process because these patterns showed a period of increased variability in the vicinity of a developmental jump and as a transition phase between two stages. These outcomes affirm once again the assumption that a relatively more unstable period can be seen as a sign that the system is changing. Finally, this study shows that the interaction between accuracy and complexity measures changes over time and within this study the interaction seemed rather random, especially at the early stages.

M. SPOELMAN and M. VERSPOOR

551

SUPPLEMENTARY DATA Supplementary material is available at Applied Linguistics online.

REFERENCES Bassano, D. and P. van Geert. 2007. ‘Modeling continuity and discontinuity in utterance length: a quantitative approach to changes, transitions and intra-individual variability in early grammatical development,’ Developmental Science 10: 588–612. Bertenthal, B. I. 1999. ‘Variation and selection in the development of perception and action’ in Savelsbergh G., H. van der Maas, and P. van Geert (eds): Non-linear Developmental Processes. Koninklijke Nederlandse Academie van Wetenschappen, pp. 105–21. Bybee, J. 2008. ‘Usage-based grammar and second language acquisition’ in Robinson P. and N. C. Ellis (eds): Handbook of Cognitive

Linguistics and Second Language Acquisition. Routledge, pp. 216–36. Cancino, H., E. Rosansky, and J. Schumann. 1978. ‘The acquisition of English negatives and interrogatives by native Spanish speakers’ in Hatch E. M. (ed.): Second Language Acquisition: A Book of Readings. Newbury House, pp. 207–30. De Bot, K., W. Lowie, and M. Verspoor. 2005. Second Language Acquisition: An Advanced Resource Book. Routledge. De Bot, K., W. Lowie, and M. Verspoor. 2007. ‘A Dynamic Systems Theory approach to second language acquisition,’ Bilingualism, Language and Cognition 10: 7–21.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

(Skehan 2003; Robinson 2005; DeKeyser 2007) should perhaps not be solely attributed to differences among learners so much, but to the individual developmental trajectories. This study has also shown that it is interesting to look at complexity not as a single construct but as a complex one. From this longitudinal case study, we may conclude that the developmental patterns of case accuracy and several complexity levels are characterized by peaks and regressions, progress and backsliding and by complex interaction between variables. In other words, L2 development is clearly non-linear. Accuracy rates improved surprisingly fast and problems remained with only a few notoriously difficult cases, supporting the view that complex and opaque rules are the most difficult to acquire. Complexity measures all increased over time and three of them seemed to support each other: word complexity correlated with both NP complexity and sentence complexity. However, NP complexity and sentence complexity competed strongly with each other during the acquisitional path. Because these measures no longer competed at the end, we concluded that the two types of embedding competed during part of the acquisition process In addition, a few ‘classic’ examples of a step-wise developmental process were found, with periods of variability in the vicinity of a developmental jump and with periods of increased variability as transition phases between two stages. From these outcomes, it can be concluded that intra-individual variability behaves according to the principles of Dynamic Systems. To conclude, analyzing different degrees of intra-individual variability in different sub-systems of accuracy and complexity has again been found to provide important insights into the process of Second Language Development.

552 DEVELOPMENT OF ACCURACY AND COMPLEXITY

Kiparsky, P. 2005. ‘Absolutely a matter of degree: The semantics of structural case in Finnish,’ handout of talk presented at CLS 41. Larsen-Freeman, D. 1997. ‘Chaos/complexity science and second language acquisition,’ Applied Linguistics 18: 141–65. Larsen-Freeman, D. 2006. ‘The emergence of complexity, fluency, and accuracy in the oral and written production of five Chinese learners of English,’ Applied Linguistics 27: 590–619. Larsen-Freeman, D. and L. Cameron. 2008. ‘Research methodology on language development from a complex theory perspective,’ Modern Language Journal 92: 200–13. Larsen-Freeman, D. and M. H. Long. 1991. An Introduction to Second Language Acquisition Research. Longman. Leino, P. 1997. Suomen Kielioppi. Kustannusosakeyhtio¨ Otava. Verspoor, M., W. Lowie, and K. de Bot. in preparation. A Dynamic Approach to Second Language Development: Methods and Techniques. Benjamins. MacWhinney, B. and C. E. Snow. 1990. ‘The child language data exchange system: An update,’ Journal of Child Language 17: 457–72. Martin, M. 1995. ‘The Map and the Rope. Finnish Nominal Inflection as a Learning Target. Studia Philologica Jyva¨skyla¨ensia 38. University of Jyva¨skyla¨. Mauranen, A. 2000. ‘Strange strings in translated language: A study on corpora’ in Olohan M. (ed.): Intercultural Faultlines. Research Models in Translation Studies 1: Textual and Cognitive Aspects. St. Jerome Publishing, pp. 119–41. Moscoso del Prado Martı´n, F., R. Bertram, T. Ha¨ikio¨, R. Schreuder, and R. H. Baayen. 2004. ‘Morphological family size in a morphologically rich language: The case of Finnish compared with Dutch and Hebrew,’ Journal of Experimental Psychology: Learning, Memory, and Cognition 30: 1271–8. Nikanne, U. 1993. ‘On assigning semantic cases in Finnish’ in Holmberg A. and U. Nikanne (eds): Case and Other Functional Categories in Finnish Syntax, Studies in Generative Grammar. Mouton de Gruyter. Norris, J. M. and L. Ortega. 2008. Measurement for understanding: The case of CAF’. Paper

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

DeKeyser, R. 2007. ‘Skill acquisition theory’ in Van Patten B and J. Williams (eds): Theories in Second Language Acquisition: An Introduction. Lawrence Erlbaum, pp. 97–112. DeKeyser, R.M. 2005. ‘What makes learning Second-language grammar difficult? A review of issues,’ Language Learning 55: 1–25. Denison, N. 1957. The Partitive in Finnish. AnnalesAcademiae Scientiarum Fennicae B 108, 262. Suomalainen Tiedeakatemia. Ellis, N. C. 1998. ‘Emergentism, connectionism and language learning,’ Language Learning 48: 631–64. Ellis, N. C. 2006a. ‘Cognitive perspectives on SLA: The associative cognitive CREED,’ AILA Review 19: 100–21. Ellis, N. C. 2006b. ‘Language acquisition as rational contingency learning,’ Applied Linguistics 27: 1–24. Ellis, N. C. and D. Larsen-Freeman. 2006. ‘Language emergence: Implications for applied linguistics. Introduction to the special issue,’ Applied Linguistics 27: 558–59. Ellis, R. 1994. The Study of Second Language Acquisition. Oxford University Press. Helasvuo, M.-L. 2008. ‘Aspects of the structure of Finnish’ in Klippi A. and K. Launonen (eds): Research in Logopedics, Speech and Language Therapy in Finland. Multilingual Matters, pp. 9–18. Holmberg, A. and U. Nikanne. 1993. Case and other functional categories in Finnish syntax: Introduction Studies in Generative Grammar. Mouton de Gruyter. Hood, G. 2004. Poptools [Computer software], Pest Animal Control Co-operative Research Center (CSIRO). Hopper, P. 1998. ‘Emergent grammar’ in Tomasello M. (ed.): The New Psychology of Language. Erlbaum, pp. 155–75. Karlsson, F. 1999. Finnish, An Essential Grammar. Routledge. Karttunen, L. 2006. ‘Numbers and Finnish numerals,’ A Man of measure: Festschrift in honour of Fred Karlsson on his 60th birthday a special supplement to SKY Journal of Linguistics 19: 407–21. Kiparsky, P. 1998. ‘Partitive case and aspect’ in Butt M. and W. Geuder (eds): The Projection of Arguments: Lexical and Compositional Factors. CSLI Publications, pp. 265–308.

M. SPOELMAN and M. VERSPOOR

accuracy and complexity in task-based learning,’ Language Teaching Research 1: 185–211. Skehan, P. and P. Foster. 2001. ‘Cognition and tasks’ in Robinson P. (ed.): Cognition and Second Language Instruction. Cambridge University Press, pp. 183–205. Tarone, E. 1988. Variation in Interlanguage. Edward Arnold. Thelen, E. and L. B. Smith. 1994. A Dynamic Systems Approach to the Development of Cognition and Action. MIT Press. Tomasello, M. 2003. Constructing a Language: A Usage-Based Theory of Language Acquisition. Harvard University Press. Van Dijk, M. 2004. Child Language Cuts Capers: Variability and Ambiguity in Early Child Development. University of Groningen. Van Dijk, M. and P. van Geert. 2007. ‘Wobbles, humps and sudden jumps: A case study of continuity, discontinuity and variability in early language development,’ Infant and Child Development 16: 7–33. Van Geert, P. 1994. Dynamic Systems of Development: Change between Complexity and Chaos. Harvester. Van Geert, P. 2008. ‘The dynamic systems approach in the study of L1 and L2 acquisition: An introduction,’ Modern Language Journal 92: 179–99. Van Geert, P. and M. van Dijk. 2002. ‘Focus on variability: New tools to study intra-individual variability in developmental data,’ Infant Behavior and Development 25: 340–75. Van Geert, P., G. Savelsbergh, and H. van der Maas. 1999. Non-linear Developmental Processes. Royal Netherlands Academy of Arts and Sciences. Verspoor, M. and K. Sauter. 2000. English Sentence Analysis: An Introductory Course. John Benjamins. Verspoor, M., W. Lowie, and M. van Dijk. 2008. ‘Variability in L2 development from a dynamic systems perspective,’ Modern Language Journal 92: 214–31.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

presented at the invited colloquium ‘‘Fluency, accuracy and complexity in second language acquisition: Theoretical and methodological perspectives,’’ convened by Alex Housen and Folkert Kuiken. 30th American Association for Applied Linguistics Annual Conference, Washington, DC, March 29–April 1. Norris, J. M. and L. Ortega. 2009. ‘Towards an organic approach to investigating CAF in instructed SLA: The case of complexity,’ Applied Linguistics 30: 555–78. [Special Issue on Complexity, accuracy, and fluency in second language acquisition: Theoretical and methodological perspectives, co-edited by A. Housen & F. Kuiken.]. O’Grady, W. 2008a. ‘Language without grammar’ in Robinson P. and N.C. Ellis (eds): Handbook of Cognitive Linguistics and Second Language Acquisition. Routledge, pp. 139–67. O’Grady, W. 2008b. ‘The emergentist program,’ Lingua 118: 447–64. Ortega, L. and H. Byrnes. 2008. ‘Theorizing advancedness, setting up the longitudinal research agenda’ in Ortega L. and H. Byrnes (eds): The Longitudinal Study of Advanced L2 Capacities. Routledge, pp. 281–300. Ringbom, H. 1987. The role of the First Language in Foreign Languge Learning. Mulitlingual Matters. Robinson, P. 2005. ‘Aptitude and second language acquisition,’ Annual Review of Applied Linguistics 25: 46–73. Schot-Saikku, P. 1990. Der Partitiv und die Kasusalternation: Zum Fall Partitiv in der Finnischen Syntax. Buske. Skehan, P. 1996. ‘Second language acquisition research and task-based instruction’ in Willis J. and D. Willis (eds): Challenge and Change in Language Teaching. Heinemann, pp. 17–30. Skehan, P. 2003. ‘Task based instruction,’ Language Teaching 36: 1–14. Skehan, P. and P. Foster. 1997. ‘The influence of planning and post-task activities on

553

Applied Linguistics: 31/4: 554–577 ß Oxford University Press 2010 doi:10.1093/applin/amq007 Advance Access published on 3 March 2010

Investigating L2 Performance in Text Chat 1

SHANNON SAURO and 2BRYAN SMITH

1 University of Texas at San Antonio and 2Arizona State University E-mail: [email protected]; [email protected]

INTRODUCTION Computer-mediated environments represent a growing context for L2 learners to study and use the target language (Chapelle 2008). This can be seen, for example, in the use of computer-mediated communication (CMC) used among students in the same classroom (e.g. Abrams 2003; Shekary and Tahririan 2006) or as part of larger collaborative class projects that link learners with target language speakers in other cities and countries (e.g. Belz 2002; Lee 2004). Such technology-enhanced L2 environments call for research on second language learning processes and outcomes that arise during or are influenced by CMC (Skehan 2003). A small but growing body of research has begun to examine claims and constructs from SLA during written synchronous computer-mediated communication (SCMC), text-chat, in particular. These include, for example, uptake following negotiated interaction (Smith 2005), the influence of task type on negotiation episodes during chat (Blake 2000; Pellettieri 2000), the amount and type of negotiation strategies that occur during chat (Ko¨tter 2003), the effectiveness of computer-mediated corrective feedback on the development of L2 grammar (Loewen and Erlam 2006; Sachs and Suh 2007; Sauro 2009), the provision of and responses to linguistic affordances during NS/NNS telecollaboration (Darhower 2008), the use of dynamic assessment to observe learner development during chat (Oskoz 2005), and the nature of CMC self-initiated self-repair (Smith 2008). While findings have largely supported certain trends found in face-to-face interaction, more dynamic data capture technologies have recently been used

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

This study examines the linguistic complexity and lexical diversity of both overt and covert L2 output produced during synchronous written computer-mediated communication, also referred to as chat. Video enhanced chatscripts produced by university learners of German (N = 23) engaged in dyadic task-based chat interaction were coded and analyzed for syntactic complexity (ratio of clauses to c-units), productive use of grammatical gender, and lexical diversity (Index of Guiraud). Results show that chat output that exhibits evidence of online planning in the form of post-production monitoring displays significantly greater linguistic complexity and lexical diversity than chat output that does not exhibit similar evidence of online planning. These findings suggest that L2 learners do appear to use the increased online (i.e. moment-by-moment) planning time afforded by chat to engage in careful production and monitoring.

S. SAURO and B. SMITH

555

WRITTEN SYNCHRONOUS CMC Written synchronous CMC, often referred to as chat or text-chat, is typically characterized by multiple overlapping turns, an enduring as opposed to ephemeral trace, and greater lag time between turns than afforded by spoken interaction. Early work on small group interaction in L2 classrooms identified overlapping turns as a characteristic of chat that permitted greater participation for members in small group interaction than permitted during spoken small group interaction, in which only one person at a time can hold the floor (Kern 1995). That is to say, in chat, multiple interlocutors may be composing messages simultaneously and hitting the return key within

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

to identify features of learner strategies and L2 performance that may be a function of the chat environment. This can be seen in Smith’s (2008) use of screen capture software to record all mouse movement, typing, and deletion to examine self-initiated self-repair (SISR) strategies during chat. By examining videos of the participants’ computer screens, Smith was able to document all instances of self-repair, including CMCovert self-repair, defined as self-repair attempts that were deleted during the editing and rewording of chat turns prior to transmission. Smith’s findings indicated that accounting for both overt SISR (that which is ‘sent’ to the interlocutor and appears on traditional chat logs) and CMCovert SISR (that which is not sent and which does not appear on traditional chat logs) revealed far more self-repair during chat than had been previously assumed. Furthermore, examination of the video revealed that a substantial portion of L2 output was deleted during the composition of each message and did not appear in the final transmitted turn. Smith’s findings raised questions concerning the deleted portion of the L2 output; in particular, what was being deleted (e.g. simple spelling errors, developmentally more advanced but less automatic IL forms), and how the deleted text compared linguistically to the text that was eventually transmitted. In other words, did the deleted text show evidence of hypothesis testing and risk-taking through the use of more varied lexical items and more complex morphosyntax relative to that found in the text that was eventually transmitted? As has been demonstrated by research on planning time and L2 performance, increased online planning time has been found to benefit the complexity and accuracy of L2 performance (Yuan and Ellis 2003; Ellis and Yuan 2004). Synchronous text chat has been argued to afford learners more processing time (Pellettieri 1999; Shehadeh 2001; Smith 2004) and, by extension, increased online planning time during these ‘conversations in slow motion’ (Beauvois 1992). Certainly, learners do engage in significant monitoring of target language output during text chat (Smith 2008). Accordingly, this study uses screen capture video records of learner interaction in order to more closely explore the relationship between planning time and L2 performance in SLA—specifically, the linguistic complexity and lexical diversity of L2 output in a chat environment.

556 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

Turn #

Turns with Time Stamps

3

CW: Der Mann mit dem weiss hembt gibt der andere Mann ein Portmonet 11:58:31 CW: The Man with the white shirt gives the other man a wallet

4

CW: Eine Frau sagt etwas 11:59:01 CW: A woman says something

5

CW: Der anderer Mann sagt etwas und gibt der Portmonet zum Mann mit weiss Hembt 11:59:48 CW: The other man says something and gives the wallet to the man with white shirt

6

MT: In der ersten bild ein mann offnen sein Portemonnai 12:00:30 MT: In the first picture a man opens his wallet

7

CW: Der Mann mit weiss Hemd fragt die Frau 12:00:59 CW: The man with white shirt asks the woman

8

CW: dass sind meine 4 Bilder 12:01:10 CW: those are my 4 pictures

9

MT: ok 12:01:18 MT: ok

Lag Time

0:30

0:47

0:42

0:29

0:11

0:08

Figure 1: Lag time between turns in NNS/NNS chat

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

moments of each other without having to wait for a break in conversation to reply to prior comments or introduce new content. This overlapping of turn-taking in chat is supported by the enduring nature of the interaction, which, unlike spoken conversation, is captured and preserved in the chat window on the computer screen. Thus, in contrast to the fleeting nature of spoken interaction, the chat window provides interlocutors with a more enduring and reviewable visual record of the exchange. According to Smith (2005), this accessible record affords L2 learners the opportunity to reread or scroll back to prior turns in a manner that may mirror the benefits of repetition and redundancy. Although unfolding in real time, chat interaction supports longer pauses between turns than commonly found in spoken interaction. Figure 1 illustrates the length of pauses between turns that occurred during the NNS/NNS chat examined in this study. In this particular exchange, wait time between turns ranged from 8 to 47 s. Protracted wait time between turns in chat results in part from typing, which is intrinsically slower than speaking, as well as from a delay in message transmission inherent in most pedagogical (e.g. Blackboard Chat) and commercial (e.g. Skype, Yahoo! Instant Messenger) chat clients. Whereas in spoken conversation, in which interlocutors can listen to their partners’ utterances as they unfold, interlocutors in most chat programs are not privy to unfolding utterances but must wait until the return key is pressed and the completed and edited message is transmitted in full.

S. SAURO and B. SMITH

Turn #

Transmitted and Deleted Output with Time Stamps

3

CW: Der Mann mit dem weiss hembt gibt der andere Mann ein Poerrtmonet 11:58:31

4

CW: Eine Frau sagt etwas 11:59:01

5

CW: Der andere maner Mann sagt etwas und gibt der Portmonet zum Mann mit He h weiss Hembt 11:59:48

6

MT: In der ersten bild ein mann offnen sein Portemonnai 12:00:30

7

CW: Der mMann mit weiss Hembd [6] sagt etwas Der Mann mit weidss Hemd fragt die Frau 12:00:59

8

CW: dallss sind meine 4 Bilder 12:01:10

9

MT: ok 12:01:18

557

Lag Time

0:30 0:47

0:42

0:29 0:11

Figure 2: Overt and covert output during chat Taken together, the speed of typing relative to speaking as well as the software-induced lag time between turns mean that interlocutors in chat have more time to both process incoming messages and produce and monitor their output. Indeed, Smith’s (2008) screen capture of self-repair provided evidence that L2 learners often do, in fact, monitor and edit their output during this period of lag time. Examples of this output monitoring can be seen in Figure 2, which depicts the coded video enhanced chatscript of the exchange shown initially in Figure 1. In this coding system (see online supplementary material for Appendix 1), a strikethrough indicates text that was typed and then deleted before the message was sent (e.g. turn 5) while underlined text (e.g. turn 7) signifies text with embedded deletions, that is, a larger string of text which is deleted and which contains subcomponents that were edited prior to the full deletion. While certain turns show no evidence of message editing and monitoring (e.g. turn 4), others show evidence of deletion of words or word parts (e.g. turns 5 and 8), and still others contain deletion of full phrases or sentences (e.g. turn 7). As Figure 2 illustrates, whether at the word or sentence level, these deleted segments comprise a portion of the total learner output during chat. Thus L2 chat interaction includes the production of both overt and covert output with the former comprising the output that is transmitted to the interlocutor and the latter encompassing the deleted elements. The significance of the production of overt and covert output during chat for L2 development can be understood in terms of planning time and L2 performance.

PLANNING AND L2 PERFORMANCE Research on planning time and L2 performance has focused in particular on two overarching types of planning which are differentiated based on when the

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

0:08

558 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

planning occurs: pre-task planning, and online planning. As the name suggests, pre-task planning is either rehearsal (i.e. a practice run-through of the task) or strategic planning (i.e. deliberation of content and code) which occurs during a preparation period prior to the performance of a language production task (Ortega 1999; Ellis 2005). In contrast, online planning refers to ‘the moment-by-moment planning during the task performance’ (Yuan and Ellis 2003: 4). Both types of planning have been hypothesized to contribute to L2 performance by freeing up attentional resources, thereby enabling L2 learners to attend to linguistic form. Pre-task planning, it has been suggested, affords learners the opportunity to consider message content in advance (Ellis 2005) while online planning enables learners to attend carefully to message formulation and to engage in pre- and post-production monitoring (Yuan and Ellis 2003). Research on planning time has primarily investigated the impact of pre-task and online planning time on three aspects of L2 performance, hypothesized to map on to phases of the learning process (Skehan 2003). The first of these, complexity, which describes the use of more advanced or diverse target language features, corresponds to the growth and subsequent restructuring of the learner’s interlanguage system (Skehan and Foster 1999). Accuracy, or the avoidance of error during production, has been hypothesized to reflect the increase of control over newly acquired language features while fluency, defined as real-time rapid language production, reflects more advanced or native-like control of target language structures (Skehan and Foster 1999). In studies of planning time, a variety of measures have been used to evaluate these three aspects of L2 performance. Typical indicators of complexity include measures of syntactic complexity (e.g. ratio of clauses to T-units) (Ellis and Yuan 2004, 2005; Kawauchi 2005), syntactic variety (e.g. range of verb forms used) (Ellis and Yuan 2005), lexical diversity (e.g. modified type-token ratios) (Daller et al. 2003), and length of c-units (Elder and Iwashita 2005). Indicators of accuracy include percentage of error-free clauses (Skehan and Foster 2005), error rates (e.g. per 100 words, per T-unit) (Sangarun 2005), and target-like use analysis of select morphology (e.g. article usage, verbal morphology) (Pica 1983). Indices of fluency include speech rate (e.g. syllables per minute) (Ellis and Yuan 2005), mean length of run (Towell, et al. 1996; Tavakoli and Skehan 2005), and measurements of pausing (e.g. number of pauses, total pausing time) (Kormos and De´nes 2004). Studies have demonstrated a benefit for pre-task planning on these three aspects of L2 performance. In particular pre-task planning has been found to consistently benefit fluency (Foster and Skehan 1996; Mehnert 1998; Kawauchi 2005) and syntactic complexity (Crookes 1989; Mehnert 1998; Ortega 1999; Yuan and Ellis 2003; Kawauchi 2005) during oral production. Though limited, research on planning time and written production (Ellis and Yuan 2004) has also found an advantage for pre-task planning on fluency in writing and complexity as measured by syntactic variety (the number of verb forms used). However, findings regarding the effect of pre-task planning on

S. SAURO and B. SMITH

559

ONLINE PLANNING AND CHAT As Skehan and Foster (2005) have argued, ‘[p]lanning is an unobservable activity’ (p. 197) and claims regarding the influence of planning time on L2 performance rely on the assumption that learners are in fact making use of allotted time to actually plan or monitor language production. Studies of pre-task planning have looked to notes taken during the pre-task planning phase or to post-task stimulated recall (Ortega 1999) to support this assumption. However, in the case of online planning, operationalized as additional or unlimited time for language task performance, no similar product or evidence is generated to support the assumption that learners are in fact making use of additional online time to plan and monitor their utterances (Skehan and Foster 2005). It is here that the case of covert output generated by learners during chat may provide evidence of online planning in the form of post-production monitoring (Yuan and Ellis 2003). By post-production monitoring, we simply mean learners’ self-correction of their own output that occurs after a message is produced. In the current study such post-production monitoring is apparent in the form of (covert) self-repair. If, as suggested by prior research on online planning during speech and writing, online planning time also benefits L2 production during chat, then comparison of covert and overt output may reveal differences in L2 performance. In particular, if learners do in fact engage in more careful production1 and monitoring during synchronous chat, we may expect a qualitative difference in the use of developmentally more advanced or varied TL features in the target language produced prior to, during and following covert output.

RESEARCH QUESTIONS The present study, therefore, uses a portion of the data from Smith’s (2008) study and delves more deeply into describing the L2 performance in chat by

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

accuracy are less consistent and suggest that, other factors such as the type and complexity of the language production task (Skehan and Foster 1997, 1999), the target language features used to measure accuracy (Ellis 2005), the competing processing demands of fluency and complexity (Foster and Skehan 1996), as well as the proficiency of the learner (Kawauchi 2005) may mitigate the advantage of pre-task planning on accuracy. In contrast, studies of online planning in both oral and written production, though still quite limited, have found a benefit for accuracy and complexity but not for fluency. Relative to non-planning conditions, these findings also point to the following trends: (i) similar advantages for both pre-task and online planning on complexity over non-planning conditions, (ii) an advantage for online planning over pre-task planning for accuracy, and (iii) an advantage for pre-task planning over online planning for fluency.

560 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

examining whether there is a difference in the linguistic complexity and variety of covert and overt output produced by learners. The following research questions were posed: 1. Is there a difference in the linguistic complexity of chat output that shows evidence of on-line planning in the form of post-production monitoring versus chat output that does not? 2. Is there a difference in the lexical diversity of chat output that shows evidence of on-line planning in the form of post-production monitoring versus chat output that does not?

Participants and tasks For this analysis, one task was chosen for further analysis from the larger study reported in Smith (2008). Data from this task consisted of 23 usable chat and Camtasia records of beginner-high level learners of German-as-aforeign-language. There were 12 dyads that completed the task, however, one participant experienced technical problems of some sort and did not initiate the screen capture software correctly, thus resulting in the odd number. These students participated in this study as part of their regularly scheduled German language course at a major southwestern university in the United States. In the larger study students were required to meet once every other week in the foreign language micro-computing lab over the course of the semester. All students were undergraduates and all were native speakers of English. None were German majors. Their proficiency level and placement in the German sequence was determined by an in-house online placement test. All participants were characterized by the instructor as roughly at the ACTFL Novice-High proficiency level and indicated verbally to the researcher that they were familiar with the chat function in Blackboard. However, participants did complete one 50-min training session prior to data collection to ensure they were familiar with the chat interface as well as the task type and procedures since they were not necessarily accustomed to performing similar task-based CMC activities in their German class.

Materials The current task was a sequential ordering task, a type of jigsaw. As with all of the tasks in the larger study this task type was chosen because of its structural requirement of two-way information exchange by participants who are striving to reach a convergent goal (Pica et al. 1993). The current task provided learners each with a task sheet that contained a series of color stills from a two-minute dramatic video clip that corresponded to the week’s assigned course content. Care was taken to select a series of stills,

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

METHODOLOGY

S. SAURO and B. SMITH

561

DATA COLLECTION AND ANALYSIS Capturing the interaction The dynamic screen capture software Camtasia 3 recorded exactly what appeared on each participant’s computer screen in real time. The Camtasia files were recorded to a networked drive and copied by one of the researchers for later analysis. The chat logs of these interactions were saved automatically in Blackboard.

Coding the interaction Hard copies of the chat transcripts were converted to individual MS Word documents. One copy of each chat transcript was renamed in preparation for coding the interaction with the screen capture. These versions are referred to as the video-enhanced chatscripts. The video-enhanced chatscripts were coded by one of the researchers using the procedure outlined in Smith (2008), which used Appendix 1 (see online supplementary material) to signify text that was typed and subsequently deleted before being sent, text that was inserted later in the composition phase of any given message, as well as to show the timing/ location of each of these moves. In order to code the chat interaction in this way, each participant’s video file (a screen capture of the chat interaction) was played back in its entirety, pausing the playback where necessary. The second researcher coded each of the chatscripts for c-units,2 clauses, and instances of productive use of grammatical gender. To establish inter-rater reliability,

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

which, when viewed without having seen the video clip, had no obvious sequential order. These same stills were easily sequenced upon viewing the full video clip, however. Learners were instructed to describe their series of pictures to their partner and collaboratively arrive at a proposed correct and logical ordering for the series of eight pictures. After the pairs had a solution, they were instructed to view the short video clip and then reconvene to evaluate their solution/sequence and make any changes before proposing a final sequence/solution. Given the length of the class participants realistically had about 40 min to complete the task. All students worked collaboratively online with a partner. Each participant was given task sheet A or B. All of those holding task sheet A were grouped in one area of the computer lab while those holding task sheet B were grouped in another. This was done in order to reduce the chance that any participant would gain visual access to their interlocutor’s (partner’s) task sheet. Participants interacted with one another via the chat function in Blackboard and were assigned to one of various paired ‘groups’ under Blackboard’s Communication Tool, Virtual Classroom.

562 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

the first researcher also coded two full chat transcripts for c-units, clauses, and instances of productive use of grammatical gender. The initial overall agreement statistic was .93 and was considered sufficiently high. In the few cases where there was initial disagreement the researchers discussed the item until 100% agreement was reached. Each segment of text from these coded chatscripts was then placed into one of five columns, representing the different sub-categories of output produced with and without evidence of post-production monitoring.

Operationalizations

SCMC Chatscript Column A

Hard copy of transcript (with English translation) Column B 1b. K: ok das ist so wie photo D von mir 1:08:57 1a. K: ok das ist so wie photo D von mir 1:08:57 K: ok that is like picture D of mine 2b. D: Die menner mit dem ros Hemd ist die 2a. D: Die menner mit dem Rose ros Hemd ist die T Zeitung zu halten. 1:09:23 Zetiung [1a] [zu] halten [+]. 1:09:23 D: The men with the pink shirt is holding the looki to 3a. D: die manner sind look newspaper 3b. D: wie sagt man to look at? 1:10:12 look at? 1:10:12 D: How do you say to look at? 4a. K: shen 1:10:37 4b. K: shen 1:10:37 K: Misspells the verb “to look” 5a. K: sehen** 1:10:40 5b. K: sehen** 1:10:40 K: to look 6a. K: oder sieht etwas an 1:10:50 6b. K: oder sieht etwas an 1:10:50 K: or ”to look at something” 7b. K: ok im welches photo nehmt den mann mit 7a. K: ok im welches photo nehmt den mann mit der der rosa hemd die zeitung 1:12:14 rosa hemd die zeitung 1:12:14 K: ok in which picture does the man with the pink shirt take the newspaper 8a. D: die manne [6a] e ist an a sie[g]t ht [-] e die 8b. D: der Man in der rose Hemd sieht an dieser der Man in der Rose hemd rose Hemd istt shieght and man mit dem blau Hemd. 1:12:15 dieser man mit dem blau Hemd.[7a] 1:12:15 D: the man with the pink shirt looks at the man with the blue shirt.

Figure 3: Examples of text categories

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Output generated without evidence of post-production monitoring included the following two categories: (i) pristine text, and (ii) pre-deleted text. Output that showed evidence of post-production monitoring included the following three categories: (iii) post-deleted text, (iv) deleted text and (v) post-deleted deleted text. Descriptions of each follow as does an example from the coded data in Figure 3. Pristine text was a complete stretch of text that contained no deletions of any kind. Such text was written from beginning to end without any self-corrections or alterations and sent to the interlocutor. Pristine text is not necessarily error-free, however. Pristine text can be considered overt in nature.

S. SAURO and B. SMITH

563

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

When considering this category of text we may say that learners have had the regular benefit of increased processing time afforded by the SCMC interface. Deleted text was that text which appears only on the video-enhanced chatscripts. That is to say, this category of text would not appear on traditional chat transcripts. We consider such deleted text to be covert in nature. This text was captured by examining the Camtasia video files of the chat interaction in a line by line fashion. Post-deleted text was that text learners typed immediately following any deletions/corrections participants made and as such can be considered overt in nature. We counted as post-deleted text that text which started immediately after a deletion/correction and ended with sentence-ending punctuation, or the sending of the message. Post-deleted deleted text was that post-deleted text that was subsequently deleted and as such can be considered covert in nature. A discussion of this category follows in the Data analysis section below. Pre-deleted text was developed simply to account for all of the text written by an individual. Since Pristine text required that there be absolutely no deletions or corrections in a given stretch of discourse, the pre-deleted text category was needed in order to code and account for that text which came immediately before a deletion or correction in the same turn. The pre-deleted text category was not used explicitly in the data analysis but is considered overt in nature. There were, then, actually two kinds of deleted text; deleted text and post-deleted deleted text. Deleted text of both sorts (deleted text and post-deleted deleted text) are argued here to constitute evidence that learners are indeed making use of the additional time afforded by SCMC for online planning in the form of post-production monitoring of their ‘utterances’. We may expect that output which comes immediately after such post-production monitoring will show the benefits of this planning time in the way of more sophisticated language production since there is arguably a heightened degree of attention to form immediately following the execution of deleted text. When one deletes text, one’s attention is drawn to the fact that this text was unacceptable in some way. Thus, it seems reasonable to assume that learners will focus on form (and arguably content) more in their writing immediately following self-repair.3 In summary, then, we argue that the methodology presented here, which captures text that is subsequently deleted gets us closer to accounting for the possible positive effects of online planning. Likewise, this deleted text allows us to distinguish between evidence of online planning and the possible effects of online planning as measured by post-deleted text production. As such the current (CMC) study accesses information about the nature of learner–learner interaction that is not readily available in comparable face-to-face studies.

564 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

Data analysis

Analysis 1 In the deleted text focus we coded and quantified all deleted text and assigned it to the deleted text category irrespective of whether this text could/should also be coded as post-deleted deleted text as well. Thus, in this analysis only that post-deleted text which was not then subsequently deleted remained in the post-deleted text category. In summary then, for this first level of analysis, all text that was typed and then subsequently deleted was assigned to the deleted text category.

Analysis 2 In the post-deleted text focus all text that was initially assigned to the post-deleted text category remained there irrespective of whether it was subsequently deleted (post-deleted deleted text). In this way we were able to avoid any overlaps in the statistical comparisons of the data since we conducted two independent and parallel statistical analyses. Figures 4 and 5 illustrate the analysis conducted.

Calculating measures of linguistic complexity and lexical diversity We calculated the linguistic complexity and lexical diversity for each category of text (pristine text, deleted text, and post-deleted text) for each participant (n = 23) and for each focus (deleted text focus and post-deleted text focus). Thus, participants served as their own controls. We operationalized linguistic complexity in two ways. First, we calculated the syntactic complexity for each text category for each learner, and second, we calculated the complexity of form by productive use of grammatical gender.4 We arrive at the syntactic complexity of

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Because there were two categories of deleted text in the chat interaction data examined here, two separate comparisons of these data were required. That is, as mentioned above, occasionally text was at the same time ‘post-deleted text’ and ‘deleted text’. Such a coding dilemma occurs when text which is ‘post-deleted’ in nature is subsequently deleted (post-deleted deleted text) before being sent. The problem was what to do with this post-deleted deleted text. Of course, such an ‘overlap’ or ‘double counting’ of the data would be problematic for any statistical analysis, so simultaneously counting post-deleted deleted text as both deleted text and post-deleted text in the same analysis was not an option. In order to address this problem, we decided to conduct two separate levels of analysis of the data. Pristine text was not affected by this problem since by definition there was never any overlap between pristine text and deleted or post-deleted text. We shall call the first level of analysis the deleted text focus (Analysis 1) and the second level of analysis the post-deleted text focus (Analysis 2).

S. SAURO and B. SMITH

Deleted text (DT)

Post-deleted deleted text (PDDT)

Analysis 1: Deleted text focus

565

Post-deleted text (PDT)

Analysis 2: Post-deleted text focus

Figure 4: Illustration of the relationship between deleted text, post-deleted text, and post-deleted deleted text

2. Post-deleted text focus

Pristine text

All PT

Deleted Text Post-deleted text DT + PDDT PDT – PDDT DT - PDDT

PDT + PDDT

Note: PT = Pristine text; DT = Deleted text; PDT = Post-deleted text; and PDDT = Postdeleted deleted text.

Figure 5: Two separate analyses of DT focus and PDT focus data a text by dividing the number of clauses by the number of c-units. The measure of grammatical gender simply reflected the number of occurrences in the chatscript of grammatical gender in German. According to DeKeyser (2005), grammatical gender is among those elements of a second language that are notoriously hard to acquire for native speakers of L1s that do not have them or that use a very different system.5 These elements of grammar also seem to be strongly resistant to instructional treatments. In addition, due to the nature of the communication tasks in which no specific complex German form was deemed task essential (Loschky and Bley-Vroman 1993), it was determined that grammatical gender would be the target language feature most likely to occur in sufficient quantity for statistical analysis. For our measure of lexical diversity we employed the Index of Guiraud (or Guiraud value/coefficient), which is the number of lexical types, operationalized as unique content and function words, divided by the square root of tokens. This measure reduces the influence of the token length or the length of the text under consideration. The higher the score of the Guiraud value (G), the greater the variety of vocabulary included in a text. (See Daller et al. 2003 for a detailed discussion.)

RESULTS Analysis 1: Deleted text focus For comparisons of the syntactic complexity, grammatical gender, and lexical diversity of pristine text, deleted text, and post-deleted text, a series of

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Analysis 1. Deleted text focus

566 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

Table 1: Descriptive data—deleted text focus (n = 23) Text type

Mean

SD

Min.

Max.

Syntactic complexity

PDT PT DT

0.811 0.396 0.205

0.815 0.269 0.208

0 0 0

4.29 1.0 0.80

Grammatical gender

PDT PT DT

9.65 4.60 4.39

6.18 5.48 3.11

1.0 0 0

25.0 21.0 14.0

Lexical diversity

PDT PT DT

4.85 4.07 3.20

0.890 1.35 0.779

2.71 1.0 1.0

6.36 6.55 4.57

PDT = post-deleted text; PT = pristine text; DT = deleted text.

pair-wise Wilcoxon Matched Pairs tests were conducted. In all cases alpha was set at .017 (Bonferroni adjustment) to account for the fact that multiple comparisons of the same data were made. Table 1 shows the descriptive data for analysis 1. Figures 6, 7, and 8 show graphical representations of these comparisons. Results for this pair-wise comparison for syntactic complexity showed that syntactic complexity was significantly higher for post-deleted text than pristine text (z = 3.458, p = .001, r = .72)6 as well as deleted text (z = 4.010, p < .001, r = .84). Syntactic complexity for pristine text was also significantly higher than deleted text (z = 2.581, p = .010, r = .54). Thus, the syntactic complexity for post-deleted text was the highest followed by the pristine text. Deleted text had the lowest degree of syntactic complexity. A similar analysis was run for the measure of grammatical gender. Results of the multiple Wilcoxon Matched Pairs tests showed that grammatical gender was significantly higher for post-deleted text than pristine text (z = 2.471, p = .013, r = .52) as well as deleted text (z = 3.656, p < .001, r = .76). There was no significant difference found between pristine text and deleted text (z = 0.179, p = .858). Post-deleted text, then, showed significantly more use of grammatical gender than did pristine text and deleted text. Finally, a similar analysis was performed on the measure of lexical diversity (G). Results of the Wilcoxon Matched Pairs tests showed that post-deleted text had significantly higher scores than pristine text (z = 2.464, p = .014, r = .51) as well as deleted text (z = 4.197, p < .001, r = .88). Pristine text scores were also significantly higher than deleted text scores as well (z = 2.433, p = .015, r = .51). We can say, then, that the post-deleted text showed the highest degree of lexical diversity followed by the pristine text. Deleted text showed the lowest relative lexical diversity score.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Variable

S. SAURO and B. SMITH

567

Analysis 2: Post-deleted text focus The second level of analysis focused on the nature of the post-deleted text in the chat interaction. As mentioned above, all text that was composed after some deleted text was assigned to the post-deleted text category for this analysis regardless of whether this text was eventually deleted prior to being sent (see coding explanation in the Operationalizations section). The rationale here stemmed from the working hypothesis that immediately following a deletion, learners would have a heightened state of attention to form. For this comparison, whether post-deleted text eventually becomes post-deleted deleted text is irrelevant. For comparisons of the syntactic complexity, grammatical gender, and lexical diversity of pristine text, deleted text, and post-deleted text in this second analysis, a series of pair-wise Wilcoxon Matched Pairs tests were conducted. In all cases alpha was set at .017 (Bonferroni adjustment) to account for the fact that multiple comparisons of the same data were made. Table 2 shows the descriptive data for the post-deleted text focus (analysis 2), with Figures 9, 10, and 11 depicting the graphical representations of these same comparisons.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Figure 6: Analysis 1—comparisons of syntactic complexity across the variable text type

568 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

Results showed that syntactic complexity was significantly higher for post-deleted text than pristine text (z = 2.902, p = .004, r = .61) as well as deleted text (z = 3.702, p < .001, r = .77). There was no significant difference found between pristine text and deleted text (z = 2.062, p = .039). Results for grammatical gender showed a similar outcome. Grammatical gender scores were significantly higher for post-deleted text than pristine text (z = 3.079, p = .002, r = .64), as well as deleted text (z = 4.109, p < .001, r = .86). No significant difference was found in the comparison of deleted text and pristine text (z = 1.730, p = .084). Results for lexical diversity (G), showed that post-deleted text scores were significantly higher than pristine text (z = 2.403, p = .016, r = .50), as well as deleted text (z = 4.167, p < .001, r = .87). Pristine text scores were also significantly higher than deleted text scores (z = 2.768, p = .006, r = .58).

DISCUSSION In answer to research question 1, ‘Is there a difference in the linguistic complexity of chat output, which shows evidence of online planning in the form of post-production monitoring, versus chat output that does not?’, the present

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Figure 7: Analysis 1—comparisons of grammatical gender across the variable text type

S. SAURO and B. SMITH

569

Table 2: Descriptive data—post-deleted text focus (n = 23) Variable

Text type

Syntactic complexity

PDT PT DT

Grammatical gender

PDT PT DT

Lexical diversity

PDT PT DT

Mean

SD

Min.

0.285 0.269 0.260

0 0 0

12.82 4.60 2.26

9.65 5.48 2.19

1.0 0 0

4.80 4.07 2.89

0.879 1.35 0.777

2.59 1.0 1.0

0.642 0.396 0.215

PDT = post-deleted text; PT = pristine text; DT = deleted text.

Max. 1.5 1.0 1.0 41.0 21.0 8.0 6.39 6.55 4.166

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Figure 8: Analysis 1—comparisons of lexical diversity across the variable text type

570 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

Figure 10: Analysis 2—comparisons of grammatical gender across the variable text type

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Figure 9: Analysis 2—comparisons of syntactic complexity across the variable text type

S. SAURO and B. SMITH

571

data suggest a partial advantage for one subset of chat output that shows evidence of online planning. For both measures of linguistic complexity (syntactic complexity and grammatical gender) post-deleted text was significantly higher than both pristine text and deleted text. The results regarding the relationship between pristine text and deleted text for syntactic complexity are less clear. In answer to research question 2, ‘Is there a difference in the lexical diversity of chat output, which shows evidence of online planning in the form of post-production monitoring, versus chat output that does not?’, the present data suggest a partial advantage for one subset of output that shows evidence of online planning. The results show that post-deleted text was significantly more lexically diverse than both pristine text and deleted text. Likewise, pristine text was significantly more lexically diverse than deleted text. Table 3 illustrates the nature of the relationships between the variables in both the deleted text focus (Analysis 1) and post-deleted text focus (Analysis 2). Most interesting is the fact that post-deleted text was significantly higher than the pristine text in every comparison made. We might expect the post-deleted text and the pristine text to be higher than the deleted text since the learner decides to delete the deleted text because it is arguably faulty in some way. Furthermore, we notice that for lexical diversity pristine

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Figure 11: Analysis 2—comparisons of lexical diversity across the variable text type

572 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

Table 3: Overview of relationship between each variable

Deleted text focus Post-deleted text focus

Syntactic complexity

Grammatical gender

Lexical diversity

PDT > PT > DT PDT > PT & DT

PDT > PT & DT PDT > PT & DT

PDT > PT > DT PDT > PT > DT

Limitations of the current study The current study has several limitations. The first of these is the small sample size (n = 23) and the resulting possibility of a Type II7 error. In addition to sample size limitations the use of a specific intact population makes it difficult to make generalizations about the nature of L2 performance during chat. That said, these ‘limitations’ can also be viewed as design strengths. That is, if we are to posit any strong pedagogical relevance to our findings, it is important to make the study as naturalistic as possible (rather than artificially simulated). It is our view that such a design has a high degree of ‘ecological validity’ in that learners were engaged in a communicative task with another learner in a familiar setting (the language computer lab). The data collection was scheduled during their regularly scheduled class time and the underlying ‘tracking’ software employed in no way interfered with their typical experience with the learning management system (Blackboard). Investigation of chat produced by learners of varying proficiencies, particularly by more advanced L2 learners, may have produced more robust effects. A third limitation stems from characteristics of the communication task used and the subsequent nature of the

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

text is also significantly higher than deleted text in both the deleted text focus and post-deleted text focus. Overall, then, the results suggest that learners create more complex or sophisticated language in the post-deleted text environment. That is to say, subsequent to the deletion of covert chat output, an indicator of post-production monitoring, this group of learners used more complex target language features. In most cases, dyads did not arrive at the ‘correct’ sequential order for the video stills. Indeed, the task was intentionally constructed to be challenging for this group in order to force them to sufficiently stretch their interlanguage resources. The task itself is best viewed as a means to an end, namely, to lead learners to engage in meaningful and arguably beneficial target language interaction with another learner. Success was not measured in terms of correctly ordering the picture sequence, but rather in terms of whether each dyad was able to arrive at a mutually acceptable joint resolution to the task. Keeping this in mind, we can say that although only two of the dyads arrived at the prescribed ‘correct’ sequence (17%), all dyads were successful in completing the task (100%).

S. SAURO and B. SMITH

573

language this task elicited. Although promoting a two-way flow of interaction, this particular jigsaw task (Pica et al. 1993) did not require the use of specific task-essential or task-useful TL forms (Loschky and Bley-Vroman 1993). In some instances, learners were able to avoid complex syntax by relying on letters to agree on the order of their pictures (e.g. ‘Vielleicht, F, E, B?’) instead of describing the actions depicted in the pictures. A task that necessitated the use of more sophisticated syntax or specifically targeted a complex TL form may have generated different effects.

CONCLUSIONS AND FUTURE DIRECTIONS Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The results of this study suggest that learners do appear to use the increased online planning time afforded by chat to engage in careful production that results in more complex language. If post-deleted text output is considered a product of online planning, these findings support the results of other (face-to-face) studies which also found an effect for increased online planning time and complexity (Yuan and Ellis 2003; Ellis and Yuan 2004). These findings hold pedagogical implications for the use of synchronous text-chat in language classrooms. Instructors in face-to-face contexts wishing to provide learners with output opportunities that allow learners to produce complex language may consider incorporating task-based chat into their lessons. Similarly, instructors of online distance language courses may also consider incorporating a synchronous task-based chat component to their course to allow remote language learners this opportunity for careful target language production. Additionally, and perhaps most importantly, pedagogical implications deriving from these findings call for the design and selection of communication tasks that take advantage of the additional online planning time afforded by chat. Optimally, such tasks should be designed to elicit complex target language features or precise and descriptive vocabulary from learners. Thus a picture sequencing task, such as the one used in this study, might be replaced with a video sequencing task, in which learners view and narrate different segments from a larger video clip in order to determine the sequence of segments. Successful completion of such a video reconstruction task (see, for example, Sullivan and Caplan 2003) may call upon more complex syntax as learners must narrate the events of each clip instead of relying on abbreviated labels for pictures, as was observed in this study. In addition, chat may also be an optimal tool for the use of more specifically form-focused activities that require careful production or monitoring of complex or difficult language features. Understanding the nature of L2 performance during chat and the role of increased online planning time requires research which examines L2 chat interaction among learners of varying proficiencies. This includes studies which examine the complexity of covert and overt output generated by learners of differing proficiencies while completing tasks that necessitate or are at least more likely to elicit more complex language during chat interaction. It is

574 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

SUPPLEMENTARY DATA Supplementary material is available at Applied Linguistics online.

NOTES 1

2

Yuan and Ellis (2003) contrast careful production with rapid production, which lacks opportunity for selfrepair. In keeping with prior L2 research on CMC (e.g. Gonza´les-Lloret 2003), cunits (communication units) were used instead of T-units to account for isolated words and phrases that did not necessarily contain verbs yet conveyed information. c-Units, unlike T-units, include the single words and phrases that characterize a great deal of chat discourse.

3

4

See Smith and Sauro (2009) for an exploration of additional factors in the chat context that might also influence a learner’s decision to delete text during chat. These numbers reflect learner attempts at target-like use and do not imply that these instances were accurate, grammatical, or target-like. Rate of targetlike use, which corresponds to accuracy, the second dimension of L2 performance according to Skehan (2003), was beyond the scope of this current study.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

also worth investigating whether differences in online planning time afford an advantage for L2 performance in chat relative to face-to-face interaction and to what degree any such advantage is mitigated by typing ability, digital literacy, and target language proficiency. In addition, the current study made no attempt to investigate whether there might be a relationship between the amount and complexity of post-deleted text with the complexity of language production (whether written or spoken) subsequent to the chat sessions. Such research might uncover potential benefits from chat for continued L2 development. While the current study investigated syntactic complexity and lexical diversity, future research should also examine the relative accuracy of both covert and overt output. Finally, further inquiry into the nature of chat interaction should investigate factors influencing deletion during chat, including identifying under what conditions learners choose to either delete or transmit their output, what elements of the chat tool or interlocutor responses facilitate noticing or prompt more complex and accurate output. In particular, the use of stimulated recall could help us better understand other underlying factors that prompt learners to delete output during chat. Chat environments remain a promising site for research on second language development; however, to date, few studies have examined the full range of target language output learners produce during chat. The results of this study demonstrate how capturing the full range of learner output through screen capture technology can be used to investigate how the online planning time afforded by the medium is in fact used by L2 learners to produce more complex language during dyadic interaction.

S. SAURO and B. SMITH

5

7

size estimates are adversely affected by departures from normality and heterogeneity of variances (as is largely the case with the present data). In order to provided some indication of the meaningfulness of observed significant differences in these data effect sizes (r) were calculated by dividing the relevant z score by the square root of N. An effect size of r = 0.10 was defined as small, r = 0.30 as medium, and r = 0.50 or larger as large. We also chose to provide confidence intervals along with graphical representations of each statistic comparison in Figures 6–11. Type II error results when a researcher fails to reject a null hypothesis that is false. This contrasts against Type I error, which results when a researcher rejects a null hypothesis that is true.

REFERENCES Abrams, Z. 2003. ‘The effect of synchronous and asynchronous CMC on oral performance in German,’ The Modern Language Journal 87: 157–67. Beauvois, M. H. 1992. ‘Computer-assisted classroom discussion in the foreign language classroom: Conversation in slow motion,’ Foreign Language Annals 25: 455–64. Belz, J. A. 2002. ‘Social dimensions of telecolaborative foreign language study,’ Language Learning and Technology 6/1: 60–81. Blake, R. 2000. ‘Computer mediated communication: a window on L2 Spanish interlanguage,’ Language Learning and Technology 4/1: 120–36. Chapelle, C. A. 2008. ‘Technology and second language acquisition,’ Annual Review of Applied Linguistics 27: 98–114. Cohen, L. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates. Crookes, G. 1989. ‘Planning and interlanguage variation,’ Studies in Second Language Acquisition 11: 367–83.

Daller, H., R. Van Hout, and J. TreffersDaller. 2003. ‘Lexical richness in the spontaneous speech of bilinguals,’ Applied Linguistics 24: 197–222. Darhower, M. 2008. ‘The role of linguistic affordances in telecollaborative chat,’ CALICO Journal 26/1: 48-69. DeKeyser, R. M. 2005. ‘What makes learning second language grammar difficult? A review of issues,’ Language Learning 55/ Suppl 1: 1–25. Elder, C. and N. Iwashita. 2005. ‘Planning for test performance: does it make a difference?’ in Ellis R. (ed.): Planning and Task Performance in a Second Language. Lawrence Erlbaum Associates, pp. 219–38. Ellis, R. 2005. ‘Planning and task-based performance: Theory and research’ in Ellis R. (ed.): Planning and Task Performance in a Second Language. Lawrence Erlbaum Associates, pp. 3–34. Ellis, R. and F. Yuan. 2004. ‘The effects of planning on fluency complexity and accuracy in

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

6

As one anonymous reviewer pointed out, grammatical gender is a feature introduced early in formal German language instruction. Despite its inclusion in introductory texts, grammatical gender, remains a cognitively complex form for second language learners who are faced with recognizing and learning a complex underlying system of semantic, morphological and phonological rules that guide gender assignment, so that this feature of German continues to pose a challenge for even advanced learners (Menzel 2005). There does not seem to be an ‘industry standard’ for calculating effect sizes for many nonparametric measures. More commonly used (parametric) effect size measures such as Cohen’s d are not appropriate when employing nonparametric analyses since such effect

575

576 INVESTIGATING L2 PERFORMANCE IN TEXT CHAT

Menzel, B. 2005. ‘Psycholinguistic aspects of gender acquisition in instructed GFL learning’ in Housen A. and M. Pierrard (eds): Investigations in Instructed Second Language Acquisition. Walter de Gruyter, pp. 51–97. Ortega, L. 1999. ‘Planning and focus on form in L2 performance,’ Studies in Second Language Acquisition 21: 109–48. Oskoz, A. 2005. ‘Students’ dynamic assessment via online chat,’ CALICO Journal 22/3: 513–36. Pellettieri, J. 1999. ‘Why-talk? Investigating the role of task-based interaction through synchronous network-based communication among classroom learners of Spanish’ Unpublished doctoral dissertation. University of California Davis. Pellettieri, J. 2000. ‘Negotiation in cyberspace: The role of chatting in the development of grammatical competence’ in Warschauer M. and R. Kern (eds): Network-based Language Teaching: Concepts and Practice. Cambridge University Press, pp. 59–86. Pica, T. 1983. ‘Methods of morpheme quantification: The effect on the interpretation of second language data,’ Studies in Second Language Acquisition 6: 69–79. Pica, T., R. Kanagy, and J. Falodun. 1993. ‘Choosing and using communication tasks for second language instruction and research’ in Crookes G. and S. Gass (eds): Tasks and Language Learning: Integrating Theory and Practice. Multilingual Matters, pp. 9–34. Sachs, R. and B. Suh. 2007. ‘Textually enhanced recasts learner awareness and L2 outcomes in synchronous computer-mediated interaction’ in Mackey A. (ed.): Conversational Interaction in Second Language Acquisition: A Collection of Empirical Studies. Oxford University Press, pp. 197–227. Sangarun, J. 2005. ‘The effects of focusing on meaning and form in strategic planning’ in Ellis R. (ed.): Planning and Task Performance in a Second Language. Lawrence Erlbaum Associates, pp. 111–41. Sauro, S. 2009. ‘Computer-mediated corrective feedback and the development of L2 grammar,’ Language Learning and Technology 13/1: 96–120. Shehadeh, A. 2001. ‘Self-and other-initiated modified output during task-based interaction,’ TESOL Quarterly 35/3: 433–57.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

second language narrative writing,’ Studies in Second Language Acquisition 26: 59–84. Ellis, R. and F. Yuan. 2005. ‘The effects of careful within-task planning on oral and written task performance’ in Ellis R. (ed.): Planning and Task Performance in a Second Language. Lawrence Erlbaum Associates, pp. 167–92. Foster, P. and P. Skehan. 1996. ‘The influence of planning and task type on second language performance,’ Studies in Second Language Acquisition 18: 299–323. Gonza´les-Lloret, M. 2003. ‘Designing taskbased CALL to promote interaction: En busca de Esmeraldas,’ Language Learning and Technology 7/1: 86–104. Kawauchi, C. 2005. ‘The effects of strategic planning on the oral narratives of learners with low and high intermediate L2 proficiency’ in Ellis R. (ed.): Planning and Task Performance in a Second Language. Lawrence Erlbaum Associates, pp. 143–64. Kern, R. G. 1995. ‘Restructuring classroom interaction with networked computers: Effects on quantity and quality of language production,’ The Modern Language Journal 79: 457–6. Kormos, J. and M. De´nes. 2004. ‘Exploring measures and perceptions of fluency in the speech of second language learners,’ System 32/2: 145–64. Ko¨tter, M. 2003. ‘Negotiation of meaning and codeswitching in online tandems,’ Language Learning and Technology 7/2: 145–72. Lee, L. 2004. ‘Learners’ perspectives on networked collaborative interaction with native speakers of Spanish in the US,’ Language Learning and Technology 8/1: 83–100. Loewen, S. and R. Erlam. 2006. ‘Corrective feedback in the chatroom: An experimental study,’ Computer Assisted Language Learning 19/1: 1–14. Loschky, L. and R. Bley-Vroman. 1993. ‘Creating structure-based communication tasks For second language development’ in Crookes G. and S. Gass (eds): Tasks and Language Learning: Integrating Theory and Practice. Multilingual Matters, pp. 123–67. Mehnert, U. 1998. ‘The effects of different lengths of time for planning on second language performance,’ Studies in Second Language Acquisition 20: 83–108.

S. SAURO and B. SMITH

Smith, B. 2008. ‘Methodological hurdles in capturing CMC data: The case of the missing selfrepair,’ Language Learning and Technology 12/1: 85–103. Smith, S. and S. Sauro. 2009. ‘Interruptions in chat,’ Computer Assisted Language Learning 22/3: 229–47. Sullivan, J. and N. Caplan. 2003. ‘Beyond the dictogloss: Learner-generated attention to form in a collaborative, communicative classroom activity,’ Working Papers in Educational Linguistics 19/1: 65–89. Tavakoli, R. and P. Skehan. 2005. ‘Strategic planning, task structure, and performance testing’ in Ellis R. (ed.): Planning and Task Performance in a Second Language. Lawrence Erlbaum Associates, pp. 239–73. Towell, R., R. Hawkins, and N. Bazergut. 1996. ‘The development of fluency in advanced learners of French,’ Applied Linguistics 17/1: 84–115. Yuan, F. and R. Ellis. 2003. ‘The effects of pretask planning and on-line planning on fluency complexity and accuracy in L2 monologic oral production,’ Applied Linguistics 24/1: 1–27.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Shekary, M. and H. M. Tahririan. 2006. ‘Negotiation of meaning and noticing in textbased online chat,’ The Modern Language Journal 90/4: 557–73. Skehan, P. 2003. ‘Focus on form tasks and technology,’ Computer Assisted Language Learning 16/5: 391–411. Skehan, P. and P. Foster. 1999. ‘The influence of task structure and processing conditions on narrative retellings,’ Language Learning 49/1: 93–120. Skehan, P. and P. Foster. 2005. ‘Strategic and on-line planning: The influence of surprise information and task time on second language performance’ in Ellis R. (ed.): Planning and Task Performance in a Second Language. Lawrence Erlbaum Associates, pp. 193–216. Smith, B. 2004. ‘Computer-mediated negotiated interaction and lexical acquisition,’ Studies in Second Language Acquisition 26: 365–98. Smith, B. 2005. ‘The relationship between negotiated interaction, learner uptake and lexical acquisition in task-based computermediated communication,’ TESOL Quarterly 39/1: 33–58.

577

Applied Linguistics: 31/4: 578–582 ß Oxford University Press 2010 doi:10.1093/applin/amq021 Advance Access published on 1 July 2010

FORUM

Making it Real: Authenticity, Process and Pedagogy 1

RICHARD BADGER and 2MALCOLM MACDONALD

1 University of Leeds, UK and 2University of Warwick, UK E-mail: [email protected]; [email protected]

Authenticity has been a part of the intellectual resources of language education for many years. Gilmore (2007) traces this back to the 1890s (Sweet 1964) but the term moved to a central, if contested, position with the development of communicative language teaching from the 1970s onwards (Widdowson 1979; Breen 1985; Nunan 1989). The debate about authenticity has become more visible in recent years (Badger et al. 2006; Gilmore 2007; O’Donnell 2009; Roberts and Cooke 2009) and was a key part of the conversation in this journal between Waters (2009a, 2009b) and Simpson (2009). Waters argues that the imposition of authenticity by applied linguists on language teaching has, in some sense, led to a disempowerment of language teachers. We agree that there is something slightly dispiriting in the view of the language classroom as a second rate version of what happens outside the classroom. However, we share Simpson’s doubts about whether this is the influence of applied linguistics (2009: 432). There is relatively little applied linguistic research on the impact of authentic language on language learning and much second language acquisition research seems to draw on constructed language data (e.g. Pienemann 2006). However, we do think that the discourse related to authenticity is problematic. The views that we want to develop here

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Authenticity has been a part of the intellectual resources of language teaching since the 1890s but its precise meaning and implications are contested. This commentary argues for a view of authenticity which recognizes the limits of the concept as a guide for pedagogic practice and acknowledges the fact that texts are processes rather than products. First, authenticity may help to decide what texts not to use in class but provides no guidance about which authentic texts are, for example, motivating. Secondly, the term authenticity is misleading because it leads us to conceptualize authenticity as the bringing of a text from a communicative event into a classroom. Texts are the result of an interaction between what we might term a proto-text, sound waves, or marks on paper or screen, and a language user. The authenticity of a text in the classroom depends on the similarity between the way it is used in the classroom and the way it was used in its original communicative context.

FORUM

579

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

are that, first, the concept of authenticity is used to justify more than it should and secondly, and more fundamentally, it is based on a product view of language which leads to a lack of clarity when the term is used in language education. Both of these factors mean that the role of pedagogic decisions in the use of authentic language can be obscured. Waters comments on the dangers of treating authenticity as a moral imperative and there is a sense in which authenticity has a kind of halo effect. Waters identifies commentators who link authenticity to native speaker texts and motivation and he himself sees authenticity as obliging teachers to use texts that are too hard for their learners. The principle of authenticity for language samples is that we should use texts which are not designed for the purposes of language teaching. This notion emerged from concerns with the constructed texts that were produced as part of audio-lingual and situational methods of language teaching which now read as slightly odd. Language samples which come from non-language learning contexts are a better representation of language use outside the classroom. We find it hard to argue against this view but it is important to recognize the limits of the principle. For example, it says nothing about whether the producers of the language are native or non-native speakers. Authentic language is produced by both groups of language users. A similar point can be made about motivation and level of difficulty. Both motivation and level of difficulty are a function of the interaction between particular texts and particular language learners. Authentic texts which are motivating for some users will be boring for others; authentic texts which are easy for some language learners will be difficult for others. Authenticity says nothing about the motivational properties or the level of difficulty of a language sample. More generally, the principle of authenticity indicates that contrived texts are less useful for language teaching but does not indicate which authentic material language teachers should use in the classroom. When teachers select a particular authentic text, they will consider factors such as whether a particular text is motivating or at the right level of difficulty or whether learners will need to deal with native, non-native speakers or some combination of these. The principle of authenticity does not preclude pedagogic decisions by language teachers. Indeed, we would argue that a more sophisticated understanding of authenticity highlights where pedagogic principles should be applied. Our second argument relates to the conceptualization of authentic language samples as products. This is a less obvious issue but this conceptualization means that we see teachers as simply taking authentic texts from one context and moving them into the classroom. This view has become so normalized that it has not been explored to any great extent but we feel that it has been reinforced by the success of corpus linguists’ investigations of authentic text products in producing descriptions of the grammar and vocabulary of many languages, particularly English (e.g. Sinclair 1987;

580 FORUM

I am not sure that it is meaningful to talk about authentic language as such at all. I think it is probably better to consider authenticity not as a quality residing in instances of language but as a quality which is bestowed upon them, created by the response of the receiver (1979: 165). Widdowson saw the central aspect of this as what the writer or speaker intends. Authenticity, then, is achieved when the reader realizes the intentions of the writer by reference to a set of shared conventions (1979: 166). We would want to query the extent to which reading, or listening, can be seen as the realization of the writers’ or speakers’ intentions, rather than as the outcome of some kind of negotiation between writers/speakers and readers/ listeners. But, leaving this point aside, readers and listeners do more than interpret their interlocutors’ intentions. Field points out that, when we listen: what reaches our ears is not a string of words or phrases or even a sequence of phonemes. It is group of acoustic features . . . We must not think of the words or phonemes of connected speech as transmitted from speaker to listener. It is the listener who has to turn the signal into units of language (2008: 127). Similarly, what we think of as letters on a page or on a screen are just marks until we bring our knowledge of language to those marks. The process by which we treat ‘g’ and ‘g’ as the same and ‘p’ and ‘q’ as different has become so automatic that we do not even recognize that there is a process. Bauman and Briggs (1990: 120) point out ‘that verbal art forms are so susceptible to treatment as self-contained, bounded objects separable from their

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Rundell 2002; Biber et al. 2003; Carter and McCarthy 2006). These descriptions represent one of the major, if not the major, advances in language description, over the last quarter of a century but, while some teachers will give their learners authentic language products so that they can produce their own language descriptions, generally authentic language samples are brought into the classroom as a way of using the language. Learners are primarily expected to read or listen to such texts and only secondarily, if at all, to exploit them as the basis of the development of their knowledge of grammar and vocabulary. Knowledge is more easily related to product views of language, and skills to process views. The belief that authentic texts are to do with the development of knowledge of language may have made it harder to see authenticity as a process. This is reflected in the paucity of discussions of authenticity as a process, although Widdowson (1979) was clearly thinking on these lines in the late 1970s, albeit in a rather negative way.

FORUM

581

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

social and cultural contexts of production and reception’ that we do not even notice the process of what they call entextualization. This process makes ‘a stretch of linguistic production into a unit – a text – that can be lifted out of its interactional setting’ (1990: 73). We think of texts as simply physical objects. Rather, texts are created by an interaction between the physical marks on the paper or the sound waves in the air, what we might call the proto-text, and language users. When a teacher brings an authentic proto-text into the classroom and learners read it or listen to it, there is a new text and the authenticity is to be found in the degree of similarity between the text process in its original context and the text process in the classroom. There will almost always be a difference between the text processes inside and outside the classroom and teachers need to consider the aspects of the text process outside the classroom that they want to replicate inside the classroom. This implies a greater role for teachers than simply that of porters. So, White (1998: 61–62) suggests that a teacher reading a newspaper article might be pedagogically more effective than playing the recording of someone telling a story because this enables a degree of interactivity that is more similar to how the story was originally told. The reading is in some ways more authentic than the recording. In a different way, authenticity can serve to identify pedagogic gaps in language classes. Field (2008) describes the pre-listening stage of a typical listening class as having a focus on providing linguistic and world knowledge. These kinds of knowledge are elements in many psycholinguistic models of the listening process (e.g. Field 2008) and can be seen as an attempt to make the listening more similar to listening outside the classroom, that is, making it more authentic. However, this analysis also reveals that there is relatively little teaching of listening going on in such classes. In many reading and listening classes, there is too much focus on making what happens in the classroom as authentic as possible and not enough on helping learners to develop their skills so that they can read and listen independently. Our conceptualization of authenticity also has wider implications as it sees language users as a necessary part of language and so is hard to reconcile with a Saussurean (1974) view of language as comprising a signifier and a signified. It fits in better with a Piercean view (Pierce 1965; Young 2008) of language as something which stands to somebody for something in some respect or capacity (Pierce 1965: 135). This change in the conceptualization of language moves us towards a view of the language classroom not as a kind of second rate version of the outside world but as a place with its own legitimacy. ‘The classroom has its own communicative potential and its own authentic metacommunicative purpose’ (Breen 2001: 138), in which learners and teachers may work towards the development of what Simpson (2009: 432) describes as ‘authentic voices’.

582 FORUM

REFERENCES authentic text features,’ The Modern Language Journal 93/iv: 512–33. Pienemann, M. 2006. ‘Processibility theory’ in B. VanPatten and J. Williams (eds): Processibility Theory. Routledge, pp. 137–54. Pierce, C. S. 1965. Volume II: Elements of Logic. The Belknap Press. Roberts, C. and M. Cooke. 2009. ‘Authenticity in the adult ESOL classroom and beyond,’ TESOL Quarterly 43: 620–42. Rundell, M. 2002. MacMillan English Dictionary for Advanced Learners. MacMillan. Saussure de, F. 1974. Course in General Linguistics. Fontana. Simpson, J. 2009. ‘A critical stance in language education: a reply to Alan Waters,’ Applied Linguistics 30/3: 428–34. Sinclair, J. 1987. Collins Cobuild English Language Dictionary. HarperCollins. Sweet, H. 1964. The Practical Study of Languages: A Guide for Teachers and Learners. Oxford University Press. Waters, A. 2009a. ‘Ideology in applied linguistics for language teaching,’ Applied Linguistics 30/1: 138–43. Waters, A. 2009b. ‘‘To mediate relevantly’: a response to James Simpson,’ Applied Linguistics 30/4: 602–8. White, G. 1998. Listening. Oxford University Press. Widdowson, H. G. 1979. Explorations in Applied Linguistics. Oxford University Press. Young, R. F. 2008. ‘Foundations for the study of practice,’ Language Learning 58/s2: 9–47.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Badger, R. G., M. MacDonald, and M. Dasli. 2006. ‘Authenticity, culture and language learning,’ Language and Intercultural Communication 6/3–4: 1–12. Bauman, R. and C. L. Briggs. 1990. ‘Poetics and performances as critical perspectives on language and social life,’ Annual Review of Anthropology 19/1: 59–88. Biber, D., G. Leech, and S. Conrad. 2003. Longman Student Grammar of Spoken and Written English. Longman. Breen, M. 1985. ‘Authenticity in the language classroom,’ Applied Linguistics 6/1: 60–70. Breen, M. 2001. ‘The social context for language learning: a neglected situation’ in C. Candlin and N. Mercer (eds): The Social Context for Language Learning: A Neglected Situation. Routledge, pp. 122–43. Carter, R. and M. McCarthy. 2006. The Cambridge Grammar of English: A Comprehensive Guide. Cambridge University Press. Field, J. 2008. Listening in the Language Classroom. Cambridge University Press. Gilmore, A. 2007. ‘Authentic materials and authenticity in foreign language learning,’ Language Teaching 40/02: 97–118. Nunan, D. 1989. Designing Tasks for the Communicative Classroom. Cambridge University Press. O’Donnell, M. E. 2009. ‘Finding middle ground in second language reading: pedagogic modifications that increase comprehensibility and vocabulary acquisition while preserving

Applied Linguistics: 31/4: 583–594

 Oxford University Press 2010

REVIEWS Barbara Ko¨pke, Monika S. Schmid, Merel Keijzer, and Susan Dostert (eds): LANGUAGE ATTRITION: THEORETICAL PERSPECTIVES. John Benjamins, 2007.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

This volume might be seen as a sequel to the 2004 Benjamins volume about language attrition, with which it shares three editors and a number of contributors. Comprising 14 chapters, the volume presents new theoretical directions in attrition research, as well as offering further explorations of some of the research reported in the earlier volume. The volume begins with a brief introduction by Barbara Ko¨pke and Monika Schmid, which situates L1 attrition research within the context of research on bilingualism and advances an approach that integrates it with L2 acquisition research. The next three chapters, by Barbara Ko¨pke, Michael Sharwood Smith, and Kees de Bot, respectively, present different integrative approaches to the study of L1 attrition. Ko¨pke lays out a range of factors implicated in L1 attrition, including brain mechanisms, cognitive processes, and factors external to the individual, and suggests that a ‘multi-component view’ provides the most promising account of this phenomenon. Sharwood Smith presents the MOGUL (Modular Online Growth and Use of Language) framework, which relates attrition and acquisition to each other and integrates linguistic knowledge and linguistic processing. Accordingly, language acquisition involves building up a bank of (permissible) structures in long-term memory, with language attrition arising from the lesser availability of structures as a result of disuse. De Bot’s chapter offers two theoretical perspectives on the question of how and why people lose their language: that of ‘lifespan developmental psychology’, which considers changes in an individual over different age ranges, and ‘Dynamic Systems Theory’, which treats language as a dynamic system. The former approach addresses the ‘why’ of L1 attrition by focusing on language-related major life events, which include events that ‘lead to a reduction of the use of one language’ (p. 58). The latter approach addresses the ‘how’ of L1 attrition by seeing L1 and L2 systems as constantly changing over time and displaying various other properties of dynamic systems, such as being ‘self-reorganizing’ (p. 60). Another three chapters in the volume, by Carol Myers-Scotton, Ianthi Tsimpli, and Ays e Gu¨rel, describe L1 attrition and related phenomena with the tools of formal linguistic theory. Myers-Scotton draws on M-4 theory, which distinguishes four types of morphemes in terms of their grammatical role, to investigate grammatical properties of Xhosa-English bilingual language use among Xhosa L1 speakers in South Africa. She concludes that the Xhosa of these speakers contains no critical grammatical morphemes from English, signalling a shift to English that is grammatically abrupt rather than gradual.

584 REVIEWS

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Tsimpli investigates the L1s of native speakers of Greek and Italian living in English, Swedish, and German L2 environments. She adopts the framework of Chomsky’s Minimalist Program to test the hypothesis that the selective vulnerability of the L1 to attrition involves only non-core features of the L1, such as the availability of different discourse-conditioned word orders, instead of core grammatical features such as case. The results that Tsimpli reports appear to show L1 attrition in both of these aspects of L1 attriters’ language, with the unexpected attrition related to case features tentatively attributed to language performance factors. Gu¨rel also investigates selective attrition from a Chomskyan perspective—in this case, of binding properties in the English of L1 English speakers who live in Turkey—but draws additionally on the psycholinguistic subset principle to refine her predictions about L1 properties liable to L2 interference and attrition. As it happens, Gu¨rel finds no L1 attrition effects in the group she has tested. Chapters by Michel Paradis and Monika Schmid explore some implications of a neurolinguistic theory of bilingualism for L1 attrition. In his chapter, Paradis lays out such a theory, according to which the neural substrate of a language item requires neural impulses to reach a certain level of activation and use of this item lowers its activation threshold. This leads to three key predictions: that ‘language disuse gradually leads to language loss’; that ‘the most frequently used elements of L2 will tend to replace their (less used) L1 counterparts’; and that ‘comprehension will be retained longer than production because self-activation requires a lower threshold than comprehension’ (p. 125). Schmid considers the first two of Paradis’s predictions in her chapter, which reports on a study of language attrition among L1 German speakers living in Canada and the Netherlands. This study investigated the role of linguistic mode in L1 use and the impact of frequency of L1 use on overall L1 performance. Schmid’s study found, surprisingly, that frequency of L1 use did not predict observed attrition effects; this leads her to conclude that attrition may not be as dependent on frequency of continued L1 use as is generally believed. Two chapters about L1 attrition in early bilinguals, those by Christophe Pallier and Rosalie Footnick, revolve around the question of whether a speaker’s L1 can actually be lost. Pallier reports on studies that sought to determine whether Korean-born individuals who had been adopted as children into French families had completely forgotten their L1 and whether they displayed native-like use of their L2. Results to date have provided an affirmative answer to both research questions, suggesting that language systems remain plastic for far longer than is commonly assumed. Footnick reports on the case of an apparently lost language, or ‘hidden language’, recovered through hypnosis. The case in question involved a French-born speaker of French and the Togolese language Mina, the latter acquired during a childhood stay in Togo. Footnick suggests that the initial inaccessibility of this language might have resulted from an ‘involuntary conflictual situation’ associated with the hidden language.

REVIEWS 585

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The last three chapters of the volume, by Petra Prescher, Miriam Ben-Rafael and Monika Schmid, and Antonio Jime´nez Jime´nez, respectively, turn to more subjective aspects of L1 attrition. Prescher reports on the results of interviews of long-term German immigrants in the Netherlands, finding that these individuals are conscious of their L1 attrition but have generally passed through stages of assimilation and bicultural identity and created a ‘transcultural’ identity. Ben-Rafael and Schmid report on their study of Israelis of French and Russian L1 backgrounds, which compared the attitudes of these two groups toward Israeli society, the Hebrew language and their respective use of their L1. This study found that the French-speaking group, which had a stronger integrative motivation to learn Hebrew, displayed more language mixing (signalling the possibility of greater L1 attrition) than the Russian-speaking group, whose motivation to learn Hebrew was more instrumental. Finally, Jime´nez Jime´nez describes the use of a stimulated recall protocol in his investigation of compensatory strategies among speakers of Spanish as a heritage language, suggesting its utility in language attrition research, particularly as a means for identifying avoidance strategies. As these brief summaries suggest, this volume presents a wealth of new perspectives on and insights into L1 attrition and its relation to bilingualism, making it indispensable reading for students of L1 attrition. Yet, the status of this volume as a sequel to the 2004 volume means that considerable familiarity with L1 attrition research is taken for granted; as a result, the volume may be less inviting to those new to this field. One remedy to this problem would be to consult the excellent introduction to the earlier volume, also written by Barbara Ko¨pke and Monika Schmid. Also worth noting is that the individual chapters themselves are quite brief, averaging 15 pages. As such, they are best seen as reports of ongoing research, presenting preliminary results, rather than more authoritative studies of particular phenomena. This is, however, entirely in keeping with the exploratory nature of the volume, which seeks to expand the theoretical and methodological scope of language attrition research and to generate new hypotheses. Indeed, it is arguably a great virtue of the volume that it raises so many questions about L1 attrition and associated research that, to my knowledge, have yet to receive detailed answers. One basic question is whether the term ‘L1 attrition’ is best confined to ‘classic’ cases of the phenomenon or should encompass a range of attrition effects. The latter possibility appears to be endorsed by Ko¨pke and Schmid, who observe that these ‘classic’ cases may well be ‘a more visible version’ of a process ‘that all bilinguals undergo to some degree’ (p. 3). Another, related, question is whether the ‘micro’ phenomenon of L1 attrition can be productively compared with the ‘macro’ phenomenon of language shift, which involves larger social processes not explored in this volume. Still further questions concern the theoretical frameworks and methodologies represented in the volume. These include a question that arises for any research that seeks to apply existing theories to new phenomena: whether

586 REVIEWS

such applied research must hold the theory ‘constant’ rather than treating the phenomenon under investigation as a potential source of new theoretical insights. Indeed, the phenomenon of L1 attrition, given its manifest complexity, might well lead researchers to revisit cherished assumptions and develop richer, more integrated theories of language knowledge, acquisition, and attrition. Such an endeavour might require not merely the integration of different approaches, but also greater collaboration between proponents of these approaches. This volume, in bringing together a diverse body of research and group of researchers, arguably points us in the right direction.

Zolta´n Do¨rnyei: RESEARCH METHODS IN APPLIED LINGUISTICS. Oxford University Press, 2007. The research scope of applied linguistics enjoys multiple interpretations. Hence the undertaking to compile a volume handy in size on its research methods seems to be extremely difficult. The author of this monograph challenges this prickly issue and draws its cutting edge from its perspective on data and its sub-categories. With expertise and erudition, he guides research students at measured pace through the process of conducting quantitative, qualitative, and most importantly, mixed-method research in an attempt to nurture the younger generations to become ‘good enough researchers’ within the ambit of applied linguistics. This volume, which argues for a combined use of quantitative and qualitative data analysis, is structured into five parts, a total of 14 chapters, plus preface and afterword. In the preface, the author uses his and his PhD students’ academic growth path to introduce the concept of ‘paradigm war’. The short preface serves to pave the way for the two warring parties to join forces in future research. Key issues in research methodology are addressed in Part I. Chapter 1 is an introduction to the author’s own research and approach. Doing research is compared with seeking a music CD at the lowest possible price. The message conveyed is that the disciplined inquiry into something unknown is within the reach of the novice researcher. By pivoting on the two data types, qualitative and quantitative, and blurring the distinction with no particular concern given to language data, the author narrows down his research scope by making clear just how wide the net of data is cast in this volume. Chapter 2 in its turn outlines a brief historical overview of quantitative, qualitative, and mixed-method

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Reviewed by Benjamin Shaer Department of Law, Carleton University, Ottawa E-mail: [email protected] doi:10.1093/applin/amq022 Advance Access published on 29 June 2010

586 REVIEWS

such applied research must hold the theory ‘constant’ rather than treating the phenomenon under investigation as a potential source of new theoretical insights. Indeed, the phenomenon of L1 attrition, given its manifest complexity, might well lead researchers to revisit cherished assumptions and develop richer, more integrated theories of language knowledge, acquisition, and attrition. Such an endeavour might require not merely the integration of different approaches, but also greater collaboration between proponents of these approaches. This volume, in bringing together a diverse body of research and group of researchers, arguably points us in the right direction.

Zolta´n Do¨rnyei: RESEARCH METHODS IN APPLIED LINGUISTICS. Oxford University Press, 2007. The research scope of applied linguistics enjoys multiple interpretations. Hence the undertaking to compile a volume handy in size on its research methods seems to be extremely difficult. The author of this monograph challenges this prickly issue and draws its cutting edge from its perspective on data and its sub-categories. With expertise and erudition, he guides research students at measured pace through the process of conducting quantitative, qualitative, and most importantly, mixed-method research in an attempt to nurture the younger generations to become ‘good enough researchers’ within the ambit of applied linguistics. This volume, which argues for a combined use of quantitative and qualitative data analysis, is structured into five parts, a total of 14 chapters, plus preface and afterword. In the preface, the author uses his and his PhD students’ academic growth path to introduce the concept of ‘paradigm war’. The short preface serves to pave the way for the two warring parties to join forces in future research. Key issues in research methodology are addressed in Part I. Chapter 1 is an introduction to the author’s own research and approach. Doing research is compared with seeking a music CD at the lowest possible price. The message conveyed is that the disciplined inquiry into something unknown is within the reach of the novice researcher. By pivoting on the two data types, qualitative and quantitative, and blurring the distinction with no particular concern given to language data, the author narrows down his research scope by making clear just how wide the net of data is cast in this volume. Chapter 2 in its turn outlines a brief historical overview of quantitative, qualitative, and mixed-method

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Reviewed by Benjamin Shaer Department of Law, Carleton University, Ottawa E-mail: [email protected] doi:10.1093/applin/amq022 Advance Access published on 29 June 2010

REVIEWS 587

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

research, their main characteristics and their inherent strengths and weaknesses. The author’s personal views on how far each of these three can push a certain research topic brings this insightful chapter to a close. Next, Chapter 3 discusses four topics that are important for data collection and analysis: (i) criteria of quality, (ii) research ethics, (iii) the relationship between topic, research question, hypothesis and research design, and (iv) issues of data management such as the use of a pilot study and the keeping of timely research logs. Impractical as the last two strategies may sound (p. 59), Do¨rnyei’s elucidation opens a window for readers to sense the logic internal to the research process. The last chapter in Part I, which is relatively short, dwells more on the topics of longitudinal and cross-sectional research. The author emphasizes the significance of the former, its power to probe deep into an issue, as well as its intrinsic weaknesses. He points out that these limitations can be compensated for by transforming longitudinal designs into corresponding cross-sectional designs. His rough discussion on such conversions will stimulate readers to turn to complementary references. Part II, made up of four chapters, expands on data collection. Chapter 5 gives priority to collecting quantitative data. Following an introduction on the know-how of sampling, the author unfolds the data-collecting techniques for doing questionnaire surveys, (quasi-)experimental studies, and Internetbased studies. Data from language testing, though important, are excluded from the volume due to their complexity and the extensive literature which exists (p. 95). What a pity! Qualitative data collection is discussed in Chapter 6. In a similar vein, the reader is first exposed to the practical skills for sampling data. Guidelines and procedures are depicted explicitly, so that readers are equipped to collect, derive and synthesize data doing ethnographic research, (focus group) interviews, introspective methods, case studies, diary studies, and when relying on research journals. In Chapter 7, the author elaborates on the purpose of mixed-method research. In addition to a justification of the compatibility of quantitative and qualitative methods, nine main types of mixed-method design are listed. The overview of nine combination types plus the author’s own separate concise exemplar-based illustration (pp. 169–173) will be extremely helpful to students. If they are still uncertain about how to collect certain kinds of quantitative or qualitative data, they can go back to the detail provided in the preceding two chapters. Such an organization is space-saving. Chapter 8, the last one of Part II, concentrates on methods for collecting classroom data; it discloses some ongoing problems in classroom and action research. I agree that in classroom observation, the use of coding schemes for recording classroom performances will put a bulk of data in order. Data analysis is the theme of Part III. Its organization runs parallel to that of the first three chapters in Part II. Chapter 9 centres around the selection of statistical tests and procedures to prepare data and run them in SPSS. As to data preparation, the author proffers step-by-step guidelines on coding frames, creating and naming data files, screening and cleaning raw data, manipulating

588 REVIEWS

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

missing data, standardizing data, creating fewer but broader variables, as well as internal consistency and reliability analysis. His user-tailored suggestions on the mentioned notions contribute to the realization of research validity and reliability. With regard to statistical inference, the author addresses cluster analysis, meta-analysis, and structural equation modelling, which are for instance not discussed in Hatch and Farhady (1982). This indicates that new energy is invested into quantitative data analysis. Chapter 10 specifies the main principles of qualitative data analysis: it presents the strategies used in content analysis, gives an impressive overview of grounded theory, digresses into computer-aided qualitative data analysis, and more generally analyzes its pros and cons. Implanted in the reader is the idea that theory formation through systematically tiered or iterative data analysis is a must in qualitative data study. Rigorous thinking is evidenced in Chapter 11, in which the author delineates the techniques used to analyze mixed-method data. A telling example is that he cites factor analysis, cluster analysis, and the template method of coding as evidence that nuance must be added to the one-sided belief that mixed-method analysis proceeds independently in its quantitative and qualitative phases and that ‘mixing’ occurs only in the final stage of interpretation; this widely held belief is only partially true (p. 268). Part IV narrates on the report of research results and highlights facets of variation in academic writing. The author’s experience-based account with detailed recommendations for the three kinds of research will benefit student researchers who seek membership of academic discourse communities. Chapter 12 summarizes the functions of academic writing and provides an exhaustive list of suggestions for writing up the various sections of a quantitative research report. Chapter 13 does the same for qualitative and mixed-method reports. Novice researchers from non-English-speaking countries will find these two chapters extremely practical. Part V concludes the main body of the volume by pointing out some principles in choosing appropriate research methods. In a two-page long afterword the author reflects on his journey of completing this publication. This volume succeeds in making research methods in applied linguistics accessible to research students in at least three respects: (i) the principles, caveats, and procedures provided throughout guide students in their research, facilitating them to step forward from ‘legitimate periphery participation’ to ‘core periphery’ (Lave and Wenger 1991). (ii) Cogent reasoning for the potential meaning of mixed methods builds heavily on intertextual dialogue: the voices of other researchers are dialogued with, assumed, questioned, rejected, etc. in various ways. Readers are encouraged to explore differences among different voices, as part of constructing their own voice in their research practice. (iii) The author approaches the question of research methodology in applied linguistics by taking data as a starting point. This is commendable in a field which lacks consensus on its domains and limits (Berns and Matsuda 2008: 394). The complex, dynamic nature of contemporary applied linguistics might account for the lack of its definition in this volume. However, while being guided in their own research project, readers will detect both

REVIEWS 589

profundity and value in Do¨rnyei’s arguing in favour of the application of mixed methods. Reviewed by Hong Zhong and Huhua Ouyang Guangdong University of Foreign Studies, China E-mail: [email protected]; [email protected] doi:10.1093/applin/amq023 Advance Access published on 29 June 2010

REFERENCES Hatch, E. and H. Farhady. 1982. Research Design and Statistics for Applied Linguistics. Rowley. Lave, J. and E. Wenger. 1991. Situated Learning: Legitimate Peripheral Participation. Cambridge University Press.

V. Samuda and M. Bygate: TASKS IN SECOND LANGUAGE LEARNING. Palgrave Macmillan, 2008. During the past decades, there has been a steady increase in the number of Second Language Acquisition (SLA) studies in which the use of tasks is a central theme. Many of these studies try to reveal clear-cut and direct relationships between task features (e.g. complexity) and specific aspects of language learning. In order to research specific hypotheses, the conditions of task implementation are kept under control, so as to prevent all kinds of disturbing variables from intervening between task features and the independent variable. ‘Task’ is treated as a fixed variable: all learners carry out one and the same task, and the assumption is that there is a fixed effect on language learning. In authentic classrooms, however, tasks are far less under the control of teachers and learners. Tasks therefore should not be perceived as fixed entities, but rather as behaving like highly flexible material that can take on different existential guises as they pass through the minds and mouths of their users. A central question in task-based research is what this entails for the relationship between tasks (in the classroom) and language learning. In other words, to what extent do the many variables that are intrinsic to natural interaction in real classrooms have an impact on the task’s potential to stimulate language learning? Samuda and Bygate’s book tackles these topics theoretically and through research. The research selected for discussion comes from both perspectives (experimental and controlled designs versus natural, ‘real-life’ contexts), with a strong focus on classroom-based studies which research pedagogic tasks. Although the book focuses on tasks in classrooms, the intended audience are researchers, not teachers or practitioners.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Berns, M. and K. Matsuda. 2008. ‘Applied linguistics: overview and history’ in A. Anderson et al. (eds): Encyclopedia of Language & Linguistics. Vol. 1. 2nd edn. Shanghai Foreign Language Education Press.

REVIEWS 589

profundity and value in Do¨rnyei’s arguing in favour of the application of mixed methods. Reviewed by Hong Zhong and Huhua Ouyang Guangdong University of Foreign Studies, China E-mail: [email protected]; [email protected] doi:10.1093/applin/amq023 Advance Access published on 29 June 2010

REFERENCES Hatch, E. and H. Farhady. 1982. Research Design and Statistics for Applied Linguistics. Rowley. Lave, J. and E. Wenger. 1991. Situated Learning: Legitimate Peripheral Participation. Cambridge University Press.

V. Samuda and M. Bygate: TASKS IN SECOND LANGUAGE LEARNING. Palgrave Macmillan, 2008. During the past decades, there has been a steady increase in the number of Second Language Acquisition (SLA) studies in which the use of tasks is a central theme. Many of these studies try to reveal clear-cut and direct relationships between task features (e.g. complexity) and specific aspects of language learning. In order to research specific hypotheses, the conditions of task implementation are kept under control, so as to prevent all kinds of disturbing variables from intervening between task features and the independent variable. ‘Task’ is treated as a fixed variable: all learners carry out one and the same task, and the assumption is that there is a fixed effect on language learning. In authentic classrooms, however, tasks are far less under the control of teachers and learners. Tasks therefore should not be perceived as fixed entities, but rather as behaving like highly flexible material that can take on different existential guises as they pass through the minds and mouths of their users. A central question in task-based research is what this entails for the relationship between tasks (in the classroom) and language learning. In other words, to what extent do the many variables that are intrinsic to natural interaction in real classrooms have an impact on the task’s potential to stimulate language learning? Samuda and Bygate’s book tackles these topics theoretically and through research. The research selected for discussion comes from both perspectives (experimental and controlled designs versus natural, ‘real-life’ contexts), with a strong focus on classroom-based studies which research pedagogic tasks. Although the book focuses on tasks in classrooms, the intended audience are researchers, not teachers or practitioners.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Berns, M. and K. Matsuda. 2008. ‘Applied linguistics: overview and history’ in A. Anderson et al. (eds): Encyclopedia of Language & Linguistics. Vol. 1. 2nd edn. Shanghai Foreign Language Education Press.

590 REVIEWS

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

The introductory chapters describe theoretical and conceptual aspects of task-based language teaching (TBLT) from a historical viewpoint. The fact that tasks require holistic language use is something that educational theorists like Dewey and Freinet acknowledged a long time ago. The major challenge in education, and a central issue in this book, is how to find a balance between focusing on certain aspects of language to enhance learning and not losing sight of the holistic qualities of ‘normal’ language use. Here, the authors could have elaborated more on the fact that especially in Freinet and contemporary experience-based education, finding this balance is a particular pressure point. To illustrate what tasks are, and the forms they can take in classrooms, the authors use for example the ‘things in pockets’-task. Defining what tasks are does not seem to be simple and straightforward, and more attention could have been paid to the distinction between a task-as-work plan and a task-in-process (Breen 1987). Interesting is the way in which the authors compare different definitions, and describe critical task features. They arrive at their own definition: ‘A task is a holistic activity which engages language use in order to achieve some non-linguistic outcome while meeting a linguistic challenge, with the overall aim of promoting language learning, through process or product or both’ (p. 69). This definition captures the essence of tasks, but it does not tell us much about the (social) learning environment in which learners are confronted with particular tasks, an aspect which remains neglected throughout the book. Tasks can only be powerful if learners are facing them in a positive and safe language learning environment and when they receive fine-tuned interactional support (Verhelst 2006). In the next chapters, the authors deal with real classroom settings by expanding on tasks in pedagogical contexts. They do so first of all from a research perspective, by setting a particular agenda. Their overview of different approaches towards the study of tasks is indicative of different paradigms: systemic versus process-oriented research, group versus case studies and quantitative versus qualitative studies. Samuda and Bygate evaluate the paradigms without favouring one in particular. They try to reconcile them and the nuance this results in is certainly a strength throughout this book. A ‘balance sheet’ identifies relevant studies about pedagogic tasks, with the conclusion that there is too little research and that this research tends to be undertaken in a rather unsystematic way. The second part of the book is devoted to the interaction between research and practice. Empirical studies are explored critically and described in terms of theoretical and practical implications. The eight studies selected rely on different methodologies in distinct paradigms. The authors are right to advocate studies that are contextualized, that deal with the actual use of tasks (by teachers and learners in authentic settings) and also involve classroom processes (not only outcomes). Several examples are given of projects implementing TBLT. One of them is the project in Flanders, in which The Leuven Centre for Language and Education developed and implemented a task-based programme in regular and adult education (Van den Branden 2006). Although

REVIEWS 591

Reviewed by Machteld Verhelst The Catholic University of Leuven, Belgium E-mail: [email protected] doi:10.1093/applin/amq020 Advance Access published on 29 June 2010

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

the authors state that the Flemish project is extremely well-documented, they also highlight the fact that very little is known about the impact and effects on language learning. This lack of effect studies about TBLT is a widespread point of criticism. Norris (2009) points out that effect studies should be embedded in programme evaluation and be related to the situated realities of task-based teaching and learning. In contrast, as the authors mention, there are not many fully developed TBLT-programmes to study. Samuda and Bygate make a distinction between TBLT and task-supported language teaching (TSLT). In TSLT, the use of tasks is only one element in instruction, whereas, in TBLT, tasks are the only pedagogic activities. The authors describe some problems with TSLT and the risk of devaluating tasks. An existing danger for TBLT is indeed the fuzziness and creative use of the concept in daily practice (cf. Van den Branden and Verhelst 2006). In the ‘real’ world of language education, today TBLT comes in many shapes and forms. ‘Task’ serves as a basic unit which guides the identification and selection of goals, syllabus design, classroom methodology and language assessment and will be defined by practitioners in numerous ways, ranging from the things that people do in real life to focused grammar exercises which are designed to automate the learner’s knowledge of particular isolated rules. Samuda and Bygate make clear that tasks should be described as activities that people engage in to achieve certain (real-life) objectives and which necessitate the meaningful use of language. The authors further explore the strength of tasks as pedagogic tools. In the last chapters, Samuda and Bygate provide some research directions and list possible resources. Potential directions for future research are for example the relation between task design, use and grammar, the interactive processes while carrying out tasks, the need for differentiated teaching strategies for different language aspects, and the views of both teachers and learners on tasks. Although the research agenda is already very broad, some topics are not listed. Future research should also deal with TBLT and young children, the impact of social relations on TBLT, the functional use of the home language in task-based second language teaching, etc. The list of resources is very interesting, because it brings together relevant books and journals, expert organizations, databases and materials, all resources that are interesting enough to be put on a TBLT-website (that is under construction). Overall, this book is inspiring material for anyone in the field of language teaching. The richness of well-chosen examples, the clarity and balance in the description of benefits and pressure points of TBLT, make this book a worthwhile publication.

592 REVIEWS

REFERENCES Theory to Practice. Cambridge University Press, pp. 217–48. Van den Branden, K. and M. Verhelst. 2006. ‘Task-based language education: forms and functions,’ ITL International Journal of Applied Linguistics 152: 1–6. Verhelst, M. 2006. ‘A box full of feelings: Promoting infants’ second language acquisition all day long’ in K. Van den Branden (ed.): Task-based Language Teaching: From Theory to Practice. Cambridge University Press, pp. 197–216.

H. Spencer-Oatey and P. Franklin: INTERCULTURAL INTERACTION: A MULTIDISCIPLINARY APPROACH TO INTERCULTURAL COMMUNICATION. Palgrave Macmillan, 2009. Using examples of intercultural interaction from different disciplines such as anthropology, communication, psychology, marketing, management, and applied linguistics, the authors of this book have managed to present an exciting exploration of the theme of ‘becoming intercultural’ for both academic and non-academic readers. The main purpose of this book is to develop a multidisciplinary approach which brings insights from applied linguistics, pragmatics, and discourse analysis to the analysis of intercultural behaviours in interactions. However, the book comes with a distinct focus on intercultural, rather than interaction. It consists of four parts: Conceptualizing Intercultural Interaction (Chapters 2–7), Promoting Competence in Intercultural Interaction (Chapters 8 and 9), Researching Intercultural Interaction (Chapters 10 and 11), and Resources (Chapter 11). The first part, which begins with a multidisciplinary approach to the notions of culture and intercultural interaction competence, explores the role of (mis)understandings and rapport management towards effective intercultural interaction and culminates in a discussion of two outcomes of cultural difference: the encounter with an unfamiliar culture and impression management in intercultural interactions. In Chapter 2, the authors review different frameworks and ways of identifying and comparing ‘culture’ in different societies. A social construction perspective is invoked only as a justification for the variability of cultures, constructed by sets of regularities in different contexts. Without these regularities, social construction itself cannot explain the emergence of culture in societies and social groups. In fact they would not agree with Blommaert that ‘culture. . . is always situational’ (Blommaert 1998: 37). However, social constructivism, as a way of understanding the world, can be

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Breen, M. 1987. ‘Learner contributions to task design’ in C. Candlin and D. Murphy (eds): Language Learning Tasks. Prentice Hall, pp. 23–46. Norris, J. M. 2009. ‘Understanding and improving language education through program evaluation: Introduction to the special issue,’ Language Teaching Research 13/1: 7–13. Van den Branden, K. 2006. ‘Training teachers: Task-based as well?’ in K. Van den Branden (ed.): Task-based Language Education: From

592 REVIEWS

REFERENCES Theory to Practice. Cambridge University Press, pp. 217–48. Van den Branden, K. and M. Verhelst. 2006. ‘Task-based language education: forms and functions,’ ITL International Journal of Applied Linguistics 152: 1–6. Verhelst, M. 2006. ‘A box full of feelings: Promoting infants’ second language acquisition all day long’ in K. Van den Branden (ed.): Task-based Language Teaching: From Theory to Practice. Cambridge University Press, pp. 197–216.

H. Spencer-Oatey and P. Franklin: INTERCULTURAL INTERACTION: A MULTIDISCIPLINARY APPROACH TO INTERCULTURAL COMMUNICATION. Palgrave Macmillan, 2009. Using examples of intercultural interaction from different disciplines such as anthropology, communication, psychology, marketing, management, and applied linguistics, the authors of this book have managed to present an exciting exploration of the theme of ‘becoming intercultural’ for both academic and non-academic readers. The main purpose of this book is to develop a multidisciplinary approach which brings insights from applied linguistics, pragmatics, and discourse analysis to the analysis of intercultural behaviours in interactions. However, the book comes with a distinct focus on intercultural, rather than interaction. It consists of four parts: Conceptualizing Intercultural Interaction (Chapters 2–7), Promoting Competence in Intercultural Interaction (Chapters 8 and 9), Researching Intercultural Interaction (Chapters 10 and 11), and Resources (Chapter 11). The first part, which begins with a multidisciplinary approach to the notions of culture and intercultural interaction competence, explores the role of (mis)understandings and rapport management towards effective intercultural interaction and culminates in a discussion of two outcomes of cultural difference: the encounter with an unfamiliar culture and impression management in intercultural interactions. In Chapter 2, the authors review different frameworks and ways of identifying and comparing ‘culture’ in different societies. A social construction perspective is invoked only as a justification for the variability of cultures, constructed by sets of regularities in different contexts. Without these regularities, social construction itself cannot explain the emergence of culture in societies and social groups. In fact they would not agree with Blommaert that ‘culture. . . is always situational’ (Blommaert 1998: 37). However, social constructivism, as a way of understanding the world, can be

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Breen, M. 1987. ‘Learner contributions to task design’ in C. Candlin and D. Murphy (eds): Language Learning Tasks. Prentice Hall, pp. 23–46. Norris, J. M. 2009. ‘Understanding and improving language education through program evaluation: Introduction to the special issue,’ Language Teaching Research 13/1: 7–13. Van den Branden, K. 2006. ‘Training teachers: Task-based as well?’ in K. Van den Branden (ed.): Task-based Language Education: From

REVIEWS 593

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

used to explain how cultures, and even regularities are socially constructed. Therefore, the authors’ attempt at integrating social construction (‘explain variability’) with a positivist approach (‘observe regularity’) into a single framework ends up downgrading the former approach to a justification for the ‘variability’ in cultures, which is no longer central to understanding the construction of culture itself and its components in different contexts. ‘Communication’ rather than ‘interculturality’ provides the main orientation for the remaining chapters in this part of the book. Chapter 3 provides a multidisciplinary approach to the nature of ‘competence’. The authors criticize current work for its conceptual rather than research orientation and try to compensate for this lack by paying attention to what applied linguistic research can offer to a description of the component competencies for effective intercultural interaction. Chapter 4 describes the factors which facilitate the effective use of language, the construction of achieving ‘understanding’ in intercultural interactions, and the role of culture in meaning-making processes. All are illustrated using one extract from an oral interaction between a Korean tutor and an American undergraduate student. It would appear that the authors in this chapter have considered meaning transfer as the main purpose of intercultural communication, at the expense of ‘interactional goals’. Interactional goals do feature in the next chapter, where they are discussed as part of awareness, rather than as one of the main goals that people orient to when facing communication problems. However, the authors argue that in real life situations, in addition to a wide range of competencies needed to manage interpersonal rapport, contextual and individual variability should be taken into consideration when studying rapport management in intercultural communication. In the authors’ opinion, contextual and individual variation may impact negatively on the participants’ perception of their own face and that of others and this may end up in people misjudging each other and also being unfairly disadvantaged by stereotypes, prejudice, conscious discrimination, and deliberate domination. However, the authors have not paid much attention to the positive effects of contextual and individual variation in interpersonal interaction, e.g. variation can become an interesting topic in its own right and in this way help keep the interaction going. Again, the question can be raised if the various issues could not have been dealt with more efficiently under the heading of ‘interactional goals’, rather than moving to and fro between ‘intercultural’ and ‘communication’ to cover the different issues. The second part of the book has a practice-oriented focus. The authors discuss the practical steps needed to promote the competencies people need to adapt to a new culture. Chapter 8 introduces the assessment instruments for measuring culture-related value orientations and intercultural competence in different settings like schools, universities, and organizations. Although at the end of this chapter, the authors criticize current assessment instruments for their lack of validity and reliability in assessing the learner’s progress in intercultural interaction competence in educational settings, they support their usefulness in measuring educational needs. It is suggested that teacher and

594 REVIEWS

Reviewed by Alireza Jamshidnejad Canterbury Christ Church University, UK E-mail: [email protected] doi:10.1093/applin/amq019 Advance Access published on 29 June 2010

REFERENCE Blommaert, J. 1998. Different Approaches to Intercultural Communication: A Critical Survey. Plenary lecture, IPMI, Bremen.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

assessors use observable indicators of competence instead, but it is not made clear how teachers and educators can develop observable quality indicators. In Chapter 9, the authors describe the goals and methods used in intercultural development and training programmes designed for two important contexts: professional contexts and education. The authors propose that as in the school context, students are available for a longer period than in the adult professional context, the development of intercultural competencies in a school context will be easier to accomplish. However, I think the lack of time in the latter context might be compensated for by the more intensive adult experience of the intercultural context which is probably useful to facilitate the development of intercultural interaction competencies. The third part of this book adopts a research-oriented approach to intercultural interaction and attends to key research topics, associated sample studies, and the various steps needed to carry out a research project, including the cultural considerations that need to be taken into account. The authors rely on sample studies from different disciplines to discuss the themes and issues being investigated (the methods which prevail are those of discourse analysis and the use of questionnaires). The last part of the book lists the main resources for researching intercultural interaction and this will be welcomed by scholars who research intercultural interaction. There are separate sections for books, journals, conference articles, internet resources, assessment instruments, and materials for developing intercultural competence. In conclusion, this book provides a general review of the concepts, issues, and themes relevant to understanding theory, research, and practice in the field of intercultural interaction. New concepts are defined and explained clearly first and this is followed by a sample of relevant study. The book reads very coherently, with due attention paid to the transitions across chapters. However, intercultural communication is a multidimensional field of great complexity. The authors’ ambition to cover as much territory as possible sometimes results in a lack of depth in tackling the complexities of the phenomena addressed.

Applied Linguistics: 31/4: 595–597 doi:10.1093/applin/amq024

ß Oxford University Press 2010

NOTES ON CONTRIBUTORS Richard Badger is a senior lecturer in the School of Education at the University of Leeds where he co-ordinates the MA TESOL. His research interests are in the areas of academic literacies and TESOL methodology, particularly teaching or listening and writing. Address for correspondence: School of Education, University of Leeds, Leeds LS2 9JT, UK.

Megumi Hamada is an Assistant Professor in the Applied Linguistics/TESOL program in the English Department at Ball State University. She teaches TESOL and SLA courses. Her primary research includes psycholinguistics of L2 reading and word learning. Alireza Jamshidnejad holds an MA in Applied Linguistics for Language Teaching from the University of Southampton and is currently a PhD researcher at Canterbury Christ Church University, Kent, UK. His main research interests are second/foreign language (L2) learning and communication, problems of orality and language strategies, interactional discourse analysis, pragmatics, sociolinguistics, and intercultural communication. Address for correspondence: Department of English and Language Studies, Canterbury Christ Church University, North Holmes Road, Canterbury, Kent CT1 1QU, UK.

Keiko Koda is a Professor of Second Language Acquisition and Japanese in the Department of Modern Languages at Carnegie Mellon University. Her major research interests include second language reading, biliteracy development and psycholinguistics. Her publications include Insights into Second Language Reading (Cambridge University Press, 2005), Reading and Language Learning (Blackwell, 2007) and Learning to Read across Languages (Routledge, 2008). Malcolm MacDonald is an Associate Professor at the Centre for Applied Linguistics in the University of Warwick. He is engaged in three areas of

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Nick C. Ellis is a Research Scientist in the English Language Institute, Professor of Psychology, and Professor of Linguistics at the University of Michigan. His research interests include language acquisition, cognition, corpus linguistics, cognitive linguistics, psycholinguistics, and emergentism. He is the author of more than 150 scientific papers and chapters and has edited books on Implicit and Explicit Learning of Languages (Academic Press, 1994), Handbook of Spelling: Theory, Process and Intervention (John Wiley, 1994, with Gordon Brown), and Handbook of Cognitive Linguistics and Second Language Acquisition (Routledge, 2008, with Peter Robinson). He served as the editor of Language Learning from 1998 to 2002 and is currently the General Editor.

596 NOTES ON CONTRIBUTORS

research: intercultural communication, discourse analysis, and English language teaching. Address for correspondence: Centre for Applied Linguistics, Room 1.79, University of Warwick, Coventry CV4 7AL, UK.

Shannon Sauro is Assistant Professor of Applied Linguistics in the Department of Bicultural–Bilingual Studies at the University of Texas at San Antonio. Her research explores second language acquisition processes within the context of computer-mediated communication. Benjamin Shaer is a practising lawyer and an adjunct research professor in the Department of Law at Carleton University in Ottawa, where his research involves applying the tools of formal linguistic analysis to the study of legal texts. Prior to embarking on a legal career, he obtained a MEd in TESL and a PhD in linguistics from McGill University and worked as a linguist at universities and research institutes in Canada, France, and Germany, most recently at the Centre for General Linguistics in Berlin. His most recent publication is Dislocated Elements in Discourse: Syntactic, Semantic, and Pragmatic Perspectives (Routledge, 2009), co-edited with Philippa Cook, Werner Frey, and Claudia Maienborn. Address for correspondence: Department of Law, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario K1S 5B6, Canada.

Rita Simpson-Vlach has been a lecturer in the Department of Linguistics and Language Development at San Jose State University, and a researcher in the School of Education at Stanford University. She was a project director of the Michigan Corpus of Academic Spoken English (MICASE) from its inception until 2006. Her research interests lie mainly in the areas of corpus linguistics, spoken discourse analysis, and English for Academic Purposes. She co-authored the MICASE Handbook (University of Michigan Press, 2006, with Sheryl Leicher) as well as several articles on idioms and formulaic expressions in MICASE.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Huhua Ouyang, PhD, is presently Director of the English Education Research Center and Professor of English in the Faculty of English and Culture, Guangdong University of Foreign Studies. He also serves on the standing committee of China’s association of teacher education and development, and that of teaching writing and research. He has published extensively in anthropology and education, Chinese socio-psychology, contrastive rhetoric, and teacher education, in prestigious journals such as Anthropology & Education Quarterly, in addition to two books by Peking University Press. He has given guest lectures in over 30 universities and keynote speeches in a dozen international conferences. Address for correspondence: Faculty of English Language and Culture, Guangdong University of Foreign Studies, Guangzhou, 510420, P. R. China.

NOTES ON CONTRIBUTORS

597

Bryan Smith is an Assistant Professor of Linguistics in the Department of English at Arizona State University. His main research interests focus on the intersection of SLA theory and CALL. He has published his research in journals such as CALICO Journal, Computer Assisted Language Learning, Language Learning & Technology, The Modern Language Journal, Studies in Second Language Acquisition, System, and TESOL Quarterly among others.

Machteld Verhelst is on the directory board of the Centre for Language and Education at the Catholic University of Leuven. She conducted her PhD research on the acquisition of Dutch as a second language by young infants in Brussels. She has published a wide range of articles and syllabuses with regard to task-based language education. She co-edited a special issue on TBLT for The International Journal of Applied Linguistics and the book Tasks in Action, published by Cambridge Scholars Publishing. Her main research interests are early second language acquisition (by young children), multilingualism and task-based language teaching. Address for correspondence: Centrum voor Taal en Onderwijs, KUL, Blijde-Inkomststraat 7 - bus 03319, BE-3000 Leuven, Belgium. Marjolijn Verspoor (PhD 1991, University of Leiden, The Netherlands) is an Associate Professor and Chair of the Applied Linguistics Department at the University of Groningen. Her research is focused on second language development from a usage-based, dynamic systems perspective and on second language instruction based on cognitive linguistic insights.

Hong Zhong is an Associate Professor of English in the Faculty of English Language and Culture at Guangdong University of Foreign Studies. She does part-time PhD research in sociolinguistics in the National Key Research Center for Linguistics and Applied Linguistics. She has published in journals such as Language in Society, Discourse Studies, and Discourse & Communication. She participates in the following four research projects: (i) Contemporary Social Changes in China and its Discursive Construction funded by China Ministry of Education, (ii) Forging Learners’ Critical Thinking Awareness in Foreign Language Teaching, (iii) Formative Evaluation in Higher Education: Autonomous Learning, (iv) Speech Act Grammar: A Study Based on American Presidential Elections and their Inauguration Speeches. Address for correspondence: Faculty of English Language and Culture, Guangdong University of Foreign Studies, Guangzhou, 510420, P. R. China.

Downloaded from applij.oxfordjournals.org by guest on December 31, 2010

Marianne Spoelman (1985) obtained in 2008 a Master’s degree in General Linguistics and a Master’s degree in Applied Linguistics from the University of Groningen, the Netherlands. She is currently working as a PhD student at the Department of Finnish as a Second and Foreign Language at the University of Oulu, Finland.