Language Processing in Advanced Learners of English (Bilingual Processing and Acquisition) 902720540X, 9789027205407

The production and processing of collocations and formulaic language is a field of growing interest in corpus linguistic

105 73 4MB

English Pages 311 [313] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Language Processing in Advanced Learners of English (Bilingual Processing and Acquisition)
 902720540X, 9789027205407

Table of contents :
Language Processing in Advanced Learners of English
Editorial page
Title page
Copyright page
Dedication page
Table of contents
List of figures
List of tables
Acknowledgements
Part I. Empirical analysis of language production and language processing: Aspects of corpus linguistics and experimental psycholinguistics
Chapter 1. Introduction and overview
1.1 Introduction
1.2 Aims and scope of this study
1.3 Structure
Chapter 2. Aspects of corpus linguistics
2.1 Introduction
2.2 Corpus linguistics as an emergent discipline
2.3 Standard native-speaker corpora
2.3.1 First generation corpora and corpus families
2.3.2 Corpora in lexicography: The COBUILD corpus and the Bank of English
2.3.3 The British National Corpus (BNC)
2.4 Corpora in the 21st century: Web as corpus and specialized corpora
2.4.1 Corpora in the 21st century: Introduction
2.4.2 Web as corpus and web-derived corpora
2.4.3 Learner corpora as an example of specialized corpora
2.5 Aspects of corpus linguistics: A brief summary
Chapter 3. Aspects of experimental data in psycholinguistics
3.1 Introduction
3.2 Grammaticality judgment in empirical linguistics
3.3 Lexical decision tasks in empirical linguistics
3.4 Eye-tracking studies in empirical linguistics
3.5 Neuroimaging in empirical linguistics: ERP and fMRI data
3.6 Data in experimental linguistics and psycholinguistics: A brief summary
Part II. Language processing of intermediate and advanced learners of English: A multi-method approach
Chapter 4. Interference collocations of advanced German learners of English
4.1 Introduction
4.2 Collocation and collocability
4.3 Quantitative and phraseological approaches to collocation
4.4 Collocation in Contrastive Interlanguage Analysis (CIA)
Chapter 5. Measuring eye movements for the study of language processing and comprehension
5.1 Introduction
5.2 Early comprehension measures
5.3 Late comprehension measures
5.4 Eye-tracking studies for reading comprehension of non-native speakers
5.5 Measuring eye-movements for the study of language processing and comprehension: A brief summary
Chapter 6. Processing semantic mismatch and unexpected lexical items
6.1 Introduction
6.2 Interference collocations between semantic mismatch and expectation
6.3 EEG/ERP studies with non-native speaker subjects
6.4 Processing semantic mismatch and unexpected lexical items: A brief summary
Chapter 7. Methodology
7.1 Introduction
7.2 Corpus based creation of input stimuli
7.2.1 Background
7.2.2 Learner corpora: ICLE and LINDSEI
7.2.3 Identification of interference collocations
7.2.4 Significant native speaker collocations and fully incongruent collocations
7.2.5 Creating input stimuli from collocation data
7.3 Participants
7.3.1 General information on participants
7.3.2 Socio-biographic metadata of participants
7.4 Eye-tracking and EEG procedures
7.4.1 Introduction
7.4.2 Eye-tracking and EEG recording: Hardware and software
7.4.3 Eye-tracking: Measured variables
Chapter 8. Results 1: Evidence from eye-tracking
8.1 Introduction
8.2 Eye-tracking: Area of Interest 1 – Verbs and adjectives
8.2.1 AoI1: Introduction
8.2.2 AoI1: First Fixation Duration
8.2.3 AoI1: Fixation Time (ms)
8.2.4 AoI1: Fixation Count
8.2.5 AoI1: Revisits
8.2.6 Eye-tracking: Area of Interest 1 – Summary
8.3 Eye-tracking: Area of Interest 2 – Nouns
8.3.1 AoI2: Introduction
8.3.2 AoI2: First Fixation Duration
8.3.3 AoI2: Fixation Time (ms)
8.3.4 AoI2: Fixation Count
8.3.5 AoI2: Revisits
8.3.6 Eye-tracking: Area of Interest 2 – summary
8.4 Eye-tracking: Joint analysis of AoI1 and AoI2
8.4.1 Introduction
8.4.2 AoI1+AoI2: First Fixation Duration
8.4.3 AoI1+AoI2: Fixation Time (absolute)
8.4.4 AoI1+AoI2: Fixation Count
8.4.5 AoI1+AoI2: Revisits
8.4.6 AoI1+AoI2: MANOVA for a combination of dependent variables
8.4.7 Eye-tracking: Joint analysis of AoI1 and AoI2 – summary
Chapter 9. Results 2: Evidence from EEG/ERP
9.1 Introduction
9.2 Recording at sentence onset
9.2.1 Introduction
9.2.2 Sentence onset: Grand averages / student group (N1, P2, N300, N400)
9.2.3 Sentence onset: Word class / student group (N1, P2, N300, N400)
9.2.4 Sentence onset: Sentence condition / student group (N1, P2, N300, N400)
9.2.5 Recording at sentence onset – summary
9.3 Recording time-locked to AoI1 (adjectives/verbs)
9.3.1 Introduction
9.3.2 AoI1: Grand averages / student group (N1, P2, N300/N400)
9.3.3 AoI1: Word class / student group (N1, P2, N300/N400)
9.3.4 AoI1: Condition / student group (N1, P2, N300/N400)
9.3.5 Recording time-locked to AoI1 (adjectives / verbs) – summary
9.4 Recording time-locked to AoI2 (nouns)
9.4.1 AoI2: Introduction
9.4.2 AoI2: Grand averages / student group (N1, P2, N300/N400)
9.4.3 AoI2: Word class / student group (N1, P2, N300/N400)
9.4.4 AoI2: Condition / student group (N1, P2, N300/N400)
9.4.5 Recording time-locked to AoI2 (nouns) – summary
Chapter 10. Evaluation and discussion
10.1 Introduction
10.2 Evaluation of the analysis
10.2.1 Introduction
10.2.2 Corpus-based language inputs
10.2.3 Participant groups
10.2.4 Experiment design: Combined eye-tracking and EEG-recording
10.2.5 Data interpretation and statistics
10.3 Discussion
10.3.1 Introduction
10.3.2 Corpus linguistic and experimental data
10.3.3 Discussion of select results
10.3.4 Discussion – summary
Chapter 11. Conclusions and outlook
11.1 Summary and conclusions
11.2 Outlook
References
Index

Citation preview

Bilingual Processing and Acquisition

9

Language Processing in Advanced Learners of English A multi-method approach to collocation based on corpus linguistic and experimental data Marco Schilk

John Benjamins Publishing Company

Language Processing in Advanced Learners of English

Bilingual Processing and Acquisition (BPA) issn 2352-0531

Psycholinguistic and neurocognitive approaches to bilingualism/multilingualism and language acquisition continue to gain momentum and uncover valuable findings explaining how multiple languages are represented in and processed by the human mind. With these intensified scholarly efforts come thought-provoking inquiries, pioneering findings, and new research directions. The Bilingual Processing and Acquisition book series seeks to provide a unified home, unlike any other, for this enterprise by providing a single forum and home for the highestquality monographs and collective volumes related to language processing issues among multilinguals and learners of non-native languages. These volumes are authoritative works in their areas and should not only interest researchers and scholars investigating psycholinguistic and neurocognitive approaches to bilingualism/multilingualism and language acquisition but also appeal to professional practitioners and advanced undergraduate and graduate students. For an overview of all books published in this series, please see benjamins.com/catalog/bpa

Executive Editor John W. Schwieter Wilfrid Laurier University

Associate Editor Aline Ferreira

University of California, Santa Barbara

Editorial Advisory Board Jeanette Altarriba

University at Albany, State University of New York

Panos Athanasopoulos Lancaster University

Laura Bosch

Universitat de Barcelona

Marc Brysbaert

Ghent University

Kees de Bot

University of Groningen

Yanping Dong

Zhejiang University

Mira Goral

Arturo E. Hernandez University of Houston

Ludmila Isurin

University of Illinois at UrbanaChampaign

Janet G. van Hell

University of Illinois at Chicago

Walter J.B. van Heuven

University of Münster

Iring Koch

University of York

Li Wei

Concordia University

Gerrit Jan Kootstra

University of Edinburgh

Ohio State University Pennsylvania State University University of Nottingham RWTH Aachen University UCL IOE

Lehman College, The City University of New York

Radboud University Nijmegen & Windesheim University of Applied Sciences

Roberto R. Heredia

Gary Libben

Texas A&M International University

Silvina Montrul

Kara Morgan-Short Greg Poarch

Leah Roberts

Norman Segalowitz Antonella Sorace

Brock University

Volume 9 Language Processing in Advanced Learners of English. A multi-method approach to collocation based on corpus linguistic and experimental data by Marco Schilk

Language Processing in Advanced Learners of English A multi-method approach to collocation based on corpus linguistic and experimental data

Marco Schilk University of Hildesheim

John Benjamins Publishing Company Amsterdam / Philadelphia

8

TM

The paper used in this publication meets the minimum requirements of the American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.

doi 10.1075/bpa.9 Cataloging-in-Publication Data available from Library of Congress: lccn 2020002549 (print) / 2020002550 (e-book) isbn 978 90 272 0540 7 (Hb) isbn 978 90 272 6134 2 (e-book)

© 2020 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Company · https://benjamins.com

To Manuela

Table of contents

List of figures

xi

List of tables

xiii

Acknowledgements

xvii

Part I.  Empirical analysis of language production and language processing: Aspects of corpus linguistics and experimental psycholinguistics Chapter 1 Introduction and overview 1.1 Introduction 3 1.2 Aims and scope of this study  5 1.3 Structure 6 Chapter 2 Aspects of corpus linguistics 2.1 Introduction 7 2.2 Corpus linguistics as an emergent discipline  9 2.3 Standard native-speaker corpora  11 2.3.1 First generation corpora and corpus families  11 2.3.2 Corpora in lexicography: The COBUILD corpus and the Bank of English  14 2.3.3 The British National Corpus (BNC)  15 2.4 Corpora in the 21st century: Web as corpus and specialized corpora  17 2.4.1 Corpora in the 21st century: Introduction  17 2.4.2 Web as corpus and web-derived corpora  18 2.4.3 Learner corpora as an example of specialized corpora  20 2.5 Aspects of corpus linguistics: A brief summary  25 Chapter 3 Aspects of experimental data in psycholinguistics 3.1 Introduction 29 3.2 Grammaticality judgment in empirical linguistics  31 3.3 Lexical decision tasks in empirical linguistics  36 3.4 Eye-tracking studies in empirical linguistics  38

3

7

29

viii Language Processing in Advanced Learners of English

3.5 Neuroimaging in empirical linguistics: ERP and fMRI data  43 3.6 Data in experimental linguistics and psycholinguistics: A brief summary  51 Part II.  Language processing of intermediate and advanced learners of English: A multi-method approach Chapter 4 Interference collocations of advanced German learners of English 4.1 Introduction 55 4.2 Collocation and collocability  56 4.3 Quantitative and phraseological approaches to collocation  57 4.4 Collocation in Contrastive Interlanguage Analysis (CIA)  61 Chapter 5 Measuring eye movements for the study of language processing and comprehension 5.1 Introduction 65 5.2 Early comprehension measures  67 5.3 Late comprehension measures  68 5.4 Eye-tracking studies for reading comprehension of non-native speakers  70 5.5 Measuring eye-movements for the study of language processing and comprehension: A brief summary  72

55

65

Chapter 6 Processing semantic mismatch and unexpected lexical items 75 6.1 Introduction 75 6.2 Interference collocations between semantic mismatch and expectation  76 6.3 EEG/ERP studies with non-native speaker subjects  84 6.4 Processing semantic mismatch and unexpected lexical items: A brief summary  88 Chapter 7 Methodology91 7.1 Introduction 91 7.2 Corpus based creation of input stimuli  91 7.2.1 Background 91 7.2.2 Learner corpora: ICLE and LINDSEI  92 7.2.3 Identification of interference collocations  96 7.2.4 Significant native speaker collocations and fully incongruent collocations  101 7.2.5 Creating input stimuli from collocation data  103



Table of contents ix

7.3 Participants 105 7.3.1 General information on participants  105 7.3.2 Socio-biographic metadata of participants  106 7.4 Eye-tracking and EEG procedures  110 7.4.1 Introduction 110 7.4.2 Eye-tracking and EEG recording: Hardware and software  111 7.4.3 Eye-tracking: Measured variables  113 Chapter 8 Results 1: Evidence from eye-tracking 8.1 Introduction 117 8.2 Eye-tracking: Area of Interest 1 – Verbs and adjectives  117 8.2.1 AoI1: Introduction  117 8.2.2 AoI1: First Fixation Duration  118 8.2.3 AoI1: Fixation Time (ms)  124 8.2.4 AoI1: Fixation Count  130 8.2.5 AoI1: Revisits  133 8.2.6 Eye-tracking: Area of Interest 1 – Summary  138 8.3 Eye-tracking: Area of Interest 2 – Nouns  141 8.3.1 AoI2: Introduction  141 8.3.2 AoI2: First Fixation Duration  141 8.3.3 AoI2: Fixation Time (ms)  142 8.3.4 AoI2: Fixation Count  144 8.3.5 AoI2: Revisits  144 8.3.6 Eye-tracking: Area of Interest 2 – Summary  145 8.4 Eye-tracking: Joint analysis of AoI1 and AoI2  146 8.4.1 Introduction  146 8.4.2 AoI1+AoI2: First Fixation Duration  147 8.4.3 AoI1+AoI2: Fixation Time (absolute)  151 8.4.4 AoI1+AoI2: Fixation Count  153 8.4.5 AoI1+AoI2: Revisits  155 8.4.6 AoI1+AoI2: MANOVA for a combination of dependent variables  157 8.4.7 Eye-tracking: Joint analysis of AoI1 and AoI2 – Summary  161 Chapter 9 Results 2: Evidence from EEG/ERP 9.1 Introduction 165 9.2 Recording at sentence onset  167 9.2.1 Introduction  167 9.2.2 Sentence onset: Grand averages / student group (N1, P2, N300, N400)  169

117

165

x

Language Processing in Advanced Learners of English

9.2.3 Sentence onset: Word class / student group (N1, P2, N300, N400)  174 9.2.4 Sentence onset: Sentence condition / student group (N1, P2, N300, N400)  182 9.2.5 Recording at sentence onset – summary  198 9.3 Recording time-locked to AoI1 (adjectives/verbs)  200 9.3.1 Introduction  200 9.3.2 AoI1: Grand averages / student group (N1, P2, N300/N400)  201 9.3.3 AoI1: Word class / student group (N1, P2, N300/N400)  203 9.3.4 AoI1: Condition / student group (N1, P2, N300/N400)  210 9.3.5 Recording time-locked to AoI1 (adjectives / verbs) – summary  219 9.4 Recording time-locked to AoI2 (nouns)  221 9.4.1 AoI2: Introduction  221 9.4.2 AoI2: Grand averages / student group (N1, P2, N300/N400)  222 9.4.3 AoI2: Word class / student group (N1, P2, N300/N400)  224 9.4.4 AoI2: Condition / student group (N1, P2, N300/N400)  231 9.4.5 Recording time-locked to AoI2 (nouns) – summary  241 Chapter 10 Evaluation and discussion 10.1 Introduction  245 10.2 Evaluation of the analysis  246 10.2.1 Introduction  246 10.2.2 Corpus-based language inputs  246 10.2.3 Participant groups  253 10.2.4 Experiment design: Combined eye-tracking and EEG-recording  255 10.2.5 Data interpretation and statistics  258 10.3 Discussion  263 10.3.1 Introduction  263 10.3.2 Corpus linguistic and experimental data  263 10.3.3 Discussion of select results  264 10.3.4 Discussion – summary  272 Chapter 11 Conclusions and outlook 11.1 Summary and conclusions  275 11.2 Outlook  278

245

275

References

281

Index

291

List of figures

Figure 3.1 Figure 3.2

Figure 3.3 Figure 3.4 Figure 6.1

Figure 6.2 Figure 6.3

Figure 6.4 Figure 8.1 Figure 8.2 Figure 8.3 Figure 8.4 Figure 8.5 Figure 8.6 Figure 8.7 Figure 9.1 Figure 9.2

A scematic diagram of the major processes in reading comprehension (Just & Carpenter 1980: 331) Idealized waveform of the computer-averaged auditory event-related potential (ERP) to a brief sound (Hillyard & Kutas 1983: 35) N400 effect at moderate and strong semantic mismatch (Kutas & Hillyard 1980: 203) N400, ELAN and P600 components in the ERP (Friederici 2002: 81) Brain potentials in relation to contextual constraint, Cloze probability and semantic relatedness (Kutas & Hillyard 1984: 152) Differences in the processing of literal and figuratice collocations (Molinaro & Carreiras 2010: 183) The Revised Hierarchical Model (RHM) of Bilingual Lexical Activation (Kroll & Stewart 1994 as adapted in Sunderman & Kroll 2006: 392) Subtraction N400 of native speakers, intermediate and advanced Japanese learners of English (Ojima et al 2005: 1218) AoI1: First Fixation Duration by condition (all) AoI1: First Fixation Duration by Word Class AoI1: First Fixation Duration by Degree Course and Word Class AoI1: First Fixation Duration by Sentence Condition and Word Class AoI1: Fixation Time (ms) by Condition (median) AoI1: Fixation Time (ms) by Word Class (mean) Revisits by Word Class and Condition: Word Class (all) / Mean values Processing in word reading (Sereno & Rayner 2003: 490) EEG-Reading at sentence onset – Grand Average/Group (Fz, Cz, Pz)

40

44 46 47

79 80

81 86 120 121 122 123 126 129 137 168 170

xii Language Processing in Advanced Learners of English

Figure 9.3 Figure 9.4 Figure 9.5 Figure 9.6 Figure 9.7 Figure 9.8 Figure 9.9 Figure 9.10 Figure 9.11 Figure 9.12 Figure 9.13 Figure 10.1 Figure 10.2 Figure 10.3

EEG-Reading at sentence onset – Word Class/Group (Fz, Cz, Pz) EEG-Reading at sentence onset – Condition/Group (Fz, Cz, Pz) EEG-Reading at sentence onset – Degree Course/Condition (Fz, Cz, Pz) EEG-Reading at AoI1 – Grand average/Group (Fz, Cz, Pz) EEG-Reading at AoI1 – Word Class/Group (Fz, Cz, Pz) EEG-Reading at AoI1 – Condition/Deg_Course (Fz, Cz, Pz) EEG-Reading at AoI1 – Deg_Course/Condition (Fz, Cz, Pz) EEG-Reading at AoI2 – Grand average/Group (Fz, Cz Pz) EEG-Reading at AoI2 – Degree course: Word Class (Fz, Cz, Pz) EEG-Reading at AoI2 – Condition/Deg_Course (Fz, Cz, Pz) EEG-Reading at AoI2 – Deg_Course/Condition (Fz, Cz, Pz) EEG-Reading at sentence onset – Deg_Course/Condition (Fz) EEG-Reading at AoI1 – Deg_Course/Condition (Fz) EEG-Reading at AoI2 – Deg_Course/Condition (Cz)

175 183 184 202 205 211 212 223 225 232 233 266 269 271

List of tables

Table 2.1 Table 3.1 Table 4.1 Table 4.2 Table 5.1 Table 7.1 Table 7.2 Table 7.3 Table 7.4 Table 7.5 Table 8.1 Table 8.2 Table 8.3 Table 8.4 Table 8.5 Table 8.6 Table 8.7 Table 8.8 Table 8.9 Table 8.10 Table 8.11

The Brown family of corpora Data gathering techniques (Tummers et al. 2005: 232) Sub-categories of word-like combinations (Cowie 1998: 7) Syntagmatic fixedness and semantic transparency as a double continuum Approximate mean fixation duration and saccade length for different tasks (Rayner 1998: 372) ICLE-Ger and Lindsei-Ger learner corpora Interference-based Verb + Noun collocations Interference-based Adjective + Noun collocations Significant native speaker V + N collocations (BNC / N = 98,313,429) AoI Statistics – Measured variables (analysed variables in boldface) First Fixation Duration by Condition and Degree Course (all) Fixation Time (ms) by Condition and Degree Course (all) / median values Fixation Time (ms) by Condition and Degree Course (all) / mean values AoI1: Fixation Count by Condition and Degree Course (all) / mean values AoI1: Revisits by Condition and Degree Course (all) / median values AoI1: Revisits by Condition: Word Class (all) / Mean values AoI2: Fixation Time by Condition: WordClass (desc. stat. / mean) AoI1:2: First Fixation Duration (MANOVA/median) AoI1:2: First Fixation Duration / Condition: Deg_Couse (descriptive) AoI1:2: First Fixation Duration / Condition: Deg_Couse (post-hoc t-test) AoI1:2: First Fixation Duration (MANOVA / mean)

12 29 59 60 65 95 98 100 102 115 118 124 127 131 135 137 143 148 149 149 150

xiv Language Processing in Advanced Learners of English

Table 8.12 Table 8.13 Table 8.14 Table 8.15 Table 8.16 Table 8.17 Table 8.18 Table 8.19 Table 8.20 Table 8.21 Table 8.22 Table 8.23 Table 8.24 Table 8.25 Table 8.26 Table 9.1 Table 9.2 Table 9.3 Table 9.4 Table 9.5 Table 9.6 Table 9.7 Table 9.8 Table 9.9 Table 9.10 Table 9.11 Table 9.12 Table 9.13 Table 9.14 Table 9.15 Table 9.16 Table 9.17 Table 9.18 Table 9.19 Table 9.20 Table 9.21 Table 9.22

AoI1:2: First Fixation Duration / Deg_Course: WordClass (post-hoc test) AoI1:2: Fixation Time / ms (MANOVA/median) AoI1:2: Fixation Time / ms / Condition (descriptive Statistics) AoI1:2: Fixation Count (MANOVA/median) AoI1:2: Fixation Count by Deg_Course (descriptive/median) AoI1:2: Fixation Count (MANOVA/mean) AoI1:2: Fixation Count by Deg_Course (descriptive/mean) AoI1:2: Revisits (MANOVA/median) AoI1:2: Revisits by Deg_Course (descriptive/median) AoI1:2: All dependent variables (MANOVA/median) AoI1:2: All variables / Condition & Condition:WordClass (post-hoc tests) AoI1:2: All variables by Condition (descriptive/median) AoI1:2: All variables by Word Class (descriptive/median) AoI1:2: All variables by Degree Course (descriptive/median) AoI1:2: All dependent variables (MANOVA/mean) P2 – Word Class/Deg_Course (Cz) – desc. Stat N300 – Word Class/Deg_Course (Fz) – desc. stat N400 – Word Class/Deg_Course (Cz) – ANOVA N400 – Word Class/Deg_Course (Pz) – ANOVA N400 – Word Class: Deg_Course (Pz) – desc. stat N1 – Condition/Deg_Course (Fz) – ANOVA N1 – Condition/Deg_Course (Fz) – Post-hoc test N1 – Condition/Deg_Course (Fz) – Desc. Stat P2 – Condition/Deg_Course (Cz) – ANOVA P2 – Condition/Deg_Course (Cz) – post-hoc test P2 – Condition/Deg_Course (Cz) – Desc. Stat N300 – Condition/Deg_Course (Fz, Cz) – ANOVA N300 – Condition/Deg_Course (Fz, Cz) – Post-hoc Test N300 – Condition/Deg_Course (Fz) – Desc. Stat N300 – Condition/Deg_Course (Cz) – Desc. Stat N400 – Condition/Deg_Course (Cz) – ANOVA N400 – Condition/Deg_Course (Cz) – Post-hoc Test N400 – Condition/Deg_Course (Cz) – Desc. Stat N400 – Condition/Deg_Course (Fz) – ANOVA N400 – Condition/Deg_Course (Fz) – Post-hoc Test N400 – Condition/Deg_Course (Fz) – Desc. Stat AoI1: N1 – Wordclass/Deg_Course (Fz) – ANOVA

151 151 152 153 153 154 154 155 156 157 158 158 159 160 161 177 179 180 180 181 185 185 186 187 188 188 190 191 191 192 193 194 194 196 196 197 206



Table 9.23 Table 9.24 Table 9.25 Table 9.26 Table 9.27 Table 9.28 Table 9.29 Table 9.30 Table 9.31 Table 9.32 Table 9.33 Table 9.34 Table 9.35 Table 9.36 Table 9.37 Table 9.38 Table 9.39 Table 9.40 Table 9.41 Table 9.42 Table 9.43 Table 9.44 Table 9.45 Table 9.46 Table 9.47 Table 9.48 Table 9.49 Table 9.50 Table 9.51 Table 9.52 Table 9.53 Table 9.54 Table 9.55 Table 9.56

List of tables xv

AoI1: N1 – Wordclass/Deg_Course (Fz) – Post-hoc Tests AoI1: N1 – Wordclass/Deg_Course (Fz) – Desc. Stat AoI1: N300/400 – Wordclass/Deg_Course (Fz) – ANOVA AoI1: N300/400 – Wordclass/Deg_Course (Fz) – Post hoc Test AoI1: N300/400 – Wordclass/Deg_Course (Fz) – Desc. Stat AoI1: P2 – Wordclass/Deg_Course (Pz) – ANOVA AoI1: N1 – Condition/Deg_Course (Fz) – ANOVA AoI1: N1 – Condition/Deg_Course (Fz) – Post-hoc Test AoI1: N1 – Condition/Deg_Course (Fz) – Desc. Stat AoI1: N300/400 – Condition/Deg_Course (Fz) – ANOVA AoI1: N300/400 – Condition/Deg_Course (Fz) – Post-hoc Test AoI1: N300/400 – Condition/Deg_Course (Fz) – Desc. Stat AoI1: P2 – Condition/Deg_Course (Fz) – ANOVA AoI1: P2 – Condition/Deg_Course (Fz) – Post-hoc Test AoI1: P2 – Condition/Deg_Course (Fz) – Desc. Stat Aoi2: N1 – WordClass/Deg_Course (Fz) – ANOVA AoI2: N1 – WordClass/Deg_Course (Fz) – Post-hoc Tests AoI2: N1 – Wordclass/Deg_Course (Fz) – Desc. Stat AoI2: N300/400 – Wordclass/Deg_Course (Fz) – ANOVA AoI2: N300/400 – Wordclass/Deg_Course (Fz) – Post-hoc Test AoI2: N300/400 – Wordclass/Deg_Course (Fz) – Desc. Stat AoI2: P2 – Wordclass/Deg_Course (Fz) – ANOVA AoI2: P2 – Wordclass/Deg_Course (Fz) – Post-hoc Test AoI2: P2 – Wordclass/Deg_Course (Fz) – post-hoc test AoI2: N1 – Condition/Deg_Course (Fz) – ANOVA AoI2: N1 – Condition/Deg_Course (Fz) – post-hoc test AoI2: N1 – Condition/Deg_Course (Fz) – Desc. Stat AoI2: N300/400 – Condition/Deg_Course (Fz, Cz) – ANOVA AoI2: N300/400 – Condition/Deg_Course (Fz, Cz) – post-hoc test AoI2: N300/400 – Condition/Deg_Course (Fz) – Desc. Stat AoI2: N300/400 – Condition/Deg_Course (Cz) – Desc. Stat AoI2: P2 – Condition/Deg_Course (Pz) – ANOVA AoI2: P2 – Condition/Deg_Course (Pz) – post-hoc test AoI2: P2 – Condition/Deg_Course (Pz) – Desc. Stat

206 206 207 208 208 209 213 213 214 215 215 216 217 217 218 224 226 226 228 228 229 230 230 230 234 234 235 236 237 238 239 240 240 241

Acknowledgements

I wish to express my gratitude to the great number of people who assisted me in various selfless ways during the preparation and writing of this book. First and foremost, I would like to thank Friedrich Lenz for his support throughout all stages of this project. Without your advice, wisdom and mentorship this project would not have been possible. I would also like to thank Ulrich Heid, Rolf Kreyer and Kristin Kersten who read earlier versions of the manuscript, gave me invaluable feedback and freely offered their help in ways too numerous to menition here. For comments, discussions and interdisciplinary advice I am deeply grateful to Christina Womser-Hacker, Elke Montanari, the late Anette Sabban and Klaus Schubert, as well as the members and delegates at various ICAME conferences. Special thanks go to Kristian Folta-Schoofs and the members and staff of his neurodidactic team for their immense support during the experimental stages of this project. The interdisciplinary approach this book takes, was only possible due to your generous help, the insightful feedback from a psychologist’s perspective and your material contribution at considerable expense to your resources. In a similar vein I would like to thank Janet-Marie McLaughlin who proofread the original manuscript; all remaining blunders and infelicities are naturally mine. Thanks go also to the participants in the study, my collegues at the department of English Language and Literature at the University of Hildesheim, the native speaker raters, the series editor of Bilingual Processing and Acquisition and two anonymous reviewers. Finally, I would like to thank my family: my parents Ulrike and Bernd-Rüdiger for their unfailing love and support and my daughter Mara Moësha for being the greatest daughter anyone could wish for. Last but not least, I wish to thank my beloved wife Manuela for her patience, love and support and for remaining a constant center of gravity throughout all the ups and downs in a scholar’s life. To her this book is dedicated.

Part I

Empirical analysis of language production and language processing Aspects of corpus linguistics and experimental psycholinguistics

Chapter 1

Introduction and overview

1.1

Introduction

Empirical and usage-based descriptions of language representation and processing are cornerstones of many contemporary linguistic approaches. Because of the development of technology, both have rapidly increased in number over the past four decades, using new and ever increasingly refined methodologies. Two of the areas within linguistics that have been the most greatly influenced by these advancements are corpus linguistics and experimental psycholinguistics. The first of these, corpus linguistics, can be defined by the concept of gathering large computer-readable collections of language that are then used for language description. This definition indicates that any development in computerized language processing technology has an enormous impact on the discipline. The second, experimental psycholinguistics, has greatly profited by the increasingly more sophisticated possibilities offered by cognitive neuroscience. Corpus linguistics and experimental psycholinguistics often focus on very similar questions as their initial focus is on the speakers of a language, which can then provide information on the system itself. However, they differ greatly in their perspectives, with corpus linguistics focusing on the language production of large numbers of speakers and experimental psycholinguistics placing far more emphasis on language processing of small sets of informants in very controlled settings. While the similarity of the research questions should therefore encourage researchers to use both corpus linguistic and experimental data, the differences in perspective and methodology often impose great restrictions on complementary approaches, which use both types of speaker data. For example, the irregularities and messiness of natural language data are accepted within corpus linguistics, but these factors are seen as being detrimental to rigid experiment design within experimental branches. Interestingly, it seems to be that, while experimental linguists embrace the results of empirical corpus linguistics, corpus linguists tend to be much more conservative regarding the acceptance of experimental methods which could potentially help to explain their descriptive findings. This has been shown, for example, in Gilquin and Gries (2009) who compared the use of combined corpus linguistic and experimental data from these two areas.

4

Language Processing in Advanced Learners of English

In their metastudy, which focused on the use of corpus data in experimental linguistics and experimental data in corpus linguistics, they state: [A] sample of recent studies […] shows (i) that psycholinguists regularly exploit the benefits of combining corpus and experimental data, whereas corpus linguists do so much more rarely and (ii) that psycholinguists and corpus linguists use corpora in different ways in terms of the dichotomy of exploratory/descriptive vs. hypothesis-testing as well as the corpus linguistic methods being used.  (Gilquin & Gries 2009: 1)

From a methodological perspective, the present study addresses the desideratum of a closer integration of corpus and experimental methods by using corpus-based observations as source data for experimental analysis. From a conceptual perspective the research conducted is based on corpus-based observations of collocation and interference phenomena in learner language and second language acquisition. In the field of corpus linguistics, the study of collocation is one of the most researched areas and many explanations of this phenomenon draw on psycholinguistic concepts of lexical storage and retrieval. However, only comparatively few studies exist that actually test the assumed psycholinguistic facilitation of collocation empirically (see e.g. Ellis et al. 2009 and Schmitt et al. 2004). Currently, even less research exists on the use of collocation and collocation-like combinations by non-native speakers of English and the psycholinguistic background of corpus-based observations of L2 production by learners. The present research, therefore, has two main aims. Firstly, the methodology is aimed at an integration of corpus-based observation with using experimental methods within the psycholinguistic framework of cognitive neuroscience in order to match descriptions of language production with insights into language processing. Secondly, empirically it attempts to demonstrate how German learners, with English as their L2, process significant L1 collocations and collocation-like L1 transfers differently, depending on their level of L2 proficiency. In general terms, it uses corpus-based observations of significant English collocations and typical German transfer mistakes within the area of collocation as stimuli for an experimental setting. This setting uses two paradigms within the framework of cognitive neuroscience, that is, eye-tracking and electroencephalographic analysis in order to investigate: a. How corpus findings and experimental findings may complement each other in learner language and SLA research; b. How experimental data obtained from a smaller group of L2 student participants could help to explain production data from large numbers of L2 speakers.



1.2

Chapter 1.  Introduction and overview

Aims and scope of this study

As mentioned in Section 1.1, collocation has been well researched within the field of corpus linguistics but is not fully understood at the level of language processing. Although corpora demonstrate that speakers are more likely to use specific lexical combinations more frequently than semantically synonymous combinations, and that the former are more preferable than the latter, claims of holistic storage and retrieval of these combinations (see. e.g. Wray 2002 or Sinclair 1991) are rarely based on empirical evidence. Only recently, attempts have been made to put those claims on a more solid psycholinguistic foundation (see e.g. Ellis et al. 2009 and Ellis & Frey 2009). While the lack of empirical evidence for conjectures about holistic storage and retrieval is still problematic for the description of native speaker language production and processing of collocations, some slight progress has been made towards the bridging of this conceptual gap. With regard to non-native speakers, however, even less research exists that focuses on the processing of collocation as well as the processing of L1 transfer. Corpus-based studies have shown that learners of foreign languages often use lexical combinations that would not be typical of those of native speakers (see e.g. Nesselhauf 2005). One reason for this is collocational transfer; a combination that collocates in the learners L1 is transferred directly to their L2, although native speakers of the learners’ L2 would use a different combination. In addition, other corpus-based studies have shown that L2 learners of English sometimes have difficulty understanding collocations or do not understand the additional component of meaning that is added by the collocation of the separate items (see e.g. Schmitt et al. 2004). The present research investigates collocation and collocational interference from the perspective of language processing by advanced learners of English at different stages of their language-learning career. It uses learner corpus data from the International Corpus of Learner English (ICLE) and the Louvain International Database of Spoken English Interlanguage (LINDSEI) to isolate typical collocational transfers by German L1 learners of English as an L2. Additionally, data from the British National Corpus (BNC) is used to isolate comparable significant English L1 collocations. These combinations, and a group of non-collocating control items are used within an experimental setting to account for processing differences between a lower level of advanced learners and highly proficient advanced learners of English, roughly corresponding to B2 and C1 levels within the Common European Framework of Reference for Languages. In terms of methodology, the study attempts to discover how to productively use corpus-based observation for experimental analysis and, conceptually, it attempts to provide more solid evidence on lexical storage and retrieval of L2 speakers with regard to collocation.

5

6

Language Processing in Advanced Learners of English

1.3

Structure

The present work has a two-part structure. The first part, Chapters 2 and 3, focuses on general assumptions and methodologies within the paradigms of corpus linguistics and experimental psycholinguistics. Chapter 2 describes the area of corpus linguistics with regard to the history of the discipline, methodological considerations of the creation of representative standard corpora and a general discussion on the use of corpora for learner data research. Chapter 3 introduces different types of experimental methodologies used for the description of language processing. Chapter 3 is structured along a double continuum of ease of application of the methodology used and the level of conscious awareness of the participants within different experiment types. Although not all methodologies discussed in Chapter 3 are used for the analysis in the second part of this discussion, this overview is essential in order to explain the choice of the procedure used for this present work. The second part starts with three theoretical chapters (Chapter 4–6) that draw on the discussion of the first part, specifying them to the task at hand; that is the selection of corpus-based collocations and interference collocations and the development of the experimental apparatus. Chapter 4 places emphasis on the concept of collocation and collocational transfer phenomena displayed by L2-learners of English from the perspective of contrastive interlanguage analysis (CIA). Chapters 5 and 6 describe the experimental methods selected for the subsequent analysis, that is, the measurement of eye-movement and electroencephalographic data respectively. After these theoretical considerations, Chapter 7 focuses on the research methodology. This chapter includes an in-depth description of the non-native speaker participants, which provides information on different learner variables. In addition, the corpus-based creation of input stimuli for the experimental analysis and the setting of the eye-tracking/EEG experiment are discussed. Chapters 8 and 9 present the respective results of the eye-movement and electro-encephalographic study. Chapter 10 further discusses these results and evaluates the methodology used for the present study with regard to the empirical findings. Chapter 11 presents a summary of the most important results and offers some perspectives for future research.

Chapter 2

Aspects of corpus linguistics

2.1

Introduction

A main objective of the present research is to investigate possibilities of combining language production data and language processing data. Within this framework the area of language production data or, more precisely, the result of language production is represented by the insights that large corpora of language and the descriptive apparatus provided by corpus linguistic methodology offer. The area of language processing, on the other hand, is represented by psycholinguistic and experimental methods that focus on reactions of speakers towards language data. The first of these areas, language production, language corpora and corpus linguistic methodology is the focus of Chapter 2, while the second area, experimental psycholinguistics is treated in Chapter 3. Corpus linguistics as a relatively young discipline within linguistics builds on the far older idea that by studying the linguistic output of a larger number of speakers generalizations about the respective language as a whole are possible. While this approach has been applied relatively early to some areas within linguistics as, for example, lexicography, it long represented a minority position in the area of grammatical description, particularly in early post-Saussurean linguistics. The reason for this lies in the different types of questions lexicographers and grammarians tend to ask. While lexicographers are generally interested in the typical meaning of lexical items, prompting questions such as: who uses a particular lexical item in which context, grammarians are generally more interested in the possibility and probability of using specific structures within a language. It is therefore unsurprising that the focus on language usage within lexicography was a driving factor for corpus linguistic approaches, while traditional grammarians relied more heavily on the introspection of the competent native speaker. However, with the new possibilities of collecting increasingly large samples of data and processing them electronically, this focus has somewhat shifted by regarding typicality and probabilistic measures as increasingly important for grammatical judgements (see e.g. Bresnan 2007 or Bresnan & Hay 2008). Sinclair’s (1991) famous quotation that “[t]he language looks rather different when you look at a lot of it at once” (Sinclair 1991: 100) illustrates this shift in perspective.

8

Language Processing in Advanced Learners of English

This approach, however, is not without its downsides. In an interview from 2006, Chomsky, probably the scholar most critical of the corpus linguistic approach points out that: My judgment, if you like, is that we learn more about language by following the standard method of the sciences. The standard method of the sciences is not to accumulate huge masses of unanalyzed data and to try to draw some generalization from them. The modern sciences, at least since Galileo, have been strikingly different. What they have sought to do was to construct refined experiments which ask, which try to answer specific questions that arise within a theoretical context as an approach to understanding the world. (Chomsky in Andor 2006: 97)

While this extreme position is debatable and in all likelihood at least partially intended as a polemic, it raises an important issue for corpus linguistics, namely the question in how far large amounts of data accumulated by recording the linguistic performance of many different speakers are helpful to answer questions about the individual speaker in particular or the language in general. The present research addresses this question by trying to integrate the results from corpus linguistic research, i.e. research that tries to find patterns in the language of large numbers of speakers with a more experimental approach that aims at identifying the processes involved in the language processing of individual speakers. In order to do so, Chapter 2 initially focuses on corpus linguistics as a linguistic discipline, its approach to language data and core paradigms of corpus based description. The first part of this chapter provides a short historic overview of the discipline by showing how early usage-based approaches slowly evolved into modern corpus linguistics. Particularly the historic connections of the discipline with lexicography is relevant to the present research, as the notion of collocation, the object of inquiry of Part II is very closely linked to this development. The second part of Chapter 2 introduces some of the modern standard corpora of the English language. Here two corpus families are of major importance. The first of these is the LOB/Brown family. Apart from including some of the oldest standard corpora, some of the members of this family have also emerged to be a quasi-standard point of reference for many psycholinguistic studies, particularly in relation to word frequency. In a final part, Chapter 2 deals with recent advances in corpus creation and also puts more emphasis on specialized corpora. Apart from methodological considerations, the present research is mainly interested in non-native speaker and learner language, which also makes it necessary to focus more closely on corpora that contain this type of language.



2.2

Chapter 2.  Aspects of corpus linguistics

Corpus linguistics as an emergent discipline

Precursors to modern corpus linguistics have emerged as early as the 18th and 19th century, with Cruden’s (1737) concordance of the Bible (see e.g. Stubbs 2009) and James Murray’s method for the collection of data for the Oxford English Dictionary being prime examples (see e.g. Murray 1977). The main connection of the 18th and 19th century developments to the discipline of corpus linguistics that developed in the 20th century is primarily found in the idea that lexical meaning is best described when viewed in the context of real-world data. However, there are a number of additional factors that were influential for the new perspectives of corpus linguistics as an emergent discipline. Two of the most important ones are technological advances in computer-based language processing and the constant re-evaluation of earlier concepts by an analysis of actual linguistic data. When considering these two factors it is possible to define two intertwined research strands in the emergent discipline of corpus linguistics. The first of these strands is theoretical in nature. Being rooted in the philosophical and empirical traditions of Wittgenstein and Malinowski, corpus linguistic theory is an empirical discipline whose main objective is the study of meaning in context and, in particular, from the point of view of its origin. The second strand is strongly connected with technological advancements and the collection of texts that can be seen as the contextual repertoire on which the theory is based. The points of origin of these two strands can be illustrated by using two examples of classical corpus linguistic concepts that have driven the development of the discipline in the course of the 20th century; one of them, the notion of collocation is a theoretical concept, the other, the key-word-in-context concordance a practical method. As mentioned above, both these concepts were already conceptionally inherent in the 18th century concordances and the work on the OED in the 19th century. It was not until the beginning and middle of the 20th century, however, that the study of meaning in context and thus the study of collocational meaning emerged in linguistic schools of thought and the creation of concordances became one of the preferred methods of investigating text in context. Influenced by Wittgenstein and Malinowski, Firth can be seen as the first linguist to provide a theoretical background to this approach to the description of language. Firth’s (1957) statement that “the main concern of descriptive linguistics is to make statements of meaning” (Firth 1957: 190) challenged concepts that were primarily interested in the structural aspect of language, such as the structuralist schools based on Saussurean and later Chomskyan thoughts. By means of the concepts of collocation and collocability, Firth (1957) introduces a level of meaning that is on the borderline between lexical semantics and grammatical meaning:

9

10

Language Processing in Advanced Learners of English

It must be pointed out that meaning by collocation is not at all the same thing as contextual meaning, which is the functional relation of a context of situation in the context of culture. […] Meaning by collocation is an abstraction on the syntagmatic level and is not directly concerned with the conceptual or idea approach to the meaning of words. (Firth 1957: 195–196)

This demonstrates how Firth’s notion of collocation goes far beyond the mere definition of lexical meaning by placing words in a linguistic context, but introduces a level of meaning that is beyond lexical meaning. This view can be seen as a starting point to grammatical and cognitive theories, such as functional grammar and construction grammar that do not divide linguistic description between one area that includes syntagmatic relations and another that is largely defined by lexical meaning. In other words, Firth’s notion of meaning by collocation anticipated later notions of a lexico-syntactic cline (Langacker 1999: 122) or the Sinclairian idiom principle (Sinclair 1991). However, the implications of this level of meaning had not been completely apparent when Firth introduced it, because large amounts of data are needed to account for habitually co-occurring words. Thus, when the study of linguistic context started to become ever more feasible in the light of the technological advancements that took place in the second half of the 20th century, the concepts of meaning by (habitual) collocation became empirically observable, as the following quotation from Sinclair’s seminal work on corpora illustrates: As the evidence started to accumulate it became clear that the accepted categories of lexicography were not suitable; […] Three major areas of language patterning, besides grammar could not be comprehensively presented in a dictionary, […]. These are collocation, semantics, and pragmatics. (Sinclair 1991: 2–3)

Therefore, when collecting and resorting texts started to become increasingly feasible with advances in typesetting and computer technology, many of the early notions of contextualism were observable not only by means of a handful of illustrative examples, but also could be shown to be a pervasive part of the language. While it is no surprise that these approaches were mainly embraced by scholars who had been greatly influenced by Firth, it is only in combination with the technological advances of the second half of the 20th century that the compilation of corpora was seen to provide more information on language beyond the level of lexical meaning that was prevalent in the 19th century. Stubbs (2009) shows how the idea of the Keyword in Context (KWIC) concordance, originally conceived as an indexing tool for librarians worked well with the punch-card input method of early computers and since “early programming languages […] assumed that information was given on the cards in fixed columns […] [t]he idea of aligning keywords followed naturally from this aspect of technology” (Stubbs 2009: 18). Linguists started using this new technology relatively early on and in the mid 1960s



Chapter 2.  Aspects of corpus linguistics

the first concordance programs were used by linguists such as John McH. Sinclair. And although data-processing by means of punch-cards has been out-dated for over half a century, many modern corpus linguistic tools still make use of the KWIC format. The brief discussion of some of the historical background in the previous section has shown how the use of language data in corpus linguistics was largely influenced by three factors. The first factor was the interest of lexicographers in describing lexical meaning in relation to context of use. The second factor was the emphasis contextualists and functional linguists placed on linguistic use as opposed to linguistic structure. The third factor was the increasing feasibility of collecting, storing and processing language data which became possible during the second half of the 20th century. Together these three factors were responsible for the creation of the first computer-readable linguistic corpora in the 1960s. 2.3

Standard native-speaker corpora

2.3.1 First generation corpora and corpus families While there are a number of precursors to the 20th century computer-based corpora, text collections that Francis (1992: 17) calls “Language corpora B.C. [before computers]”, it seems fairly obvious to date the beginning of corpus linguistics as a linguistic discipline at the time when the first computer-readable corpora were created. As described in Section 2.2 conceptually there has been a longer tradition of using actual linguistic data, as opposed to intuitive and invented examples, for linguistic analysis in Great Britain than in North America. It is therefore somewhat ironic that the first linguistic corpus, specifically designed to be used as a linguistic resource that can be accessed automatically – the Brown University Corpus of Present-Day American English –, was created at Brown University on Rhode Island (Francis & Kucera 1964). Accordingly, this approach was met with some resistance in the US research landscape of the 1960ies as Francis (1982: 7–8) reports. The arguments against creating a corpus of present-day American English notwithstanding, the Brown corpus was in many ways a seminal step for linguistic description in the 20th and early 21st century in general, and for corpus linguistics in particular. Furthermore, a number of corpora followed the original design of the Brown corpus resulting in a family of corpora representing different time periods and varieties, the most well-known being probably the FLOB and Frown corpora that recreated the original corpora in the 1990s as a tool to use the corpus family for variety research as well as diachronic research. Table 2.1 illustrates the Brown family with regard to timeframe and variety:

11

12

Language Processing in Advanced Learners of English

Table 2.1  The Brown family of corpora Variety / Time

American English

British English

Australian English

Indian English

New Zealand English

1900s 1930s 1960s 1978 1986 1991/92

  B-Brown 1931 Brown – – Frown

BLOB-1901 BLOB-1931 LOB – – FLOB

– – – – ACE –

– – – Kolhapur corpus – –

        Wellington  

Three main points are specifically noteworthy about the Brown Corpus (and the other corpora that followed in the LOB/Brown family). The first point relates to the size of the corpus and the included text samples. With one million words of data from over 500 texts containing approximately 2000 words each, the LOB/Brown family set several standards for many later corpus projects. The first of these standards is basing the size of a corpus on graphemic word count. This practice has, until now, been so widely accepted that even mentioning it as a standardizing factor of early corpora might seem superfluous. However, a number of other types of standardization are, of course, conceivable and could even be advantageous for answering many linguistic questions. Thus, basing the size of a corpus on a non-minimal, often not clearly defined linguistic unit, could be more influenced by the fact that the corpus was supposed to be computer readable than by any linguistic consideration. The second standardizing element of the Brown Corpus is the selection of sample texts of a standardized size from different text-type based populations, where the selection of the text-types is also seen as a representative sample of a population of an idealized general language. Thus, the Brown Corpus was not only the first computer readable corpus, but also the first corpus that was aiming at containing a representative sample of a more general linguistic population. Corpus sampling in the 1960s was highly arbitrary being based on the decisions of a conference of “corpus-wise scholars” (Francis 1982: 16). With regard to these decision process it is remarkable that Leech (2007) as the scholar who was most strongly involved in creating newer corpora based on the Brown design writes that: [u]nfortunately, the deliberations of these corpus-wise scholars have not come down to us: we do not know how far considerations of ‘balance’ led to their conclusion that 80 text samples were needed for the learned genre, and only 6 for that of science fiction. Although design of corpora has made considerable advances since that time, what makes a corpus ‘balanced’ or ‘unbalanced’ has remained obscure.  (Leech 2007: 137–138)



Chapter 2.  Aspects of corpus linguistics 13

As can be seen, the question as to whether the Brown corpus – or any linguistic corpus for that matter – succeeds in being a representative sample of any ‘universe’ of given texts or language is hard to answer. Surely being solely based on written texts it cannot claim to be representative of American English of the 1960s but mainly of written texts that a group of academics thought to be representative of written American English in that period, a claim that cannot easily be verified. These questions also have some bearings when considering that many newer studies use the data collected within the Brown and LOB corpus as being representative of word frequency within the English language. Particularly in experimental psycholinguistics, the second framework used for this present work, these corpora have emerged into a quasi-standard reference tool regarding word frequencies (see e.g. Federmeier et al. 2005 or Sereno et al. 2003) and it remains at least doubtful if this is completely merited fifty years after their publication. The third point in which the Brown corpus can be seen as the major pioneering work for the corpus linguistic discipline is that a schema for the development of other corpora was conceived that made it possible to create databases that enabled scholars to compare language use over different regions and times. The Brown corpus served as a design model, and a whole family of corpora has been created that made frequency-based observations of synchronic and diachronic variations in different English-speaking regions of the world possible, which in turn may also be seen as one of the driving factors of both diachronic corpus linguistics and variational corpus linguistics. In the United Kingdom and continental Europe the Survey of English Usage can be seen as the crossover point from corpus-linguistics before computers to computer-based corpus linguistics. Initiated by Randolph Quirk in 1959, it was based on the concept of investigating linguistic structure and use by means of analyzing samples of naturally occurring language. While this is clearly also the objective of modern corpus linguistics, the SEU data was not originally converted to a computer readable format but was a collection of reel-to-reel recordings of spoken language data and the transcripts of these. While the general size of the SEU corpus is identical to the data collections of the Brown family of corpora (by containing 1 million words of data), the data collected by the SEU and the use of this data in corpus linguistics is different from Brown family of corpora in many respects. Firstly, text sizes in the SEU corpus differ from the Brown corpus. SEU texts are 5,000 words in length and therefore, only 200 texts are included in the original corpus. This design has been replaced in later corpora compiled by the Survey by a 500/2,000 design. More importantly, however, the SEU data includes many texts of spoken English from a variety of different speech situations. This inclusion of spoken data is probably the main reason

14

Language Processing in Advanced Learners of English

why the data collection and publication of the SEU data took far longer than the creation of any of the members of the Brown family. The data collected by the SEU covers a timespan of approximately 30 years and many of the collected texts no longer represent present-day English. While the data contained in the Brown and LOB corpora are currently primarily used to track diachronic developments of the English language,1 parts of the spoken SEU corpus are still used for the description of present-day English, regardless of the fact that diachronic change might be even quicker in spoken English. Since the compilation of spoken corpora is extremely time consuming and because the collection of spoken data itself is an ongoing process that can cover many years, often without the possibility of accessing older material, spoken corpora often include a diachronic gap and are sometimes outdated by the time of their publication. However, the investment put into the collection and transcription of such spoken data collections is often so large that these issues tend to be neglected in view of the advantages of actually having access to computer readable spoken data. 2.3.2 Corpora in lexicography: The COBUILD corpus and the Bank of English When the 1960s and 1970s saw the collection of the first balanced million word corpora of English, designed for the purpose of syntactic description, the main influence on corpus collection and corpus-based description of language in the 1980s was lexicographic. There is probably no area in descriptive linguistics that was as greatly influenced by the use of corpus data than lexicography and the first corpus-based dictionary, the Collins COBUILD English Language Dictionary, has been so influential that currently virtually all newly published dictionaries are corpus-based. It is, therefore, not surprising that Sinclair as editor-in-chief of this project, is now seen as one of the pioneers in modern corpus linguistics. In Sinclairian terms, lexis can only be described through a reference to text and the study of vocabulary needs to take into account the co-occurrences of lexical items in text, which ideally is a “spontaneously produced, continuous stretch of natural language” (Sinclair et al. 1970: 25). His notion of the study of lexis is, thus, greatly dependent on the study of whole texts. This is, of course, not possible when defining meaning of lexical items on the grounds of a lexicographer’s intuition.

1. While this is true for corpus linguistics, these corpora are still used widely as a point of reference for lexical frequency within psycholinguistics as mentioned above.



Chapter 2.  Aspects of corpus linguistics 15

In its final form the original Cobuild corpus contained about 7.2 million words of written and spoken English consisting of complete texts from a number of different text-type categories in both the written and the spoken mode.2 During the 1990s and until at least 2005, texts were constantly added to the original Cobuild corpus so that a much larger text-collection than originally envisioned was created. This text-collection, The Bank of English, jointly owned by Harper-Collins and the University of Birmingham, currently contains approximately 500 million words of running texts and is part of the 2.5 billion-word Collins corpus. Despite this enormous increase in size, there are, however several issues that influence the general usefulness of these corpora for linguistic analysis. The Cobuild corpus was planned as a lexicographer’s tool and, because of its ownership, it is usually not possible to use this corpus in its entirety. Further, public documentation of the material that the Bank of English and the Collins corpus contain is scarce. A recent on-line publication by Harper-Collins states that the Bank of English has a total size of 553 million words. 2.3.3 The British National Corpus (BNC) Since the development of corpora and language databases is heavily dependent on the technical development of computer-based text processing, it is relatively unsurprising that some of the major advances in corpus linguistic data collection were made in the 1990s. This decade can be seen as a crossroads for corpus linguistics: the increased power of desktop computing made the collection of truly large language corpora possible, while at the same time the internet was not, as yet, as influential in every-day academia and internet-based text collection as it would be in the future. The largest project of the 1990s and early 2000s was the creation of the British National Corpus (BNC). Apart from contributing to the general overview of the developments within corpus linguistics provided in the present chapter, the description of this corpus also bears more specific relevance to the present study, as the British English significant collocations used in the analysis in Part II of this research are created on the basis of the BNC. With regard to corpus design, the BNC can be seen as a combination of the different methodologies used in the earlier 1-million word corpora and the much larger Cobuild corpus. Meyer (2002) explains this thus:

2. For a complete overview of this original corpus and the procedure of corpus design and lexicographic use of the corpus, see Sinclair (1987).

16

Language Processing in Advanced Learners of English

As more and more corpora have been created, we have gained considerable knowledge of how to construct a corpus that is balanced and representative and that will yield reliable grammatical information. We know, for instance, that what we plan to do with a corpus greatly determines how it is constructed: vocabulary studies necessitate larger corpora, grammatical studies (at least of relatively frequently occurring grammatical constructions) shorter corpora. The British National Corpus is the culmination of all the knowledge we have gained since the 1960s about what makes a good corpus. (Meyer 2002: 138)

While the first-generation corpora were created in order to account for the use of frequent grammatical constructions and the Cobuild corpus was designed as a lexicographical resource, the BNC can be seen as the first large-scale multi-functional corpus. As such it neither contains the controlled 2,000 word text-samples represented in the first-generation corpora, nor does it comply with the whole text philosophy on which the Sinclarian model is based. Rather, it contains a variety of text samples whose size is dependent on several factors, such as the represented text-type (“school essays or office memoranda are very short” (Aston & Burnard 1998: 28)), or the grouping of texts of a single speaker for the spoken demographic section. In total the corpus consists of about 100 million words of present-day English containing texts from 1960 to 1993, the vast majority, however, dating from 1975 onwards. From these 100 million words, approximately 10% represent spoken language. Although this relation still is not representative of the average language user’s linguistic experience, the inclusion of about 5 million words of spoken demographic data, that is, a “component of informal encounters recorded by a socially-stratified sample of respondents, selected by age group, sex, social class and geographic region” (Aston & Burnard 1998: 31) can be seen as a very large step towards the inclusion of informal speech in English corpora. Although the sampling process of the BNC was finished in 1994 and no new texts have been added to the collection, the corpus has been updated several times with respect to its annotation and user interface. The original corpus was created in SGML format, a frequently used mark-up language in the 1990s and contained part-of-speech tag annotation created semi-automatically with an early version of the automatic CLAWS part-of speech tagger (CLAWS4) (Leech et al. 1994). Apart from publishing two smaller versions of the corpus, the BNC Sampler corpus and the BNC Baby, the major updates are connected to the accessibility and the mentioned mark-up. While the original SGML annotated corpus was accessed with the SGML-based TCP protocol and software tool SARA, later versions facilitated access greatly and also updated the mark-up to XML. The main two new editions of the BNC currently are the online-searchable BNCweb version (Hoffmann et al. 2008) and the 2007 edition of BNCworld in which mark-up was converted to XML,



Chapter 2.  Aspects of corpus linguistics 17

including an updated version of SARA, called Xaira (Berglund-Prytz 2007). At the time of writing, work on an updated version of the British National Corpus (BNC 2014) that is based on data collected between 2012 and 2016 is well under way. However, only the spoken component of this corpus has so far been released in 2017 (Love et al. 2017). The written component, compiled at the ESRC Centre for Corpus Applications to Social Sciences (CASS) at Lancaster University is scheduled to be to be released in 2019.3 Although using this updated corpus would have been favourable for the current study, it was not available when the native-like collocations used in the experimental design were extracted. Therefore, BNCweb is the version of the corpus underlying the collocation extraction used in this present research, as described in Chapter 7. 2.4

Corpora in the 21st century: Web as corpus and specialized corpora

2.4.1 Corpora in the 21st century: Introduction The late 20th century was the heyday of traditional corpus linguistics in the sense that text corpora were being compiled by teams of researchers along the lines of standardized corpus design profiles and predefined speaker biographies. As discussed in Section 2.3.3, the BNC on the one hand is the “culmination of all the knowledge we have gained since the 1960s about what makes a good corpus” (Meyer 2002: 138) but on the other hand, it also marks somewhat of an end-point in traditional corpus collection in the sense that most corpora in the 21st century are becoming increasingly specialized and traditional notions of creating a corpus that reliably represents the textual universe of a general language have to some extent been replaced by a combination of specialized genre corpora. As with all corpus linguistic developments, this transition from monolithic standard corpora to modular special corpora is very closely connected to technological advances in language processing. When the development of traditional text corpora was closely tied to the development of text processing and the personal computer, advance within corpus linguistics in the beginning of the 21st century is greatly influenced by the development of the world-wide web. While a computerized collection of text of one million words or more was considered an incredibly powerful tool for the description of natural language in the 1960s and 1970s, today everybody has literally billions of words of written and spoken text at 3. For further information on the status of this ongoing project see the relevant webpage of the CASS at: .

18

Language Processing in Advanced Learners of English

their disposal, accessible with computers and mobile phones at all times. Today a very large amount of small-scale text collections for increasingly specialized fields exists and almost every day new niches are covered. Section 2.4 covers these two different developments by focusing on two exemplary types of corpora developed within these traditions. After a brief discussion on the use of web-based data in general a huge web-derived corpus project, the Corpus of Global Web-based English (GlobWbE) is described as an example of large-scale web-derived corpora. Secondly, and more important to the research at hand, specialized corpora for the description of non-native English and learner Englishes are introduced. As two of these learner language corpora are used as source data for the analysis in Part II of this research, the description of the learner corpora is slightly more extensive than the brief overview of the use of web-derived data, which has less bearing on the research at hand. 2.4.2 Web as corpus and web-derived corpora Section 2.4.1 briefly discussed a paradigm shift for corpus linguistics that is closely connected to the development of the world-wide web that made access to huge amounts of language data accessible far beyond the traditional proponents of corpus linguistics. This development has promoted the analysis of natural language data far beyond a relatively small area within linguistics almost into the general, even non-linguistic mainstream. With regard to the question in how far the web can be seen as a corpus and how it is different from classic standard corpora, Kilgarriff and Grefenstette (2003) illustrate two different definitions of corpora, one from a traditional corpus linguistic point of view provided by McEnery and Wilson (1996), the other from a quantitative linguistic perspective (Manning & Schütze 1999). McEnery and Wilson (1996) define corpora along four “main headings: sampling and representativeness, finite size, machine-readable form, a standard reference” (McEnery & Wilson 1996: 21). The quantitative position of Manning & Schütze (1999) on the other hand, is more concerned with the amount of training data available, pointing out that one should simply use all the data that is available regardless of concerns of corpus balance (Manning & Schütze 1999: 120). By comparing these two positions, Kilgarriff and Grefenstette (2003) anticipate a certain dividing line between the two schools of thought of traditional corpus linguists and quantitative and computational linguists that seems to have emerged and become increasingly defined over the last decade. Kilgarriff and Grefenstette (2003) can be seen as arguing the quantitative position, challenging traditional definitions of a balanced linguistic corpus as follows:



Chapter 2.  Aspects of corpus linguistics 19

We wish to avoid a smuggling of values into the criterion for corpus-hood. McEnery and Wilson (following others before them) mix the question “What is a corpus?” with “What is a good corpus (for certain kinds of linguistic study)?” muddying the simple question “Is corpus x good for task y?” with the semantic question “Is x a corpus at all?” The semantic question then becomes a distraction, all too likely to absorb energies that would otherwise be addressed to the practical one. So that the semantic question may be set aside, the definition of corpus should be broad.  (Kilgarriff & Grefenstette 2003: 334)

Over the years, following Kilgarriff & Grefenstette’s (2003) argument for defining the term corpus broadly and therefore treating the web as a corpus, the suitability of web-based data for linguistic research has been under constant scrutiny. Currently the entire world-wide web is rarely treated as a linguistic corpus, as even those analyses and tools that include vast proportions of it cannot evidently use the entire web as a corpus. Moreover, even when the definition of a corpus is applied in the broadest sense as only being a collection of texts, the world-wide web does not really fit this definition, since it is hardly a collection, but an ongoing network of data that is in constant flux. In this way the world-wide web itself does not seem to be a corpus in the traditional sense, as it is neither planned nor limited in size. A possibility for avoiding these issues is the use of data from the world-wide web in a more indirect fashion. For this, the world-wide web is treated as a data repository that is used to access data that in turn is post-edited and organized into a stand-alone database, treating the web much as standard corpora treated hard-copy publications. Hoffmann (2007) offers a good example of such an approach. In this specific case, that is, web-page-to-mega-corpus method, the data from publically available transcripts of CNN News are downloaded to a hard drive, cleaned of undesirable content (e.g. advertising and superfluous mark-up) and indexed in a meta-relational database that provides further information about the specific texts (such as author information, date of publication etc.). An advanced use of the web-page-to-megacorpus method can be seen in the corpus collection of the Brigham-Young-University (BYU) corpus project (e.g. Davies 2009). In this project a number of different linguistic corpora have been created. In principle the process of the creation of these corpora does not differ significantly from Hoffmann’s (2007) method previously described. However, these corpora are far more aimed at resembling balanced standard corpora by using a large variety of different sources. Moreover, with these corpora availability is far less restricted than is the case for, for example, the CNN transcripts. The reason for this lies in the use of a web-based interface that allows access to the data within the corpora at a level where copyright infringements are not an issue, which does of course entail other issues, such as restricted customization and annotation possibilities.

20 Language Processing in Advanced Learners of English

In contrast to using the world-wide web as a linguistic corpus in its entirety, the BYU corpora apply what could be called the controlled data-mining approach in order to arrive at corpora that closely resemble traditional corpora, such as the BNC. This approach, however, is not entirely free of specific problems. While in these corpora the included texts can be traced to their origin, and authorship issues or issues of an unknown search algorithm do not weigh as heavily as they would in the search engine queries of early-day web-based linguistic studies, there is a certain bias towards specific genres, text-types and media. As with many pre-web corpora (such as e.g. the Bank of English), news-related genres are overrepresented because these texts are most readily available.4 Finally, a central problem of corpus collection remains, namely the inclusion of spontaneous, private, spoken data, regardless of the new possibilities the growing accessibility of linguistic data the world-wide web offers. However, especially with regard to the spoken medium, the use of internet data is still in its infancy. The problem here seems to lie less in the general availability of data, but rather in the question if it can be accessed in a useful fashion. In the 20th century machine readability of text could be seen as the main driving factor behind the development of corpus linguistics. In the 21st century machine readability is not so much of an issue. Basically every text-type is multimedial machine-readable. The main question that needs to be addressed seems to lie in the possibilities of interpretation of multimedial texts. Until now, many corpus linguists use tools that are highly dependent on orthographic realizations of text. The problem here is, of course, that there do not yet seem to be reliable possibilities of automatically converting spoken text to a typeset format. Once this problem has been solved the bottleneck of manual transcription that prevents the inclusion of large amounts of spoken data into linguistic corpora may cease to exist. 2.4.3 Learner corpora as an example of specialized corpora The second major trend in 21st century corpus linguistics, and perhaps the one more relevant to the issues at hand, is a move away from the more classic standard corpus that is supposed to represent a language as a whole, and a move towards smaller corpora that only contain very specific types of language. Today a multitude of specific corpora exist for a large number of purposes, reflecting an equally great 4. The BYU corpora exist in a free-to-use online version as well as an upgraded premium version. While the latter facilitates corpus searches, the fundamental issues inherent in web-derived corpora also exist for the premium version. For example full-text access is not possible for legal reasons and correct author identification is sometimes problematic due to the automated collection process.



Chapter 2.  Aspects of corpus linguistics 21

variety of language types. In general, they either follow a specific corpus design that differs from traditional designs, contain language and text types or genres that are chosen to answer specific questions, or use a combination of both of these approaches. A typical example of a corpus design that is different from the standard corpora previously described is the structure of parallel corpora that include similar texts from two different languages or a source text and a translation of this source text. In addition to theoretical applications in corpus-based translation studies, parallel corpora are also frequently used as language learning and teaching tools and may also provide translators with numerous natural language examples (see e.g. Frankenberg-Garcia (2004) for an introduction to parallel corpora in terms of their design criteria and their function in language teaching). A typical example of a specialized corpus that consists of text types that are restricted to a specific community of practice is, for example, the Michigan Corpus of Academic Spoken English (MICASE) and its written counterpart, the Michigan Corpus of Upper-Level Student Papers (MICUSP). MICASE follows a modular design that contains a variety of spoken text-types from academic settings and offers a web-based interface that allows the user online access to either the whole corpus or preselected parts. Due to the increasing ease of accessibility of various texts as well as the general availability of tools for corpus design and access such as, for example, SketchEngine (Kilgarriff et al. 2004; Kilgarriff et al. 2014), and apart from the growing number of published and public access corpora, there is an ever-increasing number of highly specific corpora that are designed by researchers for specific projects. As the present work is interested in learner language production and processing, probably the most important type of specialized corpora in this context is learner corpora, which is the focus for the remainder of Section 2.4.3. While parallel corpora are frequently used for the purpose of language teaching by giving the learners access to the corpus, learner corpora are generally used in order to describe the language production of language learners at various stages of their language learning development. This allows for the isolating of specific problems that language learners typically have and might also help improve language teaching and the development of teaching materials. Generally the collection of data for learner corpora follows one of two possible procedures. Either a predefined set of learners are asked to produce text specifically for the corpus or general, corpus-independent, assignments within classroom settings are collected and converted to an electronic format. Either procedure has its own set of advantages and disadvantages. The first alternative allows for a more balanced corpus design by ensuring that the learner population represented follows specific criteria and the texts are, therefore, generally more easily compared to one another. However, data is also likely to suffer from a certain amount of an observer’s paradox, as informants are

22

Language Processing in Advanced Learners of English

acutely aware that their language production is recorded for later use in a corpus. The second alternative is usually dependent on existing class composition and assignments that are covered by the respective curriculum. While this leads to more natural language production from the learners because it reflects their every-day learning process, data generated in this way is also less easily comparable to other groups of learners. This is particularly true when the learners follow different curricula, which is often the case when comparing learners of a specific L2 who come from different L1 settings and can be based in different countries. A third option that could be seen as a quasi-standard for learner corpus compilation of written corpora is based on a combined approach of the previous alternatives. Within existing classroom compositions and integrated into the normal teaching practice, students are asked to produce written assignments (e.g. essays about predefined topics that are suitable for integration into the respective corpus). Subsequently permission for inclusion into the corpus is sought. The data in the International Corpus of Learner English described below is a good example of this type of data collection. A caveat, however, is that students might feel pressured to give consent for inclusion into the corpora. Similar to other language corpora (such as the BNC and the Bank of English), there are also two types of learner corpora in terms of ownership and accessibility. On the one hand there are typical research corpora that are usually compiled by university based research teams and are relatively easy to access. However, on the other hand, the commercial potential of learner corpora as a tool for language teachers has also caught the attention of major publishing houses such as Cambridge University Press. To date, two of the largest learner corpora fall into these two categories. The International Corpus of Learner English (ICLE), and its spoken counterpart the Louvain International Database of Spoken English Interlanguage (LINDSEI), is an example of a corpus that is primarily university based, with its core team situated at the Université Catholique de Louvain (UCL) in Belgium. Although the corpora from these projects are published commercially, corpus creation was done within the confines of publically funded universities. In practice a number of different corpus teams, from a large number of universities, collect data from a great number of EFL countries. The data is gathered according to corpus guidelines provided by the Belgian core team which also coordinates publication of the finished corpora. This production concept has the advantage of the published corpora being cheaper in general and, perhaps more importantly, the source data for the corpora is more easily accessed by scholars which also allows for corpus customization, such as adding specific annotation. The Cambridge Learner Corpus (CLC) is an example of a commercially owned learner corpus. Compiled from the responses to the Cambridge Student Language Assessment tests, this corpus has arguably one of the best access types to learners of



Chapter 2.  Aspects of corpus linguistics 23

English worldwide. It contains approximately 16 million words of learner English collected from learners who speak 86 different L1s. From these 16 million words, 6 million are error tagged which means that the corpus is a very useful resource for both the description of learner language and the production of teaching material (Nicholls 2003).5 This corpus is also a very valuable resource for learner language research. However, this very value makes it difficult, if not impossible, to access the corpus data as it is the property of Cambridge University Press and access is limited to CUP’s in-house scholars for the development of CUP teaching materials. This is a general dilemma within corpus linguistics because the production of large-scale corpora is immensely time-consuming and expensive, therefore the largest corpora are frequently commercially owned and general access is severely restricted. The CLC is therefore of limited use to general research into learner language (as is the Bank of English for research into ENL). Non-commercial leaner corpora, on the other hand, have a different set of issues and problems. One major problem is the economic difficulties involved in the creation of a large-scale corpus within a publicly funded university system and the fact that corpus creation (in particular that of corpora which include spoken data) is usually a long-term project that often does not match university goals. The collection of learner data over a period of 20 years that would mirror the practice of, for example, the CLC is not easily undertaken when the returns on initial investments for such programs would be too far in the future for many universities. A feasible solution to this problem appears to be the compilation of modular corpora, where each single module is provided by a different team of researchers within a shorter time span. This is why many corpora, such as, for example, ICLE and LINDSEI and the International Corpus of English are created in this way. A further issue for non-commercial corpus design is access to learner data. While commercial testing institutions such as Cambridge English Language Assessment have access to a large variety of different learners of English, including students and professionals from a large number of different fields, university-based non-profit corpora are often limited to data from the student population connected to the language departments of the respective universities. However, these students do not represent a typical L2 learner of English but rather a very specific subset of this learner type, that is, those who have decided to pursue a career in languages in general, and English in particular. Apart from the fact that these are generally more likely to be advanced learners of English, who are also frequently trainees in language pedagogy, they also tend to represent a somewhat limited demographic. 5. As this corpus is constantly growing, numbers provided by Nicholls (2003) are somewhat out-dated. To date the corpus contains far more data collected from more than 180,000 speakers from around 200 countries, as stated on the website of Cambridge University Press.

24

Language Processing in Advanced Learners of English

This is because typical English students (at least at European universities) are females in their early twenties. Although these problems are frequently ignored because they are not easily overcome, in the context of non-commercial corpus collection, users of learner corpora should at least be aware that these corpora are generally only representative of a small proportion of language learners. This reflects the restrictions faced by experimental linguists because experiment participation is also very frequently limited to student participants situated within the respective departments. For the present research this is helpful, somewhat ironically, as these limitations to access of data for the creation of learner corpora means that the corpora that are based on relatively restricted student data are easily comparable to that of the participants that are usually included into experimental settings. It should be borne in mind, however, that any results based on such data are not broadly generalizable beyond this specific group of informants. Another factor for the compilation of learner corpora that is not limited to non-commercial corpora is access to learner data of learners in early stages of their learning or process which is why “[m]ost of the existing learner corpora are based on the writing of fairly advanced learners” (Barlow 2005: 357). As English is taught relatively early on in school in many countries, early-stage learners tend to be school children who are at different stages of their general linguistic development than both the competent adult native speakers and the advanced adult learners represented in most corpora. Kreyer 2014 describes the situation regarding corpora of early and intermediate learners and finds that the few existing English learner corpora (that include German learners of English) are all relatively small: Although the last three projects have to be regarded as a step in the right direction, the small number of words (a little over 200,000 in total) makes it abundantly clear that the representation of intermediate learners of English has been neglected so far. The most likely reason, in addition to problems relating to bureaucracy and data-protection that arise from collecting data from adolescents, is that hand written exams are much less accessible to digitalisation than university prose, which is usually accessible in digital format. (Kreyer 2014: 16–17)

Issues of digitalization are also the reason why very little spoken data for both intermediate and advanced learners is available, underlining Barlow’s (2005) second point that learner corpora are predominantly based on writing. For the present research commercial corpora such as the CLC are not available and, because the participants in the psycholinguistic analysis are university students, corpora containing language of advanced learners rather than lower level learners seem to represent the language spoken by those participants most closely. For this reason the German components of the International Corpus of Learner



Chapter 2.  Aspects of corpus linguistics 25

English and the Louvain International Database of Spoken English Interlanguage are used for the analysis in Part II of the current work. These corpora are described in more detail in Sections 4.4 and 7.2.2. For the current discussion, however, two points about these corpora are of some importance. The first of these points has already been mentioned. ICLE-Ger and LINDSEI-Ger contain data collected from students that match that of the students that participated in the experimental part of this study relatively closely and therefore should include typical interference phenomena that German university students display. While this is obviously a point in favour of using these corpora, their general size is not. As the present analysis is interested in lexical co-occurrence, both of these corpora could be far too small to provide fully satisfactory descriptions of interference phenomena at this level. As the discussion of learner corpora in Section 2.4.3 has shown, size as well as accessibility are the two major issues that researchers who work with learner corpora have to face, a fact that is somewhat hard to understand considering the enormous potential such corpora could have from both descriptive and pedagogical perspectives. While commercial publishers such as Cambridge University Press and Longman seem to have embraced this potential, publicly funded projects are still few and too frequently face the kind of obstacles previously described. 2.5

Aspects of corpus linguistics: A brief summary

The present chapter has provided an overview of the development in corpus linguistics; described the first approaches to use actual language data for linguistic analysis; the advent of the discipline in the 20th century; and the present discussion of using data available on the world-wide web. As Chapter 2 has shown, the use of data in corpus linguistics has always been inextricably intertwined with technological developments in data collection and data processing. In the second half of the 20th century the main objective of corpus linguistics was to gather sizable amounts of data that were deemed to be representative of a predefined larger population. However, due to the differences between the technological advancements of computer-based written text processing and spoken text processing, a discrepancy in the representation of these media seems to have emerged over the past 30 years. While databases containing written language have steadily increased in size, to the point that it is very simple to generate hundreds of millions of words of written data for many given genres, the possibilities for the collection of spoken genres are far fewer. In light of the increasing availability of many different types of data from the world-wide web, corpus linguists need to consider a number of questions that go beyond the issues of corpus design which are important for the collection of

26 Language Processing in Advanced Learners of English

standard corpus data. The most pressing question, from my point of view, concerns the availability of larger amounts of spoken data. With regard to web-based data in particular, accessibility to such data has vastly improved compared to what was possible ten years ago. The increasing availability of multimedia data in portals such as Youtube and private podcasts offer unprecedented possibilities. However, the classic barrier to corpus processing, that is, the dependency on orthographic transcriptions, still exists and so far, there seems no viable alternative to manual transcription of spoken text. The answer to this might lie in decentralizing corpus projects, moving away from the monolithic standard corpus towards modular specific corpora that can be freely combined. This model is already widely applied to the collection of newer corpora (as discussed in Section 2.4.3). A second, albeit connected issue, is the question of text type. As the present chapter has shown, many traditional corpora define text type intuitively and corpus designers try to fill the respective text type categories with texts that assumedly fit them well. However, as Kilgarriff and Grefenstette (2003) point out, the increasing availability of many different types of texts forces the issue of a new definition of text type. When text type was a criterion that was hard to define before the advent of the internet, the steady increase of emergent text types (as e.g. short message texts, tweets, or – even more difficult – multimedia internet memes) and the shift of use from paper based to electronic communication often makes the attempt of pigeonholing a specific type of communication into a corpus category basically moot. Finally, the reproducibility of results of corpus studies is a very strong argument for the use of standardized data bases. However, it seems that reproduction of corpus studies, with possible adjustment of parameters to gain more solid results, is very rarely done in the field of corpus linguistics. The dominant approach seems to be the production of original and new studies with a certain disregard for the results of earlier studies. While this seems to be gradually changing, especially in the quantitative branch of corpus linguistics as, for example, pointed out by Gries (2006), there would still appear to be room for improvement in terms of the reproducibility of corpus results. In conclusion, corpus linguistics seems to be at a crossroads with respect to future corpora. The decades of work that have resulted in the large classic standard corpora have had a great impact on the understanding of natural language. However, new corpus design needs to embrace the changing technological possibilities of the 21st century with the same enthusiasm as when the discipline began to develop 50 years ago. By simply adding more standard corpora along the existing, often 30 year old, ideas of text a corpus might not adequately answer questions of a textual universe of English that is changing with ever increasing speed. Within the context of the present work this also creates its own set of challenges. The first of these is the improvement of the interface between corpus linguistics and



Chapter 2.  Aspects of corpus linguistics 27

experimental linguistics. All too often, one of these approaches seems to relatively unaware of the developments within the other. Experimental linguists, for example, still use first generation corpora (such as the Brown corpus) as a quasi-standard to gain information on lexical frequency, all of the above points notwithstanding. Corpus linguists, on the other hand, far too often interpret their findings without using experimental data to support their claims, Sinclair’s (1991) idiom principle, that speculates on the ease of lexical access of formulaic combinations, being a prime example.

Chapter 3

Aspects of experimental data in psycholinguistics

3.1

Introduction

While Chapter 2 has given an overview of the development in the collection of linguistic data in corpora, in Chapter 3 the main focus lies less on language data per se but, to a greater extent, on additional data that are used in experimental and psycholinguistic settings. Tummers et al. (2005) characterize the differences between these types of data by the degree of the epistemological basis (i.e. introspection versus use); spontaneity (i.e. elicited versus non-elicited); and the processual status (i.e. product versus process) of the gathered data. Table 3.1 (Tummers et al. 2005: 232) illustrates this for different research frameworks: Table 3.1  Data gathering techniques (Tummers et al. 2005: 232)  

Introspection

Survey

Experiment

Corpus

Epistemological basis

introspection

introspection / language use

language use

language use

Spontaneity

elicited

elicited

elicited

non-elicited

Processual status

product

product

process/product

product

Therefore, the main differences between the use of corpus data and the use of data in experimental settings are in the degree of elicitation of the data as well as in the processual status of the data. That is, “do the data focus on linguistic products […][,] do they focus on the process of language production” (Tummers et al. 2005: 232), or do they focus on language processing. Chapter 2 has shown that not all corpus data is completely free of elicitation. However, the type of data elicitation in experimental settings is fundamentally different from the procedures used in corpus linguistics, as linguistic output is far less the focus than responses to linguistic stimuli. The same holds true for the dimension of the processual status of the data. Although there is a certain degree of focus on the parameters of the process of language production in corpus linguistics (at least in the interpretation of the results), corpus linguists primarily focus on the product, that is, the linguistic output that is being collected.

30

Language Processing in Advanced Learners of English

Experimental settings, on the other hand, place a sharper focus on the situation of language production and processing by systematically manipulating independent variables such as, for example, speaker age or degree of language exposure in the case of non-native speakers (NNS) and using different types of language input often organized in categorical variables (e.g. words vs. pseudowords). This is in order to analyse changes in the relevant dependent variables, such as, for example, scaled grammaticality judgments, response time in lexical decision tasks or other response types. In the following, a number of different experimental methodologies are introduced, focusing on the type of data used, experimental settings applied, the quality of the respective data and the ease of application of each method in order to answer specific linguistic questions. These experimental methodologies are mainly found along the continuum of the sophistication and ease of application of the methodology as well as on the degree of naturalness from which the data is elicited. Therefore, the following discussion starts with a review of grammaticality judgement tasks, characterized by a relative simple experiment design, a low degree of technological prerequisites and a strong focus on linguistic intuition on behalf of the informants. This methodology has been widely used in many linguistic studies and can be seen as lying somewhere on the border between intuition and introspection-based data gathering and experimental data gathering. Whereas a very simple grammaticality judgement task by one or many speakers can be seen as an introspective approach to language data, this approach increasingly focuses on language use when a larger number of independent variables is taken into account. This is the case, for example, when the questions that are being asked focus less on the actual introspection of the test subject, and more on the factors that might influence this introspection (see e.g. Schütze 1996). The remaining three methodologies under discussion in Chapter 3 all appear within a stimulus-response framework, but vary in the degree of the conscious influence the subject has on the response to a stimulus. Lexical decision tasks are related to grammaticality judgements in the manner with which subjects make conscious linguistic judgements about the lexical status of the stimulus. In contrast to grammaticality judgments, however, the actual judgement is far less important than the factors that influence the decisions made. This is why these tasks are often used to gain insight into answers to questions of language processing by the respondent. Furthermore, lexical decision tasks often include specific features that are designed to elicit spontaneous results and minimize the effect of the participants’ conscious judgement. Examples of such features are time constraints or dual-performance tasks that distract the conscious attention of the participants. Eye-tracking, the next methodology described in this chapter, is also a response time method. Unlike lexical decision tasks, however, oculumotor events



Chapter 3.  Aspects of experimental data in psycholinguistics 31

of eye fixations, saccades and microsaccades operate below the level of conscious awareness so that, in these cases, response times do not include any conscious decisions but (with a certain amount of simplification) might more closely reflect cognitive processes. As with lexical decision tasks, the actual language data is used as a stimulus in a controlled experimental setting while the eye-movement data is the dependent variable. Closely related to this methodology are two paradigms in cognitive neuroscience: electroencephalograph (EEG) studies that measure changes in the electric current on the scalp of the participant and functional magnetic resonance imaging (fMRI), a neuroimaging procedure that uses blood oxygenation levels in the brain in order to differentiate more active from less active brain regions. As is the case in eye-tracking studies, these studies use linguistic stimuli and interpret a subconscious physiological response. Although the present chapter covers these methodologies in some detail, it is less focused on the specific results generated by these approaches and focuses more sharply on the connection between linguistic input data with the possibilities of interpretation of collections of language data. Two questions, therefore arise: how can we use the mass of linguistic data that has been collected in linguistic corpora and other databases to improve experimental design?; and how can results from linguistic experiments help us to better interpret the patters observed in quantitative language studies based on corpus linguistic data bases? 3.2

Grammaticality judgment in empirical linguistics

Grammaticality judgement has a long history in several different areas of linguistics, for example, theoretical linguistics, first language acquisition, second language acquisition, the study of communicative disorders and, more recently, in the field of first language attrition studies (Altenberg & Vago 2004). Schütze (1996) supplies four main reasons for the popularity of grammaticality judgements in linguistics studies. The first is the possibility of examining “reactions to sentence types that occur only very rarely in spontaneous speech or recorded corpora” (Schütze 1996: 2) which, “given the Zipfian distribution so characteristic of linguistic data, [… ] are more than one might expect” (Gilquin & Gries 2009: 9). The second reason is that grammaticality judgments can give information on negatives, that is, sequences that are not part of the language. The third reason is to avoid interpreting performance errors such as false starts or unfinished utterances as grammatical production. The “fourth and more controversial reason is to minimize the extent to which the communicative and representative function of language skill obscure our insight into its mental nature” (Schütze 1996: 2).

32

Language Processing in Advanced Learners of English

A more practical reason for the sustained popularity of such tasks is the ease of applicability: “it can typically be carried out without special equipment, in a short amount of time, and with groups of subjects at once” (Altenberg & Vago 2004: 105– 106). While the first four reasons also hold for the other types of experimental design discussed in this chapter ‘ease of applicability’ is one of the great advantages that grammaticality judgements have to offer. However, it can also be problematic when taken to extremes, for example, by relying on the intuitions or introspection of a single test subject (in the most extreme cases, the actual researcher). In these cases, Wasow and Arnold (2005: 1485) ask for caution: […] we are not arguing against the use of primary intuitions in linguistic argumentation. But when they are used they should be used as a form of experimental data and evaluated as such. That requires collecting intuitions from multiple speakers (when feasible), with careful attention to the presentation of the stimuli.  (Wasow & Arnold 2005: 1485)

With respect to the different methodologies outlined in Table 3.1, the approach of grammaticality judgments is, thus, largely introspective because the respondent is asked to intuitively judge the grammaticality of a given structure. While the general introspective nature of this approach has remained stable over the course of time, the application of the methodology has been increasingly refined by using a larger number of informants and increasingly (and painstakingly) controlling independent variables. In particular, when taking seriously Wasow and Arnold’s (2005) caveat about paying attention to speaker selection and stimuli, the method of applying judgment data generates sound results. It is also a useful method for gaining insights into the interface between conscious linguistic introspection and factors of linguistic production that are below the level of linguistic awareness. That is, interpretations of production data by means of linguistic judgement can help to identify possible reasons that underlie the production. In order to interpret data, it should, however, be examined how far the intuitions about language in judgement performances differ from linguistic production. “In particular, do grammaticality judgement tasks differ from those used in on-line, normal sentence production [?]” (Altenberg & Vago 2004: 108). At first glance there appears to be a number of similarities between conversation and a judgement task. In normal conversation, speakers use self-monitoring and self-correction and are also usually able to tell when an encountered string of language is not well-formed. Because of these similarities, Altenberg and Vago (2004) argue that “that the most parsimonious system is one in which the knowledge source is the same for each of these operations” (Altenberg & Vago 2004: 109). Schütze (1996) sees this perspective as only one of two possible positions, and allows for the other extreme: because there could be a number of crucial differences



Chapter 3.  Aspects of experimental data in psycholinguistics 33

between both processes, it might be that “judging is nothing at all like understanding and involves none of the same cognitive mechanisms” (Schütze 1996: 82). He continues by explaining that [G]rammaticality judgments seem to have much more in common with other psychological judgments. Greame Hirst (personal communication) has suggested the following analogy to food tasting. If someone asks you what you remember about last night’s dinner, chances are you will not have much to say […] you will likely have only a general impression […]. On the other hand if someone offers you food and asks you for your impression before you taste it, you will pay particular attention to the flavours, textures, aromas etc.; perhaps you will chew more slowly and you might be able to give much more detailed comments.  (Schütze 1996: 82)

This difference could lead to a type of judgment that is quite markedly different from the self-monitoring processes that speakers apply in general conversation. This is because, when ask to consciously judge the well formedness of the structure, subjects might try to apply language rules of which they are consciously aware. However, “the only rules about language that most nonlinguists have conscious access to are those learned in grade school, which tend to be of the prescriptive variety [… and] it is not yet clear whether we can induce them [i.e. the subjects] to exclude prescriptive knowledge from their judgments” (Schütze 1996: 83). Therefore, while there are some similarities between on-line performance and linguistic judgment in specifically designed judgment tasks, the nature of the task could also influence the results in ways that are not accounted for and, from corpus linguistic studies, we are well aware that intuitive judgments and actual usage can greatly differ. In order to truly treat judgment data as experimental data, these experiments need to be controlled along three different dimensions: the input (i.e. the structures that the informants judge, including control structures); the informants; and the setting of the judgement task (i.e. possibility of answers, time-restrictions etc.). Considering the first, that is, the linguistic input, it is, for the purposes of this current work, useful to distinguish between those linguistic structures that are frequently encountered in actual data (i.e. corpus-based input) and structures that are intuitively conceived by the analyst preparing the task. There are advantages and disadvantages to both types of input data. As previously described, many structures are rarely encountered in corpus data because of the Zipfian distribution of language reflected in corpora but are, however, still part of the language system. Moreover, even if a construction does occur in natural language data, the complexity of these data can draw attention away from the construction and involve several other features that might influence acceptability.

34

Language Processing in Advanced Learners of English

To illustrate, for example, that a mono-transitive realization of a transfer process seems to be more acceptable when the transferred entity is informational rather than concrete, sentences such as (1a) and (1b) might be more informative than sentences (2a) and (2b): (1) a. ?John gave a book b. John gave a talk

(2) a. If we assume the maximum of 5 per cent this gives a revalued pension of £17,000 a year at age 65 b. […] a European space telescope called Hipparcos, which is starting to give an accurate geography of our local region of the Milky Way galaxy.

Examples (1a) and (1b) are simplified examples based on a statistical analysis of actual language data in the BNC that resulted in demonstrating that speakers have a greater propensity to use monotransitive complementation with the verb give, when the transferred entity is informational rather than concrete. Examples (2a) and (2b) are part of the source data themselves (Schilk et al. 2013). In a grammaticality judgment task, it seems far more likely that subjects would assign ungrammatical or doubtful status to (1a) than to (2a). (2a) is a rare example from newspaper data from the BNC that displays actual use of monotransitive complementation with a concrete direct object of give and thus may help to illustrate under which circumstances speakers might find this use acceptable. In contrast, (1a) is a generalization of the fact that statistically this complementation type hardly ever occurs. As the degree of naturalness of linguistic input data has these effects on the informants’ responses, it is necessary to consider which type of data best matches the focus of the research in question. In order to test whether the cognitive restraints of the applicability of monotransitive complementation of ditransitive verbs in the above example are shared by NNSs of a number of different L1s, fabricated input samples such as the ones in (1) may be better suited than the natural ones in (2). This is because informants focus more strongly on the construction at hand and are not distracted by the complexity of natural data. As this complexity, however, is part of the linguistic reality any speaker is confronted with, it might be best to treat results from simplified fabrications as abstractions rather than interpreting them as prima facie evidence that is suitable for all types of generalizations. Additionally, both types of input processing may raise concerns considering their ecological validity. Generalized inputs may not ideally reflect the real world reading and processing of participants and may therefore be abstractions on the process in general. Corpus-based natural language inputs, on the other hand, also only reflect the typical natural reading process of any participant to a certain degree. As discussed in Chapter 2, language corpora are relatively artificial representations



Chapter 3.  Aspects of experimental data in psycholinguistics 35

of natural language and may not reflect the textual universe any speaker in particular may have experienced. Consider Example (2). Both corpus-based extracts obviously reflect natural language use in a specific text type. It remains questionable, however, in how far these types of texts are represented in the typical reading environment of the participants. Schmidtke et al. 2018, for example, show how individual reading experience and prior exposure to reading material of participants has an effect on the processing of transparent/intransparent compound words. Similar effects may, in all likelihood, be observed for L2 speakers with limited exposure to specific L1 text types. Informant selection, the second point previously mentioned, should ideally also be primarily governed by the specific research question being asked, but is in reality often greatly influenced by other considerations with the availability of informants often being the foremost. While this is understandable and it may be considered better to compromise than not to conduct any research at all due to the lack of ideal informants, taking this compromise to an extreme will, almost without exception, lead to a ‘garbage-in, garbage-out’ situation. That is, if the basis of the experiment is flawed by using unsuitable informants, even by taking the greatest care in language sample selection and using the greatest degree of sophistication in quantitative and qualitative interpretative methodology, unusable results will still be generated. Due to the imbalance between informant availability and representativity of the sample composition, many results in experimental linguistics may, thus, have to be taken with a certain amount of scepticism. The setting of the experimental task, the third point previously mentioned seems to be far less problematic. While the first two points – selection of sample structures and informant selection – often include the concessions described, experiment setting is usually far easier to control and the documentation of the different parameters becomes the cornerstone of replicability. Finally it should again be pointed out that the three dimensions discussed above are, with slight modifications, equally important for all experimental settings and are also not limited to the grammaticality judgment tasks discussed in this section. The difference between the different types of experimental studies primarily lies in the degree of importance that each of these three dimensions has. These three dimensions are, therefore, also considered in the description of the other descriptive frameworks, albeit more briefly.

36

Language Processing in Advanced Learners of English

3.3

Lexical decision tasks in empirical linguistics

As in the case of grammaticality judgments, the term lexical decision task can be seen as a cover term that includes a variety of different experimental settings. Therefore, when positioning these tasks on the cline of conscious awareness of the informants involved, these tasks form a cline in themselves. Early applications of lexical decision tasks date back to the 1970s and, at that time, they were primarily used to provide information on lexical storage and retrieval and answer questions about the organization of what is often called the ‘mental lexicon’ (see e.g. Meyer & Schvanefeldt 1971). Generally speaking, lexical decision tasks consist of a set of lexical or pseudolexical stimuli and a response-time measured decisions task in which test subjects judge the lexical status of the stimulus. The response data, therefore, is two-fold: response content and response time. These tasks are also often combined with priming tasks so that the input stimuli can consist of a prime and a second stimulus that the subjects judge. Typical effects that influence response-time in these experiments are word-frequency effects (i.e. frequent words are identified faster than rare words), neighbourhood effects (i.e. “nonwords with more [orthographical] neighbours taking longer to reject” (Grainger 1990: 229)) and priming effects. Independent variables in these early experiments are, thus, often word frequency (Rubinstein et al. 1970), lexical neighbourhood size (Grainger 1990) or membership in predefined semantic categories (Meyer & Ellis 1970).6 As with the other methodologies under discussion, there are various possibilities to modify the basic experiment setting. Theoretically, three main types of experiment modifications are possible and can be used independently or in combination. The first of these possibilities relates to the source data and researchers have several options. Apart from presenting participants with words and non-words it is, for example, also possible to present them with semantically related or unrelated words and ask them if the words fall into the same category (Schaeffer & Wallace 1970). Other possibilities include a change in medium, that is, presenting participants with auditory input of spoken data instead of visual input of written data or going beyond what is traditionally assumed to be the word level. The most basic type of these tasks is to present informants with visual or auditory input, that is, a string of letters on a screen or a sequence of sounds, and take 6. Interestingly, psychologists and psycholinguists have been making use of corpus derived frequencies from very early on. Meyer and Schvaneveldt (1971), for example, base their stimulus words on frequencies in the Brown corpus. Corpus linguists, on the other hand seem to be far more reluctant to include psycholinguistic methods and results in their work, a point also raised by Gilquin and Gries (2009).



Chapter 3.  Aspects of experimental data in psycholinguistics 37

the response time for a word versus non-word decision. On the level of selection of input data, we can thus distinguish between different modes of linguistic input: auditory, visual or cross-modal. The choice between these inputs is important for a number of reasons. Firstly, auditory input is, by definition, purely sequential and therefore has different constraints than visual input concerning sequentiality. Secondly, visual activation includes a second level of word recognition, while auditory activation does not. Auditory word recognition is a part of the general language acquisition process of all speakers. All adult speakers can be expected to be able to analyse auditory strings and decide upon their word status, whereas visual word recognition tasks include further layers of recognition that are used in combination with the auditory representation of lexical items (see e.g. Coltheart et al. 2001). However, the implications of speech primacy exceed these theoretical considerations of sequentiality and phonotactics when considering the choice of input data. As mentioned in the beginning of this section, the choice of input data is often influenced by considerations of word frequency, and linguistic corpora are often used to gain frequency information. This choice may, however, be unsound for two reasons. Firstly, as has been discussed at some length in Chapter 2, most linguistic corpora do not include a sufficient amount of spoken data and, if they do, the type of spoken data is usually not of the type that most speakers are likely to experience on a daily basis or the types of data represented in the corpora only make for a very small subset of spoken linguistic functions that is part of the speaker’s set of linguistic input. Secondly, although the frequency measures provided by corpora give solid information about average word frequencies in a particular genre in the language, it does not reflect the far more individual nature of the linguistic repertoire of the informants in lexical decision experiments. In simpler terms, what is frequent in a corpus may only be frequent in some written genres, which is in contrast to the speech primary axiom discussed above and may not represent the frequency structures in the individual participant’s linguistic repertoire (see e.g. Hoey’s (2005) work on lexical priming). Therefore, the second level of experiment design that needs to be taken into account is the composition of the field of participants and the biographical information that is needed in order to avoid a mismatch between the chosen input data and the selected experiment participants. Participant selection could be influenced by factors such as age, gender, education and so forth, but could also include linguistic factors such as different L1s or total number of languages spoken by the participants. Furthermore, there might be special additional necessities in the selection when lexical decision tasks are performed by aphasic patients, specifically with regard to comprehension impaired aphasics (as e.g. in Milberg & Blumstein 1981; Blumstein et al. 1982).

38

Language Processing in Advanced Learners of English

As has been shown in the discussion of grammaticality judgments in Section 3.2, participant selection is an issue that includes a trade-off of availability of participants and representativity of the group composition. Especially in linguistic university contexts, for example, group composition seems to be slanted towards young female subjects Apart from the concerns of representativity of the participant population that are inherent in many psychological and psycholinguistic studies, researchers need to be aware of the individual differences between the participants when frequency effects are taken into account and when frequency is a factor in input data selection. As previously mentioned, lexical exposure is highly individual so that overgeneralizations of frequency could have a negative influence over research results. Basing input data on corpus-derived frequencies could therefore ignore the fact that corpora do not reflect the individual subject’s history of lexical exposure and use. Considering the relatively low number of informants that most studies have, this can potentially lead to non-replicability or contradictory results from identical experiments carried out with different participant populations. To make this constraint more advantageous, it would be advisable to include as much subject-based metadata as possible, as a thorough knowledge of the included informants and their linguistic biography could help to identify underlying factors of inter-subjectal differences in performance of the task. Interestingly, participant meta-information is often very sparse. A third possibility to adjust lexical decision based experiments is within the experiment setting. In this case, the previously mentioned cline between conscious and less conscious decisions is probably the main motivation for variation. Thus, most variations in the general setting concern inserting elements of time pressure or memory constraints into the design. Possibilities here include, but are not limited to, monetary penalization of test subjects for slow or wrong answers (e.g. Meyer & Schvanefeldt 1971), general time limits for answers, or the inclusion of distractors that overload working memory (e.g. Schmitt et al. 2004). 3.4

Eye-tracking studies in empirical linguistics

In eye-tracking studies, eye-movement data is measured and interpreted while the participants are presented with a visual stimulus (in linguistic experiments often written words or sentences). These eye movements happen below the level of conscious awareness and the participants have very little influence on their reaction to the stimulus. In this way these studies fundamentally differ from the judgment and decision tasks discussed previously in this chapter, where at least some amount of conscious decision making, from the participants, was included.



Chapter 3.  Aspects of experimental data in psycholinguistics 39

The use of eye-tracking methodologies is widely applied in a number of different fields and only a small proportion are specifically focused on linguistic tasks, therefore Section 3.4 cannot provide a comprehensive description of either different eye-tracking methodologies or the complete scope of their application in psychology and cognitive neuroscience. Therefore a certain level of simplification is adopted at this point by only concentrating on linguistic tasks. Bearing this in mind, the two main poles in eye-movement studies are those times where the eyes are relatively still, periods which are usually called fixation periods and those periods where the eyes are in motion, called saccades. Rayner (1998) describes the function of saccadic movement in relation to the visual field: We make saccades so frequently because of acuity limitations. As we look straight ahead, the visual field can be divided into three regions: foveal, parafoveal and peripheral. Although acuity is very good in the fovea […], it is not so good in the parafovea […], and it is even poorer in the periphery […]. Hence, we move our eyes so as to place the fovea on that part of the stimulus we want to see clearly. […] For example, if a word of normal-print is presented in parafoveal vision, it is identified more quickly and accurately when a saccade is made.  (Rayner 1998: 374)

During saccades, however, information flow is inhibited because the eyes are moving very quickly and only blurred input is perceived. It is also not entirely clear if these visual input suppressions during a saccade carry over to suspended cognitive activities (Rayner 1998: 373). Considering the alternation between saccadic movement and fixation, there are different data that can be interpreted when these movements are monitored. On the level of saccades, it is possible to measure saccade size (i.e. the amplitude of movement), saccade duration and saccade direction, where saccade duration is the measure of the time of the span that passed between two fixations, usually measured in milliseconds. Saccade duration is, thus, influenced by target distance and saccade distance (Abrams et al. 1989: 536). With regard to fixation, the factors measured are the fixation point, that is, where the fixation occurs and the fixation duration (i.e. how long the interval between two saccades is). Both of these measures are dependent on the task, specifically on the difference between reading and non-reading tasks. For reading tasks in particular, fixation duration allows the researcher to draw conclusions about cognitive processing. Readers choose the pace at which they access a text, therefore it is assumed that readers “take in information at a pace that matches the internal comprehension process” (Just & Carpenter 1980: 329). Just and Carpenter (1980) illustrate the cognitive process at a fixation point with regard to working memory and long term memory (see Figure 3.1):

40 Language Processing in Advanced Learners of English

Get next input: Move eyes Extract physical features Encode word and access lexicon Assign case roles Integrate with representation of previous text

No

Working memory

Long Term Memory

activated representations

Productions that represent

physical features words meanings case roles clauses text units domain of discourse variable-binding memory

orthography phonology syntax semantics pragmatics discourse stucture scheme of domain episodic knowledge

end of sentence ? Yes Sentence wrap-up

Figure 3.1  A scematic diagram of the major processes in reading comprehension (Just & Carpenter 1980: 331)7

The flow chart on the left-hand side of the diagram in Figure 3.3 illustrates the sequence of the cognitive processes at different fixation points within a sentence. It starts at the saccade before each fixation and proceeds downwards. As lexical information alone is not sufficient for text comprehension, this information has to be put into relation to other elements– in the English language predominantly on the syntagmatic level– in order to assign case roles to the different syntactical units. The main piece of data that is being interpreted in eye-tracking studies according to Just and Carpenter (1980) is, thus, position and duration of fixation points, and most researchers have followed this approach (Rayner 1998: 378). This is, however, not entirely uncontroversial as it has been argued that saccade durations should be added to the computation (Inhoff & Radach 1998; Rayner 1998). In this present section, these finer points concerning the correct measure are not

7. “Solid lines denote data flow paths, and dashed lines indicate canonical flow of control“ (Just & Carpenter 1980: 331).



Chapter 3.  Aspects of experimental data in psycholinguistics 41

discussed, but it should be noted that many of these are still under scrutiny. In particular, in terms of the definition of functional oculomotor events, there does not seem to be unambiguous agreement within the field (Radach & Kennedy 2004: 14). Drawing on a survey study amongst 32 researchers who use oculomotor techniques (Inhoff & Radach 1998), Radach and Kennedy (2004) report that “all the researchers believed that there is a need for increased discussion of measurement and methodological issues. More strikingly, two thirds of them considered the definition of oculomotor events […] to be controversial” (Radach & Kennedy 2004: 14). However, even if there is still room for methodological improvement, and the progression of the theoretical methodological apparatus is also influenced by technological advancements in measuring techniques, in particular when it comes to combination of eye-movement and neurolinguistic methods, the remainder of this section does not continue with this discussion, but instead focuses on different types of linguistic application. Apart from the core data, that is, the results of measuring, for example, fixation duration, there is a certain amount of metadata that also needs to be taken into consideration. As with the other experimental methodologies, this relates mainly to the composition of the group of participants and the selection of the stimuli. Both are, of course, strongly related to the research question and are also linked to a number of different considerations which are often interdependent. The interdependence between input stimulus and participant profile is well illustrated in the case of language acquisition studies that focus on pre-readers. In the brief discussion above, only monomodal written input in reading tasks was focused upon because oculomotor monitoring focuses on the visual field and, thus, visual input can be seen as the most typical input. However, crossmodal stimuli are also frequently used, especially in the case of participants that cannot be subject to reading tasks. This is usually the case in language acquisition research where participants are frequently below the age of being able to read competently. The possibility of including very young participants in eye-movements studies is one of the great advantages of such a methodology over the conscious decision tasks previously discussed. Concerning the applicability of eye-movement studies to a large scope of possible participants, Sedivy reports that: Because of the task simplicity involved, the naturalness of the behavioral measure and the variety of possibly technical implementations […] this method is suitable for a broad range of subjects. […] It has been productively used with infants as young as 14 months of age, with normal adults and with all age ranges of childhood between infancy and adolescence. In addition, various studies have used the technique with bilinguals […], as well as brain-damaged populations such as aphasics […], apraxics and their age-matched (often elderly) controls. (Sedivy 2010: 122)

42

Language Processing in Advanced Learners of English

This greater variety of possible informants, therefore, results in a need for a number of methodological decisions in the planning phase of the specific experiments. Fernald et al. (2008), for example, describe an experiment design tailored to the specific requirements of infants from the age of about 14 months upwards, that is, at the earliest point of possibly relating eye-movement data to linguistic input.8 In this study they apply a cross-modal approach, referred to as the “looking-while-listening paradigm” (Fernald et al. 2008: 190). The input in this framework is cross-modal, that is, children are presented with visual stimuli that consist of “[r]ealistic images of common objects judged to be prototypical for children at each age” (Fernald et al. 2008: 191), including target and distractor items and auditory stimuli that consist of sentences containing target words “understood by all participants according to parental report” (Zangl & Fernald 2007: 204). As was the case in the previous study, age of informants also plays an important role for eye-tracking studies focusing on adult participants, as working memory capacity limitations may be a function of maturation and decrement (Just & Carpenter 1992). Bearing in mind that working memory plays a crucial part in reading comprehension, individual differences in working memory capacity could influence possibilities for storing information on alternate possible interpretations of, for example, functional syntactic roles, as a “larger capacity of some individuals permits them to maintain multiple interpretations” (Just & Carpenter 1992: 122). Regarding working memory restrictions, in the interpretation of syntactic constructions, being correlated to speaker age, Just and Carpenter (1992: 129) explain that: Syntactic constructions that make large demands on working memory capacity […] are the very types of constructions that produce age-related decrements in elderly people. […] For example, older adults (65–79 years of age) show relatively greater deficits than do younger adults, when they must make an inference that requires integrating information across sentences.

Based on the observation of age-effects on working memory, Kemper et al. (2004) use garden path structures in which the verb of a restricted relative clause may be misinterpreted as the main verb as exemplified in (3):

(3) The experienced soldiers warned about the dangers conducted the midnight raid  (Kemper et al. 2004: 157)

8. Studies at an earlier age do not yield interpretable data: “[w]hile 12-months-olds are happy to look at the pictures displayed, they are likely to fixate on one picture and ignore the other on any given trial, shifting less frequently overall than infants just two months older” (Fernald et al. 2008: 191).



Chapter 3.  Aspects of experimental data in psycholinguistics 43

In order to draw conclusions on working memory capacity they time “first pass fixation times, leftward regressions to previous parts of the sentence, and total fixation times plus subsequent fixations arising from regressions and re-readings the sentence in whole or in part” (Kemper et al. 2004). Their results are congruent with the initial hypothesis, that “older adults, as a group, resemble low span readers” (Kemper et al. 2004). While neither this experiment nor the results shall be discussed in detail at this point, the examples of experiment design, described above (Fernald et al. 2008; Kemper et al. 2004), illustrate well the possibilities of using eye-tracking methodologies across the whole age spectrum of speakers and, at the same time, help to point out the importance of creating an age-controlled participant field as well as controlled linguistic input stimuli. 3.5

Neuroimaging in empirical linguistics: ERP and fMRI data

The discussion of different types of data and the resulting methodologies in this chapter has been organized along the continuum of the conscious influence over, and awareness of the participants’ response to the linguistic data or stimuli. In the first type of experimental settings (i.e. grammaticality judgments, lexical decisions) informants have a great degree of conscious influence over their response. This has been shown to be increasingly less the case for the eye-tracking studies discussed in Section 3.4, where fixation is generally below the level of conscious awareness. Section 3.5, discusses the extreme point of this cline, where participants have no conscious control of their response whatsoever, since the type of response measures physiological changes at the neurological level. Two methodologies have become increasingly popular over the last couple of decades. The first of these non-invasive measuring techniques of brain functions in relation to language input is electroencephalographic measuring of electric potentials on the scalp as a stimulus induced response (event related potential (ERP) type studies). The second is neuroimaging of brain functions based on magnetic response imaging (fMRI-type studies). It is also possible to use a combination of these techniques. ERP-type studies measure “the electrical activity of the brain time-locked to the presentation of a stimulus” (Friederici et al. 1993: 184). This activity is measured on the surface of the scalp by (usually) using electrode equipped elastic caps fitted to the participant’s head which measure an electroencephalographic response to a stimulus. However, the “ERP is generally too small to be detected in the ongoing EEG […] and requires computer averaging over many stimulus presentations to achieve adequate signal/noise ratios” (Hillyard & Kutas 1983: 35), so that basic EEG data is further amplified and averaged, as illustrated in Figure 3.5:

44 Language Processing in Advanced Learners of English

Ongoing EEG Amplifier S

S

one sec

S

S

Auditory event-related potential –5µV

Auditory stimulus (S)

Nd

Signal averager

NO

I

+5µV Stimulus onset

II

III

VI

Na

Nb

N1 N2

PO

IV V

SW Pa

P1

P2 P3 (P300)

10

Time (msec)

100

1000

Figure 3.2  Idealized waveform of the computer-averaged auditory event-related potential (ERP) to a brief sound (Hillyard & Kutas 1983: 35)

As shown in Figure 3.2, ERP measurement has a high temporal resolution, usually measured in milliseconds from the onset of the stimulus. As a high-resolution real-time measure of neural response, primarily auditory stimuli are used as speech “unrolls a sequence in time” (Friederici et al. 1993: 184) and stimulus onset points can thus be measured with a great degree of precision. This strict linearity is not given for visual stimuli of written language data, although this is possible when ERP-type studies enforce sequential reading by presenting the visual stimulus in a predefined order. A second possibility that has been chosen for the present research is combining ERP measurement with an eye-tracking approach, which allows for temporal monitoring of the reading process and time-locking the EEG to the fixations of the participants. Typical fields of linguistic application of ERP measurement are similar to those discussed in the previous section on eye-tracking studies, concentrating on either phonological, lexical, semantic or syntactic processes. Friederici et al. (1993), for example, illustrate the sensitivity of ERPs to semantic and syntactic processing of auditory inputs. Somewhat simplified, some ERP-components are usually sensitive to violations of morphosyntactic, semantic or functional syntactic expectations. That is, if the stimulus deviates from expected linguistic input, the cognitive process in reaction to the deviating stimulus results in a measurable change in the EEG profile.



Chapter 3.  Aspects of experimental data in psycholinguistics 45

Friederici et al. (1993) use four different types of auditory input: incorrect semantic input as shown in (4), incorrect morphological input (5), incorrect syntactic input (6), and correct controls (7): (4) Die Wolke wurde begraben. (The cloud was buried) (5) Das Parkett wurde bohnere. (The parquet was polish) (6) Der Freund wurde im besucht. (The friend was in the visited) (7) Der Finder wurde belohnt. (The finder was rewarded)

(Friederici et al. 1993: 186)

For example (4) a correct form would either need a buriable object: ‘Die Leiche wurde begraben / The body was buried’ or a verb that can take the cloud as an object in a passive construction: ‘Die Wolke wurde betrachtet’ / The cloud was watched / viewed’. The correct morphological form in (5) would be ‘Das Parkett wurde gebohnert’, where gebohnert is the past participle of the German verb bohnern (to polish) that is obligatory in the passive construction. Bohnere is a first person singular active present tense form of the verb bohnern, which would be correct in the sentence ‘Ich bohnere das Parkett’ (‘I polish the parquet’). Example (6) either misses the object of the preposition im (e.g. Der Freund wurde im Krankenhaus besucht / The friend was visited in the hospital) or contains a superfluous preposition (Der Freund wurde besucht / The friend was visited). These types of sentences with different violation conditions are used as auditory input data during an EEG reading of 16 different adult test participants to account for temporal and topographic distribution of ERP response to the different violation conditions (Friederici et al. 1993: 189). Friederici et al. (1993) isolate three different effects that correlate to the violation condition of the input sentence. Semantic restriction violations result in “a negative going effect in the time domain at around 400ms with a broad distribution over both hemispheres” (Friederici et al. 1993: 190). This N400 effect is consistent with earlier research on semantic mismatch by, for example, Kutas and Hillyard (1980) who also showed that the amplitude of the N400 is also influenced by the degree of semantic mismatch, as illustrated in Figure 3.3. The grand average ERP’s (across all subjects) to the seventh word showed the N400 component to be substantially larger after the strong semantic mismatches […] than the moderate mismatches […]. This comparison was significant at all electrode sites, with the N400 quantified as the area in the latency zone between 400 and 600 msec […]. (Kutas & Hillyard 1980: 203–204)

46 Language Processing in Advanced Learners of English



N400

10µV +

1.0 sec XXXXX

IT

WAS

XXXXX

HE

XXXXX

SHE

HIS

FIRST

DAY

AT

WORK.

SPREAD THE

WARM

BREAD

WITH

SOCKS.

PUT

HER

HIGH

HEELED

SHOES.

ON

a.  –1200

N400

–600 0

FZ

CZ

PZ

FZ

CZ

PZ

FZ

CZ

PZ

b.  Semantic-moderate

N400

Area (µV–msec)

–1800 –1200 –600 0

c.  Semantic-Strong 0 600 1200



P560

10µV +

0

300

Normal word Deviant word 600 900 msec

1800

d.  Physical Figure 3.3  N400 effect at moderate and strong Semantic mismatch (Kutas & Hillyard 1980: 203)

P560

Chapter 3.  Aspects of experimental data in psycholinguistics 47



Unlike the N400 effect shown for semantic mismatches, syntactic argument structure mismatches, of the type shown in (6), evoke a different type of EEG peak at approximately 180ms after stimulus, while morphosyntactic mismatches evoke a slightly later peak in the same region (at approximately 100–500ms after stimulus). Unlike the equitopographical distribution of the N400, these peaks are predominantly identified “over frontal and anterior lateral electrode sites, most salient over the electrode ‘Broca left’ (Friederici et al. 1993: 190). The timeframe and positioning of this peak is reflected in the term, early left anterior negativity (ELAN), for the peak correlated with argument structure violations at 180ms past stimulus, and the term left anterior negativity (LAN) for the peak correlated with morphosyntactic mismatches. An additional peak can be identified at approximately 600ms post stimulus. This positive going effect (P600) “correlates with outright syntactic violations (following the ELAN), with ‘garden-path’ sentences that require syntactic revision, and with processing of syntactically complex sentences” (Friederici 2002: 81). Figure 3.4 illustrates these different language related components in the EPR, in relation to the processed sentences: a.

b.

–5 µV

–5 µV

N400

CZ

0 5 0

c. F7

ELAN

0

0.5

1

1.5s

5 0

Das Hemd wurde bebügelt. The Shirt was ‘ironed’. Das Gewitter wurde gebügelt. The thunderstorm was ironed.

–5 µV

PZ

0

0.5

1 F7 CZ PZ

1.5s

5 0

1.5s P600 Das Hemd wurde bebügelt. The Shirt was ‘ironed’. 0.5

1

Die Bluse wurde am gebügelt. The blouse was on ironed.

Figure 3.4  N400, ELAN and P600 components in the ERP (Friederici 2002: 81)

This type of ERP response data helps to gain insights into the cognitive processes involved in language comprehension. One of the great advantages of this methodology is that it can be applied to participants that are not able to make or articulate metalinguistic judgments, as is, for example, the case of very young children in language acquisition research or for speakers undergoing language attrition or suffering language loss. Molfese and Molfese (1997), for example, have used ERP recordings of newborn infants as young as 48 hours in “response to a series of 9 consonant-vowel syllables” (Molfese & Molfese 1997: 135). Moreover, ERP-type studies offer a very high temporal resolution and are relatively easily applicable to a large variety of participants.

48 Language Processing in Advanced Learners of English

A problem of ERP-type-studies, when compared to other neuro-imaging techniques is, however, that localization of brain activity with electrical activity on the scalp is unreliable. Rispens and Krikhaar (2010: 97) explain that: […] localization of brain activation is difficult due to the so-called inverse problem. This refers to the problem that in general there is no unique solution to compute the sources of the neural electric activity within the brain that is measured at the scalp. Thus, it is not the case that the place on the scalp, where an ERP component is measured, corresponds to the place where the neural activity is generated.

ERP-type studies can, therefore, provide sound information on the different types of processing that seem to be involved in language comprehension and can also be seen as an effective complementary method to the methodologies described earlier, such as eye-tracking. However, they do not provide linguists with direct insights into the actual brain functions but use epidermal electrical fluctuation as an indication of internal neural activity. In other words, ERP-type studies provide the researcher with insights into which phenomena of language are processed when, but they are not so effective in answering questions about localization. Questions of localization, however, have played an important role in psycholinguistic research and classic localization theories such as the Broca-WernickeLichtheim (BWL) model have only recently become subject to rethinking, as brain imaging technology constantly improves and makes non-invasive studies of healthy participants increasingly feasible. One of those methodologies that has become fairly popular over the last couple of decades is the use of functional magnetic resonance imaging (fMRI), a technique that is based on monitoring changes in blood oxygenation resulting from neural activity (Ogawa et al. 1990; Huettel et al. 2014). With regard to resolution, this method can be described as a complementary path to ERP studies, as spatial resolution of active brain areas during language production and comprehension is relatively high, while temporal resolution is far less precise. As areas of application of this methodology are relatively broad, ranging from the localization of primary sensory and motor images (e.g. Kim et al. 1993) to studies of higher cognitive functions (Binder et al. 1997: 353), the present section is restricted to a brief presentation of two different applications in neurolinguistics. The first area of application is to identify active areas in the brain during speech production and comprehension with a higher degree of precision than the classical models (based on posthumous dissection of patients with pathological conditions) offer, by measuring neuronal activity in a healthy brain. The second study presented in this section is more closely connected with a specific issue in psycholinguistic research, that is, the question of lexical storage and retrieval and, thus, ties in directly with issues addressed in Sections 3.3 and 3.4.



Chapter 3.  Aspects of experimental data in psycholinguistics 49

Based on early studies conducted with brain-damaged patients, many studies in neurolinguistics have been involved with the identification of language areas. Although this term is not very precise and might be seen as sort of a misnomer, there is relatively unambiguous agreement of the idea that linguistic functions are seriously impeded when specific areas within the human brain are damaged. However, as Bates (2003) points out, the mere fact that the impeded linguistic capacity correlates with damage to specific areas does not automatically mean that human language capacity is restricted to these areas. Based on a unanimous agreement that language perception is a cognitive function and that cognitive functions are located in the brain, better monitoring of the healthy brain should make it easier to locate those parts within the brain that are dominantly used during the performance of linguistic tasks. The problem with this approach, however, is that it is not yet clear if brain activity is domain specific, that is, if linguistic tasks use different parts of the brain than non-linguistic tasks, or task specific, indicating that tasks from different domains that are similar in their dependence on “such factors as the amount and kind of memory required for a task, its relative level of difficulty and familiarity to the subject its demands on attention, the presence or absence of a need to suppress a competing response, whether covert motor activity is required, and so forth” (Bates 2003: 255) could depend on similar structures within the brain. Accordingly, input data for fMRI studies interested in the issue of isolating ‘language areas’ within the brain often consists of linguistic data as well as control data that is similar in task type and complexity. Binder et al (1997), for example, had participants perform a linguistic task “which required meaning-based distinctions about aurally presented words […] to elicit receptive language processing at both phonetic […] and semantic […] levels” as well as a “baseline task, which require pitch based decisions about tone sequences” (Binder et al. 1997: 353) to identify those areas within the brain that are active in the linguistic task, but not in the baseline task. Binder et al. (1997) address the problem of domain versus task specificity of processing of aural stimuli by subtracting the baseline results from the results of the linguistic task, isolating those areas that are highly active during the latter but not in the former. However, this procedure is not entirely unproblematic regarding the input data for both the baseline task and the linguistic task. Considering the number of factors that could be influential on linguistic reception and comprehension, is seems to be very difficult to create input data that reflects the complexity of natural language while at the same time identifying non-linguistic baseline tasks that are equal, or at least comparable, in their setting and complexity. Therefore, more recent studies tend to focus more narrowly on specific linguistic tasks and often combine fMRI with other experimental methods in order

50

Language Processing in Advanced Learners of English

to create solid topographical imaging combined with a higher temporal resolution in relation to task performance. Fiebach et al. (2002) is an interesting example of a combination of a lexical decision task with neuroimaging of active brain areas. Since the use of lexical decision tasks is so wide-spread within experimental linguistics, the issue of suitable non-words is fairly well documented and it is possible to illustrate differences in the brain activity in the processing of words and pseudowords. In Fiebach’s (2002) study, theories of dual-route access to the mental lexicon are tested with a combination of a lexical decision test and functional neuroimaging. For the lexical decision task the input data is visual and consists of “a pseudorandomly ordered sequence of high- and low-frequency words (i.e. 68 inanimate nouns of each type) and […] the same number of phonological legal pseudowords” (Fiebach et al. 2002: 12). Unlike the lexical decision tasks discussed in Section 3.3, in addition to response time an additional layer of data is added by scanning brain activity of the participant. Resultant images are contrasted somewhat similarly to those previously discussed, that is, by subtracting activity data of non-linguistic processing from brain activities in the linguistic task: Reported are direct contrasts (a) between the activations elicited by words and pseudowords, as well as (b) between words of low- and high frequency and (c) between pseudowords and low- and high-frequency words, respectively.  (Fiebach et al. 2002: 12)

Based on the distinct input data and their contrasts, Fiebach et al. (2002) can show different brain activations depending on lexicality and word frequency. Inter alia, their results show that there are “significant activation differences between words and pseudowords” (Fiebach et al. 2002: 14) and that low frequency words caused stronger activation than high frequency words in specific brain regions.9 Moreover they can corroborate claims of the dual-route theory (Coltheart et al. 1993) by showing that there are clear differences observed between lexical and non lexical stimuli, in contrast to the computation of a visual-orthographic unit that should elicit comparable activation for words and pseudowords (Fiebach et al. 2002: 18). The example of Fiebach et al. (2002) illustrates well how neuroimaging techniques have a particularly high explanatory power when they are used in combination with other experimental methods. In other words, since the participants are usually performing linguistic tasks while their brain activity is measured by either 9. A concise discussion of the differences with respect to the exact location of higher brain activation for words compared to non-words and low frequency versus high frequency words is beyond the scope of this methodological discussion, the reader is referred to Fiebach et al. (2002) for details and illustration.



Chapter 3.  Aspects of experimental data in psycholinguistics 51

EEG or fMRI, the possibilities of interpreting the activation data greatly increases when the tasks are narrowly defined. This is, in particular, in the cases when these tasks are derived from a well-researched background that allows the researcher to control the input data for a large variety of factors, lexical frequency being only a very simple example. 3.6

Data in experimental linguistics and psycholinguistics: A brief summary

Chapter 3 has shown the differences between the types of data used in corpus linguistics and in psycholinguistic research. While in the area of corpus linguistics, specific language data per se is of interest, in psycholinguistic research linguistic data is used as principal input data, whereas the interpreted data has primarily a metalinguistic character. The focus lies on the speakers rather than language itself, therefore speaker response data to linguistic stimuli plays a major role in the different types of experimental design (that have been the focus of Chapter 3). The stimulus-response character of most psycholinguistic studies necessitates language input data that is strictly controlled in order to achieve a high degree of comparability between individual speakers’ responses. This controlled nature of the input data limits the usefulness of natural language data of the kind that corpus linguists are interested in because natural language data is often very complex and, especially with respect to the spoken medium, often rather ‘messy’, containing elements of repair, self correction, false starts and so forth. In Chapter 3, the different experimental methods under discussion have been approximately organized on the basis of the degree of conscious decision that is made by the participant. This organization, however, also reflects a certain cline of the necessity for input data control from the researcher. For grammatical judgment tasks, it is quite possible to use natural language data, which may also be one of the reasons for their popularity in many research areas. Moving along the cline of conscious response outlined throughout the chapter, the need for less natural data seems to increase. In the case of lexical decision tasks, it is, for example, necessary to create word lists where the individual types correspond to categories of frequency, phonetic accessibility and so on. Even if the decision tasks exceed the level of individual words by using longer stretches of language as linguistic primes, it is often necessary to adjust these to fit specific criteria, which limits the usefulness of natural language input, while at the same time the information about the frequency categories, may come from observations of natural language.

52

Language Processing in Advanced Learners of English

While the need for artificially ‘cleaned’ input data increases even more in the types of experiments where the response data contains a high amount of temporal or local resolution, that is, in eye-tracking and neuroimaging studies, the observations of corpus linguistics are still particularly relevant in psycholinguistic research settings. Especially at the interface of natural ‘real world’ data and experimental input data and participant selection, much may be gained from the different type of evidence corpus linguistic data may provide.

Part II

Language processing of intermediate and advanced learners of English A multi-method approach

Chapter 4

Interference collocations of advanced German learners of English

4.1

Introduction

Part I of the present work introduced two different approaches to data in empirical linguistics and illustrated how these different approaches are useful to answer different questions on language use and language processing. In summary, it showed how corpus linguistic data can be used to answer questions of what is frequent and typical within specific types and varieties of language. On the other hand psycholinguistic approaches are almost always experimental, asking test subjects to perform a number of linguistic tasks and measuring their performance in a predefined way. In Part II of this work I describe an attempt to integrate these different data types into a multi-method approach. In line with the first part of this work, the pilot study in the second part uses the types of data and methodologies previously described. For several reasons, focusing on non-native English speakers’ use of collocation seems to be a suitable testing ground for the integration of corpus data and experimental data within language processing. Firstly, the study of collocations is a well established area in the field of corpus linguistics and there are a number of earlier studies that focus on integrating the corpus linguistic description of this seemingly universal language phenomenon with notions of biographically defined language learner types and typical ways of lexical language processing (see e.g. Ellis et al. 2009; Ellis & Frey 2009). Secondly, it has been shown that the use of collocations seems to be influenced by L1-factors, so that L1 interference often leads to the use of unusual or wrong collocations in the target language (see e.g. Nesselhauf 2004). Thirdly, albeit being well described from a descriptive point of view, the processing of collocation has, so far, not been fully analysed from a psycholinguistic point of view. There seems to be a relative strong agreement among many linguists that collocations and larger potentially semi-prefabricated language units are used as a way of facilitating language processing and production (see e.g. Sinclair 1991; Wray 2002). However, there are still relatively few studies that test these notions empirically, particularly when it comes to learners and multilingual speakers. Chapter 4 provides a starting point of the analysis by focusing on collocation and collocability across languages and illustrates some of the difficulties advanced learners have in producing collocations that are typical and frequent in the target

56

Language Processing in Advanced Learners of English

language. After a brief introduction to collocation, the specific problems of contrastive collocations and interference collocations are described to define the basis for the empirical analysis that is undertaken in Chapters 8 and 9. 4.2

Collocation and collocability

The notion of collocation and collocability of lexical items is a relatively old concept in descriptive linguistics. Most of the seminal work in this regard can be traced back to Firth (1957) and his followers within the framework of British contextualism. Firth’s original work on collocation is based on his statement that “the main concern of descriptive linguistics is to make statements of meaning” (Firth 1957: 190). Challenging the traditional dichotomy of lexical versus grammatical meaning, Firth introduced the level of meaning by collocation that can be argued to be on the border between those two elements of meaning (Schilk 2009). In his early definition of collocations Firth defines collocation as an “abstraction on the syntagmatic level” (Firth 1957: 196), as discussed in Chapter 2. Halliday (1966) illustrates this abstraction on the syntagmatic level by demonstrating that there are restrictions on paradigmatic choices allowed for at the syntagmatic level that cannot be accounted for by structural restrictions, so that in “lexicogrammatical statements collocational restrictions intersect with structural ones” (Halliday 1966: 163) without being dependent on these. However, although in Firth’s terms meaning by collocation is not the same as contextual meaning, there also is an element of contextual meaning to the concept of meaning by collocation. Using Halliday’s framework of item, set and collocation this can be illustrated for different contexts of cultures and languages. Halliday (1966) uses the examples of strong argument and powerful car to illustrate the point of items such as strong being members of a specific set that also includes powerful. While some items within the set can be used in collocation with items of different sets, such as argument, others cannot and are instead used with items of different sets, such as car. As I have shown elsewhere (Schilk 2009), different cultural contexts have an influence on the collocability of items within different sets. An example from Schilk (2009) would be the different collocability of the verb offer in Indian English when compared to British English. While offer does not collocate particularly strongly with the noun prayer in British English, it does in Indian English. Although the collocation offer + prayer is possible in both varieties, a cultural contextual meaning makes the combination more likely in Indian English. In more extreme cases a collocation of items that can be found in one cultural setting would not be possible within another, as is the case when we abstract from varieties of the same language and take into account contrastive collocational possibilities across languages.



Chapter 4.  Interference collocations of advanced German learners of English 57

This can be illustrated by using Halliday’s example of the set that includes strong and powerful. While in the English language hot beverages such as coffee or tea only collocate with strong but not with powerful, in German the collocational range includes both stark (strong) and kräftig (powerful). The differences in the collocational range of strong and powerful in English and German can be even taken somewhat further when using, for example, ein starker Raucher (a heavy smoker) in German. While substance abuse and overindulgence can be used with the adjective for strong in German the item that would collocate in English is heavy (schwer), so that members of the set “abusive/indulgent user” (smoker, eater, drinker etc.) would collocate with different adjectives in English than in German. The concept of an interference-based collocation can, therefore, be defined in terms of the collocational ranges of source and target lexemes, the statistical strength of the respective collocation in the L1 and L2, and their respective semantic opacity. When treated in isolation there is usually one or more target lexemes that differ in frequency, range of use and range of collocability. While learners of a language often have access to a set of possible target lexemes, they might lack the complete knowledge of the context of situation these lexemes tend to be used in in the L2, and of the collocational range and strength of the item within the target language. They do, however have this information for their L1 and may transfer this knowledge to the L2 although it may not be the same in this language. There are, therefore, different levels of transfer included in the choice of a specific collocation, which can lead to a correct target collocation, an incorrect L1 interference collocation or (at least theoretically) a complete collocational mismatch in both languages. Interestingly these levels coincide with many approaches to the description of collocation traditionally taken in most earlier works on collocation, namely the phraseological approach and the frequency approach, where information about the collocational range and semantic opacity are of greater interest in the former and information on frequency of the single lexemes and their likelihood of collocating are of greater interest in the latter. In Section 4.3 these different approaches and their relation to interference collocation are discussed in more detail, because the methodology of the present study combines both approaches in the selection of the test collocations that are used in the experimental design. 4.3

Quantitative and phraseological approaches to collocation

Quantitative and phraseological approaches to collocation can be seen as complementary approaches to the phenomenon of word co-occurrences and the added semantic information that is not based in the semantics of the single words but

58

Language Processing in Advanced Learners of English

on what Firth (1957) called meaning by collocation. Both approaches have been described in some detail in earlier work on collocation, such as Nesselhauf (2004), Schilk (2009) and Schilk (2011), therefore only a summary of the main differences between the approaches is given in order to illustrate how both approaches contributed to the selection of collocations for the present study. Quantitative approaches are usually based on the concept that words tend to co-occur more frequently with some words than with others. It is therefore possible to create a collocational profile of a specific word by counting the occurrence of the word and the frequencies of its co-occurring words in a corpus and calculate the likelihood of their co-occurrence in relation to co-occurrence by chance. Most quantitative approaches start at the level of a single word and define an arbitrary span around this node word, counting all tokens of the different word types that occur within this span and relate the frequency of the node word, the frequency of the collocate and the frequency of the co-occurrence of these words. This allows for a potential rejection of the null hypothesis of chance co-occurrence. If the two items co-occur significantly more frequently than could be expected by chance, this co-occurrence of two lexical items is treated as a statistically significant collocation. Currently there is a relatively large body of work discussing the different statistical procedures to determine likelihood and strength of collocation (e.g. Evert 2008) as well as the question of using arbitrary predefined collocational spans (e.g. Daudaravičius & Marcinkevičienė 2004). While still under frequent discussion, a practicality based quasi-standard has been established by using a collocational span of three to ten words to the left and the right of the node word and employing either exact testing or, more frequently, Dunning’s (1993) log-likelihood ratio (G2). This has been shown to give a good approximation of exact values across the entire range of frequency signatures (Evert 2008: 1237). However, the quantitative approach gives no answers about the meaning profile of the component lexical items and the collocation and relies on post-hoc qualitative analysis to account for any additional collocational semantics. Furthermore, quantitative approaches are often only useful for relatively frequent words and feasible when large language corpora are available. Phraseological approaches represent a complementary viewpoint by putting little or no emphasis on frequency and relying mainly on semantic criteria for the description of phraseological units that are comparable to collocations. In a similar fashion to the quantitative approach where there does not seem to be a unanimously accepted procedure for the classification and statistic calculation of collocation candidates and collocations, the phraseological approach uses a number of different terminologies for comparable units. Table 4.1 (Cowie 1998: 7) summarizes some of the most influential of these different definitions. The columns in Table 4.1 exemplify the different terminologies applied by different authors to comparable form-meaning mappings of word combinations.

Chapter 4.  Interference collocations of advanced German learners of English 59



Table 4.1  Sub-categories of word-like combinations (Cowie 1998: 7) Author

General category

Opaque, invariable unit

Partially motivated unit

Phraseologically bound unit

Vinogradov (1947) Phraseological Phraseological unit fusion

Phraseological unity

Phraseological combination

Amosova (1963)

Phraseological Idiom unit

Idiom (not differentiated)

Phraseme, or Phraseoloid

Cowie (1981)

Composite

Pure Idiom

Figurative Idiom

Restricted collocation

Mel’čuk (1988)

Semantic Phraseme

Idiom

Idiom (not differentiated)

Collocation

Gläser (1986)

Nomination

Idiom

Idiom (not differentiated)

Restricted collocation

Howarth (1996)

Composite unit

Pure Idiom

Figurative Idiom

Restricted collocation

Schilk (2011) describes the types of combinations to which the authors (column 1) apply the different terms: The middle column shows the different terms for semantically opaque, unmotivated word combinations. Examples of those would be spill the beans or kick the bucket whose meanings can be construed neither compositionally nor by semantic extension. […] Partially motivated units are word combinations with meanings that are still not construable compositionally, but can be assigned by means of metaphorical extension: examples here would be blow off steam […] or hit rock bottom […]. (Schilk 2011: 25)

In the present context the most interesting phraseological units are those that can be seen in column 5. These are most frequently referred to as collocations or restricted collocations, where their definition differs notably from quantitative definitions in containing a necessity of arbitrary restrictions of their collocability. As opposed to “free combinations” (Cowie 1998), restricted collocations contain an element of restriction that is not purely semantic. Nesselhauf (2004: 14) illustrates this using the examples of drink tea and strong tea where, in the first example, the restriction is purely semantic and all elements are used in a literal sense. In the second there are arbitrary limitations on substitution (i.e. strong cannot be substituted by close synonyms such as powerful) and at least one element has a non-literal meaning. Table 4.2 illustrates these points by treating the fixedness of a lexical combination and its semantic transparency as a double continuum ranging from minimal restrictions to complete structural fixedness on the first level and from total analytic transparency to complete opacity of meaning.

60 Language Processing in Advanced Learners of English Table 4.2 Syntagmatic fixedness and semantic transparency as a double continuum Lexical combinations: fixedness and semantic transparency

Fixedness

only syntactic restrictions

word class-based restriction

semantic restriction

arbitrary restriction pragmatic restriction relatively fixed

idiomatic expressions fixed

NP->(det)+N

drink+N [liquid] V +O +O

Adj+N prep+N/V

formulaic expressions preferred combinations

idioms, sayings, proverbs […]

C->NP+VP the drink tea give Mary the book (little) dog / Bill the dog swims / the house is red

strong tea / powerful car on the bus / in the shower

Happy Birthday / kick the bucket Merry Christmas Will you marry me?

analytically transparent

analytically transparent selection opaque

analytically transparent selection opaque

[ditrans]

Transparency

transparent

i[anim]

analytically transparent

d

semantically and selectionally opaque

fixed

opaque



Chapter 4.  Interference collocations of advanced German learners of English 61

Distinctions in collocations are most obvious in coulumn three and four. Column three contains restricted collocations (as opposed to the free combinations in coulmn two), here including both lexical restrictions of two open-class lexical items and structural arbitrary combinations of closed-class prepositions and open class NP complements (which are also frequently subject to L1 interference due to their arbitrary nature). The combinations represented in column three are semantically relatively transparent while the basis for item selection is opaque or collocationally restricted. Combinations in column four are relatively similar but here the restrictions are pragmatic in nature. Specific contexts of situation influence the selection of near synonymous lexical candidates (there is, for example, no inherent semantic reason not to say ?Merry Birthday or ?Happy Christmas). Completely idiomatically fixed expressions (column five) are not used in the current analysis. However, they do play a role for some of the earlier works on processing of lexical combinations (e.g. Siyanova-Chanturia et. al 2011) that some of the results are compared with. These combinations contain a large element of semantic opacity (i.e. their meaning cannot be dreived by the meaning of their component parts) and they are syntactically highly restricted (e.g. passivization is frequently not possible, consider: ?Hundreds of buckets were kicked in the earthquake desaster). For the present work this definition of a restricted collocation is very useful as the arbitrary restrictions on collocability differ contrastively and can be seen as a source for transfer effects. In other words if the L2 speaker transfers the set of arbitrary limitations of restricted collocations from his or her L1 to the L2, the result may be incongruent with the arbitrary restrictions at work in the L2. This transfer of collocability restrictions forms the basis for interference collocations that are described in more detail in the following section. 4.4 Collocation in Contrastive Interlanguage Analysis (CIA) Over the past two decades there has been an increasing interest in corpus-based descriptions of Learner English as a specific EFL variant. The concept of Learner English differs from the more general term, English as a Foreign Language (EFL). It concentrates on speakers who are in the process of learning English and, thus, mainly describes EFL speakers that are actively learning English in an institutional environment, such as primary school, secondary school or at university. While there are relatively few databases that focus on early and intermediate learners (such as e.g. the Marburg Corpus of International Learner English, MILE (Kreyer 2014) or the Flensburg English Classroom Corpus, FLECC (Jäkel 2010)), most learner corpora focus on more advanced learners that are in tertiary education at the time of corpus data collection.

62

Language Processing in Advanced Learners of English

Two of the major corpora which are compiled from tertiary English learners are the International Corpus of Learner English (ICLE) that focuses on argumentative essays written by advanced undergrad university students from sixteen mainly European mother tongue backgrounds and the Louvain International Database of Spoken English Interlanguage (LINDSEI), which can be seen as the spoken counterpart to ICLE in terms of speaker population. Since the publication of ICLE in 2002 and LINDSEI in 2010, there have been many studies within the framework of Contrastive Interlanguage Analysis (CIA) based on these corpora (for a survey: see Granger 2015) focusing on phraseology. Granger (2015) describes this phraseological view of CIA as follows: The dominant view of lexis in CIA has been phrasal rather than single-wordbased. The literature abounds in studies of collocations, colligations, lexical bundles and collostructions, which have considerably enhanced our knowledge of the L2 phrasicon. (Granger 2015: 10)

Sinclair (2008) describes this phrasal focus by illustrating that “traps for the unwary” (xvii) that exist for the learner on the word level, e.g. in the form of ‘false friends between pairs of languages” (xviii) are also expected to exist on the level of the phrase. On the phrasal level, however, these difficulties are harder to describe for the scholar and harder to master for the learner, since “there are all sorts of differences in scope, range, connotation and usage conventions, and the influence of wider culture to take into account” (Sinclair 2008: xviii). In a seminal article published in 1983, Pawley and Syder show that nativelike selection and fluency is potentially also based on the phrasal level, rather than only on the individual word level. They show, for example, that the information contained in the preferred native-like phrase I want to marry you may be conveyed by numerous alternatives, yet these alternatives would not be “likely to be accepted as ordinary, idiomatic uses by English speakers” (Pawley & Syder 1983: 196). As these usage conventions may differ between individual languages (as pointed out by Sinclair 2008), the learner is faced with a dual (or multiple) convention system not only on the level of the word but also on the level of the phrase. Preferences for individual word combinations (i.e. collocations) may therefore be viewed as embedded within larger units of meaning (e.g. the phrase). Furthermore, as Hunston’s (2002: 175) example of ‘pattern flow’ illustrates, there is an additional potential of co-selection of these larger units of meaning for longer stretches of text, which lends credit to the idea of texts consisting of combinations of (semi) pre-fabricated units or, in Wray’s (2002) terms, ‘formulaic sequences’ (Wray 2002: 9). This, in turn, raises questions of potential facilitation of fluency in production and processing within and across these semi pre-fabricated units, as pointed out by Ellis et al. (2009).



Chapter 4.  Interference collocations of advanced German learners of English 63

In a research survey of studies of formulaic language in learner corpora, Paqout and Granger (2012) present an outline of the body of learner corpus research with a focus on what Granger (2015) calls the phrasicon. They show that there are some major difficulties for EFL learners in the field of collocation. Problematic issues here include collocational overuse and underuse when compared to native speakers use (i.e. learners will use a small set of frequent collocations in a wide range of situations where their native speaker counterparts have access to a larger scope of phraseological and collocational expressions), and collocational errors: Collocations are also particularly error-prone. Nesselhauf (2005) made use of dictionaries, corpora, and native-speaker informants to investigate the acceptability of around 2,000 verb-noun collocations and found that approximately one third could be considered unacceptable or questionable […]. Nesselhauf (2005) showed that most frequent type of errors in verb-noun combinations involved the wrong choice of verb (e.g.*carry out races). (Paquot & Granger 2012: 137)

The deviations described by Nesselhauf (2005) are not only on the level of transfer of German collocability patterns of the verb with the respective noun, but also include prepositional errors, determiner errors and completely unacceptable combinations (Paquot & Granger 2012: 137; Nesselhauf 2005: 238–239), although it should be noted that, particularly in the case of prepositional errors, transfer effects are also frequent because prepositional use is also highly arbitrary across languages. Regarding transfer of collocability from German to English, that is, interference collocations, Nesselhauf (2005) explains that: As to intralinguistic factors correlating with collocation difficulty, congruence (i.e. word-for-word equivalency of a collocation in the learners’ L1 and the L2) clearly emerged as the most important factor. Non-congruence between what the learner wishes to express in the L2 and the corresponding L1 expression […] was shown to lead to a deviation in around 50% of the cases. (Nesselhauf 2005: 238)

These findings do not seem to be limited to advanced German learners of English. Other studies have demonstrated similar effects for learners with different L1s, for example, Laufer & Waldmann (2011) for Hebrew-speaking EFL learners and Gilquin & Shortall (2007) for French-speaking EFL learners. While this indicates that collocational proficiency is only partially influenced by different foreign language teaching methods, as these may vary considerable within the different learner communities under scrutiny, influential factors for the acquisition of native-like use of collocation are, currently, somewhat unclear. Reasons for this might be found in the heterogeneous concept of the advanced learner of English. Although learner corpora suggest an ideally homogeneous advanced learner, the actual learners represented in the various corpora may be far

64 Language Processing in Advanced Learners of English

more heterogeneous than the monolithic term advanced learner implies. Across the different corpora this is partially based on the distinctions between the different L1s and the different cultural settings (see e.g. Jarvis 2000 for different influences of L1 and cultural settings for Swedish and Finnish learners). Furthermore, the aforementioned differences in school systems and teaching methods may focus on different types of language proficiencies and while there are attempts towards unifying the concept of the advanced learner (e.g. by applying the CEFR scale), these are still limited to specific regional and cultural settings (in the case of the CEFR this is mainly Europe) and even if these scales are applied consistently, learners rated at, for example, C1 may still show substantial differences in their individual proficiencies in specific fields, such as collocational proficiency. Moreover, the inclusion of the individual learner’s data into most learner corpora is mainly restricted by time-based variables (e.g. years of tertiary education in English) and does not necessarily include individual proficiency testing based on the CEFR or other language ability scales. Within the different L1 subcorpora further heterogeneity may be encountered by including speakers from different educational systems. Considering the German subcorpora of ICLE and LINDSEI used for the present study, Germany alone has sixteen different educational systems due to its federal organization and ICLE Germany additionally includes data from Salzburg (Austria) and Basel (Switzerland). Finally, inclusion of data within corpora is often based on avaiability rather than on more rigorous standards, as, for example, the age-band of speakers’ included in LINDSEI-Ger (19–33 years, see Chapter 7) or the gerneral overrepresentation (compared to the general population rather than to university students of languages) of female speakers represented in almost all current learner corpora suggests.

Chapter 5

Measuring eye movements for the study of language processing and comprehension

5.1

Introduction

Section 3.5 provided a general introduction to studies of eye-movement in psycholinguistics. In Chapter 5 this information is broadened and applied to the use of eye-tracking data in relation to language processing and comprehension in silent reading of sentences (and as is applied in the current study). The use of eye-tracking data to gain insights into language processing is a relatively well-covered field with early applications dating back to late 19th and early 20th century and typical modern applications have become popular since at least the mid-1970s (Rayner 1998: 372). In general, two different types of eye-movements are the main focus of eyetracking experiments: fixations on specific areas on the one hand and rapid movements between areas called saccades, on the other. Both of these types of movement are dependent on the nature of the task as well as on the specific individual being tested. Table 5.1, adapted from Rayner (1998), gives an overview of average fixation times and saccade lengths in terms of the different tasks, silent reading being the relevant task for the current analysis. Table 5.1  Approximate mean fixation duration and saccade length for different tasks (Rayner 1998: 372) Task Silent reading Oral reading Visual search Scene perception Music reading Typing

Mean fixation duration (ms) 225 275 275 330 375 400

Mean saccade length (degrees) 2 (about 8 letters) 1.5 (about 6 letters) 3 4 1 1 (about 4 letters)

66 Language Processing in Advanced Learners of English

The figures provided in Table 5.1 are relatively rough approximations because there are a number of additional factors that influence fixation times and saccade lengths (Rayner 1984). In the present study, fixation times are slightly higher than the above-mentioned 225ms, as all readers are non-native English speakers with varying degrees of language competence. Underwood et al. (2004: 159) illustrate these differences between native and non-native speakers and report mean fixation times on all words in a passage to be 201ms for native speakers (SD 25.6ms) and 228ms (SD 29.2ms) for non-native speakers. It should be noted that these numbers differ from the numbers reported by Rayner 1998, possibly due to the fact that they are averages across all words, making no distinction of word length and differences in the fixation times between function words and content words. It does, however, seem relatively safe to assume that native speakers are reading somewhat faster than their non-native speaker cohorts. As Underwood et al. (2004) explain, concerning relatively advanced L2 speakers: The advantage for the natives was consistent across the various measures, including fewer and shorter fixations on all words in the twenty contexts, and fewer and shorter fixations on the terminal words. Although it is unsurprising that the natives would be more proficient readers, the non-natives were relatively advanced in their English, studying at the same university as the natives and having passed the university’s language requirements. (Underwood et al. 2004: 161)

This shows that there seem to be significant differences concerning fixation durations of native speakers and advanced non-native speakers. While these differences are not of central importance for the present study as all participants are non-native speakers, they should still be borne in mind for two reasons. Firstly, the speakers included in the reading experiment are non-native speakers with different levels of second-language competence from both the participants of Underwood et al.’s (2004) study and between the different participant groups. They differ from the subjects included in Underwood et al.’s (2004) study because they were not currently enrolled at an English-speaking university and therefore the overall time spent reading English material is likely to be considerably lower. They also differ according to participant group because one group of participants had a higher proficiency level than the other. Secondly, there are comparatively few eye-tracking studies of English reading comprehension of (German) L2 learners of English, so that comparisons with native speaker data is sometimes the only documented reference point. In these cases the differences mentioned above should be borne in mind. While this introduction has only covered few data concerning fixation times, the remainder of Section 5.1 includes a discussion of a more varied set of eye-tracking variables. These can be approximately divided into early measures, such as, for

Chapter 5.  Measuring eye movements 67



example, first fixation duration and first pass reading time and late measures such as the total reading time spent on a specific area of interest (Siyanova-Chanturia et al. 2011). In the following sections, each measure is introduced from a general perspective and applied to the specifics of the task of the present study. Early measures are the subject of Section 5.2; later measures are discussed in Section 5.3. Section 5.4 then discusses the issue of describing reading behaviour of non-native speakers, followed by a brief summary in Section 5.5. 5.2

Early comprehension measures

Early comprehension measures, such as first fixation duration and first glance duration are measures considered to be influential for lexical activation in reading, i.e. the identification of the target word and its semantic mapping, independent of higher cognitive processes. First fixation duration is the duration of only the initial fixation on a specific word; first gaze duration is the sum of all fixations on a word before the eyes leave that specific word by a (usually rightward) saccade. As mentioned in the introduction to Chapter 5 these data are influenced by a number of factors. There are factors relating to individual differences between readers as well as factors concerning the items read as, for example, word class, word length and word frequency. Particularly in the case of high frequency function words there is a parafoveal preview effect, as these words tend to be very short and highly predictable because of syntactic constraints. Parafoveal preview occurs when a word is not fixated but previewed in the parafoveal field of vision. Activation of shorter and highly constrained words due to parafoveal preview is therefore more likely than identification and activation of longer words or words that are less constrained and have a higher number of possible alternatives. This preview effect also has a correlation with the probability of skipping the word altogether, that is, not fixating on the specific word but skipping it in favour of the next, non-previewed word. In addition to skipping high frequency function words, a greater percentage of skipping has also been shown for highly predictable content words, for example in the case of terminal words in formulaic sequences (Underwood et al. 2004). Specifically these latter effects are important for early processing measures in the current analysis. This is because both interference collocations and significant English collocations might be expected to have similar effects on skipping as the items described by Underwood et al. (2004). Therefore part of the underlying hypotheses reasons that more advanced learners might experience considerable facilitation effects from significant collocations while less advanced learners might do so only to a lesser extent and both types of speakers might also experience some

68 Language Processing in Advanced Learners of English

facilitation effects from collocational interferences, at least when compared to the semantically incongruous controls (see Chapter 7). In addition to the influence of parafoveal previews, Rayner and Duffy (1986) illustrate the effects of word frequency, verb complexity and lexical ambiguities on early fixation duration measures. They find a facilitation effect for high frequency words concerning first fixation duration and gaze duration that is not only restricted to the specific area of interest of the target word, but also carries over to the position directly to the right of the target. This effect is potentially problematic for the present analysis because the selection of input stimuli is based on learner (L2) and native speaker corpus data and a strict control for word frequency is not possible. This is because non-native corpora are too small to give reliable information of word frequency of the NNS lexical use and, in the case of interference collocation, word frequency of the L1 would also have to be taken into account. This is one of the reasons for using both median and mean data for the respective groups because the median data should compress these effects at least on a within group level, while the mean data is more likely to illustrate specific differences (see Section 7.4.3). Concerning verb complexity and verb semantics (causative vs. factive vs. negative verbs), Rayner and Duffy (1986) found no effect on first fixation duration and gaze duration, therefore lexical access does not appear to be dependent on verb class. The same holds true for lexical ambiguities, in that non-ambiguous controls were not accessed faster than the ambiguous test cases. For the present study it should be remembered, however, that we are dealing with parallel lexical access to two language systems and that possible ambiguities are not restricted to one language. Libben and Titone (2009) report interference effects of interlingual homographs as well as facilitation effects of interlingual cognates, therefore these effects might carry over to possible ambiguities so that the situation here may be somewhat more complex and not as easily controlled as that for a study based on monolingual subjects. 5.3

Late comprehension measures

Section 5.2 dealt with early measures of reading comprehension, i.e. those measures that are taken to reflect early parts of reading comprehension such as word recognition and lexical access. Other measures are frequently referred to as late measures, since they reflect parts of reading comprehension that occur after initial word recognition and lexical access, processes sometimes referred to as higher-order processes (Rayner et al. 1989: SI39). However, Clifton et al. (2007) caution that:



Chapter 5.  Measuring eye movements 69

[t]he terms ‘early’ and ‘late’ may be misleading, if they are taken to line up directly with first-stage vs. second-stage processes that are assumed in some models of sentence comprehension […]. Nonetheless, careful examination of when effects appear may be able to shed some light on the underlying processes. Effects that appear only in the ‘late’ measures are in fact unlikely to directly reflect first-stage processes; effects that appear in the ‘early’ measures may reflect processes that occur in the initial stages of sentence processing […]. (Clifton et al. 2007: 349)

Typical late measures are second pass time measures and total reading times, where second pass time is the time spent on a word (or area of interest) in a revisit to the specific area and total reading time is a sum of first pass times and all revisits. Paterson at al. (1999) summarize the combination of early and late measures and their proposed effect on sentence comprehension: [Q]uite often the region of the text that is predicted to cause processing difficulties is fixated more than once before the eyes move on (or backwards) in the text. As all of these fixations can contribute to total reading time, first-pass pass reading time is generally considered to be a better measure of the time spent initially to read a word. […] If an effect is obtained for total reading time on a region of text but not for earlier measures, such as first-fixation duration or first pass reading time, than this is generally taken as an indication of the experimental manipulation having a relatively late effect on processing. (Paterson et al. 1999: 726)

Studies on the effects of syntactic, semantic and pragmatic factors of reading comprehension describe typical applications of late measure designs (Clifton et al. 2007: 348). Syntactic manipulations include, for example, typical garden path structures where the reader is led up a syntactic garden path, that is, the structure has (at least) two different interpretations and the reader initially (probabilistically) assumes one structure but is then surprised by items further to the right and needs to reinterpret early assumptions of the sentence. For example, the sequence: Since Jay always jogs a mile and a half seems like a very short distance to him, initially allows the reader assume that a mile and a half is the object of the verb in the first clause, while in fact it is the subject of the second clause. At the point when the reader encounters the verb form seems in a position where a preposition would (most likely) be expected, a reinterpretation of the sentence is necessary that might lead to longer fixation times on seems as well as to possible revisits of the first parts of the sentence. Semantic factors of reading comprehension are factors related to word meaning beyond the initial word recognition. Unlike syntactic factors, constraints are not on the level of word order and word class but on the level of meaning expectations. That is, an item may be adequate at the syntactic level but still peculiar at the level of meaning. As the present study is concerned with word combinatory possibilities

70 Language Processing in Advanced Learners of English

across linguistic systems, the expected effects are mainly on the semantic level. Keeping in mind that different collocational bases have a different range of possible and probable collocators in different linguistic systems, multilingual speakers with different levels of proficiency in their second language might show differences in their range of collocational expectations. These effects might also be connected to word frequency and salience in the individual mental lexicon and are, thus, also related to early measures, since word recognition is also facilitated by collocational expectations. A typical early measure that may be influenced by collocational facilitation is, for example, skipping ratio. A typical late measure that is influenced by semantic mismatch would be total reading time. In other words, a reader might skip a highly expected item, especially if it is short, whereas they might revisit a semantically odd item and spend a greater amount of total time with the respective item.10 Pragmatic factors are of only peripheral interest for the current research, but are also typically seen as higher-level factors because contextual match and mismatch most likely require higher-level cognitive functions than word recognition. Pragmatic mismatch might occur, for example, when lexical items are anomalous on the level of register. Highly complex and specific items are, for example, more likely to be expected in academic registers, whereas (certain) personal pronouns are more likely to be expected in narrative. There is also, of course a length effect between these two types of words, which will be reflected in the early measures, but pragmatic anomalies are more inclined to influence later measures as the cognitive process involved is of a higher level than basic word recognition. 5.4

Eye-tracking studies for reading comprehension of non-native speakers

As described in Sections 5.2 and 5.3, the use of an eye-tracking methodology has been successfully applied, for several decades, to the study of native speaker word and sentence comprehension. However, studies concerning non-native speaker behaviour are only recently gaining in frequency. Dussias (2010) provides a relatively comprehensive overview of second language sentence processing research. The few studies that have been conducted focus mainly on two different levels: the word level and the level of syntactic processing. Very little research exists concerning semantic factors beyond the level of the word. Notable exceptions are Underwood et al. (2004) and Siyanova-Chanturia et al. (2011) which focus on the processing 10. Semantic mismatch and its relevance for the present study is discussed in more detail in Chapter 6.



Chapter 5.  Measuring eye movements 71

of formulaic sequences and idioms respectively. For the purposes of the present study, it should be considered that the work of Underwood et al. (2004) and Siyanova-Chanturia et al. (2011) focuses on a comparison of native speaker and non-native speaker data, while the present analysis compares non-native speakers who have different levels of L2 proficiency. However, the results of Underwood et al. (2004) and Siyanova-Chanturia et al. (2011) are discussed in some detail, as there are, to my knowledge, no studies that are more closely related to the focus of the present study. Underwood et al. (2004) investigate processing of formulaic sequences of different types “including lexical phrases (Nattinger & DeCarrico, 1992), transparent metaphors, sayings/proverbs and idioms” (Underwood et al. 2004: 156). They show, inter alia, that native speakers made fewer fixations overall than non-native speakers and that fixation durations are also significantly shorter for the native speaker group. Since the end of a formulaic sequence is more predictable than the terminal item in a non-formulaic combination, they also compared fixation times of terminal words in the sequence. Here they discovered an interesting interaction effect where the native speakers displayed a significant difference between terminal words in formulaic sequences and non-formulaic sequences, whereas this difference was not observed for the non-native speaker group (Underwood et al. 2004: 160–161). Thus, the native speakers had an advantage over the non-native speakers concerning both early and late measures. This in itself is not surprising, the more relevant finding, however, lies within the interaction concerning terminal words of formulaic and non-formulaic sequences. The fixation durations of native speakers with regards to formulaic sequences were significantly shorter compared to non-formulaic sequences. Therefore, it can be hypothesised that these sequences are more salient than non-formulaic sequences or, in fact, even stored holistically. The finding that no comparable significant effect was observed for the non-native speakers suggests that there was no deep entrenchment or holistic storage of these formulaic sequences. These findings are relevant for the current analysis, since a similar disparity might be observed for the different types of collocations (i.e. word combinations that are closely related to formulaic sequences) between the intermediate and the advanced non-native English speakers in the different subject groups. Siyanova-Chanturia et al. (2011) deal with a similar question by analysing eye-movements of native and non-native speakers to gain insight into the possible differences in the processing of idioms with a figurative meaning, a literal meaning and novel phrases. They focus on three different variables: first pass reading time, total reading time and fixation count. For the native speakers they find no significant effect in the early measure, that is first pass reading time between the different phrase types, but a significant effect for the two other variables:

72

Language Processing in Advanced Learners of English

“comparisons for the two late measures revealed that idioms used figuratively and literally where read significantly faster and elicited fewer fixations than novel phrases” (Siyanova-Chanturia et al. 2011: 259). No effect was attested when comparing figurative and literal idioms. For the non-native speakers they also observed no effect in the early measure. Unlike the native speakers, non-native speakers displayed no significant differences concerning idiomatic versus novel expressions. However, on the level of figurative versus literal interpretation of idiomatic expressions, the former were processed more slowly and elicited a greater amount of fixations than the latter. Comparing native speakers and non-native speakers directly, they observe: […] a significant main effect of Phrase Type, as well as a significant interaction between Phrase Type and Proficiency in the two late measures. This suggests that not only are non-native speakers overall slower then natives, but that the nature of their processing differs. Namely, when native speakers tend to slow down (reading novels strings compared to idioms), non-native speakers do not. On the other hand, where non-natives show a significant processing cost (figurative renderings vs. literal ones), natives do not. (Siyanova-Chanturia et al. 2011: 261)

Siyanova-Chanturia et al.’s (2011) results, thus, largely agree with the main results of Underwood et al. (2004). They are, however, not as directly relatable to the present study due to their focus on figurative idioms, which are not taken into account in this study. Their data might still prove useful as a reference point for processing times of non-native speakers compared to their native speaker counterparts. It might also provide a yardstick for the native speaker target model, since the current analysis does not involve native speaker participants. 5.5

Measuring eye-movements for the study of language processing and comprehension: A brief summary

Chapter 5 introduced different types of eye-movement measurements that have been shown to be related to language processing in reading comprehension. There are two different types of measurement that are related to different types of cognitive tasks. These are early measures, such as, for example, first fixation duration, which are related to word recognition and lexical activation. In contrast, late measures, as, for example, total reading time, are more closely related to higher-level processes such as syntactic parsing and semantic and/or pragmatic form-meaning mappings. A vast amount of the literature deals exclusively with native speaker data and studies that include data of non-native speakers have only relatively recently gained attention. These studies frequently compare native speaker processes to



Chapter 5.  Measuring eye movements 73

non-native speaker processes but do not focus on different levels of proficiency of non-native speakers. If native-like competence is the target of the language learner, there are two main points of interest. Firstly, in which areas do intermediate and advanced learners differ and, secondly, do the cognitive processes of the advanced learners more closely resemble those of native speakers? The present analysis focuses on an investigation of first of these questions because no native speaker participants were involved in the research. Furthermore, with the focus on interference collocation, it is also more interested in what could be called a ‘bilingual collocation mental lexicon’ as it tests for the salience of collocational L1 and L2 structures and their influence on reading comprehension in L2 reading. Chapter 6 further examines these points and relates them to the second type of psycholinguistic data related to language comprehension and reading comprehension, that is, EEG and event-related potential data.

Chapter 6

Processing semantic mismatch and unexpected lexical items

6.1

Introduction

Over the course of the past two decades there has been an increasing interest in the study of language processing and language comprehension within the field of cognitive neuroscience. As discussed in Chapter 3 (especially in Section 3.5), two main methods are used when investigating comprehension and processing phenomena on a level below conscious awareness. The first of these methods, fMRI-based neuroimaging, does not play a role in the present study, as there are numerous disadvantages of the application of this methodology to the current research, low temporal resolution and extremely high demands on laboratory settings being among the most important. For these reasons the present work uses electroencephalographic (EEG) data focusing specifically on event-related potentials. That is, “[t]he electrical activity of the brain time-locked to the presentation of a stimulus […], [which] has been shown to be sensitive to a variety of sensory and cognitive processes” (Friederici et al. 1993: 184). In a seminal article, Friederici et al. (1993) illustrate how the EEG patterns of test subjects differ in relation to specific types of unexpected stimulus items and their relevant expected control items. These different types of unexpected stimuli can be defined in terms of semantic, morphological and syntactic mismatch, as explained in Chapter 3.5. The present work focuses mainly on semantic mismatch where the averaged EEG data frequently display a “negative going effect in the time domain at around 400ms with a broad distribution over both hemispheres” (Friederici at al. 1993: 190), in the following called the N400 effect. Although other ERP-components, such as an early negative component (N1), associated with initial attention and an early positive component (P2) are compared in the analysis in Chapter 9, the N400 component (and the related N300 component, that is specifically relevant with respect to visual presentation) is the component that is most frequently associated with semantic mismatch and unexpected occurrence of lexical items. As collocation and collocational transfer errors and the processing of semantically incongruous items, the main objects of inquiry, are expected to be most influential at the level of the N400 component, Chapter 6 focuses mainly on this component.

76

Language Processing in Advanced Learners of English

In order to relate ERP findings to the concept of collocation and interference collocation, Section 6.2 focuses on different mismatch types, arguing that interference collocations are a more moderate type of mismatch than mismatches where the bogus collocator is incongruent both semantically and with regard to frequency statistics. Section 6.3 investigates applying these research methods when the participants are non-native speakers. Section 6.4 summarizes these findings and illustrates their importance for the current research. 6.2 Interference collocations between semantic mismatch and expectation As has been previously discussed, the N400 ERP-component is highly sensitive to presentation of semantically unexpected stimuli. It has been shown to be more pronounced when the semantic mismatch is strong and less pronounced when this mismatch is more moderate. Kutas and Hillyard (1980: 203) illustrate this with the example sentences ‘He took a sip from a waterfall’ in contrast to ‘He took a sip from the transmitter’. In the first example the mismatch is moderate, as it is theoretically possible to drink from a waterfall, although, specifically in the combination with sip (indicating a very low amount of liquid being ingested), this combination is unlikely to occur in natural language. The second example, in contrast, is semantically completely incongruous, as transmitters are not generally defined as liquid-filled objects. Although they do not provide information on the semantically correct control inputs, it can be assumed that, in these cases, the adverbial verb complement would be similar to the instances that have been described by the term free combination (see Section 4.3). That is, in this case the noun phrase of the adverbial would contain a lexical noun that denotes a typical container, such as a cup, glass or bottle. As expectations of stimuli play a role in the processing of mismatch, collocations may differ slightly from typical free combinations, both in terms of frequency effects and in terms of priming effects. Kutas & Hillyard (1984) illustrate the effect of expectations in terms of two parameters: contextual constraint and Cloze probability. Contextual constraint is the limitation the semantics of the sentence place on the paradigm of expectable target words, whereas “Cloze probability is defined as the proportion of subjects using [… a specific] word to complete a particular sentence” (Kutas & Hillyard 1984: 161). The following examples (from Kutas & Hillyard 1984) illustrate these two dimensions: (8) He mailed the letter without a stamp (high contextual restraint/high Cloze prob.) (9) The bill was due at the end of the hour (high contextual restraint/low Cloze prob.)



Chapter 6.  Processing semantic mismatch and unexpected lexical items 77

In example (8) semantic and morphosyntactic restrictions limit the possibilities of selection to a very narrow set of words, as there are not many items/objects a letter can lack and the item address is eliminated by the a-allomorph of the indefinite article. This is matched by a high Cloze probability, that is, a very high proportion of subjects finished the sentence with the item stamp. In example (9) on the contrary, we also find a strong contextual restriction, as members of the set periods of time would be part of the expected paradigm, but a different member of this set is anticipated by the Cloze probability test subjects (possibly week or month). When viewing the concept of collocations along the two dimensions of contextual restraint and Cloze probabilities, contextual constraint is of less importance for highly idiomatic expressions that are semantically opaque. Frequency based collocations, on the other hand, usually score high at both levels, that is, their relative transparency usually conforms with the type of contextual restraint and the strength and frequency of the collocation make them highly predictable in terms of Cloze probability. Howarth (1998: 28) illustrates the continuum of idiomaticity of collocating items: (10) under the table (11) under attack (12) under the microscope (figurative) (13) under the weather

Part of the contextual constraints of the preposition under is that it prototypically requires a relatively concrete noun, as is the case in (10), which Howarth (1998) refers to as a free combination. However, it is relatively difficult to form an expectation as to which of these nouns might follow under, so that table would only be expected in sentences that profile subjects that can be typically found under tables or that collocate with table (i.e. piece of furniture). Example (11), under attack, is a typical restricted collocation; in this case the meaning of under is figurative. Concerning expectations, it is interesting to note that the figurative meaning of under is typical when looking at under’s most frequent collocates (the top five nominal collocates of under in the BNC are pressure, control, section and circumstances, all of which are primarily used figuratively). Due to the discrepancy between the prototypical meaning of under and the high frequency and collocational strength of figurative meanings, contextual restraint and also Cloze probabilities on these restricted collocations might be relatively low and depend on other parts of the sentence. However, expectations concerning an occurrence of a non-figurative combination and a restricted combination are not necessarily fulfilled. Both literal and figurative meaning might be quite possible and likely, as is shown in (14) and (15):

78

Language Processing in Advanced Learners of English

(14) Israeli soldiers came under attack in the northern Har Dov area near the border with Lebanon on Wednesday morning.  (Times of Israel 28.01.2015) (15) An artillery shell struck here at breakfast time, but by now the base was on red alert, its soldiers under cover.  (BNC: K6G 118)

In (14) the predictability of attack is primarily based on the verb and less so on the noun as the non-figurative use of under in combination with soldiers in (15) illustrates. In other words, even if the noun phrase in subject position denotes somebody who is frequently under attack, it is still not easy to predict the figurative expression with certainty. For this to be the case, other contextual constraints such as, for example, the verb come in (14) are needed. Example (12) under the microscope is even more complex. Howarth (1998) uses this as an example of a figurative idiom, meaning that the noun microscope is meant in a figurative sense. The border between restricted collocations such as under attack and figurative idioms is, however, very permeable. By looking at the use of under attack in the BNC, it can be seen that under attack is used in a figurative sense far more frequently than in the sense of being under actual physical attack, as Example (16) illustrates: (16) Traditional English teaching is also under attack from the rainbow coalition of the left.  (BNC: A1A 1091)

There is no contextual restraint for a relatively abstract concept to come under attack figuratively and it is only the frame [np vcop (MOD) UNDER _____ ((PPFROM (NP))] that makes the item attack predictable, as without the prepositional phrase other figurative items such as pressure would also be possible, taxing on Cloze probability. The final Example (13) under the weather is referred to as a pure idiom by Howarth (1998). These combinations should be most easily predictable for native speakers, as there is little variability within pure idioms. However, there still needs to be a certain amount of contextual constraint to activate and predict the purely idiomatic meaning, as with the verb feel in (17). (17) I feel a bit under the weather. 

(BNC CR6 3726)

Figure 6.1 from Kutas & Hillyard (1984) illustrates how differences in contextual restriction and Cloze probability are related to the N400 component in averaged EEG data.

Chapter 6.  Processing semantic mismatch and unexpected lexical items 79



A hi/hi hi/lo med/hi med/med med/lo lo/hi lo/lo

He mailed the letter without a stamp. The bill was due at the end of the hour. She locked the valuables in the safe. Too many men are cut of jobs. The dog chased our cat up the ladder. There was nothing wrong with the car. He was soothed by the gentle wind.

B

hi/lo med/lo lo/lo hi/hi

C

unrelated related best

B

– 5 µV +

0

300 600 ms

average lo’s lo/hi med/med med/hi hi/hi

– 5 µV +

A (best) Don’t touch the wet paint. (unrelated) Don’t touch the wet dog. (best) He liked lemon and sugar in the tea. (related) He liked lemon and sugar in his coffee.

0 300 600 ms

Figure 6.1  Brain potentials in relation to contextual constraint, Cloze probability and semantic relatedness (Kutas & Hillyard 1984: 152)

The left panel in Figure 6.1 illustrates how the N400 component is influenced by different degrees of contextual restraint and Cloze probabilities. From the data, Kutas and Hillyard (1984: 162) draw the following conclusion: The systematic decline in the N400 amplitude as a function of Cloze probability indicates that semantic incongruity is not a necessary condition for N400 elicitation. Instead N400 amplitude appears to vary systematically as an inverse function of word expectancy, operationally defined here in terms of Cloze probability.  (Kutas & Hillyard 1984: 162)

The right panel in Figure 6.1 illustrates the processing differences between the most expected (best) completion of a sentence and the alternatives of semantically unrelated (paint vs. dog) and semantically related sentence completions (coffee vs. tea). Kutas and Hilyard (1984: 162) ascribe this effect to semantic priming: The influence of context on word recognition has been attributed to the automatic priming or activation of semantic networks […]. Within such a framework a sentence fragment primes (that is, activates for faster access) semantically related words whether or not they form acceptable sentence completions. […] [The] results are in agreement with the hypothesis that the N400 component reflects the extent to which a word is semantically primed rather than its being a specific response to contextual violations. (Kutas & Hillyard 1984: 162–163)

80 Language Processing in Advanced Learners of English

Broadening this perspective, Molinaro and Carreiras (2010) also take the expectancy differences between highly idiomatic figurative collocations and literal collocations into account. Based on data from a group of thirty-six Spanish native-speakers, they account for differences in the processing of literal and idiomatic collocations as well as near synonymous substitutes of one item within the respective collocation. They show that in both cases a priming effect of the collocation exists, that is the level of expectation towards the occurrence of the collocating item influences processing and results in an N400 effect, although the near-synonym is not semantically incongruous. Figure 6.2 illustrates these differences for frontal, central and parietal electrodes. The difference concerning the N400 component, when comparing literal and idiomatic collocation, that is the N400 effect in processing of the near-synonymous 2.5

Difference waves –0.5

Fz

2.5 –200

Figurative Literal

2 1.5

0.5 1.5

Mean amplitude differences

1 Figurative Literal 0

0.5 200

400

600

0

P300

2.5 –0.5

Cz

2.5 –200

Figurative Literal

2 1.5

0.5 1.5

1 Figurative Literal 0

0.5 200

400

600

0

P300

2.5 –0.5

Pz

2.5 –200

N400 Figurative Literal

2 1.5

0.5 1.5

N400

1 Figurative Literal 0

0.5 200

400

600

0

P300

N400

Figure 6.2  Differences in the processing of literal and figuratice collocations (Molinaro & Carreiras 2010: 183)



Chapter 6.  Processing semantic mismatch and unexpected lexical items 81

substitute of the fixed collocation compared to the collocation baseline, is most markedly observable when focusing on the frontal electrodes, while central and parietal areas are less strongly affected. However, as has been shown in the discussion of Examples (10)–(17), it is not easy to completely separate figurative from non-figurative collocations in natural language data. In other words, when basing the stimuli on performance based corpus data, it is not often possible to draw as exact a dividing line between figurative and idiomatic collocations as is the case in the stimuli underlying Molinaro & Carreiras’ (2010) study. In addition, it is possible that the differences between electrode sites that Molinaro and Carreiras (2010) show are less notable in the case of non-native speakers. So far the works under discussion have focused solely on the expectations of native speakers and their reaction to unexpected sentence elements. However, non-native speakers tasked with processing of L2 stimuli might well have a different set of expectations concerning the co-occurrence of items because the two distinct language systems could influence priming and activation. Lexical activation of bilingual and multilingual speakers is probably different from lexical activation of monolingual speakers (Kroll & Tokowicz 2005), therefore the set of lexical expectations might be influenced cross-systematically. The Revised Hierarchical Model (RHM, see Figure 6.3) for bilingual lexical activation proposed by Kroll and Steward (1994) suggest that “lexical access is non-selective with respect to language” (Kroll et al. 2010: 374). Furthermore, Sunderman and Kroll (2006) provide evidence in support of the hypothesis that the activation of the translation equivalent of lexical items in the L1 depends on proficiency in the L2 (Kroll et al. 2010: 375). Lexical links L1

L2

Conceptual links

Conceptual links

Concepts

Figure 6.3  The Revised Hierarchical Model (RHM) of Bilingual Lexical Activation (Kroll & Stewart 1994 as adapted in Sunderman & Kroll 2006: 392)

82

Language Processing in Advanced Learners of English

The RHM as depicted in Figure 6.3 was originally proposed to account for the translation asymmetry between forward translation (L1 to L2) and backward translation. According to Kroll and Stewart (1994): According to the model, both lexical and conceptual links are active in bilingual memory, but the strength of the links differ as a function of fluency in L2 and relative dominance of L1 to L2. […] L1 is represented as larger than L2 because for most bilinguals, even those who are relatively fluent, more words are known in the native language than in the second language. Lexical associations from L2 and (sic.) L1 are assumed to be stronger than those from L1 to L2 because L2 to L1 is the direction in which second language learners first acquire the translations of new L2 words. The links between words and concepts, however, are assumed to be stronger for L1 than for L2. (Kroll & Stewart 1994: 157–158)

Based on this theoretical framework, Sundermann and Kroll (2006) focus on the influence of language proficiency with regard to the conceptual links between L1 and L2 items and their respective concepts. They show that advanced English L1 speakers with Spanish as their L2 differ in lexical recognition times for different types of distractors, that is, lexical neighbours (cara – card); translation neighbours (cara – fact); and meaning related words (cara – head). Generally, lexical neighbours can be defined orthographically of phonologically and may occur in the same language or across languages. In visual word recognition they are usually defined orthographically so that both “work and cord are neighbors of cork” (Dijkstra 2005: 185), as both differ in one letter from their neighbours/competitors. The example of cara and card is an example of cross lingual lexical neighbours, as they differ in only one letter but are part of two different language systems. Translation neighbours, on the other hand are lexical neighbours of the translation of the original word. In the example fact is a lexical neighbour of face, which is the English translation of the Spanish noun cara. Meaning related words include semantic relations of the translation, so that in the example head is in a holonymic relationship to face the translation of cara. While both less proficient and more proficient speakers display interference latencies for lexical neighbours and meaning related distractors, more proficient learners display almost no interference effect for translation neighbours (Sunderman & Kroll 2006: 417–418). Based on these results, they assume that “the translation equivalent in L1 is salient during early stages of L2 acquisition” (Sunderman & Kroll 2006: 418), supporting the hypothesis of the RHM that advanced L2 speakers need not draw on the lexical translation link to activate the conceptual link in the L1, whereas less advanced speakers activate L2 concepts via translation. With regards to interference collocations, it is likely that the collocational profile of a lexical item is part of its conceptual range. If this is the case, then the difference



Chapter 6.  Processing semantic mismatch and unexpected lexical items 83

in bilingual lexical activation between less proficient and more proficient L2 learners should be reflected in the expectancies towards the co-occurrence of words. As has been shown in Section 4.2, the collocational range of items differs between different language systems and expectations of collocations will therefore also differ. An illustration of this difference in collocational range can be seen in the learner-interference collocation in example (18): (18) […] when you yourself will have ?founded a family, you will give your children as much love as you have […]  (ICLE-GER: GESA 5021)

The German native speaker in (18) creates the interference collocation found + family because found is the direct translation to the German verb gründen, which is used in combination with family in the same sense that English native speakers would use the verb start. Therefore, the expected object paradigm for gründen includes Familie (family) in German but this is not so in English. If this expectancy is carried over to the L2, non-native speakers may display cross-system expectancies leading to a lower N400 effect for items that are direct translations of items that are expected to occur in the L1 so that (19) and (20) could be processed in a similar way for both languages: (19) a. John und Mary gründen eine Firma. b. John and Mary found a company. (20) a. John und Mary gründen eine Familie. b. ?John and Mary found a family.

It would, therefore, seem plausible to assume that the level of expectation for the occurrence of an L2 target collocate increases conjointly with language proficiency of the learner in the L2. That is, deeply entrenched native-like collocations in the learners’ L2-lexicon are more likely to be expected by the learner than those which have a low level of entrenchment. If there is a competing L1 collocation that is deeply entrenched in the learners L1, a direct translation might be more expected than the less familiar, lowly entrenched, target collocation. Bearing in mind the findings of Sunderman and Kroll (2006) regarding the differences of conceptual activation between more proficient and less proficient L2 speakers, more proficient L2 speakers should be more likely to activate the collocational range of the L2 item while less proficient L2 speakers rely on the translation and are therefore likely to expect the non-collocating translational equivalent rather than the correct target collocation. By following this reasoning it may, thus, be assumed that more advanced learners display a stronger N400 effect towards example (20b) than less advanced learners because they less advanced learners might activate the collocational range of gründen, rather than the collocational range of found.

84

Language Processing in Advanced Learners of English

It should also be borne in mind, however, that the analysis of Sunderman & Kroll (2006) differs slightly from the approach taken in the analysis of the present work. Firstly, they looked at reaction times to isolated word pairs. While this approach offers insights into the differences between L2 speakers on the level of lexical activation of single items, the processing of larger strings of language could be influenced by a large number of additional factors, for example, constraints on the morphosyntactic level. That is, although expectancies might well be influenced by the activation of lexical items and their collocational range in the different language systems, other factors will also play a role in the level of expectation of the participants. Furthermore, the L2 speakers in Sunderman & Kroll’s (2006) study are relatively heterogeneous, ranging from speakers with 3 semesters of Spanish classroom language learning to speakers with 16 semesters of Spanish (Sunderman & Kroll 2006: 397). These speakers are only divided into two groups and the group with higher proficiency is far more heterogenic than the lower-proficiency group. While this may be useful for a reaction time based study, heterogeneous groups are not as advisable for use in EEG studies since data needs to be averaged over a set of stimuli and over groups of participants. These issues are further investigated in Section 6.3. 6.3

EEG/ERP studies with non-native speaker subjects

While neurophysiological reactions to semantic mismatch of native speakers have been in focus for quite some time, bilingual speakers and language learners have only recently been given more attention within this framework. One incentive for including bilingual speakers and learners is to account for age-based differences in language acquisition, one of the central topics within psycholinguistics over the past five decades. One of the earlier studies that included these speakers (Weber-Fox & Neville 1996) focused mainly on the age of earliest exposure of (in this case) Chinese/English bilinguals in order to shed light on questions of maturational constraints on the development of the neural systems relevant for language (Weber-Fox & Neville 1996: 231). By comparing ERP patterns of monolinguals and different groups of bilinguals, stratified by age of exposure, they showed that reactions of bilinguals with earlier exposure (age bands: 1–3, 4–7 and 7–10) differed significantly from the group of bilingual speakers that were exposed to the second language later in life (age bands 11–13 and >16), when exposed to semantic mismatch: Also no differences were found for the peak latencies of responses for the 1–3, 4–7 and 7–10 bilinguals compared to monolinguals, however peak latencies of the responses of the 11–13 and >16 groups were longer compared to monolinguals.  (Weber-Fox & Neville 1996: 240)



Chapter 6.  Processing semantic mismatch and unexpected lexical items 85

Drawing on the earlier results of Ardal et al. (1990), who linked EPR latencies to fluency in the second language, Weber-Fox & Neville (1996) interpret their findings as being related to fluency and language proficiency: Previous investigators have linked the N400 latency to fluency (Ardal et al. 1990, Kutas & Kluender 1993). We did not measure fluency directly, however our measure of self-rated proficiency was consistent with the idea that ‘fluency’ may be important in determining N400 latency. The bilinguals who were exposed to English 16 groups indicated reduced proficiency in speaking English and both of these groups showed a later N400 peak latency. (Weber-Fox & Neville 1996: 249)

These results, as well as results from further testing that included syntactic processing, strengthen the view of sensitive periods for second language acquisition that are differentiated across subsystems specialized in processing different language aspects (Weber-Fox & Neville 1996: 231). A more recent approach to the description of language processing of post-childhood language learners is Ojima et al. (2005) who use Weber-Fox and Neville’s (1996) observations of differences in processing between early exposure and late exposure groups as a starting point. As Weber-Fox and Neville (1996) did not describe significant differences between early exposure learners and monolinguals on the level of peak latencies, Ojima et al. (2005) focus specifically on a group of late exposure learners. This group was further sub-divided according to language proficiency as this was deemed an explanatory factor for ERP variation in earlier studies. This was done by creating two groups of Japanese speakers with intermediate and advanced English language proficiency as well as a native speaker control group. Ojima et al. (2005) tested reaction to both syntactic and semantic violations (the results for semantic violations being of greater relevance for the present discussion). By using subtraction N400 waveforms, that is, computing waveforms “by subtracting the waveform in the congruous condition from that of the incongruous condition” (Ojima et al. (2005: 1214), they showed a significant group effect “confirming that the subtraction N400 peaked earliest in ENG [English native speakers] and latest in J-Low with that of J-High in between” (Ojima et al. 2005: 1214).11 Figure 5.3 illustrates the differences between the three groups for the subtraction N400 in the semantically incongruous condition when compared to the congruous condition.

11. ENG vs J-High, p = .001; ENG vs. J-Low, p